KubeCon 2019 insights

Attended KubeCon 2019 in San Diego last week. Great conference, liked the 30min prezo format, got more out of this one than any I’ve attended. And, late November is a great time to be in southern CA, though it’s almost disorienting to see the sun every day.

Key Takeaways

  1. Teams should be structured to minimize cognitive load in order to maximize effectiveness, “minimize cognitive load for others”, “use small, long-lived teams as the standard” – from the Team Topologies book authors
  2. DevEx (Developer Experience) term (new to me) used by the Team Topologies book authors – though this is easily confused with devx.com, I like it. “DevEx must be optimized to maximize feature flow”
  3. CD platform Spinnaker has really taken off with wide adoption and a healthy ecosystem – Spinnaker Summit, now in its third year, is now 500+ attendees over three days, just prior to KubeCon  / CloudNativeCon. Armory offers a hosted Spinnaker solution. A couple of my ex-co-workers have presented at Spinnaker Summit and are active members of the community: Joel Vasallo, Steven Basgall
  4. Serverless Frameworks can mean different things, ranging from as simple as scale-to-zero to functions-as-a-service. Examples of the latter are OpenFAAS (which supports k8s, Swarm, ECS / Fargate) and vendor-specific Lambda, Google Cloud Functions, Azure Functions; of the former, Knative (which is more of a serverless building-blocks technology which appears destined to be commonly-paired with k8s)
  5. Platform as a Product leads to a reliable platform
  6. Observability is largely about being smarter re unanticipated failure events (the “weird zone”) and refining craft in using the monitoring / alerting tooling one already has
  7. Security requires ongoing effort, “treat vulnerabilities like earthquakes” (they’re going to happen, maximize resiliency / recovery)
  8. Kubernetes (k8s) ecosystem is strong and getting stronger, many using it in production, GKE currently more widely used than EKS, but EKS is certainly working for some. “Kubernetes is the OS for the cloud”
  9. Service Mesh delivers the most benefit when most or all applications are on a service-mesh-ready deployment technology (incl. k8s, AWS ECS / Fargate). There are competing service mesh solutions (incl. Istio, Linkerd, AWS-specific-AppMesh, a la carte Envoy, and others), with perhaps no clear leader yet (though Knative may drive Istio adoption). “Service mesh does for service-to-service communication what Kubernetes has done for orchestration”

Favorite talks / workshops

The Elephant in the Kubernetes Room: Team Interactions at Scale – Manuel Pais, Independent (co-author of “Team Topologies”)  ** slides  **  Team Topologies book **  teamtopologies.com  **  How Airbnb Simplified the Kubernetes Workflow for 1000+ Engineers

I’m a third  of the way into the book, best I’ve read since Lean Enterprise. Really well-written, just the right balance of theory / case-study, and the diagrams are well chosen.

  • DevEx == Dev Experience – simplify, simplify, simplify
  • Platform as a Product – leads to a Reliable Platform
  • Cognitive Load – minimize it within any team, and optimize DevEx by minimizing it for Dev team customers (by providing easy-to-use abstractions)
  • Platform Team: Platforms fit for purpose, optimized for DevEx
  • Primary 3 team types: Stream-aligned (feature delivery), Enabling (DevOps embedding / support for stream-aligned), Platform
  • Platform should make it easier to do the right thing, encouraging dev teams to use the platform and not diverge and be on their own
  • Kubernetes should be a hidden impl. detail with the Platform providing abstractions for good DevEx

OpenFaaS Cloud + Linkerd: A Secure, Multi-Tenant Serverless Platform – Charles Pretzer, Buoyant & Alex Ellis, OpenFaaS, LTD  ** slides

  • OpenFAAS is “Serverless 2.0” (Any code, Anywhere), vs. “Serverless 1.0” vendor-specific platforms incl. AWS Lambda, Google Cloud Functions, Azure Functions
  • Anywhere means k8s, Swarm, ECS / EKS, Datacenter, Local
  • Two options: OpenFAAS, OpenFAAS Cloud which bundles Git-based CI / CD, Runtime secrets, OAuth, Linkerd-based Service Mesh
  • Simplicity (short stack.yml + handler.js for a simple node.js example, vs. 6 distinct longer configs for k8s). Note OpenFAAS Dockerfile is optional but is supported
  • Linkerd (bundled with “OpenFAAS Cloud”): Only service mesh currently in CNCF (incubating): Actionable metrics, Deep runtime diagnostics, CLI-debugging, 60sec install, lightweight. Traffic-splitting, mTLS, Dashboard showing routing paths.

No-Nonsense Observability Improvement – Cory Watson, SignalFx ** slides

  • “The Normal Zone” includes monitoring for Anticipated behaviors
  • The “Weird Zone” is about Observability of Unanticipated behaviors
  • Observability will be one of your most expensive projects
  • Incident Measures++ traditional ones incl. MTTD, MTTR can be lame – instead look to Nora Jones Cyclic Approach: Difficulties in Understanding, System-specific failure rates, Surprises, Lack of ownership, Near misses
  • Automation – need to avoid human being “out of the loop”
  • Invest in risk and need
  • Understand the use cases

Making an Internal Kubernetes Offering Generally Available – James Wen, Spotify  **  slides

  • “Take complexity for your developers” (more complex devops tooling, better abstractions can be worth it for a better DevEx)
  • Between the extremes “Complete Team Autonomy” and “Centralized Ops” is their happy medium: Ops (embedded) in teams, Core-Infra Org, Golden Path
  • “Establish trust through monitoring”
  • Metrics incl.: Status of backups
  • “If you don’t have restored backups, you don’t actually have backups”

Doing Things Prometheus Can’t Do with Prometheus – Tim Simmons, DigitalOcean **  slides

  • Metrics need to be Actionable, Contextual
  • Learn existing tools deeply – more valuable than shiny new ‘observability’ tools
  • Jeff Smith: “Maintenance is Revenue Protection”
  • Anomaly detection can be easily done with custom code, don’t always need a product with that feature

How Yelp Moved Security From the App to the Mesh with Envoy and OPA – Daniel Popescu, Yelp & Ben Plotnick, Cruise ** slides

  • OPA (Open Policy Agent) case study
  • OPA incl. unit-testability
  • OPA decision logs published to log collector (Splunk)
  • AuthN, AuthZ via Envoy sidecar
  • For projects like this, start from the use cases, be mindful of scope creep

Design Decisions for Communication Systems – Eric Anderson, Google  **  slides

  • Excellent historical context / pros & cons of Messaging mechanisms incl. gRPC, REST, Unix socket, TCP socket, older ones incl. DCOM / CORBA

Weaveworks EKS AppMesh Gitops workshop

  • Pretty well-constructed workshop targeting EKS & AppMesh with a GitOps workflow, using Flux as a k8s operator to promote container images
  • If you were at the conference, you got a nice Cuttlefish shirt with proof of completing this, pictured at the link above

Dockerize SQL Server using Windows Server 2016 Containers

how to build a SQL Server 2014 Express image from a single “docker build” command

I lead an automation group where we need to easily spin up and down multiple locally-administered database instances, in support of automated testing of web and other applications. Some applications use MySQL, PostgreSQL or MongoDB, which are supported as Docker Linux Containers. For projects using Windows-based SQL Server, there had been no Container option until the advent of Windows Server 2016, now available in preview (TP5).

To dive immediately into how to create a Dockerized SQL Server, see: Windows Server 2016 Support for Docker Containers.

Automation Server Scenario: SQL Server Database Instance private to each test run

Capture

The simplified environment diagram above shows how containers are envisioned to be used in the automated build / test environment:

  • Docker images are CI-built when source code and / or dockerfile changes (SQL Server image to be rebuilt at most once / week, mainly to pull latest OS and SQL Server updates to image)
  • Docker containers are spun up on-demand on a pool of generic VMs – nothing to install on the generic VMs except docker (and build server agent) – no dependency on external sql server instance
  • Each test container instance runs against its own dedicated SQL Server container instance, each are torn down at the end of the test

Another scenario – making private SQL Server instances more readily available to Mac developers:

Mac Developer Scenario: SQL Server Local Instance easily installed

Mac Developers, when working on a project using SQL Server, should have access to their own instance of SQL Server. The SQL Server Express Docker image makes that easier – they need only create a Windows Container Host VM (Windows 2016 TP5 preview) using VirtualBox as described here.

Windows Server 2016 Support for Docker Containers

Windows Server 2016 (available now in a preview edition, TP5) introduces support for Windows Containers, with Docker-compatible command support.

Docker Hub currently has Windows Container images for MongoDB, MySQL and SQLite databases (as well as Redis) – but oddly, not for SQL Server.

See below for info on how to build a SQL Server 2014 Express image from a single “docker build” command. It configures SQL Server Express to listen on static port 1433 and supports a runtime choice of data / backups stored internal-to-container or on a host volume. This was created and tested using Windows Server 2016 TP4 Core.

These steps assume you have already established a Windows Container Host VM (with at least 8G free space) – if not see Creating a Windows Container Host VM.

Building the SQL Server Docker image

If you have git installed on the host VM, you can clone the repo https://github.com/bucrogers/Dockerfiles-for-windows.git

Otherwise, pull the files as follows – from powershell:
wget https://github.com/bucrogers/Dockerfiles-for-windows/archive/master.zip -outfile master.zip

expand-archive master.zip
cd master\dockerfiles-for-windows-master\sqlexpress
Build an image named ‘sqlexpress’ using Docker – this will take up to an hour, longer the first time (the Docker Hub base image gets cached in the local Docker repo) – then list the images:
docker build -t sqlexpress .
docker images

Running a SQL Server Docker Container

The firewall on the host VM must be opened for each port to be exposed from a running container, to be accessed outside the host VM – for a single instance of the SQL Server container, open port 1433 – from powershell:
if (!(Get-NetFirewallRule | where {$_.Name -eq "SQLServer 1433"})) { New-NetFirewallRule -Name "SQL Server 1433" -DisplayName "SQL Server 1433" -Protocol tcp -LocalPort 1433 -Action Allow -Enabled True}
Run a single instance of a SQL Express container, in Docker “interactive” mode, with data hosted inside the container – you’ll end up at a powershell inside the container – when you exit powershell the container will stop:
docker run -it -p 1433:1433 sqlexpress "powershell ./start"
Connect using SQL Server Management Studio (I ran this from Windows 10)  – you’ll need the hostname / IP address of the Windows Container Host VM (via “ipconfig”). The sa password is ‘thepassword2#’.

Run two SQL Express containers, this time in “detached” mode, with data volumes on the Host VM, first removing the prior container – from powershell:
..\stop-rm-all
md \sql
md \sql2
docker run --name sql -d -p 1433:1433 -v c:\sql:c:\sql sqlexpress
docker run --name sql2 -d -p 1434:1433 -v c:\sql2:c:\sql sqlexpress
You can now connect to either of these from SSMS on their respective ports – when the containers are stopped and removed, the data and any backups remain on the Host VM, to be automatically reused on a future container run.

SQL Server dockerfile details

The dockerfile full source is here. It facilitates the following:

  • Configuration of SQL Server Express  to listen on static port 1433
  • Installation from SQL Server 2014 Express download link
  • -or- Provided SQL Server edition setup.exe
  • Run with data / backups inside the container (not persisted)
  • -or- Data / backups on host VM (persisted)
  • Bootstrap first-time data initialization for persisted mode
  • Running container either detached or interactive

Below is a walk-through of the key statements in the dockerfile:

Use a base image from Docker Hub including .NET Framework 3.5, since SQL Server requires it –  the first time this is run, it will download to the local Docker registry running on the host, after which it is pulled from the local registry:
FROM docker.io/microsoft/dotnet35:windowsservercore
Declare Docker environment variables for data folder usage etc. – these may be overriden at container-runtime (though that is not useful for this image):
ENV sqlinstance SQL
ENV sql c:\\sql
ENV sqldata c:\\sql\\data
ENV sqlbackup c:\\sql\\backup
Copy files from host into container for use at build time:
COPY . /install
Download SQL Server 2014 Express installer from link and extract files to expected setup folder (this link may be customized for any other edition of SQL Server, or the dockerfile may be modified to copy existing setup\* files from the host) :
RUN powershell invoke-webrequest http://download.microsoft.com/download/E/A/E/EAE6F7FC-767A-4038-A954-49B8B05D04EB/Express%2064BIT/SQLEXPR_x64_ENU.exe" \
-OutFile sqlexpr_x64_enu.exe

RUN /install/sqlexpr_x64_enu.exe /q /x:/install/setup
Run SQL Server installer, specifying folders, sa password, TCP protocol enabled – special note: the SYSTEM account must be used for the SQL Server Engine service as of Windows 2016 TP4 (the default “Network Service” account does not work):
RUN /install/setup/setup.exe /q /ACTION=Install /INSTANCENAME=%sqlinstance% /FEATURES=SQLEngine /UPDATEENABLED=1 \
/SECURITYMODE=SQL /SAPWD="thepassword2#" /SQLSVCACCOUNT="NT AUTHORITY\System" /SQLSYSADMINACCOUNTS="BUILTIN\ADMINISTRATORS" \
/INSTALLSQLDATADIR=%sqldata% /SQLUSERDBLOGDIR=%sqldata% /SQLBACKUPDIR=%sqlbackup% \
/TCPENABLED=1 /NPENABLED=0 /IACCEPTSQLSERVERLICENSETERMS

Change TCP protocol configuration to listen on static port 1433, since SQL Server Express defaults to a dynamic port:
RUN powershell ./Set-SqlExpressStaticTcpPort %sqlinstance%
Remove the data and backup directories, in order to facilitate running this image with data and / or backups hosted either internal-to-container or on an external volume.

This is necessary since Windows Docker, as of TP4, does not support mapping to an existing container folder. But even when that support is added, it may be preferable to leave this as-is since this approach supports a first-time automatic “bootstrap” copy of data files to an external host volume:
RUN powershell ./Move-dirs-and-stop-service %sqlinstance% %sql% %sqldata% %sqlbackup%

Declare a default CMD, which simplifies invoking “docker run” later – this first invokes start.ps1 to setup files appropriately for an internal or external data volume, then runs a detached-friendly sleep loop:
CMD powershell ./start detached %sqlinstance% %sqldata% %sqlbackup%

Creating a Windows Container Host VM on Azure

See here for an example Azure Resource Template to create a Container-ready Windows Server 2016 TP5 VM on Azure.

Creating a Windows Container Host VM on Hyper-V

The following instructions are adapted from Deploy a Windows Container Host to a New Hyper-V Virtual Machine:

Prerequisites

You may run a Windows Container Host VM on any of the following (I performed my testing on both Windows 10 and Azure):

Steps

Open Command Prompt (Admin), then start powershell:
powershell
Determine if a VM Switch is configured – if nothing returned below, create a new *external* VM switch then proceed:
Get-VMSwitch | where {$_.SwitchType –eq “External”}
Pull the create container host script

Important note: modify this before running to increase disk size from 8G to 20G by searching for “-SizeBytes 8GB”>
wget -uri https://aka.ms/tp5/New-ContainerHost -OutFile c:\New-ContainerHost.ps1
Run the script to create the Container Host VM – this will take up to 1.5 hrs. Note there is an optional “-Hyperv” argument which will work only if running on Windows 2016 TP4 installed on a physical box – this is necessary only to evaluate “Hyper-V Containers”, not for “Windows Containers”:
c:\New-ContainerHost.ps1 –VmName <yourVmName> -WindowsImage ServerDatacenterCore

If you get a “running scripts is disabled” error, then execute the following then try again:
Set-ExecutionPolicy RemoteSigned

References

Whale riding Docker in a sea of Microservices

make development more consistent and deployment more reliable

docker

Saw a couple interesting talks on Docker / Microservices last week – “State of the Art in Microservices”, the DockerCon Europe 2014 keynote, by Adrian Cockcroft ; and “Docker in Production – Reality, Not Hype”, at the March-2015 DevOps-Chicago meetup, by Bridget Kromhout (links below).

Adrian’s Microservices talk was interesting in that it was not limited to the purely technical realm of Microservices and Docker, but also described the organizational culture and structure needed to make it work:

  • Breaking Down the SILOs – a traditional “Monolithic Delivery” team must interface with each of 8 autonomous silo groups in his example, often using ticket-driven, bottleneck-prone workflow, vs. having two cross-functional “Microservices” teams (Product Team, Platform Team) which each span formerly-silo’d areas of expertise – making the point that introducing these DevOps-oriented cross-functional teams is a Re-Org
  • Microservice-based production updates may be made independently of other service updates, facilitating continuous delivery by each Microservice team and the reduced-bottleneck, high-throughput that results from it – contrasted with Monolithic Delivery deployments, which work well only with a small number of developers and single language in use
  • Docker containers facilitate the above by isolating configurations for each Microservice in their own containers, which are resource-light and start in seconds (and might live for only minutes), vs. a traditional VM-based approach which is more resource-hungry, starts in minutes and is typically up for weeks
  • Microservice Definition: Loosely coupled service oriented architecture with bounded contexts – this is the most succinct definition I’ve seen,  contrasted with the broader SOA term which can describe either a loosely or tightly coupled (often in the form of RPC-like WSDL / SOAP implementations) – loose coupling is essential for the independent production updates mentioned above, with bounded contexts (how much a service has to know about other services) an indication of loose coupling. A common example of tightly-coupled system is a centralized database schema, with the database being the “contract” between two or more more components
  • AWS lambda is an interesting service that scales on demand with 100ms granularity (currently in preview)
  • Example Microservice architectures shown for: Netflix, Twitter, Gilt, Hailo
  • Opportunity identified – of Docker Hub as an enterprise app store for components
  • Book Recommendation – Lean Enterprise: Adopting Continuous Delivery, DevOps and Lean Startup at Scale

Bridget’s talk about how DramaFever uses Docker in production (since late 2013) described some of the benefits of using Docker:

  • Development more consistent – when developers share docker containers for their environment, it both reduces friction during development and eases deployment handoff to shared-dev, QA, staging, production environments. Another side benefit is a production container can be easily and quickly pulled by a developer to a local environment to troubleshoot. In their case they went from a 17min Vagrant-based developer setup (which also differed from production in its configuration) to a < 1min Docker-based one
  • Deployment more repeatable – scaling via provision-on-demand may be done more confidently and in a more automated fashion knowing that the containers are correct. They take the exact image from the QA environment and promote it to Staging then Prod

… and some technical details / challenges:

  • Docker containers in the build pipeline – Docker base images (main app, MySQL emulation of AWS-RDS) built weekly,  and Microservice-specific builds of Docker containers dictated by the Dockerfiles in Git source control – she heavily emphasized the importance of a build-server-driven build and deployment pipeline (Jenkins in their case), the importance of having a fully-automated build and deploy chain (no laptops in the build pipeline)
  • Monitoring beyond the high-level offered by AWS CloudWatch implemented via Graphite, Sentry
  • Fig (now named “compose”) used to help containers find each other
  • “Our Own Private Registry” – they found it worked best to run local registries on each workstation rather than a centralized private registry
  • “Getting the Logs out” – host filesystem may be mounted from within the Docker container, to facilitate log export
  • “Containerize all the things” – they use Docker for most things, but have found Chef more effective for some of the infrastructure pieces such as Graphite. As she put it, you need to decide “what do you bake in your images vs. configure on the host after the fact”
  • “About those Race Conditions” – they use the Jenkins “Naginator” plugin to automatically re-run jobs which fail with certain messages such as “Cannot destroy container”

I’m looking forward to leveraging Docker to help optimize the deployment process for my current project, which will become even more important as we move toward a more Microservice-based architecture.

References: