How to make the right technical choices on your cloud native journey
When you embark on your cloud native journey there will be important choices to make about cloud providers, continuous deployment, environments’ setup and separation. This guide will help you make the right choices by sharing lessons learnt from running cloud native apps in production.
Kubernetes has become the de facto container orchestration platform. When we help clients of different sizes and domains start their cloud native journeys in Kubernetes, we assist them in making sound decisions and technology choices. There is no one-size-fits-all solution when it comes to choosing cloud providers, CI tools, continuous deployment pipelines etc., so it is important to make the right decisions at the start. Failing to do so can be very costly in terms of lost time and money.
In this blog we share some of the important considerations you will face when making decisions and technical choices to get started in Kubernetes. This post targets anyone who is starting their cloud native and Kubernetes path, but also anyone who has already started and wants to verify their choices.
Kubernetes clusters can be self-managed or managed by a cloud provider. The rule of thumb is to use a managed Kubernetes service unless you have a sound technical or legal reason to use an on-prem self-managed cluster. Self-managed clusters require highly specialized competence within your team and ongoing infrastructure and maintenance costs.
If you choose to go with a managed Kubernetes service make sure you choose the right provider. This depends on several factors, including:
Generally speaking, if you need a simple Kubernetes cluster you’ll be fine using one of the three big cloud providers: AWS, Google, or Azure. However, in our experience, Google’s GKE is the simplest to start with and the most mature managed Kubernetes offering in the market.
Most of the time, you will have at least two environments: dev/test and prod. How should those be split in Kubernetes? You can either have one cluster per environment or a namespace per environment. While using namespaces to split environments reduces costs by sharing the same infrastructure, it is more risky because of the expected human errors from developers and operators. Avoiding such errors and access issues is possible but requires high competence within the team to implement and monitor. There is a trade-off between infrastructure cost and the man-power needed for access control and security.
You could also place the clusters in separate virtual/physical networks and/or separate cloud projects/accounts. This offers a clearer separation that makes it easy to maintain strict access to prod environments while also allowing for experimentation on the Kubernetes setup itself in the dev cluster. This makes it safe to enable certain Kubernetes features or to test newer Kubernetes versions.
Once you have chosen a platform to deploy your Kubernetes clusters on the question becomes how do you deploy and maintain your clusters? In most cases you will have to manage more than one cluster, and it is important to avoid configuration drifts between them. For example, you may have dev and prod clusters and you want to make sure that they both have the same configurations and their nodes have the same access/auth scopes to other cloud services. This is important because you will be promoting your application deployed in dev to be deployed in prod and you won’t want to have any bad surprises.
To do this, we recommend using an infrastructure-as-code approach where you define the cluster configuration as code and maintain these configurations in a version control system. When you have your cluster config as code, you can spin up a new identical cluster to test something new without even disrupting developer workflow.
We also recommend using a pipeline to deploy and maintain your clusters based on changes in your version-controlled configurations. This is often referred to as GitOps.
Utilities are third party tools/services that you deploy in your cluster. This can be monitoring tools like Prometheus or SSL certificates manager like cert-manager. These utilities should be treated as part of the infrastructure i.e. they are deployed from code when the cluster is created, and are maintained through a pipeline similar to the cluster itself. It could even be a separate job in the same pipeline which creates the cluster. The advantage of this is that you can easily recreate/replicate the cluster with all its third party utilities. Our own open source tool Helmsman can be used to orchestrate deployment of third party utility apps.
Kubernetes namespaces are virtual dividers within a single cluster. They are often used to separate teams and applications. When using namespaces consider the following:
Helm is the package manager for Kubernetes. It is a useful tool to package and share your configurable application Kubernetes deployment templates. We recommend using Helm over using plain Kubernetes yaml templates to deploy your applications. Helm packages -aka charts- allow reusing templates with configuration parameters. This makes it easy to deploy multiple instances of your application in different environments with different configurations without redundant code or templates.
Helm 2.x requires a server-side component called Tiller to be deployed in the cluster. Helm 3.0 will remove Tiller, but it is currently an alpha release. As you use Helm 2.x for the time being, you will have to think about how to deploy and secure your Tiller. Consider the following:
Frequent application releases are considered a good practice because it correlates with lowering the risks of failure and speeding incident recovery time1. When using Kubernetes and Helm you should have an automated pipeline to continuously deploy your changes to K8S. Figure 1 illustrates a typical CD pipeline for cloud native apps deployed in K8S.
When designing your pipeline think about how you want to promote your application from the dev/test environment to the prod environment. This could be done with manual approvals in a single pipeline or you can split the dev and prod pipelines. There is no one-size-fits-all solution, but always try to use the simplest pipeline configuration until you have good reasons to make it more complex. Additionally, keep in mind developer convenience when designing the workflow. For example, if a developer has to make changes in multiple repos to get their small bug fix into prod, that’s probably not convenient enough to encourage faster and frequent deploys!
Helm charts and repos
When packaging your applications as Helm charts consider the following:
Another aspect to consider is where will you store your helm charts and how will you use them. Three patterns are commonly used:
1) Have your helm chart directory as part of your application VCS repo.
The benefits of this approach is that you have everything in one place. However, the drawbacks include:
2) Have a separate VCS repo for Helm charts and serve them from a Helm repo.
The pros of this approach include:
The biggest downside of this approach is that when your application changes in a way that requires a change in the Helm chart, you would have to make the Helm chart change separately at first, then deploy the application code change.
3) Since chart changes will be frequent in the early stages of application development you can combine approaches 1 and 2 at different stages of your application development. The first approach can be used at the start and once the application stabilizes the second approach can be used.
While the patterns above are viable in different cases your choice should be based on your use-case and weighing up the pros and cons of each.
It is important that you plan for failure. And you should think about scenarios like: what happens when applications are deleted ? Or, what happens if the cluster is deleted altogether? What if we want to create an identical copy of the cluster in a different cloud region?
To be ready for all these scenarios you need to backup and store the cluster state somewhere safe. Luckily, there are tools that can help you with that. We have been successfully using Velero from VMWare. Velero allows you to setup a regular backup of the cluster state (deployments, volumes,etc.) and store it in cloud storage. The cluster snapshots can be used to restore the cluster state at any time. But what if we need to recreate the cluster itself? If you deploy the cluster from code, as we discussed above, then you can recreate the cluster using the same version-controlled code you used to create it in the first place, and then restore its state using a Velero snapshot.
It is easy and tempting to make shortcuts when you are getting started. That’s fine when you are in the learning and experimenting phase. However, once you start setting up your Kubernetes environments, make sure that you do the right thing from the start even if it means more time and effort. It will pay off in the future!
Had enough of sluggish polling? With instant Artifactory event triggers you can give responsiveness in Jenkins a real boost. Here’s an easy way to set it up.
A super easy configuration guide
With the arrival of microservices code is becoming disposable. Does this mean that we no longer need maintainable code? Is it the end of refactoring?
Still relevant or increasingly redundant?
In software development tight coupling is one of our biggest enemies. On the function level it makes our application hard to change and fragile. Unfortunately, tight coupling is like the entropy of software development, so we have always have to be working to reduce it.
How to safely introduce modular architecture to legacy software.
I am an Atlassian certified trainer and over the years I have been spending much time with clients and their Jiras. In this blogpost, I have collected some small tips and tricks that will make your Jira usage better.
Jira Software is a powerful tool deployed in so many organizations, yet in day to day usage people are missing out on improvements, big and small.
In this post, I’ll take a closer look at the version of Jenkins X using Tekton, to give you an idea of how the general development, build, test, deploy flow looks like with Jenkins X. How does it feel to ship your code to production using a product coming from the Jenkins community that has very little Jenkins in it?
A crash course in Jenkins X and how to test it out on a local Kubernetes cluster
In this blog I will show you how to create snapshots of Persistent volumes in Kubernetes clusters and restore them again by only talking to the api server. This can be useful for either backups or when scaling stateful applications that need “startup data”.
Sneak peak at CSI Volume snapshotting Alpha feature
When I read Fowler’s new ‘Refactoring’ book I felt sure the example from the first chapter would make a good Code Kata. However, he didn’t include the code for the test cases. I can fix that!
Writing tests for ‘Theatrical Players’
Nicole Forsgren and the Accelerate DORA team has just released the newest iteration of the State of DevOps report. The report investigates what practices make us better at delivering valuable software to our users as measured by business outcomes. Read on for our analysis of the report, and how it can be best put to use.
The latest drivers of software delivery performance
A major challenge of software development is that our work is by and large invisible. This makes our folklore essential in business matters. Some of our commonly used arguments and visualizations are digital urban legends rather than solid foundations for informed decisions. Here, we’ll go through a few examples and some measures to address our misconceptions.
How the stories we tell influence our decisions
Learn how Docker and Kubernetes work and the key benefits they bring. Using real demos, I show how Docker is a great packaging and distribution technology, and how Kubernetes provides a powerful runtime for containerized applications.
Watch this introduction to Docker and Kubernetes at the Trondheim Developer Conference (TDC)
In the world of Agile and DevOps we use many figures, charts and diagrams to argue and reason about our world and how we prioritize and make choices. However, at all levels of the organization, we misuse and misinterpret figures. It’s time to be explicit, measure the right things and act on them. Watch this talk from DevOpsDays Zurich in May 2019.
Watch this talk from DevOpsDays Zurich
Hear about upcoming events in Scandinavia, latest tech blogs, and training in the field of Continuous Delivery and DevOps