Can you really trust your Docker images?
Just pulling a Docker image from the Docker Hub is like pulling an arbitrary binary blob from somewhere, and without really knowing what’s in it, execute it, and hope for the best!
At least, for some images. How can we decide if we trust Docker images?
Background: At our customers, we use containers heavily, why once a while we are using a Docker image from the Docker Hub as an offset for infrastructure or build automation. Thus, I then have to consider - do I trust the Docker image enough to give it access to all the secrets and intellectual property of my customer? I might not be the one to take the decision, but I’m definitely the one supplying the information to make the decision.
I’m not trying to explain how to do an elaborate and exhausting security review of a Docker image, as the scope for such review varies a lot. What matters is that you as the user have the information you need to also gain the level of trust that you need. That might not be to the extend of each individual binary file of the image, but at least on a high-level that you find reasonable for your business.
No matter how deep you need to go into details of analyzing a Docker image, the simple answer to gain the level of trust you need, is to make sure that you have full traceability of how the Docker image was created. Without this you can’t trust anything. However, given that you have full traceability, you can gain the insight you need to the level that fits you.
To trust the content of a Docker image I have three requirements:
I will look into the first, as my level of trust goes fine with the two last requirements, given that I trust the Docker tools in these matters - but I still want to know what’s inside the image for sure.
Before I run a Docker container, I always look at the Docker file behind it - assume for now I know which.
Here are my thoughts on what I look for, using the example from Praqma/Yocto-build
FROMbecause if the Dockerfile uses a base image, I would need to start my evaluation there.
ENTRYPOINTis interesting, as this is what the container does. Especially I’m concerned with scripts here.
Depending on the level of trust I need to have in the image, I need to evaluate each and every part of the Docker file into the next level. In the above example read the scripts and evaluate those, and look at the downloaded content.
Let’s now assume that I have evaluated all the content that goes into the Docker file to an extent where it fits my level of trust. How would I know if these things are actually what go into the final image?
The build environment becomes important now, and again this leads back to traceability. This time it’s not the content, but the process.
Almost any Docker image uses the
FROM statement, and many uses the base Linux distro images e.g.
FROM ubuntu:14.04. Personally, lets say I trust Ubuntu base image on Docker hub, but there is no guarantee that it actually contains what it says. If I build the Docker file on my local machine, my
ubuntu:1404 could be tricked to be anything. So I need to trust the build environment to ensure my base image is what the name says it is.
A similar concern related to the wget command, which downloads my script that I now have reviewed and decided that I trust. When executing the build environment, the DNS could be fiddled with or even the script might be different and replaced. I would need to trust the DNS is correct in the build environment.
In the above example, I would basically also like to ensure the script I evaluated myself, has matching checksum with the downloaded one, so it matches what is reviewed.
One thing you need to require from your Docker images is that they are build using the Docker automated builds. Then the trust of traceability is improved. With Docker automated builds you get traceability between the source of the Docker file, the version of the image, and the actual build output. You can easily follow that as I will show in the following.
When configuring automated builds you ensure that the
Full Description is always correct, as it point to the latest readme file from the repo. Moreover, you can trust the repository URL under
Short Description is up for edits from the authors.
Feels good for the trust to know these are correct, at least for “latest”.
Next step is to look at the build process, and as stated earlier the traceability is important.
With automated builds you get the
Build Details tab, with list of automated builds. You typically see several
latest tagged builds, but also hopefully specific version builds, like here
Now click one of the lines in the build details, and you get all the details about how the image was build as shown briefly in the image below:
Each build´s details show me a lot of information that I can use for improved trust in the image, among others I find these important:
Require automated builds from your Docker images, so you can get traceability to the level you need for trusting it.
If there is no automated builds, you should build the image yourself.
You could do even more than what I’ve proposed, the level of depth just needs to match your security level. There are a few other easy steps related to container security that are worth mentioning. The searches will show many other good effort towards security.
Even though we might assume that we can trust the Docker tools, content can be changed after my traceability analysis, so I would need to look into Content trust in Docker.
If you want to evaluate security, not only the specific image, you should definitely look at the Docker Bench for Security and the related Docker blog post about understanding Docker security and some best practices.
Container Solutions have a nice Cheat Sheet for Docker Security, this is much related to the complete security review of using containers. Definitely worth thinking about. Related to my thought above, note their recommendation in “VERIFY IMAGES” about pulling images by digest or building them yourself.
In this blog I will show you how to create snapshots of Persistent volumes in Kubernetes clusters and restore them again by only talking to the api server. This can be useful for either backups or when scaling stateful applications that need “startup data”.
Sneak peak at CSI Volume snapshotting Alpha feature
When I read Fowler’s new ‘Refactoring’ book I felt sure the example from the first chapter would make a good Code Kata. However, he didn’t include the code for the test cases. I can fix that!
Writing tests for ‘Theatrical Players’
Nicole Forsgren and the Accelerate DORA team has just released the newest iteration of the State of DevOps report. The report investigates what practices make us better at delivering valuable software to our users as measured by business outcomes. Read on for our analysis of the report, and how it can be best put to use.
The latest drivers of software delivery performance
A major challenge of software development is that our work is by and large invisible. This makes our folklore essential in business matters. Some of our commonly used arguments and visualizations are digital urban legends rather than solid foundations for informed decisions. Here, we’ll go through a few examples and some measures to address our misconceptions.
How the stories we tell influence our decisions
When you embark on your cloud native journey there will be important choices to make about cloud providers, continuous deployment, environments’ setup and separation. This guide will help you make the right choices by sharing lessons learnt from running cloud native apps in production.
Kubernetes has become the de facto container orchestration platform. When we help clients of different sizes and domains start their cloud native journeys in Kubernetes, we assist them in making sound decisions and technology choices. There is no one-size-fits-all solution when it comes to choosing cloud providers, CI tools, continuous deployment pipelines etc., so it is important to make the right decisions at the start. Failing to do so can be very costly in terms of lost time and money.
How to make the right technical choices on your cloud native journey
Learn how Docker and Kubernetes work and the key benefits they bring. Using real demos, I show how Docker is a great packaging and distribution technology, and how Kubernetes provides a powerful runtime for containerized applications.
Watch this introduction to Docker and Kubernetes at the Trondheim Developer Conference (TDC)
In the world of Agile and DevOps we use many figures, charts and diagrams to argue and reason about our world and how we prioritize and make choices. However, at all levels of the organization, we misuse and misinterpret figures. It’s time to be explicit, measure the right things and act on them. Watch this talk from DevOpsDays Zurich in May 2019.
Watch this talk from DevOpsDays Zurich
Summer is a great time to catch up on reading, whether you’re at the beach, in a summer house, or cozy at home. If your book backlog is on the short side, don’t worry! We compiled a list of great books for summer reading.
Inspiration for your summer reading list
At Praqma we believe in knowledge sharing, and we love to teach our technical expertise. Watch this series of videos to learn how traefik reverse proxy works step by step.
A video seminar to learn how Traefik works
What testing steps should you include in your Continuous Delivery pipeline? Don’t just string together existing manual processes - use simple, collaborative tools to design something better!
A new card game to design Continuous Delivery pipelines
Hear about upcoming events in Scandinavia, latest tech blogs, and training in the field of Continuous Delivery and DevOps