Rendered at 07:48:12 GMT+0000 (Coordinated Universal Time) with Netlify.
kylegalbraith 13 hours ago [-]
After building Depot [0] for the past three years, I can say I have a ton of scar tissue from running BuildKit to power our remote container builders for thousands of organizations.
It looks and sounds incredibly powerful on paper. But the reality is drastically different. It's a big glob of homegrown thoughts and ideas. Some of them are really slick, like build deduplication. Others are clever and hard to reason about, or in the worst case, terrifying to touch.
We had to fork BuildKit very early in our Depot journey. We've fixed a ton of things in it that we hit for our use case. Some of them we tried to upstream early on, but only for it to die on the vine for one reason or another.
Today, our container builders are our own version of BuildKit, so we maintain 100% compatibility with the ecosystem. But our implementation is greatly simplified. I hope someday we can open-source that implementation to give back and show what is possible with these ideas applied at scale.
> It's a big glob of homegrown thoughts and ideas. Some of them are really slick, like build deduplication. Others are clever and hard to reason about, or in the worst case, terrifying to touch.
This is true of packaging and build systems in general. They are often the passion projects of one or a handful of people in an organization - by the time they have active outside development, those idiosyncratic concepts are already ossified.
It's really rare to see these sorts of projects decomposed into building blocks even just having code organization that helps a newcomer understand. Despite all the code being out in public, all the important reasoning about why certain things are the way they are is trapped inside a few dev's heads.
mikepurvis 9 hours ago [-]
As someone who has worked in the space for a while and been heavily exposed to nix, bazel, cmake, bake, and other systems, and also been in that "passion project" role, I think what I've found is that these kinds of systems are just plain hard to talk about. Even the common elements like DAGs cause most people's eyes to immediately glaze over.
Managers and executives are happy to hear that you made the builds faster or more reliable, so the infra people who care about this kind of thing don't waste time on design docs and instead focus on getting to a minimum prototype that demonstrates those improved metrics. Once you have that, then there's buy-in and the project is made official... but by then the bones have already been set in place, so design documentation ends up focused on the more visible stuff like user interface, storage formats, etc.
OTOH, bazel (as blaze) was a very intentionally designed second system at Google, and buildx/buildkit is similarly a rewrite of the container builder for Docker, so both of them should have been pretty free of accidental engineering in their early phases.
throwway120385 7 hours ago [-]
I don't think you can ever get away from accidental engineering in build systems because as soon as they find their niche something new comes along to disrupt it. Even with something homegrown out of shell scripts and directory trees the boss will eventually ask you to do something that doesn't fit well with your existing concepts.
A build system is meant to yield artifacts, run tools, parallelize things, calculate dependencies, download packages, and more. And these are all things that have some algorithmic similarity which is a kind of superficial similarity in that the failure modes and the exact systems involved are often dramatically different. I don't know that you can build something that is that all-encompassing without compromising somewhere.
thayne 3 hours ago [-]
Blaze and bazel may have been intentionally designed, but it was designed for Google's needs, and it shows (at least from my observations of bazel, I don't have any experience with blaze). It is better now than it was, but it obviously was designed for a system where most dependencies are vendored, and worked better for languages that google used like c++, java, and python.
nl 8 hours ago [-]
> This is true of packaging and build systems in general. They are often the passion projects of one or a handful of people in an organization
This is a very insightful comment
mikepurvis 9 hours ago [-]
I introduced Depot at my org a few months ago and I've been very happy with it. Conceptually it's simple: a container builder that starts warm with all your previously built layers right there, same as it would be running local builds. But a lot goes into making it actually run smoothly, and the performance-focused breakdown that shows where steps depend on each other and how much time each is taking is great.
It's clear a ton of care has gone into the product, and I also appreciated you personally jumping onto some of my support tickets when I was just getting things off the ground.
kylegalbraith 1 hours ago [-]
Thank you for the very kind words and for your support. Depot is full of incredible people who love helping others. So while you might see me on a ticket from time to time, it’s really an entire team that is behind everything we do.
tuananh 7 hours ago [-]
Thanks for the insight Kyle. If Depot can open-source it, that would be amazing for the community.
kylegalbraith 1 hours ago [-]
Thank you for the post! It’s well done and you captured a lot of the concepts in BuildKit in an easy to understand way. Not an easy thing to do at all.
bmitch3020 16 hours ago [-]
I don't use buildkit for artifacts, but I do like to output images to an OCI Layout so that I can finish some local checks and updates before pushing the image to a registry.
But the real hidden power of buildkit is the ability to swap out the Dockerfile parser. If you want to see that in action, look at this Dockerfile (yes, that's yaml) used for one of their hardened images: https://github.com/docker-hardened-images/catalog/blob/main/...
kylegalbraith 1 hours ago [-]
I agree on both fronts! BuildKit frontends are not very well known but can be very powerful if you know how they work and how BuildKit transforms them.
The --mount=type=cache for package managers is genuinely transformative once you figure it out. Before that, every pip install or apt-get in a Dockerfile was either slow (no caching) or fragile (COPY requirements.txt early and pray the layer cache holds).
What nobody tells you is that the cache mount is local to the builder daemon. If you're running builds on ephemeral CI instances, those caches are gone every build and you're back to square one. The registry cache backend exists to solve this but it adds enough complexity that most teams give up and just eat the slow builds.
The other underrated BuildKit feature is the ssh mount. Being able to forward your SSH agent into a build step without baking keys into layers is the kind of thing that should have been in Docker from day one. The number of production images I've seen with SSH keys accidentally left in intermediate layers is genuinely concerning.
amluto 5 hours ago [-]
There is something wrong with the industry in which we think that, when a production build requires SSH keys, the problem is that the keys might leak into the build artifact.
candiddevmike 9 hours ago [-]
I hate the nanny state behavior of docker build and not being allowed to modify files/data outside of the build container and cache, like having a NFS mount for sharing data in the build or copying files out of the build.
Let me have side effects, I'm a consenting adult and understand the consequences!!!
moochmooch 16 hours ago [-]
unfortunately, make is more well written software. I think ultimately Dockerfile was a failed iteration of Makefile. YAML & Dockerfile are poor interfaces for these types of applications.
The code first options are quite good these days, but you can get so far with make & other legacy tooling. Docker feels like a company looking to sell enterprise software first and foremost, not move the industry standard forward
great article tho!
kccqzy 16 hours ago [-]
Make is timestamp based. That is a thoroughly out-of-date approach only suitable for a single computer. You want distributed hash-based caching in the modern world.
moochmooch 9 hours ago [-]
so use Bazel or buck2 if you need an iteration on make's handling of changed files. Bazel is much more serious of a project than buildkit. I'm not saying make is more functional that buildkit (it might be to some), I'm saying its better written software than buildkit. two separate things
literalAardvark 9 hours ago [-]
Bazel just seems so... Academic. I can't make heads or tails of it.
Compared to a Dockerfile it's just too hard to follow
kccqzy 8 hours ago [-]
Oh I love Bazel. The problem is that it’s harder to adopt for teams used to just using make. For a particular project at work, I argued unsuccessfully for switching from plain make to bazel, and it ended up switching to cmake.
dilyevsky 1 hours ago [-]
Now with AI bazel maintenance is almost entirely painless experience. I have fewer issues with it than the standard Go toolchain and C++ experience was always quite smooth.
craftkiller 16 hours ago [-]
Along similar lines, when I was reading the article I was thinking "this just sounds like a slightly worse version of nix". Nix has the whole content addressed build DAG with caching, the intermediate language, and the ability to produce arbitrary outputs, but it is functional (100% of the inputs must be accounted for in the hashes/lockfile, as opposed to Docker where you can run commands like `apk add firefox` which is pulling data from outside sources that can change from day to day, so two docker builds can end up with the same hash but different output, making it _not_ reproducible like the article falsely claims).
Edit: The claim about the hash being the same is incorrect, but an identical Dockerfile can produce different outputs on different machines/days whereas nix will always produce the same output for a given input.
ricardobeat 15 hours ago [-]
> so two docker builds can end up with the same hash but different output
The cache key includes the state of the filesystem so I don’t think that would ever be true.
Regardless, the purpose of the tool is to generate [layer] images to be reused, exactly to avoid the pitfalls of reproducible builds, isn’t it? In the context of the article, what makes builds reproducible is the shared cache.
xyzzy_plugh 15 hours ago [-]
It's not reproducible then, it's simply cached. It's a valid approach but there's tradeoffs of course.
verdverm 7 hours ago [-]
it's not an either or, it can be reproducible and cached
similarly, nix cannot guarantee reproducibility if the user does things to break that possibility
xyzzy_plugh 6 hours ago [-]
The difference is that you can blow the Nix cache away and reproduce it entirely. The same cannot be said for Docker.
verdverm 5 hours ago [-]
That's not true
Docker has a `--no-cache` flag, even easier than blowing it away, which you can also do with several built in commands or a rm -rf /var/lib/docker
Ah you're right, the hash wouldn't be the same but a Dockerfile could produce different outputs on different machines whereas nix will produce identical output on different machines.
cpuguy83 2 hours ago [-]
Producing different outputs isn't dockerfile's fault.
Dockerfile doesn't enforce reproducibility but reproducibility can be achieved with it.
Nix isn't some magical thing that makes things reproducible either.
nix is simply pinning build inputs and relying on caches.
nixpkgs is entirely git based so you end up pinning the entire package tree.
verdverm 7 hours ago [-]
If you are building a binary on different arches, it will not be the same. I have many container builds that I can run while disabling the cache and get the same hash/bytes in the end, i.e. reproducible across machines, which also requires whatever you build inside be byte reproducible (like Go)
Izkata 10 hours ago [-]
> whereas nix will always produce the same output for a given input.
If they didn't take shortcuts. I don't know if it's been fixed, but at one point Vuze in nix pulled in an arbitrary jar file from a URL. I had to dig through it because the jar had been updated at some point but not the nix config and it was failing at an odd place.
XYen0n 8 minutes ago [-]
This should result in a hash mismatch error rather than an output different from the previous one. If there is a way to locate the original jar file (hash matching), it will still produce the same output as before.
jasonpeacock 5 hours ago [-]
Flakes fixes this for Nix, it ensures builds are truly reproducible by capturing all the inputs (or blocking them).
cpuguy83 2 hours ago [-]
No it doesn't.
If the content of a url changes then the only way to have reproducibility is caching.
You tell nix the content hash is some value and it looks up the value in the nix store.
Note, it will match anything with that content hash so it is absolutely possible to tell it the wrong hash.
Izkata 3 hours ago [-]
Apparently I made note of this in my laptop setup script (but not when this happened so I don't know how long ago this was) so in case anyone was curious, the jar file was compiled with java 16, but the nix config was running it with java 8. I assume they were both java 8 when it was set up and the jar file upgraded but don't really know what happened.
jasonpeacock 16 hours ago [-]
You can network-jail your builds to prevent pulling from external repos and force the build environment to define/capture its inputs.
verdverm 7 hours ago [-]
just watch out for built at timestamps
stackskipton 14 hours ago [-]
SRE here, I feel like both are just instructions how to get source code -> executable with docker/containers providing "deployable package" even if language does not compile into self-contained binary (Python, Ruby, JS, Java, .Net)
Also, there is nothing stopping you from creating a container that has make + tools required to compile your source code, writing a dockerfile that uses those tools to produce the output and leave it on the file system. Why that approach? Less friction for compiling since I find most make users have more pet build servers then cattle or making modifications can have a lot of friction due to conflicts.
zaphirplane 13 hours ago [-]
This is a strange double submission , the one with caps made it !
BuildKit also comes with a lot of pain. Dagger (a set of great interfaces to BuildKit in many languages) is working to remove it. Even their BuildKit maintainers think it's a good idea.
BuildKit is very cool tech, but painful to run at volume
Fun gotchya in BuildKit direct versus Dockerfiles, is the map iteration you loaded those ENV vars into consistent? No, that's why your cache keeps getting busted. You can't do this in the linear Dockerfile
kodama-lens 12 hours ago [-]
I switched our entire container build setup to buildkit. No kaniko, no buildah, no dind. The great part is that you can split buildkitd and the buildctl.
Everything runs in its own docker runner. New buildkitd service for every job. Caching only via buildkit native cache export. Output format oci image compressed with zstd.
Works pretty great so far, same or faster builds and we now create multi arch images. All on rootless runners by the way
verdverm 12 hours ago [-]
That's pretty cool, rootless would be nice, but more effort than we see in ROI currently. I'm using the Dagger SDK directly, no CLI or modules.
Had to recently make it so multiple versions can run on the same host, such that as developers change branches, which may be on different IaC'd versions (we launch on demand), we don't break LTS release branches.
ltbarcly3 3 hours ago [-]
But the problem is it's trash? I've repeatedly tried to do things documented to work that weren't implemented, or where the implementation is buggy.
Docker is profoundly bad software.
cyberax 13 hours ago [-]
Buildkit...
It sounds great in theory, but it JustDoesn'tWork(tm).
Its caching is plain broken, and the overhead of transmitting the entire build state to the remote computer every time is just busywork for most cases. I switched to Podman+buildah as a result, because it uses the previous dead simple Docker layered build system.
If you don't believe me, try to make caching work on Github with multi-stage images. Just have a base image and a couple of other images produced from it and try to use the GHA cache to minimize the amount of pulled data.
plantain 2 hours ago [-]
Never figured out how to get the buildx cache to actually work reliably on ARM OS X. Horrible if you have to build x86 images regularly.
hanikesn 12 hours ago [-]
Why would you use the horrible GHA cache and not a much more efficient registry based cache?
cyberax 10 hours ago [-]
Registry cache...
It's yet one more incomprehensible Buildkit decision. The original Docker builder had a very simple cache system: it computed the layer hash and then checked the registry for its presence. Simple content-addressable caching.
Buildkit can NOT do this. Instead, it uses a single image as a dumping ground for the caches. If you have two builders using the same image, they'll step on each other's toes. GHA at least side-steps this.
But I tried the registry cache, and it didn't improve anything. So far, I was not able to get caching to work with multi-stage builds at all. There are open issues for that, dating back to 2020.
mid-kid 13 hours ago [-]
How do you use buildah? with dockerfiles?
I find that buildah is sort of unbearably slow when using dockerfiles...
cyberax 11 hours ago [-]
It has a braindead cache checking, I've fixed it locally and I'm cleaning it up for the upstream submission. But otherwise, it's always faster for me than Buildkit.
Avamander 10 hours ago [-]
Except anything that requires any non-trivial networking or hermetic building.
jccx70 13 hours ago [-]
[dead]
whalesalad 16 hours ago [-]
Folks, please fix your AI generated ascii artwork that is way out of alignment. This is becoming so prevalent - instant AI tell.
scuff3d 14 hours ago [-]
The "This is the key insight -" or "x is where it gets practical -", are dead give aways too. If I wanted an LLMs explanation of how it works, I can ask an LLM. When I see articles like this I'm expecting an actual human expert
croes 13 hours ago [-]
And waste time and energy again to get a similar result?
scuff3d 9 hours ago [-]
An article written by an expert is nothing like this. You might be able to get something similar out of an LLM but it's gonna take a lot more effort then was out into this.
slekker 14 hours ago [-]
This one too: "It’s a proven pattern."
unshavedyak 16 hours ago [-]
I imagine it's not the AI then, but the site font/css/something. Seeing as it looks fine for me (Brave, Linux).
craftkiller 16 hours ago [-]
Are you on a phone? I loaded the article with both my phone and laptop. The ascii diagram was thoroughly distorted on my phone but it looked fine on my laptop.
whalesalad 16 hours ago [-]
Firefox on a 27" display. Could be the font being used to render.
antonvs 15 hours ago [-]
The only ASCII image I see on that page is actually a PNG:
Maybe the page was changed? If you're just talking about the gaps between lines, that's just the line height in whatever source was used to render the image, which doesn't say much about AI either way.
tuananh 15 hours ago [-]
looks fine to me but since it messed up for some so i replace it with png
seneca 15 hours ago [-]
I found it more jarring that they chose to use both Excalidraw and ascii art. What a strange choice.
tuananh 15 hours ago [-]
the hugo theme requires an image thumbnail. i just find one and use it :D
It looks and sounds incredibly powerful on paper. But the reality is drastically different. It's a big glob of homegrown thoughts and ideas. Some of them are really slick, like build deduplication. Others are clever and hard to reason about, or in the worst case, terrifying to touch.
We had to fork BuildKit very early in our Depot journey. We've fixed a ton of things in it that we hit for our use case. Some of them we tried to upstream early on, but only for it to die on the vine for one reason or another.
Today, our container builders are our own version of BuildKit, so we maintain 100% compatibility with the ecosystem. But our implementation is greatly simplified. I hope someday we can open-source that implementation to give back and show what is possible with these ideas applied at scale.
[0] https://depot.dev/products/container-builds
This is true of packaging and build systems in general. They are often the passion projects of one or a handful of people in an organization - by the time they have active outside development, those idiosyncratic concepts are already ossified.
It's really rare to see these sorts of projects decomposed into building blocks even just having code organization that helps a newcomer understand. Despite all the code being out in public, all the important reasoning about why certain things are the way they are is trapped inside a few dev's heads.
Managers and executives are happy to hear that you made the builds faster or more reliable, so the infra people who care about this kind of thing don't waste time on design docs and instead focus on getting to a minimum prototype that demonstrates those improved metrics. Once you have that, then there's buy-in and the project is made official... but by then the bones have already been set in place, so design documentation ends up focused on the more visible stuff like user interface, storage formats, etc.
OTOH, bazel (as blaze) was a very intentionally designed second system at Google, and buildx/buildkit is similarly a rewrite of the container builder for Docker, so both of them should have been pretty free of accidental engineering in their early phases.
A build system is meant to yield artifacts, run tools, parallelize things, calculate dependencies, download packages, and more. And these are all things that have some algorithmic similarity which is a kind of superficial similarity in that the failure modes and the exact systems involved are often dramatically different. I don't know that you can build something that is that all-encompassing without compromising somewhere.
This is a very insightful comment
It's clear a ton of care has gone into the product, and I also appreciated you personally jumping onto some of my support tickets when I was just getting things off the ground.
But the real hidden power of buildkit is the ability to swap out the Dockerfile parser. If you want to see that in action, look at this Dockerfile (yes, that's yaml) used for one of their hardened images: https://github.com/docker-hardened-images/catalog/blob/main/...
What nobody tells you is that the cache mount is local to the builder daemon. If you're running builds on ephemeral CI instances, those caches are gone every build and you're back to square one. The registry cache backend exists to solve this but it adds enough complexity that most teams give up and just eat the slow builds.
The other underrated BuildKit feature is the ssh mount. Being able to forward your SSH agent into a build step without baking keys into layers is the kind of thing that should have been in Docker from day one. The number of production images I've seen with SSH keys accidentally left in intermediate layers is genuinely concerning.
Let me have side effects, I'm a consenting adult and understand the consequences!!!
The code first options are quite good these days, but you can get so far with make & other legacy tooling. Docker feels like a company looking to sell enterprise software first and foremost, not move the industry standard forward
great article tho!
Compared to a Dockerfile it's just too hard to follow
Edit: The claim about the hash being the same is incorrect, but an identical Dockerfile can produce different outputs on different machines/days whereas nix will always produce the same output for a given input.
The cache key includes the state of the filesystem so I don’t think that would ever be true.
Regardless, the purpose of the tool is to generate [layer] images to be reused, exactly to avoid the pitfalls of reproducible builds, isn’t it? In the context of the article, what makes builds reproducible is the shared cache.
similarly, nix cannot guarantee reproducibility if the user does things to break that possibility
Docker has a `--no-cache` flag, even easier than blowing it away, which you can also do with several built in commands or a rm -rf /var/lib/docker
Perhaps worth revisiting: https://docs.docker.com/build/cache/
Nix isn't some magical thing that makes things reproducible either. nix is simply pinning build inputs and relying on caches. nixpkgs is entirely git based so you end up pinning the entire package tree.
If they didn't take shortcuts. I don't know if it's been fixed, but at one point Vuze in nix pulled in an arbitrary jar file from a URL. I had to dig through it because the jar had been updated at some point but not the nix config and it was failing at an odd place.
Also, there is nothing stopping you from creating a container that has make + tools required to compile your source code, writing a dockerfile that uses those tools to produce the output and leave it on the file system. Why that approach? Less friction for compiling since I find most make users have more pet build servers then cattle or making modifications can have a lot of friction due to conflicts.
https://news.ycombinator.com/item?id=47152488
BuildKit is very cool tech, but painful to run at volume
Fun gotchya in BuildKit direct versus Dockerfiles, is the map iteration you loaded those ENV vars into consistent? No, that's why your cache keeps getting busted. You can't do this in the linear Dockerfile
Everything runs in its own docker runner. New buildkitd service for every job. Caching only via buildkit native cache export. Output format oci image compressed with zstd. Works pretty great so far, same or faster builds and we now create multi arch images. All on rootless runners by the way
Had to recently make it so multiple versions can run on the same host, such that as developers change branches, which may be on different IaC'd versions (we launch on demand), we don't break LTS release branches.
Docker is profoundly bad software.
It sounds great in theory, but it JustDoesn'tWork(tm).
Its caching is plain broken, and the overhead of transmitting the entire build state to the remote computer every time is just busywork for most cases. I switched to Podman+buildah as a result, because it uses the previous dead simple Docker layered build system.
If you don't believe me, try to make caching work on Github with multi-stage images. Just have a base image and a couple of other images produced from it and try to use the GHA cache to minimize the amount of pulled data.
It's yet one more incomprehensible Buildkit decision. The original Docker builder had a very simple cache system: it computed the layer hash and then checked the registry for its presence. Simple content-addressable caching.
Buildkit can NOT do this. Instead, it uses a single image as a dumping ground for the caches. If you have two builders using the same image, they'll step on each other's toes. GHA at least side-steps this.
But I tried the registry cache, and it didn't improve anything. So far, I was not able to get caching to work with multi-stage builds at all. There are open issues for that, dating back to 2020.
I find that buildah is sort of unbearably slow when using dockerfiles...
https://tuananh.net/img/buildkit-llb.png
Maybe the page was changed? If you're just talking about the gaps between lines, that's just the line height in whatever source was used to render the image, which doesn't say much about AI either way.