That's not going to work if both parties get different hashes when they build the image, which won't happen as long as file modification timestamps (and other such hazards) are part of what gets hashed.
It's not just the timestamps you need to worry about. Tar needs to be consistent with the uid vs username, gzip compression depends on implementations and settings, and the json encoding can vary by implementation.
And all this assumes the commands being run are reproducible themselves. One issue I encountered there was how alpine tracks their package install state from apk, which is a tar file that includes timestamps. There are also timestamps in logs. Not to mention installing packages needs to pin those package versions.
All of this is hard, and the Dockerfile didn't make it easy, but it is possible. With the right tools installed, reproducing my own images has a documented process [2].
The flip side is that the world still hasn’t settled on a language-neutral build tool that works for all languages. Therefore we resort to running arbitrary commands to invoke language-specific package managers. In an alternate timeline where everyone uses Nix or Bazel or some such, docker build would be laughed out of the window.
> running arbitrary commands to invoke language-specific package managers.
This is exactly what we do in Nix. You see this everywhere in nixpkgs.
What sets apart Nix from docker is not that it works well at a finer granularity, i.e. source-file-level, but that it has real hermeticity and thus reliable caching. That is, we also run arbitrary commands, but they don't get to talk to the internet and thus don't get to e.g. `apt update`.
In a Dockerfile, you can `apt update` all you want, and this makes the build layer cache a very leaky abstraction. This is merely an annoyance when working on an individual container build but would be a complete dealbreaker at linux-distro-scale, which is what Nix operates at.
And in languages with insufficient abstraction power like C and Go, you often need to invoke a code generation tool to generate the sources; that's an extremely arbitrary command. These are just non-problems if you have hermetic builds and reliable caching.
Personally I love using mkosi and while it has all the composability and deployment options I'd care for, its clear not everyone wants to build starting only with a blank set of OS templates.
In Spack [1] we do one layer per package; it's appealing, but I never checked if besides the layer limit it's actually bad for performance when doing filesystem operations.
Want to throw a requirements.txt in there? No no, why would you even ask that? Meanwhile docker says yeah sure just run pip install, why should I care?
Like all LLM boosters, you've ignored the fact that the largest time sink in many kinds of software is not initial development, but perpetual maintenance.
If you care about getting it to work with minimal effort right now more thar about it being sustainable later, then sure.
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
In Azure YAML I had an odd bug because I used succeeded() instead of not(failed()) as a condition. I had no way of testing the pipeline without executing it. And each DSL has its own special set of sharp edges.
At least Bash's common edges are well known.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
What more do you want?
Its a Buildkit frontend, so you still use "docker build".
They sounded nice on paper but the work they replaced was somehow more annoying.
I moved over to Docker when it came out because it used shell.
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
If your dockerfile says “ensure package X is installed at version Y” that’s a lot clearer (and also more easy to make performant/cached and deterministic) than “apt-get update; apt-get install $transitive-at-specific-version; apt-get install $the-thing-you-need-atspecific-version”. I’m not thrilled at how distro-locked the shell version makes you, and how easy it is for accidental transitive changes to occur too.
But neither of those approaches is at a particularly low abstraction level relative to the OS itself; files and system calls are more or less hidden away in both package-manager-via-bash and puppet/terraform/whatever.
But as long as people want to use scripting languages (like php, python etc) i guess docker is the neccessary evil.
I'll tell that to my CI runner, how easy is it for Go to download the Android SDK and to run Gradle? Can I also `go sonarqube` and `go run-my-pullrequest-verifications` ? Or are you also going to tell me that I can replace that with a shitty set of github actions ?
I'll also tell Microsoft they should update the C# definition to mark it down as a scripting language. And to actually give up on the whole language, why would they do anything when they could tell every developer to write if err != nil instead
Just because you have an extremely narrow view of the field doesn't mean it's the only thing that matters.
E.g. systemd exposes a lot of resource control as well as sandboxing options, to the point that I would argue that systemd services can be very similar to "traditional" runtime containers, without any image involved.
Interesting. How does go build my python app?