TL;DR:
- add
?mounts:(Mount.t list)
to Dockerfile.run
and support in Dockerfile.crunch
- add a simple Mount.t: mount type, list of options (or string map of options)
- introduce ?mount_cache in dockerfile-opam's package managers (default off), which would mount
/var/cache/yum
with an appropriately computed cache id (perhaps overridable with ?cache_id
)
- provide some helpers for an 'opam' command with appropriate caches mounted (
~/.cache
for dune
, and a symlinked ~/.opam/download-cache/{md5,sha256,sha512} to .cache
), usable by both 'opam install' and 'opam monorepo pull' for downstream containers to use
- would this be useful for you in https://github.com/ocurrent/docker-base-images too? Anything in particular I should be aware of to make it useful there?
I've got some very early experimental code to add support for --mount=type=cache
on RUN lines that works both with Docker BuildKit and Podman 4.x (see https://docs.docker.com/engine/reference/builder/#run---mounttypecache).
For now it is as a layer on top of dockerfile
and dockerfile-opam
, but I'd like to contribute at least some of the changes back to this library. In particular Dockerfile.crunch
needs to know about the mounts, because the correct way to crunch this:
RUN --mount=type=cache,target=/var/cache sudo yum install -y foo
RUN --mount=type=cache,target=/home/opam/.cache opam install bar
is
RUN --mount=type=cache,target=/var/cache --mount=type=cache,target=/home/opam/.cache sudo yum install -y foo && opam install bar
(i.e. the mounts need to stay grouped together at the beginning)
And once that is in place then the various Package managers in dockerfile-opam could be taught to take a ?use_cache
parameter to enable a package download cache (shared among all dockerfiles, not just the current one, with some care to use a proper cache ID per OS/architecture as needed!), and skip the 'clean all' at the end.
Which is beneficial even on fast networks (the package mirrors are sometimes very slow, and especially on CentOS/Fedora just refreshing the mirror/package metadata can take significantly longer than downloading the package).
Also a cache mount type is very useful for downstream container builds that do 'opam install' or 'dune build' or 'opam depext',
or 'opam monorepo pull', all of which can be cached.
(There are other mount types that are useful as well, such as 'tmpfs' but come with various caveats, such as the directory disappearing if you don't consistently specify once you started using it, and 'bind' mounts (as an efficient alternative to copying from another build stage!), but they work slightly differently between Docker and Podman, where the latter requires 'z', and the former doesn't support it)
Caching would be opt-in (who knows what I'd break otherwise).
I'll try to keep the changes minimal, just add the mechanism to support crunching mounts, and the basic mount types that are supported by both podman and docker, and leave the actual management of those caches and cached paths and stages (computing cache ids from OS/arch/etc. checking you don't use overlapping paths) to another library.
However the mount types are likely to evolve (and converge or diverge between podman and docker), and support for that might be best served by the actual application using it, so I'd keep mount types very generic here: (string * string) list
or string String.Map.t
.
For context: the end goal is a tool that builds development and CI containers for monorepos, but that is a separate project (it sort of "works" on the XAPI project already, but not yet ready for release).
I'm opening this issue to give some background on some PRs that I may open shortly, I'll try to feed the changes in as small chunks.