Comments (36)
I dived a little deeper into buildx and I am sure that docker load
is not necessarily required by buildx
.
Excerpt from the official documentaion: https://docs.docker.com/engine/reference/commandline/buildx_create/#driver
docker driver
Uses the builder that is built into the docker daemon. With this driver, the --load flag is implied by default on buildx build. However, building multi-platform images or exporting cache is not currently supported.
A PoC is available as well: https://gist.github.com/knight42/6c128a2edf7cebcb6816343da833295a. The built image is present in docker images
without docker load
.
After learning about that, I have been trying to get rid of docker load
in envd, but it is unfortunate that the version of bundled buildkitd in docker engine 20.10.14 is v0.8.3-4-gbc07b2b8, while mergeop is introduced in v0.10.3.
That said, even though the bundled builkitd in docker might be new enough to support mergeop in the future, I think we still need some fallback mechanism, like using docker-container driver as what we did now.
from envd.
https://github.com/tensorchord/buildkit/pull/1/files
I am working on it. It is not easy.. π
Things we may need to change:
- source/containerimage/pull.go to use docker/docker/image.Store
- worker
from envd.
It is possible! https://github.com/gaocegege/buildkit/pull/1/files reuses the docker image store when caching the docker image in the buildkit.
Builtkit instance in the container owns its own image cache. This PR reuses /var/lib/docker/overlay2/image/ instead of using its own separate cache.
from envd.
buildx does not work in the local docker daemon too. We need to specify --load to load the artifact into docker.
But there is an optimization we could use:
$ docker volume inspect
[
{
"CreatedAt": "2022-05-05T17:30:23+08:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/buildx_buildkit_amazing_albattani0_state/_data",
"Name": "buildx_buildkit_amazing_albattani0_state",
"Options": null,
"Scope": "local"
}
]
$ docker inspect container
{
"Type": "volume",
"Source": "buildx_buildkit_amazing_albattani0_state",
"Target": "/var/lib/buildkit"
}
We could create a volume to keep the cache persistent.
from envd.
Now we use diff and merge to reduce the size. But docker load is still slow
7GB image load takes ~17s
from envd.
Yeah, It comes from containerd diff, I think.
Maybe we can have a look at how docker build does when loading the image into its local image store.
from envd.
The Docker daemon was explicitly designed to have exclusive access to /var/lib/docker. Nothing else should touch, poke, or tickle any of the Docker files hidden there.
Why is that? Itβs one of the hard learned lessons from the dotCloud days. The dotCloud container engine worked by having multiple processes accessing /var/lib/dotcloud simultaneously. Clever tricks like atomic file replacement (instead of in-place editing), peppering the code with advisory and mandatory locking, and other experiments with safe-ish systems like SQLite and BDB only got us so far; and when we refactored our container engine (which eventually became Docker) one of the big design decisions was to gather all the container operations under a single daemon and be done with all that concurrent access nonsense.
This means that if you share your /var/lib/docker directory between multiple Docker instances, youβre gonna have a bad time. Of course, it might work, especially during early testing. βLook ma, I can docker run ubuntu!β But try to do something more involved (pull the same image from two different instancesβ¦) and watch the world burn.
from envd.
A new exporter envd is introduced in the buildkit container.
The image is loaded into the docker host successfully but it requires a dockerd reboot to find the new image. Seems that dockerd does not watch the filesystem, I will figure out.
buildctl build ... --output type=envd,name=gaoce
[+] Building 1.5s (4/4) FINISHED
=> docker-image://docker.io/library/python:3.8 1.5s
=> => resolve docker.io/library/python:3.8 1.5s
=> CACHED ls 0.0s
=> CACHED pip install -i https://mirror.sjtu.edu.cn/pypi/web/simple jupyter 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:470747d54520023ee32931048063d1f383d52046ba95625a3d41411805850893 0.0s
=> => naming to gaoce 0.0s
from envd.
We merged multiple base layers into the image, thus the size is large. diff should be used to reduce the size. Ref e64786b
from envd.
The base image nvidia:cuda:11.2-devel-ubuntu2004
is 4GB, but our base image is 7GB. We should figure out where the 3GB comes from
from envd.
Perhaps we could make MIDI
work in a way similar to docker-buildx
, which uses the BuildKit library bundled into the Docker daemon with docker driver, so that the image is actually built by dockerd and we don't need to load the image manually.
from envd.
Cool! I think it is a great idea. Let's investigate how docker buildx plugin does.
from envd.
Buildx also relies on docker load.
w = &waitingWriter{
PipeWriter: pw,
f: func() {
resp, err := c.ImageLoad(ctx, pr, false)
defer close(done)
if err != nil {
pr.CloseWithError(err)
w.mu.Lock()
w.err = err
w.mu.Unlock()
return
}
prog := progress.WithPrefix(status, "", false)
progress.FromReader(prog, "importing to docker", resp.Body)
},
done: done,
cancel: cancel,
}
return w, func() {
pr.Close()
}, nil
}
from envd.
7GB image load takes ~17s
Wow! It is really awsome! π―
from envd.
I found the sending tarball
still take ~30s on my machine. Is this expected?
from envd.
Yeah, it is expected in the current design. send tarball is the docker load process
from envd.
@knight42 Thanks for the research!
but it is unfortunate that the version of bundled buildkitd in docker engine 20.10.14 is v0.8.3-4-gbc07b2b8, while mergeop is introduced in v0.10.3
I am wondering why we should use docker 20.10.14, is it the version that supports built-in load?
from envd.
while mergeop is introduced in v0.10.3.
Currently, we use buildkit v0.10.1, and merge op is supported in this version. I am not sure if it only works after v0.10.3 π€
from envd.
Got the problem here.
failed to solve LLB: failed to solve: failed to load LLB: unknown API capability mergeop
The client returns the error that we cannot use merge op if we eliminate docker load.
from envd.
from envd.
I am wondering why we should use docker 20.10.14, is it the version that supports built-in load?
Nope, it is just the version of the dockerd in my laptop..
while mergeop is introduced in v0.10.3.
Currently, we use buildkit v0.10.1, and merge op is supported in this version. I am not sure if it only works after v0.10.3 π€
Sorry I double-checked the MergeOp PR, the merge op is actually introduced in v0.10.0.
The client returns the error that we cannot use merge op if we eliminate docker load.
Yeah, since we heavily leverage merge op in envd, if we want to get rid of docker load, we need to make sure the bundled buildkitd in dockerd has the support of merge op.
from envd.
20.10.16 still uses v0.8.3-4-gbc07b2b8. I am afraid that we need to wait until the next milestone of docker.
https://github.com/moby/moby/blob/v20.10.16/vendor.conf#L36
from envd.
Things that we need to confirm:
- How buildkit stores layers
- How buildx loads images into the local image store
- How buildkit is initialized in docker
- What is changed in earthly customized buildkit? moby/buildkit@master...earthly:earthly-main
from envd.
Using docker buildkitd directly is not possible now. We will figure out if we can mount some dir in the envd_buildkitd container to achieve a similar experience.
from envd.
Here https://github.com/moby/moby/blob/master/builder/builder-next/controller.go#L44:#L220 docker creates a buildkit daemon (control.Controller).
And the most important part is https://github.com/moby/moby/blob/master/builder/builder-next/worker/worker.go#L83 . Docker has a new worker type moby
from envd.
bk, err := buildkit.New(buildkit.Opt{
SessionManager: sm,
Root: filepath.Join(config.Root, "buildkit"),
Dist: d.DistributionServices(),
NetworkController: d.NetworkController(),
DefaultCgroupParent: cgroupParent,
RegistryHosts: d.RegistryHosts(),
BuilderConfig: config.Builder,
Rootless: d.Rootless(),
IdentityMapping: d.IdentityMapping(),
DNSConfig: config.DNSConfig,
ApparmorProfile: daemon.DefaultApparmorProfile(),
})
buildkit (builder-next.Controller) uses dockerd's DistributionService, thus the images' blobs and metadata are stored in docker image store directly. Thus there is no need to load images.
from envd.
The maintainers said it is not possible to run multi docker daemons on one data root /var/lib/docker
from envd.
But, I still think it is possible to run a (minimal) docker daemon in our envd_buildkitd container. /var/lib/docker/image
is only needed in envd_buildkitd.
The main concern above is that multiple daemons on the same data root (/var/lib/docker/) may break the consistency. Let's have a look at the dir architecture of the image part in /var/lib/docker
/var/lib/docker/image/overlay2
βββ distribution
βΒ Β βββ diffid-by-digest
βΒ Β βββ v2metadata-by-diffid
βββ imagedb
βΒ Β βββ content
βΒ Β βββ metadata
βββ layerdb
βΒ Β βββ mounts
βΒ Β βββ sha256
βΒ Β βββ tmp
βββ repositories.json
distribution is used to communicate with OCI image registry, thus it is not used in envd_buildkitd. imagedb and layerdb are actually a key-value store and the key is the file name (which is a HEX). Then it should not be affected by concurrent daemons.
The last one repositories.json is to store the map from image tag name to image ID:
{
"ubuntu": {
"ubuntu:20.04": "sha256:53df61775e8856a464ca52d4cd9eabbf4eb3ceedbde5afecc57e417e7b7155d5",
"ubuntu@sha256:47f14534bda344d9fe6ffd6effb95eefe579f4be0d508b7445cf77f61a0e5724": "sha256:53df61775e8856a464ca52d4cd9eabbf4eb3ceedbde5afecc57e417e7b7155d5"
}
}
It may be affected by the concurrent daemons. But we may have some workarounds for it. For example, we can rename the image using docker API, and not tag it in the low level. We avoid manipulating this JSON file directly.
Thus in the buildkit exporter code, we should remove logic like this:
if e.opt.ReferenceStore != nil {
targetNames := strings.Split(e.targetName, ",")
for _, targetName := range targetNames {
tagDone := oneOffProgress(ctx, "naming to "+targetName)
tref, err := distref.ParseNormalizedNamed(targetName)
if err != nil {
return nil, err
}
if err := e.opt.ReferenceStore.AddTag(tref, digest.Digest(id), true); err != nil {
return nil, tagDone(err)
}
_ = tagDone(nil)
}
}
Or we set ReferenceStore to nil
from envd.
We can create the image service outside of docker. Then I will try if it is possible to embed it into the buildkitd process.
package main
import (
"context"
"path/filepath"
"github.com/docker/docker/api/types"
_ "github.com/docker/docker/daemon/graphdriver/overlay2"
"github.com/docker/docker/daemon/images"
dmetadata "github.com/docker/docker/distribution/metadata"
"github.com/docker/docker/image"
"github.com/docker/docker/layer"
"github.com/docker/docker/pkg/idtools"
refstore "github.com/docker/docker/reference"
)
func main() {
root := "/var/lib/docker"
graphDriver := "overlay2"
layerStore, err := layer.NewStoreFromOptions(layer.StoreOptions{
Root: root,
MetadataStorePathTemplate: filepath.Join(root, "image", "%s", "layerdb"),
GraphDriver: graphDriver,
GraphDriverOptions: []string{},
IDMapping: idtools.IdentityMapping{},
ExperimentalEnabled: false,
})
if err != nil {
panic(err)
}
m := layerStore.Map()
for k, v := range m {
println(k, v)
}
imageRoot := filepath.Join(root, "image", graphDriver)
ifs, err := image.NewFSStoreBackend(filepath.Join(imageRoot, "imagedb"))
if err != nil {
panic(err)
}
imageStore, err := image.NewImageStore(ifs, layerStore)
if err != nil {
panic(err)
}
im := imageStore.Map()
for k, v := range im {
println(k, v.Size)
}
refStoreLocation := filepath.Join(imageRoot, `repositories.json`)
rs, err := refstore.NewReferenceStore(refStoreLocation)
if err != nil {
panic(err)
}
_ = rs
distributionMetadataStore, err := dmetadata.NewFSMetadataStore(filepath.Join(imageRoot, "distribution"))
if err != nil {
panic(err)
}
_ = distributionMetadataStore
imgSvcConfig := images.ImageServiceConfig{
DistributionMetadataStore: distributionMetadataStore,
ImageStore: imageStore,
LayerStore: layerStore,
ReferenceStore: rs,
}
imageService := images.NewImageService(imgSvcConfig)
is, err := imageService.Images(context.TODO(), types.ImageListOptions{})
imageService.DistributionServices()
if err != nil {
panic(err)
}
for _, i := range is {
println(i.ID)
}
}
from envd.
nerdctl uses a newer buildkit https://github.com/containerd/nerdctl/blob/e77e05b5fd252274e3727e0439e9a2d45622ccb9/Dockerfile.d/SHA256SUMS.d/buildkit-v0.10.3. Can we leverage this?
from envd.
nerdctl uses a newer buildkit https://github.com/containerd/nerdctl/blob/e77e05b5fd252274e3727e0439e9a2d45622ccb9/Dockerfile.d/SHA256SUMS.d/buildkit-v0.10.3. Can we leverage this?
We are using newer than nerdctl.
from envd.
Found the root cause. docker/docker/layer.Store loads /var/lib/docker/image/overlay2/layerdb in the New func. Thus the new layer cannot be found. π
// newStoreFromGraphDriver creates a new Store instance using the provided
// metadata store and graph driver. The metadata store will be used to restore
// the Store.
func newStoreFromGraphDriver(root string, driver graphdriver.Driver) (Store, error) {
caps := graphdriver.Capabilities{}
if capDriver, ok := driver.(graphdriver.CapabilityDriver); ok {
caps = capDriver.Capabilities()
}
ms, err := newFSMetadataStore(root)
if err != nil {
return nil, err
}
ls := &layerStore{
store: ms,
driver: driver,
layerMap: map[ChainID]*roLayer{},
mounts: map[string]*mountedLayer{},
locker: locker.New(),
useTarSplit: !caps.ReproducesExactDiffs,
}
ids, mounts, err := ms.List()
if err != nil {
return nil, err
}
for _, id := range ids {
l, err := ls.loadLayer(id)
if err != nil {
logrus.Debugf("Failed to load layer %s: %s", id, err)
continue
}
if l.parent != nil {
l.parent.referenceCount++
}
}
for _, mount := range mounts {
if err := ls.loadMount(mount); err != nil {
logrus.Debugf("Failed to load mount %s: %s", mount, err)
}
}
return ls, nil
}
from envd.
It's not really possible without significant changes in the code-base (and adding a lot of complexity); as mentioned: those daemon's won't know what's still being used by other daemons, so if one daemon pulls an image, the other daemon's won't know it's being pulled (so don't "see" the image in the list of images that's available locally), and if a daemon removes an image, the other daemons will fail (because an image that they expected to be there is suddenly gone).
from envd.
I think it is the end game. I am closing the issue since it is not possible.
from envd.
docker 22.06-beta supports merge op. We can have a check in envd.
If the docker version is 20.xx, then we use runc worker, if the docker version is 22.xx, we use moby worker.
from envd.
from envd.
Related Issues (20)
- feat: Improve error message
- bug: recent changelog loses some commits HOT 3
- refactor: bootstrap buildkit config file
- feat: create an pre-defined dev env in one command
- bug: wrong binary name in CICD HOT 1
- feat: disable buildkit merge op when using moby builder HOT 2
- bug: ERROR exporting to oci image format when the first character of directory name is underline HOT 1
- bug: bootstrap failed with timeout 5s: cannot connect to buildkitd in version 0.3.36 HOT 8
- feat: support Nydus type image builder HOT 8
- bug: No linux/arm64 wheel in 0.3.37 release HOT 3
- feat: support `io.copy` from another image
- bug: envd up failed in debug mode HOT 1
- bug: bootstrap does not work in docker-from-docker setting HOT 6
- feat: support `pyproject.toml` HOT 3
- bug: use-proxy not working HOT 5
- feat: the "evnd up" command adds a gpu-set parameter that looks like "docker run --gpus" HOT 2
- feat: add `envd shutdown` command to deinitialize envd HOT 2
- bug: ERROR importing cache manifest from docker.io/tensorchord/python-cache:envd-v0.3.43 1.8s HOT 2
- BUG: Cannot build image based on envd build image HOT 1
- feat: Publish ports HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from envd.