Comments (8)
Yeah, this is an unfortunate consequence of some strange choices Docker made about the v2.2 manifest format and extremely poor Python gzip performance**. :(
Our new intermediate format can be so fast on the happy path because it's basically designed to avoid Python g[un]zip and random accesses within a tarball.
There are other nuances that enable Bazel to prune other optional work depending on the result the user is looking for.
** - With fresh eyes, we could probably gunzip
layers in a threadpool here, but (I'm guessing) this will still be several times slower than switching to v2.2.
from containerregistry.
There are two pullers, which were you using?
The version that spits out a docker save
tarball is going to be slow because that form isn't great.
The version that spits out the more proprietary form we consume here should be fast, but one exception to this is if it needs to convert a v2 image to v2.2 here.
Can you share a little more about how you are invoking this, so I can help diagnose?
from containerregistry.
Hey @mattmoor thanks for the quick reply. So I have been hoping to use it as part of bazelbuild/rules_docker so the proprietary form would be fine, but was getting timeouts after 600s so I started debugging with the underlying :puller
target from this repo.
I think we might be falling into the "if it needs to convert a v2 image to v2.2" use case, I am doing some benchmarking locally:
$ time bazel-bin/puller --directory openjdk8 --name index.docker.io/library/openjdk:8
real 0m33.088s
user 0m3.014s
sys 0m1.985s
But for our private registry (same image pushed up to it):
$ time bazel-bin/puller --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8
real 6m17.294s
user 0m11.760s
sys 0m6.030s
We are running image: distribution/registry:2.1.1
, so I think we may just need to upgrade to take advantage of the faster speeds?
from containerregistry.
If you are able to bazel run :puller
can you try instrumenting that line and see if it hits?
The 2.1.1
tag reports as being built 2 years ago
, so I'd definitely update. You may need to republish the image for the change to take effect.
from containerregistry.
Logging patch:
diff --git a/tools/fast_puller_.py b/tools/fast_puller_.py
index 64c61a5..1ff9d2c 100755
--- a/tools/fast_puller_.py
+++ b/tools/fast_puller_.py
@@ -19,6 +19,7 @@ Unlike docker_puller the format this uses is proprietary.
import argparse
+import logging
from containerregistry.client import docker_creds
from containerregistry.client import docker_name
@@ -73,12 +74,16 @@ def main():
creds = docker_creds.DefaultKeychain.Resolve(name)
with v2_2_image.FromRegistry(name, creds, transport, accept) as v2_2_img:
+ logging.warning("with v2_2_image.FromRegistry")
if v2_2_img.exists():
+ logging.warning("saving fast for v2_2 image")
save.fast(v2_2_img, args.directory, threads=_THREADS)
return
with v2_image.FromRegistry(name, creds, transport) as v2_img:
+ logging.warning("with v2_image.FromRegistry")
with v2_compat.V22FromV2(v2_img) as v2_2_img:
+ logging.warning("with v2_compat.V22FromV2")
save.fast(v2_2_img, args.directory, threads=_THREADS)
return
time bazel-bin/puller --directory openjdk8 --name index.docker.io/library/openjdk:8
WARNING:root:with v2_2_image.FromRegistry
WARNING:root:saving fast for v2_2 image
real 0m28.850s
user 0m3.495s
sys 0m2.366s
$ time bazel-bin/puller --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8
WARNING:root:with v2_2_image.FromRegistry
WARNING:root:with v2_image.FromRegistry
WARNING:root:with v2_compat.V22FromV2
real 5m13.048s
user 0m14.126s
sys 0m7.164s
Looks like that could be it, thanks @mattmoor.
from containerregistry.
@mattmoor, so we have upgraded our registry to 2.5.2
in order to take advantage of the new manifest version. I have also bumped my version of this repo to v0.0.22
and re-built the puller
and am getting the following results:
$ time bazel-bin/puller --stderrthreshold DEBUG --directory openjdk8 --name index.docker.io/library/openjdk:8
I1212 14:56:00.225013 75771 docker_creds_.py:228] Loading Docker credentials for repository 'index.docker.io/library/openjdk:8'
I1212 14:56:00.225629 75771 fast_puller_.py:79] Pulling v2.2 image from <containerregistry.client.docker_name_.Tag object at 0x10f1980d0> ...
real 0m26.948s
user 0m2.628s
sys 0m2.335s
$ time bazel-bin/puller --stderrthreshold DEBUG --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8
I1212 14:57:01.843225 75821 docker_creds_.py:228] Loading Docker credentials for repository 'tld.for.private.registry/library/openjdk:8'
I1212 14:57:01.843544 75821 fast_puller_.py:79] Pulling v2.2 image from <containerregistry.client.docker_name_.Tag object at 0x1108d90d0> ...
real 3m32.554s
user 0m3.144s
sys 0m2.464s
When observing the network pane of Actibity monitor in OSX, I am pulling much slower from our private registry in comparison to the public one. We are not running anything fancy for our registry, any tips on speeding up the pulls? Is there any additional logging I can add to try to isolate where the slow-downs are?
from containerregistry.
That's interesting. These should be following nearly identical code paths, so I'm wondering if this is just registry slowness? Hard to say without further instrumentation.
e.g. the major registries I've played with (or built) serve blobs by redirecting to the underlying storage (e.g. GCS, S3) because those stacks are optimized for delivering large data very fast. For the registry to serve the data directly, it would have to read the data itself and then proxy it to the client. Knowing nothing about your private stack, I'm left to wonder about this :)
I'd probably instrument this code to find out how much time is spent in accessor(arg)
, which is effectively the blob or config fetch.
from containerregistry.
Has this been resolved? I think I'm having similar issues.
from containerregistry.
Related Issues (20)
- foreign layers not pulled properly HOT 2
- Example for pulling an image from a remote registry into the docker daemon HOT 3
- error wnhen using docker 2.0.0.3
- error wnhen using docker 2.0.0.3
- error wnhen using docker 2.0.0.3
- error wnhen using docker 2.0.0.3 HOT 2
- error wnhen using docker 2.0.0.3
- auth error when using default config.json from docker 2.0.0.3 HOT 4
- fast_puller_.py pulls the wrong image from GCR HOT 2
- forced url server specification fails to match for docker-credentials-secretservice
- fast_puller_.py pulls an image with an unknown SHA256 from quay.io HOT 1
- Pushes to gitlab fail with SSL error HOT 6
- Pushes to gitlab fail with SSL error HOT 1
- save.py: digest file writing thread bugs
- Doens't work with Bazel 0.25.0 because it uses Python 3 by default HOT 7
- containerregistry in Pypi
- image_digester.py fails to run on Python3-default systems HOT 1
- Is there a unittest suite for this code? HOT 1
- Support images larger than 2GiB in docker_session.upload() HOT 3
- Docker Puller tool does not work with Dockerhub images HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from containerregistry.