Code Monkey home page Code Monkey logo

Comments (8)

mattmoor avatar mattmoor commented on May 18, 2024 1

Yeah, this is an unfortunate consequence of some strange choices Docker made about the v2.2 manifest format and extremely poor Python gzip performance**. :(

Our new intermediate format can be so fast on the happy path because it's basically designed to avoid Python g[un]zip and random accesses within a tarball.

There are other nuances that enable Bazel to prune other optional work depending on the result the user is looking for.

** - With fresh eyes, we could probably gunzip layers in a threadpool here, but (I'm guessing) this will still be several times slower than switching to v2.2.

from containerregistry.

mattmoor avatar mattmoor commented on May 18, 2024

There are two pullers, which were you using?

The version that spits out a docker save tarball is going to be slow because that form isn't great.

The version that spits out the more proprietary form we consume here should be fast, but one exception to this is if it needs to convert a v2 image to v2.2 here.

Can you share a little more about how you are invoking this, so I can help diagnose?

from containerregistry.

promiseofcake avatar promiseofcake commented on May 18, 2024

Hey @mattmoor thanks for the quick reply. So I have been hoping to use it as part of bazelbuild/rules_docker so the proprietary form would be fine, but was getting timeouts after 600s so I started debugging with the underlying :puller target from this repo.

I think we might be falling into the "if it needs to convert a v2 image to v2.2" use case, I am doing some benchmarking locally:

$ time bazel-bin/puller --directory openjdk8 --name index.docker.io/library/openjdk:8

real	0m33.088s
user	0m3.014s
sys	0m1.985s

But for our private registry (same image pushed up to it):

$ time bazel-bin/puller --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8

real	6m17.294s
user	0m11.760s
sys	0m6.030s

We are running image: distribution/registry:2.1.1, so I think we may just need to upgrade to take advantage of the faster speeds?

from containerregistry.

mattmoor avatar mattmoor commented on May 18, 2024

If you are able to bazel run :puller can you try instrumenting that line and see if it hits?

The 2.1.1 tag reports as being built 2 years ago, so I'd definitely update. You may need to republish the image for the change to take effect.

from containerregistry.

promiseofcake avatar promiseofcake commented on May 18, 2024

Logging patch:

diff --git a/tools/fast_puller_.py b/tools/fast_puller_.py
index 64c61a5..1ff9d2c 100755
--- a/tools/fast_puller_.py
+++ b/tools/fast_puller_.py
@@ -19,6 +19,7 @@ Unlike docker_puller the format this uses is proprietary.
 
 
 import argparse
+import logging
 
 from containerregistry.client import docker_creds
 from containerregistry.client import docker_name
@@ -73,12 +74,16 @@ def main():
   creds = docker_creds.DefaultKeychain.Resolve(name)
 
   with v2_2_image.FromRegistry(name, creds, transport, accept) as v2_2_img:
+    logging.warning("with v2_2_image.FromRegistry")
     if v2_2_img.exists():
+      logging.warning("saving fast for v2_2 image")
       save.fast(v2_2_img, args.directory, threads=_THREADS)
       return
 
   with v2_image.FromRegistry(name, creds, transport) as v2_img:
+    logging.warning("with v2_image.FromRegistry")
     with v2_compat.V22FromV2(v2_img) as v2_2_img:
+      logging.warning("with v2_compat.V22FromV2")
       save.fast(v2_2_img, args.directory, threads=_THREADS)
       return
time bazel-bin/puller --directory openjdk8 --name index.docker.io/library/openjdk:8
WARNING:root:with v2_2_image.FromRegistry
WARNING:root:saving fast for v2_2 image

real	0m28.850s
user	0m3.495s
sys	0m2.366s
$ time bazel-bin/puller --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8
WARNING:root:with v2_2_image.FromRegistry
WARNING:root:with v2_image.FromRegistry
WARNING:root:with v2_compat.V22FromV2

real	5m13.048s
user	0m14.126s
sys	0m7.164s

Looks like that could be it, thanks @mattmoor.

from containerregistry.

promiseofcake avatar promiseofcake commented on May 18, 2024

@mattmoor, so we have upgraded our registry to 2.5.2 in order to take advantage of the new manifest version. I have also bumped my version of this repo to v0.0.22 and re-built the puller and am getting the following results:

$ time bazel-bin/puller --stderrthreshold DEBUG --directory openjdk8 --name index.docker.io/library/openjdk:8
I1212 14:56:00.225013   75771 docker_creds_.py:228] Loading Docker credentials for repository 'index.docker.io/library/openjdk:8'
I1212 14:56:00.225629   75771 fast_puller_.py:79] Pulling v2.2 image from <containerregistry.client.docker_name_.Tag object at 0x10f1980d0> ...

real	0m26.948s
user	0m2.628s
sys	0m2.335s
$ time bazel-bin/puller --stderrthreshold DEBUG --directory private-openjdk8 --name tld.for.private.registry/library/openjdk:8
I1212 14:57:01.843225   75821 docker_creds_.py:228] Loading Docker credentials for repository 'tld.for.private.registry/library/openjdk:8'
I1212 14:57:01.843544   75821 fast_puller_.py:79] Pulling v2.2 image from <containerregistry.client.docker_name_.Tag object at 0x1108d90d0> ...

real	3m32.554s
user	0m3.144s
sys	0m2.464s

When observing the network pane of Actibity monitor in OSX, I am pulling much slower from our private registry in comparison to the public one. We are not running anything fancy for our registry, any tips on speeding up the pulls? Is there any additional logging I can add to try to isolate where the slow-downs are?

from containerregistry.

mattmoor avatar mattmoor commented on May 18, 2024

That's interesting. These should be following nearly identical code paths, so I'm wondering if this is just registry slowness? Hard to say without further instrumentation.

e.g. the major registries I've played with (or built) serve blobs by redirecting to the underlying storage (e.g. GCS, S3) because those stacks are optimized for delivering large data very fast. For the registry to serve the data directly, it would have to read the data itself and then proxy it to the client. Knowing nothing about your private stack, I'm left to wonder about this :)

I'd probably instrument this code to find out how much time is spent in accessor(arg), which is effectively the blob or config fetch.

from containerregistry.

lshamis avatar lshamis commented on May 18, 2024

Has this been resolved? I think I'm having similar issues.

from containerregistry.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.