Code Monkey home page Code Monkey logo

platform-registry-api's Introduction

platform-registry-api

platform-registry-api's People

Contributors

dependabot[bot] avatar dependabot-preview[bot] avatar asvetlov avatar pre-commit-ci[bot] avatar dalazx avatar zubenkoivan avatar neu-ro-github-bot[bot] avatar romasku avatar anayden avatar alin999 avatar serhiy-storchaka avatar didcv avatar yevheniisemendiak avatar andriihomiak avatar eslam-yousry avatar shagren avatar

Stargazers

 avatar

Watchers

Sergey Nikolenko avatar paulm avatar  avatar  avatar  avatar  avatar  avatar vova.soroka avatar

Forkers

asvetlov

platform-registry-api's Issues

AWS registry tags issue

Got this in logs:

2021-07-22 12:58:34,725 - aiohttp.server - ERROR - Error handling request
Traceback (most recent call last):
  File "/root/.local/lib/python3.8/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/root/.local/lib/python3.8/site-packages/sentry_sdk/integrations/aiohttp.py", line 123, in sentry_app_handle
    reraise(*_capture_exception(hub))
  File "/root/.local/lib/python3.8/site-packages/sentry_sdk/_compat.py", line 54, in reraise
    raise value
  File "/root/.local/lib/python3.8/site-packages/sentry_sdk/integrations/aiohttp.py", line 113, in sentry_app_handle
    response = await old_handle(self, request)
  File "/root/.local/lib/python3.8/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/root/.local/lib/python3.8/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.8/site-packages/aiohttp_remotes/x_forwarded.py", line 94, in middleware
    return await handler(request)
  File "/root/.local/lib/python3.8/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.8/site-packages/platform_registry_api/api.py", line 437, in handle_repo_tags_list
    response = await self._handle_aws_ecr_tags_list(
  File "/root/.local/lib/python3.8/site-packages/platform_registry_api/api.py", line 534, in _handle_aws_ecr_tags_list
    "tags": [image["imageTag"] for image in data["imageIds"]],
KeyError: 'imageIds'

/v2/_catalog should proxy HTTP errors properly

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 390, in start
    resp = await self._request_handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_app.py", line 366, in _handle
    resp = await handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_middlewares.py", line 106, in impl
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp_remotes/x_forwarded.py", line 73, in middleware
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_middlewares.py", line 106, in impl
    return await handler(request)
  File "/neuromation/platform_registry_api/api.py", line 259, in handle_catalog
    client_response.raise_for_status()
  File "/usr/local/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 853, in raise_for_status
    headers=self.headers)
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden'
2019-06-11 15:02:44,096 - aiohttp.access - INFO - 10.138.0.8 [11/Jun/2019:15:02:43 +0000] "GET /v2/_catalog HTTP/1.1" 500 330 "-" "Python/3.6 aiohttp/3.5.4" 

internal server error during image push

neuro image push synthetic-reverse
Using local image 'synthetic-reverse:latest'
Using remote image 'image://adavydow/synthetic-reverse:latest'
ERROR: Docker API error: received unexpected HTTP status: 500 Internal Server Error

Just got this error.
Pushed the image again and it was pushed without any problems.
Image was pretty large (7.5 GB).

Not Authorized `neuro image tags` for images with >30 tags

message from @mariyadavydova :

Team, I observe weird behaviour with tags. I remember that we discussed the problem with AWS, but I don’t remember the current state of this issue. It works as follows: if you have more than 30 tags, you get an error (when asking for next 30 tags):

# an image with ton of tags
(neuro) mariyadavydova@Mariyas-MacBook-Pro platform-web-ui % neuro image tags image://neuro-public/cookiecutter-e2e/extras-e2e-custom-dockerfile
ERROR: Cannot authenticate (Not Authorized)
# an image with one tag
(neuro) mariyadavydova@Mariyas-MacBook-Pro platform-web-ui % neuro image tags image:image-1                                                     
image://neuro-public/mrsmariyadavydova/image-1:v2

Example of failed CI: https://github.com/neuromation/neuro-extras/actions/runs/259203480

Check that the upstream error messages are converted properly

Example of a message that has the upstream project name in it's payload.

{'errors': [{
                 'code': 'NAME_UNKNOWN',
                 # TODO: this has to be fixed ASAP:
                 'detail': {'name': 'testproject/neuromation/unknown'},
                 'message': 'repository name not known to registry',
             }]}

Exception on docker pull

run tests/e2e/test.sh, when it does docker pull localhost:5000/1ce6fb32-e097-426b-a140-f3136c5b2e93/ubntu:latest there will be an error in server logs:

registry_1     | 2018-12-28 12:09:42,401 - aiohttp.server - ERROR - Error handling request
registry_1     | Traceback (most recent call last):
registry_1     |   File "/usr/local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 242, in data_received
registry_1     |     messages, upgraded, tail = self._request_parser.feed_data(data)
registry_1     |   File "aiohttp/_http_parser.pyx", line 523, in aiohttp._http_parser.HttpParser.feed_data
registry_1     | aiohttp.http_exceptions.BadStatusLine: invalid HTTP method
registry_1     | 2018-12-28 12:09:42,407 - platform_registry_api.api - DEBUG - registry request: <Request GET /v2/ >; headers: <CIMultiDictProxy('Host': 'localhost:5000', 'User-Agent': 'docker/18.09.0-ce go/go1.11.2 git-commit/4d60db472b kernel/4.19.4-arch1-1-ARCH os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.0-ce \\(linux\\))', 'Accept-Encoding': 'gzip', 'Connection': 'close')>

Do not rely on the hard-coded limit of number of images for _catalog

in PR #34 we hard-coded number of images 1000 that the platform receives when requested _catalog (discussion in Russian https://neuromation.slack.com/archives/CE76BT03G/p1547576598103500).
This means, when this limit exceeded the only first 1000 images will be seen. Instead, the registry-api should request the docker-registry iteratively until there are images to see. Note, we should use GET params n, b and last (see https://docs.docker.com/registry/spec/api/#catalog)

AWS registry is not working properly with `21.4.26`

We cannot build the images using Kaniko with the nevest version of platformregistery on AWS clusters.

The problem appears when the Kaniko pulls cached layer from our registry to speedup the image build.
At this step, our registry fails with error:

aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
2021-06-29 07:55:44,393 - aiohttp.server - ERROR - Error handling request
Traceback (most recent call last):
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/integrations/aiohttp.py", line 123, in sentry_app_handle
    reraise(*_capture_exception(hub))
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/_compat.py", line 54, in reraise
    raise value
  File "/root/.local/lib/python3.7/site-packages/sentry_sdk/integrations/aiohttp.py", line 113, in sentry_app_handle
    response = await old_handle(self, request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp_remotes/x_forwarded.py", line 94, in middleware
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/aiohttp/web_middlewares.py", line 119, in impl
    return await handler(request)
  File "/root/.local/lib/python3.7/site-packages/platform_registry_api/api.py", line 599, in handle
    auth_headers=auth_headers,
  File "/root/.local/lib/python3.7/site-packages/platform_registry_api/api.py", line 711, in _proxy_request
    async for chunk in client_response.content.iter_any():
  File "/root/.local/lib/python3.7/site-packages/aiohttp/streams.py", line 39, in __anext__
    rv = await self.read_func()
  File "/root/.local/lib/python3.7/site-packages/aiohttp/streams.py", line 386, in readany
    raise self._exception
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed

Which means that the image layer was not download compeletely and, as a result, Kaniko fails on the layer checksum verification.
Example of Kaniko error message:

INFO[0153] Found cached layer, extracting to filesystem
error building image: error building stage: failed to execute command: extracting fs from image: error verifying sha256 checksum; got "9a593afe07a07cc862136409824a9b391a352945a4dcbbd7de8084ab8b13d572", want "92c6c2f9925e03fe28f4e343e6aa128aeb78eccc59c50589cefa15db665857c8"

This problem is not reproducible with platform registry api v21.2.11.
As a result, we downgraded our platformregistryapi on AWS clusters from v21.4.26 to v21.2.11.

IDK yet, what might be the reason for this.

Image is not deleted from the catalog in onprem

When you delete the last tag of image, the image itself should not be displayed in neuro image ls.
This holds only for cluster deployments, where do not use managed image registry.

@anayden you told we have enabled garbage collector, but I didn't found where is it.
Could you elaborate, please?

Implement simple GCR token caching

Currently we request a GCR oauth2 token each time we receive a request. We should try to cache tokens per scope and invalidate according to the expiration information in the token payload.

Investigate docker login -u '<token>' -p ACTUAL_TOKEN host

It is not clear how to force the docker client to pass the specified token as the Bearer token value.
It might be possible to respond 401 with WWW-Auth: Bearer realm=Docker Registry instead of WWW-Auth: Basic realm=Docker Registry as we do now.

Output proper bodies for 403 responses

rpc error: code = Unknown desc = Error response from daemon: error parsing HTTP 403 response body: invalid character \':\' after top-level value: "403: Forbidden"

Support SHA tags

Purpose: improve reproducibility, give user more control over images.

$ neuro pull tomcat@sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f
...

$ neuro images --digests
image:tomcat:latest sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f

Idea: by @adavydow

No tags repo removal for AWS ECR

In AWS, a repo can exist after the last tag was removed. This leads to "unremovable" images in neuro image ls. To overcome this problem, this fix was introduced, but it can lead to problems if the user tries to list tags before they upload the first image, as it will drop repo.
The solution is to add a special endpoint to remove the repo and use it in the client when neuro image rm was called for AWS ECR repo.

AWS registry forbits two subsequent dashes in image names

From @zubenkoivan:

In our tests we pushed image with name neuro-f7b0ba1a9278974e-1/platform-e2e--6ad98d85-5516-492c-bd99-13f438aff0c1-date202106241254-date - this test is failing in AWS compute clusters, since AWS registry changed its API. Now it forbits two dashes together -- in the image name.
Therefore, we need:

  • change our CLI (or API) to reject user images with two dashes (now it's allowed)
  • change registry tests, which are running when the cluster is deployed

v2/*/tags/list returns the upstream project

curl -vv -u reg:$REGISTRY_TOKEN https://registry-staging.neu.ro/v2/dalazx/python/tags/list | jq
{
  "child": [],
  "manifest": {
    "sha256:32e8ae6d055cd64e02655d2617c0863c948d204a7ed5f1c8a149a34020c5b210": {
      "imageSizeBytes": "352124901",
      "layerId": "",
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "tag": [
        "3.6.6-stretch"
      ],
      "timeCreatedMs": "1532541907437",
      "timeUploadedMs": "1558971642550"
    }
  },
  "name": "light-reality-205619/dalazx/python",
  "tags": [
    "3.6.6-stretch"
  ]
}

we should be removing light-reality-205619/ ^

note that the fix may require implementing a dedicated handler.

Support cluster names in permission URI authority

We want to add support of extended ACL where cluster names are in the authority component of permission URIs.

This functionality should be optional. We need to introduce a feature flag set by an environment variable.

The cluster name should also be passed via an env var and should be required if the feature flag is on.

Supports of image removing

Let's implement next syntax:

neuro image remove image://user/image1:tag1
neuro image remove --all-tags  image://user/image1

docker client is not sending the Authorization header in PATCH requests

Looks like our ingress (traefik) is loosing some important headers prior delivering requests to our services.
Specifically, the Authorization header in PATCH requests is missing.

I used the same docker client, but different servers:
the server I ran locally was receiving the required header successfully:

2018-09-05 19:41:12,005 - platform_registry_api.api - DEBUG - registry request:
<Request PATCH /v2/ubuntu/blobs/uploads/APy59-BfDUiFCq_0jeEL1aldt-l56VrWlk9kCtkgNCPw9QuHBjyDpw3IYi-Zg6Oo6ZiyydOKZ5jxtX0Ki5qddC4 >;
headers: <CIMultiDictProxy('Host': 'localhost:5000',
'User-Agent': 'docker/18.03.1-ce go/go1.9.5 git-commit/9ee9f40 kernel/4.9.87-linuxkit-aufs os/linux arch/amd64 UpstreamClient(Docker-Client/18.03.1-ce \\(darwin\\))',
'Transfer-Encoding': 'chunked',
'Authorization': 'Basic TOKEN',
'Accept-Encoding': 'gzip',
'Connection': 'close')>

But the server deployed in K8S (in both envs) was not:

2018-09-05 19:34:27,523 - platform_registry_api.api - DEBUG - registry request: <Request PATCH /v2/ubuntu/blobs/uploads/APy59-Aj-oUGoSQzTfIpE0QTJWrDLKF-pramJVVXdqlThGedltaAe_TdATNyyVsNrz3AcOR69HH37ZIFmgtL5rw >;
headers: <CIMultiDictProxy('Host': 'registry.dev.neuromation.io',
'User-Agent': 'docker/18.03.1-ce go/go1.9.5 git-commit/9ee9f40 kernel/4.9.87-linuxkit-aufs os/linux arch/amd64 UpstreamClient(Docker-Client/18.03.1-ce \\(darwin\\))',
'Transfer-Encoding': 'chunked',
'Accept-Encoding': 'gzip',
'X-Forwarded-For': '10.128.0.9',
'X-Forwarded-Host':'registry.dev.neuromation.io',
'X-Forwarded-Port': '80',
'X-Forwarded-Proto': 'http',
'X-Forwarded-Server': 'traefik-548fc44c6c-hpj2w',
'X-Real-Ip': '10.128.0.9')> 

HEAD requests are sometimes failing

HEAD /v2/dalazx/python/blobs/sha256:c4eb586021290cde57d793b8b8a1248272bc7ff438999cecafc37cf2c09650ad is failing with

Traceback (most recent call last):                                                                                                                                                                                                                                                      File "/usr/local/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 757, in start                                                                                                                                                                                          
    message, payload = await self._protocol.read()                                                                                                                                                                                                                                      File "/usr/local/lib/python3.6/site-packages/aiohttp/streams.py", line 543, in read                                                                                                                                                                                                 
    await self._waiter                                                                                                                                                                                                                                                                  File "/usr/local/lib/python3.6/site-packages/aiohttp/client_proto.py", line 195, in data_received                                                                                                                                                                                       messages, upgraded, tail = self._parser.feed_data(data)                                                                                                                                                                                                                             File "aiohttp/_http_parser.pyx", line 523, in aiohttp._http_parser.HttpParser.feed_data                                                                                                                                                                                             aiohttp.http_exceptions.BadHttpMessage: 400, message='invalid character in chunk size header'                                                                                                                                                                                                                                                                                                                                                                                                                                                                               The above exception was the direct cause of the following exception:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
Traceback (most recent call last):                                                                                                                                                                                                                                                    
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 390, in start                                                                                                                                                                                               resp = await self._request_handler(request)                                                                                                                                                                                                                                       
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_app.py", line 366, in _handle                                                                                                                                                                                              
    resp = await handler(request)                                                                                                                                                                                                                                                       File "/usr/local/lib/python3.6/site-packages/aiohttp/web_middlewares.py", line 106, in impl                                                                                                                                                                                             return await handler(request)                                                                                                                                                                                                                                                     
  File "/usr/local/lib/python3.6/site-packages/aiohttp_remotes/x_forwarded.py", line 73, in middleware                                                                                                                                                                                
    return await handler(request)                                                                                                                                                                                                                                                       File "/usr/local/lib/python3.6/site-packages/aiohttp/web_middlewares.py", line 106, in impl                                                                                                                                                                                             return await handler(request)                                                                                                                                                                                                                                                     
  File "/neuromation/platform_registry_api/api.py", line 307, in handle                                                                                                                                                                                                                   request, url_factory=url_factory, url=upstream_repo_url.url, token=token                                                                                                                                                                                                            File "/neuromation/platform_registry_api/api.py", line 352, in _proxy_request                                                                                                                                                                                                           timeout=timeout,                                                                                                                                                                                                                                                                  
  File "/usr/local/lib/python3.6/site-packages/aiohttp/client.py", line 855, in __aenter__                                                                                                                                                                                                self._resp = await self._coro                                                                                                                                                                                                                                                     
  File "/usr/local/lib/python3.6/site-packages/aiohttp/client.py", line 391, in _request                                                                                                                                                                                                  await resp.start(conn)                                                                                                                                                                                                                                                              File "/usr/local/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 762, in start                                                                                                                                                                                              message=exc.message, headers=exc.headers) from exc                                                                                                                                                                                                                                aiohttp.client_exceptions.ClientResponseError: 400, message='invalid character in chunk size header'  

Unreliable cache variable in AWSECRUpstream

The method AWSECRUpstream.create_repo uses variable _existing_repos to cache already created repositories. But as it is simple in memory set, it can contain repo that was deleted. For example, suppose the following:

  1. User uploads to repo foo -> it's created in ECR -> 'foo' is stored in _existing_repos
  2. Someone manually removes repo foo from ECR
  3. User uploads to repo foo -> it's in _existing_repos, so no repo is created on ECR -> request fails with error.

Example error from CI run:

name unknown: The repository with name 'dev/neuro-cli-e2e/e2e-banana-image' does not exist in the registry with id '771188043543'

Integrate neuro_auth_client

  1. Add aiohttp-security;
  2. Implement IdentityPolicy;
  3. Implement AuthorizationPolicy;
  4. Update handlers to query the Auth service for user permissions.

Force username repo namespacing

We need to enforce repo namespacing so that a user is allowed to manage their images only within their namespace corresponding to the username.
if a user has a username testuser, the following requests should be accepted:
docker pull registry.staging.neuromation.io/testuser/name:tag
docker push registry.staging.neuromation.io/testuser/name:tag
whereas
docker pull registry.staging.neuromation.io/anotheruser/name:tag
docker push registry.staging.neuromation.io/anotheruser/name:tag
are not.

Implement image deletion

The functionality is crucial for writing CLI E2E tests.
Without the deletion, tests will dump dev registry with temporary images.

Implement the /v2/_catalog route

Implement missing catalog request:

https://github.com/neuromation/platform-registry-api/blob/f341e1c8980d4ac307f107650de9db3d4d5407d6/platform_registry_api/api.py#L203-L206

Task can implemented in step based approach with the very first version to list only user specific images, and the next step / enhancement to implement shared containers. The latter can be useful as it would give powerful way to share pre-defined/pre-built images that contains all required libs, and would thus reduce amount of work required for typical research lifecycle.

Make oAuth2 scopes configurable

The Oracle Cloud Registry requires the scope actions to be passed explicitly.
Instead of * they require push,pull for repositories.
They also do not support the registry scopes whatsoever.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.