We have a bunch of container images in Azure's container registry, and want to move th

Move from Azure to AWS - tracking about chirpycardinal HOT 4 CLOSED

stanfordnlp commented on July 24, 2024

Move from Azure to AWS - tracking

from chirpycardinal.

Comments (4)

shashank2000 commented on July 24, 2024

I played around with downloading the images onto my computer, but they are much too large and it takes a very long time. Perhaps I am doing something wrong, maybe we need only the latest image and not the whole repository, but most repositories seem to have only one image. Maybe I am reading this wrong.

 az acr repository show-manifests --name <redacted> --repository chirpy/blenderbot --detail --query '[].{Size: imageSize, Tags: tags}'
This command has been deprecated and will be removed in a future release. Use 'acr manifest list-metadata' instead.
[
  {
    "Size": 15018375090,
    "Tags": [
      "latest"
    ]
  }
]

from chirpycardinal.

shashank2000 commented on July 24, 2024

Going to try running a job on cluster, first goal is to simply download a container image from Azure. Eventual goal is to write a script that takes in a container image URL, downloads it and then uploads to AWS ECR. All will be a job on a john node.

To download a container image from Azure locally, I first run az login then pull from the container registry. Issue is this is interactive mode, so I need to get a client secret from Azure and add this to the script so it can be an actual slurm job

from chirpycardinal.

shashank2000 commented on July 24, 2024

For each repository (group of image versions of the same container) in Azure, we get the most recent image by running az acr repository show-tags --name stanfordoval --repository chirpy/chatserver --orderby time_desc --top 1. I wrote a script to give me all the images in Azure that we care about, and am running each command in sequence on a slurm job in the cluster.

here are the commands:

docker pull stanfordoval.azurecr.io/chirpy/blenderbot:latest
docker pull stanfordoval.azurecr.io/chirpy/chatserver:20220924.7
docker pull stanfordoval.azurecr.io/chirpy/convpara:latest
docker pull stanfordoval.azurecr.io/chirpy/corenlp:20220715
docker pull stanfordoval.azurecr.io/chirpy/dialogact:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/entitylinker:20210107
docker pull stanfordoval.azurecr.io/chirpy/g2p:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/gpt2ed:20201116.1
docker pull stanfordoval.azurecr.io/chirpy/infiller:20210107
docker pull stanfordoval.azurecr.io/chirpy/questionclassifier:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/responseranker:20210108
docker pull stanfordoval.azurecr.io/chirpy/stanfordnlp:20201111.1

It's hard to see the status of a job on slurm - you can login to the john node that the job is running on, but I still haven't figured out a way to see stdout. I tried writing to a file but that seems to be erroring out. Writing to a file seems promising though.

There seems to be a pretty clear pipeline to replicate the exact repository structure on AWS ECR - this page had pretty much everything I needed. I'm testing with some toy examples on my local computer now.

from chirpycardinal.

shashank2000 commented on July 24, 2024

Final fix was a script that handled aws cli authentication as well, overall workflow was list all repos in Azure CR, loop through each repo, find latest image, download latest image to NLP cluster node, create same repo structure in AWS ECR, upload image to appropriate place

from chirpycardinal.

Move from Azure to AWS - tracking about chirpycardinal HOT 4 CLOSED

Comments (4)

Related Issues (16)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent