Comments (4)
I played around with downloading the images onto my computer, but they are much too large and it takes a very long time. Perhaps I am doing something wrong, maybe we need only the latest
image and not the whole repository, but most repositories seem to have only one image. Maybe I am reading this wrong.
az acr repository show-manifests --name <redacted> --repository chirpy/blenderbot --detail --query '[].{Size: imageSize, Tags: tags}'
This command has been deprecated and will be removed in a future release. Use 'acr manifest list-metadata' instead.
[
{
"Size": 15018375090,
"Tags": [
"latest"
]
}
]
from chirpycardinal.
Going to try running a job on cluster, first goal is to simply download a container image from Azure. Eventual goal is to write a script that takes in a container image URL, downloads it and then uploads to AWS ECR. All will be a job on a john
node.
To download a container image from Azure locally, I first run az login
then pull from the container registry. Issue is this is interactive mode, so I need to get a client secret from Azure and add this to the script so it can be an actual slurm job
from chirpycardinal.
For each repository (group of image versions of the same container) in Azure, we get the most recent image by running az acr repository show-tags --name stanfordoval --repository chirpy/chatserver --orderby time_desc --top 1
. I wrote a script to give me all the images in Azure that we care about, and am running each command in sequence on a slurm job in the cluster.
here are the commands:
docker pull stanfordoval.azurecr.io/chirpy/blenderbot:latest
docker pull stanfordoval.azurecr.io/chirpy/chatserver:20220924.7
docker pull stanfordoval.azurecr.io/chirpy/convpara:latest
docker pull stanfordoval.azurecr.io/chirpy/corenlp:20220715
docker pull stanfordoval.azurecr.io/chirpy/dialogact:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/entitylinker:20210107
docker pull stanfordoval.azurecr.io/chirpy/g2p:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/gpt2ed:20201116.1
docker pull stanfordoval.azurecr.io/chirpy/infiller:20210107
docker pull stanfordoval.azurecr.io/chirpy/questionclassifier:20201111.1
docker pull stanfordoval.azurecr.io/chirpy/responseranker:20210108
docker pull stanfordoval.azurecr.io/chirpy/stanfordnlp:20201111.1
It's hard to see the status of a job on slurm - you can login to the john
node that the job is running on, but I still haven't figured out a way to see stdout. I tried writing to a file but that seems to be erroring out. Writing to a file seems promising though.
There seems to be a pretty clear pipeline to replicate the exact repository structure on AWS ECR - this page had pretty much everything I needed. I'm testing with some toy examples on my local computer now.
from chirpycardinal.
Final fix was a script that handled aws cli authentication as well, overall workflow was list all repos in Azure CR, loop through each repo, find latest image, download latest image to NLP cluster node, create same repo structure in AWS ECR, upload image to appropriate place
from chirpycardinal.
Related Issues (16)
- Running Tests HOT 11
- dependency of packages in questionclassifier HOT 4
- Issues in running preprocess HOT 33
- Live Demo is down HOT 1
- Possible Extensions HOT 8
- No index phone_doc-0520-3 found in Elastic Search HOT 6
- docker images for all annotators HOT 1
- Possibly missing Postgresql db schema HOT 7
- Treelet State HOT 2
- Downloading the dumps to start the elasticsearch indexing HOT 2
- newer version timeline
- ModuleNotFoundError: No module named 'agent' HOT 2
- self pronoun inconsistency - chirpy says "we" for housemates in icebreaker and "lives alone" in living_condition treelet
- Downloading wikidumps data
- Updating requirements.txt `ipython-sql`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chirpycardinal.