Comments (13)
@awalford16 I'm able to reproduce this on my side, so I'll be working on a fix. ETA, for a fix will be some time in the next month.
cc: @northtyphoon @ganeshkumarashok
from acr.
Thanks for reporting this. Does the nodepool have access to both registries?
from acr.
Hi, yes we have a service principal tied to the AKS cluster that has pull permissions for both ACRs
from acr.
Got it, and can you confirm a couple of things to help narrow down the root cause.
-
If artifact streaming is disabled in the cluster there are no issues?
-
Are there any issues when only a single registry with streaming enabled is used?
Meanwhile, I will try to reproduce the issue on my end as well.
from acr.
I can confirm there are no issues if artifact streaming is disabled in the cluster, we have only started seeing this since we enabled it and we only see it for the one nodepool that we enabled it on.
I was able to get the image to pull from the same ACR when it was not using streaming. However the issue appears to be temporamental and hard to reproduce as it affects random nodes (even though they have not interacted with our streaming-enabled ACR at any point)
Could you please share the command to disable artifact streaming on a nodepool, I could disable it on the pool we are seeing issues and validate that the issue goes away
from acr.
@awalford16 so I did a bit more digging and I have a workaround you could try. It appears to happen when you have the same image-reference w/ different registries in the same pod.
For example,
Fails
apiVersion: v1
kind: Pod
metadata:
name: &name mix-wordpress
spec:
containers:
- name: wordpress-streaming
image: streaming.azurecr.io/wordpress:latest
- name: wordpress-nonstreaming
image: non-streaming.azurecr.io/wordpress:latest
Works
apiVersion: v1
kind: Pod
metadata:
name: non-wordpress
spec:
containers:
- name: wordpress-nonstreaming
image: non-streaming.azurecr.io/wordpress:latest
Works
apiVersion: v1
kind: Pod
metadata:
name: wordpress
spec:
containers:
- name: wordpress-streaming
image: streaming.azurecr.io/wordpress:latest
(I tested these all running on the same node pool w/ node-selectors)
I am working on figuring out the root cause and a fix, but just wanted to share a possible workaround you could try for your own evaluation.
from acr.
So it looks like this can affect any pod spec that has multiple containers. It looks like there's been a regression in AKS/containerd but I'm still trying to narrow it down.
from acr.
@awalford16 Could you provide the value of this label from your nodepool that has this issue,
`kubernetes.azure.com/node-image-version'
For example it should be some value that looks like this - AKSUbuntu-2204gen2containerd-202403.13.0
from acr.
@juliusl thanks for looking into this. The label value is AKSUbuntu-2204gen2containerd-202401.17.1
from acr.
@awalford16 so good news, I figured out the issue and I have a fix. I'm working on the release, so should be about a week or two for it to make it's way upstream.
from acr.
@awalford16 Hey there just to close up the loop. The fix has been rolled out to all AKS regions for about a week or two, are you able to update your node images and give it a try?
from acr.
Thanks @juliusl! Looks like it is working on my end now. For confirmation these are the versions on my nodes: 5.15.0-1061-azure
and containerd://1.7.15-1
from acr.
Has the fix (#739) been distributed to UK South? And how can I rollback "az aks nodepool update --enable-artifact-streaming"?
from acr.
Related Issues (20)
- Manifests - Get API returns 404 for multi arch images
- Fail pulling image - manifest unknown
- Failing to pull image when Artifact streaming is enabled
- Dockerfile with extension is interpreted as YAML HOT 1
- Storage used per repository
- Pull Through Caching from Another Azure Container Registry HOT 2
- Add support of registry.k8s.io type in cache rules HOT 1
- Rest api for get tags doesn respect n parameter (pagesize) HOT 2
- Scope Security/ Vulnerability scan to certain image tags only
- Catalog API only works with scope map * HOT 2
- Allow configuration of CORS headers for API access from web clients HOT 1
- Use Entra security principals with scope maps HOT 3
- ACR Cache error: too many requests to source registry for cache rule HOT 43
- connectivity_challenge_error grcsharedacr
- Unable to login into azure acr HOT 2
- Unable to login to container registry shazdevops HOT 1
- Auth Endpoint seems to require account parameter which is not part of the API Spec
- Cache elastic images HOT 1
- ACR Build with public access disabled HOT 1
- ACR streaming: failed to open remote file as tar file error HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from acr.