Code Monkey home page Code Monkey logo

k8s-digester's Introduction

Digester

Digester resolves tags to digests for container and init container images in Kubernetes Pod and Pod template specs.

It replaces container image references that use tags:

spec:
  containers:
  - image: gcr.io/google-containers/echoserver:1.10

With references that use the image digest:

spec:
  containers:
  - image: gcr.io/google-containers/echoserver:1.10@sha256:cb5c1bddd1b5665e1867a7fa1b5fa843a47ee433bbb75d4293888b71def53229

Digester can run either as a mutating admission webhook in a Kubernetes cluster, or as a client-side Kubernetes Resource Model (KRM) function with the kpt or kustomize command-line tools.

If a tag points to an image index or manifest list, digester resolves the tag to the digest of the image index or manifest list.

The webhook is opt-in at the namespace level by label, see Deploying the webhook.

If you use Binary Authorization, digester can help to ensure that only verified container images can be deployed to your clusters. A Binary Authorization attestation is valid for a particular container image digest. You must deploy container images by digest so that Binary Authorization can verify the attestations for the container image. You can use digester to deploy container images by digest.

Running the KRM function

  1. Download the digester binary for your platform from the Releases page.

    Alternatively, you can download the latest version using these commands:

    VERSION=v0.1.13
    curl -Lo digester "https://github.com/google/k8s-digester/releases/download/${VERSION}/digester_$(uname -s)_$(uname -m)"
    chmod +x digester
  2. Install kpt v1.0.0-beta.1 or later, and/or install kustomize v3.7.0 or later.

  3. Run the digester KRM function using either kpt or kustomize:

    • Using kpt:

      kpt fn eval [manifest directory] --exec ./digester
    • Using kustomize:

      kustomize fn run [manifest directory] --enable-exec --exec-path ./digester

    By running as an executable, the digester KRM function has access to container image registry credentials in the current environment, such as the current user's Docker config file and credential helpers. For more information, see the digester documentation on Authenticating to container image registries.

Deploying the webhook

The digester webhook requires Kubernetes v1.16 or later.

  1. If you use Google Kubernetes Engine (GKE), grant yourself the cluster-admin Kubernetes cluster role:

    kubectl create clusterrolebinding cluster-admin-binding \
        --clusterrole cluster-admin \
        --user "$(gcloud config get core/account)"
  2. Install the digester webhook in your Kubernetes cluster:

    VERSION=v0.1.13
    kubectl apply -k "https://github.com/google/k8s-digester.git/manifests/?ref=${VERSION}"
  3. Add the digest-resolution: enabled label to namespaces where you want the webhook to resolve tags to digests:

    kubectl label namespace [NAMESPACE] digest-resolution=enabled

To configure how the webhook authenticates to your container image registries, see the documentation on Authenticating to container image registries.

If you want to install the webhook using kustomize or kpt, follow the steps in the package documentation.

If you want to apply a pre-rendered manifest, you can download an all-in-one manifest file for a released version from the Releases page.

Private clusters

If you install the webhook in a private Google Kubernetes Engine (GKE) cluster, you must add a firewall rule. In a private cluster, the nodes only have internal IP addresses. The firewall rule allows the API server to access the webhook running on port 8443 on the cluster nodes.

  1. Create an environment variable called CLUSTER. The value is the name of your cluster that you see when you run gcloud container clusters list:

    CLUSTER=[your private GKE cluster name]
  2. Look up the IP address range for the cluster API server and store it in an environment variable:

    API_SERVER_CIDR=$(gcloud container clusters describe $CLUSTER \
        --format 'value(privateClusterConfig.masterIpv4CidrBlock)')
  3. Look up the network tags for your cluster nodes and store them comma-separated in an environment variable:

    TARGET_TAGS=$(gcloud compute firewall-rules list \
        --filter "name~^gke-$CLUSTER" \
        --format 'value(targetTags)' | uniq | paste -d, -s -)
  4. Create a firewall rule that allow traffic from the API server to the cluster nodes on TCP port 8443:

    gcloud compute firewall-rules create allow-api-server-to-digester-webhook \
        --action ALLOW \
        --direction INGRESS \
        --source-ranges "$API_SERVER_CIDR" \
        --rules tcp:8443 \
        --target-tags "$TARGET_TAGS"

You can read more about private cluster firewall rules in the GKE private cluster documentation.

Documentation

Disclaimer

This is not an officially supported Google product.

k8s-digester's People

Contributors

arnabmaji avatar halvards avatar iamasmith avatar jonjohnsonjr avatar oeduardoal avatar thesuess avatar zhaoyonghe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

k8s-digester's Issues

Helm chart

Any plans on having this packaged as a helm chart so this can be deployed using argocd?

Offline authentication fails for public GCR and AR images when gcloud is installed but not initialized

Situation

Using offine authentication to resolve the digest for a public image in Google Container Registry or Artifact Registry, e.g., gcr.io/google-containers/pause-amd64:3.2.

This fails in situations if all of the following are true:

  • Digester is not running on GCP (GKE, GCE, Cloud Run, etc) with access to the metadata server.
  • The GOOGLE_APPLICATION_CREDENTIALS environment variable is not set or doesn't point to a valid Google service account key file.
  • There are no credentials available at the expected location ($HOME/.config/gcloud/application_default_credentials.json).
  • The gcloud command line tool is installed
  • The gcloud tool has not been set up with credentials (i.e., the user hasn't run gcloud init or gcloud auth login).

If any of the above are not true, this issue does not occur.

Behavior

The webhook fails to resolve the digest for the public GCR image with this error:

handler.go:100] "msg"="admission error" "error"="could not get digest for gcr.io/google-containers/pause-amd64:3.2: failed to create token source from env: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information. or gcloud: error executing gcloud config config-helper : exit status 1"

Full details: https://github.com/google/k8s-digester/runs/2997015198#step:9:249

Investigation

authn.multiKeychain.Resolve() returns error on the first attempt to resolve,

This happens because google.Keychain.resolve() fails to create both the environment Authenticator and the gcloud Authenticator.

Feature Request: skip digester resolution for images from certain repo base paths

We have been experiencing an issue, supposedly resolved by a recent release of Anthos Service Mesh mdp-controller, where mdp-controller would restart pods which it believed to be out of date against the current version of the mesh deployed.

Having continued to experience this issue we checked further into this.

It appears that mdp-controller is not aware of the @sha256 form of the image and instead inspects running pods for tag version and if this does not match then it deems the pod out of date, patches the deployment and restarts the pod... digester then patches this to a @sha256 and the process starts again with the already updated pod being restarted once more because mdp-controller does not yet support the resolved @sha256 form of the image.

The actual images concerned fall within the default Binary Authorisation whitelist paths being under gcr.io/gke/releaase/asm/proxyv2 so skipping digester functionality for these paths should allow the container to start using those paths for the sidecar whilst allowing it to function for our main container images.

I'll monitor this thread and perhaps raise a pull request for this in a couple of days if nobody can get to it. In the meantime I believe that the ASM product team are now aware of the issue and a fix may be in the works but that may take longer to arrive. In any case this seems like it might be a sensible feature for images within the Binary Auth whitelist section.

The proposal would be to be able to specify a user provided list of prefix paths that filterImage in resolve.go could use to selectively skip processing.

Improve error message on tag resolution failure

Digester currently uses HEAD requests to look up image digests.

If the image registry returns an error, there is no response body to provide additional information about the error.

As long as the HTTP status code was not 429, digester could fall back to a GET request in order to get the error message and surface it to the user.

Does the k8s-digester work with the kind CronJobs?

Hey guys -- thanks for the amazing work here...

The digester seems like can't get the hash from the registry using the CronJob kind.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-cronjob
spec:
  schedule: '*/15 * * * *'
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: my-cronjob
              image: myorg/my-cronjob-image:latest
kpt fn eval my-cronjob --exec ./scripts/digester

Use of unscoped labels to control injection

As an issuer with a standard matchLabel..

... used to determine if the webhook should be enabled for the namespace.

According to k8s documentation at https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/ which uses their app.kubernetes.io prefix in their example..

'Shared labels and annotations share a common prefix: app.kubernetes.io. Labels without a prefix are private to users. The shared prefix ensures that shared labels do not interfere with custom user labels'

It seems that the plain label used in the config really should be deemed a user label and that we should expect from an issuer, creating a label standard, a prefix of their choice.

We have to trust that the issuer (you) does this, as simply for us to choose a different prefix or adding a label to a well known one from Google is clearly against that pattern.

Is there any reason that you don't use a prefix?

digester should use failurePolicy: Fail in the example mutatingwebhookconfiguration to avoid a specific edge case

We use GKE, use Binary Auth and for system components we like to use latest tags from our own Artifact Registry so we use digester.

We have a daily Terraform deployment run that deploys a variety of configs to our environment including some system components like digester but also some other deployments which utilise digester.

One of these components is a kubernetes API proxy because we use private services connect so we have this as a system component that is delivered daily.

Now, we shold have but didn't up to now, ignore_changes for the image field in the Terraform run. It didn't really matter to us but we did know that when the deployment happened every day digester without the @sha256 value would mutate the deployment but since this was in keeping with the current replicaset then this didn't matter.

Yesterday I discovered about 18000 ReplicaSets for this deployment and new ones were getting created every 2-3 minutes rotating the proxy out of service and breaking people's connection.

Inspection revealed that the k8s API proxy deployment had gotten past digester without being mutated and just had the tag 'latest' meaning this was now different to the ReplicaSet and then the fight began - the deployment controller seeing the difference thought to create a new ReplicaSet using just the tag it had, then digester would update the ReplicaSet, start rotating pods from the old to the new and the process would begin again 2 mins later.

Takeaways are..

  1. For us we should use TF lifecycle policy to ignore changes on the image to stop these servces even touching the deployment.
  2. A failurePolicy of Fail instead of Ignore would have stopped the deployment being updated if there was a service timeout to the webhook deployment.

We're implementing these changes now ourselves but submit this edge case for your consideration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.