Code Monkey home page Code Monkey logo

Comments (4)

pflorek avatar pflorek commented on June 10, 2024

@DanielSWTP Do you have an example?

You may add an inline policy as your runners role i.e.

ecrDeploy: new PolicyDocument({
          statements: [
            new PolicyStatement({
              effect: Effect.ALLOW,
              actions: ["ecr:GetAuthorizationToken"],
              resources: ["*"],
            }),
            new PolicyStatement({
              effect: Effect.ALLOW,
              actions: [
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:GetRepositoryPolicy",
                "ecr:DescribeRepositories",
                "ecr:ListImages",
                "ecr:DescribeImages",
                "ecr:BatchGetImage",
                "ecr:GetLifecyclePolicy",
                "ecr:GetLifecyclePolicyPreview",
                "ecr:ListTagsForResource",
                "ecr:DescribeImageScanFindings",
                "ecr:InitiateLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:CompleteLayerUpload",
                "ecr:PutImage",
              ],
              resources: ["*"],
            }),
          ],
        }),

from cdk-autoscaling-gitlab-runner.

grantlerduck avatar grantlerduck commented on June 10, 2024

@pflorek Deploying to ecr was never a Problem.

So lets lets imagine an example pipeline with a job and an image from my private ECR:

stages:
  - build

my-build-job:
  stage: build
  image: ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/node18-cdk-typescript-python3-alpine:latest
  script:
    - echo "Hello Job"

I want to pull a cached image from my ECR to reduce the traffic from my VPC to the internet. Thus reducing ec2-other costs.

I gave the runner instance the following role:

 const runnerRole = new Role(this, 'GlRunnerBuildRole', {
      assumedBy: new ServicePrincipal('ec2.amazonaws.com', {}),
      inlinePolicies: {
        CdkSynthLookUp: PolicyDocument.fromJson({
          Version: '2012-10-17',
          Statement: [
            {
              Effect: 'Allow',
              Action: ['s3:*Object', 's3:ListBucket', 's3:GetBucketLocation'],
              Resource: ['arn:aws:s3:::cdktoolkit-*'],
            },
            {
              Sid: 'assumerole',
              Effect: 'Allow',
              Action: ['sts:AssumeRole', 'iam:PassRole'],
              Resource: ['arn:aws:iam::*:role/cdk-readOnlyRole', 'arn:aws:iam::*:role/cdk-hnb659fds-lookup-role-*'],
            },
            {
              Sid: 'pullEcrImages',
              Effect: 'Allow',
              Action: [
                'ecr:BatchCheckLayerAvailability',
                'ecr:BatchGetImage',
                'ecr:DescribeImages',
                'ecr:DescribeRepositories',
                'ecr:GetDownloadUrlForLayer',
                'ecr:GetAuthorizationToken',
              ],
              Resource: ['*'],
            },
          ],
        }),
      },
    });

Moreover, other roles exist which are assumed within a job, e.g. push to ECR, role for deployment of cdk, etc.

The roles work fine and as expected, except pulling the job image from ECR instead from my gitlab rigistry.

I also tried to pass a specific docker env to the the runners

const dockerEnvDev = [
  'DOCKER_DRIVER=overlay2',
  'DOCKER_TLS_CERTDIR=/certs',
  'DOCKER_AUTH_CONFIG={"credHelpers": { "public.ecr.aws": "ecr-login", "MY_ACCOUNT.dkr.ecr.MY_REGION.amazonaws.com": "ecr-login" } }',
];

I investigated the cloud trail and the runners do not perform the ECR in order to pull the job image directly from my ECR.
After many tries and desperation I also gave the runner role full access to ECR, still the same output. Does not matter what I do, the runner never performs the ECR login.

from cdk-autoscaling-gitlab-runner.

HaKePlan avatar HaKePlan commented on June 10, 2024

I also seem to struggle with the ECR configuration.

I tried to follow the example from the README.md and get also the same error.

```typescript
new GitlabRunnerAutoscaling(this, "Runner", {
runners: [
{
// ...
environment: [
"DOCKER_DRIVER=overlay2",
"DOCKER_TLS_CERTDIR=/certs",
'DOCKER_AUTH_CONFIG={"credHelpers": { "public.ecr.aws": "ecr-login", "<aws_account_id>.dkr.ecr.<region>.amazonaws.com": "ecr-login" } }',
],
},
],
});
```

ERROR: Job failed: failed to pull image "<aws-id>.dkr.ecr.<region>.amazonaws.com/my-registry:latest" with specified policies [always]: Error response from daemon: Head "https://<aws-id>.dkr.ecr.<region>.amazonaws.com/v2/my-registry/manifests/latest": no basic auth credentials (manager.go:250:0s)

After some investigation, I came across this ticket from GitLab. As someone stated in a comment, the amazon-ecr-credential-helper needs to be installed on the manager instance. This because everything related to authentication against the ECR has to be done by the manager and not on the runner instances.

As I have noticed, the implementation of the amazon-ecr-credential-helper (implemented in PR #297) installs the amazon-ecr-credential-helper only on the runner machines and not on the manager. So I checked on the manager and there I noticed that there was no amazon-ecr-credential-helper installed. As suggested in the ticket from GitLab, I did the following steps:

  • manually installed the amazon-ecr-credential-helper on the manager instance
  • added the AWS_REGION variable to the GitLab CI/CD Variables store (you also need to have AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from a user permitted to use ECR set)
  • added the credHelpers config in /root/.docker/config.json (like suggested in the comment on the GitLab ticket mentioned)
  • then I searched for the automatic generated role by cdk-autoscaling-gitlab-runner in the IAM console and added a custom role with the following policies:
{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "VisualEditor0",
			"Effect": "Allow",
			"Action": [
				"ecr:BatchGetImage",
				"ecr:GetAuthorizationToken",
				"ecr:GetDownloadUrlForLayer"
			],
			"Resource": "*"
		}
	]
}

After all this manual configuration, I could finally manage to get it running. My jobs were finally able to use images from my private ECR registry.

An interesting note is, that the DOCKER_AUTH_CONFIG in the runner configuration seems not to have any role for this use case (pull a job / service image from private ECR). The configuration needs to be done in /root/.docker/config.json. (This will also enable the possibility of a private pull through cache for the job images.)

@pflorek I am not sure if my way was not intended, and you managed the same differently. But I would suggest adding the manual steps I did to the CDK code for the manager. If you think that is a desirable, I'll try to create a PR to get these adjustments in the manager CDK. (IMO it should be definitely done otherwise I have to do the manual adjustments after each redeploy 😉)

from cdk-autoscaling-gitlab-runner.

blsmth-visualboston avatar blsmth-visualboston commented on June 10, 2024

I was also able to get this working in my setup by doing as @HaKePlan points out and install the amazon-ecr-credential-helper so docker-credential-ecr-login executable is available on the manager (edited!) and the IAM policy and things are working finally. I spent the better part of 2 days trying to debug this and this was finally the solution. I am now up and running with autoscaling instances with permissions to read and write to my registry.

@pflorek would you like one of us to take a stab at producing a PR to fix this? and thanks for creating this package. it saved me a lot of headache trying to get a similar setup up and running

from cdk-autoscaling-gitlab-runner.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.