Code Monkey home page Code Monkey logo

cdk-autoscaling-gitlab-runner's People

Contributors

acfo avatar andrew-g-mcdonald avatar anux-linux avatar blsmth-visualboston avatar dependabot[bot] avatar didrux avatar marco-streng avatar mergify[bot] avatar nlianna avatar pflorek avatar unerty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cdk-autoscaling-gitlab-runner's Issues

Reporting a vulnerability

Hello!

I hope you are doing well!

We are a security research team. Our tool automatically detected a vulnerability in this repository. We want to disclose it responsibly. GitHub has a feature called Private vulnerability reporting, which enables security research to privately disclose a vulnerability. Unfortunately, it is not enabled for this repository.

Can you enable it, so that we can report it?

Thanks in advance!

PS: you can read about how to enable private vulnerability reporting here: https://docs.github.com/en/code-security/security-advisories/repository-security-advisories/configuring-private-vulnerability-reporting-for-a-repository

use v0.16.2-gitlab.19 to support docker 23+

@danielbayerlein I open this to track until we get all follow-up errors resolved

Since yesterday the GitLab Runner is not working anymore, because there was an update at Docker.

With Docker 23.0.0 release (which happened today), installation of Docker doesn't create an /etc/docker directory anymore by default.

The latest docker-machine version (v0.16.2-gitlab.19) solves this issue.

https://gitlab.com/gitlab-org/ci-cd/docker-machine/-/merge_requests/102 https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29594 https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29593

relates to #533

IAM issue in GovCloud

After attempting to deploy a zero config Stack to GovCloud, I found that the runners were failing to be created due to an IAM issue. Here's a sanitized snippet from /var/log/gitlab-runner.log:

Jan 4 20:10:30 ip-REDACTED gitlab-runner: #33[31;1mERROR: Error creating machine: Error in driver during machine creation: Error request spot instance: UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws-us-gov:sts::REDACTED:assumed-role/GitLabRunnerStack-GitlabRunnerManagerRole2F9BC927-REDACTED/i-REDACTED is not authorized to perform: ec2:RequestSpotInstances on resource: arn:aws-us-gov:ec2:us-gov-west-1:REDACTED:subnet/subnet-REDACTED because no identity-based policy allows the ec2:RequestSpotInstances action.

Deeper inspection found the culprit at

"ec2:Vpc": `arn:aws:ec2:${Stack.of(this).region}:${Stack.of(this).account}:vpc/${
Whereas the arn:aws prefix is hard-coded, the actual GovCloud ARN prefix is going to be arn:aws-us-gov.

Error during deployment: response doesn't contain SecurityGroups.0.GroupName

Currently we are not able to deploy the Stack. Did not face this problem before.

Deployment error:

CustomResource attribute error: Vendor response doesn't contain SecurityGroups.0.GroupName key in object arn:aws:cloudformation:***:stack/GitlabRunnerStack/***RunnerRunnersSecurityGroupDescribeSGCustomResource*** in S3 bucket cloudformation-custom-resource-storage-***

We use the latest versions of AWS CDK and @pepperize packages:

"aws-cdk": "^2.73.0",
"aws-cdk-lib": "^2.73.0",

"@pepperize/cdk-autoscaling-gitlab-runner": "^0.2.456",
"@pepperize/cdk-private-bucket": "^0.0.360",
"@pepperize/cdk-security-group": "^0.0.433",
"@pepperize/cdk-vpc": "^0.0.575",

The logs of the Lambda which runs the describeSecurityGroups command look like this. Looks like the response data object is empty to me.

{
    "Status": "SUCCESS",
    "Reason": "OK",
    "PhysicalResourceId": "sg-***",
    "StackId": "arn:aws:cloudformation:e***:stack/GitlabRunnerStack/***",
    "RequestId": "***",
    "LogicalResourceId": "RunnerRunnersSecurityGroupDescribeSGCustomResource***",
    "NoEcho": false,
    "Data": {}
}

KeyPair name for runner instances

According to the documentation it is possible to specify a custom keyPairName which is applied to the Manager Instance and the Auto Scaling Group.

The launched Runner Instances still use new created Key Pairs. But for compliance reasons, sometimes it is only possible to use imported key pairs.

Is there a way to also set a keyPairName for the Runner Instances?

Runner get always a public IP address

Currently, the runners get a public IP address. I would like to disable this so that the communication goes over the NAT gateway. My current configuration looks like this:

export class GitlabRunnerStack extends Stack {
  constructor(scope: Construct, id: string, props: GitlabRunnerStackProps) {
    super(scope, id, props);

    const { gitlabToken, gitlabUrl, cidr } = props;

    const token = new aws_ssm.StringParameter(this, 'Token', {
      parameterName: '/gitlab-runner/token',
      stringValue: gitlabToken,
      type: aws_ssm.ParameterType.STRING,
      tier: aws_ssm.ParameterTier.STANDARD,
    });

    const vpc = new aws_ec2.Vpc(this, 'Vpc', {
      cidr,
      natGateways: 1,
      subnetConfiguration: [
        {
          name: 'Public',
          subnetType: aws_ec2.SubnetType.PUBLIC,
          mapPublicIpOnLaunch: false,
        },
        {
          name: 'Private',
          subnetType: aws_ec2.SubnetType.PRIVATE_WITH_NAT,
        },
      ]
    });

    new GitlabRunnerAutoscaling(this, 'Runner', {
      runners: [
        {
          instanceType: aws_ec2.InstanceType.of(
            aws_ec2.InstanceClass.T3,
            aws_ec2.InstanceSize.MEDIUM,
          ),
          token: token,
          configuration: {
            url: gitlabUrl,
            machine: {
              machineOptions: {
                spotPrice: 0.04,
              },
            },
          },
        },
      ],
      network: { vpc: vpc },
    });
  }
}

How to use the keypairName option for the runner instances?

Hi, I created a stack with the following code:

const keyPair = aws_secretsmanager.Secret.fromSecretNameV2(this, 'SshKeyPair', 'SshKeyPair');
new GitlabRunnerAutoscaling(this, 'GitlabRunner', {
  network: {
    vpc: vpc,
  },
  manager: {
    keyPairName: 'SshKeyPair', // assume there is already the ec2 key pair "SshKeyPair" created beforehand
  },
  runners: [
    {
      keyPair: keyPair,
      instanceType: aws_ec2.InstanceType.of(aws_ec2.InstanceClass.T3, aws_ec2.InstanceSize.MEDIUM),
      token: xxx,
      configuration: {
        machine: {
          machineOptions: {
            keypairName: 'theKeyPairName',
          },
        },
      },
    },
  ],
});

inside the AWS secrets manager, i already created the "SshKeyPair" secret with 2 keys:

"theKeyPairName": "-----BEGIN RSA PRIVATE KEY----- xxx ",
"theKeyPairName.pub": "-----BEGIN PUBLIC KEY-----  xxx"

but when deploying the CDK stack, the deployment failed because: Received 1 FAILURE signal(s) out of 1. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement on the manager instance ASG.

since this is related to the cfn-init script, i checked the logs by ssh into the manager instance and run cat /var/log/messages | grep cloud-init, and found the error message:

[DEBUG] Command 999-retrieve-ec2-key-pair output: /bin/sh: line 1: -----BEGIN: command not found

and it seems like the cfn-init script does a shell command substitution "$()" over the secret value? based on this line: f5a173f#diff-38c267fcf5e98b1bf0a4bc4c84a8b1f97d08aac16be65b4908e6c3de8616dfcfR328

then what am i supposed to put inside the secret value? any help would be appreciated ๐Ÿ‘

MachineImages.latestAmazonLinux is depreciated

During stack synth:

[WARNING] aws-cdk-lib.aws_ec2.MachineImage#latestAmazonLinux is deprecated.
  use MachineImage.latestAmazonLinux2 instead
  This API will be removed in the next major release.
Searching for AMI in 753334223818:us-east-1
[WARNING] aws-cdk-lib.aws_ec2.MachineImage#latestAmazonLinux is deprecated.
  use MachineImage.latestAmazonLinux2 instead
  This API will be removed in the next major release.

Feature Request: environment variables (e.g. for proxy support)

As an enterprise user that is working behind an internet proxy, I would like to be able to pass environment variables that are used during the initialization of the runner to the constructs so that I can correctly configure the proxy URL and no-proxy list.

Customize runner instande (install amazon-ecr-credential-helper)

I would like to be able to access AWS ECR repository from gitlab runner. i.e. specify private AWS ECR repository as image for gitlab CI job. In order to do so the amazon-ecr-credential-helper has to be installed on runner machine.
Currently I do not see way to customise the runner machine (not the agent).
Poking around code I think it should go here

InitPackage.yum("docker"),
as another package:
InitPackage.yum("amazon-ecr-credential-helper"),
any advice how to accomplish that?

Pulling job images from private ECR not working

Hi,

I tried to pull Job images from private ECR to, however, the job fails to pull the image from ECR with the following error:
ERROR: Job failed: failed to pull image ... with specified policies [always]: Error response from daemon: Head ...: no basic auth credentials (manager.go:237:0s)

The runner role has permission to pull every image from my account, just for testing purposes.
Moreover, I tried to different configurations for the docker environment. Nothing helped.
It appears that multiple ppl struggle with GL runners and ecr-login to use images from ECR for jobs.
i tried to follow various guides, Tipps and tricks, including this one: https://gist.github.com/dreampuf/a0d416a15299a2ac74a0a5cb8f2871c0

No connection of the runner to GitLab

Unfortunately, there is no registration of the GitLab Runner. In GitLab I only see the message New runner, has not contacted yet. Do you have any idea what the problem is?

import { GitlabRunnerAutoscaling } from '@pepperize/cdk-autoscaling-gitlab-runner';
import { aws_ec2, aws_iam, aws_ssm, Stack, StackProps } from 'aws-cdk-lib';
import { Construct } from 'constructs';

interface GitlabRunnerStackProps extends StackProps {
  readonly gitlabToken: string;
  readonly gitlabUrl: string;
  readonly cidr: string;
}

export class GitlabRunnerStack extends Stack {
  constructor(scope: Construct, id: string, props: GitlabRunnerStackProps) {
    super(scope, id, props);

    const { gitlabToken, gitlabUrl, cidr } = props;

    const token = new aws_ssm.StringParameter(this, 'Token', {
      parameterName: '/gitlab-runner/token',
      stringValue: gitlabToken,
      type: aws_ssm.ParameterType.STRING,
      tier: aws_ssm.ParameterTier.STANDARD,
    });

    const vpc = new aws_ec2.Vpc(this, 'Vpc', {
      cidr,
      natGateways: 1,
    });

    new GitlabRunnerAutoscaling(this, 'Runner', {
      runners: [
        {
          instanceType: aws_ec2.InstanceType.of(
            aws_ec2.InstanceClass.M6G,
            aws_ec2.InstanceSize.MEDIUM,
          ),
          token: token,
          role: new aws_iam.Role(this, 'Role', {
            assumedBy: new aws_iam.ServicePrincipal('ec2.amazonaws.com'),
            managedPolicies: [
              aws_iam.ManagedPolicy.fromAwsManagedPolicyName(
                'AmazonSSMManagedInstanceCore',
              ),
            ],
          }),
          configuration: {
            name: 'gitlab-runner',
            url: gitlabUrl,
          },
        },
      ],
      network: { vpc: vpc },
    });
  }
}

Can not create runner instances

With the latest release (v0.2.361), runner instances can not be started anymore.

Error message:

Error creating machine: Error in driver during machine creation: Error launching instance: UnauthorizedOperation: You are not authorized to perform this operation. 

Runner Manager Role:

"aws:TagKeys": ["InstanceProfile"],

Seems like the tag InstanceProfile is missing, because when I delete the condition from the role it works.

install amazon-ecr-credential-helper fails occasionally

I was using the latest release that includes ecr credential helper.
Occasionally this fail in spectacular manner. It start spinning up VM's and those fail and new is spinning up in loop. at one point deleting the VM's was not happening and I ended up with 8 quite large VM's, before it hit my resource limit. Luckily I notice it early or it would be ๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ :)
I spend some time investigating the issue. Turns out it fail at the apt-get update step. sorry I do not have the logs handy.
I found entries in the actual runner (not the manager) logs complaining about the apt lock failed. This is know issue using ubuntu like system. When ubuntu starts the apt-daily.service ran very early in power up sequence. If you unlucky the apt-get update or apt-get install ... will conflict as those can not run concurrently and using file based lock.
Not sure how to solve it, there does not seems to be universal solution. Best description I found: https://saveriomiroddi.github.io/Handling-the-apt-lock-on-ubuntu-server-installations/

For now can we make the installation of amazon-ecr-credential-helper optional?

There is already a Construct with name 'Network' in GitlabRunnerStack

Deploying two GitlabRunnerAutoscaling constructs in the same CDK project (for example one with runners for light workloads and one for heavy workloads) is currently not possible. Every GitlabRunnerAutoscaling instantiates a new Network construct called "Network", a new SecurityGroup called "ManagerSecurityGroup", ... and those names will clash.
Can this Network not be shared between two GitlabRunnerAutoscaling constructs, just like for the cache bucket?

default sshKeyPath causes machine creation errors

When I deploy an instance of the runner stack, the manager fails to create runners with an error log like this:

Running pre-create checks...                        driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
Creating machine...                                 driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
(runner-abcdef-gitlab-runner-01-02) Launching instance...  driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
(runner-abcdef-gitlab-runner-01-02) Missing instance ID, this is likely due to a failure during machine creation  driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
(runner-abcdef-gitlab-runner-01-02) Missing key pair name, this is likely due to a failure during machine creation  driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
IdleCount is set to 0 so the machine will be created on demand in job context  creating=1 idle=0 idleCount=0 idleCountMin=0 idleScaleFactor=0 maxMachineCreate=0 maxMachines=10 removing=0 runner=Abcdef total=1 used=0  
ERROR: Error creating machine: Error in driver during machine creation: unable to create key pair: open /etc/gitlab-runner/ssh: no such file or directory  driver=amazonec2 name=runner-abcdef-gitlab-runner-01-02 operation=create  
ERROR: Machine creation failed                      error=exit status 1 name=runner-abcdef-gitlab-runner-01-02 time=5.644731027s  

The stack looks like this:

    new GitlabRunnerAutoscaling(this, "GitlabRunnerAutoscaling", {
      network: {
        vpc: vpc,
      },
      runners: [
        {
          instanceType: ec2.InstanceType.of(
            ec2.InstanceClass.T3,
            ec2.InstanceSize.SMALL,
          ),
          token: token,
          role: runnerExecutionRole,
          configuration: {
            machine: {
              machineOptions: {
                requestSpotInstance: false,
              },
            },
          },
        },
      ],
    });

If I set configuration.machine.sshKeyPath to "", the manager is able to create the runner just fine.

I'm using @pepperize/cdk-autoscaling-gitlab-runner@^0.2.111 with aws-cdk@^2.30.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.