awslabs / genomics-secondary-analysis-using-aws-step-functions-and-aws-batch Goto Github PK

This solution provides a framework for Next Generation Sequencing (NGS) genomics secondary-analysis pipelines using AWS Step Functions and AWS Batch.

Home Page: https://aws.amazon.com/solutions/implementations/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/

License: Apache License 2.0

Shell 47.86% Dockerfile 7.48% Python 44.67%

genomics-secondary-analysis-using-aws-step-functions-and-aws-batch's Introduction

Deprecation Notice

In 2022, AWS launched AWS HealthOmics, a purpose built service to store, query and analyze genomics and other omics data securely and at scale. Since HealthOmics allows users to run bioinformatics workflows using industry specific workflow languages and abstracts the AWS infrastructure and its management from the user, we recommend using AWS HealthOmics. An updated guidance is available here: https://aws.amazon.com/solutions/guidance/development-automation-implementation-monitoring-of-bioinformatics-workflows-on-aws/

Genomics Secondary Analysis Using AWS Step Functions and AWS Batch

This solution provides a framework for Next Generation Sequencing (NGS) genomics secondary-analysis pipelines using AWS Step Functions and AWS Batch. It deploys AWS services to develop and run custom workflow pipelines, monitor pipeline status and performance, fail-over to on-demand, handle errors, optimize for cost, and secure data with least-privileges.

The solution is designed to be starting point for developing your own custom genomics workflow pipelines using Amazon States Language and AWS Step Functions using continuous integration / continuous deployment (CI/CD) principles. That is everything - from the workflow definitions, to the resources they need to run on top of - is code, tracked in version control, and automatically built, tested, and deployed when developers make changes.

Standard deployment

To deploy this solution in your account use the "Launch in the AWS Console" button found on the solution landing page.

We recommend deploying the solution this way for most use cases.

This will create all resources you need to get started developing and running genomics secondary analysis pipelines. This includes an example containerized toolset and definition for a simple variant calling pipeline using BWA-MEM, Samtools, and BCFtools.

Install options

Customized deployment

A fully customized solution can be deployed for the following use cases:

Modifying or adding additional resources deployed during installation
Modifying the "Landing Zone" of the solution - e.g. adding additional artifacts or customizing the "Pipe" CodePipeline

Fully customized solutions need to be self-hosted in your own AWS account, and you will be responsible for any costs incurred in doing so.

To deploy and self-host a fully customized solution use the instructions below.

Note: All commands assume a bash shell.

Customize

Clone the repository, and make desired changes

File Structure

.
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE.txt
├── NOTICE.txt
├── README.md
├── deployment
│   ├── build-s3-dist.sh
│   └── run-unit-tests.sh
└── source
    ├── code
    │   ├── buildspec.yml
    │   ├── cfn
    │   │   ├── cloudwatch-dashboard.cfn.yaml
    │   │   ├── core
    │   │   │   ├── batch.cfn.yaml
    │   │   │   ├── iam.cfn.yaml
    │   │   │   └── networking.cfn.yaml
    │   │   └── workflow-variantcalling-simple.cfn.yaml
    │   ├── containers
    │   │   ├── _common
    │   │   │   ├── README.md
    │   │   │   ├── aws.dockerfile
    │   │   │   ├── build.sh
    │   │   │   ├── entrypoint.aws.sh
    │   │   │   └── push.sh
    │   │   ├── bcftools
    │   │   │   └── Dockerfile
    │   │   ├── buildspec.yml
    │   │   ├── bwa
    │   │   │   └── Dockerfile
    │   │   └── samtools
    │   │       └── Dockerfile
    │   └── main.cfn.yml
    ├── pipe
    │   ├── README.md
    │   ├── buildspec.yml
    │   ├── cfn
    │   │   ├── container-buildproject.cfn.yaml
    │   │   └── iam.cfn.yaml
    │   └── main.cfn.yml
    ├── setup
    │   ├── lambda
    │   │   ├── lambda.py
    │   │   └── requirements.txt
    │   ├── setup.sh
    │   ├── teardown.sh
    │   └── test.sh
    ├── setup.cfn.yaml
    └── zone
        ├── README.md
        └── main.cfn.yml

Path	Description
deployment	Scripts for building and deploying a customized distributable
deployment/build-s3-dist.sh	Shell script for packaging distribution assets
deployment/run-unit-tests.sh	Shell script for execution unit tests
source	Source code for the solution
source/setup.cfn.yaml	CloudFormation template used to install the solution
source/setup/	Assets used by the installation and un-installation process
source/zone/	Source code for the solution landing zone - location for common assets and artifacts used by the solution
source/pipe/	Source code for the solution deployment pipeline - the CI/CD pipeline that builds and deploys the solution codebase
source/code/	Source code for the solution codebase - source code for containerized tooling, workflow definitions, and AWS resources for workflow execution

Run unit tests

cd ./deployment
chmod +x ./run-unit-tests.sh
./run-unit-tests.sh

Build and deploy

Create deployment buckets

The solution requires two buckets for deployment:

<bucket-name> for the solution's primary CloudFormation template
<bucket-name>-<aws_region> for additional artifacts and assets that the solution requires - these are stored regionally to reduce latency during installation and avoid inter-regional transfer costs

Configure and build the distributable

export DIST_OUTPUT_BUCKET=<bucket-name>
export SOLUTION_NAME=<solution-name>
export VERSION=<version>

chmod +x ./build-s3-dist.sh
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION

Deploy the distributable

Note: you must have the AWS Command Line Interface (CLI) installed for this step. Learn more about the AWS CLI here.

cd ./deployment

# deploy global assets
# this only needs to be done once
aws s3 cp \
    ./global-s3-assets/ s3://<bucket-name>/$SOLUTION_NAME/$VERSION \
    --recursive \
    --acl bucket-owner-full-control

# deploy regional assets
# repeat this step for as many regions as needed
aws s3 cp \
    ./regional-s3-assets/ s3://<bucket-name>-<aws_region>/$SOLUTION_NAME/$VERSION \
    --recursive \
    --acl bucket-owner-full-control

Install the customized solution

The link to the primary CloudFormation template will look something like:

https://<bucket-name>.s3-<region>.amazonaws.com/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch.template

Use this link to install the customized solution into your AWS account in a specific region using the AWS Cloudformation Console.

This solution collects anonymous operational metrics to help AWS improve the quality of features of the solution. For more information, including how to disable this capability, please see the implementation guide.

Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at

http://www.apache.org/licenses/

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.

genomics-secondary-analysis-using-aws-step-functions-and-aws-batch's People

Contributors

Stargazers

Watchers

genomics-secondary-analysis-using-aws-step-functions-and-aws-batch's Issues

No Space left on device

I am running into an issue where I get a java.lang.RuntimeException: java.io.IOException: No space left on device in my Batch jobs. I thought that the EBS volume that is used as a mount dir has ebs auto-scaling.

Issues when executing build-s3-dist.sh

Hi,
A couple of issues when running build-s3-dist.sh (running from Ubuntu on AWS Cloud9):

pip install -t . crhelper returns an error:
DistutilsOptionError: can't combine user with prefix, exec_prefix/home, or install_(plat)base
Following SO answers by changing it to pip install --user --install-option="--prefix=" crhelper resolved it.

Also, some of the commands in the script produce output which I'm not sure is expected:

copy yaml templates and rename
cp: cannot stat '/home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/*.yaml': No such file or directory
mv: cannot stat '*.yaml': No such file or directory
Updating code source bucket in template with XXXX-pipeline-distribution
sed -i '' -e s/%%BUCKET_NAME%%/XXXX-pipeline-distribution/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory
sed -i '' -e s/%%SOLUTION_NAME%%/XXXX-Pipeline/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory
sed -i '' -e s/%%VERSION%%/1.0/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory

Finally (and maybe related to the above), there's no ./dist folder to upload to S3 when the script is done.

Thanks in advance,
Ury

403 downloading s3 assets

Looks to me like a problem with the upstream genomics-solutions-shared-assets bucket when the build script runs.
No sample data is retrievable.

Deploy customized solution

Hello !

I'm trying to customize the workflow (making some changes into the AWS::EC2::LaunchTemplate resource mainly) and so far failing to deploy the customized solution from my own S3 bucket. I managed to interact with the solution using the "as-is" template.

I've made the changes in the files (batch.cfn.yaml) and tried following the README for getting my solution ready.

First I'm a bit confused by who the S3 bucket for the source code should be named. Is it my-bucket-name-us-east-1 or my-bucket-name ? Because when looking at the template file

Mappings:
  Send:
    AnonymousUsage:
      Data: Yes
  SourceCode:
    General:
      S3Bucket: 'my-bucket-name'

The region identifier is not present.

Anyway I've uploaded the global-s3-asset folder (and not dist) to both my-bucket-name and my-bucket-name-us-east-1

When I want to upload my template into CloudFormation, when trying to use directly the S3 Path of the template present in my buckets, the template is read but when clicking Next, I'm getting the following error :

Domain name specified in my-bucket-name-us-east-1 is not a valid S3 domain

However, when uploading directly the template file that was generated from my hard drive, the CloudFormation creation starts.

However I'm afraid I'm getting the same error as in #1

Failed to create resource. Code Build job 'GenomicsWorkflow2-Setup:c3605228-d405-4413-b3a4-134ca89e97d8' in project 'GenomicsWorkflow2-Setup' exited with a build status of 'FAILED'.

Thanks for your help !

Sorry I found the error :

aws s3 cp s3://customgenomicsworkflow-us-east-1/myawssolution/1/samples/NIST7035_R1_trim_samp-0p1.fastq.gz s3://genomicsworkflow2zone-zonebucket-1icns0ac2ai79/samples/NIST7035_R1_trim_samp-0p1.fastq.gz
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden

I copied the fastq files and I'm now resuming... I'm leaving this open in case I don't manage to build till the end. But so I guess it's really the regional-s3-assets that I should have uploaded since it's the one with the fastqs ?

Anyway I'm still puzzled by way I can't start the CloudFormation by inputing the S3 path of the template.

Edit 2 : Success !

Edit 3 : And my change into the LaunchTemplate appears to be working too, so I'm closing this now!

Error building when modifying the solution

I can get this to run unmodified; however, I made a few modifications:

Added additional docker images (tested locally and these build correctly) - also if I don't delete on stack failure these images are present.
added additional batch jobs for docker images
removed sections of the code the upload the sample data.

I updated the policy for the sample bucket to :

      Policies:
        - PolicyName: S3Access
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:ListBucket
                  - s3:GetObject
                  - s3:GetObjectVersion
                  - s3:PutObject
                Resource:
                  - !Sub arn:aws:s3:::${JobResultsBucket}
                  - !Sub arn:aws:s3:::${JobResultsBucket}/*
              - Effect: Allow
                Action:
                  - s3:ListBucket
                  - s3:GetObject
                Resource:
                  - arn:aws:s3:::my_data_folder
                  - arn:aws:s3:::my_data_folder/*
                  # - !Sub arn:aws:s3:::${SamplesBucket}
                  # - !Sub arn:aws:s3:::${SamplesBucket}/*

I get the following error when building and I am unclear what it means or how to debug it.

2021-09-07T08:53:14.924-07:00	++ get-repo-url GenomicsWorkflowPipe
2021-09-07T08:53:14.924-07:00	++ local stack_name=GenomicsWorkflowPipe
2021-09-07T08:53:14.924-07:00	+++ aws cloudformation describe-stacks --stack-name GenomicsWorkflowPipe --query 'Stacks[].Outputs[?OutputKey==`RepoCloneUrl`].OutputValue' --output text
2021-09-07T08:53:14.924-07:00	++ local url=https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00	++ echo https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00	+ git remote add origin https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00	+ git push -u origin master
2021-09-07T08:53:18.981-07:00	To https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00	* [new branch] master -> master
2021-09-07T08:53:18.981-07:00	Branch 'master' set up to track remote branch 'master' from 'origin'.
2021-09-07T08:53:18.981-07:00	+ wait-for-stack GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00	+ local stack_name=GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00	+ local exists_attempts=6
2021-09-07T08:53:18.981-07:00	+ local status=0
2021-09-07T08:53:18.981-07:00	+ set +e
2021-09-07T08:53:18.981-07:00	+ echo 'Creating stack: GenomicsWorkflowCode'
2021-09-07T08:53:18.981-07:00	Creating stack: GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00	+ (( attempt=1 ))
2021-09-07T08:53:18.981-07:00	+ (( attempt<=6 ))
2021-09-07T08:53:18.981-07:00	+ echo 'Waiting for stack creation - attempt: 1'
2021-09-07T08:53:18.981-07:00	Waiting for stack creation - attempt: 1
2021-09-07T08:53:18.981-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:54:55.034-07:00	
2021-09-07T08:54:55.034-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:54:55.034-07:00	+ status=255
2021-09-07T08:54:55.034-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T08:54:55.034-07:00	+ (( attempt++ ))
2021-09-07T08:54:55.034-07:00	+ (( attempt<=6 ))
2021-09-07T08:54:55.034-07:00	+ echo 'Waiting for stack creation - attempt: 2'
2021-09-07T08:54:55.034-07:00	Waiting for stack creation - attempt: 2
2021-09-07T08:54:55.034-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:56:31.112-07:00	
2021-09-07T08:56:31.112-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:56:31.112-07:00	+ status=255
2021-09-07T08:56:31.112-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T08:56:31.112-07:00	+ (( attempt++ ))
2021-09-07T08:56:31.112-07:00	+ (( attempt<=6 ))
2021-09-07T08:56:31.112-07:00	+ echo 'Waiting for stack creation - attempt: 3'
2021-09-07T08:56:31.112-07:00	Waiting for stack creation - attempt: 3
2021-09-07T08:56:31.112-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:58:07.196-07:00	
2021-09-07T08:58:07.196-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:58:07.196-07:00	+ status=255
2021-09-07T08:58:07.196-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T08:58:07.196-07:00	+ (( attempt++ ))
2021-09-07T08:58:07.196-07:00	+ (( attempt<=6 ))
2021-09-07T08:58:07.196-07:00	+ echo 'Waiting for stack creation - attempt: 4'
2021-09-07T08:58:07.196-07:00	Waiting for stack creation - attempt: 4
2021-09-07T08:58:07.196-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:59:43.284-07:00	
2021-09-07T08:59:43.284-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:59:43.284-07:00	+ status=255
2021-09-07T08:59:43.284-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T08:59:43.284-07:00	+ (( attempt++ ))
2021-09-07T08:59:43.284-07:00	+ (( attempt<=6 ))
2021-09-07T08:59:43.284-07:00	+ echo 'Waiting for stack creation - attempt: 5'
2021-09-07T08:59:43.284-07:00	Waiting for stack creation - attempt: 5
2021-09-07T08:59:43.284-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T09:01:21.383-07:00	
2021-09-07T09:01:21.383-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T09:01:21.383-07:00	+ status=255
2021-09-07T09:01:21.383-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T09:01:21.383-07:00	+ (( attempt++ ))
2021-09-07T09:01:21.383-07:00	+ (( attempt<=6 ))
2021-09-07T09:01:21.383-07:00	+ echo 'Waiting for stack creation - attempt: 6'
2021-09-07T09:01:21.383-07:00	Waiting for stack creation - attempt: 6
2021-09-07T09:01:21.383-07:00	+ aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T09:02:58.238-07:00	
2021-09-07T09:02:58.238-07:00	Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T09:02:58.238-07:00	+ status=255
2021-09-07T09:02:58.238-07:00	+ '[' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00	+ (( attempt++ ))
2021-09-07T09:02:58.238-07:00	+ (( attempt<=6 ))
2021-09-07T09:02:58.238-07:00	+ '[' '!' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00	+ echo '[ERROR] Stack creation could not be started.'
2021-09-07T09:02:58.238-07:00	[ERROR] Stack creation could not be started.
2021-09-07T09:02:58.238-07:00	+ return 255
2021-09-07T09:02:58.238-07:00	+ status=255
2021-09-07T09:02:58.238-07:00	+ set -e
2021-09-07T09:02:58.238-07:00	+ '[' '!' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00	+ echo '[ERROR] GenomicsWorkflowCode Stack FAILED'
2021-09-07T09:02:58.238-07:00	[ERROR] GenomicsWorkflowCode Stack FAILED
2021-09-07T09:02:58.238-07:00	+ exit 255
2021-09-07T09:02:58.238-07:00	
2021-09-07T09:02:58.238-07:00	[Container] 2021/09/07 16:02:56 Command did not exit successfully ./setup/$SOLUTION_ACTION.sh exit status 255
2021-09-07T09:02:58.238-07:00	[Container] 2021/09/07 16:02:56 Phase complete: INSTALL State: FAILED
                               [Container] 2021/09/07 16:02:56 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: ./setup/$SOLUTION_ACTION.sh. 
                               Reason: exit status 255

[Errno 28] No Space left on device

Dear all,

First of all thank you very much for creating this repository. This has helped my team and I tremendously.

I have implemented the solution as described in the README using the "standard deployment".

I have made modifications to the state machine. I'm not using the JOB_INPUTS, JOB_OUTPUTS, JOB_OUTPUT_PREFIX... arguments anymore. I am just giving an s3 path for a config file, which the container downloads and executes in a script accordingly. The issue arises when the batch job downloads fasta files from the s3 bucket (1) it seems to be slow, and (2) I get the [Errno 28] No Space left on device

I know that this no longer uses entrypoint.aws.sh for input, output, and avoiding file path clobbering and I have no problem going back to it, I just wanted to get our set-up working in the new development stack
I had to add the following two arguements to build.sh
--build-arg AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION \
--build-arg AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI \
and our base docker image also has the following arguments added:
ARG AWS_DEFAULT_REGION
ARG AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
I made sure that the shell scripts we are running in the docker batch job are specifically doing work in /scratch.
I tried playing with the size of the attached EBS volume, although I was under the impression that it should automagically change it's size with the amazon-ebs-autoscale repository.

Also, any changes I make to the VolumeSize either the /dev/xvdcz or /dev/sdc don't get implemented when the ec2 instance starts running.

I tried to describe changes that I made and some things that I tried out to solve this issue. Any help you could throw my direction would be greatly appreciated.

Thanks in advance

Matt

Deployment fails at setup in the stack formation

Dear all,

first of all, thank you for the solution. I got it up and running in Virginia without any problems. Then I realized that Ireland is my preferred region. I changed the region in the drop down, and tried again. The first time the setup failed at setup. I tried again and it worked. I subsequently deleted the stack and tried again. Now, it doesn't go through. I get the following error in the cloud formation process:

2020-07-02 23:38:56 UTC+0200	MARVL-Stack	ROLLBACK_IN_PROGRESS	The following resource(s) failed to create: [Setup]. . Rollback requested by user.
2020-07-02 23:38:54 UTC+0200	Setup	CREATE_FAILED	Custom Resource failed to stabilize in expected time

Do you know what might be preventing the cloud formation from going through?

Thanks for any help you might be able to offer on this topic!

Best Regards

Matt

Error while downloading sample files

Hi,
Was following the instructions for setup and am running into the following issue while executing
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION:

Downloading sample files.
--2022-04-15 10:54:56-- https://genomics-solutions-shared-assets.s3-us-west-2.amazonaws.com/secondary-analysis/example-files/fastq/NIST7035_R1_trim_samp-0p1.fastq.gz
Resolving genomics-solutions-shared-assets.s3-us-west-2.amazonaws.com (genomics-solutions-shared-assets.s3-us-west-2.amazonaws.com)... 52.218.220.225
Connecting to genomics-solutions-shared-assets.s3-us-west-2.amazonaws.com (genomics-solutions-shared-assets.s3-us-west-2.amazonaws.com)|52.218.220.225|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-04-15 10:54:56 ERROR 403: Forbidden.

Are the sample files not accessible anymore?

lacking lambda permissions for Cloudformation Zone role?

Hello,

I'm trying to redeploy from scratch just to see whether I can migrate my solution efficiently to other AWS account and I'm bumping into a lot of missing permissions for Cloudformation Zone Role:

User: arn:aws:sts:::assumed-role/AwesomeGenomicsZone-CloudFormationRole-EEUTS26ANEP4/AWSCloudFormation is not authorized to perform: lambda:AddPermission on resource: arn:aws:lambda:us-east-2::function:AwesomeGenomicsPipe-LambdaTriggerCodeBuild-A7NZMzdgpQjy because no identity-based policy allows the lambda:AddPermission action (Service: AWSLambdaInternal; Status Code: 403; Error Code: AccessDeniedException; )

At the moment adding these :

              - Effect: Allow
                Action:
                  - lambda:CreateFunction
                  - lambda:DeleteFunction
                  - lambda:GetFunction
                  - lambda:AddPermission
                Resource:
                  - "*"

Into source/zone/main.cfn.yaml for the CloudFormationRole

Does it make sense to you that these are now required and did not prompt an error in the past ? Or am I missing something ?

Edit : yeah sorry about that, I know why I needed the Lambda permission in my own custom solution.

Is there any way to increase the VCPU and Memory when starting a state machine?

I've just been testing running a state machines by passing params e.g.,

{
    "params": {
        "queue": "DataQualityWorkflowsLowPriority",
        "environment": {
            "REFERENCE_NAME": "Homo_sapiens_assembly38",
            "SAMPLE_ID": "NIST7035",
            "SOURCE_DATA_PREFIX": "s3://dataqualityworkflowszone-zonebucket-mtwqpz9vrde0/samples/",
            "JOB_OUTPUT_PREFIX": "s3://dataqualityworkflowscode-jobresultsbucket-1b2pb00b6cyfl",
            "JOB_AWS_CLI_PATH": "/opt/miniconda/bin"
        },
        "chromosomes": [
            "chr19",
            "chr20",
            "chr21",
            "chr22"
        ]
    }
}

Is there a way to increase the default RAM when starting a new job?

Multiple Input files have the same name.

I have a step in in one of my workflows where the input files are the same name, just different directories. For example;

JOB_OUTPUT_PREFIX="s3://job-output-bucket/assemblies/sample1/contigs.fasta s3://job-output-bucket/assemblies/sample2/contigs.fasta s3://job-output-bucket/assemblies/sample3/contigs.fasta"

However, each file is downloaded to ./contigs.fasta and this is overwritten. Is there a way I could download to either ./sample1/contigs.fasta or rename each file downloaded to ./sample1-contigs.fasta?

Where in the code can I increase the size of the persistant EC2 volume?

The ec2 volume is currently 1000 GB in size and it's insufficient to run some of my jobs, I am trying to see in the code where I can increase it.