Code Monkey home page Code Monkey logo

aws-cloudhsm-cloudformation-template's Issues

Enhancement AL 2023 support

I really appreciate this template. Since AL2 is going EOL in 2025, it would be nice to see AL2023 support added for the EC2 instance.

Enhancement: support the stack to work with private VPCs

Although the documentation states that Internet connectivity is required for the deployment of the stack, it can be changed so the stack could work without this requirement. The majority of software (with the exception of AWS CLI v2) is hosted on S3, so a private VPC with the S3 VPC endpoint works just fine (if one comments out the forceful update of AWS CLI v2). I've raised the corresponding feature request in AWS CLI v2 repo to re-consider the release hosting location to be moved to S3 too.

Given that the auxiliary EC2 instance is spawned from Amazon Linux 2 which already has AWS CLI installed, do we really need to forcibly re-install the package?

Update to use CloudHSM client v5

Update the automation to download and use v5 of the CloudHSM client. Currently, v3 is used.

Examples of change to the CLI:

New explicit command for cluster activation

The current AWS CloudHSM documentation describes the cluster activation process as follows:

https://docs.aws.amazon.com/cloudhsm/latest/userguide/activate-cluster.html

Note the use of the cluster activate command.

The current implementation still works, but it uses a slightly different and perhaps outdated means to trigger cluster activation. It uses the process of setting the Crypto Officer (CO) password to an initial value to trigger the activation.

Rework format of stack outputs

Rework the custom resource and oClusterInfo | oCloudHsmKeyStoreId output to be more aligned with the format of properties of other resources that have IDs. e.g. EC2 instances.

Security: Add documentation and guidance for the EC2 client's egress 443 rule

And add inline comments around the egress rule for 443 to explain the purposes for the egress rule (AWS package downloads, connectivity to AWS Systems Manager and other AWS services - be specific). Also highlight that in a more formal implementation, the CIDR should be changed from 0.0.0.0/0 to be more specific and align with how connectivity to such endpoints is established.

Implement test automation

Implement some extent of test automation so that the effort required to validate changes is greatly reduced. See the TESTING.md file for an initial set of test cases to consider automating.

cloudhsm-cli: when selected for installation, run as ssm-user results in warning and log messages to terminal

When cloudhsm-cli is selected for installation during a create stack, use of ssm-user to execute the CLI manually after stack creation results in warning messages and stdout/stderr content from the CLI being displayed. For example, when a user uses AWS Systems Manager Session Manager to access the EC2 client and execute the cloudhsm-cli command. In this case, the use is ssm-user.

You can still use the CLI, but the output messages are annoying.

h-4.2$ /opt/cloudhsm/bin/cloudhsm-cli interactive
thread 'CloudHSM Worker' panicked at 'failed to create appender: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tracing-appender-0.2.2/src/rolling.rs:499:53
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error writing to log file. Falling back to standard error.
2023-04-27T18:33:55.150Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::connection::connection_pool] Adding HSM connection to connection pool: HsmConnection { hsm_info: HSM { IP: "10.4.12.221", Port: 2223 } }
2023-04-27T18:33:55.150Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::connection::connection_pool] Adding HSM connection to connection pool: HsmConnection { hsm_info: HSM { IP: "10.4.19.44", Port: 2223 } }
2023-04-27T18:33:55.150Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] HSM 10.4.12.221:2223 is connecting
2023-04-27T18:33:55.159Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::server_connection::common] Initializing new connection: HSM { IP: "10.4.12.221", Port: 2223 }
2023-04-27T18:33:55.160Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] HSM 10.4.19.44:2223 is connecting
2023-04-27T18:33:55.165Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::server_connection::common] Initializing new connection: HSM { IP: "10.4.19.44", Port: 2223 }
2023-04-27T18:33:55.216Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::server_properties] Version handshake with server succeeded. Received version: ComponentVersion { major: 2, minor: 8 }
2023-04-27T18:33:55.216Z INFO  [793] ThreadId(1) [hsm1_marshaling::server_handshake] Reporting sdk version CLI:5.8.0-el7:CodeBuildBatchProject-uFu5sNXfquqK:1ce78aba-ddf5-4c08-aaab-3d9eda62e152
2023-04-27T18:33:55.217Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::server_properties] Version handshake with server succeeded. Received version: ComponentVersion { major: 2, minor: 8 }
2023-04-27T18:33:55.217Z INFO  [793] ThreadId(1) [hsm1_marshaling::server_handshake] Reporting sdk version CLI:5.8.0-el7:CodeBuildBatchProject-uFu5sNXfquqK:1ce78aba-ddf5-4c08-aaab-3d9eda62e152
2023-04-27T18:33:55.309Z INFO  [793] ThreadId(2) [cloudhsm_provider::hsm1::connection::connection_pool::cluster_info_message] Current cluster version is 0; incoming cluster version is 199391178
2023-04-27T18:33:55.309Z INFO  [793] ThreadId(2) [cloudhsm_provider::hsm1::connection::connection_pool::cluster_info_message] HSMs to be added: {HSM { IP: "10.4.19.44", Port: 2223 }, HSM { IP: "10.4.12.221", Port: 2223 }}
2023-04-27T18:33:55.309Z INFO  [793] ThreadId(2) [cloudhsm_provider::hsm1::connection::connection_pool::cluster_info_message] HSMs to be removed: {}
2023-04-27T18:33:55.311Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] Updating the state of HSM 10.4.19.44:2223
2023-04-27T18:33:55.311Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] HSM 10.4.19.44:2223 is connected and ready
2023-04-27T18:33:55.317Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] Updating the state of HSM 10.4.12.221:2223
2023-04-27T18:33:55.318Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::hsm_connection::hsm_connection_impl] HSM 10.4.12.221:2223 is connected and ready
2023-04-27T18:33:55.320Z INFO  [793] ThreadId(1) [cloudhsm_provider::hsm1::connection::connection_pool] HSM Connection already in pool 10.4.19.44:2223
aws-cloudhsm > 2023-04-27T18:33:55.320Z INFO  [793] ThreadId(3) [cloudhsm_provider::hsm1::connection::connection_pool::cluster_info_message] Current cluster version is 199391178; incoming cluster version is 199391178

The issue is likely due to the activate operation being carried out via the root user and the underlying CLI log file being created using the root user's ID and group ID. Subsequent attempts by non-root users to execute the CLI result in the warning message and log output being written to the terminal.

Prior to running the command as the ssm-user, the run/ directory:

sh-4.2$ ls -alR /opt/cloudhsm/run
/opt/cloudhsm/run:
total 4
drwxrwxrwt 2 root root   41 May 25 18:49 .
drwxr-xr-x 7 root root   61 May 25 18:33 ..
-rw-r--r-- 1 root root 3193 May 25 18:49 cloudhsm-cli.log.2023-05-25

Reproduce

Two methods:

1. Download cloudhsm-cli package

On a suitable Linux instance:

  1. Download the cloudhsm-cli package
  2. As root, execute /opt/cloudhsm/cloudhsm-cli interactive
  3. As a non-root user, execute the same command

2. Use this CloudFormation template

  1. Create a stack but select the option to install the cloudhsm-cli at stack creation.
  2. After stack is created, use Session Manager to access the EC2 client and run the cloudhsm-cli as the ssm-user.

Attempt to reconnect existing keystore when creating cluster from backup

Add support for being able to create a cluster from a backup and automatically reconnect an existing disconnected keystore that was associated with the cluster from which the backup was made.

The template currently supports creation of a new CloudHSM cluster from a backup when pStackScope is set to cluster-and-client-only, but support has not been added to enable the same operation when the scope is set to with-custom-key-store. Currently, under that condition, the template will create a new custom keystore and connect it to the newly instantiated cluster. The desire is to at least have an option to attempt to reconnect an existing keystore to the new cluster.

Update supported AWS Regions

Update the template with the currently supported set of AWS Regions given that ap-northeast-3 and sa-east-1 now support CloudHSM, however cn-* do not.

Enhance create cluster from backup to highlight dependency on customer CA cert from original cluster

In this section of the doc:

https://github.com/aws-samples/aws-cloudhsm-cloudformation-template#creating-a-cloudhsm-cluster-from-a-backup

Make clear that the customer CA cert associated with the cluster from which the backup was taken must exist in Secrets Manager under the name:

/{system_id}/{backup_cluster_id}/customer-ca-cert

Where {system_id} is the value of the pSystem parameter used for both the stack associated with the original cluster from which the backup was taken and the new cluster to be created from the backup. Also highlight that the pSystem parameter value for both stack needs to be the same.

The original customer CA cert is used during the process of creating a new cluster from a backup to configure the EC2 client with the proper CA cert so that the CloudHSM client tools can interact with HSMs in the newly created cluster.

Provide Template URL

Hi, since this is a CloudFormation it would be nice to have the template URL associated with the current commit SHA for easier deployment.
The reason is that cloudhsm.yaml is larger than 50kB, resulting in at 'templateBody' failed to satisfy constraint: Member must have length less than or equal to 51200

Support external cluster cert signing processes

Enable users to use their own cluster cert signing process as an option. Continue with the default behavior of using self-signed CA cert and automating the process of signing the cluster cert.

Provide option to install CloudHSM client SDK v3

Since CloudHSM SDK v3 provides key management commands that are not in SDK v3, add an option to support installing v3 in place of v5. Continue to default to using v5.

While making this change, determine if there's a "latest" reference for the v5 SDK so that the example can avoid pinning to a potentially old version of v5.

Check CloudHSM support selected subnets/AZs prior to creating cluster

In Regions where more than 3 AZs exist, CloudHSM might not be supported in all AZs. Currently, the template may either encounter an unsupported AZ during creation of the first HSM or during creation of subsequent HSMs. In the latter case, there can be a delay of minutes prior to recognition that an unsupported subnet/AZ has been specified.

Ideally, the template would check up front the specified AZs and issue an error prior to creation of the first HSM when an unsupported subnet/AZ has been specified.

Key store: Support stack updates

See Editing CloudHSM key store

Implement an update state machine to support:

  • Renaming key store
  • Disconnecting and reconnecting the key store in support of maintenance operations
    • Informing KMS of an update to the kmsuser password in support of common operations
  • Disconnecting the key store and reconnecting it to a different CloudHSM cluster
    • For example, in support of connecting to a CloudHSM cluster created from a backup
  • Changing the option to delete the key store upon stack deletion

Default to CloudHSM cluster only for `pStackScope`

Consider changing the default value of the pStackScope parameter to cluster-and-client-only from the current default of with-custom-key-store. The goal of this change is to make the simpler and faster to achieve configuration the default.

The template would still default to 2 HSMs so that it demonstrates the best practice of deploying HSMs to at least 2 AZs.

BYO PKI: Help address situations in which first HSM gets replaced before initalization

HSMs can end up being replaced by AWS due to internal failures and other circumstances. When the first HSM in a cluster gets replaced prior to the cluster being initialized, the cluster private key and consequently the CSR are also replaced. This means that the external/BYO PKI process needs to be restarted to use the new CSR. The longer the BYO PKI process takes, the greater exposure of the initial HSM to being replaced.

This issue calls for the IaC to be enhanced to help minimize the impact of this situation.

Delete: Optimization when no HSMs exist

During a stack delete operation, detect if zero HSMs exist and skip 30 second wait before checking state of HSMs. For example, if an error occurs during stack create and before any HSMs have been created, a delete operation will currently still wait for 30 seconds before checking the state of HSMs in the cluster and recognizing that none exist.

Detect and handle HSMs in `DEGRADED` state

Address scenario in which an HSM enters the "degraded" state during creation of the HSM.

During testing of our IaC, we have seen cases in which, during cluster creation, the first HSM to be created doesn't enter the ACTIVE state but enters a degraded state. With the current code, this state is not caught. Eventually, the create operation times out and an auto rollback of the stack is attempted.

When additional HSMs are created beyond the first HSM, any of the create actions could result in an HSM entering the degraded state.

An HSM in a degraded state can be deleted. i.e. the HSM won't automatically transition to an ACTIVE state later of its own accord.

This is what the state of such an HSM looks like:

            "Hsms": [
                {
                    "AvailabilityZone": "us-east-2a",
                    "ClusterId": "cluster-quhwuyosn7k",
                    "SubnetId": "subnet-04a76758b58c05023",
                    "EniId": "eni-09422fd6133d93b07",
                    "EniIp": "10.4.14.83",
                    "HsmId": "hsm-n5yjka6nfos",
                    "State": "DEGRADED",
                    "StateMessage": "HSM creation failed. Please delete this HSM and try again."
                }
            ],

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.