Code Monkey home page Code Monkey logo

cfn-artifactory's Introduction

Artifactory

The cfn-artifactory project is a sub-project of the overarching DevOps Tool-Chain (DOTC) project. This project — and its peer projects — is designed to handle the automated deployment of common DevOps tool-chain services onto STIG-hardened, EL7-compatible Amazon EC2 instances and related AWS resources. The first part of this automation is comprised of CloudFormation (CFn) templates. Included in this project are the following templated activities:

The above currently do not support Artifactory Enterprise or use of AWS AutoScaling functionality. Both are pending features.

Additionally, automation-scripts are provided to automate the deployment of the Artifactory Server software onto the relevant EC2 instances. They have been tested on stand-alone Artifactory deployments but will be updated for use with AWS's AutoScaling service and Artifactory Enterprise as part of supporting those use-cases.

Design Assumptions

These templates are intended for use within AWS VPCs. It is further expected that the deployed-to VPCs will be configured with public and private subnets. All Artifactory elements other than the Elastic LoadBalancer(s) are expected to be deployed into private subnets. The Elastic LoadBalancers provide transit of Internet-originating web UI requests to the the Artifactory node's web-based interface.

Notes on Templates' Usage

It is generally expected that the use of the various, individual-service templates will be run via the "parent" template(s). The "parent" template allows for a kind of "one-button" deployment method where all the user needs to worry about is populating the template's fields and ensuring that CFn can find the child templates.

In order to use the "parent" template, it is recommended that the child templates be hosted in an S3 bucket separate from the one created for backups by this stack-set. The template-hosting bucket may be public or not. The files may be set to public or not. CFn typically has sufficient privileges to read the templates from a bucket without requiring the containing bucket or files being set public. Use of S3 for hosting eliminates the need to find other hosting-locations or sort out access-particulars of those hosting locations.

The EC2-related templates currently require that the scripts be anonymously curlable. The scripts can still be hosted in a non-public S3 bucket, but the scripts' file-ACLs will need to allow public-read. This may change in future releases — likely via an enhancement to the IAM template.

These templates do not include Route53 functionality. It is assumed that the requisite Route53 or other DNS alias will be configured separate from the instantiation of the public-facing ELB.

Resultant Service Architecture

The templates and scripts act together to make standing up a new service quick and (reasonably) easy. Application-level configuration - beyond JDBC configuration - are not handled by these templates and scripts.

These templates and scripts are also designed to ensure that Artifactory data is persisted and backed up. This ensures that the Artifactory service can be quickly and easily reconstituted as necessary.

  • As part of this design, the Artifactory artifact-repository is designed to be placed on an external, persistent network-attached storage. The supported storage option is currently limited to NFS (e.g. if using EFS). Some hooks for use with GlusterFS are included but not well-tested.
  • Artifactory configuration data is expected to be hosted within an external PostGreSQL database (typically hosted via RDS).
  • Backup cron-jobs for the Artifactory contents needs to be configured within Artifactory, itself. These tools include configuration of a "sweep to S3" cron job. If Artifactory is not configured to create backups, nothing will be swept to S3. This will adversely impact ability to recover or migrate Artifactory data.

Closing Notes

  • Ability to destroy and recreate at will, while retaining all configuration and hosted data, has been tested. It's expected that most such actions will happen via stack-update or autoscaling actions (manual, scheduled or reactive). In the event that a stack-update results in two instances being "live" simultaneously, it will typically be necessary to restart the new instance after the pre-update instance terminates. This requirement is resultant Artifactory's built-in data-integrity protections.
  • Due to a bug in the systemd/nfs-client implementation in RHEL/CentOS 7, reboots of instances have a better than strong probability of hanging. If a hang occurs, it may be necessary to issue a force-shutdown to clear the hang (paired with a start if the goal was a reboot).
  • The EC2 template runs watchmaker after the EC2 instance launches but before Artifactory has been installed. Watchmaker ensures that the resultant system is STIG-hardened. See the [Watchmaker document)(https://watchmaker.readthedocs.io/) for description of what Watchmaker does, how it does it and any additional, environment-specific fine-tuning that may be desired/needed.

Build Status

cfn-artifactory's People

Contributors

dotcghproxy avatar ferricoxide avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cfn-artifactory's Issues

Capture Application Logs via CWA Logging

Currently, CWA logging (where configured, at all) only captures those log-files specified in the watchmaker 1.5.6 templates. Need to ensure the logging definitions to include the Artifactory application logs. Probably best to work this issue in coordination with #34.

Enable CloudWatch Agent install/config in Pro templates

Problem Description:

Pro templates created prior to source-template being configured to install/configure CloudWatch Agent. Need to fix this and ensure that the IAM policy is updated similarly to #20's fix for EE.

Expected Behavior:

EC2 instances deploy in CloudWatch-enabled regions with CloudWatch agent installed and configured

Actual Behavior:

CloudWatch agent not installed and configured — whether in supportable regions or not

(Detailed) Steps to reproduce:

(Optional) Fix recommendation:

Borrow logic from EE templates.

ELBs Should Work Whether or Not ACM Is Available

Problem Description:

Amazon Certificate Manager (ACM) is not available for use in all regions/partitions. In these regions/partitions, it will be necessary to use Identity and Access Management (IAM) to host SSL certificates used for ELB-based SSL-termination. To maximize portability, ELB templates should allow use of either ACM- or IAM-hosted SSL certificates.

Expected Behavior:

ELBs support SSL-termination whether ACM is available for use in a given region/partition.

Actual Behavior:

ELBs do not currently support SSL-termination when ACM is unavailable for use in a given region/partition.

Affected Components

The following templates need remediation:

  • make_artifactory_ELBv1.tmplt.json
  • make_artifactory_ELBv2.tmplt.json

Fix recommendation:

Add a Condition{} and Parameters{} components and associated logic within the Resources{} sections to support selection of ACM- or IAM-hosted SSL certificates when launching an ELB template.

Move Nginx's Upload-Staging Directory

Problem Description:

Uploads of large files fail with HTTP 500 errors

Expected Behavior:

Large files should upload successfully

Actual Behavior:

Uploads of large files (currently > 1GiB, but may vary upwards or downwards over the lifecycle of the hosting EC2) fail with HTTP 500 errors

Steps to reproduce:

  1. Attempt to upload a project that contains files greater than 1GiB in size. Problem has, to date, only manifested for Docker layers
  2. Monitor upload process
  3. Small files in a data-set succeed; "large" files fail with HTTP 500 errors

Fix recommendation:

Issue appears to lie in the Nginx internal reverse-proxy. While the file-size limits (client_max_body_size 0) and send/receive timeouts (proxy_send_timeout and proxy_read_timeout, respectively) have been removed, Nginx defaults to staging uploads to /var/lib/nginx/tmp/. Root filesystems on AMIs is too small to accommodate staging of large files. Need to either add the client_body_temp_path setting and explicitly point Nginx to a more-suitable location or change /var/lib/nginx/tmp/ to a symlink that does similar.

Either way, there may be SELinux implications to address. Will want to add something like:

setsebool -P httpd_use_nfs 1
semanage fcontext -a -t "httpd_sys_rw_content_t" "/path/to/upload_temp(/.*)?"

To the updated automation.

Ensure Site-Logo Location is *NOT* a Directory

Problem Description:

Artifactory 6.x creates the custom site-logo location as a directory on new build. If config-db is already referencing a custom logo file, this will break the ability to set/reset the logo to a desired image-file

Expected Behavior:

Rebuilds of Artifactory either have the custom logo persisted across rebuilds or that using the web UI to reset will work.

Actual Behavior:

After rebuild, cutsom logo shows broken-link image on page and does not allow resetting via the admin UI

(Detailed) Steps to reproduce:

Rebuild a node (e.g., via ASG) after setting custom logo

(Optional) Fix recommendation:

Until vendor fixes the behavior:

  • (Minimum) ensure ${ARTIFACTORY_HOME}/etc/ui/logo does not exist as a directory
  • (Complete) ensure that ${ARTIFACTORY_HOME}/etc/ui/logo file is persisted across builds

Ensure IAM Role provides permissions to the Bucket hosting the CW-agent

Problem Description:

Current IAM role doesn't provide access to requested CW-agent location

Expected Behavior:

IAM role provides access to requested CW-agent location

Actual Behavior:

Without manual edit of IAM role, fetch of S3-hosted CW-agent fails

(Detailed) Steps to reproduce:

Launch EC2 instance referencing IAM-role created by IAM stack in this process: provisioning fails with permission denied error during CW-agent fetch step

(Optional) Fix recommendation:

Add CW-agent bucket-name to IAM policy-sets.

[note: needs explicit s3:GetObject permission set; s3:* is not sufficient]

Adding the amazoncloudwatch-agent bucket name to the same permission-block granting access to SSM downloads should be sufficient.

Implement S3 Cost Control Defaults

Solution currently leverages S3 for hosting of service's daily backups. No lifecycle tiering or expiration is enabled. Probably be useful to add a lifecycle policy similar to:

{
    "Rules": [
        {
            "Status": "Enabled",
            "NoncurrentVersionExpiration": {
                "NoncurrentDays": 180
            },
            "NoncurrentVersionTransitions": [
                {
                    "NoncurrentDays": 3,
                    "StorageClass": "GLACIER"
                }
            ],
            "Filter": {
                "Prefix": "Backups/"
            },
            "Expiration": {
                "Days": 45
            },
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 7
            },
            "Transitions": [
                {
                    "Days": 5,
                    "StorageClass": "GLACIER"
                }
            ],
            "ID": "BackupTiering"
        }
    ]
}

Make Sure Buckets Are Instrumented for Inventory Analysis

Probably want to attach an inventory policy to the backup S3 bucket. Something similar to the following:

{
    "InventoryConfiguration": {
        "Schedule": {
            "Frequency": "Daily"
        },
        "IsEnabled": true,
        "Destination": {
            "S3BucketDestination": {
                "Prefix": "StorageReports",
                "Bucket": "arn:aws:s3:::<DESTINATION_BUCKET_NAME>",
                "Format": "CSV"
            }
        },
        "OptionalFields": [
            "Size",
            "LastModifiedDate",
            "StorageClass"
        ],
        "IncludedObjectVersions": "Current",
        "Id": "ArtifactoryLayout"
    }
}

Though this is probably of less-immediate importance than the analytics enablement.

Ensure all EC2-Deploying Templates Are Using A Common Baseline

Since the initiation of this project, the source watchmaker templates for EC2 have been continually upgraded. The Artifactory project's templates have not generally been re-baselined to capture the newer functionality found in the watchmaker templates.

Each EC2-deploying template should be rebaselined against the watchmaker 1.5.6 baseline:

Investigate Adding Support for t3 and m5 Instance-Types

Problem Description:

AWS has released new instance types that might better align to some deployment-scopes

Expected Behavior:

Support t3 and m5 instance-types where possible

Actual Behavior:

Does not currently support t3 and m5 instance-types at all

(Optional) Fix recommendation:

Update template logic to allow for t3 and m5 instance-types

Update Templates to Make Partition-Agnostic

Problem Description:

Templates may not be sufficiently portable if ARNs hardcode the :aws: partition-element into them (won't work in specialty-regions like aws-cn). See AWS::Partition pseudo-param documentation.

Expected Behavior:

All templates should work in all AWS partitions

Actual Behavior:

Some templates will fail if not launched into the default/commercial AWS region:

  • make_artifactory-Enterprise_IAM-instance.tmplt.json
  • make_artifactory_ELBv1.tmplt.json
  • make_artifactory_instance_role.tmplt.json

Fix recommendation:

Update enumerated template-files to update all "arn:aws:... string-literals to something more like:

            {
              "Fn::Join": [
                ":",
                [
                  "arn",
                  { "Ref": "AWS::Partition"},
                  …,
                  …
                ]
              ]
            }

Regression: Fetch Of Group Admin-Keys Files Fails Under FIPS Mode

Problem Description:

When operating under FIPS mode, the fetch of the Admin-Keys file will fail with a:

ToolError: Failed to retrieve https://s3.amazonaws.com/<BUCKET>/<KEY>/<GROUPKEY_FILE>:
error:060800A3:digital envelope routines:EVP_DigestInit_ex:disabled for fips

Error when launching the Artifactory-EE EC2 templates.

Steps to reproduce:

  1. Launch either Artifactory-EE EC2 template
  2. Wait for launch to fail
  3. Login to instance
  4. View cfn-init.log file
  5. Find previously shown error-snippet

Fix recommendation:

Add launch-logic to update the /usr/lib/python${PYVERS}/site-packages/cfnbootstrap/util.py file's default hash-method. Can use a simple sed type mechanism:

sed -i '/^[ \\t][ \\t]*self._etag/s/etag$/None/'

Update "Sweep Backups to S3" method to be more performant

Problem Description:

Current Artifactory backups-design causes sub-optimal IO due to insufficient uniqueness in uploaded files keys.

Expected Behavior:

Better upload, download and delete-speeds (more equivalent to those outlined in AWS documentation

Actual Behavior:

Operations that require enumeration of a significantly-sized portion of the S3-hosted objects is slow

(Detailed) Steps to reproduce:

Perform an upload or download of an Artifactory-produced export-structure (or attempt to replicate or delete an entire S3 bucket)

(Optional) Fix recommendation:

Change current backup method from an s3 sync of the Artifactory-produced export to a tar cf - <ARTIFACTORY_EXPORT> | s3 cp - s3://<BUCKET>/<KEY>/<TAR_FILE> method

Add a "Tools Folder" Parameter to EC2 and ASG Templates

Problem Description:

Templates currently allow selection of a "tools" bucket but not a root-folder. This means that multiple instantiations using a common "tools" bucket will create (potentially breaking) configuration collisions.

Desired Behavior:

Allow each deployment from a common "tools" bucket to be fully isolated from another deployment

Actual Behavior:

Currently, if two builds reference the same "tools" bucket, they will pull the same Licenses, SupportFiles and Templates directories. Need to introduce a "deployment root" folder so that each discrete build has non-overlapping Licenses, SupportFiles and Templates directories.

(Optional) Fix recommendation:

Add a "Tools Folder" parameter to allow per-instantiaion "rooting" of a deployment even if multiple deployments use the same "tools" bucket.

Update cloud-init-per Logic for compatibility with 7.6

Problem Description:

With EL 7.6's rebasing of cloud-init, the current cloud-init-per declaration in UserData results in the secondary EBS being mkfsed each time the instance boots.

Expected Behavior:

Secondary EBS is only mkfsed during initial boot

Actual Behavior:

Secondary EBS being mkfsed each time the instance boots.

Fix recommendation:

Update UserData. Change:

"  - cloud-init-per instance mkfs-appvolume mkfs -t ext4 ",

To:

"  - cloud-init-per instance appvolume mkfs -t ext4 ",

Synchronize (or consolidate) Pro and EE Templates

PRO templates were developed month prior to the EE templates. For maximum deployment-flexibility, need automation for both PRO and EE to be equally deployable (with maximum configurational equivalence: see other issues around CWA logging, etc.). See #31 for common baselines to target.

Ensure all EC2 Templates support GenFive instance types

Currently, only the make_artifactory-PRO_EC2-node.tmplt.json template enables support of fifth-generation instance types (c5*.*, m5*.*, t3.*, etc.). Borrow the GenFive logic from the make_artifactory-PRO_EC2-node.tmplt.json template and apply to the make_artifactory-EE_EC2-node.tmplt.json template (and any templates created for #32).

Feature Request: Use custom DB parameter group enhancement

Problem Description:

It may be desirable to offer the ability to customize database tuning-options. Need the DB to use a custom — rather than the currently used RDS-default — parameter group.

Expected Behavior:

Ability to tune DB behavior via DB parameter-group settings

Actual Behavior:

Current use of RDS-default DB parameter-group precludes tuning customizations

(Detailed) Steps to reproduce:

Deploy RDS DB from existing templates

(Optional) Fix recommendation:

Add a AWS::RDS::DBParameterGroup resource-type into the current RDS templating.

Tune JVM memory allocation based on hosting-EC2's memory configuration (Pro templates)

Problem Description:

Currently, Artifactory application is deployed with default, 2GiB/512MiB (max/min) memory configuration. When deployments are shifted to larger instance sizes, these settings should be more-reflective of increase memory-capacities.

Expected Behavior:

JVM is allocated memory based on amount available within the hosting EC2

Actual Behavior:

JVM is allocated a default (2GiB/512MiB) value regardles of amount available within the hosting EC2

Steps to reproduce:

Launch stack/stack-set, login to resultant EC2, note that Artifactory's Java process is running with -Xmx=2g

Fix recommendation:

Update application-installation script to update the ${ARTIFACTORY_HOME}/bin/artifactory.default file's JAVA_OPTIONS parameter to a more sane value. Need to determine an optimal (integer) precentage of:

  • /proc/meminfo's MemTotal value
  • /proc/meminfo's MemFree value
  • dmidecode -t 16's Maximum Capacity value
  • dmidecode -t 17's Size value

To allocate.

See tuning guide

Update PGSQL RDS Templates

Since initial authoring, AWS has updated available PGSQL versions. Per today's (2018-12-10) notifications, AWS is recommending updating running versions to at least 9.6.9.

AWS's currently-supported versions are (application support may vary: test if moving to a higher major):

10.4
10.3
10.1
9.6.10
9.6.9
9.6.8
9.6.6
9.6.5
9.6.3
9.6.2
9.6.1
9.5.14
9.5.13
9.5.12
9.5.10
9.5.9
9.5.7
9.5.6
9.5.4
9.5.2

Add Log-Rotation (Since JFrog Didn't Bother)

Problem Description:

On long-running systems, ${ARTIFACTORY_HOME}/logs will cause the filesystem to fill up. Implement log-rotation to prevent.

Expected Behavior:

Vendor should include automated log-rotation

Actual Behavior:

On long-running instantiations, logs fill up their directory.

Fix recommendation:

Add a logrotate.d file similar to:

/var/opt/jfrog/artifactory/logs/catalina/catalina.out
/var/opt/jfrog/artifactory/logs/catalina/catalina.[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].log
{
        copytruncate
        daily
        rotate 7
        compress
        missingok
        minsize 5M
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.