enterprise-cmcs / macpro-quickstart-serverless Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Elasticsearch can occasionally take >30minutes to be created. The cause of this isn't known and isn't being pursued.
The 'no output' timeout for serverless is set to 30 right now. Bumping it to 60 would cover all of the long build time scenarios seen so far.
AC:
There are several variables needed to deploy to AWS. Some are optional, some are not.
This should all be documented in the README and the config.yml
AC:
The lock script occasionally fails to work as intended.
This is what is believed to be happening
The impact is minimal at the moment, since the effect is running tests against an environment that has just begun deploying. The tests would finish before deployment reallys tarts, and the tests are just smoke tests anyway.
A fix should be introduced though. Maybe the 'check the api for no earlier jobs' has to pass twice in a row? This might be difficult to test.
AC:
You can click and view the current attachment. There could be a button for the user to download it to disk.
AC:
Eventually that's gonna bite... so at least we'll capture it here? While I was PR-ing the branch to move to Actions, I poked around to see if it was easy enough for me to "just update". and..... when less is at stake, I will update and test, though I am happy to let you someone else do this, too :)
The set-env
command is deprecated and will be disabled soon:
A moderate security vulnerability has been identified in the GitHub Actions runner that can allow environment variable and path injection in workflows that log untrusted data to stdout. This can result in environment variables being introduced or modified without the intention of the workflow author. To address this issue we have introduced a new set of files to manage environment and path updates in workflows.
Here's what I found as a solution:
Setting an environment variable
echo "{name}={value}" >> $GITHUB_ENV
Creates or updates an environment variable for any actions running next in a job. The action that creates or updates the environment variable does not have access to the new value, but all subsequent actions in a job will have access. Environment variables are case-sensitive and you can include punctuation.
Example
echo "action_state=yellow" >> $GITHUB_ENV
Running $action_state in a future step will now return yellow
Multline strings
For multiline strings, you may use a delimiter with the following syntax.
{name}<<{delimiter}
{value}
{delimiter}
Example
In this example, we use EOF as a delimiter and set the JSON_RESPONSE environment variable to the value of the curl response.
steps:
Prepends a directory to the system PATH variable for all subsequent actions in the current job. The currently running action cannot access the new path variable.
Example
echo "/path/to/dir" >> $GITHUB_PATH
The role's created by the uploads-scan service don't allow for iam_path or iam_permissions_boundary to be set.
These are typically not needed, but in Greenfield accounts, we need to specify those. This is a pattern we have in other services.
Note: This quickstart isn't deploying in a greenfield account, so this issue went expectedly unnoticed.
AC:
Adding PII scanning to the uploaded objects has been mentioned and would be a great addition to the quickstart.
This is targeted as a use case for AWS Macie, but this issue won't prescribed the technology.
It also won't prescribe the exact actions to take upon finding potential PII.
What's important for this quickstart is that the plumbing is there, and there's an existing framework to scan for PII, trigger an event on findings, and take action. The details of those actions aren't super important. Maybe send an email, tag the object, or delete it.
Do what you think is best. Keep it flexible.
AC:
This stems from a CMS specific requirement.
In CMS Greenfield accounts, admins can only create IAM objects under a "/some/iam/path" IAM path.
The current codebase can't support this, as the default "/" path is implied everywhere.
The quickstart should be updated to optionally take a global iam path; that's the idea.
AC:
AC
A WAF can be put in front of cloudfront to restrict access by ip range. This can be used to essentially allow only vpn traffic to reach the site. It would be nice if the quickstart had this pattern laid out, ideally as a configurable option. Restricting access to the site to a vpn is a common ask.
AC:
By default, serverless creates a bucket for its use for each serivce. This results in a lot of buckets.
It seems preferable to create a single bucket for all services use.
AC:
A profile or settings page should be added. A user should be able to update or add any mutable user attributes, like name and phone number
AC:
S3 buckets typically need to be emptied prior to deletion. The destroy.sh script currently finds the s3 buckets name, empties it using the aws cli, then continues with the delete.
It works, but theres a plugin to do this. https://www.serverless.com/plugins/serverless-s3-remover
AC:
CircleCI supports caching of dependencies. This should be used for node modules in the name of speed. The .serverless directories should be investigated for caching too, but that'll be another issue.
AC:
Creating ES indices and configuring mappings is understood to be a common workflow for systems using ES.
Adding a pattern for creating an index and configuring mappings would be a solid addition to the quickstart.
It's not totally clear where this config should ideally live.
At the moment, a separate es config service that does configuration via lambda makes sense.
AC:
Seven buckets per deployment kind of seems like a lot?
The following might be useful --
If you have defined multiple buckets, you can limit your deployment to a single bucket with the --bucket option:
$ sls s3deploy --bucket my-bucket
The ultimate goal is not running into that account bucket limit... it will help a lot to get moving to a CI where the destroy is automatically happening, but even with decent cleanup, it only takes 14 branches to blow the limit... I count 6 people already making branches, with at least 3 more onboarding this quarter...
Let's please optimize the bucket usage and cleanup :)
Upon first logging into the site, the Profile/logout drop down does not display properly.
The clickable navitem is intended to display the logged in user's email. But, when logging in, the element is rendered before the user is authenticated. So, it appears blank.
Screenshot of home page after first logging in (note the top right drop down menu with no label, just an arrow):
Screenshot of home page after refreshing:
Two options:
When first logging into the site, users are shown a blank/confusing drop down for their profile and settings. While there is still a dropdown arrow, it's not totally obvious that there is a menu available or what it does.
This is considered a low impact bug, as the menu label only behaves incorrectly once, upon first logging in.
Login to the app. Use Cognito/Private mode if there's a possibility you have a live session.
The top right menu will be blank.
Refresh the page and your email will appear.
Screenshots above.
The WAF logs for Cloudfront and APIGW should be piped to S3 for storage and analysis. This should allow for insights into what requests are getting denied, what ips are hammering the sight... take your pick.
AC:
This is simple but an issue is good for tracking purposes.
The deploy workflow for github actions is currently named config.yml
That should be renamed to deploy.yml
Likewise, the workflow itself should be renamed from Build to Deploy
This could temporarily break the github lock script if the right order of events happened, so I want to include this as part of the other breaking changes in release 1.x
AC:
The quickstart curently support specifying AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY or specifying CTKEY_USER/CTKEY_PASSWORD
CTKEY is a tool that lets you trade a fixed CloudTamer username/password for temporary Amazon credentials.
The authors' use case for using CloudTamer programmatically (CTKEY) is going away, and the support for it should too.
It complicates masking sensitive variables, and the workflow was never broadly applicable.
AC:
pull request builds should be required on every pull request.
Right now, if you push a branch cci-, it will be built. If you make a pull request for one of these branches, it will be required to pass its status check (cci build).
But branches not matching cci- have no build.
This was seen as a good thing, since we can get code in quickly without a build (like readme updates). Also, the build process for a new env takes significant time (30 minutes or so) because of elastic search.
However, we could have a designated PR environment... each pr kicks off a deployment of that environment, and locks it so only one PR at a time. So a persistent PR environment. This takes care of the speed issue for new environments... the env isn't new. There will be times when changing a good deal of infrastructure that this becomes problematic and slow. I think we take those as they come. Most changes will be to the UI code or the fast deploying lambdas.
AC:
This isn't a problem yet, but has been seen when implementing Okta which uses a cognito custom domain. Can/should be fixed ahead of time. found by @sethsacher
An error occurred: UserPoolDomain - The domain name contains an invalid character. Domain names can only contain lower-case letters, numbers, and hyphens. Please enter a different name that follows this format: ^[a-z0-9](?:[a-z0-9\-]{0,61}[a-z0-9])?$ (Service: AWSCognitoIdentityProviderService; Status Code: 400; Error Code: InvalidParameterException; Request ID: 013e0b42-cac6-4420-ab34-27720db86024; Proxy: null).
I recommend adding to the branch verification in github actions and circle ci to fail if theres an uppercase.
AC:
User uploaded S3 objects should be scanned for virus/malware/whatev.
This should probably be driven by an S3 event triggering a lambda based scan. Tool TBD. ClamAV is the only tool that's been mentioned by name
Perhaps a user should only be allowed to download a file if it's been scanned.
Perhaps the file should be removed if it's found to be a risk.
Perhaps the admin team and/or the user should be notified if it was deemed a risk.
There's wiggle room here to pick what's best.
AC:
Right now the user's email is attached to the form, and it is notified on create/update/delete.
There should be a space to add additional emails that should be notified.
AC:
The CCI secret/env-specific injection pattern fails to work properly when building branches beginning with a number.
AC:
The elasticsearchIndexer function currently uses dynamo-to-elasticsearch-nodejs. It's a publicly available package specifically made for just that... flowing data from a dynamodb stream to elasticsearch.
This should be replaced for two reasons:
AC:
Make the deployment fast. Caching, package.json consolidation, conditional npm installs, npm ci.
Probably skip the "only deploy whats changed since last successful"... serverless only updates stacks if it needs to, so if the total deployment is fast enough, we wont need that logic (i dont want that logic either, its a mess)
AC:
This is observered on branches where INFRASTRUCTURE_TYPE=production is set (currently only master).
Serverless: WarmUp: app-api-master-delete Serverless: Bundling with Webpack... ⚠ 「copy-webpack-plugin」: unable to locate '_warmup' at '/home/runner/work/macpro-quickstart-serverless/macpro-quickstart-serverless/services/app-api/node_modules/serverless-bundle/src/_warmup' ⚠ 「copy-webpack-plugin」: unable to locate '_warmup' at '/home/runner/work/macpro-quickstart-serverless/macpro-quickstart-serverless/services/app-api/node_modules/serverless-bundle/src/_warmup' ⚠ 「copy-webpack-plugin」: unable to locate '_warmup' at '/home/runner/work/macpro-quickstart-serverless/macpro-quickstart-serverless/services/app-api/node_modules/serverless-bundle/src/_warmup' ⚠ 「copy-webpack-plugin」: unable to locate '_warmup' at '/home/runner/work/macpro-quickstart-serverless/macpro-quickstart-serverless/services/app-api/node_modules/serverless-bundle/src/_warmup' ⚠ 「copy-webpack-plugin」: unable to locate '_warmup' at '/home/runner/work/macpro-quickstart-serverless/macpro-quickstart-serverless/services/app-api/node_modules/serverless-bundle/src/_warmup'
The warmup plugin is only used on branches where INFRASTRUCTURE_TYPE is set to production; this is thought to explain why this issue is only seen on master.
This is seen as the main delta between master and dev branches, and could explain the 20-30second speed diff.
AC:
CloudTamer service users must use CTKey to exchange who they are for temporary AWS CLI access keys. This should be implemented to keep aligned with the original consumers of this repo, unfortunately, but hopefully it can be made optional.
Maybe if CT info is not detected, the CTKey exchange is skipped.
AC:
test
Currently, on deployment, the entire ui cloudfront distribution is invalidated. So we upload our static archive to s3, then invalidate the whole cache.
This should be refactored; the suggested way is:
Create a lambda to do individual object invalidations. Set that lambda to trigger on each new or updated file in the s3 bucket via an s3 event. That lambda invalidates that object. So only the objects that need to be invalidated are. And, the pipeline can quit concerning itself with invalidations.
AC:
The scan definition updater is run on a schedule.
On deployments of new environments, there is a time where the environment is up and usable, but the scan definition uploader hasn't run yet.
This breaks the app; all scans fail saying 'key doesn't exist'... that's the definitions key.
Although it's a second or two time hit, simply invoking the definition uploader lambda on deploy should fix this. The pattern that's been used is serverless-plugin-scripts hook calling serverless invoke -f at deploy:finalize. Although this specific code was removed later on, you can see the pattern for this here: https://github.com/CMSgov/macpro-quickstart-serverless/pull/36/files
AC:
This is entirely subjective.
We intend to usenightwatch for browser UI testing.
Consumers of this repo can surely use any testing framework they'd like, but for our purposes, nightwatch will be the standard unless there's a need to divert.
TestCafe and Nightwatch are both simple npm packages, so this should be relatively straightforward to replace.
AC:
In the deploy Github Action, there are two tasks related to running the night watch test. First the test itself, and then an action to upload the results of that test to a store for that given run. By default, failed actions halt execution of future tasks so in the case of a failure nothing will be uploaded.
I believe this means that results like screenshots of a test failure will not be recoverable from a failed CI run. We should sort out how to perform that upload as part of the same action or if there is some way to force the task to run regardless we could do that.
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
Right now, all branches/stages get deployed to the same amazon account.
We want the ability to deploy certain branches (stages) to specific amazon accounts.
AC:
Allow users to upload multiple attachments.
AC:
A browser smoke test would be a good addition. The application endpoint should be fetched and a test suite ran.
AC:
See details here: #71 (comment)
AC:
The CCI builds deploy target environments and will soon run tests.
We want to block concurrent builds.
It seems it's not super straightforward, but there are helper scripts publicly available on github.
AC:
There should be a suggested pattern for promoting code.
Simple merges from master -> qa/staging -> prod should be simple and effective for now.
What should probably be done is build EVERY branch. Simple as that. And perhaps add a "dont build branches that match nocci-* at the start" so we have a way to get README updates in without building.
AC:
SES_SOURCE_EMAIL_ARN and IAM_PERMISSIONS_BOUNDARY_ARN are optionao environment variables during deployment.
Each is of a predictable format given the email address / boundary policy and the account id.
Instead of requiring the user to set an account specific arn, he should only be required to suppy an account agnostic name (like email address), and the code should generate the arn based off the known format.
AC:
CMS' CircleCI does not support v2.1 out of the box.
The workflow to enable it is unclear; we think it's probably a Jira ticket to somewhere.
For now, the config.yml should be converted to work with 2.0 to keep this targeted to CMS use cases.
AC:
Our CircleCI use case is no more, and I'd prefer we not try to maintain support for it.
AC:
Stage deletion should be automated as much as possible.
Since all branches (excluding those that begin with skipci-) are deployed automatically, deletion of those environments should occur when the corresponding branch is deleted.
Manually deployed stages, as in stages deployed outside of the CI system (not common right now), will require a person to run the destroy script... this issue is not for automating the cleanup of those environments, just the envs built by the CI system
AC:
set-env is being deprecated. See: https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/
I haven't looked at the details here, but there are likely some steps to take.
AC:
Background:
We have a list of deploy time variables to inject. Things like SES_SOURCE_EMAIL_ADDRESS, ROUTE_53_DOMAIN_NAME,
INFRASTRUCTURE_TYPE, etc. These are currently injected as GitHub Secrets.
On projects where the core of this quickstart is being used, that list is growing rapidly. In particular, projects that make use of VPC based resources suddenly have a lot of info to inject: vpc id, private subnet ids, public subnet ids. The current setup dictates all of this information go into GitHub Secrets.
Now take that list and multiply it, since VPC info will vary for some branches, particularly val/prod.
The problem:
Proposed solution:
All non sensitive deploy configuration should be migrated to AWS SSM Parameter Store (this is often referred to as just 'SSM' in conversation and writing, including this issue).
GitHub Actions will need AWS access keys to authenticate to amazon, but once authenticated, it may fetch deploy information from SSM.
Take the example of VPC information... That's account specific information, and it is much more appropriate to keep that information inside the AWS account (SSM) than to give it to github secrets to inject. The build may authenticate to the correct Amazon account, and then pull that VPC info from a standard place in SSM.
Or look at the example of an SES source email address... That's a value that is not bound to an account, but can vary environment to environment, branch to branch. In this case, an SSM naming convention can be used. A default value may be published at /configuration/default/sesSourceEmailAddress but it can be overriden with a branch specific variable at /configuration/mybranchname/sesSourceEmailAddress. The Serverless code may then simple fetch that value as ${ssm:/configuration/${self:custom.stage}/sesSourceEmailAddress, ssm:/configuration/default/sesSourceEmailAddress}. This has the effect of fetching the branch specific var if it exists, and falling back to the default if it doesn't.
In a nutshell, aside from the AWS keys, all of this info can be much better organized, maintained, and fetched using SSM.
Note: This is predicated on authenticating to amazon, so the AWS access key id and secret access key will need to stay in GitHub Secrets. That's been said a few times. What hasn't been addressed is 'what about other sensitive information?'. Currently, we don't have sensitive info to supply outside of the AWS keys. But if there were some, could we store that in SSM too?.... answer: yes, but, the masking provided by injecting true secrets as GitHub Secrets prevents people from printing sensitive information. That masking is not provided out of the box by serverless, as it doesn't know it's a secret. So, this is the reasoning behind billing this issue as 'move all non-sensitive info to SSM'... it is our current understanding that we are better off keeping any future sensitive information as true GitHub Secrets. There may be a way to denote a serverless variable as secret; we haven't looked into it. And finally, we might be better off just not injecting secret strings into the build and avoiding the issue altogether.
AC:
Right now, some variables are set per branch...
For instance, you can set <branch_name>_CLOUDFRONT_DOMAIN_NAME to set CLOUDFRONT_DOMAIN_NAME for a specific branch and only a specific branch.
For example, master_CLOUDFRONT_DOMAIN_NAME is set in circleci project settings to be aps.cl-demo.com
That works well enough, but there are more variables that developers will want and need to set for specific branches.
A more fleshed out way/pattern to do this is what we want.
AC:
configureLocal.sh doesn't take a stage parameter. It should.
AC:
We want a better way to destroy stages (branches) built from our project.
The serverless framework using cloudformation under the hood to build stuff in Amazon. So all state required to destroy is in Amazon itself, in a cloudformation stack. So for instance, if we have 10 services in our services folder, we will have 10 cloudformation stacks for any given stage/branch.
It's possible to search for cloudformation stacks matching a pattern, and delete accoridngly. This is what I suggest we do. I've written a script that does this, but it needs some improvement and code changes to make it ready to be a standard workflow.
There's a few gotchas to know about
For those reading that have dealt with terraform destroys in the past.... this is so much cleaner.
This can all be run in lambda, but I'd consider that a future enhancement. I think step 1 is to get the tagging in place and the destroy script in source. Where and when to run destroy is something to think about, and maybe something best left to the teams.
AC:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.