nypl / engineering-general Goto Github PK

View Code? Open in Web Editor NEW

47.0 50.0 3.0 732 KB

Standards, values, and other information relevant to the NYPL Engineering Team.

engineering-general's People

Stargazers

Watchers

Forkers

jmandel1027 ezrapagel jarielbalberona

engineering-general's Issues

Make "Fixing Quickly..." more generic?

I definitely ❤️ the "Fixing Quickly vs Ideally" sentiment but I wonder if there's a way to express this in a more generic way so it's more universally applicable. Maybe more in-line with some of the principles expressed in the Agile Manifesto:

Two relevant principles come to mind:

Simplicity--the art of maximizing the amount of work not done--is essential.
Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

I'll take a first try though it's really rough:

Velocity vs. Ideal
Simple solutions, delivered quickly are often better than ideal solutions. “Quick fixes” are acceptable if they are documented and agreed upon by other developers.

Standardizing Github Teams

tldr; Teams are a mess. By project maybe?

Our Github Teams situation is a mess. Our teams include a mix of teams created for individual projects and teams reflecting prior institutaional hierarchy. I wasn't able to find useful online opinion on patterns for using Github Teams to simplify repository access across multiple interrelated projects & repos. Our challenge is to 1) maximize the ease with which we can grant Read and Write access to relevant staff while simultaneously 2) withholding access that is unwanted, unnecessary, or dangerous. We want things locked-down in general, but we're willing to be a little permissive with permissions if it means we save time tracking down owners/admins to manage account-specific access.

As an aside, I think our repositories should be public by default. If we're following our own recommendations, there's no danger making them so. This simplifies Read access concerns, although public repos would still benefit from better organized Teams for granting Write access.

The major patterns of Team organization seem to be:

I. Recreate institutional hierarchy (i.e. group by role)

The vanilla use case favored by Github's Nested teams documentation doesn't seem well suited to our environment on its own. I don't think we'd benefit from teams solely organized by role like "NYPL Digital Engineers", "NYPL Digital PMs", "NYPL Digital Design/UX", etc. We don't often need to grant read/write access only to Developers - and to all Developers. In practice our projects involve certain developers, only one PM, and a selection of Design/UX, and subject specialists - some outside Digital.

II. One team per project

Our work is typically organized by "project", a thematically squashed collection of products supported by a number of repositories. A project tends to have a regular stand-up and a dedicated Jira/Waffle board. Contributors to a project often benefit from having unified access to all of the repositories involved in that project. Thus, one pattern we could adopt is to create Github teams around projects. At the start of a project, for the purpose of easily giving people read/write access to the repositories for a project, a Github team could be created that matches the product name (e.g. "Best Books", "Discovery/ReCAP", "SimplyE"). The team would include every Github account that might be interested in seeing the contents of the repo. A more selective "Write-Access" sub-team could be created to represent those contributors likely to need write access to the majority of the repositories associated with the project (e.g. "Best Books Write-Access", "Discovery/ReCAP Write-Access"). Anyone added to the project would need only be added to the relevant project team to gain read/write access to all relevant repositories.

(I really don't like "Write-Access" as a suffix, but "Developer"/"Contributor" are presumptuous/limiting.)

Github project teams would not be deleted when project work is complete because doing so may revoke read/write access to the effected repositories, which may make troubleshooting issues post-launch difficult/impossible. Thus, under this model, the list of Github teams is expected to increase steadily forever and that's okay. Teams are free and their persistence may be valuable as a kind of rolling secondary documentation of repository-project relationships (in lieu of using Github's "Projects" feature).

Note that a given repository may be implicated in multiple projects and thus may grant Read/Write access to multiple Teams. For example, the repositories representing the Inventory Service may grant Write access to both "Discovery/ReCAP Write-Access" and "New Arrivals Write-Access".

For common repositories not tied to a specific project (e.g. NYPL/nypl-data-api-client, NYPL/nypl-core, NYPL/engineering-general, etc.), we would be forced to make them Public or grant Read access to a "NYPL" catch-all team and Write access to either:
a) individual accounts on a case by case basis, or
b) role-based teams like "NYPL Engineers", "NYPL Metadata Services".

III. Grant `Read` to "NYPL", `Write` by project

This option abandons granular, project-specific Read Teams in favor of simply granting Read to every member of a catch-all "NYPL" team. We gain the convenience of not having to manage project or role-specific teams for Read access. The cost of this is that every member of the "NYPL" catch-all team gets Read access to every NYPL private repo. (I see no danger in that provided we're not storing secrets.) To retain granular control over Write access, project specific Teams would be created (e.g. "Best Books Write-Access", "Discovery/ReCAP Write-Access").

Our Teams listing might resemble:

NYPL
- NYPL Engineers
- NYPL Metadata Services
- NYPL Design/UX
Best Books Write-Access
SimplyE Write-Access

For common repositories not tied to a specific project, our options resemble those in II.

Questions

What option or mix of options above maximizes convenience and security for us?
What do we do about Admin access, which by default is held solely by repository creators. If we create project based teams, the only time we should need Admin access is when initially granting team access to the repo, so perhaps that power is sufficient shared solely by the repo owner and NYPL org admin.

Log Message Standard?

Hello (especially @NYPL/recap-data-search, @NYPL/recap-request, @NYPL/recap-ui),

There has been some discussion about logging and standardizing our logging messages. Currently, we're using a few different libraries (winston, bunyan, and monolog). Instead of standardizing on the library, maybe we can agree on log message standard with a "minimum" set of key/values. I'll kick off the discussion with the following proposal:

Log Message Standard

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

The message format MUST be JSON.

The message MUST contain the following top-level keys: level, message:

level MUST be a string of of one of the following values (case-sensitive) and MUST follow this order of severity (from least to greatest): DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, or EMERGENCY.
message MUST be a string and SHOULD contain a message useful for debugging/error reporting.

Additional key/values of any type MAY be included and MUST NOT break functionality.

Any thoughts? Should we RECOMMEND or REQUIRE additional key/values?

Create onboarding documentation for new developers

I think it might be a good idea to add an "Onboarding" document to this repo to help new developers get up to speed.

Do you know of any existing documents that we can build on for @gkallenberg @nodanaonlyzuul?

@thisisstephenbetts, what was helpful to you in the onboarding process?

Maybe we can use this issue to help create this document.

CC: @jvarghese01

Graceful Slack Offboarding

After a wonderful conversation with @winniequinn regarding offboarding staff on Slack, she suggested the following wonderful ideas:

Convert the account as guest access to #general channel
Keep the guest access for 3 months as part of the offboarding process.

That way, we can DM the guest while the account is still active for those three months. The guest access only to #general channel will not incur new charges to our nypl.slack.com account.

Let me know what you think, @kfriedman, @nodanaonlyzuul. Please ping anyone who you think should join this conversation!

Create First Draft Recommendation for Naming Conventions

I will a first draft PR that tackles naming conventions.

Inconsistent naming between GitHub repos and AWS

At the moment it can be difficult to find out where code that is deployed to AWS is stored on GitHub just from looking at AWS. I thought it would be useful to add something to our standards about indicating corresponding GitHub repos on AWS. For example, perhaps we could ask to have the repo url stored in an environment variable with a standard name, or included in the "description" field.

Requesting addition to Documentation.md

Is it okay to include this quote at the beginning of the documentation standards page? It's been on my mind. 😄

“Documentation is a love letter that you write to your future self.”
— Damian Conway

Add CI section to testing standard?

Perhaps we should a CI section to the testing standard?

Or, if we're not doing full CI, at least spell out our usage of a CI service like Travis?

AWS Profile names

I noticed on some of our machines we used different names for our AWS profiles. As a recommendation, I'd like to suggest that the profile names be the Account ID or alias, e.g. nypl-sandbox, nypl-digital-dev, and nypl for AWS account run by Systems Engineering, when we log into AWS, so it is

Easier to remember and track on our local machines
When we write automation scripts, the profile names are unified and it's easier to share or install on our local machines.

Right now I'm only concerned about those three profiles, and engineers should be free to name other profiles to their preference.

No default profile for AWS

Hello, @nonword and I went on an adventure with AWS Lambda deployment, and we discovered that if our machines have a [default] profile inside our .aws/config and .aws/credentials files, node-lambda npm library will look for default settings first and override the necessary KMS decryption settings. I am recommending that we should include a note on AWS Lambda to make sure our machines have no [default] on our aws credentials.

Test coverage on all microservices > 70%.

Understand that 100% might be counter-productive or not so feasible, still seems like we should hold our own feet to the fire. We're developing something that people are going to need to count on, and with as many pieces as this thing has, we need to make sure they all work.

Log Level Integer is inconsistant

Hi.

Our current logging guides address an optional levelCode key and gives a guide to how those integers map to the level key.

That mapping is different across programming languages:

In Ruby it looks like this:

# Logging severity.
module Severity
  DEBUG = 0
  INFO = 1
  WARN = 2
  ERROR = 3
  FATAL = 4
  UNKNOWN = 5
end

In Python it looks like:

Level	Numeric Value
CRITICAL	50
ERROR	40
WARNING	30
INFO	20
DEBUG	10
NOTSET	0

So - I feel weird that our existing documentation calls out the log levels that we do.
I'm not sure what the answer is:

A conspicuous caveat saying that the levels vary from language to language?
Making the MAY of the logLevel key more conspicuous?
1 & 2?
Removing mention of logLevel all together?

Should talk about limiting Travis builds in travis-ci documentation

In our travis coverage, we should note ways to better leverage our common travis acct. Suggestion from @nodanaonlyzuul and @gkallenberg :

I think we can get more mileage out of the limited number of concurrent builds our Travis plan gives us by:

Doing fewer builds.

Making builds faster.
Doing fewer builds is easy - by doing:

  only:
    - master
    - deployable-branch1
    - deployable-branch2

If you do this - TravisCI will not do builds for feature branches but WILL do builds for PRs.
We can make builds faster by asking travis to cache gems, node modules, (maybe even apt packages) https://docs.travis-ci.com/user/caching/.

Create Microservice Guidelines

In order to provide guidance and a common approach to building microservices at NYPL, a set of guidelines was initially drafted. This is a first, rough draft:

Microservice Guidelines

Services SHOULD follow common microservice patterns.
It's RECOMMENDED that they:

Have independent development and deployment.
Have private data stores not accessed by other services.
Are small enough to deliver value but not “too” small.

See: Best Practices for Building a Microservice Architecture and Pattern: Microservice Architecture

Services with HTTP endpoints SHOULD follow RESTful design principles.
It's RECOMMENDED that they follow Best Practices for Designing a Pragmatic RESTful API.

Services SHOULD provide adequate documentation.
It's RECOMMENDED that they:

Use the Swagger specification for models, resources, etc.
Include any necessary API Gateway annotations.

Services SHOULD publish data events to the NYPL data-streaming platform.
Services that publish events MUST:

Encode messages using Avro.
Publish their schema to the Schema API.

Services MUST be designed with adequate monitoring/alerting.
It's RECOMMENDED that they:

Publish metrics using a tool like CloudWatch.
Have automatic alerts generated from metrics or log filters.

Services MUST log error messages.
Error messages MUST:

Be easily viewable, retrievable, and searchable.
Be logged consistently using agreed-upon standards.

Services SHOULD provide a SLA/performance standard.
It's RECOMMENDED that they are load-tested with results published.

Services with HTTP endpoints SHOULD be deployable via an API Gateway
It's RECOMMENDED that they:

Are compatible with API Gateway path and query parameter patterns.
Designed for optimal caching from the API Gateway.
Use the X-NYPL-Identity header for authentication.

Services MUST follow security practices.
It's RECOMMENDED that they:

Check OpenID Connect claims (subject, scope).
Are on a private network and pass other established security policies.

Services MUST follow agreed-upon engineering practices.
It's RECOMMENDED that they:

Are "owned" by one or more engineers responsible to be the point of contact for troubleshooting.
Follow NYPL coding and testing conventions.

Please add your thoughts, suggestions, and comments!

Document deployment patterns & configuration

I'd like to continue the discussion started in NYPL/discovery-api#93 , which proposed placing common deployment config in a central place like engineering-general. There are several patterns for deployment emerging from our work, with lots of common config. I like @nodanaonlyzuul's idea that we collect that information in a central place, perhaps organized by technology (e.g. EB, lambda, EKS). For any kind of deployment, we could document the distinct strategies we consider best practice and link to a boilerplate repo, yeoman generator, etc. This would help bring up new apps, but could also allow us to slim down our READMEs by letting us link directly to the deployment standard the app uses, reducing the app's own responsibility for documenting all of the relevant eb subcommands, for example. (Separately, but in that same spirit, I'd love for us to develop a vocabulary of git deploy strategies to make it easier to read at a glance what kind of app we're dealing with.)

Broken links / housekeeping

Several links in the README.md file are broken, namely:

Coding Style/Javascript
Coding Style/Python
Coding Style/Ruby on Rails
Coding Style/PHP
Production Readiness (was this meant to link to https://github.com/NYPL/engineering-general/blob/main/standards/deployment.md#production-readiness?)

In addition, since this Repo builds out a GitHub pages site here: https://nypl.github.io/engineering-general/ it would be nice to embed that link directly into the README.md file for folks to read the information there, if they wish.

Create a list of standards and write a first draft

Based on our meeting on 7/28, we agreed to create a first draft of standards (and assigned owners) for the following:

^ Test coverage @emu47
^ Logging @nodanaonlyzuul
Peer review @holingpoon
Git workflow - branch naming, PRs, release tagging @emu47
Deployment @jobinthomasnypl
Development environments @jobinthomasnypl
CI @jobinthomasnypl
^ Documentation - README, functional, license, API/Swagger, talks to/listens to/dependencies, diagrams, workflows @gkallenberg @ktp242
Accessibility
Performance @jbdalton
^ Alerts/downtime @kfriedman
Mission @emu47
Code of ethics
Naming conventions - Lambdas, repos, errors, etc. - @rhernand3z
Ownership/responsibilities
^ Security - proper OAuth scopes, MFA enabled on AWS and other accounts @kfriedman @rhernand3z
^ Error handling @gkallenberg
Coding styles @nodanaonlyzuul
Code quality @gkallenberg @rhernand3z

^ indicates first priority as it's a blocker/critical for the ReCAP project

Next steps:

Each assignee will create an issue for their standard in this repo
Any developer interested in collaborating on the standard can comment on the issue
The first draft of the standard will be created by end of the current sprint (8/9)
Standards SHOULD follow the RFC 2119 language.

@EdwinGuzman, @nonword, @ktp242, and @rhernand3z: Since you weren't able to attend: 1. Do you have any standards you'd like to see created? 2. Would you like to volunteer for any of the unassigned standards above (or any other standard)?

Thanks everyone. Please let me know if I missed anything or if you have have any more thoughts. Y'all rock! 🎸

cc: @jvarghese01, @ablwr

Tweak title?

I know this might be a little too touchy-feely for some (or splitting hairs), but "Rights & Responsibilities" sounds a little too, um, rigid for my liking.

Maybe "Engineering Values" like Medium or Buffer:
https://medium.engineering/engineering-values-7143c0db0bd6
https://buffer.com/about

Or "Principles" like the Agile Manifesto:
http://agilemanifesto.org/principles.html

I recognize this could just be personal preference though. 😄

Create service discovery standard

I think it might be helpful to have a standard that describes how services should be discovered.

Right now, it could just be a file in this Github repo somewhere. Maybe it should list:

Service name
Service owner
Service description
Link to repo

This would greatly help with discovery and uniform naming of things like SNS topics, etc.

Add deployment to CI

Who should review the Pull Requests?

I was having a conversation with @katesweeney on who needs to be reviewers of nypl-core, and learned that I should include @saverkamp as a reviewer for pull requests.

I'm thinking, rather than asking around every time who should be included as reviewers, a Markdown file telling me maintainers/point-of-contacts I should include within each code repository, would work wonders. I have learned recently that Github gives special meaning to CONTRIBUTING.md at code repository's root level, and it is also a standard practice used at companies such as Pantheon and CircleCI.

While we can figure out which parts of the standards we can and should adopt, some instructions on how to submit a Pull Request within a file such as CONTRIBUTING.md would give me pointers on the workflow of a repo.

cc: @thisisstephenbetts @kfriedman @nodanaonlyzuul @nonword

Renaming ci-and-deployment.md to ci-coverage.md?

@jobinthomasnypl I got a little confused thinking that the file ci-and-deployment.md covers both CI and Deployment. IMO Deployment may need another page. Is it possible to rename the file to ci-coverage.md or something else? Thanks!

Where should documentation live?

I've been staring at some old documentation, and these pages are everywhere: On Confluence via Atlassian Cloud, on confluence.nypl, various Google Docs, etc. Our newest spot was to park the documentation on GitHub Wiki pages. Having a discussion with @nonword, he pointed out that the search function on the Wiki is broken. I was on the track of thought that maybe if we ever have to move our repos again somewhere else, we will lose the GitHub Wiki pages because it is specific to GitHub.

I have a suggestion: I think it'll make more sense to embed documentation as part of the repos, e.g. a doc or docs folder in the code repo, and park all our documentation written in MarkDown that is somehow outside of README. That way the file structure is vendor-agnostic, we also have a plus that most repo management vendors such as BitBucket and GitHub would parse MarkDown files.