Code Monkey home page Code Monkey logo

enseada's People

Contributors

dependabot[bot] avatar fossabot avatar marcoieni avatar matteojoliveau avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

enseada's Issues

Agnostic error reporting

Right now the error reporting system in Enseada is purely log-based. While this is acceptable for now, it will be better to provide integrations with error reporting systems such as Sentry, Airbrake/Errbit and StackDriver.

To achieve this, we need to define an agnostic reporter interface with different adapters so that users can configure the system they prefer.

Acces Tokens cleanup

We need a background job that deletes expired access and refresh tokens every X seconds | minutes | hours | centuries.

Also, we need to delete all personal access tokens when a user is deleted

Support RubyGems API

This follows the same destiny as the Maven and NPM APIs: it's not that well documented and will require some reverse engineering of both existing servers and the gem client.

Reference implementations:

Personal Access Tokens

Personal Access Tokens (PAT) are special OAuth tokens generated by Enseada that are user-configurable.

They can only be manually created by a user with a specified scope and expiration, which can be infinite. They are used as a substitute for passwords for those systems unable to perform a classic Authorization Code Flow to obtain a temporary OAuth token from the registry.

Since Enseada support passing tokens via both the Authorization header and as Basic Auth credentials (with the string x-oauth-token as the username), usage of PATs is recommended for systems compatible with Basic Auth, such as Maven or Docker.

PATs can have a label so that they can easily be identified by the user, are exposed in the web UI and REST API (though only their information is visible, not the token value itself which is not stored by the registry) and can be revoked at any time by the user.

Personal access tokens management

We use OAuth tokens handled with fosite as our only authentication method. All endpoints support them either as a Bearer token in the Authorization header or as a Basic Auth password (with the special user x-oauth-token. The latter is available for clients that are unable to handle the full OAuth authorization flow and/or are unable to use Bearer tokens, such as Maven.
Due to their inability to refresh expired token, these clients need long-lived access tokens that don't expire (or expire after a very long time). We'll call these tokens personal access tokens. A personal access token can be generated by a user and will inherit the ACL permissions of the creator. A personal access token can be revoked by the creator.

We need to implement a custom fosite handler similar to the refresh token handler that implements a custom OAuth grant type personal_access_token. The flow will accept POST requests on the token endpoint with the following form-urlencoded body:

grant_type=personal_access_token&access_token=<valid-access-token>&scope=<scope-for-pat>

valid-access-token is a valid OAuth access token that was issued to a client allowed to use the personal_access_token grant type.
scope is the space-separated list of scopes requested for the PAT.

The response body will be the classic OAuth token response:

{
  "access_token": <PAT>,
  "token_type":"bearer",
  "expires_in": 3153600000 // 100 years, since fosite does not allow non-expiring access tokens
}

Expose application metrics

To improve observability and monitoring, we need to expose Prometheus metrics as an HTTP endpoint.

We will use OpenCensus Metrics as the reporting library since it provides pluggable reporters and will allow us to easily support other systems such as Statsd.

Improve test coverage

Right now Enseada is still in the PoC state. Now that it's approaching a more stable status, test coverage is definitely something we should work on.

Three types of tests are required:

  • Unit tests for smaller logic
  • Integration tests for bigger logic (they provide more value than unit IMHO, but that's debatable)
  • Spec compliance tests for package registry APIs (Maven, Docker etc)

The goal is at least 80% coverage in CodeClimate and full API coverage for compliance tests.

Support RBAC

At the moment Enseada supports per-resource ACL rules. This allows the most granular level of flexibility over access control but is not always the most convenient.

Supporting RBAC will allow to retain this flexibility, since the regular ACL rules will still work, while allowing admins to create roles and namespaces to more easily manage users and rules.

The Casbin model to use is the following.

[request_definition]
r = sub, obj, act

[policy_definition]
p = sub, obj, act

[role_definition]
g = _, _, _

[policy_effect]
e = some(where (p.eft == allow))

[matchers]
m = g(r.sub, p.sub, r.ns) && r.obj == p.obj && r.act == p.act || r.sub == "root" 

We introduce the concept of roles, which are classic RBAC roles assigned to users. An access rule can either a role or a user as its subject.

Remove Echo

Enseada is not really using any particular feature of Echo (data binding, auto-TLS) that is not available on net/http too. I would like to remove Echo entirely and switch to a simpler mux like gorilla/mux.

This also means removing echo.Logger from all over the app, but it's something I was going to do anyway in favour of an internal Logger interface, probably backed by zap.

Cache control

To be CDN and cache friendly, we need to set HTTP caching headers appropriately on all HTTP endpoints. This is especially important for registry APIs since they are the one serving packages, and therefore will benefit the most from caching.

RFC 7234: HTTP/1.1 Caching

Storage engine improvements

The current storage engine is a simple abstraction over cloud object storage services such as S3. A file is received by the server and stored as-is in the backing service. This can be inefficient in case of large files.

The following activities can be carried out to improve the storage engine:

  • Build an abstraction around the hold crate that can implement custom logic around files and stores metadata in CouchDB
  • Split files into equally sized chunks and store those as blobs
  • Compress chunks before writing them to reduce storage space

Document and test CouchDB module

Since we are approaching the first usable release we should start to pour more effort into documenting and cleaning up the codebase. One of the simplest modules to start is the CouchDB client.

TODO

  • add integration tests for methods that interact with CouchDB. We can use the mockito library to mock the CouchDB API and provide assertions
  • add documentation comments on every public type, function and method
  • cleanup any Clippy warnings. This can be tested by running cargo clippy in the couchdb directory

Code documentation and style

Code is not documented at all ATM. We need to properly document exported functions and modules. Ideally running make lint should return no issues.

We also need to properly configure CodeClimate and resolve the related issues.

Manage Personal Access Tokens in API

Personal Access Tokens are special kinds of OAuth tokens that do not expire and are generally used for long-lived systems that need to interact with the registry. Example use cases are CI/CD pipelines, automated consumers of artefacts (such as Kubernetes, which can use them as image pull secrets) and for systems that cannot initiate an OAuth 2.0 refresh token flow, such as Maven.

Since they are manually generated by the user, we need to give them a way to manage them through the API.

This means adding a new Twirp API to the auth.v1beta1 package with the following methods:

  • TokensAPI/GeneratePersonalAccessToken
  • TokensAPI/ListPersonalAccessTokens
  • TokensAPI/GetPersonalAccessToken
  • TokensAPI/RevokePersonalAccessToken

Personal Access Tokens are not stored in plain text, so they are not retrievable after they are created. They have the following properties:

  • label a human-readable label. Not unique, required.
  • expiration date when the token will expire. Optional, defaults to 100 years from the current date
  • scopes a list of scopes granted to the token. The OAuth client requesting the token must have them in its granted scopes list.

Tokens are immutable after they are created. They can be revoked, which will delete them from the database.

Cache Control

Implement proper cache-control header configuration so that files can be properly cached by clients/CDNs/proxies

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.