Code Monkey home page Code Monkey logo

ct-diag-server's Introduction

ct-diag-server

CircleCI Coverage Status GitHub OpenAPI Validator GoDoc Go Report Card

ct-diag-server is an HTTP server written in Go for storing and retrieving Diagnosis Keys, as defined in Apple/Google's draft specification of its Exposure Notification framework. It aims to respect the privacy of its users and store only the bare minimum of data needed for anonymous exposure notifications.

In anticipation of the general release of Apple and Google's native APIs (planned for May 2020) to assist health organizations with contact tracing, this application provides a reference implementation for the framework's server component: a central repository for submitting Diagnosis Keys after a positive test, and retrieving a collection of all previously submitted Diagnosis Keys, to be used on the device for offline key matching.

ℹ️ The terminology and usage corresponds with v1.2 of the specification, as found here and here.

⚠️ Apple/Google released sample code on May 4, which clarifies some terminology and states best practices for apps and server implementations. Check out the issue tracker for an up to date overview of the ongoing work as this project is being updated accordingly.

👉 Are you an app developer or working for a government and/or health authority looking to implement this server? Please contact me if you have questions, or open an issue.

Table of contents

Goals

  • Privacy by design: Doesn't store or log any personally identifiable information.
  • Built for high workloads and heavy use:
    • Aims to have a small memory footprint.
    • Minimal data transfer: Diagnosis Keys are uploaded/downloaded as bytestreams, easily cachable by CDNs or upstream (government) proxy services.
    • Ships with a Dockerfile, for easy deployment as a workload on a wide range of hosting platforms.
  • Security: relies on Go's standard library where possible, and has minimal vendor dependencies.
  • Solid test coverage, for easy auditing and review.
  • Permissive license, easily forkable for other developers.

Features

  • HTTP server for storing and retrieving Diagnosis Keys. Uses bytestreams for sending and receiving as little data as possible over the wire: 21 bytes per Diagnosis Key (16 bytes for the TemporaryExposureKey, 4 bytes for the RollingStartNumber and 1 byte for the TransmissionRiskLevel).
  • Ships with PostgreSQL adapter for storage of Diagnosis Keys, but can easily be forked for different adapters.
  • Caching interface, with in-memory implementation.
  • Cursor based offsetting for listing Diagnosis Keys, with support for byte ranges and cache control headers.

API reference

💡 Check out the OpenAPI reference or import openapi.yaml in a compatible client for exploring the API and creating client code stubs. Also check out the example client code.

Listing Diagnosis Keys

To be used for fetching a list of Diagnosis Keys. A typical client is either a mobile device or the intermediate platform/server of an app developer, for manual/custom distribution of the payload to clients. In either case, the keyset can be regarded as public; it doesn't contain PII.

Request

GET /diagnosis-keys

The endpoint supports byte range requests as defined in RFC 7233. The HEAD method may be used to obtain Last-Modified and Content-Length headers for cache control purposes.

A query parameter (after) allows clients to only fetch keys that haven't been handled on the device yet, to minimize redundant network traffic and parsing time. Pass the last known/handled key (hexadecimal encoding) to retrieve only new keys uploaded after the given key.

Query parameters

Name Description
after Used for listing diagnosis keys uploaded after the given key. Format: hexadecimal encoding of a Temporary Exposure Key. Example: a7752b99be501c9c9e893b213ad82842. (Optional)

Response

A 200 OK response should be expected for normal requests (non-empty and empty), and 206 Partial Content for responses to byte range requests. In case of an empty reply, a Content-Length: 0 header is written.

A 500 Internal Server Error response indicates server failure, and warrants a retry.

Response headers

Name Description
Content-Type: application/octet-stream The HTTP response is a bytestream of Diagnosis Keys (see below).
Content-Length: {n * 21} Content length is n * 21, where n is the amount of returned Diagnosis Keys (byte range requests may yield different lengths).
Cache-Control: public, max-age=0, s-maxage=600 For (upstream) caching purposes, this header may be used.

Response body

The HTTP response body is a bytestream of Diagnosis Keys. A Diagnosis Key is 21 bytes and consists of three parts: the TemporaryExposureKey itself (16 bytes), the RollingStartNumber (4 bytes, big endian) and the TransmissionRiskLevel (1 byte). Because the amount of bytes per Diagnosis Key is fixed, there is no delimiter.

Uploading Diagnosis Keys

To be used for uploading a set of Diagnosis Keys by a mobile client device. Note: It's still undecided if this server should authenticate requests. Given the wide range of per-country use cases and processes, this is now delegated to the server operator to shield this endpoint against unauthorized access, and provide its own upstream proxy, e.g. tailored to handle auth-z for health personnel.

Request

POST /diagnosis-keys

Any request headers (e.g. Content-Length and Content-Type) are not needed.

Body

The HTTP request body should be a bytestream of 1 <= n Diagnosis Keys, where n is the max upload batch size configured on the server (default: 14). A diagnosis key consists of three parts: the TemporaryExposureKey itself (16 bytes), the RollingStartNumber (4 bytes, big endian) and the TransmissionRiskLevel (1 byte). Because the amount of bytes per Diagnosis Key is fixed, there is no delimiter.

An unexpected end of the bytestream (e.g. incomplete key) results in a 400 Bad Request response.

Duplicate keys are silently ignored.

Response

A 200 OK response with body OK should be expected on successful storage of the keyset in the database. A 400 Bad Request response is used for client errors. A 500 Internal Server Error response is used for server errors, and warrants a retry. Error reasons are written in a text/plain; charset=utf-8 response body.

Retrieving exposure configuration

To be used for fetching an ENExposureConfiguration object (see Apple‘s sample code article).

Request

GET /exposure-config

Response headers

Name Description
Content-Type: application/json The response contains an object in JSON (see below).

Response

A 200 OK response should be expected. A 500 Internal Server Error response indicates server failure, and warrants a retry.

Response body

The HTTP response body is a ENExposureConfiguration object, encoded in JSON.

Example:

{
  "minimumRiskScore": 0,
  "attenuationLevelValues": [1, 2, 3, 4, 5, 6, 7, 8],
  "attenuationWeight": 50,
  "daysSinceLastExposureLevelValues": [1, 2, 3, 4, 5, 6, 7, 8],
  "daysSinceLastExposureWeight": 50,
  "durationLevelValues": [1, 2, 3, 4, 5, 6, 7, 8],
  "durationWeight": 50,
  "transmissionRiskLevelValues": [1, 2, 3, 4, 5, 6, 7, 8],
  "transmissionRiskWeight": 50
}

TODO

👉 See issue tracker.

Status

The project is currently under active development.

Contributors

David Stotijn, Martin van de Belt, Milo van der Linden, Peter Hellberg, Arian van Putten.

Acknowledgements

Thanks to the community of Code for NL (#corona-apps and #corona-ct-diag-server on Slack) for all the valuable feedback and discussions!

License

MIT


© 2020 David Stotijn — Twitter, Email

ct-diag-server's People

Contributors

arianvp avatar dstotijn avatar milovanderlinden avatar peterhellberg avatar vdbelt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ct-diag-server's Issues

Maybe remove the httprouter dependency?

The README states:

Security: rely on Go's standard library where possible, minimal usage of vendor dependencies.

And it seems to me that net/http would work well for this project instead of relying on the third party httprouter package.

@dstotijn What do you think about removing the httprouter dependency?
(If you agree then I’d be happy to open a PR, just wanted to ask first)

Change offset behavior from date based `since` param to cursor based, using key data

As (implicitly) recommended in Apple‘s sample code, the server should support downloading Diagnosis Keys with a cursor based offset, e.g. the latest known Temporary Exposure Key data.

Note: This probably means the server shouldn't order by uploaded_at (not accurate and also may lead to race conditions when handling concurrent uploads), but introduce an auto incrementing column and order by that.

The cache in-memory cache will also have to be adapted. When given a key cursor, it should scan its buffer for the key and stream data starting from there (skipping of course the given cursor's key).

For the query parameter, the hex encoded string representation of the key data should be used. E.g. 9aa2837955831298863798daee32b999.

Add TLS support for the server

It's only a few extra lines of go.

Even if this is not the app that will directly be serving the traffic; having mutual TLS authentication between all components cache <-> proxy <-> ct-diag-server <-> database is good practice.

Would also like to configure TLS verification for the postgres connection if possible.

Data deletion after 14 days

It is also important to come up with a mechanism to invalidate any upstream caches / CDNs that are caching the tracing keys

Add `since` query parameter

Add since query parameter to listDiagnosisKeys endpoint to allow offsetting the Diagnosis Keys from the database by created_at to minimize data sent over the wire.

  • Add created_at column.
  • Add since search support in diag, api and postgres packages.

Key integrity

Due to the primary key constraint on the diagnoses key, duplicate submissions will fail with a 500 Internal Server error.

The question is whether feedback of duplicate submissions is wanted from a client point-of-view.

If not, the insert could fail silently or simply do an upsert instead (responsibility of the adapter).

Complexity will go up slightly if this feedback is wanted, considering:

  • that the storage adapter might not have all information at hand upon inserting (think a HA master-master cluster)
  • the format of the feedback itself (duplicate submissions within a batch).

Don‘t append to cache, only replace with repository data

When horizontally scaled (e.g. multiple containers running the server), the in-memory cache (which isn't distributed) might be stale. This is fine; cache will be eventually consistent after 5 minutes and clients will get any missing key data next time they make a request.

However, currently when we've stored a key set in the database, we append the new keys to the in-memory cache of the container handling the request. Because between the 5 minute cache refresh interval other containers might have handled uploads as well, there's the risk of a data gap in the cache, and the client won't notice this, especially if they'll use cursor based pagination.

To solve this for the in-memory cache implementation, we should simply not append to the in-memory cache. Cache will remain stale for a maximum of 5 minutes which is fine (see first paragraph), after which it will be set afresh from data from the repository (which is strongly consistent).

Upload limit is hardcoded

The 14 days in the spec is phrased as an example and not a hard limit. Health authorities may want to change this based on epidemiological characteristics. So it’s probably better to move this value to a config:

MaxUploadBatchSize = 14

Write benchmarks

Write benchmarks for api and diag packages, and research options to benchmark postgres.

create-db.sh is missing and should not be used?

I am assuming that create-db.sh is used to generate the database template, but it is missing from the source. However, when in a kubernetes cluster, you may not have access to the database server shell; running scripts on the postgres container should be avoided in my opinion. Is it an option to let golang detect the presence of the database and run migrations?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.