Code Monkey home page Code Monkey logo

pa-drs-crypt4gh-poc's Introduction

DRS-filer

License Build_status Coverage

Synopsis

Microservice implementing the Global Alliance for Genomics and Health (GA4GH) Data Repository Service (DRS) API specification.

Description

Building

To build locally and push to docker hub, complete the following steps (todo automate this):

docker login
docker build . -t pacificanalytics/pa-drs-crypt4gh-poc:x.x.x
docker push pacificanalytics/pa-drs-crypt4gh-poc:x.x.x

Deployment

Preliminary setup

(1) Create a Python environment (as a virtual environment, via Conda, etc) and install the server application and its dependencies. From the root of the repo, run:

pip install -r requirements.txt
pip install -e . -v

(2) Create a public and a private key for the server:

crypt4gh-keygen --sk server-sk.key --pk server-pk.key

Via Docker

  1. Build the container
docker build . -t crypt
  1. Run the container
docker run \
  -e MONGO_DBNAME=drsstore \
  -e MONGO_HOST=localhost \
  -e MONGO_USERNAME=admin \
  -e MONGO_PASSWORD=password123 \
  -e ACCESS_KEY=123 \
  -e SECRET_KEY=456 \
  -e STORAGE_BUCKET=mybucket \
  -e STORAGE_SECURE=false
  crypt

Via Kubernetes

To deploy into an existing kubernetes cluster, the cluster will require some dependencies to already be installed such as the nginx ingress, cert-manager and mongodb community operator.

  1. Ensure docker registry secret exists (you can create the token in via dockerhub):
kubectl create secret docker-registry dockerhub --docker-username=%username% --docker-password=%token%
  1. Ensure the mongodb pass exists (replace with real values)
kubectl create secret generic pa-drs-crypt4gh-poc-secrets \
  --from-literal=MONGOPASS=password123 \
  --from-literal=ACCESS_KEY=access-key \
  --from-literal=SECRET_KEY=secret-key \
  --from-literal=STORAGE_BUCKET=staging-pa-drs-crypt4gh-poc \
  --from-literal=STORAGE_HOST=s3.ap-southeast-2.amazonaws.com \
  --from-file=PUB_KEY=key.pub \
  --from-file=SEC_KEY=key
  1. Install this helm chart:
helm upgrade -i crypt4gh-poc deployment

Via docker-compose

  1. Run the following (edit to taste):
export MONGO_URI="mongodb://admin:password@db:27017/drsStore?authSource=admin"
export SEC_KEY="-----BEGIN CRYPT4GH PRIVATE KEY-----
YzRnaC12MQAEbm9uZQAEbm9uZQAg5eYgf1QUl1cFyquP6OgMz2faF2uSc4s8OXf0L4MLRQM=
-----END CRYPT4GH PRIVATE KEY-----"
export PUB_KEY="-----BEGIN CRYPT4GH PUBLIC KEY-----
dq/9iq2WMYpYQqnxVpfd0pwRp2PToAccVWldr+kynCI=
-----END CRYPT4GH PUBLIC KEY-----"
export STORAGE_HOST=https://s3.ap-southeast-2.amazonaws.com
export STORAGE_BUCKET=staging-pa-drs-crypt4gh-poc
export STORAGE_SECURE=true
export ACCESS_KEY=key
export SECRET_KEY=secret

docker-compose up --build

Troubleshooting

Logs are sent to elasticsearch via fluentd or failing that you can access the logs directly using kubectl by completing the following steps:

  1. Install awscli and ensure you have it authed with aws configure
  2. Install kubectl and run the aws command the configures your kubectl to work with eks aws eks update-kubeconfig --region ap-southeast-2 --name staging-pa-drs-kubernetes
  3. View the current pods running in the cluster kubectl get pods
  4. View the logs for the pod you are interested, in our case it would be kubectl logs -f pa-drs-crypt4gh-poc-684d56f666-c6z9c (-f being follow which is similar to tail).

Local development

The source code repository is mounted as a volume inside the Docker container for the server. That means that you can edit the code in this repo, and the server will automatically restart when any changes are detected. There is no need to rebuild or even restart the Docker container.

To run the unit tests, it is necessary to activate the Python environment set up above. To run the unit test suite, run the following command:

pytest tests

If pytest cannot be found, install the testing requirements via pip install -r requirements-test.txt.

The unit test suite does not require the server to be running.

Running the integration tests

The server comes with a number of basic integration or "smoke" tests that test basic end-to-end functionality of the entire server stack. The goal is not to exhaustively test the server and make manual QA superfluous, but rather to detect simple breakage that cannot be detected at the level of the unit tests early on.

Currently the following scenarios are tested:

  1. Querying the service-info endpoint and comparing the output with what is expected.

To run the integration test suite, bring up the server as described above, and then run (from within the development environment, and with the same environment variables set as were used to bring up the server):

pytest integration

The integration test suite is automatically run under GitHub actions as well, for every PR.

Configuring the server

The file config.yaml in the drs_filer can be used to set various configuration options for the server. The Crypt4GH-related options and some others can also be passed in as environment variables. Currently the following options are supported:

  • STORAGE_HOST: The FQDN of the storage host (e.g. https://s3.eu-west-1.amazonaws.com for AWS or http://localhost:9000 for Minio).
  • STORAGE_BUCKET: The name of the bucket.
  • STORAGE_SECURE: Whether or not to check the TLS certificate of the storage host.

Note that the server must have the following environment variables set as well, as detailed above:

  • ACCESS_KEY: Access key ID providing access to the storage bucket.
  • SECRET_KEY: Secret access key providing access to the storage bucket.

Contributing

This project is a community effort and lives off your contributions, be it in the form of bug reports, feature requests, discussions, or fixes and other code changes. Please refer to our organization's contributing guidelines if you are interested to contribute. Please mind the code of conduct for all interactions with the community.

Versioning

The project adopts the semantic versioning scheme for versioning. Currently the service is in beta stage, so the API may change without further notice.

License

This project is covered by the Apache License 2.0 also shipped with this repository.

Contact

The project is a collaborative effort under the umbrella of ELIXIR Cloud & AAI. Follow the link to get in touch with us via chat or email. Please mention the name of this service for any inquiry, proposal, question etc.

pa-drs-crypt4gh-poc's People

Contributors

jvkersch avatar sarthakgupta072 avatar kenj1h avatar uniqueg avatar akash2237778 avatar alexandersenf avatar lvarin avatar chinvib66 avatar jamesla avatar dependabot[bot] avatar

Stargazers

Pavel Nikonorov avatar

Watchers

Boris Guennewig avatar

pa-drs-crypt4gh-poc's Issues

Replace Minio client library with Boto3

Testing against a staging bucket reveals that Minio requires permissions that it currently doesn't have (beyond s3:PutObject etc) to upload a file, while Boto3 plays nicely with the provided permissions. Given that we've moved away (on the server side) from Minio, it makes sense to use on the client side also the standard library, which is Boto3.

Replacing minio by boto3 should only touch this file: https://github.com/PacificAnalytics/pa-DRS-Crypt4GH-PoC/blob/dev/uploader/uploader/store.py.

Prepare version of code for open-sourcing

This ticket is to prepare a version of the code for open-sourcing, in time for the GA4GH plenary meeting in SF (19-22 September, https://www.ga4gh.org/event/11th-plenary/). After discussion on Slack, the idea is to create a PR against the original drs-filer repo with the principal changes that were made to this repo to support Crypt4GH. In order to do so, the following need to happen:

  • Merge the deployment PR #20
  • Open source the drs client repo (the one that we forked). We made 1 small change here (passing in custom headers), but since it is not part of the PoC repo, it would be better to just make it available as a public repo. It can then be added to the requirements.
  • Prepare a separate open source repo with the client code (currently in the uploader repo). Proposed name: pa-DRS-Crypt4GH-Client`
  • Move the code that is common between server and client to a separate Python package, hosted out of this repo. The idea being that the server should be entirely standalone, and should not import any client code, as is now the case.
  • Create a PR with the server code against the original drs-filer repo.

GitHub repository token for pa-DRS-Crypt4GH-Downloader repo

To set up an integration test for uploading and downloading file data (started in #13), the integration test GitHub action needs to be able to install the pa-DRS-Crypt4GH-Downloader repository. To do this, we need a personal access token for that repo.

Concretely:

  1. We need to create a personal access token for the https://github.com/PacificAnalytics/pa-DRS-Crypt4GH-Downloader repository, with download privileges. This could be a fine-grained access token (accessible under Settings > Developer Settings > Personal Access Tokens > Fine Grained Tokens).
  2. The token's name should be PA_DRS_CRYPT4GH_DOWNLOADER and its expiry date set sometime in the distant future. The token should give read-only access to the pa-DRS-Crypt4GH-Downloader repo only.
  3. The token should be added as a secret in this repository.

Creating a token:
image

Remove foca 0.6.0

#1 upgraded foca to 0.12.0 but relied on an ugly hack to do so (see below). Removing this hack is not difficult, but requires replacing config['FOCA'] everywhere by config.foca.

# FIXME: The line below is a hack, in order to support code written for
# foca 0.6.0. To address it, update the code to refer to the foca
# object via the syntax on the RHS.
app.app.config['FOCA'] = app.app.config.foca

Bundle client functionality in single app/Docker image

The client functionality currently requires installing two Python applications (the uploader, in this repo, and the DRS client) into a Python environment, and both have different calling conventions. This makes working with the client side difficult.

This proposal is to

  • Build a small client shim that will expose a uniform interface to both the downloader and the uploader (drs-client get and drs-client upload), calling through to the existing functionality.
  • Provide a Docker image that will make this shim available without having to prepare a Python environment.

In the end, the goal is to provide a one-command install experience, and a one-command user experience.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.