Code Monkey home page Code Monkey logo

data-certification-api's Introduction

Data certification API

Le Wagon Data Science certification exam starter pack for the predictive API test.

πŸ’‘Β Β This challenge is completely independent of other challenges. It is not required to complete any other challenge in order to work on this challenge.

Setup

Duplicate the repository for the API challenge

πŸ“Β Β Let's duplicate the repository of the API challenge.

Go to https://github.com/lewagon/data-certification-api:

  • Click on Use this template
  • Enter the repository name data-certification-api
  • Set it as Public
  • Click on Create repository from template
  • Click on Code
  • Select SSH
  • Copy the SSH URL of the repository (the format is [email protected]:YOUR_GITHUB_NICKNAME/data-certification-api.git)

Clone the repository for the API challenge

πŸ“Β Β Now we will clone your new repository.

Open your terminal and run the following commands:

πŸ‘‰Β Β replace YOUR_GITHUB_NICKNAME with your github nickname and PASTE_REPOSITORY_URL_HERE with the SSH URL you just copied:

cd ~/code/YOUR_GITHUB_NICKNAME
git clone PASTE_REPOSITORY_URL_HERE
cd data-certification-api

Look around

πŸ’‘Β Β The content of the challenge should look like this:

tree
.
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ MANIFEST.in
β”œβ”€β”€ Makefile
β”œβ”€β”€ README.md
β”œβ”€β”€ api
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  └── app.py
β”œβ”€β”€ exampack
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ data
β”‚Β Β  β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ predictor.py
β”‚Β Β  β”œβ”€β”€ tests
β”‚Β Β  β”‚Β Β  └── __init__.py
β”‚Β Β  └── utils.py
β”œβ”€β”€ notebooks
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ scripts
β”‚Β Β  └── exampack-run
└── setup.py

Open your favourite text editor and proceed with the challenge.

API challenge

πŸ“Β Β In this challenge, you are provided with a trained model saved as model.joblib. The goal is to create an API that will predict the popularity of a song based on its other features.

πŸ‘‰Β Β You will only need to edit the code of the API in api/app.py 🚨

πŸ‘‰Β Β The package versions listed in requirements.txt should work out of the box with the pipelined model saved in model.joblib

Install the required packages

The requirements.txt file lists the exact version of the packages required in order to be able to load the pipelined model that we provide.

pip install -r requirements.txt
πŸ‘‰Β Β If you encounter a version conflict while installing the packages πŸ‘ˆ

Β 

In this case you will need to create a new virtual environment in order to be able to load the pipeline.

πŸ‘‰Β Β Only execute this commands if you encounter an issue while installing the packages 🚨

pyenv install 3.8.6
pyenv virtualenv 3.8.6 certif
pyenv local certif
pip install -r requirements.txt

Run a uvicorn server

πŸ“Β Β Start a uvicorn server in order to make sure that the setup works correctly.

Run the server:

uvicorn api.app:app --reload

Open your browser at http://localhost:8000/

πŸ‘‰Β Β You should see the response { "ok": true }

You will now be able to work on the content of the API while uvicorn automatically reloads your code as it changes.

API specification

Predict the popularity of a Spotify song

GET /predict

Parameter Type Description
acousticness float whether the track is acoustic
danceability float describes how suitable a track is for dancing
duration_ms int duration of the track in milliseconds
energy float represents a perceptual measure of intensity and activity
explicit int whether the track has explicit lyrics
id string id for the track
instrumentalness float predicts whether a track contains no vocals
key int the key the track is in
liveness float detects the presence of an audience in the recording
loudness float the overall loudness of a track in decibels
mode int modality of a track
name string name of the track
release_date string release date
speechiness float detects the presence of spoken words in a track
tempo float overall estimated tempo of a track in beats per minute
valence float describes the musical positiveness conveyed by a track
artist string artist who performed the track

Returns a dictionary with the artist, the name of the song and predicted popularity as an integer.

Example request:

/predict?acousticness=0.654&danceability=0.499&duration_ms=219827&energy=0.19&explicit=0&id=0B6BeEUd6UwFlbsHMQKjob&instrumentalness=0.00409&key=7&liveness=0.0898&loudness=-16.435&mode=1&name=Back%20in%20the%20Goodle%20Days&release_date=1971&speechiness=0.0454&tempo=149.46&valence=0.43&artist=John%20Hartford

Example response:

{
  "artist": "John Hartford",
  "name": "Back in the Goodle Days",
  "popularity": 22
}

πŸ‘‰ It is your turn, code the endpoint in api/app.py. If you want to verify what data types the pipeline expects, have a look at the docstring of the create_pipeline method in exampack/trainer.py.

API in production

πŸ“Β Β Push your API to production on the hosting service of your choice.

πŸ‘‰Β Β If you opt for Google Cloud Platform πŸ‘ˆ

Β 

Once you have changed your GCP_PROJECT_ID in the Makefile, run the directives of the Makefile to build and deploy your containerized API to Container Registry and finally Cloud Run.

data-certification-api's People

Contributors

gmanchon avatar krokrob avatar ssaunier avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.