Code Monkey home page Code Monkey logo

movie_intake's Introduction

moviedb.template

This template shows example usage of the Metis Machine platform for the purpose of data ingestion and curation. Fundamentally, the task is to go out each morning and fetch a list of valid movie IDs from www.themoviebd.org (TMDb) and then retrieve additional data about each film (genres, release date, length, etc).

Dependencies

  1. User must sign up and aqcuire a free API key from TMDb.
  1. Set the API key as an environment variable with the skafos CLI.
  • Run from the terminal (in your project directory): skafos env MOVIE_DB --set <API KEY>
  1. User must have git installed and a github account created --> https://git-scm.com/

Project Structure

  • movies
    • __init__.py
    • constants.py
    • logger.py
    • movie_fetch.py
    • movie_info.py
  • metis.config.yml
  • environment.yml
  • README.md
  • main.py

Flow

  • movie_fetch.py and movie_info.py contain the classes that handle the ingestion using TMDb API calls. Any methods can be expanded to retrieve more or less data as desired.
  • main.py script is the primary driver for this task. At the end, a list of valid movie ID's and associated information will be written to a project keyspace using the Skafos Data Engine.
  • metis.config.yml allows the user to configure project specific or runtime specific requirements (schedule, resources, run count, etc). See for more information here --> https://metismachine.readme.io/docs/deploying-tasks.
  • Once the user is ready to fire it off (after following the dependency steps above):
    • CREATE: a github repository and attach it to the skafos app here --> https://github.com/apps/skafos
    • OPEN: one of the files and make some sort of change (add a comment, add new functionality, change config file, etc)
    • From Project Directory RUN:
      $ git init
      $ git add .
      $ git commit -am "<message>"
      $ git remote add origin [email protected]:<organization>/<repository-name>.git
    

Options

These are the environment variables that could be set from the terminal in the project directory using the Skafos CLI. Only one is required and the others are optional. See below for details.

required

  • MOVIE_DB: API key from TMDb. See above.

optional

  • POPULARITY, default=15: Lower threshold on popularity score for a particular movie. Note: most films fall above 15.
  • BATCH_SIZE, default=10: Number of rows written to the database at a time.
  • BACKFILLED_DAYS: Number of consecutive days in the past for which to fetch movie data.
  • FILE_DATE, format=%Y-%m-%d: Single date for which to fetch movie data.

Deployment

Once the steps are completed above, the user has deployed their movie data ingester. To check the status of the build or job run skafos logs --tail to see realtime updates.

movie_intake's People

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.