Code Monkey home page Code Monkey logo

motion's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar nicholasarvin avatar rguptar avatar shreyashankar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

rguptar ai-jie01

motion's Issues

Spec out train-once and run views

There are 2 views I am thinking of:

  • train-once
  • run

For each view, describe the operations a user can run in the UI

--

Explore

Goal: enable developers to build the pipeline. Quick iteration on models, can slice and dice intermediates to inspect the data as they please. Success: time it takes to build a pipeline is less than half the time it takes without motion.

Key aspects: runs training ops only once using the minimum number of data points specified. Checkpoints inputs and outputs for every transform. allows user to run each transform, as well as whole pipeline.

3 types of cells:

  • type: feature or label type creation
  • transform: create a transform, set dependencies, can execute it
  • free: forks/copies state and allows users to perform any read operations on intermediate state. this is kind of like a jupyter cell/free-for-all.

Test

Goal: simulate deployment on a batch of data. Understand performance as it would be, deployed. Success: when user deploys the pipeline to prod, they don't see a big performance drop soon after.

Key aspects: auto-runs retraining whenever models are getting stale. Users cannot add type or transform cells here. They simply run the pipeline on the ids they care about. They can also evaluate a function to measure performance.

2 types of cells:

  • free: same as above
  • evaluator: takes predictions and labels, and computes an evaluation metric. can either be a function or a class with state (to do incremental maintenance).

Write query method to avoid a join

Currently we have a call like:

results = store.con.execute(
            "SELECT fashion.query.query, fashion.query.text_suggestion, fashion.catalog.permalink, fashion.catalog.img_url, fashion.query.img_score FROM fashion.query JOIN fashion.catalog ON fashion.query.img_id = fashion.catalog.id WHERE fashion.query.img_id IS NOT NULL AND fashion.query.query_id = ? ORDER BY fashion.query.img_score ASC",
            (query_id,),
        ).fetchdf()

Handle DB migrations seamlessly

We want to be able to support schema migrations---for example, a user adding a new key to a relation. Ideally the user does not have to run a separate command; Motion should automatically detect when a schema doesn't match up, and migrate to match up.

It would be good to prompt the user to confirm they want to migrate the schema.

Documentation

  • Requirements.txt
  • Schema documentation
  • Store methods (common ones one will use)
  • Transform class
  • end to end example (including how to log feedback)

Don't need to make it official yet.

Incorporating feedback

Figure out how to take in new labels or feedback from the user. Also figure out how to create an evaluator.

Write `get` method to access DB

  • Basic get method
  • Get method with guardrail (e.g., can't peek into future)
  • Copy method (takes in id, doesn't execute triggers)

[EPIC] Generative Fashion

Use a generative image model like stable diffusion to:

  • generate for a user's prompt

Needs fine-tuning on fashion images. We can use our catalog and fine-tune on text_suggestion, image pairs that the user likes.

Productionization

Handle authentication and allow for nginx and/or gunicorn serving.

[EPIC] Create representation of someone's closet

Steps:

  • User uploads camera roll + photo of themselves
  • Model prunes set of images for images that have outfits of the user
  • Find k most "worn" items for the user and store this
  • Find similar clothing times to what users already have

Models:

  • Face recognition to filter camera roll for pics of user
  • Outfit segmentation to find most "worn" items
  • Retrieval model to find neighboring outfits (scraped from online catalog) that the user doesn't own

First pass w/o fine-tuning

Save model state when motion stops

It looks like the on-disk version of duckdb is too slow. We may want to save the data directly in arrow tables and use duckdb to query. When a session shuts down, we need to:

  • persist data/table information
  • persist trigger state

Experiment Mode

In motion test, allow tests to run on clean data. We can essentially create a new version of the application.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.