dm4ml / motion Goto Github PK
View Code? Open in Web Editor NEWFramework for building and maintaining self-updating prompts for LLMs
Home Page: https://dm4ml.github.io/motion/
Framework for building and maintaining self-updating prompts for LLMs
Home Page: https://dm4ml.github.io/motion/
Users should be allowed to write tests for application logic
Whenever duplicating a record, the old record typically isn't written to. So there are null-valued records that exist. Figure out how to cleanly dispose of them.
Including cron-scheduled triggers
Right now, after fit
is called, model outputs aren't regenerated. We need to do this.
See if fashion pipeline can be effectively rewritten with one schema instead of 2
There are 2 views I am thinking of:
For each view, describe the operations a user can run in the UI
--
Goal: enable developers to build the pipeline. Quick iteration on models, can slice and dice intermediates to inspect the data as they please. Success: time it takes to build a pipeline is less than half the time it takes without motion.
Key aspects: runs training ops only once using the minimum number of data points specified. Checkpoints inputs and outputs for every transform. allows user to run each transform, as well as whole pipeline.
3 types of cells:
Goal: simulate deployment on a batch of data. Understand performance as it would be, deployed. Success: when user deploys the pipeline to prod, they don't see a big performance drop soon after.
Key aspects: auto-runs retraining whenever models are getting stale. Users cannot add type or transform cells here. They simply run the pipeline on the ids they care about. They can also evaluate a function to measure performance.
2 types of cells:
Try redis and polars
Currently we have a call like:
results = store.con.execute(
"SELECT fashion.query.query, fashion.query.text_suggestion, fashion.catalog.permalink, fashion.catalog.img_url, fashion.query.img_score FROM fashion.query JOIN fashion.catalog ON fashion.query.img_id = fashion.catalog.id WHERE fashion.query.img_id IS NOT NULL AND fashion.query.query_id = ? ORDER BY fashion.query.img_score ASC",
(query_id,),
).fetchdf()
We want to be able to support schema migrations---for example, a user adding a new key to a relation. Ideally the user does not have to run a separate command; Motion should automatically detect when a schema doesn't match up, and migrate to match up.
It would be good to prompt the user to confirm they want to migrate the schema.
Components:
Don't need to make it official yet.
Triggers:
The example project is currently outdated.
Figure out how to take in new labels or feedback from the user. Also figure out how to create an evaluator.
Helps with validation and removes the need for python 3.10
Use a generative image model like stable diffusion to:
Needs fine-tuning on fashion images. We can use our catalog and fine-tune on text_suggestion, image pairs that the user likes.
Handle authentication and allow for nginx and/or gunicorn serving.
Right now the thread will stop if there's a failure in a cron thread. Make it fail gracefully.
Steps:
Models:
First pass w/o fine-tuning
Default can be async
It looks like the on-disk version of duckdb is too slow. We may want to save the data directly in arrow tables and use duckdb to query. When a session shuts down, we need to:
Allow users to run triggers several times if a trigger fails. They can set this parameter in the constructor class?
In motion test
, allow tests to run on clean data. We can essentially create a new version of the application.
This should just be an alias for executing a python script, checking if they are in the right directory.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.