Code Monkey home page Code Monkey logo

course-ds-base's People

Contributors

caropen avatar mnrozhkov avatar pared avatar

course-ds-base's Issues

Task 4.3 - Tracking changes and switching between versions

From course-ds-base created by mnrozhkov: iterative#13

Branch: step-6-data-and-model-version-control
(continue from iterative#11)

Tasks:

  • add remote storage
  • store file.txt to remote storage
  • delete file.txt in workspace
  • restore file.txt from DVC cache with dvc checkout file.txt
  • clean cache with dvc gc
  • try to restore file.txt from DVC cache with dvc checkout file.txt (failure is expected)
  • restore file.txt from DVC Remote Storage dvc pull (success is expected)
  • update 'file.txt' (version 2) and push it to remote storage
  • switch btw Git branches and 'file.txt' versions

Task 3.5 - Step 5: Automate pipelines with DVC

From course-ds-base created by mnrozhkov: iterative#9

Branch: step-5-automate-ml-pipeline

Tasks:

  • install DVC (in a virtual environment)
  • automate ML pipeline with DVC
  • setup dependencies and outputs
  • setup parameters

Requirements:

  • for each stage, explicitly define dependencies (code and data)

Task 5.2 - Plots and graphics with DVC

From course-ds-base created by mnrozhkov: iterative#16

Branch: step-7-metrics-and-experiments

Tasks:

  • save 'reports/classes.csv` file
  • get confusion matrix plot withdvc plots show reports/classes.csv --template confusion --template confusion
  • dvc plots show
  • dvc plots diff

Task 6.1 - Experimenting Workflow

From course-ds-base created by mnrozhkov: iterative#19

Branch: step-9-experimenting-workflow

Tasks:
Experiment 1

  • run a new experiment with updated parameter:
    dvc exp run -S train.cv=2
  • show metrics with CLI:
    dvc exp show
  • apply to the current branch / commit & push to remote /
    dvc exp apply <exp>
    git add . & git commit -m "Experiment 1: train.cv=2"

Experiment 2

  • run a new experiment
    dvc exp run -S train.estimator_name=svm
  • create a branch / commit / push to remote / check updates in Studio
    dvc exp branch <exp> experiment-2
    git add . & git commit -m "Experiment 2: SVM"
    git push origin experiment-2
  • merge to step-9-experimenting / check updates in Studio

Task 3.6 - Reproduce end-to-end ML pipelines

From course-ds-base created by mnrozhkov: iterative#10

Branch: step-5-automate-ml-pipeline_SOLUTION

Tasks:

  • run dvc repro (assume we do this for 1st time)
  • run dvc repro (no changes, DVC skips all stages)
  • run dvc repro -f (force running ML pipeline)
  • run dvc repro train (run only train stage from dvc.yaml)
  • update train stage configs and run dvc repro
  • update src/stage/train.py code and run dvc repro
  • remove reports/metrics.json and run dvc repro

Task 3.4 - Step 4: Build ML experiment pipeline

From course-ds-base created by mnrozhkov: iterative#8

Branch: step-4-build-ml-pipeline

Tasks:

  • create src/stages directory

  • create .py modules for each pipeline stage:

    • data_load.py
    • data_split.py
    • featurize.py
    • train.py
    • evaluate.py
  • run each stage

Requirements:

  • use params.yaml to manage stages configuration

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.