Code Monkey home page Code Monkey logo

dotscience's People

Contributors

lukemarsden avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

nikolays

dotscience's Issues

[1d] publish a python client library for printing dotscience metadata

Context

We want to make it easier for people to emit the DOTSCIENCE_ annotations, and the print statements have the following problems:

  • They're hard to read, the json.dumps isn't obvious to a data scientist.
  • They rely on having variables in global scope - but parameters may be used inside a function. Technically, users can do the print statements from inside a function but it's not clear that this will work.
  • There's no validation of input and output datasets, or output format.

Requirements

Develop a trivial Python library and publish it on PyPI which solves the above problems. Usage should be like:

import dotscience as ds
# The following two methods throws exceptions if agent1/2 or model isn't a mountpoint
ds.input("agent1", "agent2")
ds.output("model")

# The following methods can either copy the values at call time, or keep the
# reference for completion - probably a copy is better as the user will probably
# expect the value _right now_ to be captured.
ds.metric("f-score", f_score)
ds.parameter("batch-size", batch_size)
ds.label("frobrinator", "off")

# They also return the result for handy use like this:
tensorflow.setBatchSize(ds.parameter("batch-size", 0.3))

# Multiple stats, params or labels can be passed as long as there are 2*x params
ds.metric("f-score", f_score, "batch_size", batch_size)

# Alternate calling style with **kwargs
ds.metric(a=1, b=2)
ds.parameter(c=3, d=4)

# Preview the metrics in human-readable form without publishing them to
# dotscience even if the notebook is saved
ds.debug()

# Report the metrics
ds.report()

# Report data changes, but no metrics (summary stats)
ds.report(plot=False)

The final method will print:

---
DOTSCIENCE_INPUTS=["agent1", "agent2"]
DOTSCIENCE_OUTPUTS=["model"]
DOTSCIENCE_SUMMARY={"f-score": 0.1, "batch-size": 0.9}
DOTSCIENCE_PARAMETERS={"c": 3, "d": 4}
---
Note to Jupyter users: don't forget to save your notebook in order to publish
these results to dotscience.

Open questions

  • metric or summary? Let's decide

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.