Code Monkey home page Code Monkey logo

tomato's People

Contributors

lorisercole avatar peterkraus avatar ramirezfranciscof avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

tomato's Issues

New driver interface for `tomato-1.0`

Motivation

In the tomato-0.2 branch, the driver and job interfaces were basically merged into one piece of code: each job talked to each physical device separately, which was causing some race conditions (#28). With only one supported type of device, i.e. the biologic driver, this kind-of made sense at the time.

In tomato-1.0, we want to pave the road for a "device dashboard", meaning the device statuses have to be accessible from outside of the jobs. As a reminder, the relationship between jobs and devices in tomato-1.0 is shown below:

concepts_flowchart

Basically, each device (a digital twin of a physical device) is managed by a driver, and there is only one driver process running managing all devices of that type. All communication with the physical device is therefore handled by the driver, the individual physical devices (and channels within them) can be addressed when one knows the device address and channel.

Requirements

  1. We would like the new driver interface to handle a wide range of physical devices, e.g.:
  • thermocouple readers, e.g. PicoLog, which are almost completely read-only
  • volumetric flow meters, e.g. Mesa DryCal, which have a few adjustable parameters, have a start/stop function
  • mass flow controllers, e.g. Bronkhorst, which have setpoints
  • temperature controllers, e.g. Jumo, which might have ramps
  • gas chromatographs, e.g. Fusion, where a measurement might take ~ 5 minutes and might have to be scheduled
  • potentiostats, e.g. Biologic, where a single set of instructions can contain a whole cycling protocol
  1. The rest of tomato should be completely driver-agnostic, i.e. everything relevant for the measurement comes from the driver (available parameters, units, adjustable limits, etc.). This means the list of techniques and parameters, i.e. the driver specific language (DSL), has to be defined and documented in the driver docs.

  2. Some functionality, e.g. task scheduling or conditional interruption, should be probably implemented just once and made available to every driver via specific keywords. Currently I can think of the following examples:

    • if I want to measure temperature for 10 minutes every 15 seconds, but only poll for new data every 60 seconds, I need to be able to tell the driver that it's supposed to communiate with the device with a 15 second resolution, caching the data. Then, tomato calls the driver's get_data() every 60 seconds, and sends a stop signal after 10 minutes.

Implementation

  • Each driver can be a separate python package, e.g. tomato-biologic or tomato-bronkhorst, for easier maintenance.
  • Tomato provides a central abstract Driver class, which is inherited from and exposed by in these packages. The current model for the class looks like this:
    driver_proto_v01.txt

Design questions

  • what can be abstracted safely and be handled tomato-side?
  • are we missing any key functionality?
  • how to indicate Driver features (e.g. long acquisition time in GC requiring scheduling, or batching of requests for multi-channel devices)?

Implement `driver_reset`

A driver_reset function that sets every Component of every Device in the Pipeline needs to be re-introduced.

Annotate data from multiple devices

Currently, when the partial Datasets generated by separate devices are concatenated into one using xarray.concat(), we align the dataset on the "uts" coordinate, but the data_vars are unmodified. This means that if multiple devices in a pipeline produce the same column (e.g. "flow" for a flow meter), its difficult to disambiguate.

Solution: the role of the device in the pipeline should be prepended to all columns.

`Payload-1.0`: New schema for tomato payloads

It's time to update the payload schema. The wishlist currently includes the following bits:

  • pydantic-2.0 compatibility
  • forward compatibility of Payload-0.2 via .update() mechanism
  • separate device-specific parts of the method from (optional) general commands managed by tomato, such as:
    • maximum task duration
    • measurement frequency
    • task start time relative to payload start (e.g. "after 2 hours")
    • trigger propagation between pipeline components (i.e. task completing on one component will trigger next tasks on every component)

Tagging @edan-bainglass for comments, with respect to the AiiDAlab-Aurora schemas. It might be a good time to make the payload schema here a "subset" of the other schema.

Implement job queue.

Jobs should be ordered in an internal, 3-step queue:

  • queued jobs
  • running jobs
  • finished jobs

In the first instance, the jobs should specify a sample, a payload, and optionally a pipeline.

Wrap and extend BioLogic's kbio

The Python kbio provided by BioLogic should be wrapped by extended and wrapped by tomato to implement the following functionality:

  • OCV
  • CALIMIT / CPLIMIT
  • LOOP

In a second stage the interface should be completed by including the following techniques:

  • VSCANLIMIT / ISCANLIMIT
  • PEIS/GEIS

compatibility: make `tomato` work on both Windows and Linux

Currently, tomato is Windows-only, as the only real device driver that is currently supported (biologic) requires the Windows DLL interface. However, the dummy driver should be platform agnostic, and the code should be modified to work on both Windows and Linux.

sample id not in datagrams

On 0.2.x by default the sample name is not recorded in snapshot or final files

It could also be useful to record the channel and address used in the final file metadata

jobs don't 'complete with error' when start_job fails

When starting a job, if the biologic says it is in state "RUN" the job never starts, and it never switches to 'ce' (completed with error), it stays in a frozen running state and has to be manually cancelled.

The same happens if an error is thrown by drivers.biologic.start_job (e.g. there is a problem with the payload or the firmware isn't loaded), the job stays frozen and running.

Implement data export.

tomato should be able to create a dataschema, call yadg, and place the created datagram in the output folder specified by the user. This is also related to dgbowl/yadg#54.

Implement samples.

The job matching process has to be able to specify a sample which is required for the job to be started. In the first instance, two things have to be implemented:

  • a way for the user to specify which sample is loaded in which pipeline using tomato
  • a way for the scheduler to match a queued job against the pipeline+sample combination

Fix `xfail` tests

Fix the following tests, which are currently flaky - the output file sometimes does not get generated:

  • test_ketchup_cancel
  • test_ketchup_snapshot

Implement device settings.

Device settings should be stored in a persistent location. The settings file should contain the following:

  • information about each connected device
  • organisation of individual devices into addressable pipelines

`ketchup status`: multiple jobids and format

  • When multiple jobs have been submitted, AiiDA may ask for the status of multiple jobs at the same time.
    It would be nice if ketchup status could accept multiple jobids as argument: e.g.
ketchup status 1 2 3

which is equivalent to

ketchup status 1 && ketchup status 2 && ketchup status 3
  • Furthermore, maybe the output of this command could be rendered in a nicer format (also easier to parse), such as yaml-style:
- jobid: 1
  name: job-1
  status: q
  submitted: '2022-06-28 16:09:31.463749+00:00'
- jobid: 2
  name: job-2
  status: q
  submitted: '2022-06-28 16:09:31.463749+00:00'
- jobid: 3
  name: job-3
  status: q
  submitted: '2022-06-28 16:09:31.463749+00:00'
  • Feel free to add any other scheduler information that could be useful to retrieve for debug purposes (e.g. pipeline used, ...).

  • Finally another useful feature would be to be able to see the list of all jobs (including completed ones). E.g. with a command like:

ketchup status queue -a

Dummy driver: multi-step method results

With the dummy driver, the results returned from a multi-step method are not split into steps but are concatenated into a single step.

Example payload:

version: "0.1"
sample:
    name: fake_sample
    capacity: 1.0
method:
  - device: "worker"
    technique: "random"
    time: 35
    delay: 2
  - device: "worker"
    technique: "random"
    time: 20
    delay: 1

The output json file contains 38 points assigned to a single step, instead of 18+20.

`queue`: avoid skipping jobs

The main loop checks the queue once per iteration, but checks the state of the matched pipelines multiple times per iteration. This may lead to jobs submitted later being executed before jobs submitted earlier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.