dgbowl / tomato Goto Github PK
View Code? Open in Web Editor NEWtomato: au-tomation without the pain!
Home Page: https://dgbowl.github.io/tomato
License: GNU General Public License v3.0
tomato: au-tomation without the pain!
Home Page: https://dgbowl.github.io/tomato
License: GNU General Public License v3.0
In the tomato-0.2
branch, the driver and job interfaces were basically merged into one piece of code: each job talked to each physical device separately, which was causing some race conditions (#28). With only one supported type of device, i.e. the biologic
driver, this kind-of made sense at the time.
In tomato-1.0
, we want to pave the road for a "device dashboard", meaning the device statuses have to be accessible from outside of the jobs. As a reminder, the relationship between jobs and devices in tomato-1.0
is shown below:
Basically, each device (a digital twin of a physical device) is managed by a driver, and there is only one driver process running managing all devices of that type. All communication with the physical device is therefore handled by the driver, the individual physical devices (and channels within them) can be addressed when one knows the device address and channel.
The rest of tomato should be completely driver-agnostic, i.e. everything relevant for the measurement comes from the driver (available parameters, units, adjustable limits, etc.). This means the list of techniques and parameters, i.e. the driver specific language (DSL), has to be defined and documented in the driver docs.
Some functionality, e.g. task scheduling or conditional interruption, should be probably implemented just once and made available to every driver via specific keywords. Currently I can think of the following examples:
tomato
calls the driver's get_data()
every 60 seconds, and sends a stop signal after 10 minutes.tomato-biologic
or tomato-bronkhorst
, for easier maintenance.Driver
class, which is inherited from and exposed by in these packages. The current model for the class looks like this:tomato
-side?Driver
features (e.g. long acquisition time in GC requiring scheduling, or batching of requests for multi-channel devices)?Restoring from crashes of tomato
as well as stopping of running jobs with ketchup
should be implemented.
A driver_reset
function that sets every Component of every Device in the Pipeline needs to be re-introduced.
Currently, when the partial Datasets
generated by separate devices are concatenated into one using xarray.concat()
, we align the dataset on the "uts"
coordinate, but the data_vars
are unmodified. This means that if multiple devices in a pipeline produce the same column (e.g. "flow" for a flow meter), its difficult to disambiguate.
Solution: the role of the device in the pipeline should be prepended to all columns.
It's time to update the payload schema. The wishlist currently includes the following bits:
pydantic-2.0
compatibilityPayload-0.2
via .update()
mechanismTagging @edan-bainglass for comments, with respect to the AiiDAlab-Aurora schemas. It might be a good time to make the payload schema here a "subset" of the other schema.
The I and E range can be selected automatically based on C/D rates.
Jobs should be ordered in an internal, 3-step queue:
In the first instance, the jobs should specify a sample
, a payload
, and optionally a pipeline
.
The Python kbio
provided by BioLogic should be wrapped by extended and wrapped by tomato to implement the following functionality:
In a second stage the interface should be completed by including the following techniques:
Currently, tomato
is Windows-only, as the only real device driver that is currently supported (biologic
) requires the Windows DLL interface. However, the dummy
driver should be platform agnostic, and the code should be modified to work on both Windows and Linux.
Being able to cancel a job that is waiting in the queue would be useful functionality
On 0.2.x by default the sample name is not recorded in snapshot or final files
It could also be useful to record the channel and address used in the final file metadata
When starting a job, if the biologic says it is in state "RUN" the job never starts, and it never switches to 'ce' (completed with error), it stays in a frozen running state and has to be manually cancelled.
The same happens if an error is thrown by drivers.biologic.start_job (e.g. there is a problem with the payload or the firmware isn't loaded), the job stays frozen and running.
During testing with @lorisercole we found that the I range setting of keep
does not actually keep the previous I range. Needs debugging and fixing.
tomato
should be able to create a dataschema, call yadg
, and place the created datagram in the output folder specified by the user. This is also related to dgbowl/yadg#54.
The job matching process has to be able to specify a sample
which is required for the job to be started. In the first instance, two things have to be implemented:
sample
is loaded in which pipeline
using tomato
pipeline+sample
combinationFix the following tests, which are currently flaky - the output file sometimes does not get generated:
test_ketchup_cancel
test_ketchup_snapshot
Device settings should be stored in a persistent location. The settings file should contain the following:
This should probably be done in 3 steps:
kbio
to a separate package which will be an optional dependencybiologic
driver, perhaps implementing command batchingTagging @NukP and @edan-bainglass.
ketchup status
could accept multiple jobids as argument: e.g.ketchup status 1 2 3
which is equivalent to
ketchup status 1 && ketchup status 2 && ketchup status 3
- jobid: 1
name: job-1
status: q
submitted: '2022-06-28 16:09:31.463749+00:00'
- jobid: 2
name: job-2
status: q
submitted: '2022-06-28 16:09:31.463749+00:00'
- jobid: 3
name: job-3
status: q
submitted: '2022-06-28 16:09:31.463749+00:00'
Feel free to add any other scheduler information that could be useful to retrieve for debug purposes (e.g. pipeline used, ...).
Finally another useful feature would be to be able to see the list of all jobs (including completed ones). E.g. with a command like:
ketchup status queue -a
With the dummy
driver, the results returned from a multi-step method are not split into steps but are concatenated into a single step.
Example payload:
version: "0.1"
sample:
name: fake_sample
capacity: 1.0
method:
- device: "worker"
technique: "random"
time: 35
delay: 2
- device: "worker"
technique: "random"
time: 20
delay: 1
The output json file contains 38 points assigned to a single step, instead of 18+20.
The main loop checks the queue
once per iteration, but checks the state
of the matched pipelines multiple times per iteration. This may lead to jobs submitted later being executed before jobs submitted earlier.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.