Code Monkey home page Code Monkey logo

downcast's People

Contributors

lucas-mc avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

downcast's Issues

Finalization crashes if --output-dir is a relative path

If --output-dir is specified as a relative path, e.g.:

downcast.py --init --server demo --output-dir example-output \
            --start '2004-10-31 10:00:00.000 -05:00'
downcast.py --batch --server demo --output-dir example-output \
            --end '2004-10-31 10:05:00.000 -05:00' --terminate

then something in the finalization process crashes:

  File "/home/benjamin/downcast/downcast/subprocess.py", line 283, in _main1
    self.handler.flush()
  File "/home/benjamin/downcast/downcast/dispatcher.py", line 151, in flush
    self._handler_flush(h)
  File "/home/benjamin/downcast/downcast/dispatcher.py", line 313, in _handler_flush
    handler.flush()
  File "/home/benjamin/downcast/downcast/output/waveforms.py", line 156, in flush
    self.archive.flush()
  File "/home/benjamin/downcast/downcast/output/archive.py", line 165, in flush
    rec.flush(self.deterministic_output)
  File "/home/benjamin/downcast/downcast/output/archive.py", line 271, in flush
    deterministic = deterministic)
  File "/home/benjamin/downcast/downcast/output/archive.py", line 297, in _write_state_file
    os.rename(tmpfname, fname)
FileNotFoundError: [Errno 2] No such file or directory: 'example-output/3d/demo_3d97e525-d794-4aa8-82e8-8821b8da12b4_20041031-1500/__phi_properties.tmp' -> 'example-output/3d/demo_3d97e525-d794-4aa8-82e8-8821b8da12b4_20041031-1500/_phi_properties'

It doesn't do this if the output directory is an absolute path.

This is bizarre. Nothing in the entire package calls chdir, so why should it matter if the path is absolute or relative?

Handling of "delayed" numerics

Some numeric values (in particular, NBP) have multiple time values and we need to understand what they mean and how to use them.

  • TimeStamp seems to have one-second resolution.

  • SequenceNumber seems to have 5120-ms resolution.

  • Often the two values are wildly different (TimeStamp could be hours earlier.)

  • Often the same measurement appears multiple times with the same TimeStamp and differing SequenceNumber.

I am guessing, actually, that the TimeStamp is pretty meaningless - that it refers to the time when the measurement was first "requested" rather than when it was actually performed. I'm guessing that the SequenceNumber tells us when the measurement was reported, which might be a few seconds after it was measured.

It might be helpful to hear from somebody who is familiar with using these machines:

  • is NBP measured automatically (on a schedule) or does the nurse press a button to initiate the measurement? Or both?

  • how long does it usually take (from inflating the pressure cuff, to deflating it, to when the NBP measurements appear on screen)?

  • how long do the values stay on screen afterwards?

Correct signal file checksums

Currently, WaveSampleHandler will set all signal checksums to zero. The checksums should instead be set to the sum of all samples in the segment.

Finalizing record at end of patient stay

When a patient is discharged, we need to mark the record as finalized.

Currently, we finalize a record automatically when there is a gap - i.e., some period of time when no new messages are seen, then a new message appears - but when the patient is discharged, there are no new messages, so this never happens.

(For testing, we can force all records to be finalized by using the --terminate argument, but that's no good for "real" conversion.)

The tricky thing is that since we are processing messages in parallel, it's hard to say which worker process is responsible for finalizing the record.

One way to deal with this would be to periodically check "what is the earliest unprocessed message in any queue"? Call that timestamp T_next. Then, if there are any unfinalized records for which the last processed message is earlier than (T_next - split_interval), those records should be finalized.

I don't think there's a good way to do this without stopping and then restarting all of the worker processes, but we don't need to do so frequently - doing it once for every 3 hours of data should be quite adequate.

Generate unique signal names

In some strange cases we might see two simultaneous waveforms with the same label. WFDB requires that each signal in a multi-segment record has a unique name.

Correct signal file sample range

In some cases, the stated sample range for a signal (scale_lower/scale_upper) is flat-out wrong. ECG signals in particular are often wrong.

WaveSampleHandler should calculate and report an accurate sample range (adcres/adczero) for each segment. This is required in order to correctly convert the record to other formats (e.g., using wfdb2mat.)

Generate multisegment record

When a record is finalized, we need to generate a multi-segment header file for it.

Up until now I've been doing this by hand using a hacked version of 'wfdbjoin', but this should ideally be done by the WaveSampleHandler itself.

This requires re-reading the segment header files, and creating a layout header with the composite signal information, and a master header containing the names and lengths of the segments (and gaps, if any.)

Negative clock adjustment coincides with DST transition

At fall transition time, when the clock switches from wrongly-labelled winter time to correctly-labelled winter time, there is often a small negative clock adjustment (after fixing the broken timestamps).

For example, the raw data might look like this:

TimeStamp                        SequenceNumber
2020-11-01 01:59:59.123 -05:00   657489599123
2020-11-01 01:00:04.218 -05:00   657489604243

Clearly there's no discontinuity here and the first message is mislabelled as -05:00 when it should be -04:00. But the delta in TimeStamp is only 5095 milliseconds vs. a delta in SequenceNumber of 5120.

There even seems to be a clock adjustment in those rare cases that DWC labels the summer timestamps correctly.

Normally a negative clock adjustment creates ambiguity, but in this case it seems it might be possible to disambiguate based on the timezone. Need to investigate further.

(There is usually no adjustment when the clock switches from correctly-labelled summer time to wrongly-labelled winter time. So the one-hour correction is right.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.