Code Monkey home page Code Monkey logo

Comments (8)

jmlaka avatar jmlaka commented on August 26, 2024 1

Thanks for your ideas @Enet4

Considering this solution:

Or, just provide only to_date and let that be the expected path towards making a NaiveDate, which narrows the interface into the bare minimum required to be functional and usable: value.to_date()?.as_naive_date()?

If I undrestand correctly, the path works like this:

PartialValue.to_date() -> Result<DicomDate>
DicomDate.as_naive_date() -> Result<NaiveDate>

(this second method only yields NaiveDate if DicomDate.is_precise() )

And all the rest needs proper documentation.
I think I like this.

I'll be back with a PR

from dicom-rs.

jmlaka avatar jmlaka commented on August 26, 2024

So I looked at the dicom spec once more and
Date has a value length of 8 bytes FIXED (apart from range matching), so parsing of and incomplete Date should maybe fail ?

Only Time and DateTime mention the null components ...

@Enet4 , do you think that this problem could be solved like this :

At the level of deserialize.rs

  • parsing of incomplete DA, DT, TM would fail with something like IncompleteValue
  • that means that in case of DT and TM it would fail for any other case than a full representation (so a lot)
  • new methods .to_partial_date, .to_partial_time, .to_partial_datetime would return a new type like PartialDate ..,
    a tuple of two or a vec of two (DA, DT, TM) values, that would represent a range that takes into account the imprecision caused by null components.

For a Date it should be easy, as "2010" would translate to (2010-01-01,2010-12-31)
A Time of "071010.123" would translate to ( 07:10:10.123000, 07:10:10.123999) ?
A DateTime "2010" would translate to
(2010-01-01 00:00:00.000000 , 2010-12-31 23:59:59.999999 ) ?

At the level of PrimiteValue and higher,

  • introduce 3 new Partial ... values with getters and all (is it a good idea?)
  • calling .to_date on a PartialDate should default to PartialDate.0 ?
  • in case of range matching with the Partial data types one would use 1stPartialDate.0 - 2ndPartialDate.1 as a range

As the original value is already permanently stored as a string, would this be sufficient for practical handling of the
imprecisions ?

from dicom-rs.

Enet4 avatar Enet4 commented on August 26, 2024

Thank you for this investigation, @jmlaka.

So I looked at the dicom spec once more and
Date has a value length of 8 bytes FIXED (apart from range matching), so parsing of and incomplete Date should maybe fail ?

Only Time and DateTime mention the null components ...

@Enet4 , do you think that this problem could be solved like this :

At the level of deserialize.rs

  • parsing of incomplete DA, DT, TM would fail with something like IncompleteValue
  • that means that in case of DT and TM it would fail for any other case than a full representation (so a lot)

This sounds good. It will be a breaking change, but the current state of sticking default component values in can cause greater confusion and damage to a system in the long run. 👍

  • new methods .to_partial_date, .to_partial_time, .to_partial_datetime would return a new type like PartialDate ..,
    a tuple of two or a vec of two (DA, DT, TM) values, that would represent a range that takes into account the imprecision caused by null components.

In this case we should rather represent partial date-times not as ranges, but specifically as a value containing a variable number of components present. As in something like this:

enum PartialDate {
    Year(u16),
    YearMonth(u16, u16),
    Complete(NaiveDate),
}

This one would be loaded with helper methods so that it could be easily converted to a full date, a range, or something else.

For a Date it should be easy, as "2010" would translate to (2010-01-01,2010-12-31)
A Time of "071010.123" would translate to ( 07:10:10.123000, 07:10:10.123999) ?
A DateTime "2010" would translate to
(2010-01-01 00:00:00.000000 , 2010-12-31 23:59:59.999999 ) ?

One or two of these values would also have a clear path to be converted to a range, and the logic would be something like that. Whether to encode this as two exact date-times (begin, end) could be left as an implementation detail, but once we have a proof of concept we can make a better assessment of this.

At the level of PrimiteValue and higher,

  • introduce 3 new Partial ... values with getters and all (is it a good idea?)

For now I say we keep the variants of PrimitiveValues as is, and implement to_partial_* and to_*_range methods which have to perform a deserialization from the string values at that moment.

The original purpose of having so many variants was so that the DICOM object's decoded values could be stored during the object reading process. But this process was eventually changed in 0.3.0 so that their original format is preserved rather than trying to decode everything. That is, all dates and times are fetched and stored in memory as strings now, and so some of these variants do not ever occur when reading DICOM objects. The same thing would happen for partial date-times and ranges. We can adopt a different strategy for PrimitiveValue later.

  • calling .to_date on a PartialDate should default to PartialDate.0 ?
  • in case of range matching with the Partial data types one would use 1stPartialDate.0 - 2ndPartialDate.1 as a range

If we follow the advice above of making PartialDate a sum type, this will no longer apply. In this case we may just require the date to be complete, and for partial dates, we can maybe have helper methods for retrieving the earliest and latest possible date-times. Handling these numbers at the range type itself would work as well.

As the original value is already permanently stored as a string, would this be sufficient for practical handling of the
imprecisions ?

After considering the points above, I believe that we have a good plan here for the time being.

from dicom-rs.

jmlaka avatar jmlaka commented on August 26, 2024

@Enet4 thanks for the comment.

In this case we should rather represent partial date-times not as ranges, but specifically as a value containing a variable number of components present.

See Note 3 in DateTime documentation that says:
A Date Time value of 19530827111300.0 means August 27, 1953, 11;13 a.m. accurate to 1/10th second

In case of an Enum, lets consider a DateTime value of 20210101153322.123
It has a YYYY, MM, DD, hh, mm, ss and even and FF component but this FF component, if not fully represented by 6 bytes , according to documentation is not a fully accurate DateTime , it represents a range between .123000 and .123999/th of a second
So the PartialDateTime Enum would need 6 viariants just for the FFFFFF component.

Am I thinking straight ?

And still, we need the UTC offset to be stored even for the partial enum variants, because 2020-0100 is a valid DateTime.

So in case of a Date, we could just fail the conversion of incomplete values right away, and it would still be according to the Dicom spec.

But with DT and TM it gets more complicated.

What do you suggest ? Thanks

from dicom-rs.

jmlaka avatar jmlaka commented on August 26, 2024

Could a PartialDateTime be represented like:

pub enum DateTimeParts {
    YYYY(u16),
    YYYYMM(u16, u8),
    YYYYMMDD(u16, u8, u8),
    YYYYMMDDhh(u16, u8, u8, u8),
    YYYYMMDDhhmm(u16, u8, u8, u8, u8),
    YYYYMMDDhhmmss(u16, u8, u8, u8, u8, u8),
    YYYYMMDDhhmmssF(u16, u8, u8, u8, u8, u8, u32),
}

pub struct PartialDateTime {
    parts: DateTimeParts,
    offset: FixedOffset
}

and everything except a full 6 digit fraction of a second (FFFFFF) would be considered a PartialDateTime with the
possibility to call .earliest() and .latest() on the struct

from dicom-rs.

Enet4 avatar Enet4 commented on August 26, 2024

and everything except a full 6 digit fraction of a second (FFFFFF) would be considered a PartialDateTime with the
possibility to call .earliest() and .latest() on the struct

This is a good question. If the value does not specify all 6 digits, then according to the standard, the second fraction is only specific up to the given digits. However, I can imagine many time values in this domain to be incomplete, as modalities might not feel the need to be that specific.

As we are dealing with much smaller time measurements, it should be safe to accept that we have all components from the moment we have at least one digit from the fraction of a second. So, we would let them be converted to a NaiveTime or a DateTime with a note in the documentation that the remaining unspecified precision is assumed to be 0.
In either case, earliest and latest may indeed be good methods to have in that struct.

from dicom-rs.

jmlaka avatar jmlaka commented on August 26, 2024

Hello @Enet4 ,

as you suggested, I implemented the Partial[Date,Time,DateTime] structures (parsing, precision, range) and propagated the methods to primitive.rs.

At this stage we end up with 2 separate methods for a PrimitiveValue e.g. :
to_date() - which fails, if value incomplete, returns NaiveDate
to_partial_date() - which accepts incomplete values, returns PartialDate

From a user perspective, one has to call the the first method only to find out that it failed for imprecise values.

I wonder how this could be simplified for the user.

Maybe if we rewrote the PrimitiveValue types:

    /// A sequence of complete dates.
    /// Used for the DA representation.
    Date(C<NaiveDate>),

    /// A sequence of complete date-time values.
    /// Used for the DT representation.
    DateTime(C<DateTime<FixedOffset>>),

    /// A sequence of complete time values.
    /// Used for the TM representation.
    Time(C<NaiveTime>),

... so that they contain a custom struct like the implemented Partials, maybe named Dicom[Date,Time, DateTime] .
The Partial structure can hold a DA, TM, DT with full precision, so they can return NaiveDate, NaiveTime, DateTime...

such structure would then have documented (trait?) methods like

DicomDate.has_all_components() -> bool (or is_precise())
DicomDate.precision() -> DateComponent (year,month, day)
DicomDate.to_precise_value() -> Option<NaiveDate> (none if not precise)
DicomDate.to_range() -> (Option<NaiveDate>,Option<NaiveDate>)
DicomDate.earliest() -> Result<NaiveDate>
DicomDate.latest() -> Result<NaiveDate>

Or maybe, do you see a different place where this incovenience for the user can be sorted out ?
Thanks

from dicom-rs.

Enet4 avatar Enet4 commented on August 26, 2024

Hey @jmlaka, thank you for your continued commitment to this! Let's see...

as you suggested, I implemented the Partial[Date,Time,DateTime] structures (parsing, precision, range) and propagated the methods to primitive.rs.

At this stage we end up with 2 separate methods for a PrimitiveValue e.g. :
to_date() - which fails, if value incomplete, returns NaiveDate
to_partial_date() - which accepts incomplete values, returns PartialDate

From a user perspective, one has to call the the first method only to find out that it failed for imprecise values.

It does not seem too far out or inconvenient. One could maybe do a few other tweaks to the API:

  • Calling the methods to_specific_date(&self) -> NaiveDate and to_date(&self) -> PartialDate would more likely direct users to the latter method first, and through documentation they can understand whether to_specific_date would be more suitable for them. This documentation part can be done regardless of whether the method names are changed.
  • Or, just provide only to_date and let that be the expected path towards making a NaiveDate, which narrows the interface into the bare minimum required to be functional and usable: value.to_date()?.as_naive_date()?

I wonder how this could be simplified for the user.

Maybe if we rewrote the PrimitiveValue types:

    /// A sequence of complete dates.
    /// Used for the DA representation.
    Date(C<NaiveDate>),

    /// A sequence of complete date-time values.
    /// Used for the DT representation.
    DateTime(C<DateTime<FixedOffset>>),

    /// A sequence of complete time values.
    /// Used for the TM representation.
    Time(C<NaiveTime>),

... so that they contain a custom struct like the implemented Partials, maybe named Dicom[Date,Time, DateTime] .
The Partial structure can hold a DA, TM, DT with full precision, so they can return NaiveDate, NaiveTime, DateTime...

There are two matters when working with this primitive value type:

  • The umbrella of conversion to other types, so that users can easily interpret the value as strings, dates, bytes, etc.
  • How they are persisted in memory, through the enum variants, which influences the underlying decoder implementation and how users have to build these values themselves.

In practice, the Date and Time, and DateTime variants do not occur when opening DICOM files, since these values are instead read as strings and converted by the user when such is requested. They still make sense when constructing new DICOM values in memory, so it is fair to make adjustments to these variants if deemed worthwhile.

such structure would then have documented (trait?) methods like

DicomDate.has_all_components() -> bool (or is_precise())
DicomDate.precision() -> DateComponent (year,month, day)
DicomDate.to_precise_value() -> Option<NaiveDate> (none if not precise)
DicomDate.to_range() -> (Option<NaiveDate>,Option<NaiveDate>)
DicomDate.earliest() -> Result<NaiveDate>
DicomDate.latest() -> Result<NaiveDate>

Or maybe, do you see a different place where this incovenience for the user can be sorted out ?

So yes, calling this one just DicomDate is a better name. 👍 It better conveys the idea that the date can be partial, as stated by the standard. Same for DicomTime and DicomDateTime. This part sounds good to me!

Feel free to ask any follow up questions if anything wasn't clear.

from dicom-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.