Code Monkey home page Code Monkey logo

Comments (8)

cmeyer avatar cmeyer commented on August 16, 2024
  1. Do not copy metadata. Copying almost always results in incorrect metadata and makes it impossible to use metadata for anything useful. For instance, let's say we try to detect "data acquired from a camera" using metadata, which we do currently. If all processed data from the original has a copy of the metadata, how are we to distinguish it from the original?
  2. Put source metadata into a sub-tag. This is also not a good idea as it can propagate a lot of idea quickly and still make it difficult to access the metadata.

My plan for this in the future (which may be now) is to provide more metadata functions that allow for tracing the sources of the metadata. So an operation can add metadata about its sources and then additional functions can help to access a processed item's sources and metadata can be grabbed from the source. The obvious downside is if the source is missing.

Another approach would be to get rid of the concept of unstructured metadata completely and define subsets that are more specific and may be copied. We already do this to some extent with calibrations and data descriptions. In the near future we may add coordinate systems. The first example might be "description of processing". Another new example might be "original data acquisition information." This needs some thinking.

from niondata.

cmeyer avatar cmeyer commented on August 16, 2024

Before I close this issue, I'd like to understand your use case. What are you trying to accomplish by copying metadata?

from niondata.

Brow71189 avatar Brow71189 commented on August 16, 2024

I think there are a lot of processing routines where copying the metadata is valid. Think of things like a Gaussian blur (or any other filter). Even alining a sequence does not invalidate metadata.
Right now if you process your data in Swift, you always loose all metadata.
If you then export or even just snapshot the result, you have no idea about how this data was acquired because all the information about it is just gone.
There might be operations that invalidate metdata, but defaulting to just dropping everything is the wrong approach in my opinion.

from niondata.

cmeyer avatar cmeyer commented on August 16, 2024

I've considered the exporting case in the past and my conclusion in the past was that export is a special operation that should consolidate processing information and include metadata from sources. See nion-software/nionswift#397 and nion-software/nionswift#398.

If the user has a way to view sources and metadata of those sources directly in an expanded metadata editor and export has an option for including processing info and metadata from sources, would that satisfy your concerns?

from niondata.

Brow71189 avatar Brow71189 commented on August 16, 2024

It is better than nothing but I would still prefer if functions that do not cause metadata to be invalid would just copy it to their results.

from niondata.

Brow71189 avatar Brow71189 commented on August 16, 2024

New example where this behavior leads to an awkward implementation:

Consider a camera with the new acquire_synchronized capabilities. It returns Partial data with the xdata attribute being the SI data which usually still contains the flyback pixels. So the obvoius thing to do would be to use the follwing code to crop the data:

from nion.data import xdata_1_0 as xd
partial_data = camera.acquire_synchronized_continue(...)
xdata = xd.data_slice(partial_data.xdata, (slice(None), slice(-2), ...))

Which is great because it keeps calibrations etc. and it is an API function.
But now we have to access a "private" function to get our metadata dict back:

xdata._set_metadata(partial_data.xdata.metadata)

Why does data_slice strip the metadata dict from the data? This makes absolutely no sense to me...

from niondata.

cmeyer avatar cmeyer commented on August 16, 2024

An additional rationale for this issue is described in the similar-to-niondata xarray project, where they only copy metadata unless explicitly requested or it is unambiguous.

xarray: What is your approach to metadata?

from niondata.

cmeyer avatar cmeyer commented on August 16, 2024

Additional notes 2021-07-21:

1:

  1. In general metadata describes what we are looking at, how it was collected, and its calibrations. Obviously none of these things change from an align process. An integrate will obviously affect the effective exposure time.
  2. Chris' comment that in the metadata there is an item specific to "data acquired from a camera" means that we should discard all information is highly facetious. The proper way to handle this is to have a metadata tag "Origin" that gets set to what the origin of the data was after copying the metadata. I would also add a tag "Trail" that appends the current Origin to the Trail of the source. So we would see "Trail"="Camera ELA,Align,Integrate,Crop" for a basic wq.
  3. losing calibrations is nuts, and it is not helped by it not being possible to use inspector to copy them, as inspector only shows a heavily rounded version of the calibration.

2:

We would be hugely better off if we simply copied the meta-data from the parent to the child. There is indeed a chance that this could lead to some confusion, and this would be an improvement on the current situation where confusion is guaranteed.

Generally speaking, a more robust method for handling meta data is needed. We need options for what to do to handle complex cases, with sensible defaults for the most common operations.

Common use cases include:

  1. multi-acquire NxM, align, integrate -- please do not lose the calibration information (scale nor intensity), microscope kV, acquisition detector, etc...
    (this could be 10,000 spectra with acquisition time 1ms, or 1,000 images with 1us/pixel, or other)
  2. acquire a spectrum, subtract dark image, multiply by gain image. please do not lose the calibration information (scale nor intensity), microscope kV, acquisition detector, etc...

from niondata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.