Code Monkey home page Code Monkey logo

Comments (5)

root-11 avatar root-11 commented on May 26, 2024

Hello @rhs3i

The hdf5 file will always be in tmp as:

H5_STORAGE = pathlib.Path(tempfile.gettempdir()) / "tablite.hdf5"

May I ask what you're trying to achieve?

from tablite.

rhs3i avatar rhs3i commented on May 26, 2024

Certainly. I'm the author of H5s, a scanner for HDF5. The first objective is to verify the scanner renders tablite HDF5 well. HDF5 that is intended to be primarily machine-read can stress a visual model in ways perhaps not considered, but the graphical constructs should still hold up when inspecting these files. The screenshots and links to the visual vocabulary of the scanner can give some illustration.

The second objective is to get a quick bit of insight as to whether H5s can augment usage of tablite in an interesting way, but that would be a future topic.

from tablite.

root-11 avatar root-11 commented on May 26, 2024

Hi Robert,
As you can see from the usage of tempdir the HDF5 files are generally used as a volatile database where data is stored in a hierarchy best described as:

  1. Tables have columns
  2. Columns have pagehandlers
  3. Page handlers have pages.
  4. Pages are of type: (a) Simple (int,float), (b) String (str, utf-8), (c) Mixed (non simple datatypes), (d) Sparse (lots of Nones)

In the tablite.hdf5-file you will therefore find that the Pages contain all the data, whilst the dataset (hdf groups) for Tables and Columns are empty and only have metadata in the attrs-field.

The details are explained here in the HDF5 group webinar: https://youtu.be/OoHVIKAD854?t=1415

from tablite.

rhs3i avatar rhs3i commented on May 26, 2024

Ah. Thank you for the correction and apologies for the time-wasting. I did sit for your HDF5 webinar (thank you), but I misunderstood the design, thinking that once the HDF5 backing-store had been created and stored all the computational deltas, it would persist beyond program execution and be used by a subsequent downstream tablite processor. But your presentation was clear--the re-import/reload example you showed (39:19) was from within a single program session. Scanning a volatile HDF5 datastore might have some utility in a debugging capacity, but that's another matter and may not be very useful.

Appreciate your time in getting me straightened out on this.

from tablite.

root-11 avatar root-11 commented on May 26, 2024

No problem. Happy I could help.

from tablite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.