At the moment, blocks in the GDF file are returned as a list which mirrors how they ar

The current naming of load and <code class="notransla

Now I am thinking of the following names: read(...) - dict rea

Easygdf Dict Interface about easygdf HOT 7 CLOSED

electronsandstuff commented on July 19, 2024

Easygdf Dict Interface

from easygdf.

Comments (7)

electronsandstuff commented on July 19, 2024

The current naming of load and save isn't the best and migrating to the dict interface will require renaming anyways so this is a good time to think about this topic.

Other python libraries seem to use either load/dump or open/save. Since I am moving to a dict interface, but want to maintain backwards compatibility, I don't think I can keep the current name load without causing issues. That might always have to refer to the function that loads blocks of data. Instead, what I could do is change to the names open/save (like in pillow, for instance). That way the open function gets a new name which is unique to the dict interface. The save function can figure out which version to use based on the objects passed to it.

Since I am also planning for pandas compatibility, I also worry about the naming for those functions. The pandas naming convention is read_gdf and to_gdf. The issue is that there are two different standard files that can be read/written: screens/touts and initial distributions. I am kind of against using one function to handle both of them since it can be confusing to get two completely different types of outputs depending on the input and I feel this could lead to strange bugs for users (IE screens/touts returns like a dict of lists of dataframes and the initial distribution files are just dataframes). I suppose I should also segregate the naming of the pandas functions to make it clear what they do. I could put them in like a submodule named pd or pandas. Maybe it makes sense to keep the same names as the non-pandas versions, but just add the pd or pandas prefix?

Note: one argument against using submodules with the name pd or pandas is that if a user does from easygdf import * then it could cause serious confusion with names.

Right now it's looking like the public interface's names will be:

open(...) - Reads file with dict-like blocks
save(...) - Saves file with dict-like (or array-like) blocks
open_screens_touts(...) - alias for load_screens_touts
open_initial_distribution(...) - alias for load_initial_distribution
pd_open_screens_touts(...) - same as open_screens_touts, but with pandas dataframes
pd_open_initial_distribution(...) - same as open_initial_distribution, but with pandas dataframes

I still think the last few function names are kind of long, I should maybe look into shorter names for them.

from easygdf.

electronsandstuff commented on July 19, 2024

Alternative suggestions for long function names:

screens_touts: st, particles, particle_data, pdata, parts,
initial_distribution: dist, idist, init_dist

from easygdf.

electronsandstuff commented on July 19, 2024

Now I am thinking of the following names:

read(...) - dict reader
write(...) - dict writer
read_particles(...) - alias to open_screens_touts(...)
read_dist(...) - alias to open_initial_distribution(...)
write_particles(...) - alias to save_screens_touts(...)
write_dist(...) - alias to save_initial_distribution(...)

I am still thinking of the names for the panda functions. I would only need to have ones for the reading functions since I can tell what the type is for the write methods. I am also now thinking of whether this should be a new function or a keyword argument. I am leaning towards the latter. I will also keep all existing functions for backwards compatibility.

from easygdf.

electronsandstuff commented on July 19, 2024

I'm starting to work on these changes in branch dict-pandas-interface

from easygdf.

electronsandstuff commented on July 19, 2024

One issue I am immediately running into is that in the GDF format, objects can have both a value and children. This means I can't just interpret it as dicts of data.

That is, you can have blocks that look like:

[
  {'name': 'a', 'value': 0.0, 'children': [...]},
  {'name': 'b', 'value': 1.0, 'children': [...]},
]

Clearly, if there is only data and no children, then I think I should just return a key-value pair. If there is no value and only children, then I could return just a key-list pair. When there are both values and children, however, I guess the simplest thing to do would be {'value': 0.0, 'child': {...}}. Note: the multiple child blocks will get converted to a single dict and so it is child here, not children.

This does add another layer of depth to the tree which I don't like, but if I don't do this, then they value would show up in the parent dict or in child. I guess I could put it in the child with some fixed name like 'parent_val', but that could cause naming conflict. I could check for that issue, however.

from easygdf.

electronsandstuff commented on July 19, 2024

Actually, reading more through the standard, repeated block names are a common thing for GDF files. For screen outputs, for instance, the saved blocks all have the same name, but different value. This would imply using the key (name, value) to save things in the dict interface. This isn't that appealing since it was meant to make things easier to use and now you would have to type of crazy key names to access raw data.

from easygdf.

electronsandstuff commented on July 19, 2024

Sadly, it's starting to seem like a dict interface might not be reasonable for the GDF files. The most common output will be touts / screen outs which can't be represented this way. I might just have to stick to the current format. I suppose the names could still be improved, however, and I can still add the pandas methods.

from easygdf.

Easygdf Dict Interface about easygdf HOT 7 CLOSED

Comments (7)

Related Issues (7)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent