tomography / tomopy-cli Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 8.0 1.74 MB

A command-line-interface for tomopy

Home Page: https://tomopycli.readthedocs.io

License: Other

Python 100.00%

tomopy-cli's People

Contributors

Stargazers

Watchers

Forkers

aps-7bm nikitinvv canismarko vdeandrade brk4 decarlof cpchuang mittoalb

tomopy-cli's Issues

storing a parameter as True

Need to add to the developer documentation the following:

if in the config.py file, my-parameter

    'my-parameter': {
        'default': False,
        'help': 'Verbose output',
        'action': 'store_true'}

is in the 'general' section then

tomopy recon --my-parameter

set args.my-parameter to True only when --my-parameter is used.
if my-parameter is defined in any other section listed as RECON_PARAMS or
FIND_CENTER_PARAMS then args.my-parameter is set to True the first time it is passed and it stays True till the next init

logger color decoder is not working on windows systems

#71 provided support to windows. All works except the logger is not correctly printing using colored messages.

change of the /process tag hadling

Currently the generation of the raw hdf data file (see 009_test.h5) has 3 actors:

During data collection:

areadetector in /exchange and /measurement following the data-exchange definition
tomo scan in /process/acquisition/

During data analysis:

tomopy-cli in /process/tomopy-cli-0.2/

To standardize the /process structure to better include other scan/recon methods (@MarkRivers and other), I propose tomopy-cli to store the processing information

in:

/process/reconstruction/tomopy-cli-0.2/

instead of:

/process/tomopy-cli-0.2/

This is a trivial change but it will require some coordination to make sure is not breaking any other dependency from the current structure in the find center and beam hardening sections of tomopy-cli.

I will follow with a pull request and ask @nikitinvv and @aps-7bm to review and propagate the change to any other local code the may have developed.

Enhancement request from Vincent

So, here is my first impression.

The help of the command line is very hard to digest: after doing a copy / past of the full help in word, one ends up with 4 pages of command lines (without any line intervals). I feel like it has to be subdivided with the most important info at the top, or at least organize on the readthedoc webpage.
I feel like reading through the config file should be easier:
- with most exotic functions at the very end (like beam hardening)
- more comments explaining what is what (trying to put the comments on the same lines as the commands to minimize the length of the file)
Some info I was trying to get and did not find (I am not saying it is not there, but I did not find it within the time I spent looking at CLI):
- doesn’t explain it can mixed config files and options
- Doesn’t tell where data will be dumped. Personally, I don’t like the fact the recon are not in the same folder as the hdf files. I put experiment in one folder (PI name). If I switch to cli, hdf files should be transferred at the end of the scan into a sub-folder then rather than the GUP folder on txmtwo.
Remark: if data has problem (flat field not at the good location for instance), I usually reshape stacks inside my recon script. I can’t do it here but I think the best way is to apply the fix to the hdf5 directly. I should make a bunch of scripts ready for this.

Chunked phase-retrieval reconstruction incorrectly recover slices on borders

Example:
If chunk size 32 then slices 0..4 and 28..31 shows no effect of the Paganin filtering.

For correct filtering one should use the chunks of sizes at least 48 (40 for ichunk=0 and ichunk=end-1), i.e. 0..40, 24..72, and cut them after filtering to get 0..32, 32..64, etc.

hdf file writer

Currently the config data are saved in the hdf file as string. This should be handled properly by a dedicated hdf writer that handles type and attributes (for units)

read hdf file meta data

I propose to use meta to replace all auto_read_dexchage() functions.

As a start, the option to save the recon data as hdf (--output-format hdf5) now is adding all data collection meta data (see #84 )

calling tomopy without paramenters

throughs an error:

logs_home = args.logs_home
AttributeError: 'Namespace' object has no attribute 'logs_home'

we should capture it

extra_parameters_file

I'm looking for a way to specify per-file recon parameters, and I have a solution to propose. It's not technically breaking backwards compatibility, but we should deprecate the rotation_axis_file to avoid confusion.

Use Case: I have a large number (>100) tomograms to reconstruct from an in-situ experiment. I have a single recon.conf file that is close to the right configuration for all the tomograms, but each tomogram has a few parameters that are different (e.g. rotation_axis, nsino, stripe_removal).

My solution using YAML file: My first thought was to extend the rotation_axis.json approach already present. The problem is that the current schema of {"0": {"my_tomogram001.h5": 1024}} only allows for specifying a rotation axis. I would have to break the schema in order to include other parameters (and then need to include some version info to detect which schema is being used). Since I'm rewriting the schema anyway, why not switch to a more human-readable markup language, like YAML. The same rotation axis info would then be::

my_tomogram001.h5:
  rotation-axis: 1024

Extra parameters can be added as well:

my_tomogram001.h5:
  rotation-axis: 1024
my_tomogram002.h5:
  rotation-axis: 1015
  remove-stripe-method: none

I create new branch yaml-parameter-file that implements this approach. At the start of each reconstruction, a new argparse.Namespace object is created that is a deep copy of the main args. Then if args.extra_parameters_file is given, the YAML file is loaded for the current value of args.file_name and the extra parameters (if any) are swapped out in the local Namespace used for the reconstruction. After each reconstruction, the local Namespace is discarded.

Any argument given directly via the command-line takes precedence over those found in the YAML file, and the YAML file takes precedence over the .conf file.

The same YAML file can also be provided as the argument to args.file_name. In this case, the list of HDF5 files to reconstruct will be taken from the YAML file, but the parameters will be ignored (unless the same file is given for args.extra_parameters_file). This matches the current behavior for if a JSON file is given as args.file_name.

Open Questions:

Is YAML the best approach here? Are there situations where YAML is too constrained? It would be nice not to change the schema again if this is accepted.
After a full reconstruction, a new .conf file is saved alongside the TIFFs for reproducibility. Should this reflect the main args Namespace, or the temporary local Namespace? I think it should be the local one but could see an argument for either one.
How best to deprecate args.rotation_axis_file? One option is to put a check and logging in config.update_config so the parameter is not saved. That way $tomopy init doesn't save that parameter. It's a bit fragmented, though.
What to do with args.rotation_axis_auto? Currently, if args.rotation_axis_auto == "json", then the rotation axis is taken from the JSON file. I've done the same for the YAML file if args.rotation_axis_auto == "yaml", but this seems needlessly complicated since this only applies to rotation_axis and requires someone to be aware of this behavior. If we removed this, though, it would be difficult to specify a manual rotation center in the .conf file and still use args.extra_parameters_file.
Is args.extra_parameters_file the right name for the parameter or is there something more clear?
Does order matter? The current rotation_axis.json file includes an index for each tomogram, but I don't see a reason that this would matter. Even if order was important, it seems that the order in which the files are listed could be respected.

Backwards Compatibility:

I don't think there's any breaking changes here. The $ tomopy find_center command will save rotation centers into the YAML file specified by args.extra_parameters_file. The args.rotation_axis_file parameter is effectively deprecated, though will still be used for rotation centers if given, with the YAML file taking precedence.

Other Strategies: Here are some other things I considered but ultimately didn't like for one reason or another.

I could create a .conf file for each tomogram. The problem here is that if I need to change some common parameter (e.g. number of iterations) I need to update every .conf file.
Shell script or makefile that contains a series of lines to call tomopy recon mytomogramNNN.h5 and then add each overidden parameter as an argument to the relevant script line. This would probably work, but rotation centers would still be stored in the rotation_axis.json file created by tomopy find_center. Would be nice to have everything in one file and all handled by tomopy-cli.

orthorec

Looking for suggestion in keeping orthorec as a stand alone command or include it into tomopy-cli.

Thoughts?

handling multiple reconstructions running at the same time

Currently when running multiple reconstructions each saves its status in the same file (~/tomopy.conf) so unless tomopy recon is called with all params all the time we may end up using a set of incorrect parameter values.

A possible solution is to add a --config-update default False

any comment or concern?

support for different data format

add support tomographic data generated at different beamline by integrating data-exchange readers.

inconsistent rec images saving for different --reconstruction-type options

--reconstruction-type slice saves under the raw data directory while try/full under top_rec/
--reconstruction-type full must have overwrite=True as default

numpy 1.24.X (current latest) depreciates np.int()

Near line ~190 of recon.py, the following lines:

sino_chunk_start = np.int(sino_start + nSino_per_chunkiChunk)
sino_chunk_end = np.int(sino_start + nSino_per_chunk(iChunk+1))

.... np.int() is depreciated in vers 1.24.x of np and replaced with np.int32() and np.int64()

automated center of rotation

The new:

`$ tomopy recon --file-name /local/data/data.h5 --rotation-axis-auto

improves and replaces:

$ tomopy find_center --file-name /local/data/data.h5

in the current version (when passing a file) $ tomopy find_center --file-name /local/data/data.h5 just shows the automatically found center (without saving it). The new --rotation-axis-auto option handles the single file properly => I should remove the handling of the single fine by find_center so from now on this option will be used only for folders:

$ tomopy find_center --file-name /local/data/

in this case a json file is created and all 1-slice recon are stored in the same folder for rapid inspection as a stack.

Installation complains that tomopy-sacred.py does not exist

I'm trying to install tomopy-cli according to the README. When I run python setup.py install, I get an error message that bin/tomopy-sacred.py does not exist. If I change setup.py so that line 23 reads scripts=['bin/tomopy'],, then it installs fine. I'm not sure if this is the correct solution since I don't know what is contained in tomopy-sacred.py.

Use correct package name(s) when reporting version or calling function

The tomopy-cli interface conflates tomopy and tomopy-cli. This is confusing users because they don't know that tomopy-cli=0.2 (the app) could be running atop any version of tomopy=1 (the library) since there is no minimum tomopy version specified by tomopy-cli. See tomopy/tomopy#266 (comment) for example.

Please report both the tomopy and tomopy-cli versions, when calling $ tomopy --version. Possibly consider changing the app call from tomopy to tomopy-cli.

Ring removal error when using 16-bit floats

I just tried to perform a reconstruction of data saved as 16-bit floats. The code was able to run successfully except for the ring removal. Both the fw and ti options gave an error in trying to convert float16 to a ctypes.

Allow different output formats for reconstructed data

Planning to submit a PR for this, but wanted to get input on how best to tie all the pieces together. This also potentially involves a change to dxchange.

Here's the use case: during this COVID-19 business, I've been running tomography reconstructions on APS workstations (eg. Mach and txmtwo). I then copy the tiffs over to my local machine in order to do the analysis/post-processing/visualization. I prefer to do this work directly on HDF5 datasets since it's easier on my computer's memory, especially for something like plotting a single XZ slice where reading the whole tiff stack into memory is not necessary. The first step after copying is to read all the tiffs and then save them into an HDF5 file. This is a waste of time and disk space since there's no reason I couldn't just save the reconstructed slices directly to the HDF5 file and copy that instead.

My proposal: Add a string-type configuration parameter output-format with the default value of "tiff" and choises of ['tiff', 'hdf5']. A value of "tiff" keeps the current behavior. If the value is "hdf5", then the reconstructed slices are saved in a new HDF5 file. If the input tomography data is my_tomo_data.h5 then the output file would be my_tomo_data_recon.h5 or something like that and have a dataset named volume with a element_size_um attribute holding the pixel size for compatibility with the ImageJ HDF5 plugin.

dxchange: I'd like to use dxchange for writing the HDF5 file, however I noticed that the overwrite=True option to dxchange's HDF writer seems to clobber the entire file, which could be dangerous if that file also has processed data in it or something like that. I would look into submitting a PR to dxchange to change that behavior to overwrite just the HDF5 dataset, not the whole HDF5 file.

Let me know what you think and if you would prefer I do it a different way.

global log

address issue reported at #2

replace read_aps_32id with new dx_reader

Once the dxchange pull request is accepted and part of the dxchange distribution replace read_aps_32id with the new read_dx.
read_dx returns all hdf file metadata as a dictionary.

``--fix-nan-and-inf-value`` can only be the max value in the sinogram

The --fix-nan-and-inf option will replace all the nan and inf values in the new value specified in --fix-nan-and-inf-value. It will also replace any values above the new value with the new value itself, effectively chopping off the top end of the histogram (

tomopy-cli/tomopy_cli/prep.py

Line 49 in b49125d

data[data > params.fix_nan_and_inf_value] = params.fix_nan_and_inf_value

). This is not mentioned in the help string for this option, and so if I choose, for example, the median value in the sinogram as my new value, I will not get a sensible reconstruction but with indication why.

I think we should either (in order of preference):

Get rid of this one line. Does anyone still use this feature? It was originally added by Alan during my 7-BM-B beamtime in Feb 2020.
Add a separate option to cap the allowed values in the sinogram.
Indicate in the help string for the --fix-nan-and-inf option that it will cap the values in the sinogram so at least this clear when using that option.

Thoughts?

IndexError: boolean index did not match indexed array along dimension 0; dimension is 10240 but corresponding boolean dimension is 10241

A fresh install of a conda env with tomopy and tomopy-cli result in an indexing error. Error reproduced on multiple systems, following install instructions of
tomopy: https://tomopy.readthedocs.io/en/stable/
and tomopy-cli: https://tomopycli.readthedocs.io/en/latest/source/install.html

Traceback (most recent call last):
  File "/home/tsippel/anaconda3/envs/tomopy/bin/tomopy", line 33, in <module>
    sys.exit(load_entry_point('tomopy-cli==0.3', 'console_scripts', 'tomopy')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tsippel/anaconda3/envs/tomopy/bin/tomopycli.py", line 181, in main
    args._func(args)
  File "/home/tsippel/anaconda3/envs/tomopy/bin/tomopycli.py", line 109, in run_rec
    recon.rec(args)
  File "/home/tsippel/anaconda3/envs/tomopy/lib/python3.11/site-packages/tomopy_cli/recon.py", line 92, in rec
    proj, flat, dark, theta, rotation_axis = file_io.read_tomo(sino, pproj, params) 
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tsippel/anaconda3/envs/tomopy/lib/python3.11/site-packages/tomopy_cli/file_io.py", line 68, in read_tomo
    proj, flat, dark, theta = flip_and_stitch(params, proj360, flat360, dark360, theta360)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tsippel/anaconda3/envs/tomopy/lib/python3.11/site-packages/tomopy_cli/file_io.py", line 236, in flip_and_stitch
    img[:,:,:img360.shape[2]] = img360[good_180_360,:,::-1] * wedge
                                ~~~~~~^^^^^^^^^^^^^^^^^^^^^
IndexError: boolean index did not match indexed array along dimension 0; dimension is 10240 but corresponding boolean dimension is 10241

Current environment:

# Name                    Version                   Build  Channel
dxchange                  0.1.8              pyhd8ed1ab_0    conda-forge
meta                      0.1                      pypi_0    pypi
tomopy                    1.14.0             pyhd8ed1ab_0    conda-forge
tomopy-cli                0.3                      pypi_0    pypi

0-360 data collection

@canismarko @aps-7bm @nikitinvv:

there are currently two options to reconstruct 0-360 data sets collected with the rotation axis on one side

--file-type flip_and_stich --rotation-axis-flip 10

and

--file-type double_fov --rotation-axis 10

these look very similar. The only difference I can really see is that the first does the flip@stitch at file reading accounting for the overlapping area and generates flip&stitch data with different H size depending from the location of the rotation axis. The second does the flip&stitch with normalized data (just before the reconstruction) and uses a fixed double H size.

The first adds its own find_center, the second does not support find_center yet.

The first adds to --rotation-axis (that is still the location of the rotation axis but now in in the larger array) the --rotation-axis-flip parameter to indicate the location of the rotation axis in the raw data where to make the flip.

Any suggestion on how to consolidate the two so they are compatible with any other option applied after file reading (like the existing find_center used in the 0-180 scan for example)?

handling of --file name /folder

@PallDmitrich reported an inconsistency when handling a folder containing multiple hdf files.

Unsolved conflict during #75 pull request causes read_tomo error

#75 is causing:

proj, flat, dark, theta, params_rotation_axis_ignored = file_io.read_tomo(sino, pproj, params, True) TypeError: read_tomo() takes from 2 to 3 positional arguments but 4 were given

I think this is because of the unsolved conflict at:

is possible that the last changes were made on a local version that was not current.

It should be simple to solve by just replacing:

read_tomo(sino, params, ignore_flip = False):

with the new:

read_tomo(sino, proj, params, ignore_flip = False):

moved tomopy-cli under tomography

@dgursoy @carterbox Please advise if is OK to move here the tomopy-cli. I start to see forks from 7-BM and I expect more from 2-BM and 32-ID => I prefer also to work on a fork under decarlof and keep the master somewhere else

rotation-axis tag

rotation-axis is currently under file-reading. Makes it sense to move it under find-rotation-axis?

Also I would change this to use write_hdf so the rotation axis information is stored under /process/tomopy-cli-0.1/find-rotation-axis/rotation-axis.

Let me know if it is ok and I will make the changes