Code Monkey home page Code Monkey logo

waveorder's People

Contributors

andrewdchen avatar bryantchhun avatar camfoltz avatar ieivanov avatar johannarahm avatar lihaoyeh avatar mattersoflight avatar talonchandler avatar tiffchien avatar ziw-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

waveorder's Issues

Focus-finding features

@ieivanov and I just discussed some desired features for the focus-finding algorithms:

  • return a normalized "SNR" or "confidence" metric alongside the in-focus slice. We're concerned that low-SNR images may make the focus-finding fail and cause erratic stage motion. I need to think more and test exactly what metric we'll use, but @ieivanov and I agreed that we want the metric to be invariant to (1) the number of pixels, (2) the overall brightness of the sample, and (3) the size of the sample.
  • a plot=True option to save a small figure of the focus-finding run, likely mid-band power vs. slice, a dot for the peak, and an annotation for the confidence metric.

torch meshgrid warning

recOrder tests surfaced the following warning:

recOrder/tests/cli_tests/test_compute_tf.py::test_compute_transfer
  /opt/anaconda3/envs/recorder-dev/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3484.)
    return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Support GPUs via `torch`

waveorder's simulations and reconstructions are moving to torch following the new models structure, and along the way we decided to temporarily drop GPU support in favor of prioritizing the migration.

We would like to restore GPU support for many of our operations, especially our heaviest reconstructions.

@ziw-liu, can you comment on the easiest path you see to GPU support?

acquisitions that are stopped early produce incorrect zarr outputs

problem

A micro-manager acquisition that stops early (either from a crash or manually) does not update the ome.xml metadata, which is determined based on defined experiment parameters at the start of acquisition. Consequently, tifffile, zarr, and our io.MicromanagerReader expect a larger dataset than exists. The result is a written file that has a larger dimensionality than the source file.

solution

Unclear right now. It's possible the ome.tiff structure has something in place to signify "empty data" or similar... Will need some research

Zarr Metadata for Data Reading / Writing

As we discussed in the deployment meeting, we are moving towards data storage in zarr or ome-zarr format. As we start to think about optimizing the storage, we should also begin on how to structure subgroups of zarr datastores so that a data reader/writer module will be able to correctly load data into the correct place.

For example, Ivan has different stokes data stored as subgroups in a larger datastore that corresponds to all of the data for a single FOV. These subgroups can have names, such as "Stokes_1_FOV1", and when you need to load this array, you will have to specify the name of this subgroup. For a data reader, we would need a way to standardize the naming format of these subgroups, or assign subgroup attributes, so that we can intuitively scan through a datastore to find the correct subgroups to load.

I suggest that we come up with a standardized way of assigning attributes to these subgroups, so that we do not necessarily need to adhere to strict naming conventions. For example, if we are saving computed stokes channels as different subgroups in a datastore, where the structure is as such: store/Stokes/Stokes0_zstack where the array is Stokes0_zstack. We can assign attributes in the following way:

store_path = '/home/camfoltz2/Stokes_FOV1.zarr'
store = zarr.open(store_path)

store['Stokes']['Stokes0_zstack'].attrs['Type'] = 'S0'

by calling .attrs['Type'] = 'S0' we can create an attribute of the array called 'Type' and then define the 'Type' as 'S0'. You can then search through this dictionary of attributes when loading data. These attributes do not have to be strings, they can also be lists and numbers.

I think a good idea here would be to standardize the attributes of all of our data, and have the data io module assign these standardized attributes to arrays upon saving. @bryantChhun and others what do you think here? I think there will need to be some type of data labeling as we move to the zarr storage format.

Reader to automatically determine datatype (singlepagetiff vs. multipage tiff)

it would be a nice feature for the user to not have to specify what type of raw data they are trying to read, such as ometiff or singlepagetiff. I think they are different enough at the metadata level that we should be able to find some distinction between the two and automatically determine the type upon init.

This will help with eventual recOrder pipelines and allow the user to not specify this parameter.

WaveorderWriter default array name

Zarr defaults to calling an unnamed array arr_0, maybe we can call the array that the WaveorderWriter writes arr_0 instead of array. This way we can use the dot notation to index the array: zs.arr_0 instead of zs['arr_0']. @camFoltz if you remember zs.array is a method that we're overloading by calling the data array array

`WaveorderWriter` is not compatible with `napari-ome-zarr` reader

I first noticed this issue during a recOrder session when I tried to drag and drop a zarr store saved by a recOrder acquisition into napari. The layers and shape populated correctly, but the actual pixel values were set to zero.

Digging further, I've realized that this is a WaveorderWriter issue. Here's a minimal example:

$ python
>>> import numpy as np
>>> from waveorder.io import WaveorderWriter
>>> writer = WaveorderWriter("./test")
>>> writer.create_zarr_root("test")
>>> writer.init_array(position=0, data_shape=(1,1,2,3,4), chunk_size=(1,1,2,3,4), chan_names=["TEST"])
>>> writer.write(np.random.random((1,1,2,3,4)), p=0)
>>> quit()
$ napari --plugin napari-ome-zarr test/test.zarr

results in this traceback:

18:06:35 WARNING version mismatch: detected: FormatV01, requested: FormatV04
18:06:35 ERROR Failed to load Row_0/Col_0/0/0
Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 137, in __getitem__
    result = self.fs.cat(k)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 755, in cat
    return self.cat_file(paths[0], **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 665, in cat_file
    with self.open(path, "rb", **kwargs) as f:
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 1034, in open
    f = self._open(
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 162, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 260, in __init__
    self._open()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 265, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'Q:/Talon/2022_11_23/recOrderPluginSnap_0/test/test.zarr/Row_0/Col_0/0/0/.zarray'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1376, in __getitem__
    return self.map[key]
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 141, in __getitem__
    raise KeyError(key)
KeyError: 'Row_0/Col_0/0/0/.zarray'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 241, in _load_metadata_nosync
    meta_bytes = self._store[mkey]
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1378, in __getitem__
    raise KeyError(key) from e
KeyError: 'Row_0/Col_0/0/0/.zarray'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\reader.py", line 548, in get_tile
    data = self.zarr.load(path)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\io.py", line 113, in load
    return da.from_zarr(self.__store, subpath)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\dask\array\core.py", line 3569, in from_zarr
    z = zarr.Array(mapper, read_only=True, path=component, **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 215, in __init__
    self._load_metadata()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 232, in _load_metadata
    self._load_metadata_nosync()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 243, in _load_metadata_nosync
    raise ArrayNotFoundError(self._path)
zarr.errors.ArrayNotFoundError: array not found at path %r' 'Row_0/Col_0/0/0'
Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\Scripts\napari.exe\__main__.py", line 7, in <module>
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\__main__.py", line 446, in main
    _run()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\__main__.py", line 335, in _run
    run(gui_exceptions=True)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\_qt\qt_event_loop.py", line 402, in run
    app.exec_()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\contextlib.py", line 126, in __exit__
    next(self.gen)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\_qt\utils.py", line 459, in _maybe_allow_interrupt
    old_sigint_handler(*handler_args)
KeyboardInterrupt
(recorder-dev) PS Q:\Talon\2022_11_23\recOrderPluginSnap_0> python
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:51:29) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from waveorder.io import WaveorderWriter
>>> writer = WaveorderWriter("./test")
>>> writer.create_zarr_root("tes")
Creating new zarr store at ./test\tes.zarr
>>> writer.init_array(position=0, data_shape=(1,1,2,3,4), chunk_size=(1,1,2,3,4), chan_names=["TEST"])
>>> writer.write(np.random.random((1,1,2,3,4)), p=0)
>>> quit()
(recorder-dev) PS Q:\Talon\2022_11_23\recOrderPluginSnap_0> napari --plugin napari-ome-zarr test/tes.zarr
18:11:25 WARNING version mismatch: detected: FormatV01, requested: FormatV04
18:11:25 ERROR Failed to load Row_0/Col_0/0/0
Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 137, in __getitem__
    result = self.fs.cat(k)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 755, in cat
    return self.cat_file(paths[0], **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 665, in cat_file
    with self.open(path, "rb", **kwargs) as f:
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 1034, in open
    f = self._open(
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 162, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 260, in __init__
    self._open()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 265, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'Q:/Talon/2022_11_23/recOrderPluginSnap_0/test/tes.zarr/Row_0/Col_0/0/0/.zarray'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1376, in __getitem__
    return self.map[key]
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 141, in __getitem__
    raise KeyError(key)
KeyError: 'Row_0/Col_0/0/0/.zarray'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 241, in _load_metadata_nosync
    meta_bytes = self._store[mkey]
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1378, in __getitem__
    raise KeyError(key) from e
KeyError: 'Row_0/Col_0/0/0/.zarray'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\reader.py", line 548, in get_tile
    data = self.zarr.load(path)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\io.py", line 113, in load
    return da.from_zarr(self.__store, subpath)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\dask\array\core.py", line 3569, in from_zarr
    z = zarr.Array(mapper, read_only=True, path=component, **kwargs)
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 215, in __init__
    self._load_metadata()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 232, in _load_metadata
    self._load_metadata_nosync()
  File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 243, in _load_metadata_nosync
    raise ArrayNotFoundError(self._path)
zarr.errors.ArrayNotFoundError: array not found at path %r' 'Row_0/Col_0/0/0'

It seems related to our naming convention of Row_0/Col_0/Pos_000. Do we need to change this convention, or change the metadata so that ome-zarr can read our zarrs?

I think this is reasonably important because there are lots of cases when I'd like to use the ome-zarr reader to split zarr's channels into layers (it's native behaviour) instead of the recOrder-napari reader that opens as a single array (also convenient in some cases). It also causes confusion because the reader seems to work but has zeros---did the reconstruction get saved? (It does)

Interface of WaveorderWriter.writer is not "pythonic"

The t, c, z input interface to WaveorderWriter.writer is not very pythonic. It makes sense for these parameters to be integers or iterables (e.g. c = range(4)) as opposed to list of start and end indices, i.e. c = [0, 4]. This should simplify WriterBase.write too

reconstruction with GPU seems is limited by GPU RAM

I tried to deconvolve a single position and the 40GB GPU RAM wasn't sufficient. Perhaps there is a bug in version of waveOrder used by recOrder.

The path to config file:
comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon_sequential.yml

To reproduce, do the following on our HPC nodes:

module load anaconda
module load comp_micro
conda activate recorder
recOrder.reconstruct --config phase_fluor_decon_sequential.yml

I see the following error log

(recorder) [shalin.mehta@gpu-a-002 2022_01_20_orgs_BF_GFP_63x_04NA]$ recOrder.reconstruct --config phase_fluor_decon_sequential.yml 
Reading Data...
Finished Reading Data (0.0 min)
Creating new zarr store at /hpc/projects/comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon/phase_fluor_optimization_pos47.zarr
Initializing Reconstructor...
Finished Initializing Reconstructor (5.62 min)
Initializing Reconstructor...
Finished Initializing Reconstructor (3.18 min)
Beginning Reconstruction...
Traceback (most recent call last):
  File "/home/shalin.mehta/.conda/envs/recorder/bin/recOrder.reconstruct", line 33, in <module>
    sys.exit(load_entry_point('recOrder', 'console_scripts', 'recOrder.reconstruct')())
  File "/home/shalin.mehta/code/recOrder/recOrder/scripts/run_pipeline.py", line 48, in main
    manager.run()
  File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/pipeline_manager.py", line 337, in run
    deconvolve2D, deconvolve3D = self.pipeline.deconvolve_volume(stokes)
  File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/phase_from_bf_pipeline.py", line 136, in deconvolve_volume
    lambda_re=self.config.TV_reg_ph_3D, itr=self.config.itr_3D)
  File "/home/shalin.mehta/code/recOrder/recOrder/compute/qlipp_compute.py", line 308, in reconstruct_phase3D
    itr=itr, verbose=False)
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/waveorder_reconstructor.py", line 2370, in Phase_recon_3D
    f_real = Single_variable_Tikhonov_deconv_3D(S0_stack, H_eff, reg_re, use_gpu=self.use_gpu, gpu_id=self.gpu_id, autotune=autotune_re, verbose=verbose)
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 1069, in Single_variable_Tikhonov_deconv_3D
    f_real_f = compute_f_real_f(xp.log10(reg_re))
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 961, in compute_f_real_f
    f_real_f = S0_stack_f * H_eff_conj / (H_eff_abs_square + reg_coeff)
  File "cupy/core/core.pyx", line 1045, in cupy.core.core.ndarray.__truediv__
  File "cupy/core/_kernel.pyx", line 1063, in cupy.core._kernel.ufunc.__call__
  File "cupy/core/_kernel.pyx", line 565, in cupy.core._kernel._get_out_args
  File "cupy/core/core.pyx", line 2380, in cupy.core.core._ndarray_init
  File "cupy/core/core.pyx", line 151, in cupy.core.core.ndarray._init_fast
  File "cupy/cuda/memory.pyx", line 578, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1250, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1271, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 939, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 959, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 1210, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 7,314,866,176 bytes (allocated so far: 36,674,994,688 bytes).

`WaveorderReader` uses `glob` to read a folder of ome-tiffs

The WaveorderReader scrambles the position order when it reads a folder of ome-tiffs.

For example:

$ python
>>> from waveorder.io import WaveorderReader
>>> reader = WaveorderReader('/hpc/projects/compmicro/raw_data/hummingbird/all_21_3/') # will take ~30 seconds
>>> reader.reader.files[0:3]
['...MMStack_232.ome.tif', '...MMStack_177.ome.tif', '...MMStack_57.ome.tif']

This line seems to be the culprit, since glob docs say that the files will be returned in arbitrary order.

A possible solution is to wrap the glob in a natsort.natsorted function. But maybe theres a better way that uses the MM metadata instead of the file names?

Expose step_size from metadata

A key parameter that we use in reconstruction is step size, if we can pull this out from the metadata and have it easily accessible it would be very useful.

Partial FOV Loading

With the introduction of memory mapping, it will be nice to have partial FOV as @lihaoyeh mentioned in #43.

This should be possible to make general for all the readers and I will document it here as an upcoming feature addition.

`lambda_illu`, `NA_obj`, and `NA_illu` names can be improved

The waveorder_microscopy constructor contains the following lines:

self.lambda_illu = lambda_illu/n_media
self.NA_obj = NA_obj/n_media
self.NA_illu = NA_illu/n_media
which creates confusion when reading the methods and refactoring.

I am planning to make the following changes:

  1. Rename self.lambda_illu to self.lambda_media because lambda_illu/n_media is the wavelength inside the immersion media. It's particularly important to keep track of what media we're in because we have plans to to reconstructions in layered samples (immersion media, water, etc).

  2. Do not change the names ofself.NA_obj and self.NA_illu, but make them true to their names and don't normalize by n_media.

Slow birefringence reconstruction for z-stacks

The time to reconstruct z-stacks is not linear and quickly becomes an issue reconstructing >50 slice z-stacks. These stacks have shapes (5,61,2048,2048) for 5-state polarization, Z,Y, X.

The issue arises mainly in the reconstruct_qlipp_stokes() function where we get:
1 slice
Stokes Elapsed Time: 1.6s
Birefringence Elapsed Time: 0.4
5 slice
Stokes Elapsed Time: 10.2s
Birefringence Elapsed Time: 2s
10 slice
Stokes Elapsed Time: 24.6s
Birefringence Elapsed Time: 4.6s
61 slice
Stokes Elapsed Time: 210s
Birefringence Elapsed Time: 327s

#%%
from waveorder.io.reader import WaveorderReader
from waveorder.io.writer import WaveorderWriter
from recOrder.io.utils import load_bg
from recOrder.compute.reconstructions import (
    initialize_reconstructor,
    reconstruct_qlipp_stokes,
    reconstruct_qlipp_birefringence,
    reconstruct_phase3D,
)
from datetime import datetime
import numpy as np
import os
import sys
import multiprocessing as mp  

#debugging
import time

# %%
#Data paths
data_root = '/hpc/projects/comp_micro/rawdata/falcon/Ed/2022_12_09_zebrafish_GFP_ISG15_dtomat_MPEG'
# data_folder = 'infected_uninfected_fish_5'
data_folder = 'recOrderPluginSnap_0/multiposition_intoto_1'
dataset_folder = os.path.join(data_root,data_folder)
print('data:' + dataset_folder)
# Background folder name
bg_folder = os.path.join(data_root,'BG')
print('bg_folder:' + bg_folder)
# %%
#Setup Readers
reader = WaveorderReader(dataset_folder)
print(reader.shape)

t = 0
pos = 0
pol_data = reader.get_array(0)[t, :5,25:26]
# channels = ['state0','state1','state2','state3','state4','gfp','dtomato']
# pol_data = data[:5,25:30]
C, Z, Y, X  = pol_data.shape

# fluor_data = data[6:]
bg_data = load_bg(bg_folder, height= Y, width= X)

# %%
# Get first position
print("Start Reconstructor")
reconstructor_args = {
    "image_dim": (Y, X),
    "mag": 8.25,  # magnification   is 200/165
    "pixel_size_um": 3.45,  # pixel size in um
    "n_slices": Z,  # number of slices in z-stack
    "z_step_um": 5,  # z-step size in um
    "wavelength_nm": 532,
    "swing": 0.1,
    "calibration_scheme": "5-State",  # "4-State" or "5-State"
    "NA_obj": 0.55,  # numerical aperture of objective
    "NA_illu": 0.2,  # numerical aperture of condenser
    "n_obj_media": 1.0,  # refractive index of objective immersion media
    "pad_z": 5,  # slices to pad for phase reconstruction boundary artifacts
    "bg_correction": "local_fit",  # BG correction method: "None", "local_fit", "global"
    "mode": "3D",  # phase reconstruction mode, "2D" or "3D"
    "use_gpu": False,
    "gpu_id": 0,
}

reconstructor = initialize_reconstructor(
    pipeline="birefringence", **reconstructor_args
)

proj_folder =  '/hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg'
timestamp = datetime.now().strftime("%Y_%m_%d_%H%M%S")
proj_folder = os.path.join(proj_folder, timestamp)
if not os.path.exists(proj_folder):
    os.mkdir(proj_folder)
output_folder = os.path.join(proj_folder, 'multiposition_intoto_1')
print(output_folder)

writer = WaveorderWriter(output_folder)

timestamp_short  = datetime.now().strftime("%Y%m%d")
writer.create_zarr_root(timestamp_short + "_birefringence_pos_" + str(pos))
    
# Reconstruct data Stokes w/ background correction
print("Begin stokes calculation")
start_time = time.time()
bg_stokes = reconstruct_qlipp_stokes(bg_data,reconstructor)
stokes = reconstruct_qlipp_stokes(pol_data, reconstructor,bg_stokes)
print(f'Stokes elapsed time:{time.time() - start_time}')
print(f"Shape of background corrected data Stokes: {np.shape(stokes)}")
# Reconstruct Birefringence from Stokes
# Shape of the output birefringence will be (C, Z, Y, X) where
# Channel order = Retardance [nm], Orientation [rad], Brightfield (S0), Degree of Polarization
print("Begin birefringence calculation")
bire_start_time = time.time()
birefringence = reconstruct_qlipp_birefringence(stokes, reconstructor)
birefringence[0] = (
    birefringence[0] / (2 * np.pi) * reconstructor_args["wavelength_nm"]
)
print(f'Bire elapsed time:{time.time() - bire_start_time}')
print(f'Total elapsed time:{time.time() - start_time}')
# %%

I am documenting as an issue for further deep debugging. Thank you @talonchandler for testing and confirming similar results.

Divide by zero during phase normalization

recOrder tests surfaced the following warning.

recOrder/tests/cli_tests/test_reconstruct.py::test_reconstruct
  /Users/talon.chandler/waveorder/waveorder/util.py:711: RuntimeWarning: invalid value encountered in divide
    img_norm_stack[i] = img_stack[i] / uniform_filter(

This section is used by the isotropic_thin_3d reconstruction during normalization before reconstruction. This division was copied directly from pre-alg-dev code, and these reconstructions are passing our qualitative tests and I'm not concerned about deep problems here.

This issue is related to #88. Edit: uniform_filter is from scipy while uniform_filter_2D is a custom implementation.

Migrate fluorescence reconstructions to `torch` and `model`-based structure

I'm planning to move the fluorescence reconstructions to torch and to the new model-based structure.

  • the waveorder_reconstructor/fluorescence_microscopy class will eventually be deprecated
  • the current fluorescence_microscopy class supports 2D, 3D, and fluorescence anisotropy reconstructions. Following our <sample_type>_<sample_thickness>_<data_type> model-name schema, I will suggest moving+renaming these reconstructions to:
  1. 2D -> isotropic_fluorescent_thin_3d
  2. 3D -> isotropic_fluorescent_thick_3d
  3. fluorescence anisotropy -> inplane_anisotropic_fluorescent_thin_pol3d

@mattersoflight naming discussion is welcome.

@mattersoflight @edyoshikun if our immediate needs require just the isotropic_fluorescent_thick_3d, then I will prioritize that model.

`WaveorderReader` should give more informative error messages

The current WaveorderReader error messages can be slightly confusing. For example:

python
>>> from waveorder.io import WaveorderReader
>>> WaveorderReader('./2T_3P_16Z_128Y_256X_Kazansky_BF_1')
<waveorder.io.reader.WaveorderReader object at 0x7f7d1807cc70>
>>> WaveorderReader('./2T_3P_16Z_128Y_256X_Kazansky_BF_1/2T_3P_16Z_128Y_256X_Kazansky_BF_1_MMStack_Pos0.ome.tif')
ValueError: Specific input contains no ome.tif files, please specify a valid input directory
>>> WaveorderReader('./nonexistent'))
FileNotFoundError: No compatible data found under ./nonexistent, please specify the top level micromanager directory.

I might suggest that:

  • trying to open a single multipage .ome.tif file should give a different error message like "WaveorderReader expects a directory of .ome.tif files, not a single file.". As written the error might make it seem like the ome.tif is invalid/corrupt.
  • trying to open a nonexistent file/folder displays the message "This file/folder does not exist on the file system, please check your path and provide a directory." As written the error makes it seem like the reader only reads micromanager directories. Also, the error makes it makes it seem like the files are invalid/corrupt, when in reality it's a path issue.

README figure is low-res and transparent

The latest README figure is lower resolution than the previous one, so the smallest text is barely legible:

Screen Shot 2023-02-22 at 11 00 20 AM

@ziw-liu also pointed out that the background is mostly transparent and inconsistent:

the equation in the figure is on a solid white background, while the rest of the figure is on a transparent background that may have contrast issues in a dark theme

@mattersoflight can you point me to the source file or fix these when you get a chance?

Note: this does not need to be fixed before the release, since I'm linking the github file directly (I learned through recOrder releases that linking the github file is a good strategy since distributing .pngs is suboptimal---no need to pip-install big pngs for everyone).

`opencv-python` is taking >30 minutes to build on M1

opencv-python just released version 4.7.0.72 and it is taking >30 minutes to build on my M1. Installing the same version on Windows is fast, and installing the earlier version 4.7.0.68 is also fast.

@ziw-liu I'm considering pinning opencv-python==4.7.0.68 for this release, so that I can move forward. As you mention, we not depending on opencv very much and it may be causing a bug so we're planning to remove this dependency soon anyway.

@ziw-liu are you able to replicate an extremely slow opencv-python build? This problem affects recOrder installations on my M1 too.

Removing I/O module

Moving the conversation over from czbiohub-sf/iohub#55 (comment):

Remove WaveorderReader from waveorder and depend on the tagged version of iohub via github. After testing, release waveorder==1.0.0.

@talonchandler Does waveorder need to depend on iohub? I thought the plan was to turn waveorder into an array-in-array-out library and leave file I/O to downstream applications like recOrder.

More efficient GPU compute

Most of the QLIPP compute functions take the same variables in and out of GPU memory, which is inefficient.

For example, Stokes_transform put the raw Stokes images into GPU memory, carries out the Stokes normalization, and takes the normalized Stokes parameters out of GPU memory; Polscope_bg_correction then puts the normalized Stokes images back into GPU memory to carry out the background normalization math, and takes the background corrected Stokes images out of GPU memory at the end.

A more efficient way to process a single z-stack would be something like this:

reconstructor.Stokes_recon(I_meas) ## here I_meas is YXZ stack; raw Stokes images stay in GPU memory

reconstructor.Stokes_transform() ## normalize raw Stokes images which are in GPU memory and keep them there

reconstructor.Polscope_bg_correction(S_bg) ## pass a single argument with background Stokes images; these can stay in GPU memory for reuse or be rewritten is new S_bg are passed next time the function is called

phys_props = reconstructor.Polarization_recon() ## recon phys props with bg corrected Stokes in GPU memory

phase = reconstructor.Phase_recon_3D(method='Tikhonov') ## recon phase with S[0] from GPU memory

We'll need to make sure that the reconstruction still works as expected if these steps are not followed in order. For example, calling reconstructor.Stokes_recon should erase old S_norm, S_corr, etc. from GPU memory. Recon functions can still return the expected output variables such that intermediate results can be inspected. The Stokes input parameters to Stokes_transform, Polscope_bg_correction, Polarization_recon, and Phase_recon_3D should be optional, such that, for example, Stokes parameters can be processed (e.g. denoised, filtered, averaged) outside of the reconstructor before plugging them back in the pipeline.

`napari.Viewer()` fails after `import waveorder` in OnDemand session

In an OnDemand session with waveorder installed (in the shared recOrder environment or the current main branch):

python
>>> import napari
>>> v = napari.Viewer() 

will open a napari window as expected. If instead you try

python 
>>> import waveorder
>>> import napari
>>> v = napari.Viewer()

napari does not open and you'll receive a message:

WARNING: QObject::moveToThread: Current thread (0x557aa28ed750) is not the object's thread (0x557aa47ffdb0).
Cannot move to target thread (0x557aa28ed750)

Warning: Could not load the Qt platform plugin "xcb" in "/home/talon.chand\
ler/.local/lib/python3.8/site-packages/cv2/qt/plugins" even though it was \
found.
WARNING: This application failed to start because no Qt platform plugin co\
uld be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, \
offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcom\
posite-glx, webgl.

Changing the order of imports is workable for scripts, but not always convenient. This issue also might be causing issues in recOrder see: mehta-lab/recOrder#109

Odd-sized dimensions use incorrect frequency coordinates

Spatial frequency coordinates are used by all of waveorder's frequency-space reconstructions (phase, uPTI, in-progress phase from polarization, fluorescence) to calculate the coordinates of OTFs and interpret the results of FTs.

For example, waveorder.util.gen_coordinate is used to calculate all transverse spatial frequency coordinates. If M is the number of pixels along an axis and ps is the pixel size, then the current implementation uses ifftshift((np.r_[:M]-M/2)/M/ps) to calculate the spatial frequencies along that dimension---see full function definition.

np.fft provides a utility function for exactly this purpose: fftfreq. The result of ifftshift((np.r_[:M]-M/2)/M/ps) and np.fft.fftfreq(M, ps) are identical for even M, but they diverge for odd M. The fftfreq function matches the convention of most DFT functions: if the signal has an even or odd length, the DC term should always appear as the first entry.

For example:

>>> import numpy as np
>>> M = 4
>>> ps = 1
>>> np.fft.ifftshift((np.r_[:M]-M/2)/M/ps)
array([0., 0.25, -0.5, -0.25])
>>> np.fft.fftfreq(M, ps)
array([0., 0.25, -0.5, -0.25]) # identical
>>> M = 5
>>> np.fft.ifftshift((np.r_[:M]-M/2)/M/ps)
array([-0.1,  0.1,  0.3, -0.5, -0.3]) # no DC term
>>> np.fft.fftfreq(M, ps)
array([ 0. ,  0.2,  0.4, -0.4, -0.2]) # different! now includes the DC term

This issue only affects reconstructions with odd dimensions, and the issue gets smaller as the dimensions get larger. I suspect a fix will only make a noticeable difference for small transverse FOVs, so this fix is a currently a low priority since most of our reconstructions are on large FOVs. .

I'm investigated the axial frequency coordinates further because we frequently use odd and small dimensions. I could not find any evidence of incorrect axial frequency coordinates for 3D-phase and 3D-fluorescence reconstructions because none of the direct computations happen in the frequency domain. I found potentially concerning lines in gen_dyadic_Greens_tensor which is used to generate the 3D vector WOTF for PTI, but I'm not extremely concerned because I think most of those reconstructions were run with a large number of z slices.

Context and comments welcome.

Phase transfer function is numerically small

We have been observing inconsistent dynamic range issues with phase reconstructions across different datasets. When I try sweeping the Tikhonov regularization strength, the same FOV would also have different dynamic ranges:

regularization strength: 1e-4 std: 0.0071132425
regularization strength: 1e-3 std: 0.0012211832
regularization strength: 1e-2 std: 0.00013627978

It seems like the product of the regularization parameter and the width of the histogram stays the same (inverse relation). When inspecting the transfer function, the numerical values appear to be small:

>>> transfer_function = zarr.open("transfer_function.zarr")["real_potential_transfer_function"]
>>> tf = np.abs(transfer_function)
>>> tf.min()
0.0
>>> tf.max()
0.030031698
>>> tf.mean()
0.007650641
>>> tf.std()
0.0053157224

Looking at how the inverse was applied, it seems like the output would be dominated by the regularization parameter we typically use:

waveorder/waveorder/util.py

Lines 972 to 978 in 366ef1b

def compute_f_real_f(reg_x):
reg_coeff = 10**reg_x
# FT{f} (f=scattering potential (whose real part is (scaled) phase))
f_real_f = S0_stack_f * H_eff_conj / (H_eff_abs_square + reg_coeff)
return f_real_f

Here if H_eff_abs_square is much smaller than reg_coeff, the output will be close to being divided by the reg_coeff alone, resulting in the inverse relation between regularization strength and the dynamic range of the reconstructed image.

@mattersoflight suggests that we rescale $H_{eff}$ to $[0, 1]$, so that $H_{eff} \gg \rho$.

The config I used to generate the transfer function:

input_channel_names: ["BF"]
time_indices: all
reconstruction_dimension: 3
phase:
  transfer_function:
    wavelength_illumination: 0.450
    yx_pixel_size: 0.1494
    z_pixel_size: 0.205
    z_padding: 30
    index_of_refraction_media: 1.4
    numerical_aperture_detection: 1.35
    numerical_aperture_illumination: 0.85
    invert_phase_contrast: false
  apply_inverse:
    reconstruction_algorithm: Tikhonov
    regularization_strength: 0.001
    TV_rho_strength: 0.001
    TV_iterations: 1

Dataset: /hpc/projects/comp.micro/mantis/2023_08_18_A549_DRAQ5/0-zarr/a549_draq5_1/a549_draq5_labelfree_1.zarr/0/0/0

Make `torch` optional dependency?

Does it make sense to make torch an optional dependency? How easy it it to write non-torch accelerated versions of the waveorder algorithms? Are there cases where we'd like to use waveorder but cannon install torch, e.g. no GPU?

partial loading of large ome.tiff files

problem

For very large ome.tiff files, the data loading time can be very long. This is primarily because we use tifffile library to parse the ome.xml structure and tiff page locations. Even if we open as zarr, the page locations need to be found, which requires highly iterative search through the data, even before data is loaded (zarr facilitates the lazy loading when you want the data).

this issue is running documentation of research and attempts at solving this problem

research

I created this issue on tifffile repo: cgohlke/tifffile#83
But the answer is that no high level API exists for this kind of loading and that we'd have to implement this manually.

Here is an example of someone's manual tifffile parsing to extract cropped regions of a large multi-tile image:
https://stackoverflow.com/questions/62747767/how-to-efficiently-read-part-of-a-huge-tiff-image
and the corresponding answer from Christoph
https://gist.github.com/rfezzani/b4b8852c5a48a901c1e94e09feb34743

singlepage tiff reader for micro-manager data

The current reader is capable of ome-tiff reading. We should add functionality so that it can parse and assemble sequences of singlepage tiffs. Micro-manager would structure these single page tiffs in several ways.

  1. One folder per position
  2. all files in one folder

Some of this code could be taken from reconstruct-order 1.0 repo. I'll also try to set aside test data that we can continually reference for this.

waveorder changes matplotlib backend at import

The waveorder __init__.py file imports all library modules, even if they are not used. This includes from .visual import *, which changes the default matplotlib backend, in my case to QtAgg, even if function from the visual module are not used, which can create problems with other libraries. Can we remove the from .visual import * line, or even the entire contents of this file? I think this is not a good practice in general.

Bump Python version

Test fail for #92 because PEP-586 needs Python 3.8.

Given the current state of the environment: #92 (comment), we can bump the supported Python version and tests to 3.8-3.11.

Release 1.0

question

What is the minimal feature set after which we're ready to release a v1.0? This 1.0 release could also coincide with a pypi release, but it doesn't need to. Also, we can choose a 0.0.1 if that feels better.

Some ideas:

  • more documentation?
  • update readme to include reader/writer details.
  • more examples?
  • specific features: parallelized reading/writing?

Torch warning about copying tensors

recOrder tests surfaced the following warnings at optics.py:321, optics.py:370, and stokes.py:291

recOrder/tests/cli_tests/test_reconstruct.py::test_reconstruct
  /Users/talon.chandler/waveorder/waveorder/optics.py:321: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
    * torch.tensor(z_position_list)[:, None, None]

You can generate these warnings with a call like

import torch
torch.tensor(torch.tensor([0,1,2]))[:,None,None]

The core issue seems to be that some optics and stokes functions can take either a list or a tensor.

@ziw-liu do you think we should make sure that only lists are passed? Or can you think of a simpler solution?

This does not seem critical.

Tiled deconvolution

I think it is time to implement tiled deconvolution in waveOrder to handle deconvolving large Z-Stacks.

I have run into this problem quite frequently, where I cannot fit the computation on a single GPU node. It will be nice if we can implement a similar method to what they use in dexp which uses this scatter_gather function.

It would be nice to have some automatic detection of when the GPU is going to fill up, and then we can employ a scatter function to break up the compute into smaller chunks.

modifying reader attributes does not modify underlying tiff-class attributes

after building a reader:

mmr = MicromanagerReader(<folder to tiff files>)

modifying an attribute such as frames will not change the underlying tiff-class's attribute

mmr.frames = 10
print(mmr.reader.frames)
>>> not 10

This is a problem in cases where a acquisition quits before it's scheduled to finish -- so the metadata reflects a different dimensionality than actually exists. Simple case is when you specify a 100 frame time series at the start but end the collection after 50 frames.

Solution:
change reader attributes to getters/setters that modify and return the underlying tiff-type attribute

estimate focus slice during 2D phase from brightfield reconstruction

2D phase reconstruction from a brightfield stack is widely useful for high throughput screens. The plane in which cells are in optimal focus can drift slightly even in the presence of hardware autofocus. There is a signature in the data to estimate the drift: the optimal plane of focus is where the brightfield image has the lowest contrast (if only phase information is present) or 3D phase reconstruction has the highest contrast. In the following data from @tmakushok and reconstruction by @talonchandler, slice 2 or 3 is where the cells are in focus.

data.mov
recon.mov

One approach to dealing with the drift in focus is to use max projection of 3D phase. This should work fine for segmentation but is obviously not suitable for analyzing density variations. The proper fix would be to estimate the slice where cells are in focus from the spatial frequency spectrum of the brightfield/3D phase stack.

We compute the spectrum of brightfield data during the reconstruction. This step can be implemented before 3D OTFs are computed and when the focal_slice parameter is set to auto.

WaveorderReader ome-zarr contrast limits

  • Contrast limits provided to init_array should be in format of either (end, max, min, start) or (start, end)
  • Document how contrast limits should be formatted. It would be more intuitive to format the len=4 tuple as (min, max, start, end)
  • Right now create_channel_dict assumed that the data is float32 and sets max to 65535. It would be good to match that to the data type such that, for example, the default value for uint8 data would be 256

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.