mehta-lab / waveorder Goto Github PK
View Code? Open in Web Editor NEWA generalist library of wave optical models and inverse algorithms for label-free imaging of density & orientation.
License: BSD 3-Clause "New" or "Revised" License
A generalist library of wave optical models and inverse algorithms for label-free imaging of density & orientation.
License: BSD 3-Clause "New" or "Revised" License
@ieivanov and I just discussed some desired features for the focus-finding algorithms:
plot=True
option to save a small figure of the focus-finding run, likely mid-band power vs. slice, a dot for the peak, and an annotation for the confidence metric.recOrder
tests surfaced the following warning:
recOrder/tests/cli_tests/test_compute_tf.py::test_compute_transfer
/opt/anaconda3/envs/recorder-dev/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3484.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
waveorder
's simulations and reconstructions are moving to torch
following the new models
structure, and along the way we decided to temporarily drop GPU support in favor of prioritizing the migration.
We would like to restore GPU support for many of our operations, especially our heaviest reconstructions.
@ziw-liu, can you comment on the easiest path you see to GPU support?
A micro-manager acquisition that stops early (either from a crash or manually) does not update the ome.xml metadata, which is determined based on defined experiment parameters at the start of acquisition. Consequently, tifffile
, zarr
, and our io.MicromanagerReader
expect a larger dataset than exists. The result is a written file that has a larger dimensionality than the source file.
Unclear right now. It's possible the ome.tiff structure has something in place to signify "empty data" or similar... Will need some research
As we discussed in the deployment meeting, we are moving towards data storage in zarr or ome-zarr format. As we start to think about optimizing the storage, we should also begin on how to structure subgroups of zarr datastores so that a data reader/writer module will be able to correctly load data into the correct place.
For example, Ivan has different stokes data stored as subgroups in a larger datastore that corresponds to all of the data for a single FOV. These subgroups can have names, such as "Stokes_1_FOV1", and when you need to load this array, you will have to specify the name of this subgroup. For a data reader, we would need a way to standardize the naming format of these subgroups, or assign subgroup attributes, so that we can intuitively scan through a datastore to find the correct subgroups to load.
I suggest that we come up with a standardized way of assigning attributes to these subgroups, so that we do not necessarily need to adhere to strict naming conventions. For example, if we are saving computed stokes channels as different subgroups in a datastore, where the structure is as such: store/Stokes/Stokes0_zstack
where the array is Stokes0_zstack
. We can assign attributes in the following way:
store_path = '/home/camfoltz2/Stokes_FOV1.zarr'
store = zarr.open(store_path)
store['Stokes']['Stokes0_zstack'].attrs['Type'] = 'S0'
by calling .attrs['Type'] = 'S0'
we can create an attribute of the array called 'Type' and then define the 'Type' as 'S0'. You can then search through this dictionary of attributes when loading data. These attributes do not have to be strings, they can also be lists and numbers.
I think a good idea here would be to standardize the attributes of all of our data, and have the data io module assign these standardized attributes to arrays upon saving. @bryantChhun and others what do you think here? I think there will need to be some type of data labeling as we move to the zarr storage format.
it would be a nice feature for the user to not have to specify what type of raw data they are trying to read, such as ometiff
or singlepagetiff
. I think they are different enough at the metadata level that we should be able to find some distinction between the two and automatically determine the type upon init.
This will help with eventual recOrder pipelines and allow the user to not specify this parameter.
The community has been renaming master
branches to main
.
We can refer to this guide for such a transition in waveorder
.
Zarr defaults to calling an unnamed array arr_0
, maybe we can call the array that the WaveorderWriter writes arr_0
instead of array
. This way we can use the dot notation to index the array: zs.arr_0 instead of zs['arr_0']. @camFoltz if you remember zs.array is a method that we're overloading by calling the data array array
I first noticed this issue during a recOrder
session when I tried to drag and drop a zarr store saved by a recOrder acquisition into napari. The layers and shape populated correctly, but the actual pixel values were set to zero.
Digging further, I've realized that this is a WaveorderWriter
issue. Here's a minimal example:
$ python
>>> import numpy as np
>>> from waveorder.io import WaveorderWriter
>>> writer = WaveorderWriter("./test")
>>> writer.create_zarr_root("test")
>>> writer.init_array(position=0, data_shape=(1,1,2,3,4), chunk_size=(1,1,2,3,4), chan_names=["TEST"])
>>> writer.write(np.random.random((1,1,2,3,4)), p=0)
>>> quit()
$ napari --plugin napari-ome-zarr test/test.zarr
results in this traceback:
18:06:35 WARNING version mismatch: detected: FormatV01, requested: FormatV04
18:06:35 ERROR Failed to load Row_0/Col_0/0/0
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 137, in __getitem__
result = self.fs.cat(k)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 755, in cat
return self.cat_file(paths[0], **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 665, in cat_file
with self.open(path, "rb", **kwargs) as f:
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 1034, in open
f = self._open(
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 162, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 260, in __init__
self._open()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 265, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'Q:/Talon/2022_11_23/recOrderPluginSnap_0/test/test.zarr/Row_0/Col_0/0/0/.zarray'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1376, in __getitem__
return self.map[key]
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 141, in __getitem__
raise KeyError(key)
KeyError: 'Row_0/Col_0/0/0/.zarray'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 241, in _load_metadata_nosync
meta_bytes = self._store[mkey]
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1378, in __getitem__
raise KeyError(key) from e
KeyError: 'Row_0/Col_0/0/0/.zarray'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\reader.py", line 548, in get_tile
data = self.zarr.load(path)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\io.py", line 113, in load
return da.from_zarr(self.__store, subpath)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\dask\array\core.py", line 3569, in from_zarr
z = zarr.Array(mapper, read_only=True, path=component, **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 215, in __init__
self._load_metadata()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 232, in _load_metadata
self._load_metadata_nosync()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 243, in _load_metadata_nosync
raise ArrayNotFoundError(self._path)
zarr.errors.ArrayNotFoundError: array not found at path %r' 'Row_0/Col_0/0/0'
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\Scripts\napari.exe\__main__.py", line 7, in <module>
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\__main__.py", line 446, in main
_run()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\__main__.py", line 335, in _run
run(gui_exceptions=True)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\_qt\qt_event_loop.py", line 402, in run
app.exec_()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\contextlib.py", line 126, in __exit__
next(self.gen)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\napari\_qt\utils.py", line 459, in _maybe_allow_interrupt
old_sigint_handler(*handler_args)
KeyboardInterrupt
(recorder-dev) PS Q:\Talon\2022_11_23\recOrderPluginSnap_0> python
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:51:29) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from waveorder.io import WaveorderWriter
>>> writer = WaveorderWriter("./test")
>>> writer.create_zarr_root("tes")
Creating new zarr store at ./test\tes.zarr
>>> writer.init_array(position=0, data_shape=(1,1,2,3,4), chunk_size=(1,1,2,3,4), chan_names=["TEST"])
>>> writer.write(np.random.random((1,1,2,3,4)), p=0)
>>> quit()
(recorder-dev) PS Q:\Talon\2022_11_23\recOrderPluginSnap_0> napari --plugin napari-ome-zarr test/tes.zarr
18:11:25 WARNING version mismatch: detected: FormatV01, requested: FormatV04
18:11:25 ERROR Failed to load Row_0/Col_0/0/0
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 137, in __getitem__
result = self.fs.cat(k)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 755, in cat
return self.cat_file(paths[0], **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 665, in cat_file
with self.open(path, "rb", **kwargs) as f:
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\spec.py", line 1034, in open
f = self._open(
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 162, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 260, in __init__
self._open()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\implementations\local.py", line 265, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'Q:/Talon/2022_11_23/recOrderPluginSnap_0/test/tes.zarr/Row_0/Col_0/0/0/.zarray'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1376, in __getitem__
return self.map[key]
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\fsspec\mapping.py", line 141, in __getitem__
raise KeyError(key)
KeyError: 'Row_0/Col_0/0/0/.zarray'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 241, in _load_metadata_nosync
meta_bytes = self._store[mkey]
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\storage.py", line 1378, in __getitem__
raise KeyError(key) from e
KeyError: 'Row_0/Col_0/0/0/.zarray'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\reader.py", line 548, in get_tile
data = self.zarr.load(path)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\ome_zarr\io.py", line 113, in load
return da.from_zarr(self.__store, subpath)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\dask\array\core.py", line 3569, in from_zarr
z = zarr.Array(mapper, read_only=True, path=component, **kwargs)
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 215, in __init__
self._load_metadata()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 232, in _load_metadata
self._load_metadata_nosync()
File "C:\Users\LabelFree\.conda\envs\recorder-dev\lib\site-packages\zarr\core.py", line 243, in _load_metadata_nosync
raise ArrayNotFoundError(self._path)
zarr.errors.ArrayNotFoundError: array not found at path %r' 'Row_0/Col_0/0/0'
It seems related to our naming convention of Row_0/Col_0/Pos_000
. Do we need to change this convention, or change the metadata so that ome-zarr
can read our zarrs?
I think this is reasonably important because there are lots of cases when I'd like to use the ome-zarr
reader to split zarr's channels into layers (it's native behaviour) instead of the recOrder-napari
reader that opens as a single array (also convenient in some cases). It also causes confusion because the reader seems to work but has zeros---did the reconstruction get saved? (It does)
The t, c, z input interface to WaveorderWriter.writer is not very pythonic. It makes sense for these parameters to be integers or iterables (e.g. c = range(4)
) as opposed to list of start and end indices, i.e. c = [0, 4]
. This should simplify WriterBase.write too
I tried to deconvolve a single position and the 40GB GPU RAM wasn't sufficient. Perhaps there is a bug in version of waveOrder used by recOrder.
The path to config file:
comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon_sequential.yml
To reproduce, do the following on our HPC nodes:
module load anaconda
module load comp_micro
conda activate recorder
recOrder.reconstruct --config phase_fluor_decon_sequential.yml
I see the following error log
(recorder) [shalin.mehta@gpu-a-002 2022_01_20_orgs_BF_GFP_63x_04NA]$ recOrder.reconstruct --config phase_fluor_decon_sequential.yml
Reading Data...
Finished Reading Data (0.0 min)
Creating new zarr store at /hpc/projects/comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon/phase_fluor_optimization_pos47.zarr
Initializing Reconstructor...
Finished Initializing Reconstructor (5.62 min)
Initializing Reconstructor...
Finished Initializing Reconstructor (3.18 min)
Beginning Reconstruction...
Traceback (most recent call last):
File "/home/shalin.mehta/.conda/envs/recorder/bin/recOrder.reconstruct", line 33, in <module>
sys.exit(load_entry_point('recOrder', 'console_scripts', 'recOrder.reconstruct')())
File "/home/shalin.mehta/code/recOrder/recOrder/scripts/run_pipeline.py", line 48, in main
manager.run()
File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/pipeline_manager.py", line 337, in run
deconvolve2D, deconvolve3D = self.pipeline.deconvolve_volume(stokes)
File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/phase_from_bf_pipeline.py", line 136, in deconvolve_volume
lambda_re=self.config.TV_reg_ph_3D, itr=self.config.itr_3D)
File "/home/shalin.mehta/code/recOrder/recOrder/compute/qlipp_compute.py", line 308, in reconstruct_phase3D
itr=itr, verbose=False)
File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/waveorder_reconstructor.py", line 2370, in Phase_recon_3D
f_real = Single_variable_Tikhonov_deconv_3D(S0_stack, H_eff, reg_re, use_gpu=self.use_gpu, gpu_id=self.gpu_id, autotune=autotune_re, verbose=verbose)
File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 1069, in Single_variable_Tikhonov_deconv_3D
f_real_f = compute_f_real_f(xp.log10(reg_re))
File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 961, in compute_f_real_f
f_real_f = S0_stack_f * H_eff_conj / (H_eff_abs_square + reg_coeff)
File "cupy/core/core.pyx", line 1045, in cupy.core.core.ndarray.__truediv__
File "cupy/core/_kernel.pyx", line 1063, in cupy.core._kernel.ufunc.__call__
File "cupy/core/_kernel.pyx", line 565, in cupy.core._kernel._get_out_args
File "cupy/core/core.pyx", line 2380, in cupy.core.core._ndarray_init
File "cupy/core/core.pyx", line 151, in cupy.core.core.ndarray._init_fast
File "cupy/cuda/memory.pyx", line 578, in cupy.cuda.memory.alloc
File "cupy/cuda/memory.pyx", line 1250, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1271, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 939, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
File "cupy/cuda/memory.pyx", line 959, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
File "cupy/cuda/memory.pyx", line 1210, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 7,314,866,176 bytes (allocated so far: 36,674,994,688 bytes).
This way when loaded in napari other channels are not visible and data are not overlayed. Overlaying usually does not make sense for our data
The WaveorderReader
scrambles the position order when it reads a folder of ome-tiffs.
For example:
$ python
>>> from waveorder.io import WaveorderReader
>>> reader = WaveorderReader('/hpc/projects/compmicro/raw_data/hummingbird/all_21_3/') # will take ~30 seconds
>>> reader.reader.files[0:3]
['...MMStack_232.ome.tif', '...MMStack_177.ome.tif', '...MMStack_57.ome.tif']
This line seems to be the culprit, since glob docs say that the files will be returned in arbitrary order.
A possible solution is to wrap the glob
in a natsort.natsorted
function. But maybe theres a better way that uses the MM metadata instead of the file names?
A key parameter that we use in reconstruction is step size, if we can pull this out from the metadata and have it easily accessible it would be very useful.
The waveorder_microscopy
constructor contains the following lines:
self.lambda_illu = lambda_illu/n_media
self.NA_obj = NA_obj/n_media
self.NA_illu = NA_illu/n_media
which creates confusion when reading the methods and refactoring.
I am planning to make the following changes:
Rename self.lambda_illu
to self.lambda_media
because lambda_illu/n_media
is the wavelength inside the immersion media. It's particularly important to keep track of what media we're in because we have plans to to reconstructions in layered samples (immersion media, water, etc).
Do not change the names ofself.NA_obj
and self.NA_illu
, but make them true to their names and don't normalize by n_media
.
The position argument of WaveorderReader.get_array
should be optional and should default to position=0
, I think. Same goes for get_zarr
.
The time to reconstruct z-stacks is not linear and quickly becomes an issue reconstructing >50 slice z-stacks. These stacks have shapes (5,61,2048,2048) for 5-state polarization, Z,Y, X.
The issue arises mainly in the reconstruct_qlipp_stokes()
function where we get:
1 slice
Stokes Elapsed Time: 1.6s
Birefringence Elapsed Time: 0.4
5 slice
Stokes Elapsed Time: 10.2s
Birefringence Elapsed Time: 2s
10 slice
Stokes Elapsed Time: 24.6s
Birefringence Elapsed Time: 4.6s
61 slice
Stokes Elapsed Time: 210s
Birefringence Elapsed Time: 327s
#%%
from waveorder.io.reader import WaveorderReader
from waveorder.io.writer import WaveorderWriter
from recOrder.io.utils import load_bg
from recOrder.compute.reconstructions import (
initialize_reconstructor,
reconstruct_qlipp_stokes,
reconstruct_qlipp_birefringence,
reconstruct_phase3D,
)
from datetime import datetime
import numpy as np
import os
import sys
import multiprocessing as mp
#debugging
import time
# %%
#Data paths
data_root = '/hpc/projects/comp_micro/rawdata/falcon/Ed/2022_12_09_zebrafish_GFP_ISG15_dtomat_MPEG'
# data_folder = 'infected_uninfected_fish_5'
data_folder = 'recOrderPluginSnap_0/multiposition_intoto_1'
dataset_folder = os.path.join(data_root,data_folder)
print('data:' + dataset_folder)
# Background folder name
bg_folder = os.path.join(data_root,'BG')
print('bg_folder:' + bg_folder)
# %%
#Setup Readers
reader = WaveorderReader(dataset_folder)
print(reader.shape)
t = 0
pos = 0
pol_data = reader.get_array(0)[t, :5,25:26]
# channels = ['state0','state1','state2','state3','state4','gfp','dtomato']
# pol_data = data[:5,25:30]
C, Z, Y, X = pol_data.shape
# fluor_data = data[6:]
bg_data = load_bg(bg_folder, height= Y, width= X)
# %%
# Get first position
print("Start Reconstructor")
reconstructor_args = {
"image_dim": (Y, X),
"mag": 8.25, # magnification is 200/165
"pixel_size_um": 3.45, # pixel size in um
"n_slices": Z, # number of slices in z-stack
"z_step_um": 5, # z-step size in um
"wavelength_nm": 532,
"swing": 0.1,
"calibration_scheme": "5-State", # "4-State" or "5-State"
"NA_obj": 0.55, # numerical aperture of objective
"NA_illu": 0.2, # numerical aperture of condenser
"n_obj_media": 1.0, # refractive index of objective immersion media
"pad_z": 5, # slices to pad for phase reconstruction boundary artifacts
"bg_correction": "local_fit", # BG correction method: "None", "local_fit", "global"
"mode": "3D", # phase reconstruction mode, "2D" or "3D"
"use_gpu": False,
"gpu_id": 0,
}
reconstructor = initialize_reconstructor(
pipeline="birefringence", **reconstructor_args
)
proj_folder = '/hpc/projects/comp_micro/projects/zebrafish-infection/2012_12_09_zebrafish_GFPisg15_dtomato_mpeg'
timestamp = datetime.now().strftime("%Y_%m_%d_%H%M%S")
proj_folder = os.path.join(proj_folder, timestamp)
if not os.path.exists(proj_folder):
os.mkdir(proj_folder)
output_folder = os.path.join(proj_folder, 'multiposition_intoto_1')
print(output_folder)
writer = WaveorderWriter(output_folder)
timestamp_short = datetime.now().strftime("%Y%m%d")
writer.create_zarr_root(timestamp_short + "_birefringence_pos_" + str(pos))
# Reconstruct data Stokes w/ background correction
print("Begin stokes calculation")
start_time = time.time()
bg_stokes = reconstruct_qlipp_stokes(bg_data,reconstructor)
stokes = reconstruct_qlipp_stokes(pol_data, reconstructor,bg_stokes)
print(f'Stokes elapsed time:{time.time() - start_time}')
print(f"Shape of background corrected data Stokes: {np.shape(stokes)}")
# Reconstruct Birefringence from Stokes
# Shape of the output birefringence will be (C, Z, Y, X) where
# Channel order = Retardance [nm], Orientation [rad], Brightfield (S0), Degree of Polarization
print("Begin birefringence calculation")
bire_start_time = time.time()
birefringence = reconstruct_qlipp_birefringence(stokes, reconstructor)
birefringence[0] = (
birefringence[0] / (2 * np.pi) * reconstructor_args["wavelength_nm"]
)
print(f'Bire elapsed time:{time.time() - bire_start_time}')
print(f'Total elapsed time:{time.time() - start_time}')
# %%
I am documenting as an issue for further deep debugging. Thank you @talonchandler for testing and confirming similar results.
recOrder
tests surfaced the following warning.
recOrder/tests/cli_tests/test_reconstruct.py::test_reconstruct
/Users/talon.chandler/waveorder/waveorder/util.py:711: RuntimeWarning: invalid value encountered in divide
img_norm_stack[i] = img_stack[i] / uniform_filter(
This section is used by the isotropic_thin_3d
reconstruction during normalization before reconstruction. This division was copied directly from pre-alg-dev
code, and these reconstructions are passing our qualitative tests and I'm not concerned about deep problems here.
This issue is related to #88. Edit: uniform_filter
is from scipy
while uniform_filter_2D
is a custom implementation.
I'm planning to move the fluorescence reconstructions to torch
and to the new model
-based structure.
waveorder_reconstructor/fluorescence_microscopy
class will eventually be deprecatedfluorescence_microscopy
class supports 2D
, 3D
, and fluorescence anisotropy
reconstructions. Following our <sample_type>_<sample_thickness>_<data_type>
model-name schema, I will suggest moving+renaming these reconstructions to:2D
-> isotropic_fluorescent_thin_3d
3D
-> isotropic_fluorescent_thick_3d
fluorescence anisotropy
-> inplane_anisotropic_fluorescent_thin_pol3d
@mattersoflight naming discussion is welcome.
@mattersoflight @edyoshikun if our immediate needs require just the isotropic_fluorescent_thick_3d
, then I will prioritize that model.
The current WaveorderReader
error messages can be slightly confusing. For example:
python
>>> from waveorder.io import WaveorderReader
>>> WaveorderReader('./2T_3P_16Z_128Y_256X_Kazansky_BF_1')
<waveorder.io.reader.WaveorderReader object at 0x7f7d1807cc70>
>>> WaveorderReader('./2T_3P_16Z_128Y_256X_Kazansky_BF_1/2T_3P_16Z_128Y_256X_Kazansky_BF_1_MMStack_Pos0.ome.tif')
ValueError: Specific input contains no ome.tif files, please specify a valid input directory
>>> WaveorderReader('./nonexistent'))
FileNotFoundError: No compatible data found under ./nonexistent, please specify the top level micromanager directory.
I might suggest that:
.ome.tif
file should give a different error message like "WaveorderReader expects a directory of .ome.tif
files, not a single file.". As written the error might make it seem like the ome.tif
is invalid/corrupt.The latest README figure is lower resolution than the previous one, so the smallest text is barely legible:
@ziw-liu also pointed out that the background is mostly transparent and inconsistent:
the equation in the figure is on a solid white background, while the rest of the figure is on a transparent background that may have contrast issues in a dark theme
@mattersoflight can you point me to the source file or fix these when you get a chance?
Note: this does not need to be fixed before the release, since I'm linking the github file directly (I learned through recOrder releases that linking the github file is a good strategy since distributing .pngs is suboptimal---no need to pip-install big pngs for everyone).
opencv-python
just released version 4.7.0.72
and it is taking >30 minutes to build on my M1. Installing the same version on Windows is fast, and installing the earlier version 4.7.0.68
is also fast.
@ziw-liu I'm considering pinning opencv-python==4.7.0.68
for this release, so that I can move forward. As you mention, we not depending on opencv very much and it may be causing a bug so we're planning to remove this dependency soon anyway.
@ziw-liu are you able to replicate an extremely slow opencv-python
build? This problem affects recOrder
installations on my M1 too.
waveorder.util.uniform_filter_2D
currently uses a custom set of numerical operations to reimplement scipy.ndimage.uniform_filter
on GPU.
This should be replaced by calling the CuPy API directly.
Moving the conversation over from czbiohub-sf/iohub#55 (comment):
Remove WaveorderReader from waveorder and depend on the tagged version of iohub via github. After testing, release waveorder==1.0.0.
@talonchandler Does waveorder need to depend on iohub? I thought the plan was to turn waveorder into an array-in-array-out library and leave file I/O to downstream applications like recOrder.
Most of the QLIPP compute functions take the same variables in and out of GPU memory, which is inefficient.
For example, Stokes_transform
put the raw Stokes images into GPU memory, carries out the Stokes normalization, and takes the normalized Stokes parameters out of GPU memory; Polscope_bg_correction
then puts the normalized Stokes images back into GPU memory to carry out the background normalization math, and takes the background corrected Stokes images out of GPU memory at the end.
A more efficient way to process a single z-stack would be something like this:
reconstructor.Stokes_recon(I_meas) ## here I_meas is YXZ stack; raw Stokes images stay in GPU memory
reconstructor.Stokes_transform() ## normalize raw Stokes images which are in GPU memory and keep them there
reconstructor.Polscope_bg_correction(S_bg) ## pass a single argument with background Stokes images; these can stay in GPU memory for reuse or be rewritten is new S_bg are passed next time the function is called
phys_props = reconstructor.Polarization_recon() ## recon phys props with bg corrected Stokes in GPU memory
phase = reconstructor.Phase_recon_3D(method='Tikhonov') ## recon phase with S[0] from GPU memory
We'll need to make sure that the reconstruction still works as expected if these steps are not followed in order. For example, calling reconstructor.Stokes_recon should erase old S_norm, S_corr, etc. from GPU memory. Recon functions can still return the expected output variables such that intermediate results can be inspected. The Stokes input parameters to Stokes_transform
, Polscope_bg_correction
, Polarization_recon
, and Phase_recon_3D
should be optional, such that, for example, Stokes parameters can be processed (e.g. denoised, filtered, averaged) outside of the reconstructor before plugging them back in the pipeline.
In an OnDemand session with waveorder
installed (in the shared recOrder
environment or the current main
branch):
python
>>> import napari
>>> v = napari.Viewer()
will open a napari
window as expected. If instead you try
python
>>> import waveorder
>>> import napari
>>> v = napari.Viewer()
napari
does not open and you'll receive a message:
WARNING: QObject::moveToThread: Current thread (0x557aa28ed750) is not the object's thread (0x557aa47ffdb0).
Cannot move to target thread (0x557aa28ed750)
Warning: Could not load the Qt platform plugin "xcb" in "/home/talon.chand\
ler/.local/lib/python3.8/site-packages/cv2/qt/plugins" even though it was \
found.
WARNING: This application failed to start because no Qt platform plugin co\
uld be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, \
offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcom\
posite-glx, webgl.
Changing the order of imports is workable for scripts, but not always convenient. This issue also might be causing issues in recOrder
see: mehta-lab/recOrder#109
Spatial frequency coordinates are used by all of waveorder
's frequency-space reconstructions (phase, uPTI, in-progress phase from polarization, fluorescence) to calculate the coordinates of OTFs and interpret the results of FTs.
For example, waveorder.util.gen_coordinate
is used to calculate all transverse spatial frequency coordinates. If M
is the number of pixels along an axis and ps
is the pixel size, then the current implementation uses ifftshift((np.r_[:M]-M/2)/M/ps)
to calculate the spatial frequencies along that dimension---see full function definition.
np.fft
provides a utility function for exactly this purpose: fftfreq
. The result of ifftshift((np.r_[:M]-M/2)/M/ps)
and np.fft.fftfreq(M, ps)
are identical for even M
, but they diverge for odd M
. The fftfreq
function matches the convention of most DFT
functions: if the signal has an even or odd length, the DC term should always appear as the first entry.
For example:
>>> import numpy as np
>>> M = 4
>>> ps = 1
>>> np.fft.ifftshift((np.r_[:M]-M/2)/M/ps)
array([0., 0.25, -0.5, -0.25])
>>> np.fft.fftfreq(M, ps)
array([0., 0.25, -0.5, -0.25]) # identical
>>> M = 5
>>> np.fft.ifftshift((np.r_[:M]-M/2)/M/ps)
array([-0.1, 0.1, 0.3, -0.5, -0.3]) # no DC term
>>> np.fft.fftfreq(M, ps)
array([ 0. , 0.2, 0.4, -0.4, -0.2]) # different! now includes the DC term
This issue only affects reconstructions with odd dimensions, and the issue gets smaller as the dimensions get larger. I suspect a fix will only make a noticeable difference for small transverse FOVs, so this fix is a currently a low priority since most of our reconstructions are on large FOVs. .
I'm investigated the axial frequency coordinates further because we frequently use odd and small dimensions. I could not find any evidence of incorrect axial frequency coordinates for 3D-phase and 3D-fluorescence reconstructions because none of the direct computations happen in the frequency domain. I found potentially concerning lines in gen_dyadic_Greens_tensor
which is used to generate the 3D vector WOTF for PTI, but I'm not extremely concerned because I think most of those reconstructions were run with a large number of z slices.
Context and comments welcome.
FFFFFF corresponds to "gray" colormap
We have been observing inconsistent dynamic range issues with phase reconstructions across different datasets. When I try sweeping the Tikhonov regularization strength, the same FOV would also have different dynamic ranges:
regularization strength: 1e-4 std: 0.0071132425
regularization strength: 1e-3 std: 0.0012211832
regularization strength: 1e-2 std: 0.00013627978
It seems like the product of the regularization parameter and the width of the histogram stays the same (inverse relation). When inspecting the transfer function, the numerical values appear to be small:
>>> transfer_function = zarr.open("transfer_function.zarr")["real_potential_transfer_function"]
>>> tf = np.abs(transfer_function)
>>> tf.min()
0.0
>>> tf.max()
0.030031698
>>> tf.mean()
0.007650641
>>> tf.std()
0.0053157224
Looking at how the inverse was applied, it seems like the output would be dominated by the regularization parameter we typically use:
Lines 972 to 978 in 366ef1b
Here if H_eff_abs_square
is much smaller than reg_coeff
, the output will be close to being divided by the reg_coeff
alone, resulting in the inverse relation between regularization strength and the dynamic range of the reconstructed image.
@mattersoflight suggests that we rescale
The config I used to generate the transfer function:
input_channel_names: ["BF"]
time_indices: all
reconstruction_dimension: 3
phase:
transfer_function:
wavelength_illumination: 0.450
yx_pixel_size: 0.1494
z_pixel_size: 0.205
z_padding: 30
index_of_refraction_media: 1.4
numerical_aperture_detection: 1.35
numerical_aperture_illumination: 0.85
invert_phase_contrast: false
apply_inverse:
reconstruction_algorithm: Tikhonov
regularization_strength: 0.001
TV_rho_strength: 0.001
TV_iterations: 1
Dataset: /hpc/projects/comp.micro/mantis/2023_08_18_A549_DRAQ5/0-zarr/a549_draq5_1/a549_draq5_labelfree_1.zarr/0/0/0
Does it make sense to make torch
an optional dependency? How easy it it to write non-torch
accelerated versions of the waveorder
algorithms? Are there cases where we'd like to use waveorder
but cannon install torch
, e.g. no GPU?
For very large ome.tiff files, the data loading time can be very long. This is primarily because we use tifffile
library to parse the ome.xml structure and tiff page locations. Even if we open as zarr, the page locations need to be found, which requires highly iterative search through the data, even before data is loaded (zarr facilitates the lazy loading when you want the data).
this issue is running documentation of research and attempts at solving this problem
I created this issue on tifffile
repo: cgohlke/tifffile#83
But the answer is that no high level API exists for this kind of loading and that we'd have to implement this manually.
Here is an example of someone's manual tifffile parsing to extract cropped regions of a large multi-tile image:
https://stackoverflow.com/questions/62747767/how-to-efficiently-read-part-of-a-huge-tiff-image
and the corresponding answer from Christoph
https://gist.github.com/rfezzani/b4b8852c5a48a901c1e94e09feb34743
In addition to the remaining fluorescence reconstructions #122, the PTI reconstruction will also need to move.
enable multithread or multiprocess read/write
for writing, zarr is designed to support this:
https://zarr.readthedocs.io/en/stable/tutorial.html#parallel-computing-and-synchronization
for reading from ome.tiff, this is will need some research.
The current reader is capable of ome-tiff reading. We should add functionality so that it can parse and assemble sequences of singlepage tiffs. Micro-manager would structure these single page tiffs in several ways.
Some of this code could be taken from reconstruct-order 1.0 repo. I'll also try to set aside test data that we can continually reference for this.
The waveorder __init__.py
file imports all library modules, even if they are not used. This includes from .visual import *
, which changes the default matplotlib
backend, in my case to QtAgg
, even if function from the visual
module are not used, which can create problems with other libraries. Can we remove the from .visual import *
line, or even the entire contents of this file? I think this is not a good practice in general.
Test fail for #92 because PEP-586 needs Python 3.8.
Given the current state of the environment: #92 (comment), we can bump the supported Python version and tests to 3.8-3.11.
I'll fix this typo, update to include 3.7-3.9 (same as recOrder), test the installation, and merge.
Mentioned here:
conda-forge/staged-recipes#19222
What is the minimal feature set after which we're ready to release a v1.0? This 1.0 release could also coincide with a pypi release, but it doesn't need to. Also, we can choose a 0.0.1 if that feels better.
Some ideas:
recOrder
tests surfaced the following warnings at optics.py:321
, optics.py:370
, and stokes.py:291
recOrder/tests/cli_tests/test_reconstruct.py::test_reconstruct
/Users/talon.chandler/waveorder/waveorder/optics.py:321: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
* torch.tensor(z_position_list)[:, None, None]
You can generate these warnings with a call like
import torch
torch.tensor(torch.tensor([0,1,2]))[:,None,None]
The core issue seems to be that some optics
and stokes
functions can take either a list or a tensor.
@ziw-liu do you think we should make sure that only lists are passed? Or can you think of a simpler solution?
This does not seem critical.
I think it is time to implement tiled deconvolution in waveOrder to handle deconvolving large Z-Stacks.
I have run into this problem quite frequently, where I cannot fit the computation on a single GPU node. It will be nice if we can implement a similar method to what they use in dexp which uses this scatter_gather
function.
It would be nice to have some automatic detection of when the GPU is going to fill up, and then we can employ a scatter function to break up the compute into smaller chunks.
after building a reader:
mmr = MicromanagerReader(<folder to tiff files>)
modifying an attribute such as frames
will not change the underlying tiff-class's attribute
mmr.frames = 10
print(mmr.reader.frames)
>>> not 10
This is a problem in cases where a acquisition quits before it's scheduled to finish -- so the metadata reflects a different dimensionality than actually exists. Simple case is when you specify a 100 frame time series at the start but end the collection after 50 frames.
Solution:
change reader attributes to getters/setters that modify and return the underlying tiff-type attribute
2D phase reconstruction from a brightfield stack is widely useful for high throughput screens. The plane in which cells are in optimal focus can drift slightly even in the presence of hardware autofocus. There is a signature in the data to estimate the drift: the optimal plane of focus is where the brightfield image has the lowest contrast (if only phase information is present) or 3D phase reconstruction has the highest contrast. In the following data from @tmakushok and reconstruction by @talonchandler, slice 2 or 3 is where the cells are in focus.
One approach to dealing with the drift in focus is to use max projection of 3D phase. This should work fine for segmentation but is obviously not suitable for analyzing density variations. The proper fix would be to estimate the slice where cells are in focus from the spatial frequency spectrum of the brightfield/3D phase stack.
We compute the spectrum of brightfield data during the reconstruction. This step can be implemented before 3D OTFs are computed and when the focal_slice
parameter is set to auto
.
init_array
should be in format of either (end, max, min, start)
or (start, end)
(min, max, start, end)
create_channel_dict
assumed that the data is float32 and sets max
to 65535. It would be good to match that to the data type such that, for example, the default value for uint8 data would be 256A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.