The decode from turagalab

Abstract logger implementation to allow for other loggers

Use pytorch-lightning logger implementation which allows for

tensorboard, csv, comet_ml, mlflow etc.
multiple loggers

https://pytorch-lightning.readthedocs.io/en/stable/logging.html
I think it is perfectly fine to use the logger without anything else form PL.

However, logging figures is not yet merged (in fact I made a PR for this), but that does not mean one has to wait.

Batch Folder processing

Publish a batch fitting script/notebook.

There are some open questions:

What do we do if the files are were acquired with different EM gains (--> Write proper tif tag reader)

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Modify model save to store also parameters

Bind parameters directly into binary, similar to checkpointing because then we only need to ship one file.
I imagine something like:

model_pkg =  {
'model_state_dict' : [...]
'param_run' : [params as dictionary]
'param_in' : [parms in as dictionary]
}

This should be loadable by torch.load() and is reasonably basic to be future robust.

Tensorboard from conda not working

From Jonas:

(decode_env) mac-ries20:~ jonasries$ tensorboard --samples_per_plugin images=100 --port=6006 --logdir=runs
TensorFlow installation not found - running with reduced feature set.
Traceback (most recent call last):
 File “/Users/jonasries/anaconda3/envs/decode_env/bin/tensorboard”, line 10, in <module>
  sys.exit(run_main())
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/tensorboard/main.py”, line 65, in run_main
  default.get_plugins(),
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/tensorboard/default.py”, line 113, in get_plugins
  return get_static_plugins() + get_dynamic_plugins()
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/tensorboard/default.py”, line 148, in get_dynamic_plugins
  return [
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/tensorboard/default.py”, line 149, in <listcomp>
  entry_point.load()
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/pkg_resources/__init__.py”, line 2471, in load
  self.require(*args, **kwargs)
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/pkg_resources/__init__.py”, line 2494, in require
  items = working_set.resolve(reqs, env, installer, extras=self.extras)
 File “/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages/pkg_resources/__init__.py”, line 790, in resolve
  raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (yarl 1.6.2 (/Users/jonasries/anaconda3/envs/decode_env/lib/python3.8/site-packages), Requirement.parse(‘yarl<1.6.0,>=1.0’), {‘aiohttp’})

XYZ_sigma of EmitterSet does not convert as xyz does

EmitterSet features automatic conversion of px and nm for xyz which are derived properties of an internal xyz container. They can also be set via a setter method as

em = EmitterSet(xy_unit='px')
em.xyz_nm = ...
--> em.xy_unit : 'nm'

however the sigma values remain in the original unit. I don't really know how to best handle this, or just issue a warning.

Add all relevant (hyper)params to tensorboard

The stuff that we put in the param.yaml should be also found in the tensorboard logger.
Log it there.

Document logic for mirroring in Notebooks and documentation

Document when to use mirroring and when not.

Originally posted by @Maxgamill in #49 (reply in thread)

Robust EM Gain tag read from .tiff

From @jries: Automatic determination does not work on example data. Why not? Try to fix. Also set auto EM gain as default [in Colab].

Replace paths in tests by pkg resources

Relative paths to file or similiar could likely fail in packaged decode installations (because then folders don't exist anymore?).
Do it similarly to load_reference, i.e.

try:
        import importlib.resources as pkg_resources
    except ImportError: # Try backported to PY<37 `importlib_resources`.
        import importlib_resources as pkg_resources

    from . import reference_files
    param_ref = pkg_resources.open_text(reference_files, 'reference.yaml')

Sync CSV i/o signature and return to HDF5 / torch signature,return

DECODE/decode/utils/emitter_io.py

Lines 30 to 60 in a59f219

    
           def load_csv(file: (str, pathlib.Path), mapping: (None, dict) = None, **pd_csv_args) -> Tuple[dict, dict, dict]: 
        
               """ 
        
               Loads a CSV file which does provide a header. 
        
               Args: 
        
                   file: path to file 
        
                   mapping: mapping dictionary with keys ('x', 'y', 'z', 'phot', 'id', 'frame_ix') 
        
                   pd_csv_args: additional keyword arguments to be parsed to the pandas csv reader 
        
               Returns: 
        
                   dict: dictionary which can readily be converted to an EmitterSet by EmitterSet(**out_dict) 
        
               """ 
        
               if mapping is None: 
        
                   mapping = {'x': 'x', 'y': 'y', 'z': 'z', 'phot': 'phot', 'frame_ix': 'frame_ix'} 
        
               chunks = pd.read_csv(file, chunksize=100000, **pd_csv_args) 
        
               data = pd.concat(chunks) 
        
               xyz = torch.stack((torch.from_numpy(data[mapping['x']].to_numpy()).float(), 
        
                                  torch.from_numpy(data[mapping['y']].to_numpy()).float(), 
        
                                  torch.from_numpy(data[mapping['z']].to_numpy()).float()), 1) 
        
               phot = torch.from_numpy(data[mapping['phot']].to_numpy()).float() 
        
               frame_ix = torch.from_numpy(data[mapping['frame_ix']].to_numpy()).long() 
        
               if 'id' in mapping.keys(): 
        
                   identifier = torch.from_numpy(data[mapping['id']].to_numpy()).long() 
        
               else: 
        
                   identifier = None 
        
               return {'xyz': xyz, 'phot': phot, 'frame_ix': frame_ix, 'id': identifier}, None, None

is it okay to store metadata in few commented lines in .csv? Or maybe its better to save plain .csv plus meta.txt with the additional information
change csv signature
add default decode mapping
change return arguments
update emitter.save() if needed
check the tests

Implement checkpoint loading in codebase

Currently, only checkpoint saving is implemented. Loading only supported in notebooks / implemented in google colab.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

(Soon) Deprecated oepration in post-processing

Locate and change.

  0%|          | 0/313 [00:00<?, ?it/s]/home/lucas/RemoteDeploy/DeepSMLM/decode/neuralfitter/post_processing.py:174: UserWarning: This overload of nonzero is deprecated:
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /opt/conda/conda-bld/pytorch_1595629395347/work/torch/csrc/utils/python_arg_parser.cpp:766.)
  batch_ix = active_px.nonzero()[:, 0]

Memory leak when training in notebook directly

Memory consumption continuously grows. I don't get where this comes from. In terminal it does not happen.

Change parameter file logic because it breaks sometimes

Change parameter file logic.
Sometimes parameters are exclusive, i.e. if you change LROnPlateau for StepLR the arguments are different and the reference parameter file just adds arguments of StepLR to LROnPlateau which causes it to break.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Can we make DECODE a conda 'noarch'?

Since DECODE itself is Python only that should be possible? It would make deployment with updates way easier.
This way we only need to go to 3 different machines if we need to update SplinePSF (which gets updates way less frequently).

Automatic Learning Rate setter

For difficult conditions, the current default LR might be too high.

Try out whether we can automatically set a good learning rate (https://pytorch-lightning.readthedocs.io/en/latest/lr_finder.html).
If not, maybe add a trigger that resets the model and optimizer state when nothing happened in the first 10-20 epochs.

Dummy Issue

asdfjaksdfa decode/test/test_assets_web.py. ...

Multiprocessing / multiple workers on Windows

On some Windows machines multiprocessing is unstable (I don't know why this is not reproducible though ...).
Maybe we can improve this? Or it automatically improves by upgrading pytorch?

Resources:

https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing/18205006#18205006

https://discuss.pytorch.org/t/brokenpipeerror-errno-32-broken-pipe-when-i-run-cifar10-tutorial-py/6224/3

Add python 3.9 support and update to latest pytorch

Change command line argument parsing from click to something else (e.g. argparse)

In live_engine.py we should change from click to e.g. argparse because click sometimes has problems with encoding and then people need to set stuff in the terminal.

Filter warnings when no emitter is found

In the first few epochs when we don't find any emitters above threshold there are many numpy warning thrown. Filter them out when len(emitter) == 0 but do not just blindly filter all of them.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Add verification to the end of fitting (notebook)

Check (and or plot) the emitter attributions distributions and ensure that they are within the limits of the training.
E.g. if many fitted background values are 10 and the trained bg range was from 10-50 it may mean that the training was just not appropriate for the experimental data.

Pad to Size instead of center crop

Currently we crop a frame by default to multitudes of 8 so that we can forward it through the model. Change this to a less destructive pad to size, by padding until it reaches a multitude of 8.

Idea:

tar_size = ceil(actual_size / 8) * 8
pad_to_size(tensor, tar_size)

Resources:
https://pytorch.org/docs/stable/generated/torch.nn.utils.rnn.pad_sequence.html
Use pad_sequence put an artificial tensor of target size in and only return the input tensor.

SMAP prefit .yaml does not conform with DECODE

SMAP .yaml produces lists of lists and has both density and emitter_av in exported .yaml file; exactly one of them must be None.

ToDo:

Add current SMAP exported smap_param.yaml to test assets
Write a test that makes sure that the types after import of smap_param.yaml are the same as the param_friendly.yaml

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Logging training at the end of the epoch can take quite some time.

INvestigate, rm old stuff and speed up.

On the fly tiff loader

Implement on the fly tiff loader for very large TIFFs that possibly do not fit into RAM.
Probably:
A dataset which loads the tiff pages in question in the getitem method. Or the tiff loader returning a generator object.

Fix unstable rendering when using slightly different clip percentiles

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Python 3.6 dependency solving fails for conda, but works for mamba

I think they use some deprecated stuff from conda that mamba simply ignores.
This is not super critical since we support 3.6 anyways only because of colab / pip.

Replace paths in decode tests

Relative paths to file or similiar could likely fail in packaged decode installations (because then folders don't exist anymore?).
Do it similarly to load_reference, i.e.

try:
        import importlib.resources as pkg_resources
    except ImportError: # Try backported to PY<37 `importlib_resources`.
        import importlib_resources as pkg_resources

    from . import reference_files
    param_ref = pkg_resources.open_text(reference_files, 'reference.yaml')

Make Emitter compatible to None attributes

Currently, we fill None attributes with len(em) * nan which takes up significant resources.
Moreover, we even save it in .csv, hdf5, .pt.
Change EmitterSet class to allow for None fields.

This requires deep changes and careful checks.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Output an all-in-one decode.pt file for model, param (logging)

To make it easier for transferring the stuff and that everything is in one place; prototypical and similar to ckpt.
Makes it a bit less verbose for pure PyTorch ML though.

Clean up Metrics in Tensorboard logging

All metrics should have a readable/friendly description. This can be added to the log_scalar, I guess it is needed only once.

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Seperate parameter magic from load param

Currently when you load params they are autofilled by the reference which is bad style. Seperate

Failed to download assets

Failed to download web assets as tested in decode/test/test_assets_web.py.

Automatic Batch Size

Implement automatic batch size finder for inference (where we can do it silently) but also for training.

Segmentation fault on testing the train implementation

Got segmentation fault on decode/test/test_train_val_impl.py even though the test itself passed.

Can only reproduce this on Ubuntu with not so much RAM.
Though I cannot see excessive RAM usage in themonitor. The model is mocked and super small ...

Maybe this has something to do with the fact that multiprocessing fails for some people?

Debugging with gdb gives:

(gdb) r -m pytest decode/test/test_train_val_impl.py
Starting program: /home/lucas/miniconda3/envs/decode_dev_cpu/bin/python -m pytest decode/test/test_train_val_impl.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
====================================================================== test session starts =======================================================================
platform linux -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /home/lucas/git/DECODE/decode/test, configfile: pytest.ini
collected 2 items                                                                                                                                                

decode/test/test_train_val_impl.py [New Thread 0x7fff1958d700 (LWP 2866)]
[New Thread 0x7fff13fff700 (LWP 2867)]
[New Thread 0x7fff0eee1700 (LWP 2868)]
..                                                                                                                      [100%]

======================================================================== warnings summary ========================================================================
../../miniconda3/envs/decode_dev_cpu/lib/python3.9/site-packages/torch/cuda/__init__.py:52
  /home/lucas/miniconda3/envs/decode_dev_cpu/lib/python3.9/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /opt/conda/conda-bld/pytorch_1607370151529/work/c10/cuda/CUDAFunctions.cpp:100.)
    return torch._C._cuda_getDeviceCount() > 0

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================================================= 2 passed, 1 warning in 25.09s ==================================================================
[Thread 0x7fff1958d700 (LWP 2866) exited]

Thread 4 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff0eee1700 (LWP 2868)]
0x000055555576b8ba in PyThreadState_Clear () at /home/conda/feedstock_root/build_artifacts/python-split_1611624120657/work/Python/pystate.c:785
785	/home/conda/feedstock_root/build_artifacts/python-split_1611624120657/work/Python/pystate.c: No such file or directory.
(gdb) where
#0  0x000055555576b8ba in PyThreadState_Clear () at /home/conda/feedstock_root/build_artifacts/python-split_1611624120657/work/Python/pystate.c:785
#1  0x00007fffc64edb1c in pybind11::gil_scoped_acquire::dec_ref() ()
   from /home/lucas/miniconda3/envs/decode_dev_cpu/lib/python3.9/site-packages/torch/lib/libtorch_python.so
#2  0x00007fffc64edb59 in pybind11::gil_scoped_acquire::~gil_scoped_acquire() ()
   from /home/lucas/miniconda3/envs/decode_dev_cpu/lib/python3.9/site-packages/torch/lib/libtorch_python.so
#3  0x00007fffc6815dd9 in torch::autograd::python::PythonEngine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) ()
   from /home/lucas/miniconda3/envs/decode_dev_cpu/lib/python3.9/site-packages/torch/lib/libtorch_python.so
#4  0x00007ffff4431067 in std::execute_native_thread_routine (__p=0x555559bca5f0)
    at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/src/c++11/thread.cc:80
#5  0x00007ffff7bbd6db in start_thread (arg=0x7fff0eee1700) at pthread_create.c:463
#6  0x00007ffff6f39a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)

When changing, possibly the following modules are affected:

PSF
TargetGenerator
PostProcessing
Plotting
Export

	def load_csv(file: (str, pathlib.Path), mapping: (None, dict) = None, **pd_csv_args) -> Tuple[dict, dict, dict]:
	"""
	Loads a CSV file which does provide a header.

	Args:
	file: path to file
	mapping: mapping dictionary with keys ('x', 'y', 'z', 'phot', 'id', 'frame_ix')
	pd_csv_args: additional keyword arguments to be parsed to the pandas csv reader

	Returns:
	dict: dictionary which can readily be converted to an EmitterSet by EmitterSet(**out_dict)
	"""
	if mapping is None:
	mapping = {'x': 'x', 'y': 'y', 'z': 'z', 'phot': 'phot', 'frame_ix': 'frame_ix'}

	chunks = pd.read_csv(file, chunksize=100000, **pd_csv_args)
	data = pd.concat(chunks)

	xyz = torch.stack((torch.from_numpy(data[mapping['x']].to_numpy()).float(),
	torch.from_numpy(data[mapping['y']].to_numpy()).float(),
	torch.from_numpy(data[mapping['z']].to_numpy()).float()), 1)

	phot = torch.from_numpy(data[mapping['phot']].to_numpy()).float()
	frame_ix = torch.from_numpy(data[mapping['frame_ix']].to_numpy()).long()

	if 'id' in mapping.keys():
	identifier = torch.from_numpy(data[mapping['id']].to_numpy()).long()
	else:
	identifier = None

	return {'xyz': xyz, 'phot': phot, 'frame_ix': frame_ix, 'id': identifier}, None, None

turagalab / decode Goto Github PK

decode's People

Contributors

Stargazers

Watchers

Forkers

decode's Issues

Recommend Projects

Recommend Topics

Recommend Org