profjsb / deepcr Goto Github PK

View Code? Open in Web Editor NEW

41.0 41.0 11.0 23.06 MB

Deep Learning Based Cosmic Ray Removal for Astronomical Images

License: BSD 3-Clause "New" or "Revised" License

Python 86.28% TeX 13.72%

astronomical-images astronomy convolutional-neural-networks deep-learning python3 pytorch

deepcr's Introduction

Hi there 👋

deepcr's People

Contributors

Stargazers

Watchers

Forkers

bbw7561135 cy-xu chaorun alphasentaurii kgb0255 rayadastidar zhuochen-ucla jamestheawesomedude lshuns pierfra-rocci

deepcr's Issues

issue while getting data from flc.fits

While getting data from flc.fits, it's showing "TypeError: buffer is too small for requested array" as follows. PLease provide me solution.

TypeError Traceback (most recent call last)
in
1 #image = fits.getdata("jdba2sooq_flc.fits")
----> 2 image = fits.getdata("jdba2sooq_flc.fits")[:512,:512]
3
4 #image = fits.getdata("jdba2sooq_flc.fits")
5

~/.local/lib/python3.5/site-packages/astropy/io/fits/convenience.py in getdata(filename, header, lower, upper, view, *args, **kwargs)
194 try:
195 hdu = hdulist[1]
--> 196 data = hdu.data
197 except IndexError:
198 raise IndexError('No data in this HDU.')

~/.local/lib/python3.5/site-packages/astropy/utils/decorators.py in get(self, obj, owner)
742 return val
743 else:
--> 744 val = self.fget(obj)
745 obj.dict[self._key] = val
746 return val

~/.local/lib/python3.5/site-packages/astropy/io/fits/hdu/image.py in data(self)
228 return
229
--> 230 data = self._get_scaled_image_data(self._data_offset, self.shape)
231 self._update_header_scale_info(data.dtype)
232

~/.local/lib/python3.5/site-packages/astropy/io/fits/hdu/image.py in _get_scaled_image_data(self, offset, shape)
694 code = BITPIX2DTYPE[self._orig_bitpix]
695
--> 696 raw_data = self._get_raw_data(shape, code, offset)
697 raw_data.dtype = raw_data.dtype.newbyteorder('>')
698

~/.local/lib/python3.5/site-packages/astropy/io/fits/hdu/base.py in _get_raw_data(self, shape, code, offset)
506 offset=offset)
507 elif self._file:
--> 508 return self._file.readarray(offset=offset, dtype=code, shape=shape)
509 else:
510 return None

~/.local/lib/python3.5/site-packages/astropy/io/fits/file.py in readarray(self, size, offset, dtype, shape)
332
333 return np.ndarray(shape=shape, dtype=dtype, offset=offset,
--> 334 buffer=self._mmap)
335 else:
336 count = reduce(operator.mul, shape)

TypeError: buffer is too small for requested array

clean_large_parallel() doesn't work properly on NERSC

When running model.clean(parallel=True) as a job on NERSC Cori cluster, the following error message occurs:

cori-2151-4: ERROR: step 129 exited with 1
Traceback (most recent call last):
  File "one_hdu.py", line 23, in <module>
    mask = model.clean(im, binary=False, n_jobs=4)
  File "/global/homes/k/kemingz/.conda/envs/deepcr/lib/python3.7/site-packages/deepCR/model.py", line 132, in clean
    n_jobs=n_jobs)
  File "/global/homes/k/kemingz/.conda/envs/deepcr/lib/python3.7/site-packages/deepCR/model.py", line 240, in clean_large_parallel
    shape=im_shape, mode='w+')
Remaining jobs: 210
  File "/global/homes/k/kemingz/.conda/envs/deepcr/lib/python3.7/site-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OSError: [Errno 38] Function not implemented

Here's the code executed <one_hdu.py>

import sys
from deepCR import deepCR
from astropy.io import fits
import numpy as np
import os

file = sys.argv[1]

out_dir = '/global/cscratch1/sd/kemingz/decals_dr9_deepcr/'

model = deepCR('decam')
path, f_nopath = os.path.split(file)
try:
    os.mkdir(out_dir + path[-10:])
except:
    pass
save_path = out_dir + path[-10:] + '/'
hdu = fits.open(file)
for i in range(len(hdu)-1):
    im = hdu[i + 1].data
    mask = model.clean(im, parallel=True, n_jobs=2, binary=False)
    hdu[i + 1].data = (mask * 255).astype(np.uint8)
output_f = save_path + f_nopath[:-8] + '_deepcr_prob.fits.fz'
hdu.writeto(output_f, overwrite=True)

Here's the parallel function called by model.clean() --> model..clean_large_parallel()

    def clean_large_parallel(self, img0, threshold=0.5, inpaint=True, binary=True,
                    patch=256, n_jobs=-1):
        """
            given input image
            return cosmic ray mask and (optionally) clean image
            mask could be binary or probabilistic
        :param img0: (np.ndarray) 2D input image
        :param threshold: for creating binary mask from probabilistic mask
        :param inpaint: return clean image only if True
        :param binary: return binary mask if True. probabilistic mask otherwise.
        :param patch: (int) Use 256 unless otherwise required. patch size to run deepCR on.
        :param n_jobs: (int) number of jobs to run in parallel, passed to `joblib.` Beware of memory overflow for
          larger n_jobs.
        :return: CR mask and (optionally) clean inpainted image
        """
        folder = './joblib_memmap_' + secrets.token_hex(3)
        try:
            mkdir(folder)
        except FileExistsError:
            folder = './joblib_memmap_' + secrets.token_hex(3)
            mkdir(folder)

        im_shape = img0.shape
        img0_dtype = img0.dtype
        hh = int(math.ceil(im_shape[0]/patch))
        ww = int(math.ceil(im_shape[1]/patch))

        img0_filename_memmap = path.join(folder, 'img0_memmap')
        dump(img0, img0_filename_memmap)
        img0 = load(img0_filename_memmap, mmap_mode='r')

        if inpaint:
            img1_filename_memmap = path.join(folder, 'img1_memmap')
            img1 = np.memmap(img1_filename_memmap, dtype=img0.dtype,
                            shape=im_shape, mode='w+')
        else:
            img1 = None

        mask_filename_memmap = path.join(folder, 'mask_memmap')
        mask = np.memmap(mask_filename_memmap, dtype=np.int8 if binary else img0_dtype,
                           shape=im_shape, mode='w+')

        @wrap_non_picklable_objects
        def fill_values(i, j, img0, img1, mask, patch, inpaint, threshold, binary):
            img = img0[i * patch: min((i + 1) * patch, im_shape[0]), j * patch: min((j + 1) * patch, im_shape[1])]
            if inpaint:
                mask_, clean_ = self.clean_(img, threshold=threshold, inpaint=True, binary=binary)
                mask[i * patch: min((i + 1) * patch, im_shape[0]), j * patch: min((j + 1) * patch, im_shape[1])] = mask_
                img1[i * patch: min((i + 1) * patch, im_shape[0]), j * patch: min((j + 1) * patch, im_shape[1])] = clean_
            else:
                mask_ = self.clean_(img, threshold=threshold, inpaint=False, binary=binary)
                mask[i * patch: min((i + 1) * patch, im_shape[0]), j * patch: min((j + 1) * patch, im_shape[1])] = mask_

        results = Parallel(n_jobs=n_jobs, verbose=0)\
                   (delayed(fill_values)(i, j, img0, img1, mask, patch, inpaint, threshold, binary)
                    for i in range(hh) for j in range(ww))

        mask = np.array(mask)
        if inpaint:
            img1 = np.array(img1)
        try:
            shutil.rmtree(folder)
        except:
            print('Could not clean-up automatically.')

        if inpaint:
            return mask, img1
        else:
            return mask

Full code of the deepCR model class is at
https://github.com/profjsb/deepCR/blob/master/deepCR/model.py

Adding a CONTRIBUTING?

Hi, I have the pleasure to review deepCR for your JOSS submission.

I'll likely have a couple of smaller comments and questions during the next few days, and will raise separate issues to keep your github issues clean.

I really appreciate your paragraph on contributing in the README; do you think it's worth considering to expand this to a full CONTRIBUTING.md? I think it would add value, is easier to find, and gives you the opportunity to provide more detailed guidelines and information on how to contribute.

UNet maxpool2D causing tensor shape mismatch

When max pool is applied to tensor shape including odd numbers it rounds down.
So when the downsampled tensor is upsampled via transpose conv. it doesn't match the original tensor shape, thereby causing a tensor shape mismatch at the residual connection concatenation.

e.g. mdl=deepCR(); out=mdl(torch.ones(99,99)))

Allow usage of generator-like objects instead of numpy arrays for training

By using a single numpy array at train time, the user is currently limited to building models on dataset sizes that fit in RAM. Instead, we should allow the user to train on a generator (akin to fit_generator in keras), which can read data as needed from disk. Perhaps the user can pass a pytorch.Dataset instead of a numpy array.

Jupyter Notebook showing end-to-end training example

It will be very useful to construct a set of notebooks showing how to train new models and use existing one (including viz of the results). My suggestion would be to build a model on ACS 814W data, akin to what is done in the original paper. The resultant models can be added to the repo (and docs/README) updated accordingly.

Instructions to verify installation

Hi, I have the pleasure to review deepCR for your JOSS submission.

I'll likely have a couple of smaller comments and questions during the next few days, and will raise separate issues to keep your github issues clean.

Could you add a quick paragraph in the README that allows the user to verify the installation, i.e. execute the tests locally? Ideally, you could just grab the matching lines from your travis config (pip install pytest-XYZ; python setup.py test).

profjsb / deepcr Goto Github PK

deepcr's Introduction

Hi there 👋

deepcr's People

Contributors

Stargazers

Watchers

Forkers

deepcr's Issues

issue while getting data from flc.fits

clean_large_parallel() doesn't work properly on NERSC

Adding a CONTRIBUTING?

UNet maxpool2D causing tensor shape mismatch

Allow usage of generator-like objects instead of numpy arrays for training

Jupyter Notebook showing end-to-end training example

Instructions to verify installation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent