Code Monkey home page Code Monkey logo

dicom-cleaner's Introduction

unit-tests type-hints doc-build test-coverage Python version PyPI version DOI

pydicom

pydicom is a pure Python package for working with DICOM files. It lets you read, modify and write DICOM data in an easy "pythonic" way. As a pure Python package, pydicom can run anywhere Python runs without any other requirements, although if you're working with Pixel Data then we recommend you also install NumPy.

Note that pydicom is a general-purpose DICOM framework concerned with reading and writing DICOM datasets. In order to keep the project manageable, it does not handle the specifics of individual SOP classes or other aspects of DICOM. Other libraries both inside and outside the pydicom organization are based on pydicom and provide support for other aspects of DICOM, and for more specific applications.

Examples are pynetdicom, which is a Python library for DICOM networking, and deid, which supports the anonymization of DICOM files.

Installation

Using pip:

pip install pydicom

Using conda:

conda install -c conda-forge pydicom

For more information, including installation instructions for the development version, see the installation guide.

Documentation

The pydicom user guide, tutorials, examples and API reference documentation is available for both the current release and the development version on GitHub Pages.

Pixel Data

Compressed and uncompressed Pixel Data is always available to be read, changed and written as bytes:

>>> from pydicom import dcmread
>>> from pydicom.data import get_testdata_file
>>> path = get_testdata_file("CT_small.dcm")
>>> ds = dcmread(path)
>>> type(ds.PixelData)
<class 'bytes'>
>>> len(ds.PixelData)
32768
>>> ds.PixelData[:2]
b'\xaf\x00'

If NumPy is installed, Pixel Data can be converted to an ndarray using the Dataset.pixel_array property:

>>> arr = ds.pixel_array
>>> arr.shape
(128, 128)
>>> arr
array([[175, 180, 166, ..., 203, 207, 216],
       [186, 183, 157, ..., 181, 190, 239],
       [184, 180, 171, ..., 152, 164, 235],
       ...,
       [906, 910, 923, ..., 922, 929, 927],
       [914, 954, 938, ..., 942, 925, 905],
       [959, 955, 916, ..., 911, 904, 909]], dtype=int16)

Decompressing Pixel Data

JPEG, JPEG-LS and JPEG 2000

Converting JPEG, JPEG-LS or JPEG 2000 compressed Pixel Data to an ndarray requires installing one or more additional Python libraries. For information on which libraries are required, see the pixel data handler documentation.

RLE

Decompressing RLE Pixel Data only requires NumPy, however it can be quite slow. You may want to consider installing one or more additional Python libraries to speed up the process.

Compressing Pixel Data

Information on compressing Pixel Data using one of the below formats can be found in the corresponding encoding guides. These guides cover the specific requirements for each encoding method and we recommend you be familiar with them when performing image compression.

JPEG-LS, JPEG 2000

Compressing image data from an ndarray or bytes object to JPEG-LS or JPEG 2000 requires installing the following:

RLE

Compressing using RLE requires no additional packages but can be quite slow. It can be sped up by installing pylibjpeg with the pylibjpeg-rle plugin, or gdcm.

Examples

More examples are available in the documentation.

Change a patient's ID

from pydicom import dcmread

ds = dcmread("/path/to/file.dcm")
# Edit the (0010,0020) 'Patient ID' element
ds.PatientID = "12345678"
ds.save_as("/path/to/file_updated.dcm")

Display the Pixel Data

With NumPy and matplotlib

import matplotlib.pyplot as plt
from pydicom import dcmread
from pydicom.data import get_testdata_file

# The path to a pydicom test dataset
path = get_testdata_file("CT_small.dcm")
ds = dcmread(path)
# `arr` is a numpy.ndarray
arr = ds.pixel_array

plt.imshow(arr, cmap="gray")
plt.show()

Contributing

We are all volunteers working on pydicom in our free time. As our resources are limited, we very much value your contributions, be it bug fixes, new core features, or documentation improvements. For more information, please read our contribution guide.

If you have examples or extensions of pydicom that don't belong with the core software, but that you deem useful to others, you can add them to our contribution repository: contrib-pydicom.

dicom-cleaner's People

Contributors

vsoch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dicom-cleaner's Issues

Add GDCM dependency to Dockerfile

My DICOM images seem to be compressed and ask for GDCM. It would be great to add GDCM to the ocr Dockerfile :-).

This is what I see:

root@38a72d18ad9b:/code# ./entrypoint.sh --input /data
DEBUG Found 1 contender files in data
DEBUG Checking 1 dicom files for validation.
Found 1 valid dicom files
Traceback (most recent call last):
  File "/code/main.py", line 172, in <module>
    main()
  File "/code/main.py", line 108, in main
    verbose=args.verbose)
  File "/code/user/__init__.py", line 31, in __init__
    self.image = dicom._get_pixel_array()
  File "/opt/anaconda2/lib/python2.7/site-packages/pydicom-1.0.0a1-py2.7.egg/pydicom/dataset.py", line 1053, in _get_pixel_array
    self._pixel_array = self._pixel_data_numpy()
  File "/opt/anaconda2/lib/python2.7/site-packages/pydicom-1.0.0a1-py2.7.egg/pydicom/dataset.py", line 690, in _pixel_data_numpy
    raise NotImplementedError("Pixel Data is compressed in a "
NotImplementedError: Pixel Data is compressed in a format pydicom does not yet handle. Cannot return array. Pydicom might be able to convert the pixel data using GDCM if it is installed.

Sidenote: Commenting on line main.py#L157... if you catch all exceptions it would be great to at minimum print the exception instead of hiding it. Not seeing the error is the worst kind of error.

deal with different kinds of image data

it's not clear to me what all the different possibilities are for what I might find for pixel data, and the application needs to deal with all of them. In early testing, I hit some bugs with dimension so I'm guessing there are many.

ocr cleaner has bug with gcc library / scikit image version

The entire container libraries / base needs to be debugged, unfortunately.

>>> maybe_text = dicom.select_text_among_candidates(saved_model)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "user/__init__.py", line 122, in select_text_among_candidates
    model = cPickle.load(fin)
  File "data/__init__.py", line 29, in <module>
    from sklearn.svm import LinearSVC
  File "/opt/anaconda2/lib/python2.7/site-packages/sklearn/svm/__init__.py", line 13, in <module>
    from .classes import SVC, NuSVC, SVR, NuSVR, OneClassSVM, LinearSVC
  File "/opt/anaconda2/lib/python2.7/site-packages/sklearn/svm/classes.py", line 1, in <module>
    from .base import BaseLibLinear, BaseSVC, BaseLibSVM
  File "/opt/anaconda2/lib/python2.7/site-packages/sklearn/svm/base.py", line 8, in <module>
    from . import libsvm, liblinear
ImportError: /opt/anaconda2/lib/python2.7/site-packages/sklearn/svm/libsvm.so: undefined symbol: __cxa_throw_bad_array_new_length

See notes in #8

update dependencies

The functions are using older versions of scipy and python 2, and it would be good to update to python 3 and latest scikit-image.

create version that uses deid "pre-defined" header fields to flag

  • create flow into application that doesn't use OCR, but deid
    • should allow for file or folder input
    • output should be choice between png, dicom, with/without summary PDF report
    • files should NOT be overwritten for any reason

Once we have the above, we can test in a headless environment over a folder of sampled images and then use the reports to assess how well we did.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.