Code Monkey home page Code Monkey logo

Comments (4)

vsoch avatar vsoch commented on August 28, 2024

hey @BrianKolowitz ! The library is agnostic to the names of the filters, actually. The ones we are using (whitelist, blacklist, greylist) correspond with different actions that we implement. For example:

  • whitelist: we most it through, it likely doesn't have burned in pixels
  • blacklist: definitely has burned in pixels, quarantine
  • greylist: not sure, a human needs to look over and possibly customize filter

And when you run the above to replace identifiers with the above, although the criteria for filtering is represented in that deid recipe, the action to replace_identifiers doesn't look at any kind of filtering. You would have already done this and removed the ones you didn't want from your list of dicom_files. Actually, I made an object called a DicomCleaner for just this task (as opposed to handing stuff around to different functions).

So what you want to do then is a workflow that looks like:

  1. start with raw dicom files, and design your filters (looks like you are done here!)
  2. Then you will want to create a DicomCleaner and run it with your input list of dicoms, and specifically the function to "detect" That looks like:
from deid.dicom import DicomCleaner

# Here is some dummy file you have to test
dicom_file = "example.dcm"

# Create the cleaner
client = DicomCleaner()

# If you intend to run cleaning, you can provide an output folder. Otherwise just skip this
client = DicomCleaner(output_folder='/home/vanessa/Desktop')

Running detect is just handing the file to the cleaner client. This is likely the extent of how you will want to use the cleaner. The output is a datastructure with the result, and detect means we take your deid recipe and parse headers looking for matches to the filters.

client.detect(dicom_file)

{'flagged': True,
'results': [{'coordinates': [],
   'group': 'blacklist',
   'reason': ' ImageType missing  or ImageType empty '}]}

Then you could parse that datastructure and deal with the files appropriately, and then the ones that you want to continue processing could go into replace_identifiers. I haven't tested this fully yet, but we also have some (very basic) functions to perform a cleaning, and they depend on having known coordinates for PHI based on modality / image type, etc. That would look like this (after detect):

client.clean()

# And then there are a few saving functions (dcm and png)
client.save_png()
client.save_dicom()

If there are coordinates, they are blanked, otherwise no change. This is again reliant on how good your list is. Much better would be an OCR method, which I started but it needs more testing and development, if you are interested --> https://github.com/pydicom/dicom-scraper

A full example script is here:
https://github.com/pydicom/deid/blob/master/examples/dicom/pixels/run-cleaner-client.py

If you step through this and want to write up some docs for the (web friendly / readable) version it would be greatly appreciated! I wrote them into that script but didn't pass on the knowledge to the docs yet. Let me know if you have other questions.

from deid.

BrianKolowitz avatar BrianKolowitz commented on August 28, 2024

Thanks. Is there a way to accomplish this in one line?

FORMAT dicom
%filter whitelist
LABEL Xray
contains Modality CR|DX

%filter blacklist
LABEL Not Xray
equals Modality PR|RG

I'd like to specify something like this

%filter blacklist
LABEL Not Xray
notin Modality [CR,DX]

so I don't have to be exhaustive in the modalities I list

from deid.

vsoch avatar vsoch commented on August 28, 2024

Could you write out in (people terms) what you are trying to do - basically "not in Modality CR or DX?" There should be a notequals, so like:

%filter blacklist
LABEL Not Xray
notequals Modality CR|DX

The whole list of filters are:

contains
notcontains (looks like there is a bug in filters.py for this, will fix soon!)
equals
notequals
missing
present
empty

The stuff on the right side is all regular expressions, so whatever regular expression string you might use is fair game! if there is a filter that you think would be useful to add, let's add it!

from deid.

BrianKolowitz avatar BrianKolowitz commented on August 28, 2024

I think it's fine for my current needs, I'm trying to include only CR OR DX.

from deid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.