nadeemlab / deepliif Goto Github PK

Deep Learning Inferred Multiplex ImmunoFluorescence for IHC Image Quantification (https://deepliif.org) [Nature Machine Intelligence'22, CVPR'22, MICCAI'23, Histopathology'23, MICCAI'24]

Home Page: https://deepliif.org

License: Other

Python 90.58% Dockerfile 0.08% Shell 0.38% Java 8.95%

multiplex immunohistochemical-images immunohistochemistry deep-learning multitask-learning pathology pathology-image pathology-gan segmentation cell-segmentation

deepliif's Introduction

Deep-Learning Inferred Multiplex Immunofluorescence for Immunohistochemical Image Quantification

Reporting biomarkers assessed by routine immunohistochemical (IHC) staining of tissue is broadly used in diagnostic pathology laboratories for patient care. To date, clinical reporting is predominantly qualitative or semi-quantitative. By creating a multitask deep learning framework referred to as DeepLIIF, we present a single-step solution to stain deconvolution/separation, cell segmentation, and quantitative single-cell IHC scoring. Leveraging a unique de novo dataset of co-registered IHC and multiplex immunofluorescence (mpIF) staining of the same slides, we segment and translate low-cost and prevalent IHC slides to more expensive-yet-informative mpIF images, while simultaneously providing the essential ground truth for the superimposed brightfield IHC channels. Moreover, a new nuclear-envelop stain, LAP2beta, with high (>95%) cell coverage is introduced to improve cell delineation/segmentation and protein expression quantification on IHC slides. By simultaneously translating input IHC images to clean/separated mpIF channels and performing cell segmentation/classification, we show that our model trained on clean IHC Ki67 data can generalize to more noisy and artifact-ridden images as well as other nuclear and non-nuclear markers such as CD3, CD8, BCL2, BCL6, MYC, MUM1, CD10, and TP53. We thoroughly evaluate our method on publicly available benchmark datasets as well as against pathologists' semi-quantitative scoring. Trained on IHC, DeepLIIF generalizes well to H&E images for out-of-the-box nuclear segmentation.

DeepLIIF is deployed as a free publicly available cloud-native platform (https://deepliif.org) with Bioformats (more than 150 input formats supported) and MLOps pipeline. We also release DeepLIIF implementations for single/multi-GPU training, Torchserve/Dask+Torchscript deployment, and auto-scaling via Pulumi (1000s of concurrent connections supported); details can be found in our documentation. DeepLIIF can be run locally (GPU required) by pip installing the package and using the deepliif CLI command. DeepLIIF can be used remotely (no GPU required) through the https://deepliif.org website, calling the cloud API via Python, or via the ImageJ/Fiji plugin; details for the free cloud-native platform can be found in our CVPR'22 paper.

Overview of DeepLIIF pipeline and sample input IHCs (different brown/DAB markers -- BCL2, BCL6, CD10, CD3/CD8, Ki67) with corresponding DeepLIIF-generated hematoxylin/mpIF modalities and classified (positive (red) and negative (blue) cell) segmentation masks. (a) Overview of DeepLIIF. Given an IHC input, our multitask deep learning framework simultaneously infers corresponding Hematoxylin channel, mpIF DAPI, mpIF protein expression (Ki67, CD3, CD8, etc.), and the positive/negative protein cell segmentation, baking explainability and interpretability into the model itself rather than relying on coarse activation/attention maps. In the segmentation mask, the red cells denote cells with positive protein expression (brown/DAB cells in the input IHC), whereas blue cells represent negative cells (blue cells in the input IHC). (b) Example DeepLIIF-generated hematoxylin/mpIF modalities and segmentation masks for different IHC markers. DeepLIIF, trained on clean IHC Ki67 nuclear marker images, can generalize to noisier as well as other IHC nuclear/cytoplasmic marker images.

Prerequisites

Python 3.8
Docker

Installing `deepliif`

DeepLIIF can be pip installed:

$ conda create --name deepliif_env python=3.8
$ conda activate deepliif_env
(deepliif_env) $ conda install -c conda-forge openjdk
(deepliif_env) $ pip install deepliif

The package is composed of two parts:

A library that implements the core functions used to train and test DeepLIIF models.
A CLI to run common batch operations including training, batch testing and Torchscipt models serialization.

You can list all available commands:

(venv) $ deepliif --help
Usage: deepliif [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  prepare-testing-data   Preparing data for testing
  serialize              Serialize DeepLIIF models using Torchscript
  test                   Test trained models
  train                  General-purpose training script for multi-task...

Note: You might need to install a version of PyTorch that is compatible with your CUDA version. Otherwise, only the CPU will be used. Visit the PyTorch website for details. You can confirm if your installation will run on the GPU by checking if the following returns True:

import torch
torch.cuda.is_available()

Training Dataset

For training, all image sets must be 512x512 and combined together in 3072x512 images (six images of size 512x512 stitched together horizontally). The data need to be arranged in the following order:

XXX_Dataset 
    ├── train
    └── val

We have provided a simple function in the CLI for preparing data for training.

To prepare data for training, you need to have the image dataset for each image (including IHC, Hematoxylin Channel, mpIF DAPI, mpIF Lap2, mpIF marker, and segmentation mask) in the input directory. Each of the six images for a single image set must have the same naming format, with only the name of the label for the type of image differing between them. The label names must be, respectively: IHC, Hematoxylin, DAPI, Lap2, Marker, Seg. The command takes the address of the directory containing image set data and the address of the output dataset directory. It first creates the train and validation directories inside the given output dataset directory. It then reads all of the images in the input directory and saves the combined image in the train or validation directory, based on the given validation_ratio.

deepliif prepare-training-data --input-dir /path/to/input/images
                               --output-dir /path/to/output/images
                               --validation-ratio 0.2

Training

To train a model:

deepliif train --dataroot /path/to/input/images 
               --name Model_Name

python train.py --dataroot /path/to/input/images 
                --name Model_Name

To view training losses and results, open the URL http://localhost:8097. For cloud servers replace localhost with your IP.
Epoch-wise intermediate training results are in DeepLIIF/checkpoints/Model_Name/web/index.html.
Trained models will be by default be saved in DeepLIIF/checkpoints/Model_Name.
Training datasets can be downloaded here.

DP: To train a model you can use DP. DP is single-process. It means that all the GPUs you want to use must be on the same machine so that they can be included in the same process - you cannot distribute the training across multiple GPU machines, unless you write your own code to handle inter-node (node = machine) communication. To split and manage the workload for multiple GPUs within the same process, DP uses multi-threading. You can find more information on DP here.

To train a model with DP (Example with 2 GPUs (on 1 machine)):

deepliif train --dataroot <data_dir> --batch-size 6 --gpu-ids 0 --gpu-ids 1

Note that batch-size is defined per process. Since DP is a single-process method, the batch-size you set is the effective batch size.

DDP: To train a model you can use DDP. DDP usually spawns multiple processes. DeepLIIF's code follows the PyTorch recommendation to spawn 1 process per GPU (doc). If you want to assign multiple GPUs to each process, you will need to make modifications to DeepLIIF's code (see doc). Despite all the benefits of DDP, one drawback is the extra GPU memory needed for dedicated CUDA buffer for communication. See a short discussion here. In the context of DeepLIIF, this means that there might be situations where you could use a bigger batch size with DP as compared to DDP, which may actually train faster than using DDP with a smaller batch size. You can find more information on DDP here.

To launch training using DDP on a local machine, use deepliif trainlaunch. Example with 2 GPUs (on 1 machine):

deepliif trainlaunch --dataroot <data_dir> --batch-size 3 --gpu-ids 0 --gpu-ids 1 --use-torchrun "--nproc_per_node 2"

Note that

batch-size is defined per process. Since DDP is a single-process method, the batch-size you set is the batch size for each process, and the effective batch size will be batch-size multiplied by the number of processes you started. In the above example, it will be 3 * 2 = 6.
You still need to provide all GPU ids to use to the training command. Internally, in each process DeepLIIF picks the device using gpu_ids[local_rank]. If you provide --gpu-ids 2 --gpu-ids 3, the process with local rank 0 will use gpu id 2 and that with local rank 1 will use gpu id 3.
-t 3 --log_dir <log_dir> is not required, but is a useful setting in torchrun that saves the log from each process to your target log directory. For example:

deepliif trainlaunch --dataroot <data_dir> --batch-size 3 --gpu-ids 0 --gpu-ids 1 --use-torchrun "-t 3 --log_dir <log_dir> --nproc_per_node 2"

If your PyTorch is older than 1.10, DeepLIIF calls torch.distributed.launch in the backend. Otherwise, DeepLIIF calls torchrun.

Serialize Model

The installed deepliif uses Dask to perform inference on the input IHC images. Before running the test command, the model files must be serialized using Torchscript. To serialize the model files:

deepliif serialize --model-dir /path/to/input/model/files
                   --output-dir /path/to/output/model/files

By default, the model files are expected to be located in DeepLIIF/model-server/DeepLIIF_Latest_Model.
By default, the serialized files will be saved to the same directory as the input model files.

Testing

To test the model:

deepliif test --input-dir /path/to/input/images
              --output-dir /path/to/output/images
              --model-dir /path/to/the/serialized/model
              --tile-size 512

python test.py --dataroot /path/to/input/images
               --results_dir /path/to/output/images
               --checkpoints_dir /path/to/model/files
               --name Model_Name

The latest version of the pretrained models can be downloaded here.
Before running test on images, the model files must be serialized as described above.
The serialized model files are expected to be located in DeepLIIF/model-server/DeepLIIF_Latest_Model.
The test results will be saved to the specified output directory, which defaults to the input directory.
The tile size must be specified and is used to split the image into tiles for processing. The tile size is based on the resolution (scan magnification) of the input image, and the recommended values are a tile size of 512 for 40x images, 256 for 20x, and 128 for 10x. Note that the smaller the tile size, the longer inference will take.
Testing datasets can be downloaded here.

Whole Slide Image (WSI) Inference:
For translation and segmentation of whole slide images, you can simply use the same test command giving path to the directory containing your whole slide images as the input-dir. DeepLIIF automatically reads the WSI region by region, and translate and segment each region separately and stitches the regions to create the translation and segmentation for whole slide image, then saves all masks in the format of ome.tiff in the given output-dir. Based on the available GPU resources, the region-size can be changed.

deepliif test --input-dir /path/to/input/images 
              --output-dir /path/to/output/images 
              --model-dir /path/to/the/serialized/model
              --tile-size 512
              --region-size 20000

If you prefer, it is possible to run the models using Torchserve. Please see the documentation on how to deploy the model with Torchserve and for an example of how to run the inference.

Docker

We provide a Dockerfile that can be used to run the DeepLIIF models inside a container. First, you need to install the Docker Engine. After installing the Docker, you need to follow these steps:

Download the pretrained model here and place them in DeepLIIF/model-server/DeepLIIF_Latest_Model.
To create a docker image from the docker file:

docker build -t cuda/deepliif .

The image is then used as a base. You can copy and use it to run an application. The application needs an isolated environment in which to run, referred to as a container.

To create and run a container:

 docker run -it -v `pwd`:`pwd` -w `pwd` cuda/deepliif deepliif test --input-dir Sample_Large_Tissues --tile-size 512

When you run a container from the image, the deepliif CLI will be available. You can easily run any CLI command in the activated environment and copy the results from the docker container to the host.

ImageJ Plugin

If you don't have access to GPU or appropriate hardware and just want to use ImageJ to run inference, we have also created an ImageJ plugin for your convenience.

The plugin also supports submitting multiple ROIs at once:

Cloud Deployment

If you don't have access to GPU or appropriate hardware and don't want to install ImageJ, we have also created a cloud-native DeepLIIF deployment with a user-friendly interface to upload images, visualize, interact, and download the final results.

Cloud API Endpoints

DeepLIIF can also be accessed programmatically through an endpoint by posting a multipart-encoded request containing the original image file, along with optional parameters including postprocessing thresholds:

POST /api/infer

File Parameter:

  img (required)
    Image on which to run DeepLIIF.

Query String Parameters:

  resolution
    Resolution used to scan the slide (10x, 20x, 40x). Default is 40x.

  pil
    If present, use Pillow to load the image instead of Bio-Formats. Pillow is
    faster, but works only on common image types (png, jpeg, etc.).

  slim
    If present, return only the refined segmentation result image.

  nopost
    If present, do not perform postprocessing (returns only inferred images).

  prob_thresh
    Probability threshold used in postprocessing the inferred segmentation map
    image. The segmentation map value must be above this value in order for a
    pixel to be included in the final cell segmentation. Valid values are an
    integer in the range 0-254. Default is 150.

  size_thresh
    Lower threshold for size gating the cells in postprocessing. Segmented
    cells must have more pixels than this value in order to be included in the
    final cell segmentation. Valid values are 0, a positive integer, or 'auto'.
    'Auto' will try to automatically determine this lower bound for size gating
    based on the distribution of detected cell sizes. Default is 'auto'.

  size_thresh_upper
    Upper threshold for size gating the cells in postprocessing.  Segmented
    cells must have less pixels that this value in order to be included in the
    final cell segmentation. Valid values are a positive integer or 'none'.
    'None' will use no upper threshold in size gating. Default is 'none'.

  marker_thresh
    Threshold for the effect that the inferred marker image will have on the
    postprocessing classification of cells as positive.  If any corresponding
    pixel in the marker image for a cell is above this threshold, the cell will
    be classified as being positive regardless of the values from the inferred
    segmentation image. Valid values are an integer in the range 0-255, 'none',
    or 'auto'. 'None' will not use the marker image during classification.
    'Auto' will automatically determine a threshold from the marker image.
    Default is 'auto'.

For example, in Python:

import os
import json
import base64
from io import BytesIO

import requests
from PIL import Image

# Use the sample images from the main DeepLIIF repo
images_dir = './Sample_Large_Tissues'
filename = 'ROI_1.png'

root = os.path.splitext(filename)[0]

res = requests.post(
    url='https://deepliif.org/api/infer',
    files={
        'img': open(f'{images_dir}/{filename}', 'rb'),
    },
    params={
        'resolution': '40x',
    },
)

data = res.json()

def b64_to_pil(b):
    return Image.open(BytesIO(base64.b64decode(b.encode())))

for name, img in data['images'].items():
    with open(f'{images_dir}/{root}_{name}.png', 'wb') as f:
        b64_to_pil(img).save(f, format='PNG')

with open(f'{images_dir}/{root}_scoring.json', 'w') as f:
    json.dump(data['scoring'], f, indent=2)
print(json.dumps(data['scoring'], indent=2))

If you have previously run DeepLIIF on an image and want to postprocess it with different thresholds, the postprocessing routine can be called directly using the previously inferred results:

POST /api/postprocess

File Parameters:

  img (required)
    Image on which DeepLIIF was run.

  seg_img (required)
    Inferred segmentation image previously generated by DeepLIIF.

  marker_img (optional)
    Inferred marker image previously generated by DeepLIIF.  If this is
    omitted, then the marker image will not be used in classification.

Query String Parameters:

  resolution
    Resolution used to scan the slide (10x, 20x, 40x). Default is 40x.

  pil
    If present, use Pillow to load the original image instead of Bio-Formats.
    Pillow is faster, but works only on common image types (png, jpeg, etc.).
    Pillow is always used to open the seg_img and marker_img files.

  prob_thresh
    Probability threshold used in postprocessing the inferred segmentation map
    image. The segmentation map value must be above this value in order for a
    pixel to be included in the final cell segmentation. Valid values are an
    integer in the range 0-254. Default is 150.

  size_thresh
    Lower threshold for size gating the cells in postprocessing. Segmented
    cells must have more pixels than this value in order to be included in the
    final cell segmentation. Valid values are 0, a positive integer, or 'auto'.
    'Auto' will try to automatically determine this lower bound for size gating
    based on the distribution of detected cell sizes. Default is 'auto'.

  size_thresh_upper
    Upper threshold for size gating the cells in postprocessing.  Segmented
    cells must have less pixels that this value in order to be included in the
    final cell segmentation. Valid values are a positive integer or 'none'.
    'None' will use no upper threshold in size gating. Default is 'none'.

  marker_thresh
    Threshold for the effect that the inferred marker image will have on the
    postprocessing classification of cells as positive.  If any corresponding
    pixel in the marker image for a cell is above this threshold, the cell will
    be classified as being positive regardless of the values from the inferred
    segmentation image. Valid values are an integer in the range 0-255, 'none',
    or 'auto'. 'None' will not use the marker image during classification.
    'Auto' will automatically determine a threshold from the marker image.
    Default is 'auto'. (If marker_img is not supplied, this has no effect.)

For example, in Python:

import os
import json
import base64
from io import BytesIO

import requests
from PIL import Image

# Use the sample images from the main DeepLIIF repo
images_dir = './Sample_Large_Tissues'
filename = 'ROI_1.png'

root = os.path.splitext(filename)[0]

res = requests.post(
    url='https://deepliif.org/api/infer',
    files={
        'img': open(f'{images_dir}/{filename}', 'rb'),
        'seg_img': open(f'{images_dir}/{root}_Seg.png', 'rb'),
        'marker_img': open(f'{images_dir}/{root}_Marker.png', 'rb'),
    },
    params={
        'resolution': '40x',
        'pil': True,
        'size_thresh': 250,
    },
)

data = res.json()

def b64_to_pil(b):
    return Image.open(BytesIO(base64.b64decode(b.encode())))

for name, img in data['images'].items():
    with open(f'{images_dir}/{root}_{name}.png', 'wb') as f:
        b64_to_pil(img).save(f, format='PNG')

with open(f'{images_dir}/{root}_scoring.json', 'w') as f:
    json.dump(data['scoring'], f, indent=2)
print(json.dumps(data['scoring'], indent=2))

Synthetic Data Generation

The first version of DeepLIIF model suffered from its inability to separate IHC positive cells in some large clusters, resulting from the absence of clustered positive cells in our training data. To infuse more information about the clustered positive cells into our model, we present a novel approach for the synthetic generation of IHC images using co-registered data. We design a GAN-based model that receives the Hematoxylin channel, the mpIF DAPI image, and the segmentation mask and generates the corresponding IHC image. The model converts the Hematoxylin channel to gray-scale to infer more helpful information such as the texture and discard unnecessary information such as color. The Hematoxylin image guides the network to synthesize the background of the IHC image by preserving the shape and texture of the cells and artifacts in the background. The DAPI image assists the network in identifying the location, shape, and texture of the cells to better isolate the cells from the background. The segmentation mask helps the network specify the color of cells based on the type of the cell (positive cell: a brown hue, negative: a blue hue).

In the next step, we generate synthetic IHC images with more clustered positive cells. To do so, we change the segmentation mask by choosing a percentage of random negative cells in the segmentation mask (called as Neg-to-Pos) and converting them into positive cells. Some samples of the synthesized IHC images along with the original IHC image are shown below.

Overview of synthetic IHC image generation. (a) A training sample of the IHC-generator model. (b) Some samples of synthesized IHC images using the trained IHC-Generator model. The Neg-to-Pos shows the percentage of the negative cells in the segmentation mask converted to positive cells.

We created a new dataset using the original IHC images and synthetic IHC images. We synthesize each image in the dataset two times by setting the Neg-to-Pos parameter to %50 and %70. We re-trained our network with the new dataset. You can find the new trained model here.

Registration

To register the de novo stained mpIF and IHC images, you can use the registration framework in the 'Registration' directory. Please refer to the README file provided in the same directory for more details.

Contributing Training Data

To train DeepLIIF, we used a dataset of lung and bladder tissues containing IHC, hematoxylin, mpIF DAPI, mpIF Lap2, and mpIF Ki67 of the same tissue scanned using ZEISS Axioscan. These images were scaled and co-registered with the fixed IHC images using affine transformations, resulting in 1264 co-registered sets of IHC and corresponding multiplex images of size 512x512. We randomly selected 575 sets for training, 91 sets for validation, and 598 sets for testing the model. We also randomly selected and manually segmented 41 images of size 640x640 from recently released BCDataset which contains Ki67 stained sections of breast carcinoma with Ki67+ and Ki67- cell centroid annotations (for cell detection rather than cell instance segmentation task). We split these tiles into 164 images of size 512x512; the test set varies widely in the density of tumor cells and the Ki67 index. You can find this dataset here.

We are also creating a self-configurable version of DeepLIIF which will take as input any co-registered H&E/IHC and multiplex images and produce the optimal output. If you are generating or have generated H&E/IHC and multiplex staining for the same slide (de novo staining) and would like to contribute that data for DeepLIIF, we can perform co-registration, whole-cell multiplex segmentation via ImPartial, train the DeepLIIF model and release back to the community with full credit to the contributors.

Memorial Sloan Kettering Cancer Center AI-ready immunohistochemistry and multiplex immunofluorescence dataset for breast, lung, and bladder cancers (Nature Machine Intelligence'22)
Moffitt Cancer Center AI-ready multiplex immunofluorescence and multiplex immunohistochemistry dataset for head-and-neck squamous cell carcinoma (MICCAI'23)

Support

Please use the Image.sc Forum for discussion and questions related to DeepLIIF.

Bugs can be reported in the GitHub Issues tab.

License

Acknowledgments

This code is inspired by CycleGAN and pix2pix in PyTorch.

Reference

If you find our work useful in your research or if you use parts of this code or our released dataset, please cite the following papers:

@article{ghahremani2022deep,
  title={Deep learning-inferred multiplex immunofluorescence for immunohistochemical image quantification},
  author={Ghahremani, Parmida and Li, Yanyun and Kaufman, Arie and Vanguri, Rami and Greenwald, Noah and Angelo, Michael and Hollmann, Travis J and Nadeem, Saad},
  journal={Nature Machine Intelligence},
  volume={4},
  number={4},
  pages={401--412},
  year={2022},
  publisher={Nature Publishing Group}
}

@article{ghahremani2022deepliifui,
  title={DeepLIIF: An Online Platform for Quantification of Clinical Pathology Slides},
  author={Ghahremani, Parmida and Marino, Joseph and Dodds, Ricardo and Nadeem, Saad},
  journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={21399--21405},
  year={2022}
}

@article{ghahremani2023deepliifdataset,
  title={An AI-Ready Multiplex Staining Dataset for Reproducible and Accurate Characterization of Tumor Immune Microenvironment},
  author={Ghahremani, Parmida and Marino, Joseph and Hernandez-Prera, Juan and V. de la Iglesia, Janis and JC Slebos, Robbert and H. Chung, Christine and Nadeem, Saad},
  journal={International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)},
  year={2023}
}

@article{nadeem2023ki67validationMTC,
  author = {Nadeem, Saad and Hanna, Matthew G and Viswanathan, Kartik and Marino, Joseph and Ahadi, Mahsa and Alzumaili, Bayan and Bani, Mohamed-Amine and Chiarucci, Federico and Chou, Angela and De Leo, Antonio and Fuchs, Talia L and Lubin, Daniel J and Luxford, Catherine and Magliocca, Kelly and Martinez, Germán and Shi, Qiuying and Sidhu, Stan and Al Ghuzlan, Abir and Gill, Anthony J and Tallini, Giovanni and Ghossein, Ronald and Xu, Bin},
  title = {Ki67 proliferation index in medullary thyroid carcinoma: a comparative study of multiple counting methods and validation of image analysis and deep learning platforms},
  journal = {Histopathology},
  year = {2023},
  doi = {https://doi.org/10.1111/his.15048}
}

@article{zehra2024deepliifstitch,
author = {Zehra, Talat and Marino, Joseph and Wang, Wendy and Frantsuzov, Grigoriy and Nadeem, Saad},
title = {Rethinking Histology Slide Digitization Workflows for Low-Resource Settings},
journal = {International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)},
year = {2024}
}

deepliif's People

Contributors

Stargazers

Watchers

Forkers

chengzhong66 zzgw eliramnof ricdodds wendywwang drewmibm apetrov-msk srinidhipy subratac ygivenx zion0710 rimanb raminnakhli histopathology liudongg hzzcl geodza jsy19997 desaixie manulawithanage shaktipratap1 nil-91 felixyiu schweini31 fredayuan77 bwealthye carolynsap nigroi wendywangwwt yanfang-research kim-hyun-ji nrcoleman salmck atlains emiledesmaili telradai birdyliu6471023 chenlipeng2019 anilgavade kekai110 zaruzcmu 18252781543 chrishonselaar shitoudidi ydlongtao jaedukseo flywind2 mrhsk haozhongma epodlich pouya-codes usama3162 taciturnfencer ayesha-12345 hoanbklucky zhk1425734486 shichongchen84 young1861 lwen1243 flyfiish

deepliif's Issues

Select evaluation metrics for generated IHC images from H&E images

Dear Authors,

Thank you for sharing your open-source code. I am particularly interested in your paper titled 'Deep Learning-Inferred Multiplex Immunofluorescence for Immunohistochemical Image Quantification.'

However, I noticed the use of evaluation metrics like Intersection over Union (IoU), pixel accuracy, Dice score, and the aggregated Jaccard index (AJI), while other metrics like Structural Similarity Index Measure (SSIM), Multi-Scale SSIM (MS-SSIM), Peak Signal-to-Noise Ratio (PSNR), and error measures (MSE, MAE, PCC) are not included.

In the context of evaluating generated IHC images from H&E images, metrics like SSIM and PSNR seem intuitively relevant. Could you elaborate on the rationale behind using IoU, pixel accuracy, and other segmentation-focused metrics in your work?

Thank in advance,
Linh

DeepLIIF dataset

Hi,

In https://www.nature.com/articles/s42256-022-00471-x, the dataset is:

The images were scaled and co-registered with the fixed IHC images using affine transformations, resulting in 1,667 registered sets of IHC images and the other modalities of size 512 × 512. We randomly selected 363 sets for training, 53 sets for validation and 600 sets for testing the model. As described in the Synthetic data generation section, we synthetically generated 250 sets using our synthetic data generation model and added 212 to training and 38 to validation.

In https://www.biorxiv.org/content/biorxiv/early/2021/10/08/2021.05.01.442219.full.pdf, the dataset is:

These images were scaled
and co-registered with the fixed IHC images using affine
transformations, resulting in 1667 registered sets of IHC images and the other modalities of size 512×512. We randomly
selected 709 sets for training, 358 sets for validation, and 600
sets for testing the model.

The data splits in the two articles are not the same, but their metric results appear to be the same.

Inconsistency Between Cloud Deployment and API Call Results

Hello，

When the input is the same pathology picture, I am encountering an issue where the results I get from using your Cloud Deployment are significantly better than those from the API calls. I am hoping to get some guidance on how to make the results consistent.

I expected that the results from both the Cloud Deployment and the API calls would be similar or consistent. Are there any parameters that can be adjusted when making API calls?

Best regards,
Haozhong Ma.

Docker Container and Post-Processing

The documentation for creating a Docker Container says to name the image "DeepLIIF_Image", but this name must be lowercase ("deepliif_image").

The example in the documentation uses the directory "Sample_Data", but the directory is actually called "Sample_Large_Tissues".

To run "PostProcessSegmentationMask.py" within a Docker container, "scikit-image" and "numba" must be installed manually.

Running "postprocessing.py" to stitch the result tiles together ignores the "Segmentation_Overlaid" and "Segmentation_Refined" tiles created using "PostProcessSegmentationMask.py".

how to change Segmentation Threshold, size gating and marker threshold in the local CLI version

Hi,
I really like the web interface version.
However, i have a lot of tiles and need to use the local version on my server.
Can I change the thresholds in CLI version?
I want to set up a higher threshold for my case

Many thanks!
Best,
Wei

Dataset Links Request

Would you be so kind as to provide the links to the datasets used in your paper? We are particularly interested in obtaining the bladder carcinoma, non-small cell lung carcinoma slides, Ki67 stained sections of breast carcinoma, and any other relevant datasets. Thank you in advance!

Possible function name change

In version 1.1.4 released on PyPI, the module cli.py tries to import read_input_image from deepliif.util, but in the source code here on GitHub (though only v1.1.2 is showing as of now), this symbol is replaced with get_information. get_information exists while read_input_image does not, leading to an error in the 1.1.4 release version on entrypoint command deepliif.

Bad Segmentation with Large IHC Images

I want to perform cell segmentation on a 3000x3000 IHC image as input, with a tile_size of 300. The big one will be cropped into 100 smaller images, but some of them containing cells are mistakenly identified as background. How should I perform the accurate segmentation?
these are some examples:

How can I obtain the coordinates of seg?

Hello, I want to obtain coordinates through the model, but when I enter the run_dask function, I found that tensor_to_pil(seg) directly converts seg into a numpy image. I don't know how to convert it into coordinates. Could you please help me? Thank you very much.

pre-training model

Problems with the pre-training model URL

Missing import modules

Hi,

PrepareDataForTesting.py and PrepareDataForTraining.py both are missing some import modules:
import numpy as np
from options.processing_options import ProcessingOptions.

Apart from that scripts work great, thanks a lot!

Best,
Sherif

can't make it on open platform:https://deepliif.org/

00000_train_1+ (2)

Inference is extremely slow

After installing deepliif 1.1.11 by default using pip install deepliif in Python 3.8, the deepliif command reports an error. Upgrading numba to 0.58.1 resolves the issue. However, when performing inference with data from the Sample_Large_Tissues folder, the program gets stuck, taking 2 hours to complete just one image.

how to calculate Percentage of IHC+ cell in WSI

Dear DeepLIIF author,

I am trying to use DeepLIIF to do segmentation of IHC + cells on our WSIs.
In the results output folder, i have multiple tiffs plus a results.pickle file.
I am wondering if there is any way to calculate the % of IHC+ from my WSI.
Or should i calculate from the results.pickle?
Many thanks!

Best,
Wei

Could you provide the code about calculating precision, recall and f1-score on the cell detection task?

I only found the segmentation metrics calculation code in your repository.
Looking forward to your reply.

Using only immunofluoresnce images without HE staining

Is it possible to use this pipeline for segmentations of images that consist only of several fluorescence image layers and dapi staining without HE and other stainings?

Thank you

error when trying to run this line: docker run -it -v `pwd`:`pwd` -w `pwd` cuda/deepliif deepliif test --input-dir Sample_Large_Tissues

Here was my error message:

** DEPRECATION NOTICE! **

THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md

Traceback (most recent call last):
File "/usr/local/bin/deepliif", line 5, in
from cli import cli
File "/usr/local/lib/python3.8/dist-packages/cli.py", line 13, in
from deepliif.models import inference, postprocess, compute_overlap, init_nets, DeepLIIFModel, infer_modalities, infer_results_for_wsi, create_model
File "/usr/local/lib/python3.8/dist-packages/deepliif/models/init.py", line 33, in
from deepliif.util import *
File "/usr/local/lib/python3.8/dist-packages/deepliif/util/init.py", line 14, in
from ..postprocessing import imadjust
File "/usr/local/lib/python3.8/dist-packages/deepliif/postprocessing.py", line 9, in
from numba import jit
File "/usr/local/lib/python3.8/dist-packages/numba/init.py", line 43, in
from numba.np.ufunc import (vectorize, guvectorize, threading_layer,
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/init.py", line 3, in
from numba.np.ufunc.decorators import Vectorize, GUVectorize, vectorize, guvectorize
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/decorators.py", line 3, in
from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception

Also, in case it's necessary information I don't have a GPU and it's on an Intel Macbook. Would you be able to let me know what to change in order to fix the error?

Thank you!

Bad Segmentation for IHC Images with cytoplasm staining

Hi,

I'm working on a research project to segment and classify IHC slides with membrane staining. I used your online tool (deepliif.org) to evaluate your model on my dataset, but it seems to have difficulty segmenting the cells correctly. I have attached a sample image along with the model's output for reference.

Are there any configurations or parameters I can modify to improve the results?

Thank you.

Could you provide the code about generating the pseudo ground truth density map on BCData?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.