Code Monkey home page Code Monkey logo

endoscopydepthestimation-pytorch's Introduction

Dense Depth Estimation in Monocular Endoscopy with Self-supervised Learning Methods

Depth prediction from ours (2nd column), SfmLearner by Zhou et al. (3rd column), and GeoNet by Yin et al. (4th column). The first row displays results from a testing patient and camera which are not seen during training. The second row is from a training patient and camera. Depth maps are re-scaled with sparse depth maps generated from SfM results for visualization.

This codebase implements the method described in the paper:

Dense Depth Estimation in Monocular Endoscopy with Self-supervised Learning Methods

Xingtong Liu, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Austin Reiter, Russell H. Taylor, Mathias Unberath

In IEEE Transactions on Medical Imaging (TMI)

This work also won Best paper award & Best presentation award in International Workshop on Computer-Assisted and Robotic Endoscopy 2018

Please contact Xingtong Liu ([email protected]) or Mathias Unberath ([email protected]) if you have any questions.

We kindly ask you to cite TMI or CARE Workshop if the code is used in your own work.

@ARTICLE{liu2019dense,
  author={X. {Liu} and A. {Sinha} and M. {Ishii} and G. D. {Hager} and A. {Reiter} and R. H. {Taylor} and M. {Unberath}},
  journal={IEEE Transactions on Medical Imaging}, 
  title={Dense Depth Estimation in Monocular Endoscopy With Self-Supervised Learning Methods}, 
  year={2020},
  volume={39},
  number={5},
  pages={1438-1447}}
@incollection{liu2018self,
  title={Self-supervised Learning for Dense Depth Estimation in Monocular Endoscopy},
  author={Liu, Xingtong and Sinha, Ayushi and Unberath, Mathias and Ishii, Masaru and Hager, Gregory D and Taylor, Russell H and Reiter, Austin},
  booktitle={OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis},
  pages={128--138},
  year={2018},
  publisher={Springer}
}

Instructions

  1. Install all necessary python packages: torch, torchvision, opencv-python, numpy, tqdm, pathlib, torchsummary, tensorboardX, albumentations, argparse, pickle, plyfile, pyyaml (< 6), datetime, shutil, matplotlib, tensorflow-gpu.

  2. Generate training data from training videos using Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM). In terms of the format, please refer to one training data example in this repository. We use SfM to generate training data in this work. Color images with the format of "{:08d}.jpg" are extracted from the video sequence where SfM is applied. camer_intrinsics_per_view stores the estimated camera intrinsic matrices for all registered views. In this example, since all images are from the same video sequence, we assume the intrinsic matrices are the same for all images. The first three rows in this file are focal length, x and y of the principal point of the camera of the first image. motion.yaml stores the estimated poses of the world coordinate system w.r.t. the corresponding camera coordinate system. selected_indexes stores all frame indexes of the video sequence. structure.ply stores the estimated sparse 3D reconstruction from SfM. undistorted_mask.bmp is a binary mask used to mask out blank regions of the video frames. view_indexes_per_point stores the indexes of the frames that each point in the sparse reconstruction gets triangulated with. The views per point are separated by -1 and the order of the points is the same as that in structure.ply. We smooth out the point visibility information in the script to make the global scale recovery more stable and obtain more sparse points per frame for training. The point visibility smoothness is controled by parameter visibility_overlap. visible_view_indexes stores the original frame indexes of the registered views where valid camera poses are successfully estimated by SfM.

  3. Run train.py with proper arguments for self-supervised learning. One example is:

/path/to/python /path/to/train.py --id_range 1 2 --input_downsampling 4.0 --network_downsampling 64 --adjacent_range 5 30 --input_size 256 320 --batch_size 8 --num_workers 8 --num_pre_workers 8 --validation_interval 1 --display_interval 50 --dcl_weight 5.0 --sfl_weight 20.0 --max_lr 1.0e-3 --min_lr 1.0e-4 --inlier_percentage 0.99 --visibility_overlap 30 --training_patient_id 1 --testing_patient_id 1 --validation_patient_id 1 --number_epoch 100 --num_iter 2000 --architecture_summary --training_result_root "/path/to/training/directory" --training_data_root "/path/to/training/data"
  1. Run evaluate.py to generate evaluation results. Apply a registration algorithm that is able to estimate a similarity transformation to register the predicted point clouds to the corresponding CT model to calculate residual errors (this step may require manual point cloud initialization). One example is:
/path/to/python /path/to/evaluate.py --id_range 1 2 --input_downsampling 4.0 --network_downsampling 64 --adjacent_range 5 30 --input_size 256 320 --batch_size 1 --num_workers 2 --num_pre_workers 8 --load_all_frames --inlier_percentage 0.99 --visibility_overlap 30 --testing_patient_id 1 --load_intermediate_data --architecture_summary --trained_model_path "/path/to/trained/model" --sequence_root "/path/to/sequence/path" --evaluation_result_root "/path/to/testing/result" --evaluation_data_root "/path/to/testing/data" --phase "test"
  1. The SfM method is implemented based on the work below. However, any standard SfM methods should also work reasonably well, such as COLMAP, and the SfM results need to be reformatted to be correctly loaded for network training. Please refer to this repo if you want to convert the format of COLMAP results. The generated folder hierarchy needs to be changed to the same one as in example_training_data_root if the conversion script there is used.
@inproceedings{leonard2016image,
  title={Image-based navigation for functional endoscopic sinus surgery using structure from motion},
  author={Leonard, Simon and Reiter, Austin and Sinha, Ayushi and Ishii, Masaru and Taylor, Russell H and Hager, Gregory D},
  booktitle={Medical Imaging 2016: Image Processing},
  volume={9784},
  pages={97840V},
  year={2016},
  organization={International Society for Optics and Photonics}
}

Related Projects

endoscopydepthestimation-pytorch's People

Contributors

lppllppl920 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

endoscopydepthestimation-pytorch's Issues

training data issue

hi, I was preparing my training datas using COLMAP. And I transformed the format like yours. But the training result is not well. And the datas created by COLMAP is not as similar as yours.
I wanna ask what kind of the tool you use to create the datas?

Error in DataLoader during training

I am trying to train the model using the provided small dataset using the following:
Screenshot from 2021-12-09 16-50-30

Nevertheless, I obtain the following error:
Screenshot from 2021-12-09 16-51-11

Have you any ideas about this issue? Do you think that there is some error in the input parameters?

Thanks!

Nan Loss

image
why is the number 0.9 in ”_bilinear_interpolate(img_masks, u_2_flat, v_2_flat) * img_masks >= 0.9“. Can it be replaced it in order to avoid causing the nan loss value, and what should it be replaced with?

image
When I replace it with 0.24, it can run but the loss_depth_consistency is too small compared to loss_sparse_flow. Is it normal?

image-20231203183544161
and in the function "torch.nn.functional.grid_sample(input=im.permute(0, 3, 1, 2), grid=grid, mode='bilinear',
align_corners=True, padding_mode=padding_mode).permute(0, 2, 3, 1)" in _bilinear_interpolate(),
why the result is out of (0,1), as the the value of im is either 0 or 1.

view size is not compatible with input tensor's size and stride

Hi. Thank you very much for releasing the code.

I am running the code with the example data with the batch_size as 4 and other parameters as what is shown in the document. May I ask you if you have any suggestions for the following issue? Many thanks.

Epoch 0, lr [0.0001]: 0%| | 0/2000 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 307, in
intrinsics])
File "/data/yanhzhan/anaconda_y/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/data/yanhzhan/YhZ_ICRA2021/EndoscopyDepthEstimation-Pytorch-master/models.py", line 463, in forward
rotation_matrices, intrinsic_matrices, self.epsilon)
File "/data/yanhzhan/YhZ_ICRA2021/EndoscopyDepthEstimation-Pytorch-master/models.py", line 541, in _depth_warping
u_2_flat = u_2.view(-1)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Exception ignored in: <object repr() failed>

num_samples should be a positive integer value, but got num_samples=0

Hello! Thank you so much for uploading the code.
I am having some issues running the example provided in GitHub in the folder example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13.

This is the command I used:
python student_training.py --id_range 1 11 --input_downsampling 4.0 --network_downsampling 64 --adjacent_range 5 30 --torchsummary_input_size 256 320 --batch_size 8 --validation_interval 1 --display_interval 50 --dcl_weight 0.1 --sfl_weight 20.0 --max_lr 1.0e-3 --min_lr 1.0e-4 --inlier_percentage 0.99 --testing_patient_id 4 --load_intermediate_data --use_hsv_colorspace --number_epoch 100 --num_iter 1000 --architecture_summary --training_result_root "output" --training_data_root "example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13"

This is the error I got:

Tensorboard visualization at output/depth_estimation_training_run_11_15_9_38_test_id_[4]
Traceback (most recent call last):
  File "student_training.py", line 182, in <module>
    num_workers=num_workers)
  File "/home/marina/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 213, in __init__
    sampler = RandomSampler(dataset)
  File "/home/marina/.local/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 94, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

Access to test data

Hi, I was wondering if you could share some test data to evaluate. Thanks.

Invalid magic number

Hey please may I ask, when running evaluate.py i get this error...

"Invalid magic number; corrupt file?" when executing line "torch.load(str(trained_model_path))"

the trained model path is to the example you supplied 'example_training_data_root/precompute_4.0_64_0.99.pkl'

Any idea what may cause this error? thanks

Depth map and 3D reconstruction

hello!
How can I get the final depth map and 3D reconstruction results from the data you gave me?
I have tried to adjust the training phase, will the test phase get the final result?
Looking forward to your reply! I hope to get your help.

KeyError: 'example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13'

i used your new training script to train the model.
however i still got this problem.

Exception: KeyError:Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/share/chyh/workspace/EndoscopyDepthEstimation-Pytorch/dataset.py", line 363, in __getitem__
    start_h, end_h, start_w, end_w = self.crop_positions_per_seq[folder]
KeyError: 'example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13'

run train.py

An error message appears at runtime:
AttributeError: 'SfMDataset' object has no attribute 'pre_workers'

Is it because of the lack of data set?
Can you provide a data set?

I am looking forward to your reply. Wish you a happy life and good luck in your work!

depth_estimation_training_run problem

Hello. I am interested in your project.
So I run student_training.py as you wrote readme. But I had a problem with training_result_root. The initial initialization process shows that the path 'Training \ depth_estimation_training_run_1_30_10_50_test_id_ [1]' cannot be found. I wonder if you need a pre-trained model or some other procedure to solve this.

I was a path problem and solved it.

Dataset access

Hi,
I would like to repeat your training phase. Is there a chance that you can share with me your dataset?

Stuck in validation

Hello!

I am able to run the code for the first training epoch, but it gets stuck in validation. "Validation Epoch 0: 0%| | 0/36 [00:35<?, ?it/s]"

I tried to change the multiprocessor start processes to 'spawn' and 'fork' but I still get the same problem.

Any ideas in how to solve this issue?

Thank you very much in advance!

COLMAP settings and sharing of pretrained weights

First of all congrats on the amazing papers that you have authored related to dense depth estimation and feature extraction , producing state of the art results .
I am working on a similar project that you've worked on estimating depth from endoscopic videos and would like your help in regards to getting the trained weights for this implementation. I would be highly obliged for the help since I don't have the resources to generate SFM/MVS data and train it due to large compute times .

Also i have a few questions regarding COLMAP data generation

  1. Why do you use a pinhole model instead of a radial one considering the nature of endoscopic videos as seen in the text files of camera intrinsic parameters uploaded in example_training_data_root?

  2. Colmap has automatic reconstruction and manual sparse/dense reconstruction where you produce dense reconstruction doing everything step by step can you share the parameters you used exactly to produce the training data .

Thanks

How do I run evaluate.py?

Hello,
I ended up learning student_training.py.
I want to see the output through evaluate.py.
What should I do?

evaluate.py

Dear Author:
I had some problems running for evaluate. py

  1. I tried '--phase'='test' ,The program works and produces results. It contains 35 *.ply files, 35 pictures, and events.out.tfevents.1595389730.tao-X10SRA . What is the role of *.ply files? Can it be transformed into a point cloud model? What should I do? Is the depth map on the right side of the picture?
  2. I tried to visualize the generated file with tensorboard, but it didn’t work.
    3.I tried '--phase'='validation' An error occurred during program operation

Restored model, epoch 101, step 50500
0%| | 0/35 [00:00<?, ?it/s]Traceback (most recent call last):
File "evaluate.py", line 170, in
for batch, (
00004584

ValueError: too many values to unpack (expected 17)
0%|
Looking forward to your reply, thank you very much for your help!

Try a new data set

Dear Author:
Follow the requirements in your readme.md. If you change to a new data set, I need to preprocess the data set. I need to get motion.yaml through SFM or SLAM. In addition, there are camera_intrinsics_per_view, selected_indexes,visible_view_indexes
Do I need to get structure.ply before I train? Does it represent the reconstruction result of one frame or the reconstruction result of multiple frames?

Versions of dependences

Hello, many thanks for sharing your excellent works! May you kindly update the versions of dependencies, such as PyTorch, Tensorflow?

problem about _depth_warping

Hi @lppllppl920 from eq(7) in the paper, B=-Kt, while in the code it seems to be B=-KR.inv()*t. May I ask if it is because the pose is defined differently from the paper?

run evaluate.py

I'm sorry to bother you ,but I have a problem and need your help.

--sequence_root "/path/to/sequence/path" --evaluation_data_root "/path/to/testing/data" --phase "test"

How do I find the two paths above? I can't find the test data. I tried to run the training dataset as a test dataset,and
the file name " example_training_data_root" to " test_path"
The following file is the result of my run.

It is a great honor to communicate with you! Looking froward to your reply!

(my_environment) siat@siat-ThinkStation-P300:~/EndoscopyDepthEstimation-Pytorch-master2$ python evaluate.py --id_range 1 2 --adjacent_range 5 30 --input_size 256 320 --load_all_frames --load_intermediate_data --architecture_summary --sequence_root "/home/siat/EndoscopyDepthEstimation-Pytorch-master2/example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13" --evaluation_result_root "/home/siat/EndoscopyDepthEstimation-Pytorch-master2/test_result" --evaluation_data_root "/home/siat/EndoscopyDepthEstimation-Pytorch-master2/test_path/" --phase "test"
Tensorboard visualization at /home/siat/EndoscopyDepthEstimation-Pytorch-master2/test_result/depth_estimation_evaluation_run_6_24_15_55_test_id_N_o_n_e

    Layer (type)               Output Shape         Param #

================================================================
Conv2d-1 [-1, 48, 256, 320] 1,344
BatchNorm2d-2 [-1, 48, 256, 320] 96
.....
Conv2d-178 [-1, 12, 256, 320] 19,452
DenseBlock-179 [-1, 192, 256, 320] 0
Conv2d-180 [-1, 1, 256, 320] 193
FCDenseNet-181 [-1, 1, 256, 320] 0

Total params: 1,374,865
Trainable params: 1,374,865
Non-trainable params: 0

Input size (MB): 0.94
Forward/backward pass size (MB): 2579.58
Params size (MB): 5.24
Estimated Total Size (MB): 2585.76

Loading /home/siat/EndoscopyDepthEstimation-Pytorch-master2/depth_estimation_train_run_6_22_20_17_test_id_[1]/checkpoint_model_epoch_100_validation_0.09037257875833246.pt ...
Restored model, epoch 101, step 101000
0%| | 0/35 [00:00<?, ?it/s]Traceback (most recent call last):
File "evaluate.py", line 323, in
for batch, (colors_1, boundaries, intrinsics, names) in enumerate(test_loader):
File "/home/siat/anaconda3/envs/my_environment/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 346, in next
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/siat/anaconda3/envs/my_environment/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/siat/anaconda3/envs/my_environment/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/siat/EndoscopyDepthEstimation-Pytorch-master2/dataset.py", line 470, in getitem
start_h, end_h, start_w, end_w = self.crop_positions_per_seq[folder]
KeyError: '/home/siat/EndoscopyDepthEstimation-Pytorch-master2/example_training_data_root/bag_1/_start_004259_end_004629_stride_25_segment_13'
0%|

nan found when training with example data given

Hi! I tried to run train.py with example data you given to get a quick understanding of your code. However, it failed to train and I found it got a Nan Loss in every iter. Is that normal for the the example data? And I also train it with my own data (I suppose my data processing was correct) and got the same problem.
Do you have any clue? Thanks a lot!
截图 2023-04-06 00-13-33

about dense depth

Hi @lppllppl920, I am wondering if you could explain a bit more on how to get a dense depth map?

  • Is the result of the monocular Depth Estimation Layer a dense depth?
  • If not, how to get the depth of the grid-mesh (U,V) for the Flow from Depth Layer?
    Many thanks.

sparse soft mask

Hi @lppllppl920. From the paper, I read that a soft mask is applied. May I ask where it is applied in the code? I tested Mj in eq(6), and eq(10). It seems that the mask is a boolean variable. Many thanks.

Verification set

Hello,
first of all, thank you for sharing the code to the public.

I tried your code and, even though I had to make some modification, I can run the training phase without any problem. But, there is no validation set to validate the training result. Is it possible for you to provide a validation set too? Or, did I do something wrong? Can the validation also be done using the data in this repository?

Thank you so much.

Questions about README.me

Hello, dear author, I have two files that I don’t understand well, I hope to get your help!

camer_intrinsics_per_view
Can you explain in detail what the three rows of numbers in camera_intrinsics_per_view mean?

view_indexes_per_point
How is the view_indexes_per_point file obtained?Does he indicate which frames of images are obtained from each 3D point? If I try a new data set, how do I get it? what is the purpose of this file?

Flow from Depth Layer

Hi Authors, thanks for your nice work! In your paper, I wasn't able to understand how you arrived at Eq. (7). for U_k and V_k. In particular, why is A=K,R_j ^k.K^-1 and B=-K.t_j^k. (Similarly for Eq. 9).
It would be super helpful if you could explain it further or direct me towards some resources to understand the same.
Thanks in advance!

Training stuck validating

Hiya, I am trying to run train.py

It gets past the first stage, but then it seems to get stuck validating on line 386 after outputting "Validation Epoch 0: 0%| | 0/36 [00:35<?, ?it/s]"

I am running the script with args...
"python train.py --id_range 1 2 --input_downsampling 4.0 --network_downsaming 64 --adjacent_range 5 30 --input_size 256 320 --batch_size 2 --num_workers 8 --num_pre_workers 8 --validation_interval 1 --display_interval 50 --dcl_weight 5.0 --sfl_weight 20.0 --max_lr 1.0e-3 --min_lr 1.0e-4 --inlier_percentage 0.99 --visibility_overlap 30 --training_patient_id 1 --testing_patient_id 1 --validation_patient_id 1 --number_epoch 100 --num_iter 1000 --architecture_summary --training_result_root "result/" --training_data_root "example_training_data_root/""

Any ideas what is happening?
Thank you for your help

will grey-scale images also work?

Hi. May I ask if this code also works with grey-scale images? Or may I ask any suggestions on how to modify the code to fit with the grey-scale image?

Regarding to the reproducibility

Hi, thanks for your job. I have a question about the code reproducibility.
In order to make results reproducible, we must set two options about CuDNN, but there are some differences in your code and PyTorch docs.
In your code,

# Fix randomness for reproducibility
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.benchmark = True

But in PyTorch docs,

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

Could you please tell me why?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.