askerlee / segtran Goto Github PK
View Code? Open in Web Editor NEWMedical Image Segmentation using Squeeze-and-Expansion Transformers
Medical Image Segmentation using Squeeze-and-Expansion Transformers
Hi! Quick question: why does your repo have logic for doing localization?
For instance:
Line 68 in 47cc13e
segtran/code/dataloaders/datasets3d.py
Line 115 in 47cc13e
I looked at both papers and neither mentioned localization during training. Did you find that localization helped improve model performance?
hi
Thank you for great code
How can check points be accessed? I will be grateful if you provide me with the Google Drive link
Your advice is appreciated.
Hi Dr. Lee,
Is test/validation implemented during model training? I want to test/validate the ongoing performance during training but did not figure out how to do that from the code. I did see that there are three data files( all.list, train.list and test.list), wondering how to test and report the perf info during training.
Best
Wendy
Hi Dr. Lee, do you have the code that you used to compare with various baselines in section 5.2 (list of other models) for 2D dataset? Is there a similar comparison for the 3D dataset?
In REFUGE dataset, the pixel in each channel of the mask can be like this:
channel 0 ---> {0: 243058, 255: 88718}
channel 1 ---> {0: 294881, 255: 36895}
channel 2 ---> {0: 331776}
It means the mask values can be either 1 or 255. ("key: is the pixel value, "value" is the number of pixels with that value)
BUT in RIM dataset, the pixel value in each channel of the mask can be:
channel 0 ---> {0: 281893, 138: 27, 158: 84, 98: 20, 222: 84, 250: 13, 254: 480, 242: 37, 221: 31, 3: 13, 31: 72, 114: 17, 253: 66, 255: 48551, 225: 21, 19: 21, 35: 13, 95: 61, 115: 9, 194: 26, 83: 39, 27: 29, 233: 21, 59: 49, 218: 17, 11: 23, 94: 5, 154: 10, 170: 21, 193: 13, 234: 6, 226: 4}
channel 1 ---> {0: 318085, 11: 14, 31: 42, 35: 10, 95: 46, 154: 7, 254: 248, 253: 42, 98: 10, 158: 44, 194: 14, 255: 12965, 193: 9, 222: 29, 242: 18, 221: 16, 27: 17, 59: 27, 170: 15, 233: 13, 138: 15, 218: 9, 19: 12, 83: 21, 250: 9, 225: 7, 94: 3, 114: 7, 3: 8, 234: 2, 226: 6, 115: 6}
channel 2 ---> {0: 331776}
We can see that the mask values in rim dataset can be any int between 0 and 255. My question is, how to deal with this?
For example, can we treat any value > 0 would be the target and == 0 will be the background?
Dear sir, we want to use the code of nnU-Net on fundus (2d images). We have some problems, could you give us some guidance, thank u!
We also notice you have mentioned' It is primarily designed for 3D tasks, but can also handle 2D images after converting them to pseudo-3D.'
Here are our settings and traceback:
The parameter are --nproc_per_node=1 --master_port=7152 /data/sementation/code/train2d.py --task fundus --ds train --split train --translayers 3 --layercompress 1,1,2,2 --net nnunet --bb resnet101 --maxiter 5000 --bs 32 --noqkbias
Traceback (most recent call last): File "<input>", line 1, in <module> File "/root/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/data/sementation/code/train2d.py", line 1434, in <module> outputs = net(image_batch) File "/data/hliu/anaconda3/install/envs/segtran-master1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/data/hliu/anaconda3/install/envs/segtran-master1/lib/python3.8/site-packages/nnunet/network_architecture/generic_UNet.py", line 400, in forward x = torch.cat((x, skips[-(u + 1)]), dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 6 but got size 5 for tensor number 1 in the list.
How can I train with my own dataset? I have prepared the masks and images folder (2d segmentation) as similar to fundus but I am not sure on how to generate the json file for my dataset before training and testing?
Thanks for your great project!
I downloaded REFUGE dataset but the name and number of image files are different from names in this repo! For example, training part has 360 files with names 0001.jpg, ...
Hi! Quick question: what is the actual input to the network? Is the input the full 2D/3D image, or is the input smaller (perhaps random) sections of the full image?
I think the answer is the full image, but I want to confirm.
Hi, thanks for your excellent work!
I want to re-implement this model, will you provide your trained models whose results achieve SOTA from your paper? Thus we can only use your model doing inference.
Especially model on REFUGE and BraTS, segtran is very extraordinary.
Best,
Hi Dr Lee,
I got this error when I tried to run the test command using your recently updated Brats iter_8000 checkpoint, any advice?
python3 test3d.py --task brats --split all --bs 5 --ds 2019valid --net segtran --attractors 1024 --translayers 1 --cpdir ./ --iters 8000
Traceback (most recent call last):
File "test3d.py", line 426, in
allcls_avg_metric = test_calculate_metric(iter_nums)
File "test3d.py", line 350, in test_calculate_metric
load_model(net, args, checkpoint_path)
File "test3d.py", line 311, in load_model
if (k not in ignored_keys) and (args2.dict[k] != cp_args[k]):
KeyError: 'qk_have_bias'
The Error is on "load_model(net, args, checkpoint_path)" in the following:
for iter_num in iter_nums:
if args.checkpoint_dir:
checkpoint_path = os.path.join(args.checkpoint_dir, 'iter_' + str(iter_num) + '.pth')
load_model(net, args, checkpoint_path)
hi
i download dataset from link but this names different from datas in data folder
Quick question regarding the following line:
https://github.com/askerlee/segtran/blob/master/code/train3d.py#L695
The tensor passed to sigmoid has shape (batch size, num classes, H, W, D)
. Because the sigmoid is applied element-wise, this suggests to me that the classes are not competitive. By this I mean, multiple classes can exist simultaneously. Can you confirmation that this is correct?
Thanks in advance!
Thanks for your great work!
In figure2 in your paper, blue blobs and light-colored dots are used to indicate the gradients. Is there any difference between these dots of different colors?
Thanks for your great project! Sir,I want to train new 2D data with pretrained model,how could I do that?
Hiiii askerlee! Thanks for your nice work and repo!!
I have a question for how to use brats_processing.py to process my brats19 or 20 dataset.
After reading the codes, I do sth like this for train and val dataset respectively:
python3 brats_processing.py h5 brats_trainset_root
python3 brats_processing.py label brats_valset_root
Is this usage right since label process seems no affects to data? I'm confused of what python3 brats_processing.py label brats_valset_root really does.
Hello,
I run this command with batch size as 2 on the sample data provided, and it got a "CUDA out of memory" error. Your advice is appreciated.
~/segtran/code$ python3 train3d.py --task brats --split all --bs 1 --maxiter 10000 --randscale 0.1 --net segtran --attractors 1024 --translayers 1
Traceback (most recent call last):
File "train3d.py", line 683, in
outputs = net(volume_batch)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/segtran/code/networks/segtran3d.py", line 541, in forward
vfeat_fused_fpn = self.out_fpn_forward(batch_base_feats, vfeat_fused)
File "/home/ubuntu/segtran/code/networks/segtran3d.py", line 428, in out_fpn_forward
align_corners=False)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/functional.py", line 3712, in interpolate
return torch._C._nn.upsample_trilinear3d(input, output_size, align_corners, scale_factors)
RuntimeError: CUDA out of memory. Tried to allocate 1.15 GiB (GPU 0; 15.78 GiB total capacity; 11.45 GiB already allocated; 900.00 MiB free; 13.54 GiB reserved in total by PyTorch)
Dear Shaohua
while checking the train process I see the dice is around 9% after percent of total epochs.
the total training on 8 rtx 8000 nvidia takes around 25 hours estimated.
what is the cause of such low value for dice?
waiting to hear from you.
thanks
hi
Thanks for the great project
What is the hardware required to run the code?
And what is the approximate running time?
Hello, thank you for all the wonderful project. I tried to run the atria task and got the following error, any advice?
python3 train3d.py --task atria --split all --bs 2 --maxiter 10000 --randscale 0.1 --net segtran --attractors 1024 --translayers 1
Traceback (most recent call last):
File "train3d.py", line 549, in
xyz_permute=args.xyz_permute)
TypeError: init() got an unexpected keyword argument 'mask_num_classes'
I would like to try a smaller patch size (orig_patch_size) such as (80,80,80) so that I could possibly increase the batch_size to 2. It gave this error, any advice?
Traceback (most recent call last):
File "train3d.py", line 781, in
outputs = net(volume_batch)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ubuntu/segtran/code/networks/segtran3d.py", line 488, in forward
feats_dict = self.backbone.extract_features(fakeRGB_batch)
File "/home/ubuntu/segtran/code/networks/aj_i3d/aj_i3d.py", line 331, in extract_features
pooled_feat = self.avg_pool(x)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/torch/nn/modules/pooling.py", line 701, in forward
self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: input image (T: 10 H: 5 W: 5) smaller than kernel size (kT: 2 kH: 7 kW: 7)
Hi,sir. I move the project to the new computer. I meet ModuleNotFoundError: No module named 'networks.segtran2d'
.It would have been nice if you had given me a hint. Thank u!
Hi dear Askerlee,
first of all, greetin for your valuable code,
please explain how I can test this code over brats 2020 dataset,
as I replace that dataset in the brats path also setting the parameters in the train3d.py, but the .h5 files were not exist in the cases pathes.
waiting to hear from you,
thanks
Hi. Thx for the great work. I was trying to reproduce the baseline. I used REFUGE as a source domain and trained on the REFUGE train and valid data. Then I just tested this model on RIM-ONE w/o any adoption. To simplify the task I only did disk segmentation, i.e., I considered both cup and disk as the disk. In this setting, the upper bound I got on RIM-ONE (i.e., trained and tested both on RIM-ONE) is ~0.89 (DICE) and the lower bound (trained on REFUGE and tested on RIM-ONE) is only ~0.55. This in comparison with the results in the paper shows a big gap (the upper bound is lower than few-shot learning results and the lower bound is much lower than zero-shot learning results in the paper). I was wondering if you could provide more detail on data preprocessing, training details, etc.
Hello! Sorry to bother u. I've just git the project down on the colab. And I upload 13 images and masks of the dataset, CVC-300, to the project. But when I run the main.py, it shows that:
Epoch : 1
3/3 [==============================] - 23s 2s/step - loss: 0.6957 - dice_coef: 0.2971 - jacard: 0.1746 - accuracy: 0.5558
1/1 [==============================] - 4s 4s/step
Traceback (most recent call last):
File "main.py", line 234, in
trainStep(model, X_train, Y_train, X_test, Y_test, epochs=150, batchSize=4)
File "main.py", line 216, in trainStep
evaluateModel(model,X_test, Y_test,batchSize)
File "main.py", line 147, in evaluateModel
plt.imshow(X_test[i])
IndexError: index 3 is out of bounds for axis 0 with size 3
I have no idea how to deal with that.
Hope you can help me with that. So sorry to bother you.🙏🙏🙏
Hi
Dear Shaohua
is there any part of this code to set just trained model path and new sample pathe from out file.oy then getting result by these parameters without going into args parsing?
When I train polyp datasets, the error occured, logs as before:
10 epochs, 1002 itertations each.
0%| | 0/10 [00:00<?, ?it/s]
Image scales: 8x8. Voxels: [1, 1600, 1792]
outputs shape: torch.Size([1, 2, 320, 320])
mask_batch shape: torch.Size([1, 3, 320, 320])
0%| | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train2d.py", line 1100, in <module>
mask_batch.permute([0, 2, 3, 1]))
File "/root/anaconda3/envs/python377/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/python377/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 617, in forward
reduction=self.reduction)
File "/root/anaconda3/envs/python377/lib/python3.7/site-packages/torch/nn/functional.py", line 2433, in binary_cross_entropy_with_logits
raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([1, 320, 320, 3])) must be the same as input size (torch.Size([1, 320, 320, 2]))
Could you tell me how to solve this problem? thank u!
Hi~please help me figure out some questions.
many thanks in advance
Hi Dr Lee,
I tried to run test3d.py for a set of 110 test images a few times, but it somehow always exit unexpectedly around 70-ish images without any error. So far, I have never successfully finished the 110 images.
Any advice?
Best,
Wendy
What's the meaning of BraTS19_CBICA_ASK_1_t1ce.nii?
What's the meaning of BraTS19_CBICA_ASK_1_t2.nii?
and what's the difference between BraTS19_CBICA_ASK_1_t1ce.nii and BraTS19_CBICA_ASK_1_t1.nii
Hi again.
I also tested the multi gpu train.py form and worked fine till now in training,
however I need to use trained model for testing on new samples.
did you checked test.py file?
Error shown as below
TypeError: init() got an unexpected keyword argument 'mask_num_classes'
there are some another errors like undefined name visualize_model, evalrobustness, testloader, ...
please check test.py
waiting to hear from you.
thanks
thanks for sharing this great project , and i met some problems when use it for LUNA16 dataset segmentation .the training loss is low ,but the test reaults are :dice: 0.000, jc: 0.000, hd: nan, asd: nan .can you give me some advice ?thanks.
Dear author:
REFUGE20的checkpoint使用的是什么预训练的CNN backbone呢?
When I try test2d.py, the error occured:
python3 test2d.py --task fundus --split all --ds valid2 --net segtran --bb resnet101 --translayers 3 --layercompress 1,1,2,2 --cpdir ../model/segtran-fundus-train,valid,test,drishti,rim-05011826 --iters 9500 --outorigsize 'fundus' mean/std loaded from 'fundus-cropped-gray0.5-stats.json' 'all' 400 samples of size 576 chosen (total 800) in '../data/fundus/valid2' 'args' orig in-feat: 2048, in-feat: 2048, out-feat: 512, in-scheme: AN, out-scheme: AN, translayer_dims: [2048, 2048, 1024, 512] Namespace(ablate_multihead=False, attn_clip=500, backbone_type='resnet101', batch_size=8, bb_feat_upsize=True, binarize=False, calc_flop=False, checkpoint_dir='../model/segtran-fundus-train,valid,test,drishti,rim-05011826', debug=False, device='cuda', do_remove_frag=False, ds_class='SegCrop', ds_name='valid2', ds_split='all', eval_robustness=False, gpu='0', gray_alpha=0.5, has_FFN_in_squeeze=False, in_fpn_layers='34', in_fpn_scheme='AN', in_fpn_use_bn=False, iters='9500', job_name='fundus-valid2', mean=[0.578, 0.429, 0.318], mid_type='shared', mince_channel_props=None, mince_scales=None, net='segtran', num_attractors=256, num_classes=3, num_modalities=0, num_modes=4, num_translayers=3, num_workers=4, orig_input_size=(576, 576), out_fpn_layers='1234', out_fpn_scheme='AN', out_origsize=True, output_upscale=2.0, patch_size=(288, 288), polyformer_mode=None, pos_bias_radius=7, pos_code_type='lsinu', pos_code_weight=1.0, qk_have_bias=True, reload_mask=False, reshape_mask_type=None, robust_aug_degrees=[0.5, 1.5], robust_aug_types=None, robust_ref_cp_path=None, robust_sample_num=120, robustness_augs=None, sample_num=-1, save_ext='png', save_features_img_count=0, save_results=True, std=[0.184, 0.162, 0.144], task_name='fundus', test_interp=None, tie_qk_scheme='none', trans_output_type='private', translayer_compress_ratios=[1.0, 1.0, 2.0, 2.0], use_exclusive_masks=False, use_global_bias=False, use_mince_transformer=False, use_pretrained=True, use_squeezed_transformer=True, verbose_output=False, vis_layers=None, vis_mode=None) Segtran Fusion Encoder with 3 trans-layers Learnable Sinusoidal positional encoding Fusion0-in-squeeze: v_has_bias: False, has_FFN: False, has_input_skip: False Fusion0-in-squeeze in_feat_dim: 2048, feat_dim: 2048, qk_have_bias: True Fusion0-squeeze-out: v_has_bias: False, has_FFN: True, has_input_skip: False Fusion0-squeeze-out in_feat_dim: 2048, feat_dim: 2048, qk_have_bias: True Fusion1-in-squeeze: v_has_bias: False, has_FFN: False, has_input_skip: False Fusion1-in-squeeze in_feat_dim: 2048, feat_dim: 2048, qk_have_bias: True Fusion1-squeeze-out: v_has_bias: False, has_FFN: True, has_input_skip: False Fusion1-squeeze-out in_feat_dim: 2048, feat_dim: 1024, qk_have_bias: True Fusion2-in-squeeze: v_has_bias: False, has_FFN: False, has_input_skip: False Fusion2-in-squeeze in_feat_dim: 1024, feat_dim: 1024, qk_have_bias: True Fusion2-squeeze-out: v_has_bias: False, has_FFN: True, has_input_skip: False Fusion2-squeeze-out in_feat_dim: 1024, feat_dim: 512, qk_have_bias: True Downloading: "https://download.pytorch.org/models/resnet101-5d3b4d8f.pth" to /root/.cache/torch/hub/checkpoints/resnet101-5d3b4d8f.pth **resnet101 created** Parameter Count: 172737073 **args[backbone_type]=resnet101, checkpoint args[backbone_type]=eff-b4, inconsistent!**
I find in test2d:“parser.add_argument('--bb', dest='backbone_type', type=str, default='eff-b4', help='Segtran backbone'”
then find in resnet.py,when test2d.py is running,it downloads from the web101. But it has this error, I don't know how to solve this,Could you tell me how to solve this problem? thank u!
Could you provide the training parameters (max-iteration, batch-size, etc.) of 'Extension of nnU-Net' (shown in Table 5.)?
Can you explain your inference process? is it patch by patch then merge them together or on full image once?
Dr Lee, do you have or implemented metrics/analysis on how sensitivity changes with different fp_per_case? - Thanks
Nice job! but i want to use multi for training , Can you tell me which parameter can be modified to use multi GPU? THX!
I am trying to use the transunet for as the --net argument. However it throws an error as depited in the image.The error reads as :"Traceback (most recent call last):
File "/content/segtran/code/train2d.py", line 976, in
transunet_config = TransUNet_CONFIGS[args.backbone_type]
KeyError: 'eff-b4'".
Even if I change the backbone to eff-b1 or resnet101, a new key error is raised for those backbones.Please help me out
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.