simonvandenhende / multi-task-learning-pytorch Goto Github PK

View Code? Open in Web Editor NEW

743.0 17.0 113.0 77.51 MB

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

License: Other

Python 98.66% MATLAB 1.34%

multi-task-learning computer-vision pytorch nyud pascal segmentation scene-understanding eccv2020

multi-task-learning-pytorch's People

Contributors

Stargazers

Watchers

Forkers

sgeorgou pkrouth saqibmobin mangalbhaskar wdjang liuguoyou kiminh flamehaze1115 frozenheart1998 mengze-96 simon4yan bapleliu linkonbsmrstu dlwbm123 davidocea yirui-fafa liaoxianglai tianhaofu gracewx acmllearner beoy yanbigong2 molybdenumyz manaspalaparthi pha-nguyen 798283635 eric-mingxiao dr-dahou-adrar chuanchuanzheng xuwei1111 danielp3011 royzon purpleleaves007 zys1994 quant-kangchen muftawoomar markwjj stc-cqupt sckmat dimitri-sinodinos daniels91 backswimming mengkunzhao ydl832 tina1994 bm21 vishalbelsare wf111hui tiffen mgsong mttsky evdcush jjaskirat hit-liuchen zhgsxf tzjtatata hlhfhmt jimmyiskandar dengxunzhi hanshan1 kashefy ganbadei jameschun0958 lvchigo j-wu97 nengwp suyanzhou626 xxlin-haa jdc08161063 chooron cagataysari xihechn neetmehta atlasgooo2 li-qingyun qi-zhangyang mncuevas tong-wj david-tangk jlhou gbyy422990 rahultgirijal klc5cr6k zmh0124 arjunroyihrpa iammuratc aks276 jawadefaj mldl hongshen-z mj-x junjie2008v onlynata krismxhe ml-d strifee abcdump steven-xiong adblu z-nkk

multi-task-learning-pytorch's Issues

Depth performance using ResNet-50 (Single-task performance)

Thank you very much for sharing the wonderful code!

I meet a question when running the code: while I can get a similar accuracy on Segmentation (43.5 on mIoU) using ResNet-50, the accuracy on depth is not so good (0.614 RMSE). I have read related issues (#1) and (#5). But I still cannot address the question in my case, could you please give me some suggestions about the single-task experiment in Depth?

Thanks and Regards

Epoch 100/100
----------
Adjusted learning rate to 0.00000
Train ...
Epoch: [99][ 0/99]      Loss depth 1.0003e-01 (1.0003e-01)      Loss Total 1.0003e-01 (1.0003e-01)
Epoch: [99][25/99]      Loss depth 1.4344e-01 (1.2583e-01)      Loss Total 1.4344e-01 (1.2583e-01)
Epoch: [99][50/99]      Loss depth 1.3219e-01 (1.2832e-01)      Loss Total 1.3219e-01 (1.2832e-01)
Epoch: [99][75/99]      Loss depth 1.2006e-01 (1.3160e-01)      Loss Total 1.2006e-01 (1.3160e-01)
Results for depth prediction
rmse           0.2232
log_rmse       0.0887
Evaluate ...
Save model predictions to ./results/NYUD/resnet50/single_task/depth/results
Files already downloaded
Initializing dataloader for NYUD val set
Number of dataset images: 654
Evaluate the saved images (depth)
Evaluating depth: 0 of 654 objects
Evaluating depth: 500 of 654 objects
Results for Depth Estimation
rmse           0.6204
log_rmse       0.2119
No new best depth estimation model 0.614 -> 0.620
Checkpoint ...
Evaluating best model at the end
Save model predictions to ./results/NYUD/resnet50/single_task/depth/results
Files already downloaded
Initializing dataloader for NYUD val set
Number of dataset images: 654
Evaluate the saved images (depth)
Evaluating depth: 0 of 654 objects
Evaluating depth: 500 of 654 objects
Results for Depth Estimation
rmse           0.6204
log_rmse       0.2119

[Question] About dataset condition for MTL

Thank you for your awesome work.
Your work might be greatly helpful to all people who interest in the MTL.

I'm about to study MTL and have one question.

I think the dataset for MTL should be in form of {Input: X(i), GT: Y_task1(i), Y_task2(i), ..., Y_taskT(i)}.

However, I think that it is difficult to satisfy this condition in a real-world environment.
When we should train task-specific datasets D_task1 {Input: X_task1, GT: Y_task1}, D_task2 {Input: X_task2, GT: Y_task2} simultaneously, how we do MTL?

For example, we aim to set MTL for both salient object detection and depth estimation.
For the salient object detection task, we use saliency labels from Pascal VOC dataset.
For the depth estimation task, we use depth-map labels from NYUD dataset.
(Both datasets totally consist of different input images, and Pascal VOC does not contain depth-map labels and NYUD does not contain saliency labels)

In this condition, how we construct MTL?
Does anyone know about MTL for task-specific datasets or related works?

Adaptation to classification and detection problem

Hi @SimonVandenhende,

Thank you for sharing your code! I was wondering if you had tested the network also for other tasks (I am trying to train for joint 3D-object detection and classification & road lane detection and would like to know if you thought this is suitable with MTI-NET)

Thank you!

Cityscapes depth estimation

Could you please support with the following questions:

In the survey paper (Revisiting Multi-Task Learning in the Deep Learning Era), it is mentioned that depth maps of cityscapes were generated using SGM. Would it be possible to provide the code for this ?
Is the depth map generated or the disparity ?
Disparity maps are made available by cityscapes ? Are these used ? In this case, the networks predict directly predict depth or they predict disparity which is then converted to depth.?

Thank you.

Implementaion for decoder focused MTL methods based on ResNet-50

Hi,

It's a great piece of work and thanks for making the code puclic!

Do you have any plan to release the implementaion, hyper-parameters, and pretrained weights for the decoder focused MTL methods based on ResNet-50 as described in Tab. 5(a) in your survey paper?

Thanks in advance!

MTL+Resnet50+nyud

Thank you for sharing the code. During multi-task learning, I encountered the following error while training ResNet50 on NYUD dataset.

RuntimeError: Given groups=1, weight of size 64 3 7 7, expected input[8, 480, 640, 3] to have 3 channels, but got 480 channels instead

Hyperparams for HRNet-48

Please could you let me know the hyperparams used to train the HRNet-48 model from your paper (both for the 45.7% mIoU and the ~49% mIoU scores)? I have tried really hard to train HRNet-48 on single task in my repository, but it doesn't go beyond 44.8% mIoU.

Thank you.

the trained model

Hi, could you please release the trained model? Thank you!

ResNet-50 single task baseline hyperparameters

Hi there, thank you for open-sourcing your amazing work.

I trained ResNet-50 single-task baseline for segmentation task using config file "configs/nyud/resnet50/semseg.yml". However, I can get at best 40.4 mIoU on NYUD-v2 dataset, which is lower than 43.9 reported in the survey. Can you please provide the hyperparameter used in the survey or some hint to reach 43.9 mIoU?

Thanks in advance.

Setup issue

I am getting a path error since a part of the path is doubled in the code when running:

python main.py --config_env configs/env.yml --config_exp configs/pascal/hrnet18/mti_net.yml

The path looks like this:

../../../data2/yd/mti/datasets/PASCAL_MT/../../../data2/yd/mti/datasets/PASCAL_MT/human_parts/2008_000008.mat

For us it seems like there is an error in data/pascal_context.py here:
In line 111 the part_gt_dir is defined as an extension to self.root:

part_gt_dir = os.path.join(self.root, 'human_parts')

however in line 172 self root is joined with part_gt_dir, so with an extension of itself:

_human_part = os.path.join(self.root, part_gt_dir, line + ".mat")

Backbone

Hey, you reported the best result of NYUD with MTI-Net(HRNet-48), and in your code, the backbone in configs is HRNet-w18. If I want to reproduce the best result, is just changing HRNet-w18 to HRNet-48 enough? Thanks.

hrnet18+padnet

Thank you very much for sharing the wonderful code! Your work is definitely very helpful for MTL community.

I am contacting you because I try to reproduce the result of hrnet18+padnet on nyud dataset. Literally

downloaded the pretrain model of hrnet_w18_small_model_v2.pth
python main.py --config_env configs/env.yml --config_exp configs/nyud/hrnet18/pad_net.yml
But the performance is only Semantic Segmentation mIoU: 33.4665 depth 0.7267, not the result in your paper, do you have idea why is it?

Looking forward to hearing from you soon!
Thank you!

How it works with Batch Normalization?

Using dataloader by dataloader, sometimes memory is limited, so only little batch for each dataloader. It will disadvantage the performance of batch normalization? how to fix this problem?

About the pre-trained model and data augmentation

Thank you for your sharing about multi-task learning problem which is very useful!

You have mentioned you used the pre-trained models, e.g., pre-trained Resnet-50, and data augmentation trick in Sec 4.1.4 Training Setup. Therefore, is it right that the results in NYUD-v2 dataset which are reported in TABLE 5(c) in your paper are also based pre-trained Resnet-50 model with data augmentation? Besides, dose using the pre-trained models and data augmentation improve the performance very much?

Thanks for your patience again!

How to calculate the edges in NYUD?

Hello, thanks for your great work. Could you please tell me how do you get the label for edge prediction in datasets like NYUD?

Implemention of task balancing methods

Thank you very much for releasing the code! Is there any implemention of the task balancing methods like Uncertainty, GradNorm and DWA? Thanks!

About PAP-Net

Hello, thanks for sharing your work. In your paper Multi-Task Learning for Dense Prediction Tasks: A Survey, I found that you've tried PAP-Net, but I can't find the code in your project. How can I get it?

About task balancing

Thank you for your excellent work and open-source code.

Could you please provide some code about task balancing, such as uncertainty weight, GradNorm multi-object Optim? Your paper has some experimental results about these methods. Thank you very much!

About fixed weights from a grid search experiments setting

Hi, I am very impressed by your survey on MTL, from which I have learned a lot.
I am currently working on a MTL project, so I am very curious about the grid search experiments for the fixed weights.
I have not found details about this in your paper as well as this repo. Could you give me more information on this?
What exactly are those grid search weighs? And you used all the combinations of those weights to train the MTL network and evaluate it? If I want to find the best weights for my MTL network, I need to do the same experiments? Could you give me some suggestions on this?
Thank you so much!

PADNet loss scheme interpolation

PADNet paper mentions that the intermediate loss functions L1 to L4 are done by re-scaling the ground truth map to 1/4 resolution in page 3.

However, the implementation of the PADNet loss scheme upsamples the initial predictions to the img size before applying the loss. This seems to be in contrast to what the paper reports.

MTI-NET NYUD task，some files lost

Hi，I'm running your code,but i find some files is not included in your respository:

FileNotFoundError: [Errno 2] No such file or directory: './results/NYUD/hrnet_w18/single_task/semseg/re
sults/NYUD_test_semseg.json'

could you please provide these files?
thanks a lot!

MTI-Net + Resnet FPN backbone

Hi,
Thanks for your code!

Could you provide your code of MTI-Net with Resnet FPN backbone in you paper?
BTW, did you use pretrained weight in your of resnet18-fpn?
If used, where could i find that weight files?

Thanks!

About dataset training

Is mtl training done sequentially on multiple datasets? Won't this lead to sub-optimization of some tasks?

ResNet50 and PAD-net

First of all, thank you for opening up the code. I am learning multi-task learning recently. And from your paper "Multi-Task Learning for Dense Prediction Tasks: A Survey", it can be concluded that on NYDU-v2 dataset, the PAD-Net based on ResNet-50 seems to get the best results, but your program does not support this situation. May I ask if you can provide one or whether the current code can achieve this function.

About the code of pix acc & mean acc for segmentation and the code of rel & \delta 1 & \delta 2 & \delta 3 for depth

Hi, I noticed that this repository did't inculde the code of pix acc & mean acc for segmentation and the code of rel & \delta 1 & \delta 2 & \delta 3 for depth task. But in your papre you provide these metrics result .

Could you provide these evaluation code ？

Thank you so much :) 👍

Can we do Multiple Classification task

Hello SimonVandenhende,

Kudos to the great work. I just want to know if we can use this repo to train multiple classification tasks. For example Vehicle Make ,Color,Orientation,Model - each 4 attributes as individual tasks.

regards
akirs

Image visualizations

Hi @SimonVandenhende , could you let me know if there's a tool in this repo which would help generate colorful pascal images?

Thanks!

Human Parts

Hello, thank you very much for your outstanding contribution. I easily found the method of coloring the semantic segmentation prediction map on the Internet, but there are few methods for coloring Pascal-context human body part segmentation. I want to know how you color the human body part segmentation prediction map?

cannot find HRNet Pre-trained model

Hi, I can't find the pre-train model hrnet_w18_small_model_v2.pth in the https://github.com/HRNet/HRNet-Image-Classification. could you please upload this pre-trained weight again? Thanks~

[question] where is hrnet_w18_small_model_v2.pth?

sorry for my disturbance.
in https://github.com/HRNet/HRNet-Image-Classification/releases/tag/PretrainedWeights, hrnet_w18_small_model_v2.pth doesn't exist.

There only exist

HRNet_W18_C_cosinelr_cutmix_300epoch.pth.tar
HRNet_W18_C_pretrained.pth
HRNet_W18_C_ssld_pretrained.pth

so which pth is right?

About 'seism'

Hi! Thanks for sharing the code.
I had one problem that when I trained the model 'PADNet', is that the 'seism' necessary？Can I not use the seism?
Thank u for your answer.

Question on Multi Task Evaluation Criteria

Your survey on MTL is awesome! Amazing work, I got a quick question, in section 4.2.1 Evaluation Criterion you mention:

where li = 1 if a lower value means better performance for metric Mi of task i, and 0 otherwise.

But in the function calculate_multi_task_performance

    for task in tasks:
        mtl = eval_dict[task]
        stl = single_task_dict[task]
        
        if task == 'depth': # rmse lower is better
            mtl_performance -= (mtl['rmse'] - stl['rmse'])/stl['rmse']

        elif task in ['semseg', 'sal', 'human_parts']: # mIoU higher is better
            mtl_performance += (mtl['mIoU'] - stl['mIoU'])/stl['mIoU']

        elif task == 'normals': # mean error lower is better
            mtl_performance -= (mtl['mean'] - stl['mean'])/stl['mean']

        elif task == 'edge': # loss lower is better
            mtl_performance += (mtl['odsF'] - stl['odsF'])/stl['odsF']

        else:
            raise NotImplementedError

task == 'edge': # loss lower is better but it's adding instead of subtracting, why is that?

Also I've trying to retrain individual tasks like depth or semseng in a multi-gpu setup but it crashes and reboots!! (I'll probably look into this latter, but wondering if you ever encountered it)

Again, awesome work! Thanks for sharing

About PAP-Net

Dataset URL

It looks like the URL specified in data/google_drive.py is not accessible anymore.

download_file_from_google_drive fails with UnboundLocalError: local variable 'token' referenced before assignment since response.cookies is empty for the NYU dataset.