Code Monkey home page Code Monkey logo

multi-task-learning-pytorch's People

Contributors

simonvandenhende avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multi-task-learning-pytorch's Issues

Depth performance using ResNet-50 (Single-task performance)

Thank you very much for sharing the wonderful code!

I meet a question when running the code: while I can get a similar accuracy on Segmentation (43.5 on mIoU) using ResNet-50, the accuracy on depth is not so good (0.614 RMSE). I have read related issues (#1) and (#5). But I still cannot address the question in my case, could you please give me some suggestions about the single-task experiment in Depth?

Thanks and Regards

Epoch 100/100
----------
Adjusted learning rate to 0.00000
Train ...
Epoch: [99][ 0/99]      Loss depth 1.0003e-01 (1.0003e-01)      Loss Total 1.0003e-01 (1.0003e-01)
Epoch: [99][25/99]      Loss depth 1.4344e-01 (1.2583e-01)      Loss Total 1.4344e-01 (1.2583e-01)
Epoch: [99][50/99]      Loss depth 1.3219e-01 (1.2832e-01)      Loss Total 1.3219e-01 (1.2832e-01)
Epoch: [99][75/99]      Loss depth 1.2006e-01 (1.3160e-01)      Loss Total 1.2006e-01 (1.3160e-01)
Results for depth prediction
rmse           0.2232
log_rmse       0.0887
Evaluate ...
Save model predictions to ./results/NYUD/resnet50/single_task/depth/results
Files already downloaded
Initializing dataloader for NYUD val set
Number of dataset images: 654
Evaluate the saved images (depth)
Evaluating depth: 0 of 654 objects
Evaluating depth: 500 of 654 objects
Results for Depth Estimation
rmse           0.6204
log_rmse       0.2119
No new best depth estimation model 0.614 -> 0.620
Checkpoint ...
Evaluating best model at the end
Save model predictions to ./results/NYUD/resnet50/single_task/depth/results
Files already downloaded
Initializing dataloader for NYUD val set
Number of dataset images: 654
Evaluate the saved images (depth)
Evaluating depth: 0 of 654 objects
Evaluating depth: 500 of 654 objects
Results for Depth Estimation
rmse           0.6204
log_rmse       0.2119

[Question] About dataset condition for MTL

Thank you for your awesome work.
Your work might be greatly helpful to all people who interest in the MTL.

I'm about to study MTL and have one question.

I think the dataset for MTL should be in form of {Input: X(i), GT: Y_task1(i), Y_task2(i), ..., Y_taskT(i)}.

However, I think that it is difficult to satisfy this condition in a real-world environment.
When we should train task-specific datasets D_task1 {Input: X_task1, GT: Y_task1}, D_task2 {Input: X_task2, GT: Y_task2} simultaneously, how we do MTL?

For example, we aim to set MTL for both salient object detection and depth estimation.
For the salient object detection task, we use saliency labels from Pascal VOC dataset.
For the depth estimation task, we use depth-map labels from NYUD dataset.
(Both datasets totally consist of different input images, and Pascal VOC does not contain depth-map labels and NYUD does not contain saliency labels)

In this condition, how we construct MTL?
Does anyone know about MTL for task-specific datasets or related works?

Adaptation to classification and detection problem

Hi @SimonVandenhende,

Thank you for sharing your code! I was wondering if you had tested the network also for other tasks (I am trying to train for joint 3D-object detection and classification & road lane detection and would like to know if you thought this is suitable with MTI-NET)

Thank you!

Cityscapes depth estimation

Could you please support with the following questions:

  1. In the survey paper (Revisiting Multi-Task Learning in the Deep Learning Era), it is mentioned that depth maps of cityscapes were generated using SGM. Would it be possible to provide the code for this ?
  2. Is the depth map generated or the disparity ?
  3. Disparity maps are made available by cityscapes ? Are these used ? In this case, the networks predict directly predict depth or they predict disparity which is then converted to depth.?

Thank you.

Implementaion for decoder focused MTL methods based on ResNet-50

Hi,

It's a great piece of work and thanks for making the code puclic!

Do you have any plan to release the implementaion, hyper-parameters, and pretrained weights for the decoder focused MTL methods based on ResNet-50 as described in Tab. 5(a) in your survey paper?

Thanks in advance!

MTL+Resnet50+nyud

Thank you for sharing the code. During multi-task learning, I encountered the following error while training ResNet50 on NYUD dataset.

RuntimeError: Given groups=1, weight of size 64 3 7 7, expected input[8, 480, 640, 3] to have 3 channels, but got 480 channels instead

Hyperparams for HRNet-48

Please could you let me know the hyperparams used to train the HRNet-48 model from your paper (both for the 45.7% mIoU and the ~49% mIoU scores)? I have tried really hard to train HRNet-48 on single task in my repository, but it doesn't go beyond 44.8% mIoU.

Thank you.

ResNet-50 single task baseline hyperparameters

Hi there, thank you for open-sourcing your amazing work.

I trained ResNet-50 single-task baseline for segmentation task using config file "configs/nyud/resnet50/semseg.yml". However, I can get at best 40.4 mIoU on NYUD-v2 dataset, which is lower than 43.9 reported in the survey. Can you please provide the hyperparameter used in the survey or some hint to reach 43.9 mIoU?

Thanks in advance.

Setup issue

I am getting a path error since a part of the path is doubled in the code when running:

python main.py --config_env configs/env.yml --config_exp configs/pascal/hrnet18/mti_net.yml

The path looks like this:

../../../data2/yd/mti/datasets/PASCAL_MT/../../../data2/yd/mti/datasets/PASCAL_MT/human_parts/2008_000008.mat

For us it seems like there is an error in data/pascal_context.py here:
In line 111 the part_gt_dir is defined as an extension to self.root:

part_gt_dir = os.path.join(self.root, 'human_parts')

however in line 172 self root is joined with part_gt_dir, so with an extension of itself:

_human_part = os.path.join(self.root, part_gt_dir, line + ".mat")

Backbone

Hey, you reported the best result of NYUD with MTI-Net(HRNet-48), and in your code, the backbone in configs is HRNet-w18. If I want to reproduce the best result, is just changing HRNet-w18 to HRNet-48 enough? Thanks.

hrnet18+padnet

Thank you very much for sharing the wonderful code! Your work is definitely very helpful for MTL community.

I am contacting you because I try to reproduce the result of hrnet18+padnet on nyud dataset. Literally

  1. downloaded the pretrain model of hrnet_w18_small_model_v2.pth
  2. python main.py --config_env configs/env.yml --config_exp configs/nyud/hrnet18/pad_net.yml
    But the performance is only Semantic Segmentation mIoU: 33.4665 depth 0.7267, not the result in your paper, do you have idea why is it?

Looking forward to hearing from you soon!
Thank you!

How it works with Batch Normalization?

Using dataloader by dataloader, sometimes memory is limited, so only little batch for each dataloader. It will disadvantage the performance of batch normalization? how to fix this problem?

About the pre-trained model and data augmentation

Thank you for your sharing about multi-task learning problem which is very useful!

You have mentioned you used the pre-trained models, e.g., pre-trained Resnet-50, and data augmentation trick in Sec 4.1.4 Training Setup. Therefore, is it right that the results in NYUD-v2 dataset which are reported in TABLE 5(c) in your paper are also based pre-trained Resnet-50 model with data augmentation? Besides, dose using the pre-trained models and data augmentation improve the performance very much?

Thanks for your patience again!

About PAP-Net

Hello, thanks for sharing your work. In your paper Multi-Task Learning for Dense Prediction Tasks: A Survey, I found that you've tried PAP-Net, but I can't find the code in your project. How can I get it?

About task balancing

Thank you for your excellent work and open-source code.

Could you please provide some code about task balancing, such as uncertainty weight, GradNorm multi-object Optim? Your paper has some experimental results about these methods. Thank you very much!

About fixed weights from a grid search experiments setting

Hi, I am very impressed by your survey on MTL, from which I have learned a lot.
I am currently working on a MTL project, so I am very curious about the grid search experiments for the fixed weights.
I have not found details about this in your paper as well as this repo. Could you give me more information on this?
What exactly are those grid search weighs? And you used all the combinations of those weights to train the MTL network and evaluate it? If I want to find the best weights for my MTL network, I need to do the same experiments? Could you give me some suggestions on this?
Thank you so much!

PADNet loss scheme interpolation

PADNet paper mentions that the intermediate loss functions L1 to L4 are done by re-scaling the ground truth map to 1/4 resolution in page 3.

However, the implementation of the PADNet loss scheme upsamples the initial predictions to the img size before applying the loss. This seems to be in contrast to what the paper reports.

MTI-NET NYUD task,some files lost

Hi,I'm running your code,but i find some files is not included in your respository:

FileNotFoundError: [Errno 2] No such file or directory: './results/NYUD/hrnet_w18/single_task/semseg/re
sults/NYUD_test_semseg.json'

could you please provide these files?
thanks a lot!

MTI-Net + Resnet FPN backbone

Hi,
Thanks for your code!

Could you provide your code of MTI-Net with Resnet FPN backbone in you paper?
BTW, did you use pretrained weight in your of resnet18-fpn?
If used, where could i find that weight files?

Thanks!

About dataset training

Is mtl training done sequentially on multiple datasets? Won't this lead to sub-optimization of some tasks?

ResNet50 and PAD-net

First of all, thank you for opening up the code. I am learning multi-task learning recently. And from your paper "Multi-Task Learning for Dense Prediction Tasks: A Survey", it can be concluded that on NYDU-v2 dataset, the PAD-Net based on ResNet-50 seems to get the best results, but your program does not support this situation. May I ask if you can provide one or whether the current code can achieve this function.

Can we do Multiple Classification task

Hello SimonVandenhende,

Kudos to the great work. I just want to know if we can use this repo to train multiple classification tasks. For example Vehicle Make ,Color,Orientation,Model - each 4 attributes as individual tasks.

regards
akirs

Human Parts

Hello, thank you very much for your outstanding contribution. I easily found the method of coloring the semantic segmentation prediction map on the Internet, but there are few methods for coloring Pascal-context human body part segmentation. I want to know how you color the human body part segmentation prediction map?

About 'seism'

Hi! Thanks for sharing the code.
I had one problem that when I trained the model 'PADNet', is that the 'seism' necessary?Can I not use the seism?
Thank u for your answer.

Question on Multi Task Evaluation Criteria

Your survey on MTL is awesome! Amazing work, I got a quick question, in section 4.2.1 Evaluation Criterion you mention:

where li = 1 if a lower value means better performance for metric Mi of task i, and 0 otherwise.

But in the function calculate_multi_task_performance

    for task in tasks:
        mtl = eval_dict[task]
        stl = single_task_dict[task]
        
        if task == 'depth': # rmse lower is better
            mtl_performance -= (mtl['rmse'] - stl['rmse'])/stl['rmse']

        elif task in ['semseg', 'sal', 'human_parts']: # mIoU higher is better
            mtl_performance += (mtl['mIoU'] - stl['mIoU'])/stl['mIoU']

        elif task == 'normals': # mean error lower is better
            mtl_performance -= (mtl['mean'] - stl['mean'])/stl['mean']

        elif task == 'edge': # loss lower is better
            mtl_performance += (mtl['odsF'] - stl['odsF'])/stl['odsF']

        else:
            raise NotImplementedError

task == 'edge': # loss lower is better but it's adding instead of subtracting, why is that?

Also I've trying to retrain individual tasks like depth or semseng in a multi-gpu setup but it crashes and reboots!! (I'll probably look into this latter, but wondering if you ever encountered it)

Again, awesome work! Thanks for sharing

About PAP-Net

Hello, thanks for sharing your work. In your paper Multi-Task Learning for Dense Prediction Tasks: A Survey, I found that you've tried PAP-Net, but I can't find the code in your project. How can I get it?

Dataset URL

It looks like the URL specified in data/google_drive.py is not accessible anymore.

download_file_from_google_drive fails with UnboundLocalError: local variable 'token' referenced before assignment since response.cookies is empty for the NYU dataset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.