zwwwayne / k-net Goto Github PK

View Code? Open in Web Editor NEW

457.0 457.0 45.0 83 KB

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

License: Apache License 2.0

Python 99.41% Shell 0.59%

image-segmentation mmdetection mmsegmentation panoptic-segmentation pytorch semantic-segmentation

k-net's Introduction

Hi there 👋

k-net's People

Contributors

Stargazers

Watchers

Forkers

mfkiwl xubin1994 saulocatharino josebrunods linhong00316 yannnnnnnnnnnn atlasgooo2 jcwang123 xjsxujingsong bamps53 cxiang26 hiyyg jyang68sh yinchimaoliang yuyanlebond qaz-tuatara lovegood-1 mornydew zhizhangxian siyisan xrosliang voxuanthuy leorainly haochenheheda ctkindle yes-jumby lixinhhu likyoo cv-ip delldu hongyangdu avickars aidanscourses pkuzhd mrma-t josslei dongsky jinuhwang robotseye kevin879275 drryanhuang cv-seg githubfragments audiowiz qingyshi

k-net's Issues

对loss的一些疑问

Hi，我对kernel_update_head.py中label_weights的实现有一些疑问。在kernel_update_head.py中_get_target_single函数中，将为何要将sem_thing_weights在 num_thing_classes上的权重设为0，将其设为1使sem label将thing 的类别视为负样本不是更符合常理的做法么，同理label_weights在num_stuff_classes的权重也设为0也不是很能理解。可以解释下这样做带来的好处么？

            sem_stuff_weights = torch.eye(
                self.num_stuff_classes, device=pos_mask.device)
            sem_thing_weights = pos_mask.new_zeros(
                (self.num_stuff_classes, self.num_thing_classes))
            sem_label_weights = torch.cat(
                [sem_thing_weights, sem_stuff_weights], dim=-1)
......
            label_weights[:, self.num_thing_classes:] = 0

Download link broken

https://download.openmmlab.com/mim-example/knet/

this one can not access which used in mscoco 3x Instance segmentation models....

Difference between arxiv paper v1 and the camera ready version?

Hi, thanks so much for sharing the great work and congrats on the paper acceptance!

I saw that in the arxiv v1 version of the paper, it reports a PQ score = 52.1 PQ of K-Net with Swin-L backbone. In the camera ready version, the score is improved to 55.2 PQ with the same architecture. I looked in to the paper but did not find out what is the difference. Besides, it seems that other results (e.g., instance segmentation) remain the same. May I ask what is the difference between arxiv v1 version and current one?

Thanks!

Logs for ADE20K

The links to training logs on ADE20K is missing. Can you provide your training log?

question about the semantic segmentation results on ade20k

Thanks for the great work. I want to ask if the result is the best result among several experments.

Inverted colors in project page

Seems like the label colors are inverted in this figure:

'MaskPseudoSampler is already registered in bbox_sampler'

when I tried to reproduce this model, I came up with this error after I ran
PYTHONPATH='./':$PYTHONPATH mim train mmdet ./K-Net/configs/det/knet/knet_s3_r50_fpn_1x_coco.py --work-dir=./K-Net/working_directory

is there anything wrong about my command or the version of mmcv?

使用knet_s3_r50_fpn_1x_coco.py训练

AssertionError: The num_classes (133) in KernelIterHead of MMDistributedDataParallel does not matches the length of CLASSES 80) in CocoDataset

About your semantic segmentation？

In your code of semantic segmentation， I noticed that you give a softmax operation for input mask in your kernel update head. I wonder whether it means that the input mask need at least two channels, and if my network only output 1 channel mask, how can I modify my network so that your idea can keep work?

IoU are all Nan, training the both deeplabv3 and swin-t on ADE20K dataset, why?

through out the whole training, these metrics are all Nan, neither deeplabv3 nor swin-t, I just use the config given in this repo and do not change. except SynBN would cause error and I change it into normal BN.
hope you could help, thanks!

About training dataset

Hi,
I would like to ask which one to follow to prepare dataset: MMDetection or MMSegmentation? Is there any difference?
Thanks

KeyError: 'KNet is not in the models registry' when runing 'train.py'

Description

I directly use your repo as my workspace, the directory tree is shown below(mmdetection has been installed by conda):

Then I run the following command which is pretty well when using builtin model:

sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

Problem

It shows KNet doesn't register:

KeyError: 'KNet is not in the models registry'

Attempt

Repo mim-example mentions that we can simply use the config and the build function. However, the following sample code is no used in the repo, which confused me a lot.

module_cfg = dict(type='mmcv.SwinTransformer')
module = build_backbone(module_cfg)

How can I fix this problem? Any help will be appreciated.

how to calculate FPS

For other mmdet models, I usually uses tools/analysis_tools/benchmark.py. But how to calculate FPS for K-Net? Thanks for your great work

Implementation about kernel activation

Hello,
Sorry to disturb you. I'm trying to visualize the kernels (called object_feats in your code). It've been illustrated in your paper.

Here is my code, which aims to save and add them on kernels.npy during the inference phrase.

"""kernel_iter_update.py line:296"""
                results.append(single_result)
        from debugger import save_test_info
        save_test_info(img_metas, scores_per_img, masks_per_img, object_feats)
        return results


def save_test_info(img_metas:list, 
    cls_score:torch.Tensor, 
    scaled_mask_preds:torch.Tensor, obj_feats:torch.Tensor):
    ...
    # kernels
    if obj_feats is not None:
        kernels_old = np.load("work_dirs/tmp/kernels.npy")
        kernels_new = obj_feats.to('cpu').detach().numpy()
        kernels = kernels_new+kernels_old
        np.save("work_dirs/tmp/kernels.npy", kernels)

"""after inference phrase"""
fig,a =  plt.subplots(10,10)
kernels_2dim = kernels.reshape((100,16,16))
for i in range(100):
    # a[int(i / 10)][i % 10].set_title(i)
    a[int(i / 10)][i % 10].set_xticks([])
    a[int(i / 10)][i % 10].set_yticks([])
    a[int(i / 10)][i % 10].imshow(kernels_2dim[i], cmap = plt.cm.hot_r)
plt.savefig('work_dirs/tmp/class_80_ins_2/kernel_2dim.png', bbox_inches='tight')
plt.show()

However, the result is completely different from your figures:

It will be appreciated if anyone can show me the way to visualize kernel correctly.

About experiments setting

Hi, really thanks for sharing your fantastic work! However, I have something puzzled about the implementation details in Sectrion 4. Why mult-scale training with a longer schedule used for fair comparisons ?

ModuleNotFoundError: No module named 'knet'

Training command is python /home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py ./configs/det/knet/knet_s3_r50_fpn_1x_coco-panoptic.py --gpus 1 --launcher none --work-dir ./tmp.
Traceback (most recent call last):
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 73, in import_modules_from_strings
imported_tmp = import_module(imp)
File "/home/pai/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 941, in _find_and_load_unlocked
File "", line 219, in _call_with_frames_removed
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 941, in _find_and_load_unlocked
File "", line 219, in _call_with_frames_removed
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'knet'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 185, in
main()
File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 90, in main
cfg = Config.fromfile(args.config)
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/config.py", line 334, in fromfile
import_modules_from_strings(**cfg_dict['custom_imports'])
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 80, in import_modules_from_strings
raise ImportError
ImportError
Traceback (most recent call last):
File "/home/pai/bin/mim", line 8, in
sys.exit(cli())
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 107, in cli
other_args=other_args)
File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 256, in train
cmd, env=dict(os.environ, MASTER_PORT=str(port)))
File "/home/pai/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)

OOM error when training on Cityscapes

Hi, I want to train K-Net on Cityscapes for panoptic segmentation with slurm. I follow the coco_panoptic.py to implement a custom cityscapes_panoptic.py. However, after running several epoches, I always encounter the following error:

slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: gpu20-15: task 0: Out Of Memory
srun: launch/slurm: _step_signal: Terminating StepId=9411285.0
slurmstepd-gpu20-15: error: *** STEP 9411285.0 ON gpu20-15 CANCELLED AT 2022-05-17T17:58:10 ***
/home/krumo/work/utils/anaconda3/envs/knet/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 56 leaked semaphores to clean up at shutdown
  len(cache))
slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

As our cluster limits the running time for interactive session, I have to create a sbatch script to submit my job. The sbatch script I used for training is like this:

#!/usr/bin/env bash

#SBATCH -p gpu20
#SBATCH --gres gpu:4
#SBATCH -n 4
#SBATCH -t 32:59:58
#SBATCH -c 4
#SBATCH --mem 200G

CONFIG=configs/det/knet/knet_s3_r50_fpn_1x_cs-panoptic.py

PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
srun --kill-on-bad-exit=1 python -u /BS/da_detection/work/utils/anaconda3/envs/knet/lib/python3.7/site-packages/mmdet/.mim/tools/train.py ${CONFIG} --launcher slurm

I think 200G memory should be sufficient for K-Net training. Thus, this error seems to be very weird. I tried searching for solutions while nothing works. Would you mind sharing some comments? Thanks in advance!

Pre-trained Model

Hi, The link of the pre-trained model on ADE20K is failed.

work-dir

Dear author：
参数work-dir有什么用呢？

关于训练

请问作者是用了八块v100训练吗？训练了多久呢？

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Any demo script?

How to visualize final result? Feed single image and visualize?

Training on custom dataset

Hi, I would like to ask what do i need to change in network files in order to train on custom coco format dataset. I've changed every instance of num_classses, num_thing_classes, num_stuff_classes and modified the config accordingly. Training epoch runs correctly, but I am facing the following during validation

File "/K-Net/knet/det/kernel_update_head.py", line 372, in _get_target_single
    mask_targets[pos_inds, ...] = pos_mask_targets
RuntimeError: shape mismatch: value tensor of shape [54, 168, 216] cannot be broadcast to indexing result of shape [54, 84, 108]

Can you provide the pre-trained models?

Thank you for your excellent work, it inspires me a lot. I want to reconstruct it for another vision task, can you provide the pre-trained models?

inference and visualisation of panoptic knet results

Do you have a script to do the inference and visualize the results with panoptic?
MMdetection does not work for this.

the segm mAp result is zero

I used my own dataset to train the instance segmentation task on K-Net and found that the seg mAP was always 0, and the number of training epochs still did not change，my num_classes=1.

如何訓練自己的數據集

@ZwwWayne 您好,很謝謝您提供這個框架以及code
想請問如何用已有模型做fine tune,訓練自己的數據集？
還有partition是要輸入哪個東西呢？
非常感謝您！

Semantic Segmentation Model Unable to download

All Semantic Segmentation Model Unable to download

how to do postprocessing

the mask seems not accurate, for a same object, it still could be generate multiple masks which is not good, how to postprocessing supress them?

An error occurred during training

Hi, I want to train K-net on a dataset that contains 26 labels, and after running the command below,

PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR

I get this error

TypeError: IterativeDecodeHead: KernelUpdateHead: __init__() got an unexpected keyword argument 'mask_upsample_stride'

thanks for your help 🙏🙏

The link of trained model on coco instance segmentation is unaccessible, can you fix it?

R-50-ms-3x-37.8 and R-101-ms-3x-39.2, the links of them are unaccessible.

About training

Hi, I have a little question about training. I am using mmdetection architecture to train other models. But I didn't use Slurm to train the model . I am using the following command
"bash ./tools/dist_train.sh
${CONFIG_FILE}
${GPU_NUM}
[optional arguments]"

May I ask could you give one example to show how to train the model use "bash ./tools/dist_train.sh". Thank you very much.

Does the batch_size must be set to 2?

if cls_scores is None: detached_cls_scores = [None] * 2 else: detached_cls_scores = cls_scores.detach() for i in range(num_imgs): assign_result = self.assigner.assign(scaled_mask_preds[i].detach(), detached_cls_scores[i], gt_masks[i], gt_labels[i], img_metas[i])

as can be seen from above code,
len(detached_cls_scores)=2