Code Monkey home page Code Monkey logo

k-net's Introduction

Hi there 👋

ZwwWayne's GitHub stats

k-net's People

Contributors

zwwwayne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

k-net's Issues

对loss的一些疑问

Hi,我对kernel_update_head.py中label_weights的实现有一些疑问。在kernel_update_head.py中_get_target_single函数中,将为何要将sem_thing_weights在 num_thing_classes上的权重设为0,将其设为1使sem label将thing 的类别视为负样本不是更符合常理的做法么,同理label_weights在num_stuff_classes的权重也设为0也不是很能理解。可以解释下这样做带来的好处么?

            sem_stuff_weights = torch.eye(
                self.num_stuff_classes, device=pos_mask.device)
            sem_thing_weights = pos_mask.new_zeros(
                (self.num_stuff_classes, self.num_thing_classes))
            sem_label_weights = torch.cat(
                [sem_thing_weights, sem_stuff_weights], dim=-1)
......
            label_weights[:, self.num_thing_classes:] = 0

Difference between arxiv paper v1 and the camera ready version?

Hi, thanks so much for sharing the great work and congrats on the paper acceptance!

I saw that in the arxiv v1 version of the paper, it reports a PQ score = 52.1 PQ of K-Net with Swin-L backbone. In the camera ready version, the score is improved to 55.2 PQ with the same architecture. I looked in to the paper but did not find out what is the difference. Besides, it seems that other results (e.g., instance segmentation) remain the same. May I ask what is the difference between arxiv v1 version and current one?

Thanks!

Logs for ADE20K

The links to training logs on ADE20K is missing. Can you provide your training log?

'MaskPseudoSampler is already registered in bbox_sampler'

when I tried to reproduce this model, I came up with this error after I ran
PYTHONPATH='./':$PYTHONPATH mim train mmdet ./K-Net/configs/det/knet/knet_s3_r50_fpn_1x_coco.py --work-dir=./K-Net/working_directory

is there anything wrong about my command or the version of mmcv?

About your semantic segmentation?

In your code of semantic segmentation, I noticed that you give a softmax operation for input mask in your kernel update head. I wonder whether it means that the input mask need at least two channels, and if my network only output 1 channel mask, how can I modify my network so that your idea can keep work?

About training dataset

Hi,
I would like to ask which one to follow to prepare dataset: MMDetection or MMSegmentation? Is there any difference?
Thanks

KeyError: 'KNet is not in the models registry' when runing 'train.py'

Description

I directly use your repo as my workspace, the directory tree is shown below(mmdetection has been installed by conda):
image
Then I run the following command which is pretty well when using builtin model:

sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

Problem

It shows KNet doesn't register:

KeyError: 'KNet is not in the models registry'

Attempt

Repo mim-example mentions that we can simply use the config and the build function. However, the following sample code is no used in the repo, which confused me a lot.

module_cfg = dict(type='mmcv.SwinTransformer')
module = build_backbone(module_cfg)

How can I fix this problem? Any help will be appreciated.

how to calculate FPS

For other mmdet models, I usually uses tools/analysis_tools/benchmark.py. But how to calculate FPS for K-Net? Thanks for your great work

Implementation about kernel activation

Hello,
Sorry to disturb you. I'm trying to visualize the kernels (called object_feats in your code). It've been illustrated in your paper.
image
Here is my code, which aims to save and add them on kernels.npy during the inference phrase.

"""kernel_iter_update.py line:296"""
                results.append(single_result)
        from debugger import save_test_info
        save_test_info(img_metas, scores_per_img, masks_per_img, object_feats)
        return results


def save_test_info(img_metas:list, 
    cls_score:torch.Tensor, 
    scaled_mask_preds:torch.Tensor, obj_feats:torch.Tensor):
    ...
    # kernels
    if obj_feats is not None:
        kernels_old = np.load("work_dirs/tmp/kernels.npy")
        kernels_new = obj_feats.to('cpu').detach().numpy()
        kernels = kernels_new+kernels_old
        np.save("work_dirs/tmp/kernels.npy", kernels)
"""after inference phrase"""
fig,a =  plt.subplots(10,10)
kernels_2dim = kernels.reshape((100,16,16))
for i in range(100):
    # a[int(i / 10)][i % 10].set_title(i)
    a[int(i / 10)][i % 10].set_xticks([])
    a[int(i / 10)][i % 10].set_yticks([])
    a[int(i / 10)][i % 10].imshow(kernels_2dim[i], cmap = plt.cm.hot_r)
plt.savefig('work_dirs/tmp/class_80_ins_2/kernel_2dim.png', bbox_inches='tight')
plt.show()

However, the result is completely different from your figures:
image

It will be appreciated if anyone can show me the way to visualize kernel correctly.

About experiments setting

Hi, really thanks for sharing your fantastic work! However, I have something puzzled about the implementation details in Sectrion 4. Why mult-scale training with a longer schedule used for fair comparisons ?

ModuleNotFoundError: No module named 'knet'

Training command is python /home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py ./configs/det/knet/knet_s3_r50_fpn_1x_coco-panoptic.py --gpus 1 --launcher none --work-dir ./tmp.
Traceback (most recent call last):
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 73, in import_modules_from_strings
imported_tmp = import_module(imp)
File "/home/pai/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 941, in _find_and_load_unlocked
File "", line 219, in _call_with_frames_removed
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 941, in _find_and_load_unlocked
File "", line 219, in _call_with_frames_removed
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'knet'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 185, in
main()
File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 90, in main
cfg = Config.fromfile(args.config)
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/config.py", line 334, in fromfile
import_modules_from_strings(**cfg_dict['custom_imports'])
File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 80, in import_modules_from_strings
raise ImportError
ImportError
Traceback (most recent call last):
File "/home/pai/bin/mim", line 8, in
sys.exit(cli())
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pai/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 107, in cli
other_args=other_args)
File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 256, in train
cmd, env=dict(os.environ, MASTER_PORT=str(port)))
File "/home/pai/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)

OOM error when training on Cityscapes

Hi, I want to train K-Net on Cityscapes for panoptic segmentation with slurm. I follow the coco_panoptic.py to implement a custom cityscapes_panoptic.py. However, after running several epoches, I always encounter the following error:

slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: gpu20-15: task 0: Out Of Memory
srun: launch/slurm: _step_signal: Terminating StepId=9411285.0
slurmstepd-gpu20-15: error: *** STEP 9411285.0 ON gpu20-15 CANCELLED AT 2022-05-17T17:58:10 ***
/home/krumo/work/utils/anaconda3/envs/knet/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 56 leaked semaphores to clean up at shutdown
  len(cache))
slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

As our cluster limits the running time for interactive session, I have to create a sbatch script to submit my job. The sbatch script I used for training is like this:

#!/usr/bin/env bash

#SBATCH -p gpu20
#SBATCH --gres gpu:4
#SBATCH -n 4
#SBATCH -t 32:59:58
#SBATCH -c 4
#SBATCH --mem 200G

CONFIG=configs/det/knet/knet_s3_r50_fpn_1x_cs-panoptic.py

PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
srun --kill-on-bad-exit=1 python -u /BS/da_detection/work/utils/anaconda3/envs/knet/lib/python3.7/site-packages/mmdet/.mim/tools/train.py ${CONFIG} --launcher slurm

I think 200G memory should be sufficient for K-Net training. Thus, this error seems to be very weird. I tried searching for solutions while nothing works. Would you mind sharing some comments? Thanks in advance!

work-dir

Dear author:
参数work-dir有什么用呢?

关于训练

请问作者是用了八块v100训练吗?训练了多久呢?

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Any demo script?

How to visualize final result? Feed single image and visualize?

Training on custom dataset

Hi, I would like to ask what do i need to change in network files in order to train on custom coco format dataset. I've changed every instance of num_classses, num_thing_classes, num_stuff_classes and modified the config accordingly. Training epoch runs correctly, but I am facing the following during validation

File "/K-Net/knet/det/kernel_update_head.py", line 372, in _get_target_single
    mask_targets[pos_inds, ...] = pos_mask_targets
RuntimeError: shape mismatch: value tensor of shape [54, 168, 216] cannot be broadcast to indexing result of shape [54, 84, 108]

the segm mAp result is zero

I used my own dataset to train the instance segmentation task on K-Net and found that the seg mAP was always 0, and the number of training epochs still did not change,my num_classes=1.

如何訓練自己的數據集

@ZwwWayne 您好,很謝謝您提供這個框架以及code
想請問如何用已有模型做fine tune,訓練自己的數據集?
還有partition是要輸入哪個東西呢?
非常感謝您!

how to do postprocessing

the mask seems not accurate, for a same object, it still could be generate multiple masks which is not good, how to postprocessing supress them?

An error occurred during training

Hi, I want to train K-net on a dataset that contains 26 labels, and after running the command below,

PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR

I get this error

TypeError: IterativeDecodeHead: KernelUpdateHead: __init__() got an unexpected keyword argument 'mask_upsample_stride'

thanks for your help 🙏🙏

About training

Hi, I have a little question about training. I am using mmdetection architecture to train other models. But I didn't use Slurm to train the model . I am using the following command
"bash ./tools/dist_train.sh
${CONFIG_FILE}
${GPU_NUM}
[optional arguments]"

May I ask could you give one example to show how to train the model use "bash ./tools/dist_train.sh". Thank you very much.

Does the batch_size must be set to 2?

if cls_scores is None: detached_cls_scores = [None] * 2 else: detached_cls_scores = cls_scores.detach() for i in range(num_imgs): assign_result = self.assigner.assign(scaled_mask_preds[i].detach(), detached_cls_scores[i], gt_masks[i], gt_labels[i], img_metas[i])

as can be seen from above code,
len(detached_cls_scores)=2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.