Code Monkey home page Code Monkey logo

hkchengrex / mivos Goto Github PK

View Code? Open in Web Editor NEW
460.0 15.0 65.0 1.37 MB

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. Semi-supervised VOS as well!

Home Page: https://hkchengrex.com/MiVOS/

License: MIT License

Python 95.13% C++ 1.88% Cuda 2.47% Cython 0.52%
computer-vision segmentation deep-learning pytorch cvpr2021 interactive-segmentation video-object-segmentation video-segmentation

mivos's Introduction

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion (MiVOS)

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

CVPR 2021

[arXiv] [Paper PDF] [Project Page] [Demo] [Papers with Code] [Supplementary Material]

Newer: check out our new work Cutie. It also includes an interactive GUI!

New: see the STCN branch for a better and faster version.

demo1 demo2 demo3

Credit (left to right): DAVIS 2017, Academy of Historical Fencing, Modern History TV

We manage the project using three different repositories (which are actually in the paper title). This is the main repo, see also Mask-Propagation and Scribble-to-Mask.

Overall structure and capabilities

MiVOS Mask-Propagation Scribble-to-Mask
DAVIS/YouTube semi-supervised evaluation ✔️
DAVIS interactive evaluation ✔️
User interaction GUI tool ✔️
Dense Correspondences ✔️
Train propagation module ✔️
Train S2M (interaction) module ✔️
Train fusion module ✔️
Generate more synthetic data ✔️

Framework

framework

Requirements

We used these packages/versions in the development of this project. It is likely that higher versions of the same package will also work. This is not an exhaustive list -- other common python packages (e.g. pillow) are expected and not listed.

Refer to the official PyTorch guide for installing PyTorch/torchvision. The rest can be installed by:

pip install PyQt5 davisinteractive progressbar2 opencv-python networkx gitpython gdown Cython

Quick start

GUI

  1. python download_model.py to get all the required models.
  2. python interactive_gui.py --video <path to video> or python interactive_gui.py --images <path to a folder of images>. A video has been prepared for you at example/example.mp4.
  3. If you need to label more than one object, additionally specify --num_objects <number_of_objects>. See all the argument options with python interactive_gui.py --help.
  4. There are instructions in the GUI. You can also watch the demo videos for some ideas.

DAVIS Interactive VOS

See eval_interactive_davis.py. If you have downloaded the datasets and pretrained models using our script, you only need to specify the output path, i.e., python eval_interactive_davis.py --output [somewhere].

DAVIS/YouTube Semi-supervised VOS

Go to this repo: Mask-Propagation.

Main Results

DAVIS Interactive Track

All results are generated using the unmodified official DAVIS interactive bot without saving masks (--save_mask not specified) and with an RTX 2080Ti. We follow the official protocol.

Precomputed result, with the json summary: [Google Drive] [OneDrive]

eval_interactive_davis.py

Model AUC-J&F J&F @ 60s
Baseline 86.0 86.6
(+) Top-k 87.2 87.8
(+) BL30K pretraining 87.4 88.0
(+) Learnable fusion 87.6 88.2
(+) Difference-aware fusion (full model) 87.9 88.5
Full model, without BL30K for propagation/fusion 87.4 88.0
Full model, STCN backbone 88.4 88.8

Pretrained models

python download_model.py should get you all the models that you need. (pip install gdown required.)

[OneDrive Mirror]

Training

Data preparation

Datasets should be arranged as the following layout. You can use download_datasets.py (same as the one Mask-Propagation) to get the DAVIS dataset and manually download and extract fusion_data ([OneDrive]) and BL30K.

├── BL30K
├── DAVIS
│   └── 2017
│       ├── test-dev
│       │   ├── Annotations
│       │   └── ...
│       └── trainval
│           ├── Annotations
│           └── ...
├── fusion_data
└── MiVOS

BL30K

BL30K is a synthetic dataset rendered using Blender with ShapeNet's data. We break the dataset into six segments, each with approximately 5K videos. The videos are organized in a similar format as DAVIS and YouTubeVOS, so dataloaders for those datasets can be used directly. Each video is 160 frames long, and each frame has a resolution of 768*512. There are 3-5 objects per video, and each object has a random smooth trajectory -- we tried to optimize the trajectories greedily to minimize object intersection (not guaranteed), with occlusions still possible (happen a lot in reality). See generation/blender/generate_yaml.py for details.

We noted that using probably half of the data is sufficient to reach full performance (although we still used all), but using less than one-sixth (5K) is insufficient.

Download

You can either use the automatic script download_bl30k.py or download it manually below. Note that each segment is about 115GB in size -- 700GB in total. You are going to need ~1TB of free disk space to run the script (including extraction buffer).

Google Drive is much faster in my experience. Your mileage might vary.

Manual download: [Google Drive] [OneDrive]

Note: Google might block the Google Drive link. You can 1) make a shortcut of the folder to your own Google Drive, and 2) use rclone to copy from your own Google Drive (would not count towards your storage limit).

[UST Mirror] (Reliability not guaranteed, speed throttled, do not use if others are available): ckcpu1.cse.ust.hk:8080/MiVOS/BL30K_{a-f}.tar (Replace {a-f} with the part that you need).

MD5 Checksum:

35312550b9a75467b60e3b2be2ceac81  BL30K_a.tar
269e2f9ad34766b5f73fa117166c1731  BL30K_b.tar
a3f7c2a62028d0cda555f484200127b9  BL30K_c.tar
e659ed7c4e51f4c06326855f4aba8109  BL30K_d.tar
d704e86c5a6a9e920e5e84996c2e0858  BL30K_e.tar
bf73914d2888ad642bc01be60523caf6  BL30K_f.tar

Generation

  1. Download ShapeNet.
  2. Install Blender. (We used 2.82)
  3. Download a bunch of background and texture images. We used this repo (we specified "non-commercial reuse" in the script) and the list of keywords are provided in generation/blender/*.json.
  4. Generate a list of configuration files (generation/blender/generate_yaml.py).
  5. Run rendering on the configurations. See here (Not documented in detail, ask if you have a question)

Fusion data

We use the propagation module to run through some data and obtain real outputs to train the fusion module. See the script generate_fusion.py.

Or you can download pre-generated fusion data: [Google Drive] [OneDrive]

Training commands

These commands are to train the fusion module only.

CUDA_VISIBLE_DEVICES=[a,b] OMP_NUM_THREADS=4 python -m torch.distributed.launch --master_port [cccc] --nproc_per_node=2 train.py --id [defg] --stage [h]

We implemented training with Distributed Data Parallel (DDP) with two 11GB GPUs. Replace a, b with the GPU ids, cccc with an unused port number, defg with a unique experiment identifier, and h with the training stage (0/1).

The model is trained progressively with different stages (0: BL30K; 1: DAVIS). After each stage finishes, we start the next stage by loading the trained weight. A pretrained propagation model is required to train the fusion module.

One concrete example is:

Pre-training on the BL30K dataset: CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=4 python -m torch.distributed.launch --master_port 7550 --nproc_per_node=2 train.py --load_prop saves/propagation_model.pth --stage 0 --id retrain_s0

Main training: CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=4 python -m torch.distributed.launch --master_port 7550 --nproc_per_node=2 train.py --load_prop saves/propagation_model.pth --stage 1 --id retrain_s012 --load_network [path_to_trained_s0.pth]

Credit

f-BRS: https://github.com/saic-vul/fbrs_interactive_segmentation

ivs-demo: https://github.com/seoungwugoh/ivs-demo

deeplab: https://github.com/VainF/DeepLabV3Plus-Pytorch

STM: https://github.com/seoungwugoh/STM

BlenderProc: https://github.com/DLR-RM/BlenderProc

Citation

Please cite our paper if you find this repo useful!

@inproceedings{cheng2021mivos,
  title={Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion},
  author={Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2021}
}

And if you want to cite the datasets:

bibtex

@inproceedings{shi2015hierarchicalECSSD,
  title={Hierarchical image saliency detection on extended CSSD},
  author={Shi, Jianping and Yan, Qiong and Xu, Li and Jia, Jiaya},
  booktitle={TPAMI},
  year={2015},
}

@inproceedings{wang2017DUTS,
  title={Learning to Detect Salient Objects with Image-level Supervision},
  author={Wang, Lijun and Lu, Huchuan and Wang, Yifan and Feng, Mengyang 
  and Wang, Dong, and Yin, Baocai and Ruan, Xiang}, 
  booktitle={CVPR},
  year={2017}
}

@inproceedings{FSS1000,
  title = {FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation},
  author = {Li, Xiang and Wei, Tianhan and Chen, Yau Pun and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2020}
}

@inproceedings{zeng2019towardsHRSOD,
  title = {Towards High-Resolution Salient Object Detection},
  author = {Zeng, Yi and Zhang, Pingping and Zhang, Jianming and Lin, Zhe and Lu, Huchuan},
  booktitle = {ICCV},
  year = {2019}
}

@inproceedings{cheng2020cascadepsp,
  title={{CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement},
  author={Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2020}
}

@inproceedings{xu2018youtubeVOS,
  title={Youtube-vos: A large-scale video object segmentation benchmark},
  author={Xu, Ning and Yang, Linjie and Fan, Yuchen and Yue, Dingcheng and Liang, Yuchen and Yang, Jianchao and Huang, Thomas},
  booktitle = {ECCV},
  year={2018}
}

@inproceedings{perazzi2016benchmark,
  title={A benchmark dataset and evaluation methodology for video object segmentation},
  author={Perazzi, Federico and Pont-Tuset, Jordi and McWilliams, Brian and Van Gool, Luc and Gross, Markus and Sorkine-Hornung, Alexander},
  booktitle={CVPR},
  year={2016}
}

@inproceedings{denninger2019blenderproc,
  title={BlenderProc},
  author={Denninger, Maximilian and Sundermeyer, Martin and Winkelbauer, Dominik and Zidan, Youssef and Olefir, Dmitry and Elbadrawy, Mohamad and Lodhi, Ahsan and Katam, Harinandan},
  booktitle={arXiv:1911.01911},
  year={2019}
}

@inproceedings{shapenet2015,
  title       = {{ShapeNet: An Information-Rich 3D Model Repository}},
  author      = {Chang, Angel Xuan and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher},
  booktitle   = {arXiv:1512.03012},
  year        = {2015}
}

Contact: [email protected]

mivos's People

Contributors

hkchengrex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mivos's Issues

How to change label color

Hi, I found that the label generated is red mask(R:128,G:0,B:0)and a black( R:0,G:0,B:0 ) background in default.

How do I set the color of the label with custom RGB parameters?

Thank u!

ratio = self.progress_num/self.progress_max ZeroDivisionError: division by zero

Thanks for sharing this great tool/project! Basically everything is working like a charm but if I make too many propagation steps while running interactive_gui.py I will face following error message and all modifications are gone (I would say that on average it happens after 20-30 propagation steps).

[A: 3798.09, U: 3057.62]: Propagation started.
[A: 3799.22, U: 3057.62]: Propagation finished!
[A: 3804.20, U: 3062.58]: Interaction Scribble at frame 579.
[A: 3805.13, U: 3063.38]: Interaction Scribble at frame 579.
[A: 3806.62, U: 3064.74]: Interaction Scribble at frame 579.
[A: 3811.83, U: 3069.87]: Propagation started.
Traceback (most recent call last):
  File "interactive_gui.py", line 550, in on_run
    self.current_mask = self.processor.interact(self.interacted_mask, self.cursur, 
  File "/home/user/projects/MiVOS/inference_core.py", line 252, in interact
    total_cb(total_num)
  File "interactive_gui.py", line 540, in progress_total_cb
    self.progress_step_cb()
  File "interactive_gui.py", line 532, in progress_step_cb
    ratio = self.progress_num/self.progress_max
ZeroDivisionError: division by zero

It looks like that for some reason self.progress_max or total_num in inference_core.py gets value 0. Can I avoid this error by checking with: if self.progress_max == 0 -> self.progress_max = 1 or something similar. Do you have a clue what causes the error?

Training for propagate module with 1 GPU

Hi.
I read your paper about MiVOS, and I think it is so great.
I am training this model, but I have a problem training in 1 GPU.
Could you please show me the command line to train a propagation model with 1 GPU?
Thank you very much, and look forward to hearing from you.

Processing on long video with high resolution

Hello!
Thank you for the amazing framework!

I have an issue while processing on long video with high resolution. I ran out of GPU memory.
As I understand, mivos tries to upload all images directly to GPU and if the video is too long or in high-resolution mivos can't handle such cases.
Is there is a way to fix this issue? Maybe modify code to work with data chunks?

Thank you in advance!

How to generate "Fusion data"

Hello,

Great work and GitHub repositories!
I was curious regarding the Fusion data generation, could you clarify the arguments (i.e. --separation,--range,..) you used for generating the data with generate_fusion.py?

MiVOS/generate_fusion.py

Lines 24 to 36 in b1992e6

Arguments loading
"""
parser = ArgumentParser()
parser.add_argument('--model', default='saves/propagation_model.pth')
parser.add_argument('--davis_root', default='../DAVIS/2017')
parser.add_argument('--bl_root', default='../BL30K')
parser.add_argument('--dataset', help='DAVIS/BL')
parser.add_argument('--output')
parser.add_argument('--separation', default=None, type=int)
parser.add_argument('--range', default=None, type=int)
parser.add_argument('--mem_freq', default=None, type=int)
parser.add_argument('--start', default=None, type=int)
parser.add_argument('--end', default=None, type=int)

Also, it seems that a link is missing for downloading the pre-generated fusion data in the README.md

Best!

Download datasets.

Could you reprovide a OneDrive link due to the previous one has expired. Thanks!

Can I run the GUI in google colab?

I constantly get errors like
error: subprocess-exited-with-error
or
error: metadata-generation-failed
when i am trying to run the GUI in Google Colab....
Can you provide a notebook that works for running the GUI? I would really appreciate beacuse i have tried everything for solving the issue and nothing worked. Are u using python 3.8?
Thank you!

errors

RuntimeError: "slow_conv_dilated<>" not implemented for 'BFloat16'(example.mp4)

Hello! I followed the instructions of Quickstart with these settings: python interactive_gui.py --video .\example\example.mp4
As I don't have a GPU, I change the map location to 'CPU'. When I select the "click" radio button and click on the object to create the mask, a runtime error is thrown.
image
Could you give me some suggestions? Looking forward to your reply.

Getting "ValueError: Davis root folder must be named "DAVIS" Error when i try run eval_interactive_davis.py

Getting "ValueError: Davis root folder must be named "DAVIS" Error when i try run eval_interactive_davis.py

Traceback (most recent call last):
File "/home/bereket/Desktop/IRCAD-Data/MiVOS/MiVOS-MiVOS-STCN/eval_interactive_davis.py", line 76, in
with DavisInteractiveSession(davis_root=davis_path+'/trainval', report_save_dir='../output', max_nb_interactions=8, max_time=8*30) as sess:
File "/home/bereket/anaconda3/envs/ivos/lib/python3.9/site-packages/davisinteractive/session/session.py", line 89, in enter
samples, max_t, max_i = self.connector.start_session(
File "/home/bereket/anaconda3/envs/ivos/lib/python3.9/site-packages/davisinteractive/connector/local.py", line 29, in start_session
self.service = EvaluationService(davis_root=davis_root)
File "/home/bereket/anaconda3/envs/ivos/lib/python3.9/site-packages/davisinteractive/evaluation/service.py", line 27, in init
self.davis = Davis(davis_root=davis_root)
File "/home/bereket/anaconda3/envs/ivos/lib/python3.9/site-packages/davisinteractive/dataset/davis.py", line 93, in init
raise ValueError('Davis root folder must be named "DAVIS"')
ValueError: Davis root folder must be named "DAVIS"

Temporal Information

Hi,
I am interested in your project and I would like to go in detail for an aspect related to temporal information. Are you training your model on video datasets? Are you getting temporal information from the dataset? or your model has been trained on single images considering only spatial information?

Thank you so much.
Best,
Francesca

The segmentation output does not produced after clicking, scribbling, or any interactions in GUI

image

I'm trying to make masks that include clothes with a video.

After clicking or scribbling, GUI does not stop but dose not make masking results.

Below are my installed modules and I'm using Conda environments.
Thanks for helping me.

beautifulsoup4      4.12.2
certifi             2023.7.22
charset-normalizer  3.2.0
colorama            0.4.6
contourpy           1.1.0
cycler              0.11.0
Cython              3.0.0
filelock            3.12.2
fonttools           4.42.1
gdown               4.7.1
idna                3.4
importlib-resources 6.0.1
kiwisolver          1.4.4
matplotlib          3.5.0
mkl-fft             1.3.6
mkl-random          1.2.2
mkl-service         2.4.0
numpy               1.19.5
opencv-python       4.2.0.32
packaging           23.1
Pillow              10.0.0
pip                 23.2.1
progressbar2        4.2.0
pyparsing           3.0.9
PyQt5               5.15.9
PyQt5-Qt5           5.15.2
PyQt5-sip           12.12.2
PySocks             1.7.1
python-dateutil     2.8.2
python-utils        3.7.0
requests            2.31.0
scipy               1.10.1
setuptools          68.0.0
setuptools-scm      7.1.0
six                 1.16.0
soupsieve           2.4.1
tomli               2.0.1
torch               1.7.1
torchaudio          0.7.2
torchvision         0.8.2
tqdm                4.66.1
typing_extensions   4.7.1
urllib3             2.0.4
wheel               0.38.4
zipp                3.16.2

The folder name is Example, not Examples

A video has been prepared for you at examples/example.mp4

It's example/example.mp4. A very small problem, but in my case it occured during the first launch, so it took me a while to figure out the root cause.

Thank you for amazing work!

question

Hello, how to mark multiple objects.
When I press numbers to change the color on the GUI interface, there is no response at all. How can I do it?

about the code of top-k filtering

thanks for your sharing.
what's the meaning of
x_exp = torch.exp(values - values[:, 0]) ?
why need to do values -values[:0]?
Looking forward to your reply.

static dataset in download_dataset.py

I note that there are a static dataset in download_dataset.py
so, where is this static dataset used?

and in readme.md, you say, you use BL30K to train fusion model, and the BL30K is very large(600G), so ,you use 600 G dataset to
pretrain fusion model?

Modification on memory

I'm testing and modifying the frequency at which the frames are put in memory "mem_freq" in inferencecore.py and how they are managed.

I was wondering if any modification on how the frames (key and values) are arranged in memory is affected by training and if I needed to retrain the model on another mem_freq.

Also I'm experiencing with a sliding window memory (buffer), and was wondering if it could also be affected by training and if only modifying the code and running it on DAVIS val dataset without training was ok.

Train a model to use interactive function to refine the pretrained model.

Hi there, thanks for your great work.

I am wondering what if I want to use your template to train a model to implement this "interactive" template to refine the pretrained model, how can I do?

For example, I have the existing video and ground truth. I would like to train a model using those available data. Looks like I can use those to train a propagation model, how about fusion model and mask propagation model?

Very appreciated!

About GUI tool.

Hi,

It is awkward that our server hasn't support GUI. Do you know any solution?

Thanks!

A consultation on your used GPU

Dear author:
Could you please tell me if two 2080Ti are enough for reproducing your results? Thank you in advance!!! PS. If the answer is yes, could you please tell how much time I need to spend on it?

训练突然变得很慢

我重新训练MIvos,训练突然变得很慢,不知道是什么原因?之前训练也没有这么慢。

Fine-tune guidance

Hi really loved the work, I'm trying to fine-tune the downloaded models(using the downlaod_model.py) to another domain. I was wondering if you could help me where to put the data and which command to run the training.

Thank you

Process killed

I tried the MIVOS + STCN on a 1.5 minute 4k video that was down sampled to 480p and the program crashed.

What are the steps to reformat/sample a 4k video to make it work for this tool?

Also can this tool run on multiple GPUs?

attention heatmap

image
I want to generate attention heatmaps just like memory frames in picture I post. Which part of the code should I use?

image format

Hello! I have downloaded the training datasets including static image and DAVIS data, and I spotted the images are in .jpg format. Also, the interactive_gui.py seems only support images in jpg format. Could we use images with other formats to train the model and test the GUI?

A possible bug?

Hi,

In the fusion_dataset.py , it may has a bug.

# We need the second reference frame to be visible from the first
if not path.exists(path.join(video_path, first_ref, tar_obj, r+'.png')):
    continue
# We need the target object to exist
if path.exists(path.join(video_path, r, tar_obj, tar_frame)):
    src2_ref_options.append(r)

Should it be like this here? Because tar_frame is a number like "00001".

if path.exists(path.join(video_path, r, tar_obj, tar_frame+'.png')):

Conflict between PyQt5 and opencv-python

I encountered a conflict between PyQt5 and opencv-python using anaconda on linux. The error message is:

QObject::moveToThread: Current thread (0x55ad844ec160) is not the object's thread (0x55ade70883e0).
Cannot move to target thread (0x55ad844ec160)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "$ANACONDA_HOME/envs/mivos/lib/python3.7/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

I solved the issue by uninstalling opencv-python and replacing it with opencv-python-headless, as described in this post on the Qt forums. I could not spot any usage of opencv UI in this project, so this fix should not limit functionality. I hope this helps someone running into the same issue.

Missing dependencies: Cython, torchvision, PyQt5

In addition to the dependencies you mention,

  • installing davisinteractive raised a warning about not having Cython but did succeed, however the code in fbrs/utils/cython obviously does not run.
  • to run the interactive GUI required PyQt5, you may want to provide that in the list of requirements (it is not installed with davisinteractive for me).
  • also got an error at runtime about missing torchvision

The working setup for a conda environment I used was

conda create -n mivos
conda activate mivos
conda install -y "cudatoolkit<11.2" -c conda-forge # 11.1.1
conda install -y pytorch torchvision -c pytorch # python 3.8.8 torch 1.8.0 torchvision 0.9.0
pip install davisinteractive progressbar2 opencv-python networkx gitpython gdown Cython

With this, the model runs for me (Ubuntu 20.04). Thought I should leave a note here so you can amend the README if you agree or at least help others.

By the way, very nicely packaged, all models should follow your example of download_models.py using gdown 😃

"prob1" bug in interactive_gui.on_undo()

Hi,

I think there is a typing error in interactive_gui.py lines 616 and 626.
self.interacted_mask = self.processor.prob1[:, self.cursur].clone()

it should be "self.processor.prob" not "prob1"

When one tries to undo the first interaction on a frame it throws an error:

line 626, in on_undo self.interacted_mask = self.processor.prob1[:, self.main_gui.cursor].clone() AttributeError: 'InferenceCore' object has no attribute 'prob1'

OneDriver Link.

Could you reprovide a OneDrive link due to the previous one has expired. Thanks!

ModuleNotFoundError: No module named 'pyximport'

Hi @hkchengrex,

I'm trying to run the following command.

python interactive_gui.py --video example/example.mp4

But I get the following error.

codeboy@192 MiVOS % python interactive_gui.py --video example/example.mp4
Traceback (most recent call last):
  File "/Users/codeboy/MiVOS/interactive_gui.py", line 31, in <module>
    from interact.fbrs_controller import FBRSController
  File "/Users/codeboy/MiVOS/interact/fbrs_controller.py", line 2, in <module>
    from fbrs.controller import InteractiveController
  File "/Users/codeboy/MiVOS/fbrs/controller.py", line 6, in <module>
    from fbrs.inference.predictors import get_predictor
  File "/Users/codeboy/MiVOS/fbrs/inference/predictors/__init__.py", line 2, in <module>
    from .brs import InputBRSPredictor, FeatureBRSPredictor, HRNetFeatureBRSPredictor
  File "/Users/codeboy/MiVOS/fbrs/inference/predictors/brs.py", line 7, in <module>
    from fbrs.model.is_hrnet_model import DistMapsHRNetModel
  File "/Users/codeboy/MiVOS/fbrs/model/is_hrnet_model.py", line 4, in <module>
    from fbrs.model.ops import DistMaps
  File "/Users/codeboy/MiVOS/fbrs/model/ops.py", line 6, in <module>
    from fbrs.utils.cython import get_dist_maps
  File "/Users/codeboy/MiVOS/fbrs/utils/cython/__init__.py", line 2, in <module>
    from .dist_maps import get_dist_maps
  File "/Users/codeboy/MiVOS/fbrs/utils/cython/dist_maps.py", line 1, in <module>
    import pyximport; pyximport.install(pyximport=True, language_level=3)
ModuleNotFoundError: No module named 'pyximport'

How can this be fixed?

Regards
Rahul Bhalley

CPU profile 2 process throwing CUDA out of memory for one image with multiple items when propagate button is clicked

@hkchengrex
To replicate:

  • load only one image of width(3024 by 4032) in folder./example/test_folder/
  • run command: python interactive_gui.py --mem_profile 2 --images ./example/test_folder/ --resolution -1 --num_objects 4
  • click on one object to create overlay of the first object (red)
  • select num keypad 2 and click a different object (to produce overlay of different color)
  • select num keypag 3 and click a different object (to produce overlay of different color)
  • select num keypag 3 and click a different object (to produce overlay of different color)
  • click "propagate"
    Throws error. See picture.
    Screenshot from 2021-12-11 12-03-39

Even though I am doing one image if I click "Save" it does what is supposed to to (save overlay and mask). But clicking "Propagate" should not throw and error with cuda when --mem_profile was set to 2, right ? should not have used GPU.

Has anyone met the following problem during the running of "interactive_gui.py"?

Traceback (most recent call last):
File "interactive_gui.py", line 23, in
from PyQt5.QtWidgets import (QWidget, QApplication, QMainWindow, QComboBox, QGridLayout,
ImportError: /usr/lib/x86_64-linux-gnu/libQt5Core.so.5: version `Qt_5.15' not found (required by /home/fg/anaconda3/envs/MiVOS/lib/python3.7/site-packages/PyQt5/QtWidgets.abi3.so)

Changing Lables Doesn't Work

Hello, amazing application! The problem I'm having seems to be that changing labels doesn't work. I've tried both the number pad and number keys to no avail. Everything else is working as intended though.

ERROR about interactive_gui.py

Traceback (most recent call last):
File "interactive_gui.py", line 34, in
from interact.fbrs_controller import FBRSController
File "D:\Code\Pytorch\MiVOS-main\interact\fbrs_controller.py", line 2, in
from fbrs.controller import InteractiveController
File "D:\Code\Pytorch\MiVOS-main\fbrs\controller.py", line 6, in
from fbrs.inference.predictors import get_predictor
File "D:\Code\Pytorch\MiVOS-main\fbrs\inference\predictors_init_.py", line 2, in
from .brs import InputBRSPredictor, FeatureBRSPredictor, HRNetFeatureBRSPredictor
File "D:\Code\Pytorch\MiVOS-main\fbrs\inference\predictors\brs.py", line 7, in
from fbrs.model.is_hrnet_model import DistMapsHRNetModel
File "D:\Code\Pytorch\MiVOS-main\fbrs\model\is_hrnet_model.py", line 4, in
from fbrs.model.ops import DistMaps
File "D:\Code\Pytorch\MiVOS-main\fbrs\model\ops.py", line 6, in
from fbrs.utils.cython import get_dist_maps
File "D:\Code\Pytorch\MiVOS-main\fbrs\utils\cython_init_.py", line 2, in
from .dist_maps import get_dist_maps
File "D:\Code\Pytorch\MiVOS-main\fbrs\utils\cython\dist_maps.py", line 3, in
from ._get_dist_maps import get_dist_maps
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\pyximport\pyximport.py", line 462, in load_module
language_level=self.language_level)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\pyximport\pyximport.py", line 231, in load_module
raise exc.with_traceback(tb)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\pyximport\pyximport.py", line 215, in load_module
inplace=build_inplace, language_level=language_level)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\pyximport\pyximport.py", line 191, in build_module
reload_support=pyxargs.reload_support)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\pyximport\pyxbuild.py", line 102, in pyx_to_dll
dist.run_commands()
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\dist.py", line 955, in run_commands
self.run_command(cmd)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\dist.py", line 974, in run_command
cmd_obj.run()
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run
_build_ext.build_ext.run(self)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\command\build_ext.py", line 339, in run
self.build_extensions()
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\command\build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils\command\build_ext.py", line 533, in build_extension
depends=ext.depends)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils_msvccompiler.py", line 345, in compile
self.initialize()
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils_msvccompiler.py", line 238, in initialize
vc_env = _get_vc_env(plat_spec)
File "D:\Softwares\Anaconda\envs\pytorch_17\lib\distutils_msvccompiler.py", line 134, in _get_vc_env
raise DistutilsPlatformError("Unable to find vcvarsall.bat")
ImportError: Building module fbrs.utils.cython._get_dist_maps failed: ['distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat\n']

Some problem when train Fusion

Hello, I encountered some problems when retraining the fusion model. Some key parameter guidelines for training fusion are not given in the code warehouse. Can you provide it?
Specifically as follows:
(1) generate_fusion.py: parameter "separation" not given

Can you provide the relevant parameter descriptions of fusion training and the instructions to run so that I can reproduce the results of your paper?

and when I try to train(python train.py),I meet some code mistake in fusion_dataset.py:
(1)are there some mistake When you assign a value to self.vid_to_instance? and It will return error at:
self.videos = [v for v in self.videos if v in self.vid_to_instance](line 60 in fusion_datast.py)

Propagation failed

Dear author, it is a great work.
However, when I use the interaction_gui for mask propagation after I manually scribble the mask for some frames and then click the propagation button, why did the mask propagate not generate a mask for frame normally?

Overlay and Mask files not equal to size of original input image.

@hkchengrex
Doing one image larger than 1k resolution in one folder with command:
python interactive_gui.py --mem_profile 2 --images ./example/test_folder/

  • clicking on an object to produce the mask
  • click "save" to save the overlay and masks
  • Both overlay and mask files are reduced to a fix resolution of: width: 480px, height: 640px

Q. Can we keep the size of the output files to be equal to the input size of the original image?
Q. Can we add a flag to use either current behavior or preserve the resolution of the input image ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.