Comments (11)
@gauravmunjal13 Hi. Thanks for your interest. (1) The video id for evaluation and saved reasults can be different because the code saving results is written by me and the code for evaluation is provided by the official dataset. But I think it is not necessary to be same because we just need to visualize the results in order. For evaluation, you should follow official dataset. (2) With our provided model, we do not get the ap of 0. I am not sure about your reason. (3) It will skip frames with None. If you want to save results in each image, you can set a lower threshold in configs file.
from sipmask.
Thanks, @JialeCao001 for your response!
It's true that the frames which have segmentations None are ignored while saving them. However, even the ones saved (having segmentations), all don't contain a visible object. When I plotted these segmentations, I find they are small. Is it that they are ignored as being small in size?
Continuing on that, I would like to know your thoughts of using SipMask on detecting very small objects?
Could you please suggest where are you saving the frames/results? I came across function show_results() in inference.py code under mmdet/apis. But it doesn't save the results.
Meanwhile, I tried to write a code snippet to plot and save results using results.pkl.json output file so that I can filter results based on score. The segmentations displayed by your code is correct, however, the one using my code snippet is drifted (seems like size or scale issue). But it did show correctly using the results.pkl.json file from MaskTrack-RCNN. Is your output results file differ in some context to the one generated by MaskTrackRCNN?
I am decoding the segmentations and applying on the image as:
mask = maskUtils.decode(segm)
im_pred = apply_mask(im_pred, mask, (0.0,1,0.0)).astype(np.uint8)
where, apply_mask() is:
def apply_mask(image, mask, color, alpha=0.5):
for c in range(3):
image[:, :, c] = np.where(mask == 1,image[:, :, c] *(1 - alpha) + alpha * color[c] * 255,image[:, :, c])
return image
Last, it is suggested on Readme that there are two versions of SipMask as High-Accuracy and Real-Time Fast. How do I know which one I am using and how to switch to the other?
from sipmask.
@gauravmunjal13 Hi. I can not get all the things that you say. Here, I try to ask your questions.
(1) I am not very sure that the mmdetection will filter out objects of small-scale objects. If I am free, I can check this problem.
(2) SipMask is more useful for large-scale objects from our experiments.
(3) In Readme, I introduce how to save the results of youtube-vis. Please use the the following command:
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm --show --save_path= ${SAVE_PATH}
And the corresponding show_result function is in base.py as
https://github.com/JialeCao001/SipMask/blob/f7b035232ae3ff8d7171fd98b80efdbac926cbd6/SipMask-VIS/mmdet/models/detectors/base.py#L109
(4) Instance segmentation in images has two versions of SipMask. For video instance segmentation, we only provide SipMask-VIS.
from sipmask.
Many thanks, @JialeCao001 for your comments!
These are really useful and really appreciate your help!
I may have figured out the reason why my code for plotting and saving the results is not working properly which may highlight some discrepancy in the results.pkl.json file.
The input to the model is images and segmentations corresponding to size (512,512). The model produces an output as results.pkl.json in which the size of segmentations is (512,512) and the mask obtained after decoding them is also (512,512). However, the issue was that these segmentations seems to be drifted towards the top left and smaller in size, but were correct in your saving results.
My analysis is that you are saving the images as of size (360,360) but the segmentations are of size (512,512). Does it mean there is some discrepancy in producing results file or am I missing anything?
Thanks!
from sipmask.
@gauravmunjal13 Hi. If you save images with the following code, can you get the right results?
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm --show --save_path= ${SAVE_PATH}
from sipmask.
@gauravmunjal13 Hi. If you save images with the following code, can you get the right results?
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm --show --save_path= ${SAVE_PATH}
Yes! But the resulting images are resized to (360,360) while the input images were of size (512,512).
So, if I take the results.pkl.json file to apply the segmentations on input images, it's not correct.
from sipmask.
@gauravmunjal13 okay. I know what you say. The saved image is the same to the rescaled input image for network, which may be different from original image. When writing the json file, the code will rescale bounding box back.
from sipmask.
Perhaps rescaling may not be happening correctly.
I followed the following steps to use segmentations from results.pkl.json file on input images of (512,512). Let me know if I am wrong. I used your code (show_result() method in base.py) as the reference.
Steps:
First, I resized the input image from (512,512) to (360,360).
Second, the mask obtained from the result file is of size (512,512), but I sliced it as mask = mask[:h, :w]
, where h and w are 360.
Applying this mask on the resized image gives the correct visualization results.
However, I still need to solve the problem of evaluation as the AP is 0. In the method ytvos_eval() in coco_utils.py, ytos detections are loaded from results.pkl.json as predictions while the ground truth annotations are loaded as ytos from the input annotation file which are of size (512,512). Since the predictions (segmentations) may not be rescaled correctly to the original input size, and thus the AP is 0. What do you think?
And many thanks @JialeCao001 for your support!
from sipmask.
Hi @JialeCao001 ,
Let me know if I can provide more information or not clear in explaining.
Thanks!
from sipmask.
@gauravmunjal13 I am not sure about your problem. I donot get a mAP of 0 on youtube-vis test set.
from sipmask.
Hi @JialeCao001 ,
I think we can close this issue. As discussed in the mails, that the following command produces the correct output (results.pkl.json):
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm
But the following command doesn't results in the correct output file (results.pkl.json) in terms that the annotations aren't resized back to the orignal size:
python tools/test_video.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --eval segm --show --save_path= ${SAVE_PATH}
Thanks!
from sipmask.
Related Issues (20)
- How to visualize the results? HOT 3
- crop_split_cuda? HOT 1
- 关于SipMask-VIS的预训练模型 HOT 4
- CUDA error: device-side assert triggered HOT 2
- SipMask-VIS tracking implementation HOT 9
- About the testing process
- SP module implementation location
- SipMask-VIS mAP is not stable HOT 2
- python setup.py develop ERROR HOT 2
- How do I set my custom label map? HOT 3
- Licensing issue
- Annotation format for a huge Dataset
- how to measure the fps? HOT 1
- AttributeError: 'float' object has no attribute 'detach' HOT 1
- 您好!我把网络改了一点点之后,训练是收敛的,但是为什么精度都是0呢?
- forward() takes from 3 to 4 positional arguments but 5 were given
- What is the problem : crop_split_cuda_backward(): incompatible function arguments.
- Hello, sorry to bother you. I wonder if you are still maintaining this code. I would like to ask what gt_pids, gt_INds and idx_GT stand for in anchor_head of SIPmask_VIS
- only size-1 arrays can be converted to python scalars HOT 2
- Implementation of SipMaskv2 in the TPAMI paper
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sipmask.