dbolya / yolact Goto Github PK
View Code? Open in Web Editor NEWA simple, fully convolutional model for real-time instance segmentation.
License: MIT License
A simple, fully convolutional model for real-time instance segmentation.
License: MIT License
Hi,
In order to solve the stacking problem of the same object, I have trained my data set as required, but there are some masks that cannot completely cover the object, only part of them can be covered. Do you know what this is about? Do you have any Suggestions for modification?
Looking forward to your reply,thank you.
I am not having consistent GPU utilization, and it says 18 days for 1 v100 gpu(p3.2xlarge) with batchsize of 12 and num-workers 8. Does this make sense?
Is there any explanation of timer column and is there tensorboard equivalent for viewing performance over time?
Thank you very much!
https://github.com/dbolya/yolact/blob/master/data/config.py#L30
Is this line wrong? I think it should be BGR.
I think thers is something incompatible with windows in yolact.py
run the eval.py says cuda unkown error, the error locates at 'torch.set_default_tensor_type('torch.cuda.FloatTensor')'. It looks like cuda init unsuccessfully.
I try to put 'torch.set_default_tensor_type('torch.cuda.FloatTensor')' from top line to down,like this:
#try1
import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from data import COCODetection, get_label_map, MEANS, COLORS
...
#try2
import torch
...
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from yolact import Yolact
...
then I find that put it before 'from yolact import Yolact' works, otherwise failed.
Now, at the begin of yolact.py, write as follow:
import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')
from data import COCODetection, get_label_map, MEANS, COLORS
...
Hello
I am facing a little issue.
I am trying to retrain the model on Pascal Voc 2012 dataset.
I took the coco like annotations from this source:
https://github.com/facebookresearch/multipathnet
Then I follow the instruction concerning the modification to do in the file config.py
But when I call : python train.py --config=yolact_base_config
I receive the following error:
KeyError: 'Traceback (most recent call last):\n File "/home/smile/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/smile/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/hdd1/prog/yolact/data/coco.py", line 88, in __getitem__\n im, gt, masks, h, w, num_crowds = self.pull_item(index)\n File "/hdd1/prog/yolact/data/coco.py", line 145, in pull_item\n target = self.target_transform(target, width, height)\n File "/hdd1/prog/yolact/data/coco.py", line 39, in __call__\n label_idx = self.label_map[obj[\'category_id\']] - 1\nKeyError: 12\n'
The error is quite not clear to me.
So what I did is create a new dataset:
PASCAL_VOC_CLASSES = ("aeroplane", "bicycle", "bird", "boat", "bottle",
"bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor")
PASCAL_VOC_LABEL_MAP = { 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8,
9: 9, 10: 10, 11: 11, 13: 12, 14: 13, 15: 14, 16: 15, 17: 16,
18: 17, 19: 18, 20: 19, 21: 20}
pascalvoc2012_dataset = dataset_base.copy({
'name': 'PASCAL VOC 2012',
'train_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
'train_info':'/home/smile/multipathnet/data/annotations/pascal_train2012.json',
'valid_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
'valid_info':'/home/smile/multipathnet/data/annotations/pascal_val2012.json',
'label_map': PASCAL_VOC_LABEL_MAP
})
I created a new base_config that only which call the dataset I previously created with the proper number of classes:
pascalvoc_base_config = Config({
'dataset': pascalvoc2012_dataset,
'num_classes': 21, # This should include the background class
...
All the other fields are let untouch.
Finally I adapted yolact_base_config:
#yolact_base_config = coco_base_config.copy({
yolact_base_config = pascalvoc_base_config.copy({
'name': 'yolact_base',
# Dataset stuff
# 'dataset': coco2017_dataset,
# 'num_classes': len(coco2017_dataset.class_names) + 1,
'dataset': pascalvoc2012_dataset,
'num_classes': len(pascalvoc2012_dataset.class_names) + 1,
Here also all the other fields are let untouch.
EDIT
After applying the modifications discussed here the dataset configuration in order to train Pascal Voc is:
MEANS_PV = (103.17, 111.70, 116.69)
STD_PV = (61.11, 59.89, 61.00)
PASCAL_VOC_CLASSES = ("aeroplane", "bicycle", "bird", "boat", "bottle",
"bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor")
PASCAL_VOC_LABEL_MAP = { 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8,
9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16,
17: 17, 18: 18, 19: 19, 20: 20}
pascalvoc2012_dataset = dataset_base.copy({
'name': 'PASCAL VOC 2012',
'train_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
'train_info':'/home/smile/multipathnet/data/annotations/pascal_train2012.json',
'valid_images':'/media/smile/45C142AD782A7053/Datasets/PASCAL_VOC/VOC2012/VOCdevkit/VOC2012/JPEGImages/',
'valid_info':'/home/smile/multipathnet/data/annotations/pascal_val2012.json',
'label_map': PASCAL_VOC_LABEL_MAP,
'class_names': PASCAL_VOC_CLASSES,
})
Hi,Thanks a lot for your fantastic work!But,i found that in your paper ,you produce 'mask coefficients'
by using fc layers.but in your code, i found you produce 'mask coefficients' by using conv layer.Can you tell me which kind of layer you use for producing 'mask coefficients'?Thanks for your reply!
I found fastnms function in ssd code,so what's the difference between yolact and ssd?
First of all, thanks for sharing the amazing work!
Following the instructions, I have deployed the environment and can execute the code successfully, however, when running eval.py, the inference speed is slower than expected.
For model ResNet101-FPN, when testing on validation set of coo, the code return about 9 FPS, and when testing on my own images of kinect (640*480), with ploting and saving disabled, the code return about 14 FPS.
my own evironment is : GTX1080, cuda8.0, cudatoolkits8.0, I am using anaconda, gpu support is checked via
torch.cuda.is_available()
I am a newer for pytorch, so I am wondering there is some configuration or dependencies have missed.
Thanks!
hi dbolya,
Can u upload your model on Google drive or other disk? The URL provided by ucdavis. is not accessable
Thanks
Hello
I am trying to retrain yolact on Pascal Part a variation of Pascal VOC where each classes has many sub-classes.
To simplify everything I make every sub-classes a class in addition with the 20 original one which give me a set 316 classes.
I generated three JSON files for each case.
When I start training I encouter the following error:
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity
Which happen here:
losses = criterion(out, wrapper, wrapper.make_mask())
train.py around line 262 (I had some print in my file so my line number is different)
Here:
eriklindernoren/PyTorch-YOLOv3#110
I read it might be a path issue however I rechecked the image path are correct.
Also I am able to train Pascal Voc using the same image path without issues.
I try to investigate the forward
method of the loss function looking for an empty tensor but I did not find any.
memory is 12G,only used 8G
python train.py --config=yolact_base_config --batch_size=5
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
Initializing weights...
Begin training!
[ 0] 0 || B: 8.264 | C: 14.452 | M: 14.870 | S: 3.010 | T: 40.595 || ETA: 0:00:00 || timer: 12.147
[ 0] 10 || B: 9.251 | C: 9.149 | M: 7.010 | S: 2.204 | T: 27.615 || ETA: 0:57:52 || timer: 0.445
[ 0] 20 || B: 8.156 | C: 7.494 | M: 6.613 | S: 1.537 | T: 23.800 || ETA: 1:00:17 || timer: 0.441
[ 0] 30 || B: 8.053 | C: 6.515 | M: 6.317 | S: 1.206 | T: 22.091 || ETA: 1:08:55 || timer: 0.437
[ 0] 40 || B: 7.631 | C: 5.865 | M: 6.203 | S: 0.981 | T: 20.680 || ETA: 1:22:37 || timer: 0.428
[ 0] 50 || B: 7.558 | C: 5.397 | M: 6.149 | S: 0.845 | T: 19.949 || ETA: 1:20:02 || timer: 0.432
Traceback (most recent call last):
File "train.py", line 374, in
train()
File "train.py", line 211, in train
for datum in data_loader:
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
MemoryError: Traceback (most recent call last):
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/chase/anaconda3/envs/maskrcnn_benchmark1/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/chase/yolact/data/coco.py", line 88, in getitem
im, gt, masks, h, w, num_crowds = self.pull_item(index)
File "/home/chase/yolact/data/coco.py", line 151, in pull_item
{'num_crowds': num_crowds, 'labels': target[:, 4]})
File "/home/chase/yolact/utils/augmentations.py", line 658, in call
return self.augment(img, masks, boxes, labels)
File "/home/chase/yolact/utils/augmentations.py", line 54, in call
img, masks, boxes, labels = t(img, masks, boxes, labels)
File "/home/chase/yolact/utils/augmentations.py", line 380, in call
current_masks = masks[mask, :, :].copy()
MemoryError
First of all, I would like to thank you for your outstanding contribution. Secondly, I would like to ask how the algorithm you proposed works on mobile devices with insufficient computing power and computing memory. Could you give me some reasonable Suggestions? Thank you so much!
When I run:
python eval.py --trained_model=weights/yolact_base_54_800000.pth --dataset=coco2017_dataset
It only evaluates 4952 images. Any ideas on why it does't go though the 5000 images in ./data/coco/images/ ?
The image folder has 5000 images and the annotations_val2017.json file has annotations for those images.
What do I need to change so that it evaluates the complete set of images? (5k)
Hello!
I trained this model with own dataset, but it fails in the mAP evaluation phase, does anyone have the same problem?
(tensorflow) root@gpuserver:/home/gpuserver/models/yolact# python train.py --config=yolact_base_config
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Initializing weights...
Begin training!
/root/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
[ 0] 0 || B: 5.480 | C: 23.075 | M: 5.976 | S: 67.004 | T: 101.536 || ETA: 0:00:00 || timer: 23.377
[ 0] 10 || B: 4.757 | C: 18.774 | M: 5.625 | S: 47.000 | T: 76.155 || ETA: 11 days, 7:25:02 || timer: 1.176
[ 0] 20 || B: 4.587 | C: 15.804 | M: 5.362 | S: 29.147 | T: 54.900 || ETA: 11 days, 7:50:00 || timer: 1.180
[ 0] 30 || B: 4.582 | C: 13.355 | M: 5.309 | S: 19.954 | T: 43.199 || ETA: 11 days, 6:14:29 || timer: 1.272
[ 0] 40 || B: 4.553 | C: 11.175 | M: 5.306 | S: 15.150 | T: 36.183 || ETA: 11 days, 6:35:18 || timer: 1.266
[ 0] 50 || B: 4.497 | C: 9.617 | M: 5.303 | S: 12.227 | T: 31.645 || ETA: 11 days, 6:12:37 || timer: 1.120
[ 0] 60 || B: 4.433 | C: 8.514 | M: 5.290 | S: 10.265 | T: 28.503 || ETA: 11 days, 4:48:22 || timer: 1.166
[ 1] 70 || B: 4.383 | C: 7.700 | M: 5.304 | S: 8.850 | T: 26.237 || ETA: 11 days, 7:53:19 || timer: 1.236
[ 1] 80 || B: 4.339 | C: 7.073 | M: 5.269 | S: 7.781 | T: 24.464 || ETA: 11 days, 6:55:00 || timer: 1.173
[ 1] 90 || B: 4.294 | C: 6.585 | M: 5.250 | S: 6.945 | T: 23.074 || ETA: 11 days, 6:22:39 || timer: 1.217
[ 1] 100 || B: 4.235 | C: 6.015 | M: 5.230 | S: 5.666 | T: 21.147 || ETA: 11 days, 5:30:35 || timer: 1.259
[ 1] 110 || B: 4.131 | C: 4.426 | M: 5.184 | S: 1.178 | T: 14.920 || ETA: 11 days, 4:57:38 || timer: 1.177
[ 1] 120 || B: 4.045 | C: 3.427 | M: 5.202 | S: 0.242 | T: 12.915 || ETA: 11 days, 4:43:12 || timer: 1.214
[ 2] 130 || B: 3.926 | C: 2.860 | M: 5.195 | S: 0.192 | T: 12.174 || ETA: 11 days, 5:57:53 || timer: 2.714
[ 2] 140 || B: 3.817 | C: 2.654 | M: 5.138 | S: 0.180 | T: 11.789 || ETA: 11 days, 5:43:35 || timer: 1.230
[ 2] 150 || B: 3.694 | C: 2.571 | M: 5.045 | S: 0.170 | T: 11.480 || ETA: 11 days, 5:23:29 || timer: 1.217
[ 2] 160 || B: 3.617 | C: 2.516 | M: 4.966 | S: 0.158 | T: 11.256 || ETA: 11 days, 5:37:45 || timer: 1.277
[ 2] 170 || B: 3.540 | C: 2.467 | M: 4.876 | S: 0.149 | T: 11.031 || ETA: 11 days, 5:16:56 || timer: 1.222
[ 2] 180 || B: 3.440 | C: 2.419 | M: 4.831 | S: 0.141 | T: 10.831 || ETA: 11 days, 4:54:42 || timer: 1.176
[ 2] 190 || B: 3.342 | C: 2.364 | M: 4.716 | S: 0.135 | T: 10.558 || ETA: 11 days, 4:41:58 || timer: 1.187
Computing validation mAP (this may take a while)...
Traceback (most recent call last):
File "train.py", line 374, in
train()
File "train.py", line 303, in train
compute_validation_map(yolact_net, val_dataset)
File "train.py", line 367, in compute_validation_map
eval_script.evaluate(yolact_net, dataset, train_mode=True)
File "/home/gpuserver/models/yolact/eval.py", line 791, in evaluate
prep_metrics(ap_data, preds, img, gt, gt_masks, h, w, num_crowd, dataset.ids[image_idx], detections)
File "/home/gpuserver/models/yolact/eval.py", line 401, in prep_metrics
ap_obj = ap_data[iou_type][iouIdx][_class]
IndexError: list index out of range
yolact/layers/functions/detection.py
Line 189 in d8ddaa1
I thought there is a mistake in the implementation of traditional_nms. When NMS finished, did the boxes need to be rescaled into [0, 1]?
Hi I am trying to run eval.py and I am getting an average FPS of 8.54 [approx]. I want the FPS to increase. So is there anyway by which the eval.py can use multiple GPUs?
Thanks.
Hi, thanks for your work. Recently I am trying to train the net using my custom dataset. There is an issue that I find it hard to debug it by myself. Here is my problem. Thanks a lot for your help again.
[ 2] 2930 || B: 3.808 | C: 2.416 | M: 4.821 | S: 0.049 | T: 11.094 || ETA: 4 days, 14:47:05 || timer: 0.478
[ 2] 2940 || B: 3.795 | C: 2.418 | M: 4.838 | S: 0.049 | T: 11.101 || ETA: 4 days, 14:47:10 || timer: 0.497
[ 2] 2950 || B: 3.787 | C: 2.421 | M: 4.812 | S: 0.049 | T: 11.069 || ETA: 4 days, 14:49:01 || timer: 0.474
[ 2] 2960 || B: 3.778 | C: 2.422 | M: 4.846 | S: 0.049 | T: 11.095 || ETA: 4 days, 14:49:52 || timer: 0.512
[ 2] 2970 || B: 3.748 | C: 2.419 | M: 4.846 | S: 0.048 | T: 11.061 || ETA: 4 days, 14:49:04 || timer: 0.491
Computing validation mAP (this may take a while)...
Traceback (most recent call last):
File "train.py", line 377, in
train()
File "train.py", line 300, in train
compute_validation_map(yolact_net, val_dataset)
File "train.py", line 370, in compute_validation_map
eval_script.evaluate(yolact_net, dataset, train_mode=True)
File "/data/pancreas/root/yolact-master/eval.py", line 869, in evaluate
prep_metrics(ap_data, preds, img, gt, gt_masks, h, w, num_crowd, dataset.ids[image_idx], detections)
File "/data/pancreas/root/yolact-master/eval.py", line 433, in prep_metrics
ap_obj = ap_data[iou_type][iouIdx][_class]
IndexError: list index out of range
Hi sir.
I want to see the data flow to understand this article. However, I nerver use torch. Could you send me a graph logdir by tensorboardX? Thank you in advance.
Hi,
I tried to train a model with a custom dataset and the resnet101 backbone. I noticed that while half of the bounding boxes looked accurate, the masks were completely off. I checked drew the annotations and verified that they are correct.
It could be due to the size of the dataset: 1357 images and 21 classes. I would like to use yolact_im700_54_80000.pth
and fine tune it with my custom classes to see if this improves my results. What would be the steps to do this?
Dear Sir:
I have some problem to understand your cluster_bbox_sizes.py, optimize_bboxes.py and bbox_recall.py. I really want use them to set the parameters: scales aspect_ratios and conv_sizes more reasonable.
Could you please explain a little of what these means? Thanks a lot!
I use the default paras as the yolact_base.cfg does, and test the scripts on a dataset
scales = [ [24],[48],[96],[192],[384] ] aspect_ratios = [ [[1, 1/sqrt(2), sqrt(2)]] ]*5 conv_sizes = [(69, 69), (35, 35), (18, 18), (9, 9),(5,5)]
here are the results:
from: cluster
`0.062 (18) aspect ratios:
17.71 (8)
5.23 (8)
109.76 (2)
0.146 (70) aspect ratios:
4.39 (34)
2.26 (30)
0.65 (6)
0.241 (125) aspect ratios:
1.12 (103)
0.23 (21)
0.00 (1)
`
from optimize_bbox:
`(Iteration 9) Aspect Ratios: [[[19.03, 0.55, 1.13]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]]]
scales = [[17.53], [60.94], [108.94], [204.94], [396.94]]
aspect_ratios = [[[19.03, 0.55, 1.13]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]], [[13.94, 13.64, 14.24]]]
`
from bbox_recall:
`Total recall: 33.80
small recall: 0.00
medium recall: 0.00
large recall: 46.75
`
Thanks a lot! It's a bit hard for me >o<
Hello, could you please show me how to use the scripts in the "root/web" subdirectory?
Hi, dbolya.
Thanks for your work. I tried to reproduce the performance with ResNet50 pre-trained model and used the command 'python train.py --config=yolact_resnet50_config'. While training, I found that it need about 30 days to finish the training which was too long. Then I set batch_size = 32 because I have 8 GPUs, but it remains the same. The total training time was still about 30 days.
Did I do anything wrong? Or the training time is actually long? How can I use Multi-GPU to accelerate training?
Thanks!
Hello, I'm trying to run eval.py, but got an error.
The error message is:
Traceback (most recent call last):
File "eval.py", line 990, in
torch.set_default_tensor_type('torch.cuda.FloatTensor')
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/init.py", line 158, in set_default_tensor_type
_C._set_default_tensor_type(t)
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/cuda/init.py", line 161, in _lazy_init
_check_driver()
File "/home/administrator/anaconda3/lib/python3.7/site-packages/torch/cuda/init.py", line 75, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
I don't have gpu graphic card on my pc, and how to run eval.py without cuda? Thanks.
Hi, thanks for your good job!
I want to train my dataset, and using for 4gpus, but I find it slower than single gpu(same batch_size), why?
Hi,
I want to try to change the backbone network into MobileNet-v2 with FPN. Is there any suggestions? THX!!!
Hi, i think you may want to use cfg.max_size, should it be?
yolact/layers/functions/detection.py
Line 188 in d04c948
yolact/layers/functions/detection.py
Line 189 in d04c948
Is this line a bug? why is there a data type in function defination?
https://github.com/dbolya/yolact/blob/dev/eval.py#L244
Hello, first off, thank you for sharing this amazing work. Much appreciated.
I wanted to report in that I also could not get 30+fps on an Nvidia RTX 2080 GPU with 8GB RAM. I am getting 8-10fps with video and with images, I get ~16fps (0.06sec/image) with the Resnet-101 model, ~20fps (0.05sec/image) with the Resnet-50 model and 17-18fps (0.055sec/image) with the Darket53 model. This is quite impressive but its roughly 1/2 of what is reported in the paper. For images, I used the python timeit module to wrap the evalimage function to report my numbers. Also, it is weird that the difference in speed between the different models is not significant (especially between Resnet-101 and Resnet-50), which indicates to me that something is reducing the processing speed by ~1/2 for all the models.
The command I am using is as below (except I change the model name as needed):
python3 eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.4 --top_k=100 --images=./test_images:./test_output_images
I also tried using --benchmark but there is no change in the numbers above.
I was wondering if I could get some help to figure this out.
What is format of training ?
What tool we can use to tag images ?
What is the command for training ?
I am training on cityscapes, so I want to preserver the ratio.(1024, 2048)
However, after turn on preserve ratio, loss keep decrease but the visualization of bounding box position always wrong.
And I find this line use max_size both at width and height.
I think it should be b_w, b_h = (int(cfg.max_size / r_w * w), int(cfg.min_size / r_h * h)).
or directly b_w, b_h =w, h
I don't understand the comment # A hack to scale the bboxes to the right size
I wonder is this a bug or some trick?
Line 68 in 5dd130d
Thanks
Hi, dbolya,
I did not find dataparallel in your yolact.py, which define the model. So the code in your repo did not support multi-gpu properly?
I tried simple CUDA_VISIBLE_DEVICES to assign multi-gpu, but the performance is not right according to the train log.
Thanks!
Thanks for sharing your your great work!
I compared yolact's training config with that of retinanet since yolact is based on retinanet(I think)
I have a few questions about the training config of yolact.
(1) the batch size on one GPU is 8, so how many GPUs did you use when training? 4 or 8? which means that total batch size is 32 or 64. Retinanet's batch size is 16.
(2) the iterations is 800k, which is almost 10x larger than retinanet. why?
(3) the learning rate is 1e-3, which is 10 times smaller than retinanet, why?
Thanks!
[ 0] 3180 || B: 3.273 | C: 6.118 | M: 5.300 | S: 1.431 | T: 16.121 || ETA: 8 days, 0:27:19 || timer: 0.833
[ 0] 3190 || B: 3.251 | C: 6.134 | M: 5.046 | S: 1.343 | T: 15.774 || ETA: 8 days, 0:20:56 || timer: 0.924
[ 0] 3200 || B: 3.220 | C: 6.074 | M: 5.023 | S: 1.346 | T: 15.663 || ETA: 8 days, 0:14:25 || timer: 0.922
[ 0] 3210 || B: 3.249 | C: 6.012 | M: 4.997 | S: 1.397 | T: 15.655 || ETA: 8 days, 0:03:42 || timer: 0.824
[ 0] 3220 || B: 3.167 | C: 5.980 | M: 4.841 | S: 1.368 | T: 15.355 || ETA: 7 days, 23:56:10 || timer: 0.831
/opt/conda/conda-bld/pytorch_1550813258230/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype *, const Dt
ype *, Dtype *) [with Dtype = float, Acctype = float]: block: [33,0,0], thread: [192,0,0] Assertion *input >= 0. && *input <= 1.
failed.
/opt/conda/conda-bld/pytorch_1550813258230/work/aten/src/THCUNN/BCECriterion.cu:57: void bce_updateOutput_no_reduce_functor<Dtype, Acctype>::operator()(const Dtype *, const Dt
ype *, Dtype *) [with Dtype = float, Acctype = float]: block: [33,0,0], thread: [193,0,0] Assertion *input >= 0. && *input <= 1.THCudaCheck FAIL file=/opt/conda/conda-bld/pyt orch_1550813258230/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
failed.
can you help me solve it? Thanks
I have NVIDIA GeForce RTX 2080 Ti,
For example, if there are multiple people in a picture, they are labeled as person 1,person2,person3?
Line 275 in cb3857a
That should be division not multiplication oh nooooooo
How could I have missed that ahhhhhh
Time to assess the damages
Dear sir:
I'm really interested in your fantastic work,
Could u please give the training steps on my own datasets?
Thanks a lot!~
Line 481 in d630bf6
That's just gray-scale with noise right?
I know the retinanet inspire the basic backbone, ssd inspire the loss, mask-rcnn inspire the branch,
but I wonder what inspire you the protonet?
I am running this on a linux 18.04 box with python3 and all the most recent versions of the libraries. Any Idea why I get this error?
$ python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video=/home/vib/Desktop/AndurilSRC/LPR_DATA/lotsofcars_1.mp4:output_video-det.mp4
Config not specified. Parsed yolact_base_config from the file name.
Loading model... Done.
Traceback (most recent call last):
File "eval.py", line 935, in
evaluate(net, dataset)
File "eval.py", line 722, in evaluate
savevideo(net, inp, out)
File "eval.py", line 682, in savevideo
preds = net(batch)
File "/home/vib/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/vib/Desktop/Personal/yolact/yolact.py", line 612, in forward
return self.detect(pred_outs)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 76, in call
result = self.detect(batch_idx, conf_preds, decoded_boxes, mask_data, inst_data)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 103, in detect
boxes, masks, classes, scores = self.fast_nms(boxes, masks, scores, self.nms_thresh, self.top_k)
File "/home/vib/Desktop/Personal/yolact/layers/functions/detection.py", line 148, in fast_nms
iou.triu_(diagonal=1)
RuntimeError: invalid argument 1: expected a matrix at /pytorch/aten/src/THC/generic/THCTensorMathPairwise.cu:203
FAIL
Please do not use the latest pytorch version 1.1.0 which may cause the CUDA errors.
the default pytorch install command is
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
use the below instead
pip3 install https://download.pytorch.org/whl/cu100/torch-1.0.1-cp36-cp36m-win_amd64.whl
one day wasted to debug this [cry]
http://manaai.cn/aicodes_detail3.html?id=32
A group of contemptible thieves has clone ur code, replace ur license with theirs and peddling it on their "WEBSITE".
Hi, is there a way to get validation loss during training? I want to monitor it for overfitting cases.
I noticed you had it before (which is giving me errors), but the overhaul has removed it.
Thanks.
Hi, thank you for the awesome work!
For some reasons, I have to re-write your eval.py by myself.
However, if I run the code, it will take 2 seconds just for prediction.
Do you have any idea why is it?
I already checked I enabled GPU.
import os
from data import COCODetection, MEANS, COLORS, COCO_CLASSES
from yolact import Yolact
from utils.augmentations import BaseTransform, FastBaseTransform, Resize
from utils.functions import MovingAverage, ProgressBar
from layers.box_utils import jaccard, center_size
from utils import timer
from utils.functions import SavePath
from layers.output_utils import postprocess, undo_image_transformation
import pycocotools
from data import cfg, set_cfg, set_dataset
import numpy as np
import torch
import torch.backends.cudnn as cudnn
from torch.autograd import Variable
import argparse
import time
import random
import cProfile
import pickle
import json
import os
from pathlib import Path
from collections import OrderedDict
from PIL import Image
import matplotlib.pyplot as plt
import time
set_cfg("yolact_resnet50_config")
with torch.no_grad():
torch.cuda.set_device(1)
cudnn.benchmark = True
cudnn.fastest = True
torch.set_default_tensor_type('torch.cuda.FloatTensor')
net = Yolact()
net.load_weights('./weights/yolact_resnet50_54_800000.pth')
net.eval()
net = net.cuda()
print('model loaded...')
#run your code
def execute(rgb_image):
net.detect.cross_class_nms = True
net.detect.use_fast_nms = True
cfg.mask_proto_debug = False
with torch.no_grad():
frame = torch.Tensor(rgb_image).cuda().float()
batch = FastBaseTransform()(frame.unsqueeze(0))
time_start = time.clock()
preds = net(batch)
time_elapsed = (time.clock() - time_start)
h, w, _ = rgb_image.shape
t = postprocess(preds, w, h, visualize_lincomb=False, crop_masks=True, score_threshold=0)
torch.cuda.synchronize()
classes, scores, boxes, masks = [x[:MAX_MASK_SIZE].cpu().numpy() for x in t]
print(time_elapsed)
the paper says that box2pix relies on an extremely light-weight backbone detector.
I think more experienments maybe nicer. maybe like this
kitti cityscape coco
box2pix
yolact
also ,yolact-lite maybe good,just like yolo-lite using light-weight backbone(like xception).
this is the yolact v1 just like yolo v1.
I am wondering if the encoder-decoder achitecture or the atrous convolution may help which is adopped by deeplab v3 plus.
expecting yolact v2...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.