aim-uofa / adelaidet Goto Github PK
View Code? Open in Web Editor NEWAdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
Home Page: https://git.io/AdelaiDet
License: Other
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
Home Page: https://git.io/AdelaiDet
License: Other
Thanks for your great work! As showed in Table-3, only using rel-coord can achive 31.3 AP, which is quite interesting! So have you print out the mask and controller weight with this setting? Will the region around (x,y) be activated for instance mask?
Hi~ @tianzhi0549
I am trying to implement to rel. coord. in the CondInst.
For location of interest (x, y) on the input image:
x_range = torch.arange(W_mask)
y_range = torch.arange(H_mask)
y_grid, x_grid = torch.grid(y_range, x_range)
y_rel_coord = (y_grid – y / mask_stride).normalize_to(-1, 1)
x_rel_coord = (x_grid – x / mask_stride).normalize_to(-1, 1)
rel_coord = torch.cat(x_rel_coord, y_rel_coord)
Am I right? Could you provide the official code snippet of rel. coord.? Thanks!
I am training FCOS based on FCOS_MS_X_101_64x4d_2x model. The estimated time required is approximately 4-5 days. Is there any tricks I can use to speed up the training please? My training process runs on 8 V100.
请问‘_C’在什么地方
Hi~ @tianzhi0549
I trained Fcos-vovnet39 with CrowdHuman dataset, and got a not bad result.
Now I want to convert my pytorch model to an onnx model or a TensorRT model.
I read the detectron2 documents.
https://detectron2.readthedocs.io/tutorials/deployment.html
However, it seems that detectron2 only provides support for 3 meta architectures (GeneralizedRCNN, PanopticFPN, RetinaNet).
Have you done similar work before? Could you provide some guidance please?
Hi,
the new SOLOv2 is really impressive, expecially the idea of learnable mask kernel and matrix NMS. After reading the paper, I'm a little confused about the decay factor in matrix NMS. Did you compare the experimental results between directly using decay=1-ious
(like that in soft NMS) and decay=(1-ious)/(1-ious_cmax)
? Thank you very much!
Best Regards,
notabigfish
Hi there!
The FCOS-RT is really amazing! Thanks for the work!
A quick question regarding its configuration:MS_DLA_34_4x_syncbn.yaml:
The solver is set up like this:
SOLVER: STEPS: (300000, 340000) MAX_ITER: 360000
I just noticed that while 360K is 4x of 90K in the vanilla setting, 300K and 340K are not 4x of 60K and 80K.
Is there any particular reason for such two lr dropping points?
Looking forward to your reply!
Thanks!
Hi, I believe AdelaiDet is an awesome project.
But I wonder how can I train the detection model with my own dataset .
I've converted my dataset to follow coco-format, but encountered "asserting failed" during loading dataset because my object categories is different from coco_2017_train.
Is there any method to change default training dataset? Thank you for your help.
Hi, where can i download the Bezier Curve Synthetic Dataset which is mentioned in the original paper ABCNet
While parsing node number 243 [InstanceNormalization -> "615"]:
ERROR: /home/onnx-tensorrt/builtin_op_importers.cpp:1550 In function importInstanceNormalization:
[8] Assertion failed: !isDynamic(tensor_ptr->getDimensions()) && "InstanceNormalization does not support dynamic inputs!
On my own datasets, about 8000 training images and 700 validation images, contains 65 classes, both COCO style. I found that the performance of the Center sampling is always 1 point AP lower than the without center sampling, after tried different training schedule and read the code very carefully, I can't find the reason.
Also I tried the mmdetection with center sampling enabled, but the result shows the center sampling will be always better, which make sense.
I found that without Center sampling, the AP of some classes will be higher, like "Pig", or "Racing Cars".
Could you please give me some hints about such condition, or any suggestion to debug? Thanks.
candidate_inds = box_cls > self.pre_nms_thresh
pre_nms_top_n = candidate_inds.view(N, -1).sum(1)
pre_nms_top_n = pre_nms_top_n.clamp(max=self.pre_nms_top_n)
# multiply the classification scores with centerness scores
box_cls = box_cls * centerness[:, :, None]
results = []
for i in range(N):
per_box_cls = box_cls[i]
per_candidate_inds = candidate_inds[i]
per_box_cls = per_box_cls[per_candidate_inds]
per_candidate_nonzeros = per_candidate_inds.nonzero()
per_box_loc = per_candidate_nonzeros[:, 0]
per_class = per_candidate_nonzeros[:, 1] + 1
per_box_regression = box_regression[i]
per_box_regression = per_box_regression[per_box_loc]
per_locations = locations[per_box_loc]
Line1: why not candidate_inds = box_cls.max(dim=2)[0] > self.pre_nms_thresh
, meaning get the class of max probablity.
when the direct pose will be released?
Any experiments and comparasion between MEInst with resnet50 backbone and other models with same?
Hi,
when will the ABCNet be released for detectron2
When I use blendmask to train my coco format, the error requires npz file, and zengimg is the picture folder.Why?
File "train_net.py", line 104, in train_loop
self.run_step()
File "d:\cnn\detect2\detectron2\engine\train_loop.py", line 209, in run_step
data = next(self._data_loader_iter)
File "d:\cnn\detect2\detectron2\data\common.py", line 140, in iter
for d in self.dataset:
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 345, in next
data = self._next_data()
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 856, in _next_data
return self._process_data(data)
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data\dataloader.py", line 881, in _process_data
data.reraise()
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch_utils.py", line 394, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "d:\cnn\detect2\detectron2\data\common.py", line 41, in getitem
data = self._map_func(self._dataset[cur_idx])
File "D:\CNNW\AdelaiDet\adet\data\dataset_mapper.py", line 137, in call
basis_sem_gt = np.load(basis_sem_path)["mask"]
File "C:\Users\wks.conda\envs\AdelaiDet\lib\site-packages\numpy\lib\npyio.py", line 428, in load
fid = open(os_fspath(file), "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'D:\labelme\zengimg\6122.npz'
in this code:
# if self.thresh_with_ctr is True, we multiply the classification
# scores with centerness scores before applying the threshold.
if self.thresh_with_ctr:
box_cls = box_cls * ctrness[:, :, None]
candidate_inds = box_cls > self.pre_nms_thresh
pre_nms_top_n = candidate_inds.view(N, -1).sum(1)
pre_nms_top_n = pre_nms_top_n.clamp(max=self.pre_nms_top_n)
if not self.thresh_with_ctr:
box_cls = box_cls * ctrness[:, :, None]
so that box_cls = box_cls * ctrness[:, :, None]
will execute whatever?
Hi, I would like to ask some questions about the Attns in the Top Layer of BlendMask.
In the paper, 3.1 chapter only describes how to inference the Attns along with the top k proposals, I wonder how to add target Attns to the Nproposal(original) Attns during training, cuz there is an function of add_gt_proposals(boxlists, targets) while training.
Maybe set as resized target mask? But how about the the "K" dim of Attns?
Wondering if I misunderstood...
Hello, wherre is for ABCNet config for training? ths.
Using the following config, I ran a random picture on ABCNet, a quick inference though, nothing changes significantly.
!python demo/demo.py \
--config-file configs/BAText/TotalText/attn_R_50.yaml \
--input /content/sam.png \
--output /content/output/ \
--opts MODEL.WEIGHTS /content/AdelaiDet/attn_tt_6262.pth
Output:
The inference is quite disappointing. Single-word detected well a bit but sequence of words, it failed to capture.
Here is the train command and eval command I am use:
python3 tools/train_net.py \
--config-file configs/BlendMask/RT_R_50_4x_bn.yaml \
--num-gpus 3 --eval-only
MODEL.WEIGHTS output/blendmask/RT_R_50_4x/model_0294999.pth
python3 tools/train_net.py \
--config-file configs/BlendMask/RT_R_50_4x_bn.yaml \
--num-gpus 3
I am using 3 GPUs to train, and I have changed lr along with batch size:
cat configs/BlendMask/Base-BlendMask.yaml
MODEL:
META_ARCHITECTURE: "BlendMask"
MASK_ON: True
BACKBONE:
NAME: "build_fcos_resnet_fpn_backbone"
RESNETS:
OUT_FEATURES: ["res3", "res4", "res5"]
FPN:
IN_FEATURES: ["res3", "res4", "res5"]
PROPOSAL_GENERATOR:
NAME: "FCOS"
BASIS_MODULE:
LOSS_ON: True
PANOPTIC_FPN:
COMBINE:
ENABLED: False
FCOS:
THRESH_WITH_CTR: True
USE_SCALE: False
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 6
BASE_LR: 0.005 # Note that RetinaNet uses a different default learning rate
STEPS: (60000, 80000)
MAX_ITER: 90000
INPUT:
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)
the model I changed:
_BASE_: "Base-550.yaml"
INPUT:
MIN_SIZE_TRAIN: (256, 288, 320, 352, 384, 416, 448, 480, 512, 544, 576, 608)
MAX_SIZE_TRAIN: 900
MAX_SIZE_TEST: 736
MIN_SIZE_TEST: 512
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
RESNETS:
DEPTH: 50
NORM: "BN"
BACKBONE:
FREEZE_AT: -1
SOLVER:
STEPS: (300000, 340000)
MAX_ITER: 360000
OUTPUT_DIR: "output/blendmask/RT_R_50_4x"
Is BlendMask RT really works?
Firstly, Thanks for your great work!
I am using BlendMask for my custom dataset, containing 10 classes. When starting training, it raises the following error:
[05/11 12:50:28 d2.engine.train_loop]: Starting training from iteration 0
ERROR [05/11 12:50:28 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/content/detectron2/detectron2/engine/train_loop.py", line 132, in train
self.run_step()
File "/content/detectron2/detectron2/engine/train_loop.py", line 215, in run_step
loss_dict = self.model(data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/content/adet/modeling/blendmask/blendmask.py", line 107, in forward
basis_out, basis_losses = self.basis_module(features, basis_sem)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/content/adet/modeling/blendmask/basis_module.py", line 96, in forward
gt_sem = targets.unsqueeze(1).float()
AttributeError: 'NoneType' object has no attribute 'unsqueeze'
Could you give me some advice on how to get rid of it? I checked the code and it says it was used to resize target to reduce memory. Since the default input is None for target, it seems the basis_module did not give a solution for None targets, I guess?
[05/16 21:21:18 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/BAText/CTW1500/attn_R_50.yaml', input=['input.jpg'], opts=['MODEL.WEIGHTS', 'ctw1500_attn_R_50.pth'], output=None, video_input=None, webcam=False)
WARNING [05/16 21:21:18 d2.config.compat]: Config 'configs/BAText/CTW1500/attn_R_50.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
File "demo/demo.py", line 72, in
cfg = setup_cfg(args)
File "demo/demo.py", line 23, in setup_cfg
cfg.merge_from_file(args.config_file)
File "/usr/local/lib/python3.6/dist-packages/detectron2/config/config.py", line 49, in merge_from_file
self.merge_from_other_cfg(loaded_cfg)
File "/usr/local/lib/python3.6/dist-packages/fvcore/common/config.py", line 118, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 464, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/usr/local/lib/python3.6/dist-packages/yacs/config.py", line 477, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: INPUT.HFLIP'
First, thank you for your work, it is really useful.
I would like to ask when are you planning to release SOLO network with the corresponding model weights?
Hi,
I met one bug when training BlendMask.
The file path is adet/modeling/blendmask/basis_module.py
seg_loss = F.cross_entropy(
sem_out, gt_sem.squeeze().long())
The problem is the mismatch of dimension between sem_out (torch.Size([1, 81, 88, 132]))and gt_sem (torch.Size([1, 1, 88, 132])).
Could you please help me to solve this problem?
Thanks.
Since CondInst is built on object detector FCOS, I am interested in the box AP of CondInst.
ModuleNotFoundError: No module named 'detectron2'
AdelaiDet/adet/modeling/fcos/fcos.py
Lines 175 to 181 in 0b51e7e
Do they require special initialization?
Hi! SOLO is wonderful! Since I heard you will release at end of last month, but didn't, I'm wondering when will you release the code? is there any plan?
Hello, I have a question about ABCnet.
Here is the sentences before 3. Experiments: "Note that during training, we directly use the generated Bezier curve GT to extract the RoI features. Therefore the detection branch does not affect the recognition branch. In the inference phase, the RoI region is replaced by the detecting Bezier curve described in Section 2.1."
Do you mean that the detection branch and the recognition are separated during training? Is ABCnet end-to-end? I don't know the total loss of ABCnet.
I would be very appreciated if you can answer me. Thank you!
1.why your code can change my CUDA environment variable? my company server cannot use tf anymore.
2.why I cannot del AdelaiDet file on my company server ?it like a computer virus.
3.Please tell me why?I was badly scolded by my leader!!!
Can I send a PR about VoVNet backbone network which is better performance and faster speed than ResNet ??
VoVNet(vovnet-detectron2) was already implemented in detectron2-style and proved better performance and faster speed than ResNet in detectron2.
Does BlendMask supports export to onnx?
Hello!
This is a kind of tricky problem and not sure whether someone has encountered it. But anyway I hope to seek some help.
So I did some modification on top of FCOS, including some CPU data processing. I can train the network normally when using 2 V100 GPUs. Now I hope to use R101 so I migrate to 4 TITAN GPUs. And every time I started the training, it always threw an error around 2100 iterations:
Here are some related post but none of them can solve the problem in my case:
facebookresearch/detectron2#817 (most related)
Hello, the MEInst is a good work, the paper says that the code is available at https://git.io/AdelaiDet, but I don't find it in the link. When will the code be released? Thank you!
Hi~ @tianzhi0549
I want to make sure the shared head architecture of CondInst.
Design A
--- conv --- conv --- conv --- conv --- cls_pred
|
| --- ctr_pred
| |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred
|
|
|
--- conv --- conv --- conv --- conv --- controller_pred
Design B
--- conv --- conv --- conv --- conv --- cls_pred
|
| --- ctr_pred
| |
FPN features --- --- conv --- conv --- conv --- conv --- --- reg_pred
|
--- controller_pred
Which one is right?
I found Design B will degradation Box AP and mask AP is also very low.
Here is my results for MS-R-50_1x.
Box AP
AP | AP50 | AP75 |
---|---|---|
38.269 | 57.210 | 55.405 |
Mask AP
AP | AP50 | AP75 |
---|---|---|
27.531 | 51.157 | 47.783 |
The Box AP should be higher than 39.5 for MS training(~39.5) & multi-task training(+~1.0). So I think Design B is wrong. It is hard for one branch to handle 3 preds, and the grad from controller_pred degenerate the reg_pred.
or how should I use this setting?
Thanks for your great work! I notice that adding an additional semantic segmentation loss in CondInst could boost the overall performance by about 1 mAP, which is quite a promising result! I want to ask:
Waiting for your reply!
Conditional Convolutions for Instance Segmentation brings a new paradigm to instance segmentation, but I don't quite understand some of the details in the paper. The following three questions are my doubts. Can you give me some advice?
Are all masks calculated from the positive samples generated by FCOS? (subsection 2.2. Network Outputs and Training Targets)
Why are there 169 parameters ( vector ) in the Controller Head and How do the 169 parameters assign to three 1 × 1 convolutions with 8-channels in Mask FCN Head? How many parameters are there in each of the three 1 × 1 convolution layers?(subsection 2.2.)
How are the three conditional convolution layers and the corresponding parameters(169 parameters in total) calculated in Mask FCN Head? (subsection 2.4)
Thank you very much for your time.
Not mentioned in the paper.
Hi,
I have a question on ML_NMS.
In your ml_nms.py, you used batched_nms
from detectron2.layers where torchvision.ops.nms is used.
I don't know how to link the ml_nms library to nms function.
Could you explain this?
Thanks in advance.
Thanks for your excellent idea.
I can't understand relative coordinates very well, how did it get there? Is it related to coordconv?
Thanks for the great work!
I wonder if there is performance benchmark about speed and AP of this detectron2-based implementation compared with the original FCOS implementation? Does the latest improvement also merge into this version as well?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.