trzy / fasterrcnn Goto Github PK

View Code? Open in Web Editor NEW

124.0 6.0 30.0 10.53 MB

Clean and readable implementations of Faster R-CNN in PyTorch and TensorFlow 2 with Keras.

Python 99.50% Shell 0.50%

faster-rcnn object-detection machine-learning computer-vision convolutional-neural-networks

fasterrcnn's People

Contributors

Stargazers

Watchers

fasterrcnn's Issues

Buggy implementation of smooth L1 loss

The implementation in pytorch/model/detector/regression_loss (line 144) seems to apply x (instead of x_abs) in the case selection.
Simply changing is_negative_branch = (x < (1.0 / sigma_squared)) to is_negative_branch = (x_abs ...). shall fix it.

Thank you for this project!

Please help me in using this in a custom model

step by step understanding approximate joint training method #192

i don't understand exactly approximate joint training method.
i know RPN and detector merged as a one network during training.
the forward path is started pre trained conv network and pass from RPN and finally arrives to fast rcnn layers. loss is computed :

RPN classification loss + RPN regression loss + Detection classification loss + Detection bounding-box regression loss.

but where is it from the backpropagation path? is it from detector and RPN and finally pretrained convnet?
in this case how derivation performed in decoder section in RPN? offcets produced with 1x1 reg-conv layer in RPN is translated to proposals in decoder.

What shoud be structute of dataset dir?

Noted dataset VOC2007 contains of the following folders:

Annotations  
ImageSets  
JPEGImages  
SegmentationClass  
SegmentationObject

Are they all necessary OR only some of them should be provided?

Sample larger than population or is negative during counting of negative_anchor_idxs

I have error:

Sample larger than population or is negative

in FasterRCNN\models\faster_rcnn.py in this section

  # Sample, producing indices into the index maps
    num_positive_anchors = len(positive_anchors)
    num_negative_anchors = len(negative_anchors)
    num_positive_samples = min(self._rpn_minibatch_size // 2, num_positive_anchors) # up to half the samples should be positive, if possible
    num_negative_samples = self._rpn_minibatch_size - num_positive_samples          # the rest should be negative
    positive_anchor_idxs = random.sample(range(num_positive_anchors), num_positive_samples)
    # negative_anchor_idxs = random.sample(range(num_negative_anchors), num_negative_samples)  # <-- error is here
    
    # ... fixed by
    negative_anchor_idxs = random.sample(range(num_negative_anchors), min(num_negative_samples, num_negative_anchors))

I have specific dataset with only 1 class and there are no negative samples (images that do not contain objects on this class)

Is this the correct fix? Or is it better to add negative examples into the dataset.

how to set batch size and optimizer

Validation Losses

Hi there, please can somebody help me to calculate the validation losses.

Thank you

What happen when I use vgg16 as backbone.

@trzy
Thank you your good work.
I've affirmed your pytorch version codes. Then, I have quessions.

What happen when I use vgg16 (not models/vgg16_torch.py) as backbone? At that case, how to load the initial weight? I did not define vgg16_caffe.pth, but I can train.
If I want to make model which can use four channels, should I make new image classification programs to train backbone from scratch? (I will use vgg16, but I can follow your advise if I have to use others.)

I want to use your good programs for my task. So, Please let me know.
Regards.

Usage over Object Detection APi

Hi, could you please provide some advice over using this approach versus using a Faster R CNN model with the Tensorflow Object Detection API?

Thanks, you did a great job.

Training times too long

Hello,

first of all thanks for the implementation. We are running some tests on a computer with two GPUs "GeForce RTX 3070, 7982 MiB" and we have some doubts, especially regarding the duration of each epoch in training.

We're running a release with the minimum configuration, asking it not to do data-augmentation and asking it to do the image caching:

python -m tf2.FasterRCNN --train --dataset-dir=./own_dataset/ --epochs=1 --learning-rate=1e-3 --save-best-to=fasterrcnn_tf2_tmp.h5 --no-augment --cache-images

we have durations per epoch of almost 2h.

Trying to launch it with data augmentation and without caching images:

python -m tf2.FasterRCNN --train --dataset-dir=./own_dataset/ --epochs=1 --learning-rate=1e-3 --save-best-to=fasterrcnn_tf2_tmp.h5

we get the same times per epoch.

If we include options like:

--debug-dir=/tmp/tf_debugger/

the duration increases to more than 8h per epoch.

Are we misconfiguring something or is it simply due to the dataset used?
Why don't we experience time improvements by removing data augmentation and including image caching?

Thank you very much!

Get predictions as a set of segments

How to get predictions as a set of coordinates of marked segments - i.e. not as an segmented png-image but as e.g. json with:

{
  imageid: ccc;
  object:dog; 
  x1: 33;
  y1:44;
...
}

Add channels from 3 to 4

Thank you for your good work trzy.
I want to use your faster r-cnn programs for my task.
And I want to add channels from 3 (RGB) to 4 (RGB + New one). Where should I change the program?

Thank you.

Please support to create new backbone based on ViT

Dear @trzy,

Thank for great repo. I am trying to try the new backbone ViT from your source code. I using the similar template from file: vgg16_torch.py which modify the line 67:
vgg16 = torchvision.models.vgg16(weights = torchvision.models.VGG16_Weights.IMAGENET1K_V1, dropout = dropout_probability)
to
ViT = torchvision.models.vit_b_16(weights=ViT_B_16_Weights.DEFAULT)

Based on ViT concept, the feature should be like
# Expand the class token to the full batch
batch_class_token = vit.class_token.expand(img.shape[0], -1, -1)
feats = torch.cat([batch_class_token, feats], dim=1)
feats = vit.encoder(feats)
We're only interested in the representation of the classifier token that we appended at position 0.
feats = feats[:, 0]

I am still get lost to fix FeatureExtractor function to fix the concept. Please assist if possible.
Many thanks!

trzy / fasterrcnn Goto Github PK

fasterrcnn's People

Contributors

Stargazers

Watchers

Forkers

fasterrcnn's Issues

Recommend Projects

Recommend Topics

Recommend Org