trzy / fasterrcnn Goto Github PK
View Code? Open in Web Editor NEWClean and readable implementations of Faster R-CNN in PyTorch and TensorFlow 2 with Keras.
Clean and readable implementations of Faster R-CNN in PyTorch and TensorFlow 2 with Keras.
The implementation in pytorch/model/detector/regression_loss (line 144)
seems to apply x (instead of x_abs) in the case selection.
Simply changing is_negative_branch = (x < (1.0 / sigma_squared))
to is_negative_branch = (x_abs ...).
shall fix it.
Thank you for this project!
i don't understand exactly approximate joint training method.
i know RPN and detector merged as a one network during training.
the forward path is started pre trained conv network and pass from RPN and finally arrives to fast rcnn layers. loss is computed :
RPN classification loss + RPN regression loss + Detection classification loss + Detection bounding-box regression loss.
but where is it from the backpropagation path? is it from detector and RPN and finally pretrained convnet?
in this case how derivation performed in decoder section in RPN? offcets produced with 1x1 reg-conv layer in RPN is translated to proposals in decoder.
Noted dataset VOC2007
contains of the following folders:
Annotations
ImageSets
JPEGImages
SegmentationClass
SegmentationObject
Are they all necessary OR only some of them should be provided?
I have error:
Sample larger than population or is negative
in FasterRCNN\models\faster_rcnn.py
in this section
# Sample, producing indices into the index maps
num_positive_anchors = len(positive_anchors)
num_negative_anchors = len(negative_anchors)
num_positive_samples = min(self._rpn_minibatch_size // 2, num_positive_anchors) # up to half the samples should be positive, if possible
num_negative_samples = self._rpn_minibatch_size - num_positive_samples # the rest should be negative
positive_anchor_idxs = random.sample(range(num_positive_anchors), num_positive_samples)
# negative_anchor_idxs = random.sample(range(num_negative_anchors), num_negative_samples) # <-- error is here
# ... fixed by
negative_anchor_idxs = random.sample(range(num_negative_anchors), min(num_negative_samples, num_negative_anchors))
I have specific dataset with only 1 class and there are no negative samples (images that do not contain objects on this class)
Is this the correct fix? Or is it better to add negative examples into the dataset.
how to set batch size and optimizer
Hi there, please can somebody help me to calculate the validation losses.
Thank you
@trzy
Thank you your good work.
I've affirmed your pytorch version codes. Then, I have quessions.
I want to use your good programs for my task. So, Please let me know.
Regards.
Hi, could you please provide some advice over using this approach versus using a Faster R CNN model with the Tensorflow Object Detection API?
Thanks, you did a great job.
Hello,
first of all thanks for the implementation. We are running some tests on a computer with two GPUs "GeForce RTX 3070, 7982 MiB" and we have some doubts, especially regarding the duration of each epoch in training.
python -m tf2.FasterRCNN --train --dataset-dir=./own_dataset/ --epochs=1 --learning-rate=1e-3 --save-best-to=fasterrcnn_tf2_tmp.h5 --no-augment --cache-images
we have durations per epoch of almost 2h.
python -m tf2.FasterRCNN --train --dataset-dir=./own_dataset/ --epochs=1 --learning-rate=1e-3 --save-best-to=fasterrcnn_tf2_tmp.h5
we get the same times per epoch.
--debug-dir=/tmp/tf_debugger/
the duration increases to more than 8h per epoch.
Are we misconfiguring something or is it simply due to the dataset used?
Why don't we experience time improvements by removing data augmentation and including image caching?
Thank you very much!
How to get predictions as a set of coordinates of marked segments - i.e. not as an segmented png-image but as e.g. json with:
{
imageid: ccc;
object:dog;
x1: 33;
y1:44;
...
}
Thank you for your good work trzy.
I want to use your faster r-cnn programs for my task.
And I want to add channels from 3 (RGB) to 4 (RGB + New one). Where should I change the program?
Thank you.
Dear @trzy,
Thank for great repo. I am trying to try the new backbone ViT from your source code. I using the similar template from file: vgg16_torch.py which modify the line 67:
vgg16 = torchvision.models.vgg16(weights = torchvision.models.VGG16_Weights.IMAGENET1K_V1, dropout = dropout_probability)
to
ViT = torchvision.models.vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
Based on ViT concept, the feature should be like
# Expand the class token to the full batch
batch_class_token = vit.class_token.expand(img.shape[0], -1, -1)
feats = torch.cat([batch_class_token, feats], dim=1)
feats = vit.encoder(feats)
We're only interested in the representation of the classifier token that we appended at position 0.
feats = feats[:, 0]
I am still get lost to fix FeatureExtractor function to fix the concept. Please assist if possible.
Many thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.