Comments (11)
hi, I have the same question!
In the paper, the author said "We used the method described by [2] to select reference bounding box shapes to match the data distribution", I think that's to say, the anchor shape is selected according to the objects' weight and height. So when the traing data changes, the anchor shape may also need to be changed.
from squeezedet.
@ChaunceyWang Thanks for your question. Responses in-line:
I want to input other image resolution , but I have some problems, first , how to design the anchors?
As @rainsoulsrx explained, you could refer to our previous paper on how to select an optimal set of anchors to best fits the data distribution.
I know that if I use squeezeDet and input image is 1272x375, the feature map shape after fire11 is [1,22,76,768]. Then how to calculate these numbers [ 36., 37.], [ 366., 174.]... ?
Anchor sizes should fit the bounding box shape distribution of your data (e.g., the shape of cars, pedestrians, cyclists, etc.). So the feature map dimension has nothing to do with anchor sizes.
Besides, I noticed your reply :#1, you mentioned "grid size" there, does it mean the (22,76) above? if I want to input 600*600 image, I only need to change the "(H,W)"? Do I need to modify the 9 anchor_shapes?
Yes, the grid size in this case is just (22, 76). Grid size is only dependent on the original image resolution, and it is independent with anchor shape.
In addition , the equation (3) in the paper seems to be different from the codes ...
Yes, you are correct. I will fix this in the paper.
from squeezedet.
@rainsoulsrx Thanks for your reminding, I ignored the citation.
@BichenWuUCB Thanks for your reply. I saw the anchor shapes in resnet50_convDet.py :
H, W, B = 24, 78, 9
anchor_shapes = np.reshape(
[np.array(
[[ 94., 49.], [ 225., 161.], [ 170., 91.],
[ 390., 181.], [ 41., 32.], [ 128., 64.],
[ 298., 164.], [ 232., 99.], [ 65., 42.]])] * H * W,
(H, W, B, 2)
)
which is different from squeezeNet before. You said:
Anchor sizes should fit the bounding box shape distribution of your data
. The inputs are same (KITTI dataset), why are the anchor shapes different?
from squeezedet.
I think that's because in Res50, the feature map shape is 24*78 in that layer.
from squeezedet.
@rainsoulsrx However, the feature map shape of VGG 16 in that layer is also 24*78, this is the source codes in kitti_vgg16_config.py
H, W, B = 24, 78, 9
anchor_shapes = np.reshape(
[np.array(
[[ 36., 37.], [ 366., 174.], [ 115., 59.],
[ 162., 87.], [ 38., 90.], [ 258., 173.],
[ 224., 108.], [ 78., 170.], [ 72., 43.]])] * H * W,
(H, W, B, 2)
)
But the anchor_shapes is different from ResNet50 above, is there something wrong?
And from the paper,does the anchor shape only relate to input object shape distribution ?
from squeezedet.
@ChaunceyWang Yes, theoretically, res50, vgg16, squeezeDet can use the same set of anchor shapes -- as long as their inputs are the same dataset with same image resolution. The reason why you see two sets of anchor shapes is the following: KITTI evaluation script ignores objects that are smaller than a certain size (there are other criteria). So when selecting anchor shapes, we could either ignore smaller objects by following KITTI's standard, or we could keep them. We used the K-Means method described in our previous paper to choose two sets of anchors for above two cases.
It is a bit confusing that we used a different set of anchor shapes for res50 model, but the reason has nothing to do with the model or input dataset, etc. The res50 model that we released here is one of many that we trained and it happened to use a different set of anchors.
Does it make sense?
from squeezedet.
@BichenWuUCB Thank you! Roger that!
from squeezedet.
Hi, BiChen, Thanks for your patient explaination! I still have a question about the anchor size select.
anchor_shapes = np.reshape(
[np.array(
[[ 36., 37.], [ 366., 174.], [ 115., 59.],
[ 162., 87.], [ 38., 90.], [ 258., 173.],
[ 224., 108.], [ 78., 170.], [ 72., 43.]])] * H * W,
(H, W, B, 2)
)
My question is, Is 36, 37, 366, 174 here... the anchor's weight and height relative to the image after resize? that is to say H * W(1242*375) but not the feature map? So if I chanege the image size, I should change the anchor size correspondingly relative to the input image size but not the feature map? I do not know if I understand correctly.
from squeezedet.
Hi, BiChen, I have another question.
If I have the origion training datasets whose resolution is 19201080 and the information in trainval.txt is also corresponding to this size, but this big size would cause the speed very slow, so I want to change the size to 450300(for example) when training the net, that's to say, I change mc.IMAGE_WIDTH and mc.IMAGE_HEIGHT to 450 and 300, but do I need to change the information in trainval.txt at the same time? or the code will do the change automatically?
from squeezedet.
@rainsoulsrx: thanks for your questions.
First question: anchor size is relative to the original image size and grid size H, W
is also relative to the image size. Depending on the padding/striding strategies, H,W
are roughly 1/16 of the original image height and width. So, let's say you want to down-sample your input image by half (on both width and height), then yes, you need to down-sample your anchor size by half and you need to change grid size H,W
to be the same as the last conv-layer output's spatial dimension.
Second question: In my implementation, trainval.txt only contains index to images. Image resolution is specified by mc.IMAGE_WIDTH
and mc.IMAGE_HEIGHT
in the config file. Once you modified the two variables correctly, input images should be re-sized to the desired resolution. You don't need to (and shouldn't) specify image resolution in the trainval.txt.
Hope that helps,
Bichen
from squeezedet.
Hi Bichen,
I have training set with image resolution of 1280x720. When I modified the image height and width in "kitti_model_config.py" and "kitti_squeezeDet_config.py", I got the following error
ValueError: Cannot reshape a tensor with 648000 elements to shape [10,16848,2] (336960 elements) for 'interpret_output/pred_class_probs' (op: 'Reshape') with input shapes: [324000,2], [3] and with input tensors computed as partial shapes: input[1] = [10,16848,2].
Can you please help me in resolving the issue
from squeezedet.
Related Issues (20)
- Gpu occupancy rate
- where is base_model_config.py?
- Will random initialization parameters have no precision? HOT 1
- Fine-tune SqueezeDet from sparse labels
- How to do hard negative mining HOT 1
- Publish frozen model? HOT 3
- Problem converting to TFLite HOT 3
- low GPU usage
- 8-bit weights
- Deploying squeezeDet on mobile HOT 3
- How to convert checkpoint of squeezedet to frozen graph for tflite conversion?! HOT 1
- Image resolution problem
- How to run demo.py using train.py checkpoint model HOT 1
- Train with different size and Inference with different size.
- Fine tuning with the model
- Train error and Eval error
- Using negative samples for training.
- print weights per layer during training
- Performance issue in src/eval.py (by P3) HOT 1
- The loss plateaus after 100 Epoch
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from squeezedet.