bichenwuucb / squeezedet Goto Github PK
View Code? Open in Web Editor NEWA tensorflow implementation for SqueezeDet, a convolutional neural network for object detection.
License: BSD 2-Clause "Simplified" License
A tensorflow implementation for SqueezeDet, a convolutional neural network for object detection.
License: BSD 2-Clause "Simplified" License
I did the evaluation on val set (3741 images) according to your guidance with the squeezeDet model (model.ckpt-87000). However, i got very low mAP:
Average precisions:
car_easy: 0.144
car_medium: 0.122
car_hard: 0.120
pedestrian_easy: 0.299
pedestrian_medium: 0.275
pedestrian_hard: 0.251
cyclist_easy: 0.272
cyclist_medium: 0.200
cyclist_hard: 0.205
Mean average precision: 0.210
I do not suspect the result presented on the paper. There must be something wrong in my experiment. Can you figure it out and give me some help?
When retraining with my own data set of size 500x500 I get the following error
ValueError: Cannot reshape a tensor with 466560 elements to shape [20,8100,2] (324000 elements) for 'interpret_output/pred_class_probs' (op: 'Reshape') with input shapes: [233280,2], [3].
What is the best way to modify the net and anchors?
H, W, B = 22, 76, 9
Hey, I attempted to pre-train the SqueezeDet model on ImageNet, freeze the first conv layer and then train on KITTI but the performance was significantly poorer than that stated in the paper. For ImageNet, I trained for ~50 epochs and the validation accuracy reached ~40% and subsequently trained on KITTI for 100k iterations. At its highest, the KITTI mAP was approximately 60%. I'm just curious as to the conditions you used for pre-training. Thanks!
hi, when I retrain the squeezeDet net on kitti dataset. I set the batch_size to 1 or 10, and I found that the usage of the GPU are just same with each other. I just confused with the problem.
thanks very much for your work, i can not download your pretrained model , because i am in china~~. can you offer an alternative place?, such as "baidu net disk"
Recently, I compared the detection pipeline of SqueezeDet and Single Shot Detection(By Wei Liu https://arxiv.org/abs/1512.02325). I found that squeezeDet is equal to the SSD only applied on one feature layer, basically. There are some minor difference except this but really I found nothing more.
@BichenWuUCB What is your opinion? Can you spot some more difference? Thank you.
Hi @BichenWuUCB,
I am trying to run SqueezeDet with my dataset. My images have a resolution of 720 x 405 pixels and the there are 15 classes of objects in the dataset.
I am able to run the demo and train the KITTI dataset with SqueezeDet, but when I use my images I get a problem with the dimension of the tensor in the softmax layer.
This is my setup file:
# Model configuration for pascal dataset
import numpy as np
from config import base_model_config
def my_squeezeDetPlus_config():
"Specify the parameters to tune below."
mc = base_model_config('my_model')
mc.IMAGE_WIDTH = 720 # 720 # 1242
mc.IMAGE_HEIGHT = 405 # 405 # 375
mc.BATCH_SIZE = 20
mc.WEIGHT_DECAY = 0.0001
mc.LEARNING_RATE = 0.01
mc.DECAY_STEPS = 10000
mc.MAX_GRAD_NORM = 1.0
mc.MOMENTUM = 0.9
mc.LR_DECAY_FACTOR = 0.5
mc.LOSS_COEF_BBOX = 5.0
mc.LOSS_COEF_CONF_POS = 75.0
mc.LOSS_COEF_CONF_NEG = 100.0
mc.LOSS_COEF_CLASS = 1.0
mc.PLOT_PROB_THRESH = 0.4
mc.NMS_THRESH = 0.4
mc.PROB_THRESH = 0.005
mc.TOP_N_DETECTION = 64
mc.DATA_AUGMENTATION = True
mc.DRIFT_X = 150
mc.DRIFT_Y = 100
mc.EXCLUDE_HARD_EXAMPLES = False
mc.ANCHOR_BOX = set_anchors(mc)
mc.ANCHORS = len(mc.ANCHOR_BOX)
mc.ANCHOR_PER_GRID = 9
return mc
def set_anchors(mc):
(H, W, B) = (22, 76, 9) # (22, 42, 9)
anchor_shapes = np.reshape([np.array([
[36., 37.],
[366., 174.],
[115., 59.],
[162., 87.],
[38., 90.],
[258., 173.],
[224., 108.],
[78., 170.],
[72., 43.],
])] * H * W, (H, W, B, 2))
center_x = \
np.reshape(np.transpose(np.reshape(np.array([np.arange(1, W
+ 1) * float(mc.IMAGE_WIDTH) / (W + 1)] * H * B),
(B, H, W)), (1, 2, 0)), (H, W, B, 1))
center_y = \
np.reshape(np.transpose(np.reshape(np.array([np.arange(1, H
+ 1) * float(mc.IMAGE_HEIGHT) / (H + 1)] * W * B),
(B, W, H)), (2, 1, 0)), (H, W, B, 1))
anchors = np.reshape(np.concatenate((center_x, center_y,
anchor_shapes), axis=3), (-1, 4))
return anchors
When I run the scripts I get this error
File "/home/my_dataset/squeezeDet/src/nn_skeleton.py", line 135, in _add_interpretation_graph
ValueError: Cannot reshape a tensor with 2786400 elements to shape [20,15048,15] (4514400 elements) for 'interpret_output/pred_class_probs' (op: 'Reshape') with input shapes: [185760,15], [3].
I assume this is a problem with the number of anchors (15048) that I am using. The default number of anchors does not seem appropriate for my dataset but I cannot understand how to set those values correctly.
The parameters H, W and B are set by default to "(22, 76, 9)" inside the function "set_anchors" (in the setup file above). How were these values selected? Is there a formula to calculate the appropriate number of batch, anchors and classes (so that they match with the tensor dimension) given an image resolution?
Thanks a lot
How to transfer model.ckpt-152000.meta into .pb file, we will apply to graph_transfer?
Thank you!!
While training on the KITTI data set, I seem to run out of memory when I try to run both evaluation scripts simultaneously (scripts/eval_train.sh and scripts/eval_val.sh):
Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuCtxCreate: CUDA_ERROR_OUT_OF_MEMORY;
Is this just a matter of me needing to adjust my batch size to allow all 3 processes enough memory to simultaneously run? I have ~24GB of VRAM across 2 Titan XPs - since this implementation does not seem setup to support multiple GPUs, I've tried using GPU 0 for training and GPU 1 for running the two evaluation scripts, I still run out of memory though.
Thanks!
The function safe_exp
lin_region = tf.to_float(w > thresh)
lin_out = slope*(w - thresh + 1.)
exp_out = tf.exp(w)
out = lin_region*lin_out + (1.-lin_region)*exp_out
Solves the exploding exponent problem, but is not safe. As exp_out is still exp(w) and can thus still become NaN, which then makes the variable out equal to NaN because multiplying a NaN by 0 still returns NaN and adding by NaN also returns NaN.
I propose adding a select, to not feed w to exp when the function is in the linear region like this:
lin_region_bool = tf.greater(w , thresh)
lin_region = tf.to_float(lin_region_bool)
lin_out = tf.multiply(lin_region,tf.multiply(slope,tf.add(tf.subtract(w , thresh), float(1.0))))
exp_out_safe = tf.exp(tf.where(lin_region_bool,tf.multiply(tf.ones_like(w),thresh),w))
exp_out = tf.multiply(tf.subtract(float(1.0),lin_region),exp_out_safe)
out = tf.add(lin_out,exp_out)
Your work impress me with high speed and better performance, before training on my own data, I just run your squeezeDetPlus model (model.ckpt-95000) on the KITTI test set (7518 images). However , the result is not good.
I see the mAP of pedestrian in the Paper (Tablet 2) is:
The result I run is:
(not identical to the KITTI (official) server groundtruth, refer to my following answer to dojoscan )
To be honest , I trust your result, there must be something wrong in the code. I just modify the demo.py, Here is my demo.py . All other codes are the same with yours.
I have checked my demo.py over and over again. I doubt there are bugs in the original demo.py , Could you figure it out? Hope you can help with that!
def image_demo():
"""Detect image."""
with tf.Graph().as_default():
# Load model
mc = kitti_squeezeDetPlus_config()
mc.BATCH_SIZE = 1
# model parameters will be restored from checkpoint
mc.LOAD_PRETRAINED_MODEL = False
model = SqueezeDetPlus(mc, FLAGS.gpu)
saver = tf.train.Saver(model.model_params)
# set model path
FLAGS.checkpoint = r'/home/project/HumanDetection/squeezeDet_github/models/squeezeDetPlus/model.ckpt-95000'
# set test data path
basic_image_path = r'/home/dataSet/kitti/ori_data/left_image/testing/image_2/'
list_path = r'/home/dataSet/kitti/ori_data/left_image/testing/test_list.txt'
write_result_path =r'/home/dataSet/kitti/ori_data/left_image/testing/run_out/'
with open(list_path,'rt') as F_read_list:
image_list_name = [x.strip() for x in F_read_list.readlines()]
print ('image numbers: ', len(image_list_name) )
count_num = 0
pedestrian_index = int(1) ## for pedestrian index
keep_score = 0.05 #
# write file format
default_str_1 = 'Pedestrian -1 -1 -10'
default_str_2 = '-1 -1 -1 -1000 -1000 -1000 -10'
with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
saver.restore(sess, FLAGS.checkpoint)
for file_name in image_list_name:
read_full_name = basic_image_path + file_name
im = cv2.imread(read_full_name)
if im is None:
print (file_name, ' is empty!')
continue
im = im.astype(np.float32, copy=False)
im = cv2.resize(im, (mc.IMAGE_WIDTH, mc.IMAGE_HEIGHT))
input_image = im - mc.BGR_MEANS
# Detect
det_boxes, det_probs, det_class = sess.run(
[model.det_boxes, model.det_probs, model.det_class],
feed_dict={model.image_input:[input_image], model.keep_prob: 1.0})
# NMS Filter
final_boxes, final_probs, final_class = model.filter_prediction(
det_boxes[0], det_probs[0], det_class[0])
## only keep high probablity pedestrian
keep_idx = [idx for idx in range(len(final_probs)) \
if final_probs[idx] > keep_score]
final_boxes = [final_boxes[idx] for idx in keep_idx]
final_probs = [final_probs[idx] for idx in keep_idx]
final_class = [final_class[idx] for idx in keep_idx]
# -------------- write files -----------------------
F_w_one_by_one = open(write_result_path + file_name.replace('png', 'txt'), 'wt')
rect_num = final_class.count(pedestrian_index)
print ('count: ', count_num)
count_num+=1
if rect_num==0:
F_w_one_by_one.close()
continue
goal_index = [idx for idx,value in enumerate(final_class) if value==pedestrian_index]
for kk in goal_index:
box = final_boxes[kk]
xmin = box[0] - box[2]/2.0
ymin = box[1] - box[3]/2.0
xmax = box[0] + box[2]/2.0
ymax = box[1] + box[3]/2.0
line_2 = default_str_1 + ' '+ str(xmin) + ' '+ str(ymin) + ' '+ \
str(xmax) + ' '+ str(ymax)+' '+ default_str_2 +' ' + str(final_probs[kk])+'\n'
F_w_one_by_one.write(line_2)
F_w_one_by_one.close()
def main(argv=None):
image_demo()
if __name__ == '__main__':
tf.app.run()
Hi,
I wonder if you use IOU threshold for box matching.
If you do, can you explain me what criterion to decide the anchor size?
Thanks in advance.
Byeonghak
Hi @BichenWuUCB
So I'm looking into using squeezeDet on my own data that's not in KITTI format. Do you have any suggestions on how to go about this?
Hi, BiChen, thanks for your impressive work.
I want to input other image resolution , but I have some problems, first , how to design the anchors? I saw the codes from kitti_squeezeDet_config.py like:
H, W, B = 22, 76, 9
anchor_shapes = np.reshape(
[np.array(
[[ 36., 37.], [ 366., 174.], [ 115., 59.],
[ 162., 87.], [ 38., 90.], [ 258., 173.],
[ 224., 108.], [ 78., 170.], [ 72., 43.]])] * H * W,
(H, W, B, 2)
)
I know that if I use squeezeDet and input image is 1272x375, the feature map shape after fire11 is [1,22,76,768]. Then how to calculate these numbers [ 36., 37.], [ 366., 174.]... ? If the input image is 600x600(for example),could you give some explanations?
Besides, I noticed your reply :https://github.com/BichenWuUCB/squeezeDet/issues/1, you mentioned "grid size" there, does it mean the (22,76) above? if I want to input 600*600 image, I only need to change the "(H,W)"? How to modify the 9 anchor_shapes?
In addition , the equation (3) in the paper seems to be different from the codes
I saw the code from imdb.py is
delta[0] = (box_cx - mc.ANCHOR_BOX[aidx][0])/box_w
delta[1] = (box_cy - mc.ANCHOR_BOX[aidx][1])/box_h
delta[2] = np.log(box_w/mc.ANCHOR_BOX[aidx][2])
delta[3] = np.log(box_h/mc.ANCHOR_BOX[aidx][3])
I think the code is right. there are four equations above, for the first and the second equation , the
should be
?
I am not sure……
thank you, hope for your reply!
Apart from config file and anchor size changes, what do I have to keep in mind if I want to implement this for PascalVOC?
Please guide.
Hi, @BichenWuUCB. I'm currently playing with your squeezeDet code, but I found some suspicious part in your code, so it would be grateful if you check if this is wrong or not.
In read_batch
function in /dataset/imdb.py
, conversion code of ground truth bounding box into delta is
delta[0] = (box_cx - mc.ANCHOR_BOX[aidx][0])/box_w
delta[1] = (box_cy - mc.ANCHOR_BOX[aidx][1])/box_h
delta[2] = np.log(box_w/mc.ANCHOR_BOX[aidx][2])
delta[3] = np.log(box_h/mc.ANCHOR_BOX[aidx][3])
Isn't it correct to divide delta[0] and delta[1] with mc.ANCHOR_BOX[aidx][2]
, mc.ANCHOR_BOX[aidx][3]
respectively? Equation (3) in the paper divides it with anchor box width/height, and I personally think that dividing by fixed value (anchor box width/height) is more stable normalization.
Thanks!
Hello,
I am attempting to train squeezeDet on my own images. They are size 1600x1200. I set the grid H, W to be roughly 1/16 of the size of the H, W respectively (75 and 100) as recommended. I also adjusted the batch size because batch size of 20 was not fitting in memory.
The training runs smoothly for up to 350 iterations but then produces the following error:
**File "/p/home/squeezeDet/src/dataset/imdb.py", line 157, in read_batch
max_drift_x = min(gt_bbox[:, 0] - gt_bbox[:, 2]/2.0+1)
IndexError: too many indices for array
Does anyone have any advice or knowledge of what may be causing this?
The full error output is below.
Traceback (most recent call last):
File "./src/train.py", line 345, in
tf.app.run()
File "/p/work2/projects/ryat/modules/tensorflow/1.0.0/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/train.py", line 341, in main
train()
File "./src/train.py", line 270, in train
coord.join(threads)
File "/p/work2/projects/ryat/modules/tensorflow/1.0.0/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 386, in join
six.reraise(*self._exc_info_to_raise)
File "./src/train.py", line 229, in _enqueue
feed_dict, _, _, _ = _load_data()
File "./src/train.py", line 166, in _load_data
bbox_per_batch = imdb.read_batch()
**File "/p/home/squeezeDet/src/dataset/imdb.py", line 157, in read_batch
max_drift_x = min(gt_bbox[:, 0] - gt_bbox[:, 2]/2.0+1)
IndexError: too many indices for array
Hi, is it possible to train with KITTI database from scratch (without pre-trained model). For example, set cfg.LOAD_PRETRAINED_MODEL
to False
? It seems that it is not easy to converge?
Could the code be implemented in the voc datasets, I try many times,but I get the error:Currently support KITTI dataset ,how to correct it.
Hello,
I have a GTX 1060 6G GPU, so I can fine tune the model only. I know the images and annotation should be in KITTI format. no problem about that.
but would you please give a small description of fine tuning process of your model? I really appreciate it.
How many classes do the model support?
Hi,
I've done installation, and trying to do the demo as described in README. I downloaded model_checkpoints.tgz
, untar it. Now, I'm running python ./src/demo.py
. It works as expected with the default parameters. But if I try to specify any other checkpoints from model_checkpoints
, it doesn't work. E.g. if I do this:
python ./src/demo.py --checkpoint=./data/model_checkpoints/squeezeDetPlus/model.ckpt-95000
I get the following error:
$ python ./src/demo.py --checkpoint=./data/model_checkpoints/squeezeDetPlus/model.ckpt-95000
2017-05-30 18:09:07.901895: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901920: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901926: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901941: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901946: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "./src/demo.py", line 217, in <module>
tf.app.run()
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/demo.py", line 212, in main
image_demo()
File "./src/demo.py", line 164, in image_demo
saver.restore(sess, FLAGS.checkpoint)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1457, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
run_metadata_ptr)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [64] rhs shape= [96]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@conv1/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/biases, save/RestoreV2)]]
Caused by op u'save/Assign', defined at:
File "./src/demo.py", line 217, in <module>
tf.app.run()
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/demo.py", line 212, in main
image_demo()
File "./src/demo.py", line 161, in image_demo
saver = tf.train.Saver(model.model_params)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1056, in __init__
self.build()
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1086, in build
restore_sequentially=self._restore_sequentially)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 691, in build
restore_sequentially, reshape)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 270, in assign
validate_shape=validate_shape)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
op_def=op_def)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [64] rhs shape= [96]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@conv1/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/biases, save/RestoreV2)]]
Is it expected?
I try to train this model on my own dataset. I created the structure of folder similar to kitti. I have the same size of image as KITTI. But my dataset has various sizes of objects and I think that anchors for KITTI doesn't fit it. F.e. there is a special script for YOLO v2 based on k-means to create anchors depends on train samples. Is there some way to calculate anchor boxes for my own dataset?
Hello.
I wonder if you have some experience on the issue of box matching.
Most CNN based object detection have box matching strategies to assign prior(anchor) box to ground truth during the training.
In many famous detection model like RCNN, SSD, they use the strategy that prior(anchor) boxes match the ground truth box when they have over 0.5 IOU.
But in this case, if there are no prior(anchor) box which over 0.5 IOU for any ground turth, that ground truth always to be a negative. This will be very big problem. To solve this they need a lot of prior(anchor) boxes and this causes the decline in processing speed.
And I found that your squeezeDet matches boxes with highest IOU, so that each ground truth can have its anchor box and never assigned as negative.
My question is if you have tried upper method. if yes, what was the difference between that method and yours?
I have tried your method but the loss is not converge after 3 when upper method gets near 1 in the end of the training. And it miss a lot(about 40%).
I think the reason is that the offset for localization is so big when the IOU is too small.
My english is not very good so if you don't understand the question then please tell me.
Hi there,
Your algorithm impresses me with the fastest inference speed than other algorithms. I am about to begin applying your algorithm on my own work. I have question whether you have a KITTI benchmark score by test set. I found your paper compared scores of other algorithms from benchmark with yours from validation set.
First of all, thanks a lot for sharing your code. SqueezeDet is really great.
I managed to train it on my own dataset and everything was good while I was using it in Python. I am not using GPU and detection takes 60-80ms for the 400x225 image on my CPU.
However, when I freezed the graph and loaded it in C++ code, session->run began to take 480-500ms on the same data. Any idea why it is so?
I have set optimization flag --config=opt when building with bazel and I have tried setting all the flags manually, it did not help. I am a new in tensorflow, and there is not much documentation on C++ Api issues, so I thought someone here could tell me what the reason of such a slow down for squeezeDet is. Any help would be highly appreciated.
Hello, when I see your code , there is 'Currently only supports KITTI dataset' in the train.py. But I need to train this model on the VOC dataset, So I need to know whether this code can work on voc? If the answer is true, how can I do it? Thanks very much,I'm looking forward to your reply
Hi,
Which version of tensorflow should I use?
Hello,
Is there any chance you could provide your model's structure in Caffe?
many thanks
Hi Bichen, It's me again~~
I want to train the squeezeDet to detect more classes, that to say, not only 'car', 'pedestrian', 'cyclist', I need to prepare the datasets and change the CLASS_NAMES in the config.py. Any other changes need to do? Do I need to change the parameter of the net?
Waiting for your reply~~
Hi @BichenWuUCB, I got some problem loading the pkl file of caffemodel you provide.
The version of related packages on my server are listed below:
easydict==1.6
joblib==0.10.3
tensorflow==0.10.0rc0
And Here's the error log:
Traceback (most recent call last):
File "./src/train.py", line 281, in
tf.app.run()
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "./src/train.py", line 277, in main
train()
File "./src/train.py", line 114, in train
model = VGG16ConvDet(mc, FLAGS.gpu)
File "/tmp3/jeff/squeezeDet/src/nets/vgg16_convDet.py", line 25, in init
self._add_forward_graph()
File "/tmp3/jeff/squeezeDet/src/nets/vgg16_convDet.py", line 39, in _add_forward_graph
self.caffemodel_weight = joblib.load(mc.PRETRAINED_MODEL_PATH)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 575, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 507, in _unpickle
obj = unpickler.load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatchkey
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 340, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 183, in read
array = self.read_array(unpickler)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array.shape = self.shape
ValueError: total size of new array must be unchanged
I have no problem running the demo example.
Is this a package version problem?
I wonder what data you used for the result on your paper. Is that randomly split validation data from 7481 training data or 7518 of test benchmark data?
Thanks
Is it possible to interrupt the training and resume it from a specific checkpoint? My train dir and checkpoint dir are the same, but when I restarted the training it started back from step 0. Any suggestion on how to change the script to resume training from a checkpoint / freeze weights and create a .pb file from a checkpoint?
Do anyone who knows how to use my own datasets to train this network. If possible, could you give me some suggestions? Thank you very much.
@BichenWuUCB KITTI update their object detection evaluation script on April 25, Which I believe corresponding to file ./src/datasets/kitti_eval/cpp/evaluate_object.cpp.
Maybe you want to update it.
http://www.cvlibs.net/datasets/kitti/eval_object.php
In addition: Can you explain why did you change the image resolution and padding strategy?
Hi Bichen, I have train the squeezeDet using data of my own, it turn out to be effective. But I want to train it using caffe for caffe is much more familiar for me. So is an caffe implementation possiable? Because without a pretrained caffemodel to finetune with, may be the net could not converge very well.
Waiting for you reply, thanks~~
For one train image, do I need to annotate all objects in the image during training.
for example , an image for pedestrian detection is :
If I only annotate the left pedestrian(there are other pedestrians), is it good for net training?
In my option , there are no explicit positive and negative samples . Does that mean : The other area is negatives except positives given by ground truth in one image? So I need to annotate all goals? I am not sure.
./scripts/train.sh: line 46: 32723 Segmentation fault (core dumped) python ./src/train.py --dataset=KITTI --pretrained_model_path=./data/SqueezeNet/squeezenet_v1.1.pkl --data_path=./data/KITTI --image_set=train --train_dir=/tmp/bichen/logs/SqueezeDet/train --net=squeezeDet --summary_step=100 --checkpoint_step=500 --gpu=$USE_GPU
I read up online and saw that this might happen due to space on the disk. But I have 64 GB RAM and only 3 GB was being used before this code stopped.
What should I do?
When I ran python ./src/demo.py, i got the following error:
Traceback (most recent call last):
File "./src/demo.py", line 217, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/demo.py", line 212, in main
image_demo()
File "./src/demo.py", line 175, in image_demo
feed_dict={model.image_input:[input_image], model.keep_prob: 1.0})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 922, in _run
+ e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Can not convert a float into a Tensor.
Any idea why this happens?
Thanks,
Awesome contribution! I've looked in the demo script, and saw there is support for processing videos? However I tried it, but didn't succeeded in running it properly.
Any tips or howto, I get following error. Tried to install gtk2.0 seperately.
cv2.destroyAllWindows()
cv2.error: /io/opencv/modules/highgui/src/window.cpp:577: error: (-2) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Carbon support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function cvDestroyAllWindows
Hi
I want to tain the model with other dataset, but with different size with KITTI dataset.
I changed the mc.IMAGE_WIDTH and mc.IMAGE_HEIGHT to my dataset, but they can't suit to the pretrained_model squeezenet_v1.0_SR_0.750.pkl and got error :
"ValueError: Cannot reshape a tensor with 47196 elements to shape [1,15048,3]....".
Thanks for your help!
Hello, I know the speed of your squeezeDet model is 57.2fps. But in my computer, reading one image using Opencv needs 0.025s. So I want to know if you include this time when you calculate speed? Thank you
@BichenWuUCB
Dear all,
I am facing a problem when I try to save a model to disk with batch size = 1 and then freeze into a .pb. I am doing this in four steps:
I realize that there is something wrong in the way I am saving the data in step 3, the graph and weights are not saved properly, but I cannot figure out what. Am I missing something very obvious here?
This is the minimum code (adapted from eval.py) that I am using to load the graph and the weights and save them with bath size of 1 (point 3)
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import cv2
import os.path
import numpy as np
import tensorflow as tf
from config import *
from nets import *
FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_string('dataset', 'KITTI',
"""Currently support PASCAL_VOC or KITTI dataset.""")
tf.app.flags.DEFINE_string('data_path', '', """Root directory of data""")
tf.app.flags.DEFINE_string('image_set', 'test',
"""Only used for VOC data."""
"""Can be train, trainval, val, or test""")
tf.app.flags.DEFINE_string('year', '2007',
"""VOC challenge year. 2007 or 2012"""
"""Only used for VOC data""")
tf.app.flags.DEFINE_string('eval_dir', '/tmp/bichen/logs/squeezeDet/eval',
"""Directory where to write event logs """)
tf.app.flags.DEFINE_string('checkpoint_path', '/tmp/bichen/logs/squeezeDet/train',
"""Path to the training checkpoint.""")
tf.app.flags.DEFINE_integer('eval_interval_secs', 60 * 1,
"""How often to check if new cpt is saved.""")
tf.app.flags.DEFINE_boolean('run_once', False,
"""Whether to run eval only once.""")
tf.app.flags.DEFINE_string('net', 'squeezeDet',
"""Neural net architecture.""")
tf.app.flags.DEFINE_string('gpu', '0', """gpu id.""")
def main(argv=None):
"""Load weights from a pre-trained squeezeDet network trained with Batch > 1
and save the model with batch = 1 for production"""
with tf.Graph().as_default():
mc = kitti_squeezeDetPlus_config()
mc.BATCH_SIZE = 1
mc.LOAD_PRETRAINED_MODEL = False
model = SqueezeDetPlus(mc, FLAGS.gpu)
saver = tf.train.Saver(model.model_params)
with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
# Restores from checkpoint
ckpts = set()
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_path)
ckpts.add(ckpt.model_checkpoint_path)
print ('Loading {}...'.format(ckpt.model_checkpoint_path))
saver.restore(sess, ckpt.model_checkpoint_path)
sess.run(tf.initialize_all_variables())
# Run one image to test that it works
read_full_name = "/data/squeezeDet_TF011/src/test2.jpg"
im = cv2.imread(read_full_name)
im = im.astype(np.float32, copy=False)
im = cv2.resize(im, (mc.IMAGE_WIDTH, mc.IMAGE_HEIGHT))
input_image = im - mc.BGR_MEANS
# Detect
det_boxes, det_probs, det_class = sess.run(
[model.det_boxes, model.det_probs, model.det_class],
feed_dict={model.image_input: [input_image], model.keep_prob: 1.0}) # works fine
# Save to disk
checkpoint_path = os.path.join("/data/squeezeDet_TF011/logs/test_freeze", 'evalBatch1.ckpt')
step = 1
saver.save(sess, checkpoint_path, global_step=step)
if __name__ == '__main__':
tf.app.run()
Thanks a lot
Cheers
Cannot find fire10/squeeze1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire10/expand1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire10/expand3x3 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/squeeze1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/expand1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/expand3x3 in the pretrained model. Use randomly initialized parameters
Cannot find conv12 in the pretrained model. Use randomly initialized parameters
Model statistics saved to data/KITTI/logs/squeezeDet/train/model_metrics.txt.
I am trying to run Squeezedet on my custom model. However, the kitty.py file is looking for a train.txt file. I tried to work without having this file and get an error
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (0,) for Tensor 'image_input:0', which has shape '(20, 1000, 1000, 3)'
Any clue how can i resolve this and make it work. I dont have the train.txt files
So that's all the question ;)
Hi, I tried inference of the pretrained models as well as the ones I trained using your code on datasets other than KITTI. For example cityscapes was one of them, and the performance was extremely depreciated. Any reason for this? Typically, it shouldn't be that bad, perhaps it is overfitting? KITTI data contains verly low variation, but I still expected decent performance on cityscpaes? Any ideas or perhaps something I might be doing wrong?
I think that there are some problems to download the model parameters.
Hi, when I came across the following code in kitti_model_config.py
center_x = np.reshape(
np.transpose(
np.reshape(
np.array([np.arange(1, W+1)*float(mc.IMAGE_WIDTH)/(W+1)]*H*B),
(B, H, W)
),
(1, 2, 0)
),
(H, W, B, 1)
)
I do not understand how center_x is calculated. For example, if IMAGE_WIDTH is equal to 14, B is equal to 2. The center_x should be 3.5, 10.5 . While 4.66, 9.32 is calculated using the above code.
Hi,
Is it possible to train squeezeDet on custom data without resizing to the 1242*375 KITTI format?
In kitti_squeezedet_config.py the image width and height can be changed, but that seems to conflict with set_anchors.
Thanks!
Hi @BichenWuUCB ,everyone
I try to test ConvDet by training with VGG16. However, it seems joblib cannot load the weight from VGG_ILSVRC_16_layers_weights.pkl file. I got the following error.
I hope you can assist on this problem. If possible, please check with your program and .pkl file.
Thank you very much.
Traceback (most recent call last):
File "./src/train.py", line 286, in
tf.app.run()
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/train.py", line 282, in main
train()
File "./src/train.py", line 114, in train
model = VGG16ConvDet(mc, FLAGS.gpu)
File "/home/cuong/squeezeDet/src/nets/vgg16_convDet.py", line 25, in init
self._add_forward_graph()
File "/home/cuong/squeezeDet/src/nets/vgg16_convDet.py", line 39, in _add_forward_graph
self.caffemodel_weight = joblib.load(mc.PRETRAINED_MODEL_PATH)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 575, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 507, in _unpickle
obj = unpickler.load()
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 340, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 183, in read
array = self.read_array(unpickler)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array.shape = self.shape
ValueError: total size of new array must be unchanged
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.