Code Monkey home page Code Monkey logo

squeezedet's People

Contributors

alvinwan avatar bichenwuucb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

squeezedet's Issues

Very low mAP

I did the evaluation on val set (3741 images) according to your guidance with the squeezeDet model (model.ckpt-87000). However, i got very low mAP:
Average precisions:
car_easy: 0.144
car_medium: 0.122
car_hard: 0.120
pedestrian_easy: 0.299
pedestrian_medium: 0.275
pedestrian_hard: 0.251
cyclist_easy: 0.272
cyclist_medium: 0.200
cyclist_hard: 0.205
Mean average precision: 0.210
I do not suspect the result presented on the paper. There must be something wrong in my experiment. Can you figure it out and give me some help?

Retraining with image size of 500x500

When retraining with my own data set of size 500x500 I get the following error

ValueError: Cannot reshape a tensor with 466560 elements to shape [20,8100,2] (324000 elements) for 'interpret_output/pred_class_probs' (op: 'Reshape') with input shapes: [233280,2], [3].

What is the best way to modify the net and anchors?

H, W, B = 22, 76, 9

Pre-training on ImageNet

Hey, I attempted to pre-train the SqueezeDet model on ImageNet, freeze the first conv layer and then train on KITTI but the performance was significantly poorer than that stated in the paper. For ImageNet, I trained for ~50 epochs and the validation accuracy reached ~40% and subsequently trained on KITTI for 100k iterations. At its highest, the KITTI mAP was approximately 60%. I'm just curious as to the conditions you used for pre-training. Thanks!

The memory of the runtime model?

hi, when I retrain the squeezeDet net on kitti dataset. I set the batch_size to 1 or 10, and I found that the usage of the GPU are just same with each other. I just confused with the problem.

can not download the pertrained model

thanks very much for your work, i can not download your pretrained model , because i am in china~~. can you offer an alternative place?, such as "baidu net disk"

SqueezeDet is SSD on one layer

Recently, I compared the detection pipeline of SqueezeDet and Single Shot Detection(By Wei Liu https://arxiv.org/abs/1512.02325). I found that squeezeDet is equal to the SSD only applied on one feature layer, basically. There are some minor difference except this but really I found nothing more.

@BichenWuUCB What is your opinion? Can you spot some more difference? Thank you.

How to select an appropriate number of anchors

Hi @BichenWuUCB,

I am trying to run SqueezeDet with my dataset. My images have a resolution of 720 x 405 pixels and the there are 15 classes of objects in the dataset.

I am able to run the demo and train the KITTI dataset with SqueezeDet, but when I use my images I get a problem with the dimension of the tensor in the softmax layer.

This is my setup file:

# Model configuration for pascal dataset

import numpy as np

from config import base_model_config


def my_squeezeDetPlus_config():
    "Specify the parameters to tune below."

    mc = base_model_config('my_model')

    mc.IMAGE_WIDTH = 720  # 720  # 1242
    mc.IMAGE_HEIGHT = 405  # 405  # 375
    mc.BATCH_SIZE = 20

    mc.WEIGHT_DECAY = 0.0001
    mc.LEARNING_RATE = 0.01
    mc.DECAY_STEPS = 10000
    mc.MAX_GRAD_NORM = 1.0
    mc.MOMENTUM = 0.9
    mc.LR_DECAY_FACTOR = 0.5

    mc.LOSS_COEF_BBOX = 5.0
    mc.LOSS_COEF_CONF_POS = 75.0
    mc.LOSS_COEF_CONF_NEG = 100.0
    mc.LOSS_COEF_CLASS = 1.0

    mc.PLOT_PROB_THRESH = 0.4
    mc.NMS_THRESH = 0.4
    mc.PROB_THRESH = 0.005
    mc.TOP_N_DETECTION = 64

    mc.DATA_AUGMENTATION = True
    mc.DRIFT_X = 150
    mc.DRIFT_Y = 100
    mc.EXCLUDE_HARD_EXAMPLES = False

    mc.ANCHOR_BOX = set_anchors(mc)
    mc.ANCHORS = len(mc.ANCHOR_BOX)
    mc.ANCHOR_PER_GRID = 9

    return mc


def set_anchors(mc):
    (H, W, B) = (22, 76, 9)  # (22, 42, 9)
    anchor_shapes = np.reshape([np.array([
        [36., 37.],
        [366., 174.],
        [115., 59.],
        [162., 87.],
        [38., 90.],
        [258., 173.],
        [224., 108.],
        [78., 170.],
        [72., 43.],
        ])] * H * W, (H, W, B, 2))
    center_x = \
        np.reshape(np.transpose(np.reshape(np.array([np.arange(1, W
                   + 1) * float(mc.IMAGE_WIDTH) / (W + 1)] * H * B),
                   (B, H, W)), (1, 2, 0)), (H, W, B, 1))
    center_y = \
        np.reshape(np.transpose(np.reshape(np.array([np.arange(1, H
                   + 1) * float(mc.IMAGE_HEIGHT) / (H + 1)] * W * B),
                   (B, W, H)), (2, 1, 0)), (H, W, B, 1))
    anchors = np.reshape(np.concatenate((center_x, center_y,
                         anchor_shapes), axis=3), (-1, 4))

    return anchors

When I run the scripts I get this error

File "/home/my_dataset/squeezeDet/src/nn_skeleton.py", line 135, in _add_interpretation_graph

ValueError: Cannot reshape a tensor with 2786400 elements to shape [20,15048,15] (4514400 elements) for 'interpret_output/pred_class_probs' (op: 'Reshape') with input shapes: [185760,15], [3].

I assume this is a problem with the number of anchors (15048) that I am using. The default number of anchors does not seem appropriate for my dataset but I cannot understand how to set those values correctly.

The parameters H, W and B are set by default to "(22, 76, 9)" inside the function "set_anchors" (in the setup file above). How were these values selected? Is there a formula to calculate the appropriate number of batch, anchors and classes (so that they match with the tensor dimension) given an image resolution?

Thanks a lot

transfer problem

How to transfer model.ckpt-152000.meta into .pb file, we will apply to graph_transfer?
Thank you!!

Running out of memory during simultaneous training and evaluation

While training on the KITTI data set, I seem to run out of memory when I try to run both evaluation scripts simultaneously (scripts/eval_train.sh and scripts/eval_val.sh):

Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuCtxCreate: CUDA_ERROR_OUT_OF_MEMORY;

Is this just a matter of me needing to adjust my batch size to allow all 3 processes enough memory to simultaneously run? I have ~24GB of VRAM across 2 Titan XPs - since this implementation does not seem setup to support multiple GPUs, I've tried using GPU 0 for training and GPU 1 for running the two evaluation scripts, I still run out of memory though.

Thanks!

Unsafe, safe_exp function

The function safe_exp

    lin_region = tf.to_float(w > thresh)

    lin_out = slope*(w - thresh + 1.)

    exp_out = tf.exp(w)

    out = lin_region*lin_out + (1.-lin_region)*exp_out

Solves the exploding exponent problem, but is not safe. As exp_out is still exp(w) and can thus still become NaN, which then makes the variable out equal to NaN because multiplying a NaN by 0 still returns NaN and adding by NaN also returns NaN.

I propose adding a select, to not feed w to exp when the function is in the linear region like this:

lin_region_bool = tf.greater(w , thresh)
lin_region = tf.to_float(lin_region_bool)
lin_out = tf.multiply(lin_region,tf.multiply(slope,tf.add(tf.subtract(w , thresh), float(1.0))))

exp_out_safe = tf.exp(tf.where(lin_region_bool,tf.multiply(tf.ones_like(w),thresh),w))

exp_out = tf.multiply(tf.subtract(float(1.0),lin_region),exp_out_safe)

out = tf.add(lin_out,exp_out)

Evaluate original squeezeDetPlus model on KITTI Benchmark

Your work impress me with high speed and better performance, before training on my own data, I just run your squeezeDetPlus model (model.ckpt-95000) on the KITTI test set (7518 images). However , the result is not good.

I see the mAP of pedestrian in the Paper (Tablet 2) is:

  • 81.4% on Easy, 68.5% on Hard

The result I run is:

(not identical to the KITTI (official) server groundtruth, refer to my following answer to dojoscan )

  • 45.69% on Easy, 38.39% on Hard.

To be honest , I trust your result, there must be something wrong in the code. I just modify the demo.py, Here is my demo.py . All other codes are the same with yours.
I have checked my demo.py over and over again. I doubt there are bugs in the original demo.py , Could you figure it out? Hope you can help with that!

def image_demo():
  """Detect image."""

  with tf.Graph().as_default():
    # Load model
    mc = kitti_squeezeDetPlus_config()
    mc.BATCH_SIZE = 1
    # model parameters will be restored from checkpoint
    mc.LOAD_PRETRAINED_MODEL = False
    model = SqueezeDetPlus(mc, FLAGS.gpu)
    saver = tf.train.Saver(model.model_params)

    #  set model path
    FLAGS.checkpoint = r'/home/project/HumanDetection/squeezeDet_github/models/squeezeDetPlus/model.ckpt-95000'
    # set test data path
    basic_image_path = r'/home/dataSet/kitti/ori_data/left_image/testing/image_2/'
    list_path = r'/home/dataSet/kitti/ori_data/left_image/testing/test_list.txt' 
    write_result_path =r'/home/dataSet/kitti/ori_data/left_image/testing/run_out/'
    
    with open(list_path,'rt') as F_read_list:
        image_list_name = [x.strip() for x in F_read_list.readlines()]


    print ('image numbers:  ', len(image_list_name) )

    count_num = 0
    pedestrian_index = int(1)  ## for pedestrian index
    keep_score = 0.05  #  
    
    # write file format
    default_str_1 = 'Pedestrian -1 -1 -10'
    default_str_2 = '-1 -1 -1 -1000 -1000 -1000 -10'

    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
    	saver.restore(sess, FLAGS.checkpoint)

    	for file_name in image_list_name:
    		read_full_name = basic_image_path + file_name
    		im = cv2.imread(read_full_name)
    		if im is None:
    			print (file_name, ' is empty!')
    			continue
    		im = im.astype(np.float32, copy=False)
    		im = cv2.resize(im, (mc.IMAGE_WIDTH, mc.IMAGE_HEIGHT))
    		input_image = im - mc.BGR_MEANS   		     
                
                # Detect 
	        det_boxes, det_probs, det_class = sess.run(
	        	[model.det_boxes, model.det_probs, model.det_class],
	        	feed_dict={model.image_input:[input_image], model.keep_prob: 1.0})
	        
                # NMS  Filter
	        final_boxes, final_probs, final_class = model.filter_prediction(
	        	det_boxes[0], det_probs[0], det_class[0])
	        
	        ##  only keep high probablity pedestrian
	        keep_idx    = [idx for idx in range(len(final_probs)) \
	        if final_probs[idx] > keep_score]

	        final_boxes = [final_boxes[idx] for idx in keep_idx]
	        final_probs = [final_probs[idx] for idx in keep_idx]
	        final_class = [final_class[idx] for idx in keep_idx]

	        # -------------- write files -----------------------
	        F_w_one_by_one = open(write_result_path + file_name.replace('png', 'txt'), 'wt')
	        rect_num = final_class.count(pedestrian_index)
	        
                print ('count: ', count_num)
        	count_num+=1
        	
	        if rect_num==0:
                        F_w_one_by_one.close()
	        	continue

        	goal_index = [idx for idx,value in enumerate(final_class) if value==pedestrian_index]

        	for kk in goal_index:
        		box = final_boxes[kk]
        		
                        xmin = box[0] - box[2]/2.0
        		ymin = box[1] - box[3]/2.0
        		xmax = box[0] + box[2]/2.0
        		ymax = box[1] + box[3]/2.0
        		

        		
        		line_2 = default_str_1 + ' '+ str(xmin) + ' '+ str(ymin) + ' '+ \
				str(xmax) + ' '+ str(ymax)+' '+ default_str_2 +' ' + str(final_probs[kk])+'\n'

        		F_w_one_by_one.write(line_2)
	         
	        F_w_one_by_one.close()


def main(argv=None):
    image_demo()


if __name__ == '__main__':
    tf.app.run()

IOU threshold

Hi,

I wonder if you use IOU threshold for box matching.

If you do, can you explain me what criterion to decide the anchor size?

Thanks in advance.
Byeonghak

about anchors design

Hi, BiChen, thanks for your impressive work.
I want to input other image resolution , but I have some problems, first , how to design the anchors? I saw the codes from kitti_squeezeDet_config.py like:

  H, W, B = 22, 76, 9
  anchor_shapes = np.reshape(
      [np.array(
          [[  36.,  37.], [ 366., 174.], [ 115.,  59.],
           [ 162.,  87.], [  38.,  90.], [ 258., 173.],
           [ 224., 108.], [  78., 170.], [  72.,  43.]])] * H * W,
      (H, W, B, 2)
  )

I know that if I use squeezeDet and input image is 1272x375, the feature map shape after fire11 is [1,22,76,768]. Then how to calculate these numbers [ 36., 37.], [ 366., 174.]... ? If the input image is 600x600(for example),could you give some explanations?

Besides, I noticed your reply :https://github.com/BichenWuUCB/squeezeDet/issues/1, you mentioned "grid size" there, does it mean the (22,76) above? if I want to input 600*600 image, I only need to change the "(H,W)"? How to modify the 9 anchor_shapes?

In addition , the equation (3) in the paper seems to be different from the codes
image
I saw the code from imdb.py is

        delta[0] = (box_cx - mc.ANCHOR_BOX[aidx][0])/box_w
        delta[1] = (box_cy - mc.ANCHOR_BOX[aidx][1])/box_h
        delta[2] = np.log(box_w/mc.ANCHOR_BOX[aidx][2])
        delta[3] = np.log(box_h/mc.ANCHOR_BOX[aidx][3])

I think the code is right. there are four equations above, for the first and the second equation , the
image
should be
image?
I am not sure……

thank you, hope for your reply!

PascalVOC implementation

Apart from config file and anchor size changes, what do I have to keep in mind if I want to implement this for PascalVOC?
Please guide.

Different equation compared to paper?

Hi, @BichenWuUCB. I'm currently playing with your squeezeDet code, but I found some suspicious part in your code, so it would be grateful if you check if this is wrong or not.

In read_batch function in /dataset/imdb.py, conversion code of ground truth bounding box into delta is

delta[0] = (box_cx - mc.ANCHOR_BOX[aidx][0])/box_w
delta[1] = (box_cy - mc.ANCHOR_BOX[aidx][1])/box_h
delta[2] = np.log(box_w/mc.ANCHOR_BOX[aidx][2])
delta[3] = np.log(box_h/mc.ANCHOR_BOX[aidx][3])

Isn't it correct to divide delta[0] and delta[1] with mc.ANCHOR_BOX[aidx][2], mc.ANCHOR_BOX[aidx][3] respectively? Equation (3) in the paper divides it with anchor box width/height, and I personally think that dividing by fixed value (anchor box width/height) is more stable normalization.

Thanks!

Training on own dataset: imdb.py IndexError: too many indices for array

Hello,
I am attempting to train squeezeDet on my own images. They are size 1600x1200. I set the grid H, W to be roughly 1/16 of the size of the H, W respectively (75 and 100) as recommended. I also adjusted the batch size because batch size of 20 was not fitting in memory.
The training runs smoothly for up to 350 iterations but then produces the following error:

**File "/p/home/squeezeDet/src/dataset/imdb.py", line 157, in read_batch
max_drift_x = min(gt_bbox[:, 0] - gt_bbox[:, 2]/2.0+1)
IndexError: too many indices for array

Does anyone have any advice or knowledge of what may be causing this?
The full error output is below.

Traceback (most recent call last):
File "./src/train.py", line 345, in
tf.app.run()
File "/p/work2/projects/ryat/modules/tensorflow/1.0.0/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/train.py", line 341, in main
train()
File "./src/train.py", line 270, in train
coord.join(threads)
File "/p/work2/projects/ryat/modules/tensorflow/1.0.0/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 386, in join
six.reraise(*self._exc_info_to_raise)
File "./src/train.py", line 229, in _enqueue
feed_dict, _, _, _ = _load_data()
File "./src/train.py", line 166, in _load_data
bbox_per_batch = imdb.read_batch()
**File "/p/home/squeezeDet/src/dataset/imdb.py", line 157, in read_batch
max_drift_x = min(gt_bbox[:, 0] - gt_bbox[:, 2]/2.0+1)
IndexError: too many indices for array

Train from scratch

Hi, is it possible to train with KITTI database from scratch (without pre-trained model). For example, set cfg.LOAD_PRETRAINED_MODEL to False? It seems that it is not easy to converge?

VOC datasets

Could the code be implemented in the voc datasets, I try many times,but I get the error:Currently support KITTI dataset ,how to correct it.

Training on a custom dataset (Fine tuning)

Hello,

I have a GTX 1060 6G GPU, so I can fine tune the model only. I know the images and annotation should be in KITTI format. no problem about that.

but would you please give a small description of fine tuning process of your model? I really appreciate it.

How many classes do the model support?

Checkpoints other than squeezeDet don't work for the demo

Hi,

I've done installation, and trying to do the demo as described in README. I downloaded model_checkpoints.tgz, untar it. Now, I'm running python ./src/demo.py. It works as expected with the default parameters. But if I try to specify any other checkpoints from model_checkpoints, it doesn't work. E.g. if I do this:

python ./src/demo.py --checkpoint=./data/model_checkpoints/squeezeDetPlus/model.ckpt-95000

I get the following error:

$ python ./src/demo.py --checkpoint=./data/model_checkpoints/squeezeDetPlus/model.ckpt-95000
2017-05-30 18:09:07.901895: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901920: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901926: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901941: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 18:09:07.901946: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
  File "./src/demo.py", line 217, in <module>
    tf.app.run()
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "./src/demo.py", line 212, in main
    image_demo()
  File "./src/demo.py", line 164, in image_demo
    saver.restore(sess, FLAGS.checkpoint)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1457, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [64] rhs shape= [96]
	 [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@conv1/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/biases, save/RestoreV2)]]

Caused by op u'save/Assign', defined at:
  File "./src/demo.py", line 217, in <module>
    tf.app.run()
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "./src/demo.py", line 212, in main
    image_demo()
  File "./src/demo.py", line 161, in image_demo
    saver = tf.train.Saver(model.model_params)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1056, in __init__
    self.build()
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1086, in build
    restore_sequentially=self._restore_sequentially)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 691, in build
    restore_sequentially, reshape)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 155, in restore
    self.op.get_shape().is_fully_defined())
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 270, in assign
    validate_shape=validate_shape)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
    use_locking=use_locking, name=name)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/dsavenko/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [64] rhs shape= [96]
	 [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@conv1/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv1/biases, save/RestoreV2)]]

Is it expected?

Is there some way to calculate anchor boxes for my own dataset?

I try to train this model on my own dataset. I created the structure of folder similar to kitti. I have the same size of image as KITTI. But my dataset has various sizes of objects and I think that anchors for KITTI doesn't fit it. F.e. there is a special script for YOLO v2 based on k-means to create anchors depends on train samples. Is there some way to calculate anchor boxes for my own dataset?

About box matching

Hello.

I wonder if you have some experience on the issue of box matching.

Most CNN based object detection have box matching strategies to assign prior(anchor) box to ground truth during the training.

In many famous detection model like RCNN, SSD, they use the strategy that prior(anchor) boxes match the ground truth box when they have over 0.5 IOU.
But in this case, if there are no prior(anchor) box which over 0.5 IOU for any ground turth, that ground truth always to be a negative. This will be very big problem. To solve this they need a lot of prior(anchor) boxes and this causes the decline in processing speed.

And I found that your squeezeDet matches boxes with highest IOU, so that each ground truth can have its anchor box and never assigned as negative.

My question is if you have tried upper method. if yes, what was the difference between that method and yours?

I have tried your method but the loss is not converge after 3 when upper method gets near 1 in the end of the training. And it miss a lot(about 40%).
I think the reason is that the offset for localization is so big when the IOU is too small.

My english is not very good so if you don't understand the question then please tell me.

KITTI benchmark score

Hi there,
Your algorithm impresses me with the fastest inference speed than other algorithms. I am about to begin applying your algorithm on my own work. I have question whether you have a KITTI benchmark score by test set. I found your paper compared scores of other algorithms from benchmark with yours from validation set.

Using SqueezeDet with Tensorflow C++ API

First of all, thanks a lot for sharing your code. SqueezeDet is really great.

I managed to train it on my own dataset and everything was good while I was using it in Python. I am not using GPU and detection takes 60-80ms for the 400x225 image on my CPU.
However, when I freezed the graph and loaded it in C++ code, session->run began to take 480-500ms on the same data. Any idea why it is so?
I have set optimization flag --config=opt when building with bazel and I have tried setting all the flags manually, it did not help. I am a new in tensorflow, and there is not much documentation on C++ Api issues, so I thought someone here could tell me what the reason of such a slow down for squeezeDet is. Any help would be highly appreciated.

Can this model work on other dataset?

Hello, when I see your code , there is 'Currently only supports KITTI dataset' in the train.py. But I need to train this model on the VOC dataset, So I need to know whether this code can work on voc? If the answer is true, how can I do it? Thanks very much,I'm looking forward to your reply

Caffe model

Hello,

Is there any chance you could provide your model's structure in Caffe?

many thanks

About more classes

Hi Bichen, It's me again~~
I want to train the squeezeDet to detect more classes, that to say, not only 'car', 'pedestrian', 'cyclist', I need to prepare the datasets and change the CLASS_NAMES in the config.py. Any other changes need to do? Do I need to change the parameter of the net?
Waiting for your reply~~

Train on KITTI dataset

Hi @BichenWuUCB, I got some problem loading the pkl file of caffemodel you provide.
The version of related packages on my server are listed below:
easydict==1.6
joblib==0.10.3
tensorflow==0.10.0rc0

And Here's the error log:
Traceback (most recent call last):
File "./src/train.py", line 281, in
tf.app.run()
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "./src/train.py", line 277, in main
train()
File "./src/train.py", line 114, in train
model = VGG16ConvDet(mc, FLAGS.gpu)
File "/tmp3/jeff/squeezeDet/src/nets/vgg16_convDet.py", line 25, in init
self._add_forward_graph()
File "/tmp3/jeff/squeezeDet/src/nets/vgg16_convDet.py", line 39, in _add_forward_graph
self.caffemodel_weight = joblib.load(mc.PRETRAINED_MODEL_PATH)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 575, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 507, in _unpickle
obj = unpickler.load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatchkey
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 340, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 183, in read
array = self.read_array(unpickler)
File "/home/extra/goan15910/.local/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array.shape = self.shape
ValueError: total size of new array must be unchanged

I have no problem running the demo example.
Is this a package version problem?

dataset for the result on paper

I wonder what data you used for the result on your paper. Is that randomly split validation data from 7481 training data or 7518 of test benchmark data?
Thanks

Interrupt and resume training from checkpoint

Is it possible to interrupt the training and resume it from a specific checkpoint? My train dir and checkpoint dir are the same, but when I restarted the training it started back from step 0. Any suggestion on how to change the script to resume training from a checkpoint / freeze weights and create a .pb file from a checkpoint?

Train on my own datasets

Do anyone who knows how to use my own datasets to train this network. If possible, could you give me some suggestions? Thank you very much.

Is there a caffe implementation plan?

Hi Bichen, I have train the squeezeDet using data of my own, it turn out to be effective. But I want to train it using caffe for caffe is much more familiar for me. So is an caffe implementation possiable? Because without a pretrained caffemodel to finetune with, may be the net could not converge very well.
Waiting for you reply, thanks~~

On train data annotation

For one train image, do I need to annotate all objects in the image during training.
for example , an image for pedestrian detection is :
qq 20170507184124
If I only annotate the left pedestrian(there are other pedestrians), is it good for net training?

In my option , there are no explicit positive and negative samples . Does that mean : The other area is negatives except positives given by ground truth in one image? So I need to annotate all goals? I am not sure.

Segmentation error on running train file on GPU cluster

./scripts/train.sh: line 46: 32723 Segmentation fault (core dumped) python ./src/train.py --dataset=KITTI --pretrained_model_path=./data/SqueezeNet/squeezenet_v1.1.pkl --data_path=./data/KITTI --image_set=train --train_dir=/tmp/bichen/logs/SqueezeDet/train --net=squeezeDet --summary_step=100 --checkpoint_step=500 --gpu=$USE_GPU

I read up online and saw that this might happen due to space on the disk. But I have 64 GB RAM and only 3 GB was being used before this code stopped.

What should I do?

error in running demo.py

When I ran python ./src/demo.py, i got the following error:

Traceback (most recent call last):
File "./src/demo.py", line 217, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/demo.py", line 212, in main
image_demo()
File "./src/demo.py", line 175, in image_demo
feed_dict={model.image_input:[input_image], model.keep_prob: 1.0})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 922, in _run
+ e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Can not convert a float into a Tensor.

Any idea why this happens?

Thanks,

How to run on a mp4 video?

Awesome contribution! I've looked in the demo script, and saw there is support for processing videos? However I tried it, but didn't succeeded in running it properly.

Any tips or howto, I get following error. Tried to install gtk2.0 seperately.

cv2.destroyAllWindows()
cv2.error: /io/opencv/modules/highgui/src/window.cpp:577: error: (-2) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Carbon support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function cvDestroyAllWindows

How to train the model on other datasets with any size?

Hi
I want to tain the model with other dataset, but with different size with KITTI dataset.
I changed the mc.IMAGE_WIDTH and mc.IMAGE_HEIGHT to my dataset, but they can't suit to the pretrained_model squeezenet_v1.0_SR_0.750.pkl and got error :
"ValueError: Cannot reshape a tensor with 47196 elements to shape [1,15048,3]....".

 Thanks for your help!

Speed

Hello, I know the speed of your squeezeDet model is 57.2fps. But in my computer, reading one image using Opencv needs 0.025s. So I want to know if you include this time when you calculate speed? Thank you
@BichenWuUCB

Save Model with Batch Size = 1 for production

Dear all,

I am facing a problem when I try to save a model to disk with batch size = 1 and then freeze into a .pb. I am doing this in four steps:

  1. I am able to train squeezeDet Plus with my custom dataset. For training I use a batch size of 20. The resulting trained model.ckpt is 60MB and the model.meta is 19.9MB in size.
  2. When I evaluate this model with eval.py, I re-create the graph with Batch size = 1 (as in the demo), and the python script works fine. I correctly get the bounding boxes for each image.
  3. Now I would like to save this graph with batch size = 1 to disk. However, when I try to save this graph using batch size = 1 as a checkpoint, the size of the model.ckpt is reduced to 30 MB and the size of the model.meta to 233 KB
  4. Then, when I try to freeze the files in 3) I get the error "tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value" (the script for freezing the graph works fine with the .ckpt and .meta files in point 1).

I realize that there is something wrong in the way I am saving the data in step 3, the graph and weights are not saved properly, but I cannot figure out what. Am I missing something very obvious here?

This is the minimum code (adapted from eval.py) that I am using to load the graph and the weights and save them with bath size of 1 (point 3)

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import cv2
import os.path
import numpy as np
import tensorflow as tf
from config import *
from nets import *

FLAGS = tf.app.flags.FLAGS

tf.app.flags.DEFINE_string('dataset', 'KITTI',
                           """Currently support PASCAL_VOC or KITTI dataset.""")
tf.app.flags.DEFINE_string('data_path', '', """Root directory of data""")
tf.app.flags.DEFINE_string('image_set', 'test',
                           """Only used for VOC data."""
                           """Can be train, trainval, val, or test""")
tf.app.flags.DEFINE_string('year', '2007',
                            """VOC challenge year. 2007 or 2012"""
                            """Only used for VOC data""")
tf.app.flags.DEFINE_string('eval_dir', '/tmp/bichen/logs/squeezeDet/eval',
                            """Directory where to write event logs """)
tf.app.flags.DEFINE_string('checkpoint_path', '/tmp/bichen/logs/squeezeDet/train',
                            """Path to the training checkpoint.""")
tf.app.flags.DEFINE_integer('eval_interval_secs', 60 * 1,
                             """How often to check if new cpt is saved.""")
tf.app.flags.DEFINE_boolean('run_once', False,
                             """Whether to run eval only once.""")
tf.app.flags.DEFINE_string('net', 'squeezeDet',
                           """Neural net architecture.""")
tf.app.flags.DEFINE_string('gpu', '0', """gpu id.""")


def main(argv=None):

  """Load weights from a pre-trained squeezeDet network trained with Batch > 1
  and save the model with batch = 1 for production"""

  with tf.Graph().as_default():

    mc = kitti_squeezeDetPlus_config()
    mc.BATCH_SIZE = 1
    mc.LOAD_PRETRAINED_MODEL = False
    model = SqueezeDetPlus(mc, FLAGS.gpu)

    saver = tf.train.Saver(model.model_params)

    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:

        # Restores from checkpoint
        ckpts = set()
        ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_path)
        ckpts.add(ckpt.model_checkpoint_path)
        print ('Loading {}...'.format(ckpt.model_checkpoint_path))
        saver.restore(sess, ckpt.model_checkpoint_path)

        sess.run(tf.initialize_all_variables())

        # Run one image to test that it works
        read_full_name = "/data/squeezeDet_TF011/src/test2.jpg"
        im = cv2.imread(read_full_name)
        im = im.astype(np.float32, copy=False)
        im = cv2.resize(im, (mc.IMAGE_WIDTH, mc.IMAGE_HEIGHT))
        input_image = im - mc.BGR_MEANS

        # Detect
        det_boxes, det_probs, det_class = sess.run(
            [model.det_boxes, model.det_probs, model.det_class],
            feed_dict={model.image_input: [input_image], model.keep_prob: 1.0})  # works fine

        # Save to disk
        checkpoint_path = os.path.join("/data/squeezeDet_TF011/logs/test_freeze", 'evalBatch1.ckpt')
        step = 1
        saver.save(sess, checkpoint_path, global_step=step)


if __name__ == '__main__':
    tf.app.run()

Thanks a lot

Cheers

squeezedet train using KITTI

Cannot find fire10/squeeze1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire10/expand1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire10/expand3x3 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/squeeze1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/expand1x1 in the pretrained model. Use randomly initialized parameters
Cannot find fire11/expand3x3 in the pretrained model. Use randomly initialized parameters
Cannot find conv12 in the pretrained model. Use randomly initialized parameters
Model statistics saved to data/KITTI/logs/squeezeDet/train/model_metrics.txt.

questions on train.txt for custom model

I am trying to run Squeezedet on my custom model. However, the kitty.py file is looking for a train.txt file. I tried to work without having this file and get an error

% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (0,) for Tensor 'image_input:0', which has shape '(20, 1000, 1000, 3)'

Any clue how can i resolve this and make it work. I dont have the train.txt files

Transfer learning performance

Hi, I tried inference of the pretrained models as well as the ones I trained using your code on datasets other than KITTI. For example cityscapes was one of them, and the performance was extremely depreciated. Any reason for this? Typically, it shouldn't be that bad, perhaps it is overfitting? KITTI data contains verly low variation, but I still expected decent performance on cityscpaes? Any ideas or perhaps something I might be doing wrong?

the way to calculate center_x and center_y

Hi, when I came across the following code in kitti_model_config.py

 center_x = np.reshape(
      np.transpose(
          np.reshape(
              np.array([np.arange(1, W+1)*float(mc.IMAGE_WIDTH)/(W+1)]*H*B), 
              (B, H, W)
          ),
          (1, 2, 0)
      ),
      (H, W, B, 1)
  )

I do not understand how center_x is calculated. For example, if IMAGE_WIDTH is equal to 14, B is equal to 2. The center_x should be 3.5, 10.5 . While 4.66, 9.32 is calculated using the above code.

Training with custom data shape

Hi,

Is it possible to train squeezeDet on custom data without resizing to the 1242*375 KITTI format?
In kitti_squeezedet_config.py the image width and height can be changed, but that seems to conflict with set_anchors.

Thanks!

Training with VGG16 joblib ValueError

Hi @BichenWuUCB ,everyone

I try to test ConvDet by training with VGG16. However, it seems joblib cannot load the weight from VGG_ILSVRC_16_layers_weights.pkl file. I got the following error.

I hope you can assist on this problem. If possible, please check with your program and .pkl file.

Thank you very much.

Traceback (most recent call last):
File "./src/train.py", line 286, in
tf.app.run()
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./src/train.py", line 282, in main
train()
File "./src/train.py", line 114, in train
model = VGG16ConvDet(mc, FLAGS.gpu)
File "/home/cuong/squeezeDet/src/nets/vgg16_convDet.py", line 25, in init
self._add_forward_graph()
File "/home/cuong/squeezeDet/src/nets/vgg16_convDet.py", line 39, in _add_forward_graph
self.caffemodel_weight = joblib.load(mc.PRETRAINED_MODEL_PATH)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 575, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 507, in _unpickle
obj = unpickler.load()
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 340, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 183, in read
array = self.read_array(unpickler)
File "/home/cuong/anaconda/envs/tensorflow/lib/python2.7/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array.shape = self.shape
ValueError: total size of new array must be unchanged

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.