Code Monkey home page Code Monkey logo

deep_sort's Introduction

Deep SORT

Introduction

This repository contains code for Simple Online and Realtime Tracking with a Deep Association Metric (Deep SORT). We extend the original SORT algorithm to integrate appearance information based on a deep appearance descriptor. See the arXiv preprint for more information.

Dependencies

The code is compatible with Python 2.7 and 3. The following dependencies are needed to run the tracker:

  • NumPy
  • sklearn
  • OpenCV

Additionally, feature generation requires TensorFlow (>= 1.0).

Installation

First, clone the repository:

git clone https://github.com/nwojke/deep_sort.git

Then, download pre-generated detections and the CNN checkpoint file from here.

NOTE: The candidate object locations of our pre-generated detections are taken from the following paper:

F. Yu, W. Li, Q. Li, Y. Liu, X. Shi, J. Yan. POI: Multiple Object Tracking with
High Performance Detection and Appearance Feature. In BMTT, SenseTime Group
Limited, 2016.

We have replaced the appearance descriptor with a custom deep convolutional neural network (see below).

Running the tracker

The following example starts the tracker on one of the MOT16 benchmark sequences. We assume resources have been extracted to the repository root directory and the MOT16 benchmark data is in ./MOT16:

python deep_sort_app.py \
    --sequence_dir=./MOT16/test/MOT16-06 \
    --detection_file=./resources/detections/MOT16_POI_test/MOT16-06.npy \
    --min_confidence=0.3 \
    --nn_budget=100 \
    --display=True

Check python deep_sort_app.py -h for an overview of available options. There are also scripts in the repository to visualize results, generate videos, and evaluate the MOT challenge benchmark.

Generating detections

Beside the main tracking application, this repository contains a script to generate features for person re-identification, suitable to compare the visual appearance of pedestrian bounding boxes using cosine similarity. The following example generates these features from standard MOT challenge detections. Again, we assume resources have been extracted to the repository root directory and MOT16 data is in ./MOT16:

python tools/generate_detections.py \
    --model=resources/networks/mars-small128.pb \
    --mot_dir=./MOT16/train \
    --output_dir=./resources/detections/MOT16_train

The model has been generated with TensorFlow 1.5. If you run into incompatibility, re-export the frozen inference graph to obtain a new mars-small128.pb that is compatible with your version:

python tools/freeze_model.py

The generate_detections.py stores for each sequence of the MOT16 dataset a separate binary file in NumPy native format. Each file contains an array of shape Nx138, where N is the number of detections in the corresponding MOT sequence. The first 10 columns of this array contain the raw MOT detection copied over from the input file. The remaining 128 columns store the appearance descriptor. The files generated by this command can be used as input for the deep_sort_app.py.

NOTE: If python tools/generate_detections.py raises a TensorFlow error, try passing an absolute path to the --model argument. This might help in some cases.

Training the model

To train the deep association metric model we used a novel cosine metric learning approach which is provided as a separate repository.

Highlevel overview of source files

In the top-level directory are executable scripts to execute, evaluate, and visualize the tracker. The main entry point is in deep_sort_app.py. This file runs the tracker on a MOTChallenge sequence.

In package deep_sort is the main tracking code:

  • detection.py: Detection base class.
  • kalman_filter.py: A Kalman filter implementation and concrete parametrization for image space filtering.
  • linear_assignment.py: This module contains code for min cost matching and the matching cascade.
  • iou_matching.py: This module contains the IOU matching metric.
  • nn_matching.py: A module for a nearest neighbor matching metric.
  • track.py: The track class contains single-target track data such as Kalman state, number of hits, misses, hit streak, associated feature vectors, etc.
  • tracker.py: This is the multi-target tracker class.

The deep_sort_app.py expects detections in a custom format, stored in .npy files. These can be computed from MOTChallenge detections using generate_detections.py. We also provide pre-generated detections.

Citing DeepSORT

If you find this repo useful in your research, please consider citing the following papers:

@inproceedings{Wojke2017simple,
  title={Simple Online and Realtime Tracking with a Deep Association Metric},
  author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich},
  booktitle={2017 IEEE International Conference on Image Processing (ICIP)},
  year={2017},
  pages={3645--3649},
  organization={IEEE},
  doi={10.1109/ICIP.2017.8296962}
}

@inproceedings{Wojke2018deep,
  title={Deep Cosine Metric Learning for Person Re-identification},
  author={Wojke, Nicolai and Bewley, Alex},
  booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2018},
  pages={748--756},
  organization={IEEE},
  doi={10.1109/WACV.2018.00087}
}

deep_sort's People

Contributors

abewley avatar iory avatar kinnzo avatar linuxandchill avatar nwojke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_sort's Issues

Tracker follow the wrong person

Hello,

Situation:
Person A is tracked and is about to leave the window of observation
Person B is not tracked yet and will cross the way of Person A (at the observation's border).

I oberseved that the tracker is often wrong and will say that Person A is Person B (like if Person A is going backward).

Is there a way to avoid this kind of error? I thought to code an algrotihm that avoid that the tracker goes backward but if you have any trick it could be cool!

Thanks in advance ;)

error when running generate_detections.py

I tried to run this file to generate features, after the first frame are processed, the program is killed due to this following error:

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: net/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=true, tensor_array_name="", _device="/job:localhost/replica:0/task:0/gpu:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: net/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=true, tensor_array_name="", _device="/job:localhost/replica:0/task:0/gpu:0"]]

image

image

Would you kindly show me how this can be fixed? Thank you!

Best regards

maxpool size is 63x31

Hi nwojke,

I find that after the maxpool the feature map size becomes 63x31 instead of 64x32. I guess after conv2d with stride 2 the size becomes 32x16 as expected. I wonder is it correct running in the case of abnormal size?

What parameters should I update for low frame rate videos?

Hi nwojke,

I ran into issues where model can not track object in low frame rate videos ( ~ 5 fps). It works fine when I test on higher frame rate videos (~15 - ~25 fps).

For example, object 1 at frame i is at top left. At frame i + 1, the object moves to the middle. I wonder if there is any parameter I should consider updating (in Kalman filter?) to accommodate this problem?

Thanks!

P/s: great work!

Why so low MOTA on MOT16 train dataset

Thanks for your wonderful code ~But when I run the code on the 7 train sequences from MOT16 ,I got an MOTA of 17.9.%.The detail of result is as followed. But on the mot16 results page, the deep_sort get a MOTA of 35.3%. I donnot know the so big difference between each other. The detection we used is the public detection. I try to upload the result on the test sequences of MOT16 in the server,but I have not made it.Thanks for your reply.
********************* Your Benchmark Results (2D) ***********************
IDF1 IDP IDR| Rcll Prcn FAR| GT MT PT ML| FP FN IDs FM| MOTA MOTP MOTAL
17.8 32.2 12.3| 43.9 63.4 5.26| 517 65 232 220| 27964 61901 729 1680| 17.9 76.9 18.6

can this be realtime?

From what I see and read, I have to train the model etc. Would this method work on realtime videos?

For example I analyze the frame detect the BBs and pass it on directly to the feature network. Can it work like that?

Running on ROS

Hi,
I want to run the algorithm on ROS.
I have built the methods that I will need in getting images and their corresponding detections at the same time in the correct format.
I am not sure where I should change the deep_sort_app.py to take input from detections and image messages from ROS. Any insights?

Thanks!

Tracking for Street LIghts

Hi,

I wanted to ask that how can I use deepSort to track street lights. What model do I need to change? And can I retrain this model to use for detection of street lights?

Your help would be really appreciated. Thanks

error

OSError: [Errno 2] No such file or directory: './MOT16/test/MOT16-06/img1'

'--display=False' doesn't work in argparse

When I run deep_sort_app.py with --display=False or --display=false, the tracking result is still displayed, after searching on google, it seems that argument with bool type in argparse doesn't work, maybe it's better to delete the type=bool and add action='store_true' for argument display.

Did the project use the Mahalanobis distance?

Hi, Thanks for you work!
the paper(SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC) mentioned a method that use Mahalanobis distance to get the motion information, however, I only saw the IOU method, so did the project use Mahalanobis distance method? if used, which file include it?
Thanks a lot

Is there any reference for computing covariance of motion and observation variables?

Hi, nwojke, although it is said that it is a bit hacky to set the motion and observation uncertainty, it seems that there is a pattern in kalman_filter.py when you initiate covariance in initiate phase, compute motion_cov in predict phase and innovation_cov in project phase, can you show me some reference or suggestion to set these variables? Thank you.

Why default --nms_max_overlap=1?

Thanks for the amazing work and open source!
Could you help me with the following questions please:

  1. Why default --nms_max_overlap=1? It means no suppression will be performed right?
  2. Was it experimentally best threshold for MOT16?

Thanks.

print <object at 0x7f87fe29c710>

line 172 , detections = [d for d in detections if d.confidence >= min_confidence]

I want to print the line 172 detections in deep_sort_app.py
but it show the
[<deep_sort.deep_sort.detection.Detection object at 0x7f87fe29c710>]

have any method to print value?

Usage on RADAR detections

Hi,

I wanted to know if the deep sort can be used to track objects from RADAR data. I have used the sort for images and it works great. But will it work for RADAR+Camera fusion or just radar detections?

Thanks!

tracking feature

HI
if when t = 1, the deep sort detect the person A and save the feature of the person A ,
when t = 2, the deep sort detect unknown person ,it will use cnn to match the feature of the person A at t=1 and the feature of unknown person at t=2, if the result is ture (same person ) , the deep sort will save the feature of the person A at t=2 , and the feature of the person A at t=1 will forget , right??

thanks

Which detection method is best for tracking?

There are many detection methods like Faster RCNN, Yolo, etc. So, which method is best suited for tracking. Currently I am using YOLO V2 but tracking results are not that much good. Occlusion is still there. Can you give me any suggestions to improve performance of tracking.

The shape of extracted featuremap is not specific.

Hi, thanks for the wonderful work. I have only one question, I print out the detections.npy file, the shape
of MOT16_POI_train/0004 is ( 40709, 138), 138 is the last dim of featuremap with no doubt.
But 40709 don't make sense, because MOT16_POI_train/0004 has 1050 images.
if every image has a single 1-d featuremap which should be (1050, 138), So basically, what is the shape of extracted feature map?

unmatched_tracks in tracker.py

in tracker.py line 127
unmatched_tracks = list(set(unmatched_tracks_a + unmatched_tracks_b))

how about the situation that the track included in both unmatched_tracks_a and iou_track_candidates, but excluded in unmatched_tracks_b? as this track should be matched.

How is the CNN trained?

Just want to know if the CNN is trained based on a weight-sharing Siamese network with triplet loss head? Or is it trained with a softmax loss where the number of classes equals to number of person IDs?

Code for training CNN model

Hi Nicolai,

I am working on an MOT project for tracking several different object classes (cars, pedestrians, cyclists, etc.). I understand that the model provided in your github repository for Deep SORT is trained on the MARS dataset which is specific for person re-identification, and was wondering if you could provide the code for training your CNN architecture so that I can train it on my own dataset.

Alternatively, it would be good if you could share some pre-trained models for object classes other than pedestrians.

Thanks for your work on Deep SORT. Cheers!

min cost matching

the ci,j = λ d(1)(i, j) + (1 − λ)d(2)(i, j) is in the paper (5)

where is the code to do the (5)?

training new tracking model

hi,
any upcoming training - testing new tracking apps, trained from scratch examples(non pedestrian) predicted soon?

Unable to download pre-generated detections and the CNN checkpoint file

Hello,
I am unable to download pre-generated detections and the CNN checkpoint file from the network drive. I am keep getting " File is currently busy, please try again later".

I am just checking if the server is busy, or the files are removed. I can see the file, my download by clicking a file does not work.

Thanks,
Fayzur

Can we run generate detections side by side while object detector is running?

I want to know is there any way to optimize the time by running generate_detections side by side while we are running object detector to get the B-Boxes... I referred the issue #4 I want the flow like:

  1. Detect_objects in frame -> after detecting objects run generate_detections to get encodings for .npy file -> after video gets finished store all the encodings in .npy file -> run deep_sot_app

freeze

I want to make a .pb file for your model. However I couldn't identify the output node name. Any suggestion?

Testing with my data

Hello. I want to track people using on my dataset, but I have a problem about adapting it to the project. In this case, do I have to provide .det file?

Attempted to use a closed Session

Hello,
When I use the deep sort in py-faster-rcnn, there are some errors bescause the py-faster-rcnn is caffe,the deep sort is tensorflow, so I change the generate detection code, but when use the generate_detections.create_box_encoder to get encoder, I get the error: RuntimeError: Attempted to use a closed Session. I don't know how to solve this problem!
thanks~

incorect tracking on video file

Hello, I've managed to run the tracker correctly with video as input, and I'm using YOLOv2 for generating box detections,but I got really bad results for the tracking, here is the snipet of the code I'm using :

metric = nn_matching.NearestNeighborDistanceMetric(
    "cosine", 0.2, 100)
tracker = Tracker(metric)
encoder = generate_detections.create_box_encoder(
        "resources/networks/mars-small128.ckpt-68577")

camera = cv2.VideoCapture(file)
while camera.isOpened():
        _, frame = camera.read()
        if frame is None:
            print ('\nEnd of Video')
            break
      h, w, _ = frame.shape
	thick = int((h + w) // 300)
##################Yolo part to generate detections #####
        detections = []
        scores = []
        boxes = self.sess.run(self.out, feed_dict)
        for b in boxes:
                left, right, top, bot, mess, max_indx, confidence = boxResults(b)
                detections.append(np.array([left,top,right-left,bot-top]))
                scores.append(float('%.2f' % confidence))
        detections = np.array(detections)
##################################################
        features = encoder(frame, detections)
        detections = [Detection(bbox, score, feature) for bbox,score, feature in
                        zip(detections,scores, features)]

         # Run non-maxima suppression.
         boxes = np.array([d.tlwh for d in detections])
         scores = np.array([d.confidence for d in detections])
         indices = prep.non_max_suppression(boxes, nms_max_overlap, scores)
         detections = [detections[i] for i in indices]

         tracker.predict()
         tracker.update(detections)

         for track,det in zip(tracker.tracks,detections):

	          bbox = track.to_tlbr()
		  cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),
					        (255,255,255), thick)
		  #cv2.putText(frame, str(track.track_id),
                                         #(int(bbox[0]), int(bbox[1]) - 12),0, 1e-3 * h, (255,255,255),thick//3)

		  bbox = det.to_tlbr()
		  cv2.rectangle(frame,(int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),
					        (255,0,0), thick)
	          cv2.putText(frame, str(track.track_id),
                                                (int(bbox[0]), int(bbox[1]) - 12),0, 1e-3 * h, (255,0,0),thick//3)
        cv2.imshow('', frame)

Here is the the results I got after testing, the ID's were not stable , and the white boxes (generated by the tracker) were always thin (width =0) :
video in google drive (in blue are yolo detections , and white output of the tracker )
can someone please help find where is the problem , and a possible solution ?
thank you !

Run Deep Sort with a videofile

Hi !

Do you know if there is a simple way to run Deep Sort using detection file + video file using open cv instead of using a folder of images?

Thanks!

Is it a mistake in your kalman_filter.py 's code ?

Hi ,

Thanks for your code.
I tried to make a better understand your code with some source of kalman filter.
On the line of 184 of kalman_filter.py:

184 new_covariance = covariance - np.linalg.multi_dot((
kalman_gain, projected_cov, kalman_gain.T))

I can't understand it well.Because it seems conflict with my resource of kalman filter.

image

Can you make some explanation ? thx

Track_id add 1 every frame

hello, I have a problem that the track_id add 1 each frame. detection with faster rcnn, and tracking with deep sort, but the results are error, can you tell me how to solve this problem?

A stupid question about "real time"

How can this code be used for "real time" video sequence from a web camera? As far as I see from the instruction, in order to track people of a video sequence, a npy file, which is generated in advanced, needs be to provided. and the npy file only works for designated video sequence, so it is not general. please correct me if I am wrong, and it is greatly appreciated if any hint is given on how to adapt the code for real time web camera video sequence. thx

Train car model

Hello, I train car model with triplets Re-id, and I got for files that .data, .index, .meta. The deep sort use a .ckpt file for model, but I train model no have this file, so how can I do? Do you give me some suggest?than you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.