Code Monkey home page Code Monkey logo

ndvr-dml's Introduction

Near-Duplicate Video Retrieval
with Deep Metric Learning

This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retrieval with Deep Metric Learning. It provides code for training and evalutation of a Deep Metric Learning (DML) network on the problem of Near-Duplicate Video Retrieval (NDVR). During training, the DML network is fed with video triplets, generated by a triplet generator. The network is trained based on the triplet loss function. The architecture of the network is displayed in the figure below. For evaluation, mean Average Precision (mAP) and Presicion-Recall curve (PR-curve) are calculated. Two publicly available dataset are supported, namely VCDB and CC_WEB_VIDEO.

Prerequisites

  • Python
  • Tensorflow 1.xx

Getting started

Installation

  • Clone this repo:
git clone https://github.com/MKLab-ITI/ndvr-dml
cd ndvr-dml
  • You can install all the dependencies by
pip install -r requirements.txt

or

conda install --file requirements.txt

Triplet generation

Run the triplet generation process for each dataset, VCDB and CC_WEB_VIDEO. This process will generate two files for each dataset:

  1. the global feature vectors for each video in the dataset:
    <output_dir>/<dataset>_features.npy
  2. the generated triplets:
    <output_dir>/<dataset>_triplets.npy

To execute the triplet generation process, do as follows:

  • The code does not extract features from videos. Instead, the .npy files of the already extracted features have to be provided. You may use the tool in here to do so.

  • Create a file that contains the video id and the path of the feature file for each video in the processing dataset. Each line of the file have to contain the video id (basename of the video file) and the full path to the corresponding .npy file of its features, separated by a tab character (\t). Example:

      23254771545e5d278548ba02d25d32add952b2a4	features/23254771545e5d278548ba02d25d32add952b2a4.npy
      468410600142c136d707b4cbc3ff0703c112575d	features/468410600142c136d707b4cbc3ff0703c112575d.npy
      67f1feff7f624cf0b9ac2ebaf49f547a922b4971	features/67f1feff7f624cf0b9ac2ebaf49f547a922b4971.npy
                                               ...	
    
  • Run the triplet generator and provide the generated file from the previous step, the name of the processed dataset, and the output directory.

python triplet_generator.py --dataset vcdb --feature_files vcdb_feature_files.txt --output_dir output_data/

DML training

  • Train the DML network by providing the global features and triplet of VCDB, and a directory to save the trained model.
python train_dml.py --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/ 
  • Triplets from the CC_WEB_VIDEO can be injected if the global features and triplet of the evaluation set are provide.
python train_dml.py --evaluation_set output_data/cc_web_video_features.npy --evaluation_triplets output_data/cc_web_video_triplets.npy --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/

Evaluation

  • Evaluate the performance of the system by providing the trained model path and the global features of the CC_WEB_VIDEO.
python evaluation.py --fusion Early --evaluation_set output_data/cc_vgg_features.npy --model_path model/

OR

python evaluation.py --fusion Late --evaluation_features cc_web_video_feature_files.txt --evaluation_set output_data/cc_vgg_features.npy --model_path model/
  • The mAP and PR-curve are returned

Citation

If you use this code for your research, please cite our paper.

@inproceedings{kordopatis2017dml,
  title={Near-Duplicate Video Retrieval with Deep Metric Learning},
  author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Yiannis},
  booktitle={2017 IEEE International Conference on Computer Vision Workshop (ICCVW)},
  year={2017},
}

Related Projects

Intermediate-CNN-Features - this repo was used to extract our features

ViSiL - video similarity learning for fine-grained similarity calculation

FIVR-200K - download our FIVR-200K dataset

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos ([email protected])
Symeon Papadopoulos ([email protected])

ndvr-dml's People

Contributors

gkordo avatar kleinmind avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ndvr-dml's Issues

Availability of Testing code?

Thank you for sharing such great information on training and evaluation of your model.
i was wondering if we need to visualize what videos are near duplicates of our input test video, how this can be done using your code?

Request FIVR-200K dataset

@gkordo Hi Giorgos,

I have read your paper "FIVR: Fine-grained Incident Video Retrieval" and I'm interested in the dataset of FIVR-200K which you will release ndd.iti.gr/fivr.html. Would you like to tell me when to release the FIVR-200K so I can use the dataset to do some experiments.

Thanks.

VCDB Triplet Generation

Error :
CC_WEB_VIDEO Triplet Generation

Query 0: 0%| | 0/341 [00:00<?, ?it/s]
Traceback (most recent call last):
File "triplet_generator.py", line 186, in
triplets = triplet_generator_cc(dataset, features)
File "triplet_generator.py", line 141, in triplet_generator_cc
pair_distance = euclidean(video1, video2)
File "/home/chq/anaconda3/envs/python2/lib/python2.7/site-packages/scipy/spatial/distance.py", line 602, in euclidean
return minkowski(u, v, p=2, w=w)
File "/home/chq/anaconda3/envs/python2/lib/python2.7/site-packages/scipy/spatial/distance.py", line 505, in minkowski
dist = norm(u_v, ord=p)
File "/home/chq/anaconda3/envs/python2/lib/python2.7/site-packages/scipy/linalg/misc.py", line 145, in norm
return nrm2(a)
_fblas.error: (offx>=0 && offx<len(x)) failed for 2nd keyword offx: dnrm2:offx=0

occurs when I run triplet_generator.py, do you know why? And only generates cc_web_video_features.npy
without cc_web_video_triplets.npy
thanks!

Triplet Generation - _fblas.error:

#17
It seems like it's an issue of scikit-learn and blas. Try to update/reinstall these two packages:

I am trying to update the package with the same issue as above.
By the way, blas gives an error.

pip3 install blas
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://pypi.org/simple/blas/

Is there any other way?

Where is the function for feature vectors fusion?

I'm trying to calculate the distance between 2 individual videos. According to the paper, after extracting the feature vectors I've got a choice between early fusing and late fusion to turn the feature vectors into a single vector before feeding it to the embedding function, but I don't see where the process even happens.

generating training files according to our own data?

Respected Sir,
I know that you are using the vcdb.pickle file for the training of DML model. i also know that the pickle file is a dictionary containing video pairs and index.
i was wondering how can i generate the training files for my own dataset?

Error when evaluating

First, thank you for your amazing work.
When running the evaluation part:
python evaluation.py --evaluation_set /data/p01/NDVR/CC_WEB_triplet/cc_web_video_features.npy --model_path /data/p01/NDVR/model/, I met error below:

Evaluation Results
==================
evaluation.py:43: RuntimeWarning: invalid value encountered in divide
sim = np.round(1 - dist[i] / dist.max(), decimals=6)
Traceback (most recent call last):
File "evaluation.py", line 78, in
positive_labels=args['positive_labels'], all_videos=False)
File "/home/p01/projects/NDVR/ndvr-dml-master/utils.py", line 157, in evaluate
precision, recall, thresholds = precision_recall_curve(y_target, y_score)
File "/home/p01/.conda/envs/pc2/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 522, in precision_recall_curve
sample_weight=sample_weight)
File "/home/p01/.conda/envs/pc2/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 416, in _binary_clf_curve
raise ValueError("Data is not binary and pos_label is not specified")
ValueError: Data is not binary and pos_label is not specified

I check the README.md and found that the order should be python evaluation.py --evaluation_set output_data/cc_vgg_features.npy --model_path model/ , so I think maybe I misunderstand the cc_vgg_features.npy, can you give me a instruction about how to build this npy file or there is something wrong at other part?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.