Code Monkey home page Code Monkey logo

nlvl-generalization's Introduction

Generalization Capacity of Natural Language Video Localisation (NLVL) Models

This is the repository containing the datasets I created during my Final Year Project at Nanyang Technological University. Please cite this work if you find it useful, using the citation given below. It can be accessed at: https://hdl.handle.net/10356/175072

Dhanyamraju, H. R. (2024). Generalization capacity of natural language video localization (NLVL) models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175072

Charades-Ego-STA Dataset

This dataset contains 302 query-video pairs labelled with start and end timestamps of the video segment that best fits the given query. All the videos in this dataset are first person videos i.e, the video is taken from the perspective of the person performing the action.

The labeled video-query can be accessed in the Charades-Ego-STA directory of this repo. The csv and txt file both have the same content but the txt file is formatted to be consistant with the format of the Charades-STA dataset. You can download the videos of the Charades-Ego dataset from Ai2: Charades-Ego.

Please note that the Charades-Ego dataset is subject to its own license whoose terms may differ from that of this repo.

Charades-STA-Merged Dataset

This dataset is built upon the Charades-STA dataset by merging various videos together in order to reduce the distributional bias in the timestamps.

The labeled video-query can be accessed in the Charades-STA-Merged directory of this repo. The csv and txt file both have the same content but the txt file is formatted to be consistant with the format of the Charades-STA dataset. You can generate the merged videos by following these steps:

  1. Download the the videos of the Charades Dataset from: Ai2: Charades
  2. Download the charades_sta_merged_train.csv and charades_sta_merged_test.csv files from the Charades-STA-Merged directory of this repo.
  3. Install FFMPEG from https://ffmpeg.org/download.html
  4. Download requirements.txt from this repo and run:
pip install -r requirements.txt
  1. Download generate_charades_merged.py and then run after replacing the text in the <> with the relevant paths:
python generate_charades_merged.py --input_video_dir "<path-to-charades-videos>" --output_dir "<path-to-dir-to-store-merged-videos>" --train_csv_path "<path-to-charades_sta_merged_train.csv>" --test_csv_path "<path-to-charades_sta_merged_test.csv>"

You will need atleast 12 GB of available disk space for the Charades-STA-Merged Dataset. You may also require additional space to store the Charades Dataset.

nlvl-generalization's People

Contributors

harshraod avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.