Code Monkey home page Code Monkey logo

2022aicitychallenge-track3's Introduction

AICITY2022_Track3

This repo includes solution for AICity2022 Challenge Track 3 - Naturalistic Driving Action Recognition

framework

Installation

Please find installation instructions for PyTorch and PySlowFast in here

Data Preparation For Training

Download the training data (A1), which is provided by 2022 ACity Challenge and put the download file into ./X3D_training/

Note: Due to the file name in .csv file and the file name of videos (.mp4) are not matching (for example, Rear_view_User_id_49381_1(in .csv file) and Rearview_mirror_user_id_49381_NoAudio_1 (in video files)), we should change the file names in csv file (user_id_*.csv) similar to the file name of coresponding videos.

Futhermore, for the synchronization, please change the name of csv file in user_id_49381 folder of the download data from User_id_49381.csv to user_id_49381.csv

cd X3D_training

Splitting training data into multiple video segments using the following command (around 04 hours of time consuming, the splitted data can be download here(for accessable person only)):

python cut_video.py

After executing the above command, the output data is located in ./X3D_training/data/

The Dataset is then splitted into video segments and put into different folder of labels based on ground truth (user_id_*.csv). The splitted files is formated as follows:

  • data
    • 0
      • VIDEO1.MP4
      • VIDEO2.MP4
      • VIDEO3.MP4
      • ... ...
    • 17
      • VIDEO1.MP4
      • VIDEO2.MP4
      • VIDEO3.MP4

The next step is to create the ground truth (.csv files) of the splitting data which categorized into trainning (train_cameraview_id.csv), validation (val_cameraview_id.csv) and testing (test_cameraview_id.csv) using the following command:

python create_csv.py

After executing the above command, the output data is located in ./X3D_training/data/ .The ground truth files is formated as follows:

  • data
    • train_cameraview_id.csv ...
    • val_cameraview_id.csv ...
    • test_cameraview_id.csv ...

The content of *.csv files are formated as follows:

  * path_to_video_1 label_1
  * path_to_video_2 label_2
  * path_to_video_3 label_3
  ...
  * path_to_video_N label_N

Reproduce the result on AICity 2021 Challenge

Train

Pretrained model of X3D-L can be download here. After downloading the pretrained model, please put the file into ./X3D_training/

cd X3D_training
python tools/run_net.py --cfg configs/Kinetics/X3D_L.yaml NUM_GPUS 1 DATA.PATH_TO_DATA_DIR data

Outputs of the trainning process (i.e., checkpoint) are saving in the main folder, which are formated as checkpoint_cameraview_id

Note: We execute the training with A100 GPU. For other GPU, please change the value of batch size in ./Training/configs/Kinetics/X3D_L.yaml

Inference

The format of inference should be similar with the A2 dataset, which is provided by 2022 AI City Challenge. The format of A2 dataset as follows:

  • A2
    • user_id_*
      • CAMERAVIEW_user_id_*.MP4
      • CAMERAVIEW_user_id_*.MP4
      • CAMERAVIEW_user_id_*.MP4
      • ...
    • video_ids.csv

The checkpoints after trainning process can be downloaded here, which includes all the checkpoints of different camera views and user id. After downloading all the checkpoints, please put all files into ./X3D_inference/checkpoint_submit/

cd X3D_inference
python inference_ensemble_3_view.py --cfg configs/Kinetics/X3D_L.yaml NUM_GPUS 1 TRAIN.ENABLE False DATA.PATH_TO_DATA_DIR A2

DATA.PATH_TO_DATA_DIR: path to Test Dataset (e.g., A2, B) Submission file appeare in ./output

Public Leaderboard

TeamName F1-Score Link
VTCC_uTVM 0.3492

Citation

If you find our work useful, please cite the following:

@inreview{Tran2022,  
    author={Manh Tung Tran and Minh Quan Vu and Ngoc Duong Hoang and Khac-Hoai Nam Bui},  
    title={An Effective Temporal Localization Method with Multi-View 3D Action Recognition for Untrimmed Naturalistic Driving Videos},  
    booktitle={CVPR Workshop},
    year={2022}  
}

Contact

Khac-Hoai Nam Bui ([email protected])

Manh Tung Tran ([email protected])

Minh Quan Vu ([email protected])

2022aicitychallenge-track3's People

Contributors

vtcc-utvm avatar tmt-uet avatar quanvuhust avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.