Code Monkey home page Code Monkey logo

fast-vqa's Introduction

FAST-VQA

PWC

PWC

PWC

PWC

The official open source training and inference code for our paper "FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling". ---- To Appear in ECCV2022 ----

Arxiv Edition

Pretrained weights:

Supports

  • Training with Large Dataset finetune.py
  • Finetuning into Smaller Datasets finetune.py
  • Evaluation infer.py
  • Direct API Import from fastvqa import deep_end_to_end_vqa
  • Package Installation as pip install .

in Master Branch.

The Dev_Branch contains several new features which is more suitable for development of your own deep end-to-end VQA models.

Intro

Examples on Live Fragments:

Frag (From LIVE-VQC, 720p, Original Score 38.24)

Frag (From LIVE-VQC, 1080p, Original Score 74.54)

Results

We reach SOTA performance with 210x reduced FLOPs.

We also refresh the SOTA on multiple databases by a very large margin.

GFLOPs-performance

Our sparse and efficient sub-sampling also reaches at least 99.5% relative accuracy than extreme dense sampling.

IOS

See in quality map demos for examples on local quality maps.

Installation

Requirements

The original method is build with

  • python=3.8.8
  • torch=1.10.2
  • torchvision=0.11.3

while using decord module to read original videos (so that you don't need to make any transform on your original .mp4 input).

To get all the requirements, please run

pip install -r requirements.txt

Direct install

You can run

pip install .

or

python setup.py installl

to install the full FAST-VQA with its requirements.

Usage

Visualize fragments

If you would like to visualize the proposed fragments, you can generate the demo visualizations by yourself, via the following script:

python visualize.py -d $DATASET$ 

You can also visualize the patch-wise local quality maps rendered on fragments, via

python visualize.py -d $DATASET$  -nm

Inference on Scripts

You can install this directory by running

pip install .

Then you can embed these lines into your python scripts:

from fastvqa import deep_end_to_end_vqa

dum_video = torch.randn((3,240,720,1080)) # A sample 720p, 240-frame video
vqa = deep_end_to_end_vqa(True, model_type=model_type)
score = vqa(dum_video)
print(score)

This script will automatically download the pretrained model weights on LSVQ.

Benchmarking FAST-VQA

You can directly benchmark the model with mainstream benchmark VQA datasets.

python inference.py -d $DATASET$

Available datasets are LIVE_VQC, KoNViD, (experimental: CVD2014, YouTubeUGC), LSVQ (or 'all' if you want to infer all of them).

Train FAST-VQA

Train from scratch

You might need to download the original Swin-T Weights to initialize the model.

Intra Dataset Training

This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset.

python train.py -d $DATASET$ --from_ar

p Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.

Cross Dataset Training

This training will do no split and directly report the best result on the provided validation dataset.

python inference.py -d $TRAINSET$-$VALSET$ --from_ar -lep 0 -ep 30

Supported TRAINSET is LSVQ, and VALSETS can be LSVQ(LSVQ-test+LSVQ-1080p), KoNViD, LIVE_VQC.

Finetune with provided weights

Intra Dataset Training

This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset.

python inference.py -d $DATASET$ --from_ar

Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.

Switching to FAST-VQA-M

You can add the argument -m fast-m in any scripts (finetune.py, inference.py, visualize.py) above to switch to FAST-VQA-M instead of FAST-VQA.

Citation

Please cite the following paper when using this repo.

@article{wu2022fastquality,
  title={FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling},
  author={Wu, Haoning and Chen, Chaofeng and Hou, Jingwen and Wang, Annan and Sun, Wenxiu and Yan, Qiong and Weisi, Lin},
  journal={Proceedings of European Conference of Computer Vision (ECCV)},
  year={2022}
}

fast-vqa's People

Contributors

teowu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.