Code Monkey home page Code Monkey logo

sanaznami / mtl_jnd Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 98 KB

Supplementary material for the paper "Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames", IEEE TCSVT, 2024.

License: Apache License 2.0

Python 100.00%
jnd just-noticeable-difference just-noticeable-distortion paper-with-code perceptual-compression perceptual-quality video-coding multitask-learning compressed-domain latent-space

mtl_jnd's Introduction

Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames

Introduction

This is the implementation of Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames paper in Tensorflow.

Abstract

The Just Noticeable Difference (JND) refers to the smallest distortion in an image or video that can be perceived by Human Visual System (HVS), and is widely used in optimizing image/video compression. However, accurate JND modeling is very challenging due to its content dependence, and the complex nature of the HVS. Recent solutions train deep learning based JND prediction models, mainly based on a Quantization Parameter (QP) value, representing a single JND level, and train separate models to predict each JND level. We point out that a single QP-distance is insufficient to properly train a network with millions of parameters, for a complex content-dependent task. Inspired by recent advances in learned compression and multitask learning, we propose to address this problem by (1) learning to reconstruct the JND-quality frames, jointly with the QP prediction, and (2) jointly learning several JND levels to augment the learning performance. We propose a novel solution where first, an effective feature backbone is trained by learning to reconstruct JND-quality frames from the raw frames. Second, JND prediction models are trained based on features extracted from latent space (i.e., compressed domain), or reconstructed JND-quality frames. Third, a multi-JND model is designed, which jointly learns three JND levels, further reducing the prediction error. Extensive experimental results demonstrate that our multi-JND method outperforms the state-of-the-art and achieves an average JND1 prediction error of only 1.57 in QP, and 0.72 dB in PSNR. Moreover, the multitask learning approach, and compressed domain prediction facilitate light-weight inference by significantly reducing the complexity and the number of parameters.

The proposed framework

Schematic illustration of previous and proposed approaches: (a) existing approach, (b) the proposed Latent-based JND prediction methods, LAT and E2E-LAT, (c) the proposed Reconstructed-based JND prediction methods, REC and E2E-REC, (d) the proposed Multi-JND (MJ) learning using Latent space, MJ-LAT, and (e) the proposed MJ learning using reconstructed JND-quality frames, MJ-REC.

Requirements

  • Tensorflow
  • FFmpeg

Dataset

Our evaluation is conducted on VideoSet and MCL-JCI datasets.

Pre-trained Models

Our pre-trained models can be downloaded using this link, from the Zenodo repository.

Usage

Our pretrained models are capable of predicting JND values, and they can also be employed for training on a custom dataset.

Note: The dataset used for training and testing should have such a structure.
- rootdir/
     - train/
         - img#1
         - ...
         - JND-Levels.txt (a file containing the 3 JND levels per image: first column for the first JND, second column for the second JND, and third column for the third JND level)
     - valid/
         - img#1
         - ...
         - JND-Levels.txt (a file containing the 3 JND levels per image: first column for the first JND, second column for the second JND, and third column for the third JND level)
     - test/
         - img#1
         - ...
     - jnd1train/
         - img#1
         - ...
     - jnd1valid/
         - img#1
         - ...
     - jnd2train/
         - img#1
         - ...
     - jnd2valid/
         - img#1
         - ...
     - jnd3train/
         - img#1
         - ...
     - jnd3valid/
         - img#1
         - ...

Testing

For prediction with LAT or REC model, the following commands can be used.

python3 [LAT.py or REC.py] test --jnd_value [JND1 or JND2 or JND3] --data_dir "Path-to-the-rootdir/" --model_weights_path "Path-to-the-pretrained-model/" --result_path "Path-to-save-test-results/" --JND_Recon_Models_Path "Path-to-the-pretrained-JND-Reconstruction-models/"

For prediction with E2E-LAT or E2E-REC model, the following commands can be used.

python3 [E2ELAT.py or E2EREC.py] test --jnd_value [JND1 or JND2 or JND3] --data_dir "Path-to-the-rootdir/" --model_weights_path "Path-to-the-pretrained-model/" --result_path "Path-to-save-test-results/" --ImgReconstrution_Model_Path "Path-to-the-pretrained-Img-Reconstruction-models/"

For prediction with MJ-LAT or MJ-REC model, the following commands can be used.

python3 [MJLAT.py or MJREC.py] test --data_dir "Path-to-the-rootdir/" --model_weights_path "Path-to-the-pretrained-model/" --result_path "Path-to-save-test-results/" --JND_Recon_Models_Path "Path-to-the-pretrained-JND-Reconstruction-models/"

Training

For training with LAT or REC model, the following commands can be used.

python3 [LAT.py or REC.py] train --jnd_value [JND1 or JND2 or JND3] --data_dir "Path-to-the-rootdir/" --checkpoint_path "Path-to-save-checkpoints-during-training/" --csv_log_path "Path-to-save-CSV-logs-during-training/" --JND_Recon_Models_Path "Path-to-the-pretrained-JND-Reconstruction-models/" --epochs Number-of-training-epochs --batch_size Batch-size-for-training --learning_rate Learning-rate-for-optimizer

For training with E2E-LAT or E2E-REC model, the following commands can be used.

python3 [E2ELAT.py or E2EREC.py] train --jnd_value [JND1 or JND2 or JND3] --data_dir "Path-to-the-rootdir/" --checkpoint_path "Path-to-save-checkpoints-during-training/" --csv_log_path "Path-to-save-CSV-logs-during-training/" --ImgReconstrution_Model_Path "Path-to-the-pretrained-Img-Reconstruction-models/" --epochs Number-of-training-epochs --batch_size Batch-size-for-training --learning_rate Learning-rate-for-optimizer

For training with MJ-LAT or MJ-REC model, the following commands can be used.

python3 [MJLAT.py or MJREC.py] train --jnd_value [JND1 or JND2 or JND3] --data_dir "Path-to-the-rootdir/" --checkpoint_path "Path-to-save-checkpoints-during-training/" --csv_log_path "Path-to-save-CSV-logs-during-training/" --JND_Recon_Models_Path "Path-to-the-pretrained-JND-Reconstruction-models/" --epochs Number-of-training-epochs --batch_size Batch-size-for-training --learning_rate Learning-rate-for-optimizer

Citation

The attention layer used in LAT-based models is derived from the Squeeze-and-Excitation Networks paper.

@inproceedings{hu2018squeeze,
	title={Squeeze-and-excitation networks},
author={Hu, Jie and Shen, Li and Sun, Gang},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
year={2018}
}

If our work is useful for your research, please cite our paper:

@article{nami2024lightweight,
	title={Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames},
author={Nami, Sanaz and Pakdaman, Farhad and Hashemi, Mahmoud Reza and Shirmohammadi, Shervin and Gabbouj, Moncef},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
year={2024},
publisher={IEEE}
}

Project information

This repository is associated with the project FALCON, under Work Package 3 (WP3). This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101022466.

Contact

If you have any question, leave a message here or contact Sanaz Nami ([email protected], [email protected]).

mtl_jnd's People

Contributors

farhad02 avatar sanaznami avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.