Code Monkey home page Code Monkey logo

mrvc's Introduction

Multi-Resolution Video Coding (MRVC)

Welcome to this project!

This an implementation of a novel video transform based on Motion Compensation (MC) and the Discrete Wavelet Transform (DWT).

Contents of this repo:

  1. demos: some OpenCV demos.
  2. distribute: use of Distutils for distributing this software (unmaintained).
  3. docs: white paper.
  4. images: some test images.
  5. src: source files.
  6. tests: test scripts.
  7. tools: tools.

mrvc's People

Contributors

abordes96 avatar afr997 avatar davidbejarcaceres avatar dingwenguo avatar jalbladewing avatar jmmateo14 avatar josefreak95 avatar josmartor avatar juanrdzbaeza avatar normanditirado avatar vicente-gonzalez-ruiz avatar victorconka avatar

Stargazers

 avatar

Watchers

 avatar  avatar

mrvc's Issues

(Week 6) Priorization of the visual content in the motion compensated (H) subbands

Objectives

  1. Understand how to prioritize information in the MCDWT domain.

Methodology

In groups, implement and test the encoder described below (the decoder should be easily derivable from the encoder). After incorporating it to MCDWT, determine its performance as a compressor for different number of sent coefficients. Write a report of your work and present it in class.

Contents

  1. MCDWT white paper.
  2. MCDWT project.
  3. Introduction to Data Compression.
  4. Text compression.

Activities

  1. Modify MCDWT.py to incorporate to implement the coder in the forward butterfly and the decoder in the backward butterfly. You will need to implement the algorithm described here.
  2. Compress the previous (truncated) sequence for different numbers of sent coefficients.
  3. Compute the compression ratio.
  4. Write a report (use preferably Markdown) describing your experiments and your conclusions.

Determining a good quantizer

In digital signal coding, a quantizer maps input M possible values of the signal (samples in the case of a digital signal) into N possible output values of the quantized signal (again, samples), being M<N. More information here.

Objectives:

  1. Understand the quantization of signals and digital signals.
  2. Implement different quantizers for MCDWT and evaluate them.

Procedure:

Implement a set of uniform scalar quantizers (midtread, midrise and deadzone) in the script quantize.py. Create a Jupyter notebook, documenting the file size and the distortion obtained when the subbands generated by MCDWT are quantized and compressed with the PNG image compression format, for the different quantizers implemented. Evaluate them using the same paremeters (videos, number of spatial levels, number of temporal levels, etc.) that in the previous issue. Notice that modifying the quantization step, different points of the following R/D curve should be obtained:

Distortion
     |*
     | *
     |  *
     |    *
     |        *
     |                 *
     |                                 *
     |                                                                                           *
     +--------------------------------------------------------- Rate
     <- larger quantization step   smaller quantization step- >

(Week 7) Priorization of the visual content in the I-type H subbands

Objectives

  1. Understand how to prioritize information in the 2D wavelet domain.

Methodology

In groups, implement and test the encoder described below (the decoder should be easily derivable from the encoder). After incorporating it to MCDWT, determine its performance as a compressor for different values of the quantization parameter (lambda). Write a report of your work and present it in class.

Contents

  1. MCDWT white paper.
  2. MCDWT project.
  3. Introduction to Data Compression.
  4. Text compression.

Activities

  1. Modify MCDWT.py to incorporate to implement the coder in the forward butterfly and the decoder in the backward butterfly. You will need to implement the algorithm described here.
  2. Implement a uniform scalar quantizer and apply it to an MCDWT output.
  3. Compress the previous (quantized) sequence.
  4. Compute the compression ratio.
  5. Write a report (use preferably Markdown) describing your experiments and your conclusions.

(Week 8) Block-based bidirectional motion estimation

Objectives

  1. Understand the basics of the Motion Estimation (ME) and Motion Compensation (MC) processes.
  2. Compare the performance of a block-based ME system to the performance of the dense optical-flow estimator already implemented.

Methodology

In groups, research the links provided below (and in general, use all the information that you can be found on the Internet) to understand how to implement a bidirectional block-based ME/MC system for video. Develop an implementation (as efficient as possible) and compare it to the current estimator of the MCDWT project. Write a report and present it in class.

Contents

  1. MCDWT white paper.
  2. MCDWT project.
  3. MCTF_video_coding project.
  4. OpenCV.
  5. Introduction to Data Compression.
  6. Video compression.

Activities

  1. Research on the Internet for packages/libraries that provide MEs (a good candidate could be the used in this repo) (see src/motion_estimate.cpp).
  2. Modify MCDWT.py to incorporate the block-based ME, coding it as an option to the current estimator (dense optical-flow).
  3. Experimentally, compare both estimators (block-based and optical-flow-based). Use the entropy of the output of the transform as a figure of merit.
  4. Write a report (use preferably Markdown) describing your experiments and your conclusions.

Adapting the `src/IO` stuff for handling multiple spatial resolutions

In the current implementation, the different subbands (LL, LH, HL, and LL) are named <prefix>_LL, _LH, _HLand_HH. This notation hinder the implementation of MDWT+MCDWT, forcing to rename the LL subbands from ???_LLto???` (? = digital digit) in each iteration.

To avoid this, we propose to use a different notation for the decompositions. An example:

Current implementation:

      DWT           iDWT
000 ------> 000_LL ------> 000
            000_LH
            000_HL
            000_HH

Proposed implementation:

      DWT           iDWT
000 ------> LL/000 ------> 000
            LH/000
            HL/000
            HH/000

These modifications should be "harcoded" in the src/IO sources.

Bit allocation in 2D decompositions

Remember:

+-----------+   DWT   +----+----+
|           | ------> | LL | LH |
|   image   |         +----+----+
|           | <------ | HL | HH |
+-----------+  iDWT   +----+----+
                     decomposition

Bit allocation (see Section 7.1.2 (Quantization)) defines, given a bit budget R, the quantization step that must be used in each subband (LL, LH, HL and HH) to minimize the distortion (MSE, for example).

If the transform (DWT) is orthogonal, this optimization problem can be solved considering that the distortion generated by the quantization of the subbands is additive (for example, the quantization error generated by the quantization of the subband LL "Q(LL)" and the subband LH "Q(LH), added "Q(LL)+Q(LH)" is equal to the quantization error generated by the quantization of both subbands at the same time "Q(LL,LH)"). Thus, if we know the RD (Rate/Distortion) curve of each subband (that is, the contribution of each subband to the quality of the reconstructed (quantized) image), the optimal quantization step for each subband is those q_step that produces as much R bits (in total, for the 4 subbands) and none of these bits decreases the distortion in an amount that corresponds with a slope in the curve that is smaller that \lambda. See this notebook.

(Week 4) Computation of the energy of the wavelet coefficients

Objectives

  1. Understand the importance of the wavelet coefficients in the quality of the reconstructions after the temporal synthesis.
  2. Know how to incorporate this information into a video codec.

Methodology

In groups, research the links provided below (and in general, use all the information that you can be found on the Internet) to understand how to determine the contribution of every wavelet coefficient in the energy of the reconstruction. For this, design an experiment where for every coefficient X, set X=255 and set to 0 the rest of the coefficients, perform an inverse MCDWT transform, and quantify the energy of the reconstruction. Generate a sequence of decompositions where each "coefficient" represents such energy.

Contents

  1. MCDWT white paper.
  2. MCDWT.
  3. MCTF-video-coding.
  4. J.C. Maturana-Espinosa, V. González-Ruiz, J.P. García-Ortiz, and D. Müller. Rate Allocation for Motion Compensated JPEG2000.
  5. Introduction to Data Compression.
  6. Video compression.
  7. SNR.

Activities

  1. Research on the Internet looking for orthogonality and orthonormality concepts in the Discrete Wavelet Transform. Research also about orthogonal and biorthogonal transforms and image compression.
  2. Implement a script to compute the contribution of each wavelet coefficient in the 2D-DWT using the method described before, depending on: (1) the number of spatial resolution levels and, (2) the selected wavelet. The number of spatial resolution levels should be provided as input parameters.
  3. Experimentally, compare the performance of two MCDWT decoders, one using the coefficients contributions and the other not using such information. To do that, quantize the coefficients (provide this as an input parameter), perform the corresponding reconstructions and measure the differences to the original video sequence (use SNR).
  4. Write a report (use preferably Markdown) describing your experiments and your conclusions.

(Week 2) The MCDWT video codec and its use

Objectives

  1. Include in the documentation of MCDWT (docs directory) a document explaining what is MCDWT and how it should be used to transform a video using MCDWT from scratch.
  2. Create a shell script (preferably for Bash) with the commands that do that.

Methodology

Revise the information provided below and write in Markdown a user manual explaining how to transform a video using MCDWT.

Contents

  1. Markdown manual.
  2. MCDWT white paper.
  3. MCDWT project.

Activities

  1. Download a video from here.
  2. Convert the video into a sequence of 16-bit PNG images using the notation videoname_???.png where ??? is a 3 digits number starting a 000. Use the scripts extract_images.sh and add_offset.py as reference. See also test_sum_sub.sh. Notice that the pixel value 0 should be stored in the PNG files as the 32768-128 value. Develop also the script(s) for reconstructing the original image sequence using the transformed one. Use only parameters in your script(s) and provide a helpful description of these. Parameters should control both, (1) the number of spatial resolutions (levels in the Laplacian Pyramid) and (2) the number of temporal resolutions (which defined the GOP size). The number of transformed images should be also an input parameter.
  3. Using an ASCII editor, write your user manual (in Markdown) and your shell script(s), and perform a PR including all files.

(Week 3) Adaptive motion compensation based on the distortion of the prediction error

Objectives

  1. Understand the importance of the generation of predictions in bidirectional motion compensation.
  2. Determine the increase in performance of the improved predictor.

Methodology

In groups, implement an image predictor based on the prediction error (see theory below). After incorporating it to MCDWT, determine its performance to decrease the energy of the residuals (prediction errors). Write a report of your work and present it in class.

Contents

  1. MCDWT white paper.
  2. MCDWT project.
  3. Introduction to Data Compression.
  4. Video compression.

Activities

  1. Modify MCDWT.py to incorporate the generation of predictions based on the distortion of the predictions (currently based on a simple average). You will need to implement Eq. 4 described here.
  2. Measure the variation of the entropy in the output of the transform (compare it to the current implementation where the prediction if a simple average of the forward and backward predictions). In this case, it is preferable to have both predictions implemented (the simple average and the weighted average) and using a (command-line) parameter, chose the desired predictor.
  3. Write a report (use preferably Markdown) describing your experiments and your conclusions.

(TFM) Subband prediction using TensorFlow

TensorFlow allows to implement efficient artificial neural networks to find approximate solutions to complex computational problems such as those related to machine learning. Recently, as can be found here, TensorFlow has been used to compute the Dense Optical Flow.

This issue proposes to implement a motion estimator for MCDWT based on the developments previously described. The main goal here is to minimize the running time of the estimation leveraging on the availability of GPU resources.

Evaluation of the JPEG image compressor

Objectives:

  1. Understand the operation of the JPEG image compressor.
  2. Implement a rate-control algorithm considering the contribution o the MCDWT subbands.
  3. Analyze the performance of the MCDWT+JPEG video compressor.

Procedure:

Create a Jupyter notebook to document the implemented rate-control algorithm for MCDWT+JPEG, and to describe the experiments performed for evaluating it. As in the previous issue, range your study for different:

  1. GOP sizes, and
  2. number of generated spatial resolutions,

starting both at 1. The notebook should be self-contained (all the information needed for repeating the experiments should be in it) and allow to run without any (human) interaction, except for the initial launching. Select the testing videos from Xiph.org Test Media with two different criteria:

  1. The amount of motion in the video.
  2. The spatial resolution of the video.

Considering "intra" pixels

When the prediction using the neighbor images is bad (something that it is already known), a prediction using the low-resolution version of the predicted image should be taken into consideration.

(Week 1) Use of the Fork and Branch Git Workflow at GitHub

Objectives

  1. Learn how to use GitHub and the fork and branch Workflow.

Methodology

Head over the links provided below.

Contents

  1. The Fork and Branch Git Workflow.
  2. git.
  3. git - the simple guide.

Activities

  1. Create a personal GitHub account.
  2. Fork the repository https://github.com/vicente-gonzalez-ruiz/fork_and_branch_git_workflow.
  3. Fork the repository https://github.com/Sistemas-Multimedia/MCDWT.
  4. Fork the repository https://github.com/Sistemas-Multimedia/Sistemas-Multimedia.github.io.
  5. Improve some aspects of at least one of the previous repositories (maybe in the available documentation) and perform a Pull Request (PR).

(Week 5) Visual evaluation of the effects of quantization

Objectives

  1. Understand the importance of quantization in image coding.
  2. Measure the visual impact of the quantization error in MCDWT.

Methodology

In groups, research the links provided below (and in general, use all the information that you can be found on the Internet) to see how quantization affects both, the quality of the reconstructions and the compression ratio. Next, implement a subband quantizer that inputs an image and outputs a quantized image, after providing the quantization step as a parameter. Use this quantizer to degrade the quality of the output of MCDWT applied to a sequence of images, in two cases: (1) when only the H subbands are quantized, and (2) when all the subbands (L included) are quantized. Visually determine the impact of quantization in both cases.

Contents

  1. MCDWT white paper.
  2. MCDWT.
  3. Introduction to Data Compression.
  4. Video compression.
  5. Quantization.

Activities

  1. Research on the Internet looking for quantization in image compression.
  2. Implement a script to quantize (PNG) images (implement an uniform scalar quantizer). The name of the file with the image and the quantization step should be provided as input parameters. The name of the quantized image is the only output.
  3. Compute an MCDWT of T temporal levels and K spatial levels.
  4. Quantize the H subbands.
  5. Compute the inverse MCDWT and visualize the reconstruction.
  6. Compare (visually) the reconstruction and the original sequence.
  7. Repeat this procedure for several T and K values.
  8. Repeat the procedure when all the subbands (L and H) are quantized.
  9. Write a report (use preferably Markdown) describing your experiments and your conclusions.

Evaluation of each coding path in MCDWT

The different alternative in each stage of MCDWT (ME, entropy coding) should be tested, in any combination. Implement a set of experiments for evaluating the different combinations.

Evaluation of the PNG image compressor

Objectives:

  1. Understand the operation of the MCDWT video codec.
  2. Understand the operation of the PNG image compressor.
  3. Analyze the performance of the MCDWT+PNG video compressor.

Procedure:

Create a Jupyter notebook, documenting the compression ratio obtained when the subbands generated by MCDWT are compressed with the PNG image compression format (notice that this is the default image compressor already used by MCDWT), for different:

  1. GOP sizes, and
  2. number of generated spatial resolutions,

starting both at 1. The notebook should be self-contained (all the information needed for repeating the experiments should be in it) and allow to run without any (human) interaction, except for the initial launching. Select the testing videos from Xiph.org Test Media with two different criteria:

  1. The amount of motion in the video.
  2. The spatial resolution of the video.

Bit Allocation in MCDWT

The procedure of determining the optimal quantization steps for a single decomposition can be extrapolated to the quantization of a sequence of images transformed by the MDWT and the MCDWT. In this case, (and always supposing that only 2 spatial resolution levels will be available, that is the same that to say that MDWT has been applied only one time) instead of considering 4 subbands in the optimization, you must consider all the subbands of the optimized GOP, that are (except for the first GOP) 4*2^T, where T is the number of iterations of the MCDWT.

For example, in the figure:

 GOP 0       GOP 1
------- ---------------
 img 0           img 2
+--+--+         +--+--+
|LL|LH|         |LL|LH|
+--+--+         +--+--+
|HL|HH|         |HL|HH|
+--+--+         +--+--+
   a               c
         img 1
        +--+--+
        |LL|LH|
        +--+--¡
        |HL|HH|
        +--+--+
           ~
           b

there are 3 images, to which the MDWT and the one-iterations (T=1) MCDWT has been applied (a, b and c are the alias of the images in the forward MCDWT butterfly). As can be seen, there are 4*2^1 = 8 subbands in the GOP 1 (the first GOP always has only one image). So, the quantization step optimizer should consider 8 RD curves (instead of only 4) in the GOP 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.