Code Monkey home page Code Monkey logo

coinpp's Introduction

COIN++

Official implementation of COIN++: Neural Compression Across Modalities.

Requirements

The requirements can be found in requirements.txt. While it is possible to run most of the code without it, we strongly recommend using wandb for experiment logging and storing as this is tighly integrated with the codebase.

Data

Before running experiments, make sure to set data paths in data/dataset_paths.yml. Most datasets can be downloaded automatically, except for FastMRI which needs an application form and ERA5 which can be downloaded here. For the FastMRI dataset, we use the brain_multicoil_val.zip file and split into train and test sets using the ids in data/fastmri_split_ids.py.

Training

To train a model, run

python main.py @config.txt.

See config.txt and main.py for setting various arguments. Note that if using wandb, you need to change the wandb entity and project name to your own.

A few example configs used to train the models in the paper can be found in the configs folder.

Storing modulations

Given the wandb_run_path from a trained model, store modulations using

python store_modulations --wandb_run_path <wandb_run_path>.

Evaluation

To evaluate the performance of a given modulation dataset (in terms of PSNR), run

python evaluate.py --wandb_run_path <wandb_run_path> --modulation_dataset <modulation_dataset>.

Quantization

To quantize a modulation dataset to a given bitwidth, run

python quantization.py --wandb_run_path <wandb_run_path> --train_mod_dataset <train_mod_dataset> --test_mod_dataset <test_mod_dataset> --num_bits 5.

Entropy coding

To entropy code a quantized modulation dataset, run

python entropy_coding.py --wandb_run_path <wandb_run_path> --train_mod_dataset <train_mod_dataset> --test_mod_dataset <test_mod_dataset>.

Saving reconstructions

To save reconstructions for a specific set of data points, run

python reconstruction.py --wandb_run_path <wandb_run_path> --modulation_dataset <modulation_dataset> --data_indices 0 1 2 3.

Trained models and modulations [Not yet public โš ๏ธ]

The trained models, runs and modulations are not yet public as we need to share wandb runs from a private project (see this github issue). We hope to make this public soon!

All models and modulations are stored on wandb. To find the link for a given model or run, see the wandb_ids.json files in the appropriate folder in the results directory. The model and run information can the be found at wandb.ai/<wandb_id>.

Results and plots

To recreate all the plots in the paper run:

python plots.py.

See plots.py for plotting options. All results and ablations can be found in the results folder.

Baselines

Running the baselines requires that all codecs are installed on your machine. In addition, the baseline scripts also require tqdm and PIL.

Image baselines

The image baselines used for CIFAR10, Kodak, FastMRI and ERA5 are:

  • JPEG: We use the JPEG implementation from PIL version 8.1.0.
  • JPEG2000: We use the JPEG2000 implementation from OpenJPEG version 2.4.0.
  • BPG: We use BPG version 0.9.8.

Audio baselines

The audio baseline used for LibriSpeech is:

  • MP3: We use the MP3 implementation from LAME version 3.100.

License

MIT

coinpp's People

Contributors

emiliendupont avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

coinpp's Issues

Question about the latent_dim parameter.

The source code provide the training configuration for dataset of kodak and cifar10 and i am a little confused about the value of latent_dim parameter. It's set to 64 for kodak dataset and 384 for cifar10 dataset, i thought kodak dataset is much more complex than cifar10 dataset and it's lantent_dim should be larger than cifar10 dataset, but it's contrary, why? And could you share the empirical advice for selecting a proper latent_dim value for a certain dataset?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.