Code Monkey home page Code Monkey logo

memorizationdilation's Introduction

Memorization-Dilation: Modeling Neural Collapse Under Noise

This repository contains an implementation of Memorization-Dilation: Modeling Neural Collapse Under Noise published at ICLR 2023. Cite this work as follows:

@inproceedings{DBLP:conf/iclr/NguyenLLHK23,
  author       = {Duc Anh Nguyen and
                  Ron Levie and
                  Julian Lienen and
                  Eyke H{\"{u}}llermeier and
                  Gitta Kutyniok},
  title        = {Memorization-Dilation: Modeling Neural Collapse Under Noise},
  booktitle    = {The Eleventh International Conference on Learning Representations,
                  {ICLR} 2023, Kigali, Rwanda, May 1-5, 2023},
  publisher    = {OpenReview.net},
  year         = {2023}
}

Foreword

As mentioned in the paper, our implementation builds on top of the official implementation of [1], namely from this repository. Thus, we re-used large parts of their code, for which we gratefully thank the authors for their work.

Requirements

To install all required packages, you need to run

pip install -r requirements.txt

The code has been tested using Python 3.6, 3.8 and 3.9 on Ubuntu 18.* and Ubuntu 20.* systems. We trained our models on machines with Nvidia GPUs (we tested CUDA 10.1, 11.1 and 11.6). We recommend to use Python virtual environments to get a clean Python environment for the execution without any dependency problems.

Configuration

In the file config/config.ini, one can find parameters that are used throughout the project. These include path-related specifications (e.g., where to store results), logging properties and resource constraints for hyperparameter optimization runs.

Datasets

All datasets are downloaded automatically, there is no need to store them explicitly. All datasets are publicly available and will be stored to ~/data/ by default. One may change this default location of datasets in args.py through the argument --data_dir.

Training and Evaluation

Training (Conventional Noise Experiments)

To train simple models for the conventional noise setting as presented in the paper, one can simply call:

python train_simple_model.py --loss LS --model MLP --dataset svhn --batch_size 16 --weight_decay 0.001 --depth 9 --width 2048 --label_noise 0.2 --seed 42 --classes 2 --epochs 200 --use_bn --act_fn relu

Note that --help passed as argument provides a comprehensive overview over all hyperparameters.

For hyperparameter optimization, train_simple_model_ho.py provide the implementation for a Skopt hyperparameter optimization (based on Ray) as described in the paper.

Training (Latent Noise Class Experiments)

For the latent noise class experiments as shown in the appendix, one has to add the parameter --fourclass_problem to the previously introduced command. By default, this works for binary classification problems. Moreover, --fc_noise_degree allows for specifying the noise degree, i.e., the fraction of class instances assigned to the new latent class.

Training (Large-Scale Experiments)

To execute the large-scale experiments as included in the appendix, one can call the model training routine as described in the original repository, which is offered in train_model.py. To conduct the hyperparameter optimization runs, one has to call train_model_ho.py. The search space parameters can be specified therein.

Training (Misc)

Furthermore, we support to train models as in the original repository.

Evaluation (Conventional and Latent Noise Class Experiments)

To evaluate simple models as used in the memorization-dilation experiments, one can simply call as an example:

python validate_simple_model.py --loss LS --model MLP --dataset svhn --batch_size 16 --weight_decay 0.001 --depth 9 --width 2048 --label_noise 0.2 --seed 42 --classes 2 --epochs 200 --use_bn --act_fn relu

While the conventional label noise setting requires no specific parameter to be set, one again has to add the parameter --fourclass_problem as in the training script for the latent class noise experiments.

The resulting (NC) metrics and further information is then stored in the file info.pkl in the run directory. As before, we also still provide all implementations as in the original repository.

Evaluation (Large-Scale Experiments)

The evaluation of the large-scale experiments can be realized by

python validate_NC.py --gpu_id 0 --dataset <identifier> --batch_size 256 --load_path <path to the uid name>

with the corresponding parameters of the training run.

References

[1]: Zhu et al. A Geometric Analysis of Neural Collapse with Unconstrained Features. arXiv:2105.02375, 2021.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.