Code Monkey home page Code Monkey logo

duck's Introduction

DUCK: Distance-based Unlearning via Centroid Kinematics

Marco Cotogni, Jacopo Bonato, Luigi Sabetta, Francesco Pelosin and Alessandro Nicolosi

The paper is available on arxiv

Overview

DUCK is a cutting-edge machine unlearning algorithm designed to enhance privacy in modern artificial intelligence models. Leveraging the power of metric learning, DUCK efficiently removes residual influences of specific data subsets from a neural model's acquired knowledge during training. Time

Features

  • Distance-based Unlearning: DUCK employs distance metrics to guide the removal of samples matching the nearest incorrect centroid in the embedding space.

  • Versatile Performance: Evaluated across various benchmark datasets, DUCK demonstrates exceptional performance in class removal and homogeneous sampling removal scenarios.

  • Adaptive Unlearning Score (AUS): Introducing a novel metric that not only measures the efficacy of unlearning but also quantifies the performance loss relative to the original model.

  • Membership Inference Attack: DUCK includes a novel membership inference attack to assess its capacity to erase previously acquired knowledge, adaptable to future methodologies.

Getting Started

Installation

# Clone the repository
git clone https://github.com/OcraM17/DUCK

# Navigate to the project directory
cd your-repo

# Installation WITH DOCKER

#Step 1:

#Build the docker image from the Dockerfile : 
docker build -f Dockerfile -t duck:1.0 . 

#Step 2:

#Run your image : 
docker run -it --gpus all -v "/path_to_dataset_folder":/root/data -v "/path_to_duck_folder":/duck duck:1.0 /bin/bash

# Install LOCALLY 
pip install -r requirements.txt

Code Execution

  • Step 1: Run 'pyhton3 training_original.py' to train the original model. Choose the dataset in opts -Step 2: Run 'sample_fgt_samples.py' to sample the forget sample for HR scenario
  • Step 3: Run 'pyhton3 training_oracle.py' to train the retrained model. Choose the dataset in opts. Use --mode CR for class removal and --mode HR for homogeneus removal

If you already have trained and retrained model you can skip the above steps.

If you plan to execute the HR scenario,before launching any unlearning method, you have to execute the script sample_fgt_samples.py for generating the indices of the samples to be forgotten.

The organization of the project should be the followiing:

/project-root
  ├── src
  │   ├── models
  │   ├── forget_id_files
  │   ├── MIA_CODE
  │   ├── out
  │   │   ├── CR
  │   │   └── HR
  │   ├── weights
  │   │   ├── chks_cifar10
  │   │   ├── chks_cifar100
  │   │   ├── chks_tiny
  │   │   └── chks_vgg
  │   └── *files*.py
  ├── Dockerfile
  ├── requirements.txt  
  ├── README.md
  ├── LICENSE
  └── .gitignore

For reproducing the experiments:

cd src

python main_def.py --run_name <run_name> --dataset <dataset> --mode <mode> --cuda <cuda> --load_unlearned_model --save_model --save_df --push_results --run_original --run_unlearn --run_rt_model --num_workers <num_workers> --method <method> --model <model> --bsize <bsize> --wd <wd> --momentum <momentum> --lr <lr> --epochs <epochs> --scheduler <scheduler> --temperature <temperature> --lambda_1 <lambda_1> --lambda_2 <lambda_2>

Configuration Options:

--run_name: Name of the run (default: "test").
--dataset: Dataset for the experiment (default: "cifar100").
--mode: Mode for the experiment (default: "CR").
--cuda: Select zero-indexed CUDA device. Use -1 to run on CPU (default: 0).
--load_unlearned_model: Load a pre-trained unlearned model.
--save_model: Save the trained model.
--save_df: Save the experiment results as a DataFrame.
--push_results: push results on google docs
--run_original: Run the original model.
--run_unlearn: Run the unlearned model.
--run_rt_model: Run the real-time model.
--num_workers: Number of workers for data loading (default: 4).
--method: Method for unlearning (default: "DUCK").
--model: Model architecture (default: 'resnet18').
--bsize: Batch size (default: 256).
--wd: Weight decay (default: 0.0).
--momentum: Momentum for SGD optimizer (default: 0.9).
--lr: Learning rate (default: 0.0004).
--epochs: Number of epochs (default: 200).
--scheduler: Learning rate scheduler milestones (default: [25, 40]).
--temperature: Temperature for unlearning algorithm (default: 2).
--lambda_1: Lambda 1 hyperparameter (default: 1).
--lambda_2: Lambda 2 hyperparameter (default: 1.4).

All the hyperparameters are reported in the Supplementary Material of the paper

Example CIFAR 10 in HR scenario

python3 main_def.py --run_name cifar10_HR --dataset cifar10 --mode HR --cuda 0 --save_model --save_df --run_unlearn  --num_workers 4 --method DUCK --model resnet18 --bsize 1024 --lr 0.001 --epochs 10  --temperature 2 --lambda_1 1 --lambda_2 1.4

Results

Time

Citation

If you find our paper or our code useful for your research, please cite:

@misc{cotogni2023duck,
      title={DUCK: Distance-based Unlearning via Centroid Kinematics}, 
      author={Marco Cotogni and Jacopo Bonato and Luigi Sabetta and Francesco Pelosin and Alessandro Nicolosi},
      year={2023},
      eprint={2312.02052},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

duck's People

Contributors

francesco-p avatar jbonato1 avatar lsabetta avatar ocram17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.