optml-group / unlearn-sparse Goto Github PK

[NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, Pranay Sharma, Sijia Liu

License: MIT License

Python 100.00%

data-privacy machine-unlearning data-deletion data-removal forgetting membership-inference membership-inference-attack unlearning

unlearn-sparse's Issues

Reproducing the result of Dense Network with FT

First of all, I like your work. It is impressive.

I'm trying to reproduce your result using Dense Network with the unlearn method FT. However, I wasn't able to reproduce your result using the default Batch size. I could reproduce your result using a batch size of 64 for random data forgetting and a batch size of 512 for Class-wise forgetting. Is this the correct Batch size you used? or did I do something wrong?

Thank You.

Reproducing the result of cifar100

Hello, first and foremost, I would like to express my gratitude for the work you have done. Thank you.
I was trying to reproduce the result of cifar100 in the appendix using retrain and FT. But for some reason, I couldn't reproduce the result of unlearning by FT.

I tested cifar100 under 6 different independent trials. As you can see FT with class-wise forgetting has quite a low score of forget efficacy on Sparse(pruned model)/Dense(unpruned) class-wise forgetting and Sparse random data forgetting.

The hyperparameters of these experiments are the same as ciafar10.

for seed in 1 2 3 4 5 6
do
    python -u main_forget.py --save_dir ${save_dir}/random/${seed}_retrain --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    python -u main_forget.py --save_dir ${save_dir}/random/${seed}_FT_only --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.04 --unlearn_epochs 10 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    
    python -u main_forget.py --save_dir ${save_dir}/class/${seed}_retrain --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --seed $seed
    python -u main_forget.py --save_dir ${save_dir}/class/${seed}_FT_only --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.01 --unlearn_epochs 10 --dataset $data --seed $seed

    python -u main_forget.py --save_dir ${save_dir2}/random/${seed}_retrain --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    python -u main_forget.py --save_dir ${save_dir2}/random/${seed}_FT_only --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.04 --unlearn_epochs 10 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    
    python -u main_forget.py --save_dir ${save_dir2}/class/${seed}_retrain --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --seed $seed
    python -u main_forget.py --save_dir ${save_dir2}/class/${seed}_FT_only --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.01 --unlearn_epochs 10 --dataset $data --seed $seed
done

#4
I'm aware that you've used different learning rates of cifar10 for class-wise forgetting(0.01) and random data forgetting(0.04). Is learning rates for cifar100 different?

Thank you.

Challenges in Replicating Model Pruning and Unlearning Results

Hello,

I've been working on replicating some results from your paper using the provided commands and code modifications in the README. However, I am encountering some discrepancies in the results, particularly with the MIA-Efficacy values. Below, I detail the steps taken and the issues encountered.

Steps and Code Used:

Initial pruning of the model was done using the command:

python -u main_imp.py --data ./data --dataset cifar10 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir omp --rate 0.95 --pruning_times 2 --num_workers 8

I modified arg_parser.py with the following additions:

parser.add_argument(
        "--num_indexes_to_replace",
        type=int,
        default=None,
        help="Number of data to forget",
    )
    parser.add_argument(
        "--class_to_replace", type=int, default=None, help="Specific class to forget"
    )
    parser.add_argument(
        "--indexes_to_replace",
        type=list,
        default=None,
        help="Specific index data to forget",
    )

When class_to_replace is set to None, a random selection of indexes equal to the number specified by num_indexes_to_replace will be chosen for the unlearning process.

args = arg_parser.parse_args()
    if args.seed:
        utils.setup_seed(args.seed)
    if args.class_to_replace is None:
        if args.dataset == "cifar10":
            num_indexes_to_replace = args.num_indexes_to_replace
            if args.indexes_to_replace is None:
                args.indexes_to_replace = np.random.choice(
                    45000, num_indexes_to_replace, replace=False
                )

I used the following command to unlearning

python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1

As shown in Figure 1, the results obtained under the 95%-sparse model are calculated as follows: UA=6.78, RA=99.99, TA=92.77.

Figure 1

I would like to inquire which value in SVC_MIA_forget_efficacy represents MIA-Efficacy. Is it the confidence value that closely matches the one mentioned in the paper, or is it the average of these values?

I used the following command to get the unlearning result of the Dense model.

python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1

As shown in Figure 2, the results under the Dense model are calculated as follows: UA=4.9, RA=99.52, TA=94.62.

Figure 2

The above results are relatively close to those reported in the paper; however, I conducted separate tests on GA for both the 95%-sparse model and the Dense Model by the following commands:
sparse model command:

python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5

Dense Model command:

python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5

As shown in Figure 3, the results for the 95%-sparse model are calculated as follows: UA=0.62, RA=99.39, TA=94.23. The UA value differs significantly from the 5.62±0.46 reported in the paper. Additionally, the MIA-Efficacy, whether it's the average or a specific value, shows a considerable discrepancy from the reported 11.76±0.52.

Figure 3 95%-sparse model result

As illustrated in Figure 4, the results for the Dense Model are calculated as follows: UA=0.78, RA=99.52, TA=94.52. The UA value shows a significant difference from the 7.54±0.29 mentioned in the paper. Moreover, the average MIA-Efficacy is 8.5, which slightly deviates from the 10.04±0.31 reported in the paper.

Figure 4 Dense Model result

While running FF under the 95%-sparse model, since the specific value for alpha was not known, we set it to 10^(-8) based on the description in the 'Additional training details of MU' section of the paper. The result is shown in Figure 5.

python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn fisher_new --num_indexes_to_replace 4500 --alpha 0.00000001

Figure 5
As shown in Figure 5, the results exhibit some discrepancies compared to the results shown in Figure 6 from the paper.

Figure 6

Questions:

What are the specific parameter settings for each unlearning method?
How is MIA-Efficacy calculated in SVC_MIA_forget_efficacy?
What could be the reasons for the discrepancies in replicating the results?

Thank you for your time and help.

Best regards,
David

Difference in commands for running OMP and IMP Pruning

There seems to be no difference in the commands for using OMP and IMP pruning as mentioned in the README (except for the fact that the OMP command takes rate as an argument and the IMP command takes pruning_times as an argument:

OMP

python -u main_imp.py --data ./data --dataset $data --arch $arch --prune_type rewind_lt --rewind_epoch 8 --save_dir ${save_dir} --rate ${rate} --pruning_times 2 --num_workers 8

IMP

python -u main_imp.py --data ./data --dataset $data --arch $arch --prune_type rewind_lt --rewind_epoch 8 --save_dir ${save_dir} --rate 0.2 --pruning_times ${pruning_times} --num_workers 8

Is this a mistake? Which method would the command mentioned run by default? I am not able to determine that by looking at the code. I would like to run OMP pruning, what would be the correct command to do so?

cannot find the mask model

Hello @ljcc0930 , I'm very happy to read your excellent work. I wonder to ask you which file is the mask model？
when i run the retrain after pruning,

python -u main_forget.py --save_dir "./cifar10_results" \
    --mask "./cifar10_results/1model_SA_best.pth.tar" \
    --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 \
    --unlearn_lr 0.1 2>&1 | tee -a ./logs/Retrain.log

i have this issue

setup random seed = 2
4500
setup random seed = 2
40500
Pruning with custom mask (all conv layers)
* remain weight ratio =  50.0 %
Traceback (most recent call last):
  File "/home/rram/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 348, in _check_seekable
    f.seek(f.tell())
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/machine_unlearning/examples/Unlearn-Sparse/main_forget.py", line 248, in <module>
    main()
  File "/machine_unlearning/examples/Unlearn-Sparse/main_forget.py", line 151, in main
    unlearn_method(unlearn_data_loaders, model, criterion, args)
  File "/machine_unlearning/examples/Unlearn-Sparse/unlearn/impl.py", line 63, in _wrapped
    initialization = torch.load(
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 771, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 275, in _open_file_like
    return _open_buffer_reader(name_or_buffer)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 260, in __init__
    _check_seekable(buffer)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 351, in _check_seekable
    raise_err_msg(["seek", "tell"], e)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 344, in raise_err_msg
    raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

I have the trained files after run your pruning code

Can you help me? I'm looking forward to your reply~

Questions about the evaluation of MIA efficacy

Thanks for making the code public.

I have a question on this line of code:

Unlearn-Sparse/evaluation/SVC_MIA.py

Line 67 in 76a4299

acc_train = clf.predict(X_target_train).mean()

Wouldn't the accuracy be calculated by comparing the predictions with the ground truth? Did I missing something.

Mistake in code?

Hello, is there a problem in the line 259 of dataset.py? It appears as if indexes_to_replace has been considered to be an int and is supposed to be a list.. was the or clause supposed to be added to line 256 with num_indexes_to_replace? Thanks!

optml-group / unlearn-sparse Goto Github PK

unlearn-sparse's Issues

Reproducing the result of Dense Network with FT

Reproducing the result of cifar100

Challenges in Replicating Model Pruning and Unlearning Results

Steps and Code Used:

Questions:

Difference in commands for running OMP and IMP Pruning

OMP

IMP

cannot find the mask model

Questions about the evaluation of MIA efficacy

Mistake in code?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent