Code Monkey home page Code Monkey logo

unlearn-sparse's Issues

Reproducing the result of Dense Network with FT

First of all, I like your work. It is impressive.

I'm trying to reproduce your result using Dense Network with the unlearn method FT. However, I wasn't able to reproduce your result using the default Batch size. I could reproduce your result using a batch size of 64 for random data forgetting and a batch size of 512 for Class-wise forgetting. Is this the correct Batch size you used? or did I do something wrong?

Thank You.

Reproducing the result of cifar100

Hello, first and foremost, I would like to express my gratitude for the work you have done. Thank you.
I was trying to reproduce the result of cifar100 in the appendix using retrain and FT. But for some reason, I couldn't reproduce the result of unlearning by FT.

I tested cifar100 under 6 different independent trials. As you can see FT with class-wise forgetting has quite a low score of forget efficacy on Sparse(pruned model)/Dense(unpruned) class-wise forgetting and Sparse random data forgetting.
image

The hyperparameters of these experiments are the same as ciafar10.

for seed in 1 2 3 4 5 6
do
    python -u main_forget.py --save_dir ${save_dir}/random/${seed}_retrain --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    python -u main_forget.py --save_dir ${save_dir}/random/${seed}_FT_only --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.04 --unlearn_epochs 10 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    
    python -u main_forget.py --save_dir ${save_dir}/class/${seed}_retrain --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --seed $seed
    python -u main_forget.py --save_dir ${save_dir}/class/${seed}_FT_only --mask ${base_dir}/${seed}/base/0model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.01 --unlearn_epochs 10 --dataset $data --seed $seed

    python -u main_forget.py --save_dir ${save_dir2}/random/${seed}_retrain --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    python -u main_forget.py --save_dir ${save_dir2}/random/${seed}_FT_only --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.04 --unlearn_epochs 10 --dataset $data --class_to_replace -1 --num_indexes_to_replace 4500 --seed $seed
    
    python -u main_forget.py --save_dir ${save_dir2}/class/${seed}_retrain --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn retrain --unlearn_epochs 160 --unlearn_lr 0.1 --dataset $data --seed $seed
    python -u main_forget.py --save_dir ${save_dir2}/class/${seed}_FT_only --mask ${base_dir}/${seed}/base/1model_SA_best.pth.tar --unlearn FT --unlearn_lr 0.01 --unlearn_epochs 10 --dataset $data --seed $seed
done

#4
I'm aware that you've used different learning rates of cifar10 for class-wise forgetting(0.01) and random data forgetting(0.04). Is learning rates for cifar100 different?

Thank you.

Challenges in Replicating Model Pruning and Unlearning Results

Hello,

I've been working on replicating some results from your paper using the provided commands and code modifications in the README. However, I am encountering some discrepancies in the results, particularly with the MIA-Efficacy values. Below, I detail the steps taken and the issues encountered.

Steps and Code Used:

  1. Initial pruning of the model was done using the command:
python -u main_imp.py --data ./data --dataset cifar10 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir omp --rate 0.95 --pruning_times 2 --num_workers 8
  1. I modified arg_parser.py with the following additions:
parser.add_argument(
        "--num_indexes_to_replace",
        type=int,
        default=None,
        help="Number of data to forget",
    )
    parser.add_argument(
        "--class_to_replace", type=int, default=None, help="Specific class to forget"
    )
    parser.add_argument(
        "--indexes_to_replace",
        type=list,
        default=None,
        help="Specific index data to forget",
    )

When class_to_replace is set to None, a random selection of indexes equal to the number specified by num_indexes_to_replace will be chosen for the unlearning process.

args = arg_parser.parse_args()
    if args.seed:
        utils.setup_seed(args.seed)
    if args.class_to_replace is None:
        if args.dataset == "cifar10":
            num_indexes_to_replace = args.num_indexes_to_replace
            if args.indexes_to_replace is None:
                args.indexes_to_replace = np.random.choice(
                    45000, num_indexes_to_replace, replace=False
                )
  1. I used the following command to unlearning
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1

As shown in Figure 1, the results obtained under the 95%-sparse model are calculated as follows: UA=6.78, RA=99.99, TA=92.77.
image
Figure 1

I would like to inquire which value in SVC_MIA_forget_efficacy represents MIA-Efficacy. Is it the confidence value that closely matches the one mentioned in the paper, or is it the average of these values?

  1. I used the following command to get the unlearning result of the Dense model.
python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1

As shown in Figure 2, the results under the Dense model are calculated as follows: UA=4.9, RA=99.52, TA=94.62.
image
Figure 2

  1. The above results are relatively close to those reported in the paper; however, I conducted separate tests on GA for both the 95%-sparse model and the Dense Model by the following commands:
    sparse model command:
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5

Dense Model command:

python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5

As shown in Figure 3, the results for the 95%-sparse model are calculated as follows: UA=0.62, RA=99.39, TA=94.23. The UA value differs significantly from the 5.62±0.46 reported in the paper. Additionally, the MIA-Efficacy, whether it's the average or a specific value, shows a considerable discrepancy from the reported 11.76±0.52.
image
Figure 3 95%-sparse model result

As illustrated in Figure 4, the results for the Dense Model are calculated as follows: UA=0.78, RA=99.52, TA=94.52. The UA value shows a significant difference from the 7.54±0.29 mentioned in the paper. Moreover, the average MIA-Efficacy is 8.5, which slightly deviates from the 10.04±0.31 reported in the paper.
image
Figure 4 Dense Model result

  1. While running FF under the 95%-sparse model, since the specific value for alpha was not known, we set it to 10^(-8) based on the description in the 'Additional training details of MU' section of the paper. The result is shown in Figure 5.
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn fisher_new --num_indexes_to_replace 4500 --alpha 0.00000001

image
Figure 5
As shown in Figure 5, the results exhibit some discrepancies compared to the results shown in Figure 6 from the paper.

image
Figure 6

Questions:

  1. What are the specific parameter settings for each unlearning method?
  2. How is MIA-Efficacy calculated in SVC_MIA_forget_efficacy?
  3. What could be the reasons for the discrepancies in replicating the results?

Thank you for your time and help.

Best regards,
David

Difference in commands for running OMP and IMP Pruning

There seems to be no difference in the commands for using OMP and IMP pruning as mentioned in the README (except for the fact that the OMP command takes rate as an argument and the IMP command takes pruning_times as an argument:

OMP

python -u main_imp.py --data ./data --dataset $data --arch $arch --prune_type rewind_lt --rewind_epoch 8 --save_dir ${save_dir} --rate ${rate} --pruning_times 2 --num_workers 8

IMP

python -u main_imp.py --data ./data --dataset $data --arch $arch --prune_type rewind_lt --rewind_epoch 8 --save_dir ${save_dir} --rate 0.2 --pruning_times ${pruning_times} --num_workers 8

Is this a mistake? Which method would the command mentioned run by default? I am not able to determine that by looking at the code. I would like to run OMP pruning, what would be the correct command to do so?

cannot find the mask model

Hello @ljcc0930 , I'm very happy to read your excellent work. I wonder to ask you which file is the mask model?
when i run the retrain after pruning,

python -u main_forget.py --save_dir "./cifar10_results" \
    --mask "./cifar10_results/1model_SA_best.pth.tar" \
    --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 \
    --unlearn_lr 0.1 2>&1 | tee -a ./logs/Retrain.log

i have this issue

setup random seed = 2
4500
setup random seed = 2
40500
Pruning with custom mask (all conv layers)
* remain weight ratio =  50.0 %
Traceback (most recent call last):
  File "/home/rram/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 348, in _check_seekable
    f.seek(f.tell())
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/machine_unlearning/examples/Unlearn-Sparse/main_forget.py", line 248, in <module>
    main()
  File "/machine_unlearning/examples/Unlearn-Sparse/main_forget.py", line 151, in main
    unlearn_method(unlearn_data_loaders, model, criterion, args)
  File "/machine_unlearning/examples/Unlearn-Sparse/unlearn/impl.py", line 63, in _wrapped
    initialization = torch.load(
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 771, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 275, in _open_file_like
    return _open_buffer_reader(name_or_buffer)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 260, in __init__
    _check_seekable(buffer)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 351, in _check_seekable
    raise_err_msg(["seek", "tell"], e)
  File "/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 344, in raise_err_msg
    raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

I have the trained files after run your pruning code
image

Can you help me? I'm looking forward to your reply~

Mistake in code?

Hello, is there a problem in the line 259 of dataset.py? It appears as if indexes_to_replace has been considered to be an int and is supposed to be a list.. was the or clause supposed to be added to line 256 with num_indexes_to_replace? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.