Hello,
I've been working on replicating some results from your paper using the provided commands and code modifications in the README. However, I am encountering some discrepancies in the results, particularly with the MIA-Efficacy values. Below, I detail the steps taken and the issues encountered.
Steps and Code Used:
- Initial pruning of the model was done using the command:
python -u main_imp.py --data ./data --dataset cifar10 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir omp --rate 0.95 --pruning_times 2 --num_workers 8
- I modified
arg_parser.py
with the following additions:
parser.add_argument(
"--num_indexes_to_replace",
type=int,
default=None,
help="Number of data to forget",
)
parser.add_argument(
"--class_to_replace", type=int, default=None, help="Specific class to forget"
)
parser.add_argument(
"--indexes_to_replace",
type=list,
default=None,
help="Specific index data to forget",
)
When class_to_replace
is set to None, a random selection of indexes equal to the number specified by num_indexes_to_replace
will be chosen for the unlearning process.
args = arg_parser.parse_args()
if args.seed:
utils.setup_seed(args.seed)
if args.class_to_replace is None:
if args.dataset == "cifar10":
num_indexes_to_replace = args.num_indexes_to_replace
if args.indexes_to_replace is None:
args.indexes_to_replace = np.random.choice(
45000, num_indexes_to_replace, replace=False
)
- I used the following command to unlearning
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1
As shown in Figure 1, the results obtained under the 95%-sparse model are calculated as follows: UA=6.78, RA=99.99, TA=92.77.
![image](https://private-user-images.githubusercontent.com/78678361/293517205-49944f66-0735-44c4-8be2-4c9d47dcb7fe.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTcyMDUtNDk5NDRmNjYtMDczNS00NGM0LThiZTItNGM5ZDQ3ZGNiN2ZlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM2MmQ2ZjFkOTEyZjNlNzcyNTVkMTRjMGZhOTBkOWJiMmJkZDYyZjRlZTAwMDA3MDNlNDBkYjZjYmZiODJhMDMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.JQaPUfETtl9R_9Z1LRH92BFJiBFGNAFG0gVt1R7wDoQ)
Figure 1
I would like to inquire which value in SVC_MIA_forget_efficacy
represents MIA-Efficacy. Is it the confidence
value that closely matches the one mentioned in the paper, or is it the average of these values?
- I used the following command to get the unlearning result of the Dense model.
python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn retrain --num_indexes_to_replace 4500 --unlearn_epochs 160 --unlearn_lr 0.1
As shown in Figure 2, the results under the Dense model are calculated as follows: UA=4.9, RA=99.52, TA=94.62.
![image](https://private-user-images.githubusercontent.com/78678361/293517338-6ea3ebd8-4513-41a5-abbd-399a389cf7f4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTczMzgtNmVhM2ViZDgtNDUxMy00MWE1LWFiYmQtMzk5YTM4OWNmN2Y0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWVjN2Q3OGVlNzA4NTIxZTg5YmQ0MmRiMDBmMzY0NzY5N2Q5NTk5Y2U2MTM3MzQ2NTIwMzRhMDZmMTBkMzI4NjQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.cTQFepo9e9zXfP8ld9kKv4zYSkd7sD2ratp78BBGkmg)
Figure 2
- The above results are relatively close to those reported in the paper; however, I conducted separate tests on GA for both the 95%-sparse model and the Dense Model by the following commands:
sparse model command:
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5
Dense Model command:
python -u main_forget.py --save_dir omp_dense --mask omp_dense/0model_SA_best.pth.tar --unlearn GA --num_indexes_to_replace 4500 --unlearn_lr 0.0001 --unlearn_epochs 5
As shown in Figure 3, the results for the 95%-sparse model are calculated as follows: UA=0.62, RA=99.39, TA=94.23. The UA value differs significantly from the 5.62±0.46 reported in the paper. Additionally, the MIA-Efficacy, whether it's the average or a specific value, shows a considerable discrepancy from the reported 11.76±0.52.
![image](https://private-user-images.githubusercontent.com/78678361/293517430-18b70837-5341-4405-90d0-cb7a9501d5d9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTc0MzAtMThiNzA4MzctNTM0MS00NDA1LTkwZDAtY2I3YTk1MDFkNWQ5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWI5MmE5N2Y2YWIwODFlZWMxMDM1ZjVjNDA0ZTQ2MTVjZjg2ZTFkMDkzZGY3ZmVlNmYxY2Q1NzVhNmY3NDdmMGYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.U_lrb3js1meSr3UU6pTe6H_r5wY1rJM5Smws3veaXdg)
Figure 3 95%-sparse model result
As illustrated in Figure 4, the results for the Dense Model are calculated as follows: UA=0.78, RA=99.52, TA=94.52. The UA value shows a significant difference from the 7.54±0.29 mentioned in the paper. Moreover, the average MIA-Efficacy is 8.5, which slightly deviates from the 10.04±0.31 reported in the paper.
![image](https://private-user-images.githubusercontent.com/78678361/293517578-70323db9-deac-4505-ac0e-52ae95ee9496.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTc1NzgtNzAzMjNkYjktZGVhYy00NTA1LWFjMGUtNTJhZTk1ZWU5NDk2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3Y2RmM2RlODAzNzlkMTUyOThiOTQ0OTU2ODA0NDNkNzNmOGZjZDNjNDc4N2U0MWQ5NWRjM2VmNGVhYmNhY2EmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.m6CKQVGNDLWl29QKV8b25XX_YYnwHXi84LlFu3OVX9Q)
Figure 4 Dense Model result
- While running FF under the 95%-sparse model, since the specific value for alpha was not known, we set it to 10^(-8) based on the description in the 'Additional training details of MU' section of the paper. The result is shown in Figure 5.
python -u main_forget.py --save_dir omp --mask omp/1model_SA_best.pth.tar --unlearn fisher_new --num_indexes_to_replace 4500 --alpha 0.00000001
![image](https://private-user-images.githubusercontent.com/78678361/293517634-2b20b0a0-1dea-49b7-8883-487a338dea1c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTc2MzQtMmIyMGIwYTAtMWRlYS00OWI3LTg4ODMtNDg3YTMzOGRlYTFjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTc4YmJiNWYyNTg5M2JlNWIwNTVmODc1MmUyY2ZlNjc4OGQ1MzJkZWJlMDhhMDllOTJmNDUyMTkxYmI4NTM2NjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.R6aV2MqLz8VB0C9QN5uSFgsFi-25yLylWWko-svxAsE)
Figure 5
As shown in Figure 5, the results exhibit some discrepancies compared to the results shown in Figure 6 from the paper.
![image](https://private-user-images.githubusercontent.com/78678361/293517698-186ddb2d-23f6-49b5-a7b8-43bc4ac338b8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTYwNDMyMzIsIm5iZiI6MTcxNjA0MjkzMiwicGF0aCI6Ii83ODY3ODM2MS8yOTM1MTc2OTgtMTg2ZGRiMmQtMjNmNi00OWI1LWE3YjgtNDNiYzRhYzMzOGI4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA1MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTE4VDE0MzUzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTM0NzQyYjFhNWE4NGE5OWM5ODU4ODFlNmQ0NmJmYmU5Nzc2ZWViODk4NjJiOWI4MzE2ZTRlMWMwOTdlMjI3MzQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.yibwo7u1hNo9q29u5mTERqg9BhUvjG9e5b9e9Q_ZV40)
Figure 6
Questions:
- What are the specific parameter settings for each unlearning method?
- How is MIA-Efficacy calculated in SVC_MIA_forget_efficacy?
- What could be the reasons for the discrepancies in replicating the results?
Thank you for your time and help.
Best regards,
David