spring-epfl / mia Goto Github PK

View Code? Open in Web Editor NEW

135.0 7.0 27.0 73 KB

A library for running membership inference attacks against ML models

License: MIT License

Python 99.84% Makefile 0.16%

machine-learning privacy adversarial-machine-learning

mia's Issues

Could you please provide a demo with PyTorch?

Follow the TF demo, there is something wrong in attacking pytorch model.
Could you please share a demo with PyTorch?

Wrong parameter order in BaseModelSerializer

Hi,

In the BaseModelSerializer definition, the Keras model is passed before the model ID:

mia/mia/serialization.py

Line 29 in d389d30

def save(self, model, model_id):

But in the ShadowModelBundle class the model ID is passed before the model object:

mia/mia/estimators.py

Line 118 in d389d30

self.serializer.save(ShadowModelBundle.MODEL_ID_FMT % i, shadow_model)

I think the BMS abstract class definition should be corrected with the two arguments swapped in order.

Best

High false positive rate

Hi! I've been playing a bit with the library, great job I really liked it! but while running the CIFAR-10 on my machine I've found that I get a fairly high False Positive Rate, any suggestions on how to reduce them? am I doing something wrong here? Thanks for the help!

The example code I'm running -- The confusion matrix at the end of the notebook

Link to documentation fails

In the README.md, the documentation link fails:

A library for running membership inference attacks (MIA) against machine learning models. Check out the documentation.

Error: Empty training data

Hi! I'm trying to conduct the membership attack on the Fashion Mnist dataset by slightly changing the provided example. Since the Fashion MNIST is 70k images (60k training and 10k for validation) and each image in 28x28x1, with 10 classes, I only had to adapt the target model to 28x28 images.
When training the attack model I get this error:
ValueError: Empty training data.

I was wondering where this problem comes from, I left untouched every other parameter like the SHADOW_DATASET_ATTACK or ATTACK_TEST_DATASET_SIZE

Saving and serializing AttackModelBundle

I ran the CIFAR10 example and I would like to save it so I can run some tests without having to retrain the attack model every time and also be able to use it elsewhere. I can't do it conveniently. Is there a way to do so?

I have tried pickle.dump() , as well as using _get_model() and the saving it.

Thanks

tests not running completely

I have installed all dependencies necessary for the application to run however when I run the tests located in /mia/tests

:~/mia/tests# python conftest.py

I receive

:~/mia/test# python conftest.py
Using TensorFlow backend.

and that is all - nothing else. do you have any idea why this could be happening?

PyTorch support documentation and examples

Hi, you mention in the readme that the package supports PyTorch models, but in ShadowModelBundle._fit you assume the model has fit method (line 116).
How exactly have you tested the PyTorch models? I was thinking of maybe using pytorch-fitmodule or SuperModule, but if there's a way you recommend already that would be great. Also it would be nice to include an example of how to load PyTorch modules in the package! (maybe I can do a PR after I'm able to do it myself :-)

Question about the attack model's accuracy

Hey Bogdan, I read your project called "spring-epfl/mia" and tried to reproduce the MIA model you shared on Github.

I found that the attack accuracy is like 55% on the Cifar-10 dataset, which is not high as we expected. Could you give me some advice to improve the accuracy?

Here are the Github link and the result of the code: https://github.com/spring-epfl/mia/tree/master/examples

Thank you in advance. Take care:)

Spelling error in documentation for wrappers (enable_cude vs enable_cuda)

This one is hopefully pretty straightforward: the wrapper param enable_cuda is spelled enable_cude in the docs. Not a huge deal, but did have me scratching my head for a minute when I saw it first.

mia/mia/wrappers.py

Line 109 in d389d30

:param enable_cude: Whether to use CUDA

Also see section 3.3 here: https://buildmedia.readthedocs.org/media/pdf/mia-lib/latest/mia-lib.pdf

Any plans for supporting attacks on regression models?

Right now it only supports attacks on classification models. Any plans for extending to regression?

Testing Data for the attack model

Firstly, thanks for you contribution of mia, which is a very well-structured and concise implementation of the model inference attack.

However, one thing confuses me is that, in your cifar10 example, line 150-152, you include (x_test, y_test) into the dataset for testing the attacker, which is used for training the shadow model. For my perspective, this is inappropriate since the attack model has already seen these data indirectly through the shadow models, which breach the assumption that the testing is unseen before.

In the official implementation https://github.com/csong27/membership-inference, they seem to avoid this by separating one more testing set out.

I don not know whether my understanding is correct.

About Algorithm 1 Data Synthesis Using the Target Model in Shokri et al. Membership Inference Attack

Hi ,
Thank you for implementing the Shokri et al. attack. I have been reading and repeating the experiment mentioned in the paper. However, I found that all the training dataset for shadow models just using the data records disjoint from target training dataset of specific dataset (like cifar-10) and replacing k features or replacing nothing in the code. Maybe, it could be a little bit different from the original algorithm in the paper.

I wrote the Algorithm 1: Data synthesis using the target model by myself using Pytorch. I generated a random tensor X_tensor as size of (1, 3, 32, 32) for cifar-10 dataset and used two phases --- search and sample as the algorithm 1 in the paper. The code is as below:

def data_synthesize(net, trainset_size, fix_class, initial_record, k_max,
                    in_channels, img_size, batch_size, num_workers, device):
    """
    It is a function to synthesize data
    """
    # Initialize X_tensor with an initial_record, with size of (1, in_channels, img_size, img_size)
    X_tensor = initial_record
    # Generate y_tensor with the size equivalent to X_tensor's
    y_tensor = gen_class_tensor(trainset_size, fix_class)
    y_c_current = 0         # target models probability of fixed class
    j = 0                   # consecutive rejections counter
    k = k_max               # search radius
    max_iter = 100          # max iter number
    conf_min = 0.1          # min probability cutoff to consider a record member of the class
    rej_max = 5             # max number of consecutive rejections
    k_min = 1               # min radius of feature perturbation
    for _ in range(max_iter):
        dataset = TensorDataset(X_tensor, y_tensor)
        dataloader = DataLoader(dataset=dataset, batch_size=batch_size, num_workers=num_workers, shuffle=True)
        y_c = nn_predict_proba(net, dataloader, device, fix_class)
        # Phase 1: Search
        if y_c >= y_c_current:
            # Phase 2: Sample
            if y_c > conf_min and fix_class == torch.argmax(nn_predict(net, dataloader, device), dim=1):
                return X_tensor
            X_new_tensor = X_tensor
            y_c_current = y_c  # renew variables
            j = 0
        else:
            j += 1
            if j > rej_max:  # many consecutive rejects
                k = max(k_min, int(np.ceil(k / 2)))
                j = 0
        X_tensor = rand_tensor(X_new_tensor, k, in_channels, img_size, trainset_size)
    return X_tensor, y_c

However, the prediction probability it generates is so low, like 0.1. Could you please give me some guidance on the Data Synthesis Using the Target Model Algorithm or update the uploaded code? Thanks in advance for your patience!

Best wish!
Yantong

spring-epfl / mia Goto Github PK

mia's Issues

Could you please provide a demo with PyTorch?

Wrong parameter order in BaseModelSerializer

High false positive rate

Link to documentation fails

Error: Empty training data

Saving and serializing AttackModelBundle

tests not running completely

PyTorch support documentation and examples

Question about the attack model's accuracy

Spelling error in documentation for wrappers (enable_cude vs enable_cuda)

Any plans for supporting attacks on regression models?

Testing Data for the attack model

About Algorithm 1 Data Synthesis Using the Target Model in Shokri et al. Membership Inference Attack

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent