Code Monkey home page Code Monkey logo

mia's Issues

Wrong parameter order in BaseModelSerializer

Hi,

In the BaseModelSerializer definition, the Keras model is passed before the model ID:

def save(self, model, model_id):

But in the ShadowModelBundle class the model ID is passed before the model object:

self.serializer.save(ShadowModelBundle.MODEL_ID_FMT % i, shadow_model)

I think the BMS abstract class definition should be corrected with the two arguments swapped in order.

Best

High false positive rate

Hi! I've been playing a bit with the library, great job I really liked it! but while running the CIFAR-10 on my machine I've found that I get a fairly high False Positive Rate, any suggestions on how to reduce them? am I doing something wrong here? Thanks for the help!

The example code I'm running -- The confusion matrix at the end of the notebook

Link to documentation fails

In the README.md, the documentation link fails:

A library for running membership inference attacks (MIA) against machine learning models. Check out the documentation.

Error: Empty training data

Hi! I'm trying to conduct the membership attack on the Fashion Mnist dataset by slightly changing the provided example. Since the Fashion MNIST is 70k images (60k training and 10k for validation) and each image in 28x28x1, with 10 classes, I only had to adapt the target model to 28x28 images.
When training the attack model I get this error:
ValueError: Empty training data.

I was wondering where this problem comes from, I left untouched every other parameter like the SHADOW_DATASET_ATTACK or ATTACK_TEST_DATASET_SIZE

Saving and serializing AttackModelBundle

I ran the CIFAR10 example and I would like to save it so I can run some tests without having to retrain the attack model every time and also be able to use it elsewhere. I can't do it conveniently. Is there a way to do so?

I have tried pickle.dump() , as well as using _get_model() and the saving it.

Thanks

tests not running completely

I have installed all dependencies necessary for the application to run however when I run the tests located in /mia/tests

:~/mia/tests# python conftest.py

I receive

:~/mia/test# python conftest.py
Using TensorFlow backend.

and that is all - nothing else. do you have any idea why this could be happening?

PyTorch support documentation and examples

Hi, you mention in the readme that the package supports PyTorch models, but in ShadowModelBundle._fit you assume the model has fit method (line 116).
How exactly have you tested the PyTorch models? I was thinking of maybe using pytorch-fitmodule or SuperModule, but if there's a way you recommend already that would be great. Also it would be nice to include an example of how to load PyTorch modules in the package! (maybe I can do a PR after I'm able to do it myself :-)

Testing Data for the attack model

Firstly, thanks for you contribution of mia, which is a very well-structured and concise implementation of the model inference attack.

However, one thing confuses me is that, in your cifar10 example, line 150-152, you include (x_test, y_test) into the dataset for testing the attacker, which is used for training the shadow model. For my perspective, this is inappropriate since the attack model has already seen these data indirectly through the shadow models, which breach the assumption that the testing is unseen before.

In the official implementation https://github.com/csong27/membership-inference, they seem to avoid this by separating one more testing set out.

I don not know whether my understanding is correct.

About Algorithm 1 Data Synthesis Using the Target Model in Shokri et al. Membership Inference Attack

Hi ,
Thank you for implementing the Shokri et al. attack. I have been reading and repeating the experiment mentioned in the paper. However, I found that all the training dataset for shadow models just using the data records disjoint from target training dataset of specific dataset (like cifar-10) and replacing k features or replacing nothing in the code. Maybe, it could be a little bit different from the original algorithm in the paper.

I wrote the Algorithm 1: Data synthesis using the target model by myself using Pytorch. I generated a random tensor X_tensor as size of (1, 3, 32, 32) for cifar-10 dataset and used two phases --- search and sample as the algorithm 1 in the paper. The code is as below:

def data_synthesize(net, trainset_size, fix_class, initial_record, k_max,
                    in_channels, img_size, batch_size, num_workers, device):
    """
    It is a function to synthesize data
    """
    # Initialize X_tensor with an initial_record, with size of (1, in_channels, img_size, img_size)
    X_tensor = initial_record
    # Generate y_tensor with the size equivalent to X_tensor's
    y_tensor = gen_class_tensor(trainset_size, fix_class)
    y_c_current = 0         # target models probability of fixed class
    j = 0                   # consecutive rejections counter
    k = k_max               # search radius
    max_iter = 100          # max iter number
    conf_min = 0.1          # min probability cutoff to consider a record member of the class
    rej_max = 5             # max number of consecutive rejections
    k_min = 1               # min radius of feature perturbation
    for _ in range(max_iter):
        dataset = TensorDataset(X_tensor, y_tensor)
        dataloader = DataLoader(dataset=dataset, batch_size=batch_size, num_workers=num_workers, shuffle=True)
        y_c = nn_predict_proba(net, dataloader, device, fix_class)
        # Phase 1: Search
        if y_c >= y_c_current:
            # Phase 2: Sample
            if y_c > conf_min and fix_class == torch.argmax(nn_predict(net, dataloader, device), dim=1):
                return X_tensor
            X_new_tensor = X_tensor
            y_c_current = y_c  # renew variables
            j = 0
        else:
            j += 1
            if j > rej_max:  # many consecutive rejects
                k = max(k_min, int(np.ceil(k / 2)))
                j = 0
        X_tensor = rand_tensor(X_new_tensor, k, in_channels, img_size, trainset_size)
    return X_tensor, y_c

However, the prediction probability it generates is so low, like 0.1. Could you please give me some guidance on the Data Synthesis Using the Target Model Algorithm or update the uploaded code? Thanks in advance for your patience!

Best wish!
Yantong

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.