tsingz0 / pfllib Goto Github PK

We expose this user-friendly algorithm library (with an integrated evaluation platform) for beginners who intend to start federated learning (FL) study

License: GNU General Public License v2.0

Python 100.00%

non-iid federated-learning personalization pytorch python distributed-computing heterogeneity differential-privacy privacy imagenet

pfllib's Introduction

PFLlib: Personalized Federated Learning Algorithm Library

Figure 1: An Example for FedAvg. You can create a scenario using generate_DATA.py and run an algorithm using main.py, clientNAME.py, and serverNAME.py.

We've created a user-friendly algorithm library and evaluation platform for those new to federated learning. Join us in expanding the FL community by contributing your algorithms, datasets, and metrics to this project.

36 traditional FL (tFL) or personalized FL (pFL) algorithms, 3 scenarios, and 20 datasets.
Some experimental results are avalible here.
Refer to this guide to learn how to use it.
This library can simulate scenarios using the 4-layer CNN on Cifar100 for 500 clients on one NVIDIA GeForce RTX 3090 GPU card with only 5.08GB GPU memory cost.
PFLlib primarily focuses on data (statistical) heterogeneity. For algorithms and an evaluation platform that address both data and model heterogeneity, please refer to our extended project Heterogeneous Federated Learning (HtFL).
As we strive to meet diverse user demands, frequent updates to the project may alter default settings and scenario creation codes, affecting experimental results.
Closed issues may help you a lot.
When submitting pull requests, please provide sufficient instructions and examples in the comment box.

The origin of the statistical heterogeneity phenomenon is the personalization of users, who generate non-IID (not Independent and Identically Distributed) and unbalanced data. With statistical heterogeneity existing in the FL scenario, a myriad of approaches have been proposed to crack this hard nut. In contrast, the personalized FL (pFL) may take advantage of the statistically heterogeneous data to learn the personalized model for each user.

Thanks to @Stonesjtu, this library can also record the GPU memory usage for the model. Following FedCG, we also introduce the DLG (Deep Leakage from Gradients) attack and PSNR (Peak Signal-to-Noise Ratio) metric to evaluate the privacy-preserving ability of tFL/pFL algorithms (please refer to ./system/flcore/servers/serveravg.py for example). Now we can train on some clients and evaluate performance on other new clients by setting args.num_new_clients in ./system/main.py. Note that not all the tFL/pFL algorithms support this feature.

Citation

@article{zhang2023pfllib,
  title={PFLlib: Personalized Federated Learning Algorithm Library},
  author={Zhang, Jianqing and Liu, Yang and Hua, Yang and Wang, Hao and Song, Tao and Xue, Zhengui and Ma, Ruhui and Cao, Jian},
  journal={arXiv preprint arXiv:2312.04992},
  year={2023}
}

Algorithms with code (updating)

Traditional FL (tFL)

FedAvg — Communication-Efficient Learning of Deep Networks from Decentralized Data AISTATS 2017

Update-correction-based tFL
SCAFFOLD - SCAFFOLD: Stochastic Controlled Averaging for Federated Learning ICML 2020

Regularization-based tFL
FedProx — Federated Optimization in Heterogeneous Networks MLsys 2020
FedDyn — Federated Learning Based on Dynamic Regularization ICLR 2021

Model-splitting-based tFL
MOON — Model-Contrastive Federated Learning CVPR 2021

Knowledge-distillation-based tFL
FedGen — Data-Free Knowledge Distillation for Heterogeneous Federated Learning ICML 2021
FedNTD — Preservation of the Global Knowledge by Not-True Distillation in Federated Learning NeurIPS 2022

Personalized FL (pFL)

FedMTL (not MOCHA) — Federated multi-task learning NeurIPS 2017
FedBN — FedBN: Federated Learning on non-IID Features via Local Batch Normalization ICLR 2021

Meta-learning-based pFL
Per-FedAvg — Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach NeurIPS 2020

Regularization-based pFL
pFedMe — Personalized Federated Learning with Moreau Envelopes NeurIPS 2020
Ditto — Ditto: Fair and robust federated learning through personalization ICML 2021

Personalized-aggregation-based pFL
APFL — Adaptive Personalized Federated Learning 2020
FedFomo — Personalized Federated Learning with First Order Model Optimization ICLR 2021
FedAMP — Personalized Cross-Silo Federated Learning on non-IID Data AAAI 2021
FedPHP — FedPHP: Federated Personalization with Inherited Private Models ECML PKDD 2021
APPLE — Adapt to Adaptation: Learning Personalization for Cross-Silo Federated Learning IJCAI 2022
FedALA — FedALA: Adaptive Local Aggregation for Personalized Federated Learning AAAI 2023

Model-splitting-based pFL
FedPer — Federated Learning with Personalization Layers 2019
LG-FedAvg — Think Locally, Act Globally: Federated Learning with Local and Global Representations 2020
FedRep — Exploiting Shared Representations for Personalized Federated Learning ICML 2021
FedRoD — On Bridging Generic and Personalized Federated Learning for Image Classification ICLR 2022
FedBABU — Fedbabu: Towards enhanced representation for federated image classification ICLR 2022
FedGC — Federated Learning for Face Recognition with Gradient Correction AAAI 2022
FedCP — FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy KDD 2023
GPFL — GPFL: Simultaneously Learning Generic and Personalized Feature Information for Personalized Federated Learning ICCV 2023
FedGH — FedGH: Heterogeneous Federated Learning with Generalized Global Header ACM MM 2023
FedDBE — Eliminating Domain Bias for Federated Learning in Representation Space NeurIPS 2023
FedCAC — Bold but Cautious: Unlocking the Potential of Personalized Federated Learning through Cautiously Aggressive Collaboration ICCV 2023
PFL-DA — Personalized Federated Learning via Domain Adaptation with an Application to Distributed 3D Printing Technometrics 2023

Knowledge-distillation-based pFL
FedDistill (FD) — Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data 2018
FML — Federated Mutual Learning 2020
FedKD — Communication-efficient federated learning via knowledge distillation Nature Communications 2022
FedProto — FedProto: Federated Prototype Learning across Heterogeneous Clients AAAI 2022
FedPCL (w/o pre-trained models) — Federated learning from pre-trained models: A contrastive learning approach NeurIPS 2022
FedPAC — Personalized Federated Learning with Feature Alignment and Classifier Collaboration ICLR 2023

Datasets and scenarios (updating)

For the label skew scenario, we introduce 14 famous datasets: MNIST, EMNIST, Fashion-MNIST, Cifar10, Cifar100, AG News, Sogou News, Tiny-ImageNet, Country211, Flowers102, GTSRB, Shakespeare, and Stanford Cars, they can be easy split into IID and non-IID version. Since some codes for generating datasets such as splitting are the same for all datasets, we move these codes into ./dataset/utils/dataset_utils.py. In the non-IID scenario, 2 situations exist. The first one is the pathological non-IID scenario, the second one is the practical non-IID scenario. In the pathological non-IID scenario, for example, the data on each client only contains the specific number of labels (maybe only 2 labels), though the data on all clients contains 10 labels such as the MNIST dataset. In the practical non-IID scenario, Dirichlet distribution is utilized (please refer to this paper for details). We can input balance for the iid scenario, where the data are uniformly distributed.

For the feature shift scenario, we use 3 datasets that are widely used in Domain Adaptation: Amazon Review (fetch raw data from this site), Digit5 (fetch raw data from this site), and DomainNet.

For the real-world (or IoT) scenario, we also introduce 3 naturally separated datasets: Omniglot (20 clients, 50 labels), HAR (Human Activity Recognition) (30 clients, 6 labels), PAMAP2 (9 clients, 12 labels). For the details of datasets and FL algorithms in IoT, please refer to my FL-IoT repo.

If you need another data set, just write another code to download it and then use the utils.

Examples for MNIST

MNIST

cd ./dataset
# python generate_MNIST.py iid - - # for iid and unbalanced scenario
# python generate_MNIST.py iid balance - # for iid and balanced scenario
# python generate_MNIST.py noniid - pat # for pathological noniid and unbalanced scenario
python generate_MNIST.py noniid - dir # for practical noniid and unbalanced scenario
# python generate_MNIST.py noniid - exdir # for Extended Dirichlet strategy

The output of python generate_MNIST.py noniid - dir

Number of classes: 10
Client 0         Size of data: 2630      Labels:  [0 1 4 5 7 8 9]
                 Samples of labels:  [(0, 140), (1, 890), (4, 1), (5, 319), (7, 29), (8, 1067), (9, 184)]
--------------------------------------------------
Client 1         Size of data: 499       Labels:  [0 2 5 6 8 9]
                 Samples of labels:  [(0, 5), (2, 27), (5, 19), (6, 335), (8, 6), (9, 107)]
--------------------------------------------------
Client 2         Size of data: 1630      Labels:  [0 3 6 9]
                 Samples of labels:  [(0, 3), (3, 143), (6, 1461), (9, 23)]
--------------------------------------------------

Client 3         Size of data: 2541      Labels:  [0 4 7 8]
                 Samples of labels:  [(0, 155), (4, 1), (7, 2381), (8, 4)]
--------------------------------------------------
Client 4         Size of data: 1917      Labels:  [0 1 3 5 6 8 9]
                 Samples of labels:  [(0, 71), (1, 13), (3, 207), (5, 1129), (6, 6), (8, 40), (9, 451)]
--------------------------------------------------
Client 5         Size of data: 6189      Labels:  [1 3 4 8 9]
                 Samples of labels:  [(1, 38), (3, 1), (4, 39), (8, 25), (9, 6086)]
--------------------------------------------------
Client 6         Size of data: 1256      Labels:  [1 2 3 6 8 9]
                 Samples of labels:  [(1, 873), (2, 176), (3, 46), (6, 42), (8, 13), (9, 106)]
--------------------------------------------------
Client 7         Size of data: 1269      Labels:  [1 2 3 5 7 8]
                 Samples of labels:  [(1, 21), (2, 5), (3, 11), (5, 787), (7, 4), (8, 441)]
--------------------------------------------------
Client 8         Size of data: 3600      Labels:  [0 1]
                 Samples of labels:  [(0, 1), (1, 3599)]
--------------------------------------------------
Client 9         Size of data: 4006      Labels:  [0 1 2 4 6]
                 Samples of labels:  [(0, 633), (1, 1997), (2, 89), (4, 519), (6, 768)]
--------------------------------------------------
Client 10        Size of data: 3116      Labels:  [0 1 2 3 4 5]
                 Samples of labels:  [(0, 920), (1, 2), (2, 1450), (3, 513), (4, 134), (5, 97)]
--------------------------------------------------
Client 11        Size of data: 3772      Labels:  [2 3 5]
                 Samples of labels:  [(2, 159), (3, 3055), (5, 558)]
--------------------------------------------------
Client 12        Size of data: 3613      Labels:  [0 1 2 5]
                 Samples of labels:  [(0, 8), (1, 180), (2, 3277), (5, 148)]
--------------------------------------------------
Client 13        Size of data: 2134      Labels:  [1 2 4 5 7]
                 Samples of labels:  [(1, 237), (2, 343), (4, 6), (5, 453), (7, 1095)]
--------------------------------------------------
Client 14        Size of data: 5730      Labels:  [5 7]
                 Samples of labels:  [(5, 2719), (7, 3011)]
--------------------------------------------------
Client 15        Size of data: 5448      Labels:  [0 3 5 6 7 8]
                 Samples of labels:  [(0, 31), (3, 1785), (5, 16), (6, 4), (7, 756), (8, 2856)]
--------------------------------------------------
Client 16        Size of data: 3628      Labels:  [0]
                 Samples of labels:  [(0, 3628)]
--------------------------------------------------
Client 17        Size of data: 5653      Labels:  [1 2 3 4 5 7 8]
                 Samples of labels:  [(1, 26), (2, 1463), (3, 1379), (4, 335), (5, 60), (7, 17), (8, 2373)]
--------------------------------------------------
Client 18        Size of data: 5266      Labels:  [0 5 6]
                 Samples of labels:  [(0, 998), (5, 8), (6, 4260)]
--------------------------------------------------
Client 19        Size of data: 6103      Labels:  [0 1 2 3 4 9]
                 Samples of labels:  [(0, 310), (1, 1), (2, 1), (3, 1), (4, 5789), (9, 1)]
--------------------------------------------------
Total number of samples: 70000
The number of train samples: [1972, 374, 1222, 1905, 1437, 4641, 942, 951, 2700, 3004, 2337, 2829, 2709, 1600, 4297, 4086, 2721, 4239, 3949, 4577]
The number of test samples: [658, 125, 408, 636, 480, 1548, 314, 318, 900, 1002, 779, 943, 904, 534, 1433, 1362, 907, 1414, 1317, 1526]

Saving to disk.

Finish generating dataset.

Models

for MNIST and Fashion-MNIST
1. Mclr_Logistic(1*28*28)
2. LeNet()
3. DNN(1*28*28, 100) # non-convex
for Cifar10, Cifar100 and Tiny-ImageNet
1. Mclr_Logistic(3*32*32)
2. FedAvgCNN()
3. DNN(3*32*32, 100) # non-convex
4. ResNet18, AlexNet, MobileNet, GoogleNet, etc.
for AG_News and Sogou_News
1. LSTM()
2. fastText() in Bag of Tricks for Efficient Text Classification
3. TextCNN() in Convolutional Neural Networks for Sentence Classification
4. TransformerModel() in Attention is all you need
for AmazonReview
1. AmazonMLP() in Curriculum manager for source selection in multi-source domain adaptation
for Omniglot
1. FedAvgCNN()
for HAR and PAMAP
1. HARCNN() in Convolutional neural networks for human activity recognition using mobile sensors

Environments

Install CUDA.

Install conda and activate conda.

conda env create -f env_cuda_latest.yaml # You may need to downgrade the torch using pip to match the CUDA version

How to start simulating (examples for FedAvg)

Create proper environments (see Environments).
Download this project to an appropriate location using git.
```
git clone https://github.com/TsingZ0/PFLlib.git
```
Build evaluation scenarios (see Datasets and scenarios (updating)).

Run evaluation:

cd ./system
python main.py -data MNIST -m cnn -algo FedAvg -gr 2000 -did 0 # using the MNIST dataset, the FedAvg algorithm, and the 4-layer CNN model

Note: It is preferable to tune algorithm-specific hyper-parameters before using any algorithm on a new machine.

Practical situations

If you need to simulate FL under practical situations, which includes client dropout, slow trainers, slow senders, and network TTL, you can set the following parameters to realize it.

-cdr: The dropout rate for total clients. The selected clients will randomly drop at each training round.
-tsr and -ssr: The rates for slow trainers and slow senders among all clients. Once a client is selected as a "slow trainer"/"slow sender", for example, it will always train/send slower than the original one.
-tth: The threshold for network TTL (ms).

Easy to extend

It is easy to add new algorithms and datasets to this library.

To add a new dataset into this library, all you need to do is write the download code and use the utils which is similar to ./dataset/generate_MNIST.py (you can also consider it as the template).
To add a new algorithm, you can utilize the class Server and class Client, which are wrote in ./system/flcore/servers/serverbase.py and ./system/flcore/clients/clientbase.py, respectively.
To add a new model, just add it into ./system/flcore/trainmodel/models.py.
If you have a new optimizer while training, please add it into ./system/flcore/optimizers/fedoptimizer.py
The evaluation platform is also convenient for users to build a new platform for specific applications, such as our FL-IoT and HtFL.

Experimental results

If you are interested in the experimental results (e.g., the accuracy) of the above algorithms, you can find some results in our accepted FL papers (i.e., FedALA, FedCP, GPFL, and DBE) listed as follows that also use this library. Please note that this developing project may not be able to reproduce the results on these papers, since some basic settings may change due to the requests of the community. For example, we previously set shuffle=False in clientbase.py

@inproceedings{zhang2023fedala,
  title={Fedala: Adaptive local aggregation for personalized federated learning},
  author={Zhang, Jianqing and Hua, Yang and Wang, Hao and Song, Tao and Xue, Zhengui and Ma, Ruhui and Guan, Haibing},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={37},
  number={9},
  pages={11237--11244},
  year={2023}
}

@inproceedings{Zhang2023fedcp,
  author = {Zhang, Jianqing and Hua, Yang and Wang, Hao and Song, Tao and Xue, Zhengui and Ma, Ruhui and Guan, Haibing},
  title = {FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy},
  year = {2023},
  booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining}
}

@inproceedings{zhang2023gpfl,
  title={GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning},
  author={Zhang, Jianqing and Hua, Yang and Wang, Hao and Song, Tao and Xue, Zhengui and Ma, Ruhui and Cao, Jian and Guan, Haibing},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={5041--5051},
  year={2023}
}

@inproceedings{zhang2023eliminating,
  title={Eliminating Domain Bias for Federated Learning in Representation Space},
  author={Jianqing Zhang and Yang Hua and Jian Cao and Hao Wang and Tao Song and Zhengui XUE and Ruhui Ma and Haibing Guan},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023},
  url={https://openreview.net/forum?id=nO5i1XdUS0}
}

pfllib's People

Contributors

Stargazers

Watchers

Forkers

gyjgyjgyj joey61liuyi cugzj brighthaozi jianxu95 sshpark ashkan-pirmani fromsystem stefanwan-durham 04dvzr yiliucs jimmyc96 weareaworld lyh02 zhangivan1 hqchen2021 wweijia94 wodaka jononearth xujiangyu ms116 macheche noewangjy lizonghang shhjwu5 fenghz jinwang1 chenlishuang tclw123 small-volcano iquhsek dadapongi6 perceptionlab-durhamuniversity yyyjn zhenxuansu-sine yutong-dai wrb-18 lebeausc gszh candy1126xx johnieh seiphro duncant0417 yajiang4215 chauvinhloi liuyu54128 xiaorong108 tkh666 xingos123 pengyangzhou kelenlv duduruhappy holyknight123 juliezqq33 eve10010 pa-wan szpsunkk shengren12138 liusu169 ysntrkc liuyuqinggg vananle stephenhua fwhirlwind fan1dy changsenxia dashingj-82 github-zhouxinyang twinte madeline271 vincentni0107 fjjt immor278 yuyuyu13 xuaikun yannlee1208 idyllic990920 zibaparsons frooob ttccq noobyzy joeypjx aouedions11 jetina yusen877 cherry-small menithya kuangzhanxpz yunwei-c y11ler jin1994xiul ss3b3 syq1175554320 bigwjz azalahmadkhan katie-jiang pengyuzhang97 zhang-wen-jun legiangk62 hieund-vn

pfllib's Issues

No module named 'opacus.dp_model_inspector'

Hi, I have installed opacus but I still got this error when run the code:

(fl_torch) zrh@zrh-server:~/zhuyonghui/Project/FL/PFL-Non-IID-master/system$ python main.py -data mnist -m cnn -algo FedAvg -gr 2500 -did 0 -go cnn

Traceback (most recent call last):
File "main.py", line 12, in
from flcore.servers.serveravg import FedAvg
File "/home/zrh/zhuyonghui/Project/FL/PFL-Non-IID-master/system/flcore/servers/serveravg.py", line 1, in
from flcore.clients.clientavg import clientAVG
File "/home/zrh/zhuyonghui/Project/FL/PFL-Non-IID-master/system/flcore/clients/clientavg.py", line 6, in
from utils.privacy import *
File "/home/zrh/zhuyonghui/Project/FL/PFL-Non-IID-master/system/utils/privacy.py", line 2, in
from opacus.dp_model_inspector import DPModelInspector
ModuleNotFoundError: No module named 'opacus.dp_model_inspector'

Can you help me solve this problem? Thank you.

Other values that can be passed for argument goal.

parser.add_argument('-go', "--goal", type=str, default="test",
help="The goal for this experiment")

Hi, I wanted to know the other values that can be passed for the argument goal other than the default test?

Questions regarding the test accuracy on algorithms.

First and foremost, thanks for the nice work!

I have following questions.

Datatset split

When checking dataset split method, it seems that the separate_data does not correctly split the dataset. To verify this, run below commands.

# generate the dataset
python generate_cifar100.py noniid -- dir

nohup python -u main.py -lbs 64 -nc 20 -jr 0.4 -nb 100 -data Cifar100 -m cnn -algo FedAvg -gr 100 -did 0 -go reproduce --local_steps 5 --local_learning_rate 0.01> Cifar100_FedAvg.out 2>&1 &

Then set a break point on this line of the serverbase.py. And check the total number of testing dataset using sum(stats[1]). It returns 15008 on myside, which means some training data is also set into testing data as Cifar100 only has 10000 testing sample.

Reproduce results

I fail to use your code to reproduce the results reported in the FedROD's paper https://arxiv.org/pdf/2107.00778.pdf. I only tested FedAvg, pFedMe.

Follow the setting used in the FedROD's paper.
Data partition: Dir(0.1), which is the default setting in generate_cifar100.py when dir is specified.
Other configurations:
0.4 join ratio / 20 clients / 100 rounds / 5 local epochs / 0.01 local learning rate.

The code I used are given below.

# generate the dataset
python generate_cifar100.py noniid -- dir

nohup python -u main.py -lbs 64 -nc 20 -jr 0.4 -nb 100 -data Cifar100 -m cnn -algo FedAvg -gr 100 -did 0 -go reproduce --local_steps 5 --local_learning_rate 0.01> Cifar100_FedAvg.out 2>&1 &

nohup python -u main.py -lbs 64 -nc 20 -jr 0.4 -nb 100 -data Cifar100 -m cnn -algo pFedMe -gr 100 -did 1 -go reproduce --local_steps 5 --local_learning_rate 0.01> Cifar100_pFedMe.out 2>&1 &

To make fair comparison, I also modified your in the to perform extra data augmentation following FedROD's paper.

    transform_train = transforms.Compose([transforms.RandomCrop(32, padding=4),
                                          transforms.RandomHorizontalFlip(),
                                          transforms.ToTensor(),
                                          transforms.Normalize(mean=[0.507, 0.487, 0.441],
                                                               std=[0.267, 0.256, 0.276])])
    transform_test = transforms.Compose([transforms.ToTensor(),
                                         transforms.Normalize(mean=[0.507, 0.487, 0.441],
                                                              std=[0.267, 0.256, 0.276])])

    trainset = torchvision.datasets.CIFAR100(root='~/data', train=True,
                                             download=True, transform=transform_train)
    testset = torchvision.datasets.CIFAR100(root='~/data', train=False,
                                            download=True, transform=transform_test)

What I got from your code were

For FedAvg

Best global accuracy.
0.2305243759423689

Average time cost per round.
7.7570583581924435
File path: ../results/Cifar100_FedAvg_reproduce_0.h5

Average time cost: 792.34s.
Length:  101
std for best accurancy: 0.0
mean for best accurancy: 0.2305243759423689
All done!

which in the paper, the paper is 41.8%. There is a 20% gap.

For pFedMe

Evaluate personalized model
Average Personalized Test Accurancy: 0.2894

Best personalized results.
0.30951199338296115

Average time cost: 2207.25s.
Length:  101
std for best accurancy: 0.0
mean for best accurancy: 0.30951199338296115
All done!

which in the paper, the paper is 38.6%. There was a 8% gap.

I may miss something. Could you please comment on this? Many thanks.

FedAMP client loss

Hi, thanks for your code, it has inspired me a lot. But when I am reading the FedAMP section, is there a mistake in the loss function here?

According to the existing code, the variable sub should be a 0 matrix.

https://github.com/TsingZ0/PFL-Non-IID/blob/a478269e52dc25488dc0c83ce5917b229697f881/system/flcore/clients/clientamp.py#L31-L45

I guess, the correct code should be

 for step in range(max_local_steps):
     # self.old_model: the last round model
     params = copy.deepcopy(weight_flatten(self.old_model))

     if self.train_slow: 
         time.sleep(0.1 * np.abs(np.random.rand())) 
     x, y = self.get_next_train_batch() 
     self.optimizer.zero_grad() 
     output = self.model(x) 
     loss = self.loss(output, y) 
  
     params_ = weight_flatten(self.model) 
     sub = params - params_ 
     loss += self.lamda/self.alphaK/2 * torch.dot(sub, sub)

I hope you have time to reply, I appreciate it!

Question about balanced Dirichlet distribution

for k in range(K):
                idx_k = np.where(dataset_label == k)[0]
                np.random.shuffle(idx_k)
                proportions = np.random.dirichlet(np.repeat(alpha, num_clients))
                ## Balance
                proportions = np.array([p*(len(idx_j)<N/num_clients) for p,idx_j in zip(proportions,idx_batch)])
                proportions = proportions/proportions.sum()
                proportions = (np.cumsum(proportions)*len(idx_k)).astype(int)[:-1]
                idx_batch = [idx_j + idx.tolist() for idx_j,idx in zip(idx_batch,np.split(idx_k,proportions))]
                min_size = min([len(idx_j) for idx_j in idx_batch])

Hi,
how can I make the number of samples in each client balanced? Thank you for your reply.

About FedProto: 'NoneType' object has no attribute 'items'

When I run FedProto, I meet this problem shown in the picture.

May I please you help me deal with this trouble ? Thank you !!!

about the per-fedavg

Why the batch_size need to * 2?

Thank you!

about the "balanced_softmax_loss" in FedROD

”sample_per_class: A int tensor of size [no of classes]“
Maybe the value of it is the num instead of proportion?

a question

Where are the first eight data sets

model has no attribute children

It seems that you have changed the parameters when initilizing the server, as in the picture below

this would induce problem when initalizing clients, since arg.model is a str

Why the mean test accuracy increases slow using Fedavg

Hi TsingZ0, I fork your code and run the default code "main.py" with mnist dataset, under iid setting, but I am confused that why the "Averaged Test Accuracy" is increasing slowly. My previous experiments on other code libraries have shown that accuracy is usually more than 90% at around 20 rounds. Can you explain it?

Data_utils

model aggregate question

def aggregate_parameters(self):
    assert (len(self.uploaded_models) > 0)  # re

    self.global_model = copy.deepcopy(self.uploaded_models[0])
    for param in self.global_model.parameters():
        param.data.zero_()
        
    for w, client_model in zip(self.uploaded_weights, self.uploaded_models):
        self.add_parameters(w, client_model)

def add_parameters(self, w, client_model):
    for server_param, client_param in zip(self.global_model.parameters(), client_model.parameters()):
        server_param.data += client_param.data.clone() * w

===========
in this model, I don't know how to pass the global model from the previous round to the next round.
My point of confusion is that I see the accumulation of weights and parameters, but not the passing of the final model parameters to the global model.
Can you help me with my questions?

Plots

Hello, I am a bit confused about how to get the training plots(loss/epoch) for the global models as well as the clients. Can somebody help me with that? I am using SGD with clientavg.

Ditto doesn't be finished

the set_parameters method haven't been finished yet.

Question about Fedfomo

Hi, when I run the algorithm "fedfomo", I found that there is a error. I don't know if I'm the only one with this problem, the current code seems to call the ^{load_train_data} method in the ^clientBase class when ^{self.evaluate()} is executed on the server, instead of calling the ^{load_train_data} method in the cliass clientfomo, which results in an error. So I suggest that the authors consider whether they should rewrite the ^{train_metrics} method in the clientfomo class,

关于结果输出的代码在哪里

-------------Round number: 3-------------

Evaluate global model
Averaged Train Loss: 0.6725
Averaged Test Accurancy: 0.8453
Averaged Test AUC: 0.9514
Std Test Accurancy: 0.1707
Std Test AUC: 0.0432

我没有找到这些语句是在哪里输出的。eg,Train loss和Test Accurancy

about the fedproto

Can anyone run this algorithm correctly? I think there are some bugs

Questions about BatchNorm layer in ResNet in average stage.

Hello, thanks for your code.
In my experiment, I found that the set_parameter function in ClientBase.py was implemented by that:

        for new_param, old_param in zip(model.parameters(), self.model.parameters()):
            old_param.data = new_param.data.clone()

However, I found that this will not transfer the server's running_mean and running_var in batchnorm layer to the corresponding batchnorm layer. So I modified this to:

state_dict = model.state_dict()
self.model.load_state_dict(state_dict)

I want to know is this wrong? How do we deal with the batchnorm layer in federated learning? Should we transfer the batchnorm running_mean, running_var to the client? If not, the zero value of this two parameter seems to be harmful to the client model evaluation.

test accuracy increase slow in IID Cifar10.

Can you provide the latest code，many thanks.

from opacus.dp_model_inspector import DPModelInspector

i wonder what you gays version of opacus,cause when i used the code like this ->from opacus.dp_model_inspector import DPModelInspector , it raises the problem ->ModuleNotFoundError: No module named 'opacus.dp_model_inspector'

FedAmp中与论文公式不符

函数A需要满足A(0)=0，代码中函数e为math.exp(-x/self.sigma)/self.sigma self.sigma默认为1，所以e(0)=1不符合论文，应调整为1-math.exp(-x/self.sigma)/self.sigma

IndexError: list index out of range

When I use the "python main.py -data mnist -m cnn -algo FedAvg -gr 2500 -did 0 -go cnn"
It has an error as follows:
============= Running time: 0th =============
Creating server and clients ...
Traceback (most recent call last):
File "main.py", line 183, in
run(goal=config.goal,
File "main.py", line 61, in run
server = FedAvg(device, dataset, algorithm, model, local_batch_size, local_learning_rate, global_rounds,
File "/home/aa/federated-learning-benchmark/Personalized-federated-learning-simulation-platform-master/system/flcore/servers/serveravg.py", line 14, in init
data = read_data(dataset)
File "/home/aa/federated-learning-benchmark/Personalized-federated-learning-simulation-platform-master/system/utils/data_utils.py", line 80, in read_data
train_file = os.listdir(train_data_dir)[0]
IndexError: list index out of range

This seems to be a data formatting error.

Facing issues in using the code on google colab.

Hi, I was trying to use this library on google colab but am facing issues. Please add some instructions on how to use it on colab. Thanks.

On running this line, I am getting the below-mentioned error. Please guide me on how to remove this

!python main.py -data mnist -m cnn -algo FedAvg -gr 2500 -did 0 -go cnn

==================================================
Algorithm: FedAvg
Local batch size: 10
Local steps: 1
Local learing rate: 0.005
Total number of clients: 20
Clients join in each round: 1.0
Client drop rate: 0.0
Time select: False
Time threthold: 10000
Global rounds: 2500
Running times: 1
Dataset: mnist
Local model: cnn
Using device: cpu

============= Running time: 0th =============
Creating server and clients ...
Traceback (most recent call last):
File "main.py", line 307, in
run(args)
File "main.py", line 106, in run
server = FedAvg(args, i)
File "/content/PFL-Non-IID/system/flcore/servers/serveravg.py", line 13, in init
self.set_clients(args, clientAVG)
File "/content/PFL-Non-IID/system/flcore/servers/serverbase.py", line 53, in set_clients
train_data = read_client_data(self.dataset, i, is_train=True)
File "/content/PFL-Non-IID/system/utils/data_utils.py", line 88, in read_client_data
train_data = read_data(dataset, idx, is_train)
File "/content/PFL-Non-IID/system/utils/data_utils.py", line 68, in read_data
with open(train_file, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '../dataset/mnist/train/0.npz'

CUDA problem

Hi, when i try to train a cnn model, I have enabled cuda but it seems not use fall power which looks like that

And the terminal output suspended on "Creating server and clients .." for a long time. Is it a normal condition?

Waiting for your reply, thanks.

The evaluation about APFL

Thanks for the implementation of these methods! I realize that test_metric is not overwritten in the client class, so the evaluation will be performed on the model instead of the personalized model model_per. Maybe you can add the commented load_model back to load the correct personalized model?

In the pathological Non-IID setting, the samples distribution on clients may be unbalanced even the `balance` is True.

First, thanks for the code, which helps me a lot. But when reading the function separate_data in the file ./dataset/utils/dataset_utils.py, I found if balance is used in line 67-68. Though you have not said the code can provide pathological noniid and balanced setting in the README.md, this is ambiguous. So I raise the issure here. Next, I will explain the samples distribution on clients is affected by the distribution of the initial dataset, which results that the samples distribution on clients may be unbalanced even balance is True (balance is the variable in the function).

Our problem is focused on relationship between the initial dataset and the sample distribution on clients, so other variables is fixed. From selected_clients = selected_clients[:int(num_clients/num_classes*class_per_client)] in line 62 and num_per = num_all_samples / num_selected_clients in line 66, we know the num_per is only affected by the num_all_samples (others are fixed by our assumption). Then by num_samples = [int(num_per) for _ in range(num_selected_clients-1)] in line 68, we get that the num_samples is affected by the num_all_samples. Considering the num_all_samples=len(idx_for_each_class[i]) in line 64 (the initial distribution), we conclude that the client gets different number of samples with the same cost (one chance), which leads to the phenomenon (the samples distribution on clients is affected by the distribution of the initial dataset).

One simple example to verify my understanding.
four labels[number of samples]: 0[20], 1[20], 2[40], 3[20]
num_clients: 10
class_per_client: 2
then we will have the results through the code with balance==True, client [labels|number of samples]:
0 [0,1|8]; 1 [0,1|8]; 2 [0,1|8]; 3 [0,1|8]; 4 [0,1|8];
5 [2,3|12]; 6 [2,3|12]; 7 [2,3|12]; 8 [2,3|12]; 9 [2,3|12].
We see that this is unbalanced obviously. By the way, [6332, 6333, 6044, 6045, 5631, 5632, 6091, 6092, 5899, 5901] is gotten if running on the mnist dataset (num_clients: 10, class_per_client: 2), which is unbalanced but not very obvious since mnist's initial sample distribution is not obvious too ([label | number of samples of this label] 0 5923, 1 6742, 2 5958, 3 6131, 4 5842, 5 5421, 6 5918, 7 6265, 8 5851, 9 5949).

At last, I think the method that partitioning the samples into shards in the paper Communication-Efficient Learning of Deep Networks from Decentralized Data may be inspiring. Thanks for your time.

local variable 'server' referenced before assignment

!python main.py -data mnist -m cnn -algo fedfomo -gr 200 -lr 0.05 -jr 5.0 -tth 500

On running above line on google colab I am getting below shown error

==================================================
Algorithm: fedfomo
Local batch size: 10
Local steps: 1
Local learing rate: 0.05
Total number of clients: 20
Clients join in each round: 5.0
Client drop rate: 0.0
Time select: False
Time threthold: 500.0
Global rounds: 200
Running times: 1
Dataset: mnist
Local model: cnn
Using device: cuda
Cuda device id: 0

[Dataset Question] Can PFL-Non-IID integrate the VOC dataset?

Hi, I want to integrate Yolo model in this project and use VOC dataset for training. Is it possible?

The VOC dataset structure is like this:

Waiting for your reply, thanks!

Questions about the test_metrics() functions and the partition between train and test.

Thanks for your awesome libs, including so many algorithms.

I have a question about the evaluate function in the serverbase.py: https://github.com/TsingZ0/PFL-Non-IID/blob/fd23a2124265fac69c137b313e66e45863487bd5/system/flcore/servers/serverbase.py#L165

It seems that in each round, the test will be conducted only on the selected clients. Is this a standard setting in PFL? Shouldn't we conduct evaluation on all test datasets? Testing only on the selected clients may introduce some bias for testing.

And I note that you merge the original train dataset and the test dataset together, and then partition them to clients. Finally you split the sub-dataset into train and test dataset for each client. My question is, will this lead to that some test data of clients is in the train dataset, instead of original test dataset? https://github.com/TsingZ0/PFL-Non-IID/blob/fd23a2124265fac69c137b313e66e45863487bd5/dataset/generate_cifar10.py#L56

关于FedProto问题

FedProto的代码跑不通，请问目前代码是否没有完成

About the FedBABU

May I ask one question about the code?
You freeze the gratitude of the model's predictor.

I wonder whether the following code is necessary since the predictor of all clients will not be changed.

Thanks for your help

How to allocate balanced samples in Dirichlet distribution

for k in range(K):
                idx_k = np.where(dataset_label == k)[0]
                np.random.shuffle(idx_k)
                proportions = np.random.dirichlet(np.repeat(alpha, num_clients))
                ## Balance
                proportions = np.array([p*(len(idx_j)<N/num_clients) for p,idx_j in zip(proportions,idx_batch)])
                proportions = proportions/proportions.sum()
                proportions = (np.cumsum(proportions)*len(idx_k)).astype(int)[:-1]
                idx_batch = [idx_j + idx.tolist() for idx_j,idx in zip(idx_batch,np.split(idx_k,proportions))]
                min_size = min([len(idx_j) for idx_j in idx_batch])

how to make the number of samples in each client balanced

about the FedDyn

Hello, @TsingZ0.
Thank you very much for your nice code. When I read your code, I noticed that the FedDyn implementation was not the same as the pseudo-code in the paper. Specifically you can see the figure.
Interestingly, I reproduced the pseudo-code from the paper and found the performance to be poor, but with your approach the performance is good. I wonder if this is because the author's pseudo-code is wrong.
Thank you again for your selfless and dedicated work.

FedROD doesn't work

Hi, my friend, I found your code of FedROD cannot be run.

Cound you please check it?

Thank you very much.

Question about the accuracy of FedFomo

Hi, TsingZ0.
I'm sorry to disturb you again. This time, I would like to ask whether you can reproduce the experimental results of 15 clients and 100 clients in the CIFAR10 data set of Fedfomo paper. I ran through your code, and set the same hyperparameter as in Fedfomo's original paper, using the CNN model similar to Fedavg's, it seems that the mean accuracy in Fedfomo's original paper cannot be achieved.

iid mnist data set accuracy does not increase

Hello, in the latest code, there is still an extremely slow increase in average test accuracy at this time the required learning rate is 0.2 but the AUC is increasing rapidly. This problem occurs in mnist of iID. I can't understand why, can you make a solution? The experimental parameters are extracted directly from your readme, and only the learning rate is modified to 0.2.
The final accuracy was 23%,AUC was 65%.

About the random seed setting

Hello @TsingZ0 ,
Thank you very much for the nice work. I have a small problem with the code, I find that when I run it multiple times with the same parameters I get variable results, I think it may be that the random seed is somewhere forgotten to set

About the FedRod

My friend, could you please tell me that if this algorithm works in your experiments, compared with baseline(Fedavg and Local), and how much it can improve? Best regards.

May I ask one question about FL?

After running the code, I find that just using Local way to train the model is much better than Fedavg. So I wonder whether it is better to use FL instead of just training local model?  
Thanks for your excellent job! Your code really helps me a lot.

About the results...

I have tried some algorithm, but find the results are not much better than FedAvg. Do you have run these codes and what are your results like? Could you provide them and the hyper-parameters? 
Best regards!

An error occurred when running generate_cifar10.py

When I ran generate_cifar10.py, the following error appeared
Traceback (most recent call last):
File "generate_cifar10_1.py", line 74, in
generate_cifar10(dir_path=dir_path, num_clients=num_clients, num_labels=num_labels, niid=niid, real=real)
File "generate_cifar10_1.py", line 65, in generate_cifar10
X, y, statistic = seperete_data(dataset, num_clients, num_labels, niid, real)
File "/home/liubochao/code/PFL-Non-IID/PFL-Non-IID-c7dff7fb0575840727baa746eb08d9bd4c8ede22/dataset/utils/dataset_utils.py", line 42, in seperete_data
dataset_content, dataset_label = data
ValueError: too many values to unpack (expected 2)

Test accuracy are very different

For cifar10 dataset, some methods such as Fedproto, pFedMe, Fedper can achieve nearly 90% but for FedAvg, Fedprox and FedBN can only achieve about 35%. Are they using different test dataset? For personalized methods, you are using local testdata (with similar data distribution) and for non-personalized methods, you are using a global testdata.

FedAvg and FedBN result in the same

please double check FedBN as it always give us same result as FedAvg

The implement of APFL is quite different from the paper and the source code provided by the author.

Hello, I find that the implement of APFL in your platform is quite different from the paper and the source code provided by the author.

For example, there exists some extra code that APFL doesn't have, and it seems that you miss the code of updating alpha.

Can you check it?

Thank you.

dataset ! HTTP Error 503: Service Unavailable

python generate_mnist.py noniid realworld

FedPer client

Hi, I have recently finished reading FedPer and FedRep. When I was reading the code, I found that FedPer's client code uploads the whole model, not the base model, which is different from FedRep's code, is it designed this way?

FedAMP

你好，我想问一下您实现的FedAMP算法，关于mnist数据你跑的最后准确率能达到多少？这个200个epoch test的acc只能到20%左右，想请教一下这个问题，感谢。

The Code of FedMTL Seems Wrong.

@TsingZ0 Hi, my friend,

According to Readme, the code of FedMTL is refered to Federated multi-task learning NeurIPS 2017, which is called MOCHA in the paper. However, I read the paper and the code offered by the author, it's quite different from your code.