Code Monkey home page Code Monkey logo

gfl's Introduction

GFL Framework


English | 简体中文

Galaxy Federated Learning Framework(GFL) is a decentralized federated learning framework based on blockchain. GFL builds a decentralized communication network based on Ethereum, and executes key operations that requires credibility in FL through smart contracts.

Quick Start

1. System Envs Required

a) GFL only supports Python3, please make sure your python version is no less than 3.4.

b) GFL is based on Pytorch, so, torch>=1.4.0 and torchvision>=0.5.0 is required before using GFL. Pytorch installation tutorial

2. Install

pip install gfl_p

3. Usage

The available commands of GFL.

usage: GFL [-h] {init,app,attach} ...

optional arguments:
  -h, --help         show this help message and exit

actions:
  {init,app,attach}
    init             init gfl env
    run              startup gfl
    attach           connect to gfl node

Init GFL node in datadir directory.

python -m gfl_p init --home datadir

Start GFL node(start in standalone mode by default). If you need to open console when starting node, use the `--console`` argument.

python -m gfl_p run --home datadir

Open console for operating GFL node. The following three methods can be used to connect to the node started in the previous step.

python -m gfl attach						# connect to http://localhost:9434 in default
python -m gfl attach -H 127.0.0.1 -P 9434
python -m gfl attach --home datadir

GFL base design

image-20210903165315547

The GFL framework is divided into two parts:

Job Generator

Used to create a job that can be executed in the GFL network. Developers can use the interface provided by GFL to generate a Job for various configuration parameters and distribute them to the network for training.

Run-Time Network

Several running nodes build GFL's decentralized training network, and each GFL node is also a node in the blockchain. These nodes continuously process the jobs to be trained in the network according to user commands.

GFL core arch

image-20210903213928765

  • Manager Layer

    • The start/stop/status operation of node
    • Provide communication interface for nodes
    • Sync job
  • Scheduler Layer

    • Manage the execution process of each job
    • Synchronize parameter files among nodes
    • Schedule the execution order of multiple jobs on the node
  • FL Layer

    • Configure the running environment of the job
    • Perform training/aggregation tasks
    • Provide the interfaces of user-defined action

gfl's People

Contributors

huyifan233 avatar malanore-z avatar zhengqi6 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gfl's Issues

setup.py error

When I wirte "python setup.py install " in terminal , which returns an error:
error in 'egg_base' option: 'src' does not exist or is not a directory

Cluster过程中报错,参数多给了一个

背景:两台机器运行 一台Server端,一台Client端.

Client端的报错信息如下
image

我尝试去core.trainerTrainMPCDistillationStrategy 这个函数里面核对传参个数,结果运行卡在只能print('a')
image

于是上父类中核对传参个数,没有发现问题.


于是返回到core.trainer_controller.py中,发现运行的是DistillationStrategy这个策略,于是强制运行NormalStrategy就跑通了

损失函数的缺失

GalaxyLearning是一个不错的轻量级框架。
我在使用时遇到了一些问题。
代码中定义了这些损失函数
image
dan但我看源码中具体实现就这两个:
image

文件缺失

你好,如何配置多个客户端同时工作?

How to use this framework?

Hello, I'm a student working on a course project of dencentralized FL. I find this repo from your paper but I don't know how to start with. Is there any quick guide or demo? Thank you.
你好,我最近在尝试一个与去中心化FL相关的课程项目。我通过你的论文找到了这个仓库,但我并不知道应该怎么使用这个框架。请问有简单的示例或教程可供参考么?谢谢!

QR Code

Can you send QR code again?

It expired.

Thank you

RDFL代码开源吗

感谢您和所在团队的分享!请问一下RDFL代码开源吗?谢谢!

使用困难

您好,我通过浙大的团队介绍找到的贵团队的代码框架,但是使用存在一些困难:
1.pip install gfl_p 是用不了的
2.通过下载源码的方式发现源码中不存在datadir(readme中提到了的)。
3.通过观看B站的PFL教程后发现,教程与贵团队提交的代码好像区别很大
望回复并支持一下,谢谢!

论文

请问这个项目对应的论文是哪哪一篇呀?

蒸馏损失函数

您好,我在阅读代码的时候发现知识蒸馏函数时,有个部分不是很理解。这里对模型的输出,做了一个计算。

  1. 对于L2或者MSE损失而言,这里的输出为0。
  2. 对于KL散度的计算而言,这里的输出是有具体数值的(<0)。

这样的处理是否有一些其他理论支撑?为什么不直接赋值为0?是为了后面其他模型蒸馏的时候更方便相加吗?

GFL/gfl/core/trainer.py

Lines 404 to 408 in a49850b

if job_l2_dist:
loss_distillation = self._compute_l2_dist(kl_pred, kl_pred)
else:
loss_distillation = self._compute_loss(LossStrategy.KLDIV_LOSS, F.softmax(kl_pred, dim=1),
F.softmax(kl_pred, dim=1))

希望您能够在空闲时间回复,无论如何,十分感激您的开源代码!

QR Code

Could you please update the QR code? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.