Code Monkey home page Code Monkey logo

pydynet's Introduction

PyDyNet:Neuron Network(DNN, CNN, RNN, etc) implementation using Numpy based on Autodiff

前作:PyNet: Use NumPy to build neuron network。在那里我们基于求导规则实现了全连接网络。在这里,我们向当今的深度学习框架看齐,实现属于自己的DL框架。

PyDyNet已被多个技术公众号和社区分享居然用Numpy实现了一个深度学习框架.

Downloads Downloads

更新日志

  • 5.10: ver 0.0.1 修改损失函数的定义方式:加入reduction机制,加入Embedding;
  • 5.15: ver 0.0.2 重构了RNN, LSTM和GRU,支持双向;
  • 5.16: ver 0.0.2 允许PyDyNet作为第三方库安装;开始手册的撰写(基于Sphinx).
  • 5.29: ver 0.0.3 加入了Dataset和Dataloader,现在可以像PyTorch一样定义数据集和分割数据集,具体参考data.py中的train_loader函数;
  • 5.30: ver 0.0.3 将一维卷积算法退化成基于循环的im2col,新版本NumPy似乎不是很支持strided上数组的魔改;
  • 7.22: ver 0.0.4/0.05 增加了Module类和Parameter类,将模块重组、增加多种Pytorch支持的初始化方式;正在撰写新的Manual;
  • 7.28: ver 0.0.6 加入no_grad方法,可以像pytorch一样禁止自动微分,比如@no_grad()with no_grad(),详见autograd.py;
  • 8.09: ver 0.0.7 基于cupy,PyDyNet现在可以使用显卡加速训练,用法与PyTorch一致,详见testscu*.py
  • 8.18: ver 0.0.8 加入学习率调整策略,实现了训练过程中自动调节学习率;
  • 10.21: ver 0.0.9 加入tensor的split方法,基于此改进了RNN;
  • 10.23: ver 0.0.10 重写RNN, LSTM和GRU,支持多层双向;
  • ...

Overview

PyDyNet也是纯NumPy(0.0.7版本后加入CuPy,其用法和NumPy一致)实现的神经网络,语法受PyTorch的启发,大致结构如下:

graph BT
   N(numpy.ndarray/cupy.ndarray) ----> ds(Dataset) ----> Data(DataLoader)--> Mission
   N --> A(Tensor) --Eager execution--> B(Basic operators: add, exp, etc)
   B -.Autograd-.-> A
   B --> CO(Complex operators:softmax,etc)
   --> f(Function:linear, conv2d, etc) 
   --> M(Basic Module:Linear,Conv2d,etc)
   --> CM(Advanced Module:CNN,RNN,etc)
   --> Mission(PyDyNet)
   N --> GD(Optimizer:SGD, Adam, etc) ----> LS(lr_scheduler:StepLR, etc)--> Mission

虚线表示用户可以通过no_grad来关闭自动微分功能。更全面的架构图(Thanks to duma-repo):

img

我们实现了:

  1. 将NumPy数组包装成具有梯度等信息的张量(Tensor):

    Example

    from pydynet import Tensor
    
    x = Tensor(1., requires_grad=True)
    print(x.data) # 1.
    print(x.ndim, x.shape, x.is_leaf) # 0, (), True

  2. 将NumPy数组的计算(包括数学运算、切片、形状变换等)抽象成基础算子(Basic operators),并对部分运算加以重载:

    Example

    import pydynet as pdn
    from pydynet import Tensor
    
    x = Tensor([1, 2, 3])
    y = pdn.exp(x) + x
    z = pdn.sum(x)
    print(z.data) # 36.192...

  3. 手动编写基础算子的梯度,实现和PyTorch相同的动态图自动微分机制(Autograd),从而实现反向传播

    Example

    import pydynet as pdn
    from pydynet import Tensor
    
    x = Tensor([1., 2., 3.], requires_grad=True)
    y = pdn.log(x) + x
    z = pdn.sum(y)
    
    z.backward()
    print(x.grad) # [2., 1.5, 1.33333333]

  4. 基于基础算子实现更高级的算子(Complex operators),它们不再需要手动编写导数:

    Example

    import pydynet as pdn
    
    def simple_sigmoid(x: pdn.Tensor):
        return 1 / (1 + pdn.exp(-x))

  5. 实现了Mudule,包括激活函数,损失函数等,从而我们可以像下面这样定义神经网络,损失函数项:

    Example

    import pydynet.nn as nn
    import pydynet.nn.functional as F
    
    n_input = 64
    n_hidden = 128
    n_output = 10
    
    class Net(nn.Module):
        def __init__(self) -> None:
            super().__init__()
            self.fc1 = nn.Linear(n_input, n_hidden)
            self.fc2 = nn.Linear(n_hidden, n_output)
    
        def forward(self, x):
            x = self.fc1(x)
            x = F.sigmoid(x)
            return self.fc2(x)
    
    net = Net()
    loss = nn.CrossEntropyLoss()
    l = loss(net(X), y)
    l.backward()

  6. 实现了多种优化器和学习率衰减策略,从而实现神经网络的训练;其中优化器和PyTorch一样支持权值衰减,即正则化:

    Example

    from pydynet.optim import Adam, StepLR
    
    ...
    net = Net()
    optimizer = Adam(net.parameters(), lr=0.01)
    lr_scheduler = StepLR(optimizer, step_size=10)
    
    for epoch in range(EPOCHES):
        for data in data_loader:
            train(...)
            optimizer.step()
        lr_scheduler.step()

  7. 实现了Dataset和DataLoader对数据集进行加载与划分:

    Example

    from pydynet.data import Dataset, DataLoader
    
    class TrainSet(Dataset):
        def __init__(self, X, y) -> None:
            self.data = X
            self.target = y
    
        def __getitem__(self, index):
            return self.data[index], self.target[index]
    
        def __len__(self):
            return len(self.data)
    
     data_loader = DataLoader(TrainSet(X, y), batch_size, shuffle)

  8. Dropout机制,Batch Normalization机制,以及将网络划分成训练阶段和评估阶段;

  9. 基于im2col高效实现Conv1d, Conv2d, max_pool1d和max_pool2d,从而实现CNN;

  10. 支持多层的多层双向RNN,LSTM和GRU;

  11. 多种初始化方式,包括Kaiming和Xavier;

  12. 基于cupy实现了显卡计算和训练:

    Example

    from pydynet import Tensor
       
    x = Tensor([1., 2., 3.], device='cuda')
    y = Tensor([1., 2., 3.], device='cuda')
    z = (x * y).sum()
    
    w = Tensor([1., 2., 3.]) # CPU上的Tensor
    x * w # 报错

Install

pip install pydynet

或本地安装

git clone https://github.com/Kaslanarian/PyDyNet
cd PyDyNet
python setup.py install

安装成功后就可以运行下面的例子

Example

tests中是一些例子。

AutoDiff

autodiff.py利用自动微分,对一个凸函数进行梯度下降:

ad

DNN

DNN.py使用全连接网络对sklearn提供的数字数据集进行分类,训练参数

  • 网络结构:Linear(64->64) + Sigmoid + Linear(64->10);
  • 损失函数:Cross Entropy Loss;
  • 优化器:Adam(lr=0.01);
  • 训练轮次:50;
  • 批大小(Batch size):32.

训练损失,训练准确率和测试准确率:

dnn

CNN

CNN.py使用三种网络对fetch_olivetti_faces人脸(64×64)数据集进行分类并进行性能对比:

  1. Linear + Sigmoid + Linear;
  2. Conv1d + MaxPool1d + Linear + ReLU + Linear;
  3. Conv2d + MaxPool2d + Linear + ReLU + Linear.

其余参数相同:

  • 损失函数:Cross Entropy Loss;
  • 优化器:Adam(lr=0.01);
  • 训练轮次:50;
  • 批大小(Batch size):32.

学习效果对比:

cnn

Droput & BN

dropout_BN.py使用三种网络对fetch_olivetti_faces人脸(64×64)数据集进行分类并进行性能对比:

  1. Linear + Sigmoid + Linear;
  2. Linear + Dropout(0.05) + Sigmoid + Linear;
  3. Linear + BN + Sigmoid + Linear.

其余参数相同:

  • 损失函数:Cross Entropy Loss;
  • 优化器:Adam(lr=0.01);
  • 训练轮次:50;
  • 批大小(Batch size):32.

学习效果对比:

BN

RNN

RNN.py中是一个用双向单层GRU对sklearn的数字图片数据集进行分类:

RNN

cuda相关

cuDNN.py, cuCNN.py, cuDropoutBN.py, cuRNN.py分别是上面四种网络的cuda版本,并对网络进行了相应的修改,主要是介绍如何使用PyDyNet的显卡功能,且已经在无显卡和有显卡的环境下都通过了测试。

Net Dataset Parameters CPU time GPU time
FC Digits (1970×64) batch_size=128, epoch=50 30.8s±392ms 22.4s±298ms
CNN1d OlivettiFaces (400×4096) batch_size=64, epoch=50 8.76s±68.7ms 4.49s±16.3ms
CNN2d OlivettiFaces (400×4096) batch_size=64, epoch=50 14.1s±285ms 4.54s±49ms

事实上,对于越庞大的网络(更宽,更深,卷积),GPU加速效果更好。

pydynet's People

Contributors

kaslanarian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pydynet's Issues

关于GPU加速的一些bug

大佬你好,我最近在研究动态图机制的实现。在使用GPU运行你的代码的时候,我发现cuCNN的例子精度不随迭代变化(cpu正常),观察到网络参数其实并没有改变,可能是因为网络参数没有移动到gpu上。我尝试为每一层添加.to('cuda')之后参数得到了更新,精度表现也正常。但随着训练的进行,显存不断增长(未更改时显存正常,保持不变),我不知道这个的具体原因是什么,大佬可以帮我看一下吗

import numpy as np
from pydynet.tensor import Tensor
import pydynet.nn.functional as F
import pydynet.nn as nn
from pydynet.optim import Adam, SGD
from pydynet.data import DataLoader, Dataset
from tqdm import tqdm


dev = ['cpu', 'cuda'][1]
np.random.seed(42)



from scipy.io import loadmat
data = loadmat('../mnist_uint8.mat')
train_x = np.reshape(data['train_x'], (60000, 1, 28, 28)) / 255.0
train_y = data['train_y']
test_x = np.reshape(data['test_x'], (10000, 1, 28, 28)) / 255.0
test_y = data['test_y']



class mnist_dataset(Dataset):
    def __init__(self, X, y) -> None:
        super().__init__()
        self.data = X
        self.label = y

    def __getitem__(self, index):
        return self.data[index], self.label[index]

    def __len__(self):
        return len(self.data)

train_loader = DataLoader(mnist_dataset(train_x, train_y), 32, True)
test_loader = DataLoader(mnist_dataset(test_x, test_y), 32, False)




class CNN2d(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        # 每层添加to之后可以正常更新,但会爆显存
        self.conv1 = nn.Conv2d(1, 1, 3, padding=1).to(dev)
        self.fc1 = nn.Linear(49, 128).to(dev)
        self.fc2 = nn.Linear(128, 10).to(dev)

    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 4, 4)
        x = x.reshape(x.shape[0], -1)
        x = self.fc1(x)
        x = F.leaky_relu(x, 0.1)
        return self.fc2(x)


net3 = CNN2d().to(dev)

optim3 = Adam(net3.parameters(), lr=0.01)
loss = nn.CrossEntropyLoss().to(dev)
EPOCHES = 50
BATCH_SIZE = 32


from time import time


t = time()
for epoch in range(EPOCHES):

    net3.train()
    train_out = []
    for batch_X, batch_y in tqdm(train_loader):
        batch_X, batch_y = Tensor(batch_X).to('cuda'), Tensor(batch_y).to('cuda')
        # print(data)
        output3 = net3(batch_X)
        l3 = loss(output3, batch_y)
        optim3.zero_grad()
        l3.backward()
        optim3.step()

        acc = np.argmax(output3.numpy(), axis=1) == np.argmax(batch_y.numpy(), axis=1)
        train_out.append(acc)
        # mp.free_all_blocks()
        # pmp.free_all_blocks()
    train_out = np.concatenate(train_out)
    train_out = np.mean(train_out)

    net3.eval()
    test_out = []
    # test_label
    for batch_X, batch_y in tqdm(test_loader):
        node_y = Tensor(batch_y).to(dev)

        data = Tensor(batch_X).to(dev)
        # print(data)
        output3 = net3(data)
        l3 = loss(output3, node_y)
        t = list(output3.numpy())
        acc = np.argmax(t, axis=1) == np.argmax(batch_y, axis=1)
        test_out.append(acc)
        # del data
    test_out = np.concatenate(test_out)
    test_out = np.mean(test_out)


    print("Epoch {:2d}:".format(epoch + 1))

    print('train acc: {}, test acc: {}'.format(train_out, test_out))
print(time() - t)


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.