Code Monkey home page Code Monkey logo

simple_fc_model's Introduction

Fully-Connected Model from Scratch

Outline:

  • Goal
  • Model (NN, weight init, activation func, loss func, bp)
  • Testing Results
  • Other Experiments

Goal

Classify input data by doing backpropagation and renewing model parameters. Models are built from scratch, without using deep learning framework such as Tensorflow and PyTorch.

Testing Cases

There are 2 cases. Each case contains 2 classes

1. Linear

Linear

2. XOR

XOR

Model

Neural Network

2 hidden layers & 1 output layer.
NN Structure

Weight Initialization

Drawn from
Weight Initialization

Activation Function

Sigmoid
Sigmoid
Derivative of sigmoid Derivative of Sigmoid

Loss Function

MSE Loss
MSE Loss

Backpropagation

bp1 bp2

Testing Results

Linear Case

Settings:

  • First/Sec hidden layer neurons: 3 & 3
  • Learning rate = 1
  • Batch size: 1/10 dataset size

Results:

  • Testing loss: 0.021464
  • Testing accuracy: 97%

Weights before training:
Weights after training:

XOR Case

Weights before training:
Weights after training:

Other Experiments

In this section, linear dataset are used for the following experiments. Basic setttings:

Different Learning Rates

Settings:

  • First & Second hidden layer neurons: 3 & 3
  • Second hidden layer neurons: 3
  • Batch size ratio: 1 (full-batch)

Model learns faster with higher learning rate. However, it's prone to be unstable with high learning rates. The blue curve drops in early stage but it swung for a period of time.

Different Batch Size

Settings:

  • First & Second hidden layer neurons: 3 & 3
  • Learning rate: 0.1

P.S. "batch size ratio" times total number of data is batch size.

Training with a lower batch size performs better after the same epochs since model with a lower batch size updates more in a single epoch. In the graph, models with lower batch size ratio drop earlier but swing a lot.

Number of Neurons in Hidden Layers

Settings:

  • First & Second hidden layer neurons: 3 & 3
  • Second hidden layer neurons: 3
  • Batch size ratio: 1 (full-batch)

Loss of 5x3 model drops fastest and loss of 2x3 drops slowest. However, 3x3 drops faster than 4x3. So, it's difficult to distinguish whether neuron number is the key factor.

From the next two graphs, we may consider more complicated models fit this linear dataset more.

simple_fc_model's People

Contributors

liaowc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.