Code Monkey home page Code Monkey logo

anasbrital98 / cnn-from-scratch Goto Github PK

View Code? Open in Web Editor NEW
15.0 1.0 4.0 3.67 MB

In this repository you will find everything you need to know about Convolutional Neural Network, and how to implement the most famous CNN architectures in both Keras and PyTorch. (I'm working on implementing those Architectures using MxNet and Caffe)

deep-learning convolutional-neural-networks cnn keras pytorch python lenet-5 inception-v1 inception-v2 inception-v3

cnn-from-scratch's Introduction

Convolutional Neural Network From Scratch

This Repository Contains The Explanation and The Implementation Of Convolutional Neural Network Using Keras and Pytorch .

In This Repository you'll see :

  • Introduction to CNN .

  • Convolutional Neural Network vs Multilayer Perceptron .

  • Convolutional Neural Network Layers .

    • Kernels or Filters .

    • Convolutional layer .

    • Activation Layer .

    • Pooling Layer .

    • Fully Connected Layer .

  • Different Layers in Keras and pyTorch .

  • Most Common Architectures of CNN and their Implementation .

  • References .


Introduction :

The Convolutional Neural Network, known as CNN (Convolutional Neural Network), is one of the deep learning algorithms that is the development of the Multilayer Perceptron (MLP) designed to process data in the form of a Matrix (image, sound ...).

Convolutional Neural Networks are used in many fields, but we will just be interested in the application of CNNs to Images.

The question now is, what is an Image?

Image is Just a Matrix of Pixels .

Coding Modes of an Image:


Convolutional Neural Network vs Multilayer Perceptron :

Imagine with me that we've an Image classification problem to solve , and we've only one choice which is Multilayer Perceptron (Neural Network ) , and The images they have 240 height and 240 width and we're Using RGB.

do you know that we need to build a Neural Network with 240 * 240 * 3 = 172 800 Input which is a very big Neural Network , and it will be very hard for as to train it .

Can we find a solution that reduces the size of the images and preserves the Characteristics ?

This is Exactly What CNN Can Do .

In General :

CNN = Convolutional Layers + Activation Layers + Pooling Layers + Fully Connected Layers .


Convolutional Neural Network Layers :

Kernels or Filters in The Convolutional layer :

In the convolutional neural network, the Kernel is nothing more than a filter used to extract features from images. The kernel is a matrix that moves over the input data, performs the dot product with the input data subregion, and obtains the output as a dot product matrix. The kernel moves on the input data by the stride value.

There is a lot Kernels , each one is responsible for extracting a specific Feature.

Convolutional Layers :

The Convolution Layer Extract The Characteristics of The Image By Performing this operation To The Input Image :

The Convolutional Layer produce an Output Image with this Formula :

The Convolutional Layer needs Two Parameters to work :

  • Padding : the amount of pixels added to an image when it is being processed by the kernel of a CNN.
  • Stride : Stride is the number of pixels shifts over the input matrix .

Example 1 : Stride = 1 , Padding = 0 :

if we Applied our Formula (In The Picture above) we'll get The Same Result .

output width = (input_width - kernel_width + 2 * padding) / stride_width + 1

output height = (input_height - kernel_height + 2 * padding) / stride_height + 1

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 0) / 1 + 1 = 5
output height = (6 - 2 + 2 * 0) / 1 + 1 = 5

Example 2 : Stride = 2 , Padding = 0 :

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 0) / 2 + 1 = 3
output height = (6 - 2 + 2 * 0) / 2 + 1 = 3

Example 3 : Stride = 2 , Padding = 1 :

input Image : 6*6
Kernel Size : 2*2

output width = (6 - 2 + 2 * 1) / 2 + 1 = 4
output height = (6 - 2 + 2 * 1) / 2 + 1 = 4

In All The Examples Above we was talking about Convolution 2D , now let See The general Case which is Convolution 3D :

Input Image : W1×H1×D1 .
Number of filters : K (With Size F*F).
the stride  : S .
Padding : P .
Output : 
W2 = (W1−F+2P)/S+1 .
           H2 = (H1−F+2P)/S+1 .
           D2 = K .


Activation Function in The Convolutional layer :

The activation function used in CNN networks is RELU and it is defined as follows:

RELU (z) = max (0, z)

Pooling Layer :

The Pooling Layer Reduce The Size of The Image , there is two type of Pooling :

  • Max Pooling .
  • AVG Pooling .

The Output Of The Pooling Layer Can be calculated Using This Formula :

Max Pooling :

AVG Pooling :


Fully Connected Layer :

fully connected layer it can be seen as one layer of a simple Neural Network .


Different Layers in Keras and pyTorch :

Keras :

Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library.

  • Convolution Layer :
tf.keras.layers.Conv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="valid",
    data_format=None,
    dilation_rate=(1, 1),
    groups=1,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
  • Activation Layer :
tf.keras.activations.relu(x, alpha=0.0, max_value=None, threshold=0)
  • Pooling Layer :

    • Max-Pooling :
    tf.keras.layers.MaxPooling2D(
    pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs
    )
    • Avg-Pooling :
    tf.keras.layers.AveragePooling2D(
    pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs
    )
  • Dropout Layer :

tf.keras.layers.Dropout(rate, noise_shape=None, seed=None, **kwargs)
  • Dense Layer or Fully Connected Layer :
tf.keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)

pyTorch :

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license.

  • Convolution Layer :
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
  • Activation Layer :
torch.nn.ReLU(inplace=False)
  • Pooling Layer :

    • Max-Pooling :
    torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
    • Avg-Pooling :
    torch.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
  • Dropout Layer :

torch.nn.Dropout(p=0.5, inplace=False)
  • Dense Layer or Fully Connected Layer :
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

Most Common Architectures of CNN and their


References :

cnn-from-scratch's People

Contributors

anasbrital98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.