This repository contains pytorch projects.
Project 1: Image Classification using CNN:
Dataset: CIFAR-10
Model: Convolutional Neural Network CNN
CNN is a popular choice for image classification because they are capable of learning spatial hierarchies of features automatically like:
- Edge
- Texture
- Shape
How does CNN work?
- The first hidden layer after the input layer in CNN is usually a convolution layer.
- This layer applies filters on the input data to detect specific patterns like edges, textures, or shapes.
- The filter in the context of CNN is a small weight matrix that often has a 3*3 or 5*5 size.
- Each filter is sensitive to a particular feature of the input data.
- These filters are learned during the training process to properly identify a particular feature concerning the training data.
- This filter or weight matrix is applied to the input data by covering a small portion of input data across width and height taking one step at a time.
- The newly generated values also form a matrix called feature maps that highlights the presence and intensity of various features mentioned in step 2.
- These feature maps are passed through the non-linear activation functions in the next steps.
- These non-linear functions introduce non-linearity in the feature maps to understand more complex patterns.
- ReLU and its variants are common activation functions used by CNN.
- After the first hidden layer, the following hidden layers can have additional convolution, pooling, and fully connected layers.
- Pooling layers are used immediately after the convolutional layers in the CNN.
- These layers reduce the spatial dimensions (width and height) of the feature maps which results in the reduction of parameters and computation in the network making systems more efficient.
- The fully connected or dense layers are present after the previous two layers in a CNN.
- Each neuron in a fully connected layer is connected to every neuron in the previous layer to learn the global pattern in the input data.
- This global representation of the input data is combined and used by the fully connected layer to perform required tasks like classification of the images into predefined classes.
- The output layer or softmax layer is a fully connected layer that uses the softmax activation function to determine the probability of an input belonging to each of a predefined set of classes.