VOC-Semantic-Segmentation

In this project, we experimented with different models for the task of semantic segmentaion on the VOC2012 datatset. We tried a naive auto-encorder architecture and moved up from there to use a pretrained backbone. We also experimented those models across different loss functions.

To see our different models please go to the models directory. In it, there are all the models that we have trained along with the logs. This directory also has the model code.

If you want to train the models, then please look at the train.ipynb.

If you want to look at our visualizations , then please see the results.ipynb notebook.

A breif description of the train function:

def train(model_name, optimizer=None, start_epoch=0, criterionType="ce", weighted=False, ignore=False, augumented=False, num_epochs=5, batch_size=64, learning_rate=1e-3, weight_decay=1e-5):

model_name : Name of the model you want to train. Allowed values are: ["naive", "skip", "fcn", "fcn8", "fcn_2", "fcn_resnet_bn", "fcn_resnet_bn_skp"]
optimizer: Currently unused
start_epoch: Currently unused
criterionType: The loss function to use. The supported options are: ['ce', 'dice', 'focal', 'lovasz']
weighted: For cross entropy loss if weights of 1 - label_frequency should be used
ignore: For cross entropy if label 0 should not be considered in loss calculation
augmented: If the training data should be augmented with center crop and scale with 50% probability
num_epochs: Number of epochs
batch_size: Batch size
learning_rate: Learning rate
weight_decay: Weight decay

If you don't have the training data downloaded, then the train function will first download the data, unzip it, convert the data to h5 file for validation and train (with augmentation for train if true) and then delete the downloaded zip file. You will need about 10 GB of free space for this inital load. Afterwards the train method will directly load the h5 files that take significantly lesser space.

We have taken help of REFunction/VOC2012-Segmentation for the data loader code and modified it to support augmentation.

Training Performance

Validation Performance

Model List: "/checkpoint/fcn_resnet_bn_skp_ce_augumented_e90.pt", "/checkpoint/fcn_resnet_bn_skp_ce_e90.pt", "/checkpoint/fcn_resnet_bn_ce_e120.pt", "/fcn_2_ce.pt", "/fcn8_ce.pt", "/fcn_ce.pt", "/skip_ce.pt", "/naive_ce.pt"

Accuracy across models and loss functions

kadanes / voc-semantic-segmentation Goto Github PK