hello, recently I tried to train the model on ADE20K from scratch completely. The vali

Thanks for your reply. I'll check my training and run it again. I think I i

Ok, so based on your message I think that: First, you should c

I tried to train the RGB branch on ADE20K from scratch, but only got 45% acc rather than 55.9%. ,about vpulab/semantic-aware-scene-recognition

Comments (13)

alexlopezcifuentes commented on June 14, 2024

Training of the RGB branch should be pretty straight forward, we were not doing anything special there... Have you followed the hyperparameters described in the paper? Are you using the same data augmentation? And finally, are you evaluating using 10-crop evaluation?

from semantic-aware-scene-recognition.

DranZohn commented on June 14, 2024

Thanks for your reply. I'll check my training script and run it again.
I think I indeed apply the hyperparameters mentioned in the paper, including the DFW optimizer with 0.1 intial lr, 0.9 momentum and 1e-4 weight decay.
Secondly, my training data and val data is directly got from ADE20KDataset.py, so the same data aug should be used.
And I evaluate the model with setting TEN_CROPS flag FALSE.

from semantic-aware-scene-recognition.

alexlopezcifuentes commented on June 14, 2024

Ok, so based on your message I think that:

First, you should change your evaluation to use TEN CROPs. It is pretty important as it increases the results (more crops to evaluate) and is the metric that the research community usually uses in scene recognition.
Second, you have to use some pre-trained weights. Even though ADE20K has 20K images for training it might not be sufficient to train a good model. I suggest you to use ImageNet pre-trained weights for ResNet18 (if that's the backbone you are using) or even Places365 which will be even better.

from semantic-aware-scene-recognition.

DranZohn commented on June 14, 2024

Hi, thanks for your advise, and following that, I had trained it again. However, I still got the same result on RGB branch——loss nearly 11% acc. In addition, I tried it on MITIndoor dataset, but similarly lost nearly 25% acc (only 55% acc). These results really confuse me a lot.

Here are the records of training on MITIndoor dataset:

And here is the training script on MITIndoor that can be run directly: (if you need)

"""
Training file
Usage:
    --config [PATH to configuration file for desired dataset]
"""
from Libs.Utils.dfw import dfw
from RGBBranch import RGBBranch
from SemBranch import SemBranch
from SASceneNet import SASceneNet
from Libs.Datasets.MITIndoor67Dataset import MITIndoor67Dataset
from Libs.Utils import utils

import torch.backends.cudnn as cudnn
import numpy as np
import argparse
import yaml
import torch
import os
import time

parser = argparse.ArgumentParser(description='Semantic-Aware Scene Recognition Evaluation')
parser.add_argument('--ConfigPath', metavar='DIR', help='Configuration file path')

def evaluationDataLoader(dataloader, model, set):
    batch_time = utils.AverageMeter()
    losses = utils.AverageMeter()
    top1 = utils.AverageMeter()
    top2 = utils.AverageMeter()
    top5 = utils.AverageMeter()

    # Extract batch size
    batch_size = CONFIG['VALIDATION']['BATCH_SIZE']['TEST']

    # Start data time
    data_time_start = time.time()
    model.eval()
    with torch.no_grad():
        for i, (mini_batch) in enumerate(dataloader):
            start_time = time.time()
            if USE_CUDA:
                RGB_image = mini_batch['Image'].cuda()
                sceneLabelGT = mini_batch['Scene Index'].cuda()

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Fuse batch size and ncrops to set the input for the network
                bs, ncrops, c_img, h, w = RGB_image.size()
                RGB_image = RGB_image.view(-1, c_img, h, w)

            semanticTensor = None
            # Model Forward
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Average results over the 10 crops
                outputSceneLabelRGB = outputSceneLabelRGB.view(bs, ncrops, -1).mean(1)

            if batch_size is 1:
                if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                    RGB_image = torch.unsqueeze(RGB_image[4, :, :, :], 0)
            
            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)

            # Measure Top1, Top2 and Top5 accuracy
            prec1, prec2, prec5 = utils.accuracy(outputSceneLabelRGB.data, sceneLabelGT, topk=(1, 2, 5))

            # Update values
            losses.update(loss.item(), batch_size)
            top1.update(prec1.item(), batch_size)
            top2.update(prec2.item(), batch_size)
            top5.update(prec5.item(), batch_size)

            # Measure batch elapsed time
            batch_time.update(time.time() - start_time)

            # Print information
            if i % CONFIG['VALIDATION']['PRINT_FREQ'] == 0:
                print('Testing {} set batch: [{}/{}]\t'
                            'Batch Time {batch_time.val:.3f} (avg: {batch_time.avg:.3f})\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\t'
                            'Prec@1 {top1.val:.3f} (avg: {top1.avg:.3f})\t'
                            'Prec@2 {top2.val:.3f} (avg: {top2.avg:.3f})\t'
                            'Prec@5 {top5.val:.3f} (avg: {top5.avg:.3f})'.
                            format(set, i, len(dataloader), set, batch_time=batch_time, loss=losses,
                                    top1=top1, top2=top2, top5=top5))

        print('Elapsed time for {} set evaluation {time:.3f} seconds'.format(set, time=time.time() - data_time_start))
        print("")

        return top1.avg, top2.avg, top5.avg, losses.avg

def train_model(model, train_loader, val_loader, optimizer, scheduler=None):
    best_top1 = 0.0
    best_epoch = 0
    # Extract batch size
    batch_size = CONFIG['TRAINING']['BATCH_SIZE']['TRAIN']
    # Start data time
    for epoch in range(100):
        model.train()
        losses = utils.AverageMeter()
        print("-" * 65)
        print(f'Training for epoch {epoch}')
        for i, mini_batch in enumerate(train_loader):
            RGB_image = mini_batch['Image'].cuda()
            sceneLabelGT = mini_batch['Scene Index'].cuda()
            if USE_CUDA:
                RGB_image = RGB_image.cuda()
                sceneLabelGT = sceneLabelGT.cuda()
            # Create tensor of probabilities from semantic_mask
            semanticTensor = None
            # Model Forward
            optimizer.zero_grad()
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)
            loss.backward()
            optimizer.step(lambda: float(loss))

            # Update values
            losses.update(loss.item(), batch_size)

            if i % CONFIG['TRAINING']['PRINT_FREQ'] == 0:
                print('Training Epoch {} \t'
                            'Batch {}/{}\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})'.format(epoch, i, len(train_loader), loss=losses))
        if scheduler is not None:
            scheduler.step()

        # eval and save model
        if epoch % 5 == 0:
            val_top1, val_top2, val_top5, val_loss = evaluationDataLoader(val_loader, model, set='Validation')
            print(' Validation results: Loss {val_loss:.3f}, Prec@1 {top1:.3f}, Prec@2 {top2:.3f}, Prec@5 {top5:.3f}'
                        .format(val_loss=val_loss, top1=val_top1, top2=val_top2, top5=val_top5))

            # save the best model
            if val_top1 - best_top1 > 0.5:
                # remove the old model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(best_epoch))
                if os.path.isfile(ckpt_name):
                    os.remove(ckpt_name)
                # save the new model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(epoch))
                torch.save(model.state_dict(), ckpt_name)
                print("update the model with {} top1 value in {} epoch.".format(val_top1, epoch))

                best_top1 = val_top1
                best_epoch = epoch

if __name__ == '__main__':

    global USE_CUDA, classes, CONFIG

    # Decode CONFIG file information
    args = parser.parse_args()
    CONFIG = yaml.safe_load(open(args.ConfigPath, 'r'))
    USE_CUDA = torch.cuda.is_available()

    print('-' * 65)
    print("Training started starting...")
    print('-' * 65)
    # Instantiate network
    print('Training ONLY RGB branch')
    print('Selected RGB backbone architecture: ' + CONFIG['MODEL']['ARCH'])
    model = RGBBranch(arch=CONFIG['MODEL']['ARCH'], scene_classes=CONFIG['DATASET']['N_CLASSES_SCENE'])
    # Move Model to GPU an set it to evaluation mode
    if USE_CUDA:
        model.cuda()
    cudnn.benchmark = USE_CUDA
    model.train()

    print('-' * 65)
    print('Loading dataset {}...'.format(CONFIG['DATASET']['NAME']))

    traindir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])
    valdir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])

    train_dataset = MITIndoor67Dataset(traindir, "train")
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
                                               shuffle=True, num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)

    val_dataset = MITIndoor67Dataset(valdir, "val", tencrops=CONFIG['VALIDATION']['TEN_CROPS'], SemRGB=True)
    val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TEST'], 
                                                shuffle=False,num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)
    classes = train_dataset.classes
    # Print dataset information
    print('Train set. Size {}. Batch size {}. Nbatches {}'
          .format(len(train_loader) * CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], len(train_loader)))
    print('Train set number of scenes: {}' .format(len(classes)))
    optimizer = dfw.DFW(model.parameters(), eta= CONFIG['TRAINING']['LR'], momentum=CONFIG['TRAINING']['MOMENTUM'], weight_decay=CONFIG['TRAINING']['WEIGHT_DECAY'])
    train_model(model, train_loader, val_loader, optimizer)

Expect your reply!

from semantic-aware-scene-recognition.

alexlopezcifuentes commented on June 14, 2024

But what have you changed in this particular training? Have you used pre-trained weights or is it trained from scratch (using random weights)?

from semantic-aware-scene-recognition.

Heither666 commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

from semantic-aware-scene-recognition.

DranZohn commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

@Heither666 Hei, I wrote the training code using "dfw.py" as the optimizer by myself. This greate project has provided most key code, you just need to combine the *Dataset.py, dfw.py and SASNet.py, and then use *config.yaml to configure the model params, just like the evaluation.py.

from semantic-aware-scene-recognition.

Heither666 commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

@Heither666 Hei, I wrote the training code using "dfw.py" as the optimizer by myself. This greate project has provided most key code, you just need to combine the *Dataset.py, dfw.py and SASNet.py, and then use *config.yaml to configure the model params, just like the evaluation.py.

WOW, thank you for your generous reply, I will try to write it myself and may ask some more questions if you don't mind : )

from semantic-aware-scene-recognition.

Heither666 commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

@Heither666 Hei, I wrote the training code using "dfw.py" as the optimizer by myself. This greate project has provided most key code, you just need to combine the *Dataset.py, dfw.py and SASNet.py, and then use *config.yaml to configure the model params, just like the evaluation.py.

Sorry for bother you again, I tried to write the training code to train SASceneNet but failed, could you show me your training script? It's too hard for me now😢

from semantic-aware-scene-recognition.

DranZohn commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

@Heither666 Hei, I wrote the training code using "dfw.py" as the optimizer by myself. This greate project has provided most key code, you just need to combine the *Dataset.py, dfw.py and SASNet.py, and then use *config.yaml to configure the model params, just like the evaluation.py.

Sorry for bother you again, I tried to write the training code to train SASceneNet but failed, could you show me your training script? It's too hard for me now😢

The training code for MITIndoor67 dataset is above this issue, it's similar for other dataset.
This code file should be at the same directory level as evaluation.py, and run it use:

python train.py --ConfigPath ./Config/config_MITIndoor.yaml

The training code:

"""
Training file
Usage:
    --config [PATH to configuration file for desired dataset]
"""
from Libs.Utils.dfw import dfw
from RGBBranch import RGBBranch
from SemBranch import SemBranch
from SASceneNet import SASceneNet
from Libs.Datasets.MITIndoor67Dataset import MITIndoor67Dataset
from Libs.Utils import utils

import torch.backends.cudnn as cudnn
import numpy as np
import argparse
import yaml
import torch
import os
import time

parser = argparse.ArgumentParser(description='Semantic-Aware Scene Recognition Evaluation')
parser.add_argument('--ConfigPath', metavar='DIR', help='Configuration file path')

def evaluationDataLoader(dataloader, model, set):
    batch_time = utils.AverageMeter()
    losses = utils.AverageMeter()
    top1 = utils.AverageMeter()
    top2 = utils.AverageMeter()
    top5 = utils.AverageMeter()
    # Extract batch size
    batch_size = CONFIG['VALIDATION']['BATCH_SIZE']['TEST']
    # Start data time
    data_time_start = time.time()
    model.eval()
    with torch.no_grad():
        for i, (mini_batch) in enumerate(dataloader):
            start_time = time.time()
            if USE_CUDA:
                RGB_image = mini_batch['Image'].cuda()
                sceneLabelGT = mini_batch['Scene Index'].cuda()

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Fuse batch size and ncrops to set the input for the network
                bs, ncrops, c_img, h, w = RGB_image.size()
                RGB_image = RGB_image.view(-1, c_img, h, w)

            semanticTensor = None
            # Model Forward
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Average results over the 10 crops
                outputSceneLabelRGB = outputSceneLabelRGB.view(bs, ncrops, -1).mean(1)

            if batch_size is 1:
                if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                    RGB_image = torch.unsqueeze(RGB_image[4, :, :, :], 0)
            
            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)

            # Measure Top1, Top2 and Top5 accuracy
            prec1, prec2, prec5 = utils.accuracy(outputSceneLabelRGB.data, sceneLabelGT, topk=(1, 2, 5))

            # Update values
            losses.update(loss.item(), batch_size)
            top1.update(prec1.item(), batch_size)
            top2.update(prec2.item(), batch_size)
            top5.update(prec5.item(), batch_size)

            # Measure batch elapsed time
            batch_time.update(time.time() - start_time)

            # Print information
            if i % CONFIG['VALIDATION']['PRINT_FREQ'] == 0:
                print('Testing {} set batch: [{}/{}]\t'
                            'Batch Time {batch_time.val:.3f} (avg: {batch_time.avg:.3f})\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\t'
                            'Prec@1 {top1.val:.3f} (avg: {top1.avg:.3f})\t'
                            'Prec@2 {top2.val:.3f} (avg: {top2.avg:.3f})\t'
                            'Prec@5 {top5.val:.3f} (avg: {top5.avg:.3f})'.
                            format(set, i, len(dataloader), set, batch_time=batch_time, loss=losses,
                                    top1=top1, top2=top2, top5=top5))

        print('Elapsed time for {} set evaluation {time:.3f} seconds'.format(set, time=time.time() - data_time_start))
        print("")

        return top1.avg, top2.avg, top5.avg, losses.avg

def train_model(model, train_loader, val_loader, optimizer, scheduler=None):
    best_top1 = 0.0
    best_epoch = 0
    # Extract batch size
    batch_size = CONFIG['TRAINING']['BATCH_SIZE']['TRAIN']
    # Start data time
    for epoch in range(100):
        model.train()
        losses = utils.AverageMeter()
        print("-" * 65)
        print(f'Training for epoch {epoch}')
        for i, mini_batch in enumerate(train_loader):
            RGB_image = mini_batch['Image'].cuda()
            sceneLabelGT = mini_batch['Scene Index'].cuda()
            if USE_CUDA:
                RGB_image = RGB_image.cuda()
                sceneLabelGT = sceneLabelGT.cuda()
            # Create tensor of probabilities from semantic_mask
            semanticTensor = None
            # Model Forward
            optimizer.zero_grad()
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)
            loss.backward()
            optimizer.step(lambda: float(loss))

            # Update values
            losses.update(loss.item(), batch_size)

            if i % CONFIG['TRAINING']['PRINT_FREQ'] == 0:
                print('Training Epoch {} \t'
                            'Batch {}/{}\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})'.format(epoch, i, len(train_loader), loss=losses))
        if scheduler is not None:
            scheduler.step()

        # eval and save model
        if epoch % 5 == 0:
            val_top1, val_top2, val_top5, val_loss = evaluationDataLoader(val_loader, model, set='Validation')
            print(' Validation results: Loss {val_loss:.3f}, Prec@1 {top1:.3f}, Prec@2 {top2:.3f}, Prec@5 {top5:.3f}'
                        .format(val_loss=val_loss, top1=val_top1, top2=val_top2, top5=val_top5))

            # save the best model
            if val_top1 - best_top1 > 0.5:
                # remove the old model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(best_epoch))
                if os.path.isfile(ckpt_name):
                    os.remove(ckpt_name)
                # save the new model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(epoch))
                torch.save(model.state_dict(), ckpt_name)
                print("update the model with {} top1 value in {} epoch.".format(val_top1, epoch))

                best_top1 = val_top1
                best_epoch = epoch

if __name__ == '__main__':

    global USE_CUDA, classes, CONFIG

    # Decode CONFIG file information
    args = parser.parse_args()
    CONFIG = yaml.safe_load(open(args.ConfigPath, 'r'))
    USE_CUDA = torch.cuda.is_available()

    print('-' * 65)
    print("Training started starting...")
    print('-' * 65)
    # Instantiate network
    print('Training ONLY RGB branch')
    print('Selected RGB backbone architecture: ' + CONFIG['MODEL']['ARCH'])
    model = RGBBranch(arch=CONFIG['MODEL']['ARCH'], scene_classes=CONFIG['DATASET']['N_CLASSES_SCENE'])
    # Move Model to GPU an set it to evaluation mode
    if USE_CUDA:
        model.cuda()
    cudnn.benchmark = USE_CUDA
    model.train()

    print('-' * 65)
    print('Loading dataset {}...'.format(CONFIG['DATASET']['NAME']))

    traindir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])
    valdir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])

    train_dataset = MITIndoor67Dataset(traindir, "train")
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
                                               shuffle=True, num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)

    val_dataset = MITIndoor67Dataset(valdir, "val", tencrops=CONFIG['VALIDATION']['TEN_CROPS'], SemRGB=True)
    val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TEST'], 
                                                shuffle=False,num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)
    classes = train_dataset.classes
    # Print dataset information
    print('Train set. Size {}. Batch size {}. Nbatches {}'
          .format(len(train_loader) * CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], len(train_loader)))
    print('Train set number of scenes: {}' .format(len(classes)))
    optimizer = dfw.DFW(model.parameters(), eta= CONFIG['TRAINING']['LR'], momentum=CONFIG['TRAINING']['MOMENTUM'], weight_decay=CONFIG['TRAINING']['WEIGHT_DECAY'])
    train_model(model, train_loader, val_loader, optimizer)

Do I understand you correctly?

from semantic-aware-scene-recognition.

Heither666 commented on June 14, 2024

Hello "LearnerZHENG", may I ask where you got the trainning code, as I didn't find any code provided by the author uses "dfw.py" which is used to train the model, but the author did offered "dfw.py" in the lib folder.

@Heither666 Hei, I wrote the training code using "dfw.py" as the optimizer by myself. This greate project has provided most key code, you just need to combine the *Dataset.py, dfw.py and SASNet.py, and then use *config.yaml to configure the model params, just like the evaluation.py.

Sorry for bother you again, I tried to write the training code to train SASceneNet but failed, could you show me your training script? It's too hard for me now😢

The training code for MITIndoor67 dataset is above this issue, it's similar for other dataset.
This code file should be at the same directory level as evaluation.py, and run it use:

python train.py --ConfigPath ./Config/config_MITIndoor.yaml

The training code:

"""
Training file
Usage:
    --config [PATH to configuration file for desired dataset]
"""
from Libs.Utils.dfw import dfw
from RGBBranch import RGBBranch
from SemBranch import SemBranch
from SASceneNet import SASceneNet
from Libs.Datasets.MITIndoor67Dataset import MITIndoor67Dataset
from Libs.Utils import utils

import torch.backends.cudnn as cudnn
import numpy as np
import argparse
import yaml
import torch
import os
import time

parser = argparse.ArgumentParser(description='Semantic-Aware Scene Recognition Evaluation')
parser.add_argument('--ConfigPath', metavar='DIR', help='Configuration file path')

def evaluationDataLoader(dataloader, model, set):
    batch_time = utils.AverageMeter()
    losses = utils.AverageMeter()
    top1 = utils.AverageMeter()
    top2 = utils.AverageMeter()
    top5 = utils.AverageMeter()
    # Extract batch size
    batch_size = CONFIG['VALIDATION']['BATCH_SIZE']['TEST']
    # Start data time
    data_time_start = time.time()
    model.eval()
    with torch.no_grad():
        for i, (mini_batch) in enumerate(dataloader):
            start_time = time.time()
            if USE_CUDA:
                RGB_image = mini_batch['Image'].cuda()
                sceneLabelGT = mini_batch['Scene Index'].cuda()

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Fuse batch size and ncrops to set the input for the network
                bs, ncrops, c_img, h, w = RGB_image.size()
                RGB_image = RGB_image.view(-1, c_img, h, w)

            semanticTensor = None
            # Model Forward
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                # Average results over the 10 crops
                outputSceneLabelRGB = outputSceneLabelRGB.view(bs, ncrops, -1).mean(1)

            if batch_size is 1:
                if set is 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                    RGB_image = torch.unsqueeze(RGB_image[4, :, :, :], 0)
            
            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)

            # Measure Top1, Top2 and Top5 accuracy
            prec1, prec2, prec5 = utils.accuracy(outputSceneLabelRGB.data, sceneLabelGT, topk=(1, 2, 5))

            # Update values
            losses.update(loss.item(), batch_size)
            top1.update(prec1.item(), batch_size)
            top2.update(prec2.item(), batch_size)
            top5.update(prec5.item(), batch_size)

            # Measure batch elapsed time
            batch_time.update(time.time() - start_time)

            # Print information
            if i % CONFIG['VALIDATION']['PRINT_FREQ'] == 0:
                print('Testing {} set batch: [{}/{}]\t'
                            'Batch Time {batch_time.val:.3f} (avg: {batch_time.avg:.3f})\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\t'
                            'Prec@1 {top1.val:.3f} (avg: {top1.avg:.3f})\t'
                            'Prec@2 {top2.val:.3f} (avg: {top2.avg:.3f})\t'
                            'Prec@5 {top5.val:.3f} (avg: {top5.avg:.3f})'.
                            format(set, i, len(dataloader), set, batch_time=batch_time, loss=losses,
                                    top1=top1, top2=top2, top5=top5))

        print('Elapsed time for {} set evaluation {time:.3f} seconds'.format(set, time=time.time() - data_time_start))
        print("")

        return top1.avg, top2.avg, top5.avg, losses.avg

def train_model(model, train_loader, val_loader, optimizer, scheduler=None):
    best_top1 = 0.0
    best_epoch = 0
    # Extract batch size
    batch_size = CONFIG['TRAINING']['BATCH_SIZE']['TRAIN']
    # Start data time
    for epoch in range(100):
        model.train()
        losses = utils.AverageMeter()
        print("-" * 65)
        print(f'Training for epoch {epoch}')
        for i, mini_batch in enumerate(train_loader):
            RGB_image = mini_batch['Image'].cuda()
            sceneLabelGT = mini_batch['Scene Index'].cuda()
            if USE_CUDA:
                RGB_image = RGB_image.cuda()
                sceneLabelGT = sceneLabelGT.cuda()
            # Create tensor of probabilities from semantic_mask
            semanticTensor = None
            # Model Forward
            optimizer.zero_grad()
            _, _, outputSceneLabelRGB, _ = model(RGB_image, semanticTensor)

            # Compute Loss
            loss = model.loss(outputSceneLabelRGB, sceneLabelGT)
            loss.backward()
            optimizer.step(lambda: float(loss))

            # Update values
            losses.update(loss.item(), batch_size)

            if i % CONFIG['TRAINING']['PRINT_FREQ'] == 0:
                print('Training Epoch {} \t'
                            'Batch {}/{}\t'
                            'Loss {loss.val:.3f} (avg: {loss.avg:.3f})'.format(epoch, i, len(train_loader), loss=losses))
        if scheduler is not None:
            scheduler.step()

        # eval and save model
        if epoch % 5 == 0:
            val_top1, val_top2, val_top5, val_loss = evaluationDataLoader(val_loader, model, set='Validation')
            print(' Validation results: Loss {val_loss:.3f}, Prec@1 {top1:.3f}, Prec@2 {top2:.3f}, Prec@5 {top5:.3f}'
                        .format(val_loss=val_loss, top1=val_top1, top2=val_top2, top5=val_top5))

            # save the best model
            if val_top1 - best_top1 > 0.5:
                # remove the old model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(best_epoch))
                if os.path.isfile(ckpt_name):
                    os.remove(ckpt_name)
                # save the new model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(epoch))
                torch.save(model.state_dict(), ckpt_name)
                print("update the model with {} top1 value in {} epoch.".format(val_top1, epoch))

                best_top1 = val_top1
                best_epoch = epoch

if __name__ == '__main__':

    global USE_CUDA, classes, CONFIG

    # Decode CONFIG file information
    args = parser.parse_args()
    CONFIG = yaml.safe_load(open(args.ConfigPath, 'r'))
    USE_CUDA = torch.cuda.is_available()

    print('-' * 65)
    print("Training started starting...")
    print('-' * 65)
    # Instantiate network
    print('Training ONLY RGB branch')
    print('Selected RGB backbone architecture: ' + CONFIG['MODEL']['ARCH'])
    model = RGBBranch(arch=CONFIG['MODEL']['ARCH'], scene_classes=CONFIG['DATASET']['N_CLASSES_SCENE'])
    # Move Model to GPU an set it to evaluation mode
    if USE_CUDA:
        model.cuda()
    cudnn.benchmark = USE_CUDA
    model.train()

    print('-' * 65)
    print('Loading dataset {}...'.format(CONFIG['DATASET']['NAME']))

    traindir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])
    valdir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])

    train_dataset = MITIndoor67Dataset(traindir, "train")
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
                                               shuffle=True, num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)

    val_dataset = MITIndoor67Dataset(valdir, "val", tencrops=CONFIG['VALIDATION']['TEN_CROPS'], SemRGB=True)
    val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TEST'], 
                                                shuffle=False,num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=False)
    classes = train_dataset.classes
    # Print dataset information
    print('Train set. Size {}. Batch size {}. Nbatches {}'
          .format(len(train_loader) * CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], CONFIG['VALIDATION']['BATCH_SIZE']['TRAIN'], len(train_loader)))
    print('Train set number of scenes: {}' .format(len(classes)))
    optimizer = dfw.DFW(model.parameters(), eta= CONFIG['TRAINING']['LR'], momentum=CONFIG['TRAINING']['MOMENTUM'], weight_decay=CONFIG['TRAINING']['WEIGHT_DECAY'])
    train_model(model, train_loader, val_loader, optimizer)

Do I understand you correctly?

Well, actually I have run this script(RBG Branch only) already and got a 70acc result.
What I need is the code for training SASceneNet(the combine of two branches) but not RGB Branch or Sem Branch.
I have tried to adjust some part based on the code you offered but didn't work. Sincerely need your help🙏

from semantic-aware-scene-recognition.

DranZohn commented on June 14, 2024

@Heither666 Here is the code for training SASceneNet:

from Libs.Datasets.MITIndoor67Dataset import MITIndoor67Dataset
from Libs.Utils import utils
from SASceneNet import SASceneNet
import torch.backends.cudnn as cudnn
import numpy as np
import argparse
import yaml
import torch
import os
import time
import torch.nn.parallel
import torch.optim
import torch.utils.data
from Libs.Utils import utils
from Libs.Utils.dfw import dfw
from Libs.Utils.dfw import bpgrad

parser = argparse.ArgumentParser(description='Semantic-Aware Scene Recognition Evaluation')
parser.add_argument('--ConfigPath', metavar='DIR', help='Configuration file path')

class SASceneManager(object):
    def __init__(self, model, classes, val_dataloader, train_dataloader=None):
        self.val_dataloader = val_dataloader
        self.train_dataloader = train_dataloader
        self.model = model
        self.classes = classes  #classes name list

    def train(self, train_bs, val_bs, momentum=0.9, lr=0.1, lr_decay=10, weight_decay=1e-4, 
                    use_cuda=True, use_tencrop=False, 
                    sem_class_num=151):
        """
        param:
            train_bs: train batch size
            val_bs: validation batch size
            use_cuda:
            sem_class_num: semantic brach class num
            train_stage:
                BOTH: train both of two stage
                RGB: only train rgb branch
                SEMANTIC: only train sem branch
                ATTENTION: only train attention module
        """
        self.train_bs = train_bs
        self.val_bs = val_bs
        self.momentum = momentum
        self.lr = lr
        self.lr_decay = lr_decay
        self.weight_decay = weight_decay
        self.use_cuda = use_cuda
        self.sem_class_num = sem_class_num
        self.use_tencrop = use_tencrop

        assert self.train_dataloader is not None
        
        self.model.freeze()
        print("Training Attention module start...")
        optimizer_stepLR = None
        optimizer = torch.optim.SGD(filter(lambda p: p.requires_grad, self.model.parameters()), 
                                    lr= self.lr,weight_decay=self.weight_decay, momentum=self.momentum)
        optimizer_stepLR = torch.optim.lr_scheduler.StepLR(optimizer, step_size=self.lr_decay, gamma=0.9)

        for epoch in range(200):
            self.model.train()
            train_loss = utils.AverageMeter()
            print("-" * 65)
            print(f'Training for epoch {epoch}')
            for i, (mini_batch) in enumerate(self.train_dataloader):
                # upload data to gpu
                if self.use_cuda:
                    RGB_image = mini_batch['Image'].cuda()
                    semantic_mask = mini_batch['Semantic'].cuda()
                    semantic_scores = mini_batch['Semantic Scores'].cuda()
                    sceneLabelGT = mini_batch['Scene Index'].cuda()

                # Create tensor of probabilities from semantic_mask
                semanticTensor = utils.make_one_hot(semantic_mask, semantic_scores, self.sem_class_num)
                # Model Forward
                outputSceneLabel, feature_conv, outputSceneLabelRGB, outputSceneLabelSEM = self.model(RGB_image, semanticTensor)

                loss = self.model.loss(outputSceneLabel, sceneLabelGT)

                optimizer.zero_grad()
                loss.backward()
                optimizer.step(lambda: float(loss))
                # Update values
                train_loss.update(loss.item(), self.train_bs)

                # Print information
                if i % 10 == 0:
                    print('training result:'
                        'Training set batch: [{}/{}] of epoch-{}/200\t'
                        'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\n'.
                        format(i, len(self.train_dataloader), epoch, loss=train_loss))
            # update lr
            if optimizer_stepLR is not None:
                optimizer_stepLR.step()
        return 

########################### Decode CONFIG file information######################################
args = parser.parse_args()
CONFIG = yaml.safe_load(open(args.ConfigPath, 'r'))
USE_CUDA = torch.cuda.is_available()

########################### Instantiate network######################################
print('Evaluating complete model')
print('Selected RG backbone architecture: ' + CONFIG['MODEL']['ARCH'])
model = SASceneNet(arch=CONFIG['MODEL']['ARCH'], scene_classes=CONFIG['DATASET']['N_CLASSES_SCENE'], semantic_classes=CONFIG['DATASET']['N_CLASSES_SEM'])

########################## Load the trained model######################################
completePath = CONFIG['MODEL']['PATH'] + CONFIG['MODEL']['NAME'] + '.pth.tar'
if os.path.isfile(completePath):
    print("Loading model {} from path {}...".format(CONFIG['MODEL']['NAME'], completePath))
    checkpoint = torch.load(completePath)
    best_prec1 = checkpoint['best_prec1']
    model.load_state_dict(checkpoint['state_dict'])
    print("Loaded model {} from path {}.".format(CONFIG['MODEL']['NAME'], completePath))
    print("     Epochs {}".format(checkpoint['epoch']))
    print("     Single crop reported precision {}".format(best_prec1))
else:
    print("No checkpoint found at '{}'. Check configuration file MODEL field".format(completePath))
    quit()
    
########################### Move Model to GPU an set train mode######################################
if USE_CUDA:
    model.cuda()
cudnn.benchmark = USE_CUDA

########################### Load train dataset and validation dataset######################################
print('-' * 65)
print('Loading dataset {}...'.format(CONFIG['DATASET']['NAME']))

traindir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])
valdir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])

train_dataset = MITIndoor67Dataset(traindir, "train")
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
                                            shuffle=True, num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=True)

val_dataset = MITIndoor67Dataset(valdir, "val", tencrops=CONFIG['VALIDATION']['TEN_CROPS'], SemRGB=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TEST'], 
                                            shuffle=False,num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=True)

classes = train_dataset.classes       # class names list of dataset

########################### train model######################################
print('-' * 65)
print("Train starting...")
print('-' * 65)
model.train()
trainManager = SASceneManager(model, classes, val_loader, train_loader)
trainManager.train(train_bs=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'], 
                                        val_bs=CONFIG['TRAINING']['BATCH_SIZE']['TEST'], 
                                        momentum=CONFIG['TRAINING']['MOMENTUM'], 
                                        lr=CONFIG['TRAINING']['LR'], 
                                        lr_decay=CONFIG['TRAINING']['LR_DECAY'], 
                                        weight_decay=CONFIG['TRAINING']['WEIGHT_DECAY'], 
                                        use_cuda=USE_CUDA, 
                                        use_tencrop = CONFIG['VALIDATION']['TEN_CROPS'], 
                                        sem_class_num=CONFIG['DATASET']['N_CLASSES_SEM'])

It runs well with correct config file. You need to add the freeze method to SASceneNet.py, which freezes the params before the Attention module. Except for these two points it's actually the same as training the RGB Branch or Sem Branch.

from semantic-aware-scene-recognition.

Heither666 commented on June 14, 2024

@LearnerZHENG
Thanks for your help!

But I didn't find the save model and evaluation function in this script

I tried to add these two part but finally got the error "RuntimeError: CUDA out of memory. Tried to allocate 2.84 GiB (GPU 0; 8.00 GiB total capacity; 1.45 GiB already allocated; 2.80 GiB free; 3.78 GiB reserved in total by PyTorch)"

I made some change
########################### Evaluation function######################################
my change
########################### Evaluation function end###################################
and
########################### Evaluation and save model ######################################
my change
########################### Evaluation and save model end###################################

in
from Libs.Datasets.MITIndoor67Dataset import MITIndoor67Dataset
from Libs.Utils import utils
from SASceneNet import SASceneNet
import torch.backends.cudnn as cudnn
import numpy as np
import argparse
import yaml
import torch
import os
import time
import torch.nn.parallel
import torch.optim
import torch.utils.data
from Libs.Utils import utils
from Libs.Utils.dfw import dfw
from Libs.Utils.dfw import bpgrad

parser = argparse.ArgumentParser(description='Semantic-Aware Scene Recognition Evaluation')
parser.add_argument('--ConfigPath', metavar='DIR', help='Configuration file path')

########################### Evaluation function######################################
def evaluationDataLoader(dataloader, model, set):
batch_time = utils.AverageMeter()
losses = utils.AverageMeter()
top1 = utils.AverageMeter()
top2 = utils.AverageMeter()
top5 = utils.AverageMeter()

ClassTPs_Top1 = torch.zeros(1, len(classes), dtype=torch.uint8).cuda()
ClassTPs_Top2 = torch.zeros(1, len(classes), dtype=torch.uint8).cuda()
ClassTPs_Top5 = torch.zeros(1, len(classes), dtype=torch.uint8).cuda()
Predictions = np.zeros(len(dataloader))
SceneGTLabels = np.zeros(len(dataloader))

# Extract batch size
batch_size = CONFIG['VALIDATION']['BATCH_SIZE']['TEST']

# Start data time
data_time_start = time.time()
model.eval()
with torch.no_grad():
    for i, (mini_batch) in enumerate(dataloader):
        start_time = time.time()
        if USE_CUDA:
            RGB_image = mini_batch['Image'].cuda()
            semantic_mask = mini_batch['Semantic'].cuda()
            semantic_scores = mini_batch['Semantic Scores'].cuda()
            sceneLabelGT = mini_batch['Scene Index'].cuda()

        if set == 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
            # Fuse batch size and ncrops to set the input for the network
            bs, ncrops, c_img, h, w = RGB_image.size()
            RGB_image = RGB_image.view(-1, c_img, h, w)

            bs, ncrops, c_sem, h, w = semantic_mask.size()
            semantic_mask = semantic_mask.view(-1, c_sem, h, w)

            bs, ncrops, c_sem, h, w = semantic_scores.size()
            semantic_scores = semantic_scores.view(-1, c_sem, h, w)


        semanticTensor = utils.make_one_hot(semantic_mask, semantic_scores, C=CONFIG['DATASET']['N_CLASSES_SEM'])
        # Model Forward
        outputSceneLabel, feature_conv, outputSceneLabelRGB, outputSceneLabelSEM = model(RGB_image, semanticTensor)

        if set == 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
            # Average results over the 10 crops
            outputSceneLabel = outputSceneLabel.view(bs, ncrops, -1).mean(1)
            outputSceneLabelRGB = outputSceneLabelRGB.view(bs, ncrops, -1).mean(1)
            outputSceneLabelSEM = outputSceneLabelSEM.view(bs, ncrops, -1).mean(1)

        if batch_size == 1:
            if set == 'Validation' and CONFIG['VALIDATION']['TEN_CROPS']:
                feature_conv = torch.unsqueeze(feature_conv[4, :, :, :], 0)
                RGB_image = torch.unsqueeze(RGB_image[4, :, :, :], 0)

            # Obtain 10 most scored predicted scene index
            Ten_Predictions = utils.obtainPredictedClasses(outputSceneLabel)

            # Save predicted label and ground-truth label
            Predictions[i] = Ten_Predictions[0]
            SceneGTLabels[i] = sceneLabelGT.item()

            # Compute activation maps
            # utils.saveActivationMap(model, feature_conv, Ten_Predictions, sceneLabelGT,
            #                         RGB_image, classes, i, set, save=True)

        # Compute class accuracy
        ClassTPs = utils.getclassAccuracy(outputSceneLabel, sceneLabelGT, len(classes), topk=(1, 2, 5))
        ClassTPs_Top1 += ClassTPs[0]
        ClassTPs_Top2 += ClassTPs[1]
        ClassTPs_Top5 += ClassTPs[2]

        # Compute Loss
        loss = model.loss(outputSceneLabel, sceneLabelGT)

        # Measure Top1, Top2 and Top5 accuracy
        prec1, prec2, prec5 = utils.accuracy(outputSceneLabel.data, sceneLabelGT, topk=(1, 2, 5))

        # Update values
        losses.update(loss.item(), batch_size)
        top1.update(prec1.item(), batch_size)
        top2.update(prec2.item(), batch_size)
        top5.update(prec5.item(), batch_size)

        # Measure batch elapsed time
        batch_time.update(time.time() - start_time)

        # Print information
        if i % CONFIG['VALIDATION']['PRINT_FREQ'] == 0:
            print('Testing {} set batch: [{}/{}]\t'
                  'Batch Time {batch_time.val:.3f} (avg: {batch_time.avg:.3f})\t'
                  'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\t'
                  'Prec@1 {top1.val:.3f} (avg: {top1.avg:.3f})\t'
                  'Prec@2 {top2.val:.3f} (avg: {top2.avg:.3f})\t'
                  'Prec@5 {top5.val:.3f} (avg: {top5.avg:.3f})'.
                  format(set, i, len(dataloader), set, batch_time=batch_time, loss=losses,
                         top1=top1, top2=top2, top5=top5))

    print('Elapsed time for {} set evaluation {time:.3f} seconds'.format(set, time=time.time() - data_time_start))
    print("")

    return top1.avg, top2.avg, top5.avg, losses.avg

########################### Evaluation function end###################################

class SASceneManager(object):
def init(self, model, classes, val_dataloader, train_dataloader=None):
self.val_dataloader = val_dataloader
self.train_dataloader = train_dataloader
self.model = model
self.classes = classes #classes name list

def train(self, train_bs, val_bs, momentum=0.9, lr=0.1, lr_decay=10, weight_decay=1e-4, 
                use_cuda=True, use_tencrop=False, 
                sem_class_num=151):
    """
    param:
        train_bs: train batch size
        val_bs: validation batch size
        use_cuda:
        sem_class_num: semantic brach class num
        train_stage:
            BOTH: train both of two stage
            RGB: only train rgb branch
            SEMANTIC: only train sem branch
            ATTENTION: only train attention module
    """
    self.train_bs = train_bs
    self.val_bs = val_bs
    self.momentum = momentum
    self.lr = lr
    self.lr_decay = lr_decay
    self.weight_decay = weight_decay
    self.use_cuda = use_cuda
    self.sem_class_num = sem_class_num
    self.use_tencrop = use_tencrop

    assert self.train_dataloader is not None
    
    self.model.freeze()
    print("Training Attention module start...")
    optimizer_stepLR = None
    optimizer = torch.optim.SGD(filter(lambda p: p.requires_grad, self.model.parameters()), 
                                lr= self.lr,weight_decay=self.weight_decay, momentum=self.momentum)
    optimizer_stepLR = torch.optim.lr_scheduler.StepLR(optimizer, step_size=self.lr_decay, gamma=0.9)

    for epoch in range(100):
        self.model.train()
        train_loss = utils.AverageMeter()
        print("-" * 65)
        print(f'Training for epoch {epoch}')
        for i, (mini_batch) in enumerate(self.train_dataloader):
            # upload data to gpu
            if self.use_cuda:
                RGB_image = mini_batch['Image'].cuda()
                semantic_mask = mini_batch['Semantic'].cuda()
                semantic_scores = mini_batch['Semantic Scores'].cuda()
                sceneLabelGT = mini_batch['Scene Index'].cuda()

            # Create tensor of probabilities from semantic_mask
            semanticTensor = utils.make_one_hot(semantic_mask, semantic_scores, self.sem_class_num)
            # Model Forward
            outputSceneLabel, feature_conv, outputSceneLabelRGB, outputSceneLabelSEM = self.model(RGB_image, semanticTensor)

            loss = self.model.loss(outputSceneLabel, sceneLabelGT)

            optimizer.zero_grad()
            loss.backward()
            optimizer.step(lambda: float(loss))
            # Update values
            train_loss.update(loss.item(), self.train_bs)

            # Print information
            if i % 10 == 0:
                print('training result:'
                    'Training set batch: [{}/{}] of epoch-{}/100\t'
                    'Loss {loss.val:.3f} (avg: {loss.avg:.3f})\n'.
                    format(i, len(self.train_dataloader), epoch, loss=train_loss))
        # update lr
        if optimizer_stepLR is not None:
            optimizer_stepLR.step()

        ########################### Evaluation and save model ######################################
        # eval and save model
        if epoch % 5 == 0:
            val_top1, val_top2, val_top5, val_loss = evaluationDataLoader(self.val_dataloader, model, set='Validation')
            print(
                ' Validation results: Loss {val_loss:.3f}, Prec@1 {top1:.3f}, Prec@2 {top2:.3f}, Prec@5 {top5:.3f}'
                    .format(val_loss=val_loss, top1=val_top1, top2=val_top2, top5=val_top5))

            # save the best model
            if val_top1 - best_top1 > 0.5:
                # remove the old model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(best_epoch))
                if os.path.isfile(ckpt_name):
                    os.remove(ckpt_name)
                # save the new model
                ckpt_name = os.path.join(CONFIG["MODEL"]["PATH"], "model-best-epoch-{}.ckpt".format(epoch))
                torch.save(model.state_dict(), ckpt_name)
                print("update the model with {} top1 value in {} epoch.".format(val_top1, epoch))

                best_top1 = val_top1
                best_epoch = epoch
            ########################### Evaluation and save model end###################################

    return

########################### Decode CONFIG file information######################################
args = parser.parse_args()
CONFIG = yaml.safe_load(open(args.ConfigPath, 'r'))
USE_CUDA = torch.cuda.is_available()

########################### Instantiate network######################################
print('Evaluating complete model')
print('Selected RG backbone architecture: ' + CONFIG['MODEL']['ARCH'])
model = SASceneNet(arch=CONFIG['MODEL']['ARCH'], scene_classes=CONFIG['DATASET']['N_CLASSES_SCENE'], semantic_classes=CONFIG['DATASET']['N_CLASSES_SEM'])

########################## Load the trained model######################################
completePath = CONFIG['MODEL']['PATH'] + CONFIG['MODEL']['NAME'] + '.pth.tar'
if os.path.isfile(completePath):
print("Loading model {} from path {}...".format(CONFIG['MODEL']['NAME'], completePath))
checkpoint = torch.load(completePath)
best_prec1 = checkpoint['best_prec1']
model.load_state_dict(checkpoint['state_dict'])
print("Loaded model {} from path {}.".format(CONFIG['MODEL']['NAME'], completePath))
print(" Epochs {}".format(checkpoint['epoch']))
print(" Single crop reported precision {}".format(best_prec1))
else:
print("No checkpoint found at '{}'. Check configuration file MODEL field".format(completePath))
quit()

########################### Move Model to GPU an set train mode######################################
if USE_CUDA:
model.cuda()
cudnn.benchmark = USE_CUDA

########################### Load train dataset and validation dataset######################################
print('-' * 65)
print('Loading dataset {}...'.format(CONFIG['DATASET']['NAME']))

traindir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])
valdir = os.path.join(CONFIG['DATASET']['ROOT'], CONFIG['DATASET']['NAME'])

train_dataset = MITIndoor67Dataset(traindir, "train")
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
shuffle=True, num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=True)

val_dataset = MITIndoor67Dataset(valdir, "val", tencrops=CONFIG['VALIDATION']['TEN_CROPS'], SemRGB=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=CONFIG['TRAINING']['BATCH_SIZE']['TEST'],
shuffle=False,num_workers=CONFIG['DATALOADER']['NUM_WORKERS'], pin_memory=True)

classes = train_dataset.classes # class names list of dataset

########################### train model######################################
print('-' * 65)
print("Train starting...")
print('-' * 65)
model.train()
trainManager = SASceneManager(model, classes, val_loader, train_loader)
trainManager.train(train_bs=CONFIG['TRAINING']['BATCH_SIZE']['TRAIN'],
val_bs=CONFIG['TRAINING']['BATCH_SIZE']['TEST'],
momentum=CONFIG['TRAINING']['MOMENTUM'],
lr=CONFIG['TRAINING']['LR'],
lr_decay=CONFIG['TRAINING']['LR_DECAY'],
weight_decay=CONFIG['TRAINING']['WEIGHT_DECAY'],
use_cuda=USE_CUDA,
use_tencrop = CONFIG['VALIDATION']['TEN_CROPS'],
sem_class_num=CONFIG['DATASET']['N_CLASSES_SEM'])

from semantic-aware-scene-recognition.

I tried to train the RGB branch on ADE20K from scratch, but only got 45% acc rather than 55.9%. about semantic-aware-scene-recognition HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent