HyCTAS: Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search

Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search

Hongyuan Yu, Cheng Wan, Mengchen Liu, Dongdong Chen, Bin Xiao and Xiyang Dai

Overview

We develop a multi-target multi-branch supernet method, which not only retains the multi-branch structure of HRNet, but also finds the proper location for placing multi-head self-attention module. Our search algorithm is optimized towards multiple objectives (e.g., latency and mIoU) and capable of finding architectures on Pareto frontier with arbitrary number of branches in a single search. We further present a series of HyCTAS that searched for the best hybrid combination of light-weight convolution layers and memory-efficient self-attention layers between branches from different resolutions and fuse to high resolution for both efficiency and effectiveness.

HyCTAS search space

HyCTAS searchable modules

Highlights:

1: We design a novel searching framework incorporating with multi-branch space for high resolution representation and genetic-based multi-objective.
2: We present a series of HyCTAS that combines a light-weight convolution module to reduce the computation cost while preserving high-resolution information and a memory efficient self-attention module to attend long-range dependencies.
3: HyCTAS achieves extremely fast speed, low flops, low parameters and maintains competitive accuracy.

Results

HyCTAS models

Prerequisites

Ubuntu 16.04
Python 3.7
CUDA 10.2 (lower versions may work but were not tested)
NVIDIA GPU (>= 11G graphic memory) + CuDNN v7.3

This repository has been tested on GTX 2080Ti. Configurations (e.g batch size, image patch size) may need to be changed on different platforms.

Installation

Clone this repo:

git clone https://github.com/HyCTAS/HyCTAS.git
cd HyCTAS

Install dependencies:

bash install.sh

Usage

0. Prepare the dataset

Download the leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip from the Cityscapes.
Prepare the annotations by using the createTrainIdLabelImgs.py.
Put the file of image list into where you save the dataset.

1. Train from scratch

cd HyCTAS/train
Set the dataset path via ln -s $YOUR_DATA_PATH ../DATASET
Set the output path via mkdir ../OUTPUT
Train from scratch

export DETECTRON2_DATASETS="$Your_DATA_PATH"
NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --world_size $NGPUS --seed 12367 --config ../configs/cityscapes/semantic.yaml

2. Evaluation

We provide training models and logs, which can be downloaded from Google Drive.

cd train

Download the pretrained weights of the from Google Drive.
Set config.model_path = $YOUR_MODEL_PATH in semantic.yaml.
Set config.json_file = $HyCTAS_MODEL in semantic.yaml.
Start the evaluation process:

CUDA_VISIBLE_DEVICES=0 python test.py

Cite

If you find this repository useful, please use the following BibTeX for citation.

@misc{yu2024realtime,
      title={Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search}, 
      author={Hongyuan Yu and Cheng Wan and Mengchen Liu and Dongdong Chen and Bin Xiao and Xiyang Dai},
      year={2024},
      eprint={2403.10413},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

marvinyu1995 / hyctas Goto Github PK

hyctas's Introduction

HyCTAS: Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search

Overview

Results

Prerequisites

Installation

Usage

0. Prepare the dataset

1. Train from scratch

2. Evaluation

Cite

hyctas's People

Contributors

Stargazers

Watchers

Forkers

hyctas's Issues

Recommend Projects

Recommend Topics

Recommend Org