CS 766 Computer Vision Project

Introduction

This project aims to adapt the YOLOv7 model to detect arbitrarily oriented objects in remote-sensing images. To achieve this, modifying the original loss function of the model is required. We obtained a successful result by increasing the number of anchor boxes with different rotating angles and combining the smooth-L1-IoU loss function proposed by R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object into the original loss for generating bounding boxes.

Features

Loss Function (only for x, y, w, h, theta)

Setup

Clone repository

$ git clone https://github.com/catchashu10/RotatedObjectDetection.git
$ cd R-YOLOv7/

Create Environment

Conda

Create virual environment

$ conda create -n ryolo python=3.8
$ conda activate ryolo

Install PyTorch and torchvision following the official instructions, e.g.,

If you are using CUDA 11.8 version
$ conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Install required libraries
```
$ pip install -r requirements.txt
```
Install detectron2 for calculating SkewIoU on GPU following the official instructions, e.g.,
```
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
```

Docker

$ docker build -t ryolo docker/
$ sudo docker run -it --rm --gpus all --ipc=host --ulimit memlock=-1 -v ${your_path}:/workspace ryolo

Download pretrained weights
weights

Make sure your files arrangment looks like the following
Note that each of your dataset folder in data should split into three files, namely train, test, and detect.

R-YOLOv7/
├── train.py
├── test.py
├── detect.py
├── xml2txt.py
├── environment.xml
├── requirements.txt
├── model/
├── datasets/
├── lib/
├── outputs/
├── weights/
    ├── pretrained/ (for training)
    └── UCAS-AOD/ (for testing and detection)
└── data/
    └── UCAS-AOD/
        ├── class.names
        ├── train/
            ├── ...png
            └── ...txt
        ├── test/
            ├── ...png
            └── ...txt
        └── detect/
            └── ...png

Train

The methods to load and train different datasets, such as UCAS-AOD and DOTA can be found at /datasets. The angle of each bounding box is limited in (- pi/2, pi/2], and the height of each bounding box is always longer than its width.

Check all the settings in the .yaml files that you will use in the /data folder.

$ python train.py --model_name DOTA_yolov7_csl_800 --config data/hyp.yaml --img_size 800 --data data/DOTA.yaml --epochs 100 --mode csl --ver yolov7

You can run display_inputs.py to visualize whether your data is loaded successfully.

UCAS-AOD dataset

Please refer to this repository to rearrange files so that the dataset can be loaded and trained by the model.

DOTA dataset

Download the official dataset from here. The original files should be able to be loaded and trained by the model.

Test

$ python test.py --data data/DOTA.yaml --hyp data/hyp.yaml --weight_path weights/DOTA_yolov7_csl_800/best.pth --batch_size 8 --img_size 800 --mode csl --ver yolov7

Detect

$ python detect.py --data data/UCAS_AOD.yaml --hyp data/hyp.yaml --weight_path weights/DOTA_yolov7_csl_800/best.pth --batch_size 8 --img_size 800 --conf_thres 0.3 --mode csl --ver yolov7

Tensorboard

If you would like to use tensorboard for tracking training process.

Open an additional terminal in the same folder where you are running the program.
Run command $ tensorboard --logdir=weights --port=6006
Go to http://localhost:6006/

Results

UCAS_AOD

Method	Plane	Car	mAP
YOLOv7 (smoothL1-iou)	98.70	91.40	95.00

DOTA

Class	Images	Labels	Precision	Recall	mAP
all	1017	21425	0.490	0.531	0.478
plane	1017	655	0.663	0.940	0.785
ship	1017	12745	0.801	0.851	0.858
storage tank	1017	136	0.662	0.485	0.484
baseball diamond	1017	47	0.151	0.362	0.131
tennis court	1017	646	0.784	0.972	0.878
basketball court	1017	70	0.546	0.957	0.719
ground track field	1017	4	0.268	0.500	0.495
harbor	1017	1786	0.564	0.155	0.276
bridge	1017	10	0.000	0.000	0.003
large vehicle	1017	1778	0.633	0.685	0.684
small vehicle	1017	3067	0.720	0.529	0.601
helicopter	1017	29	0.353	0.207	0.084
roundabout	1017	14	0.608	0.555	0.526
soccer ball field	1017	13	0.076	0.154	0.148
swimming pool	1017	425	0.522	0.619	0.493

References

yingkunwu/R-YOLOv4

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

@inproceedings{wang2023yolov7,
  title={YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
  author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={7464--7475},
  year={2023}
}

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

@article{r3det,
  title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
  author={Xue Yang, Junchi Yan, Ziming Feng, Tao He},
  journal = {arXiv},
  year={2019}
}

catchashu10 / rotatedobjectdetection Goto Github PK

rotatedobjectdetection's Introduction