Read this in other languages: English δΈζ
This implementation is verified with some custom datasets, achieved good speed and result, quantitative results on some standard datasets like PASCAL VOC and COCO will release soon.
There is still some work to be done.
- support batch size >= 2.
- COCO dataset training example and pre-trained weights.
- fix performance problem using FPN.
- replace third-party libs NMS and roi_align with pure PyTorch, NMS in torchvision is under developing, need to wait the version coming out.
- keep up with PyTorch version 0.4 and the exciting version 1.0 that is about to be released.
PyTorch 0.4 is not supported yet, versions below 0.3.1 are not guaranteed to work.
Tested version: python == 3.5.2, torch == 0.3.1, torchvision == 0.2.0
git clone [email protected]:GeeshangXu/mask-rcnn-pytorch.git
pip install cffi pillow easydict
Choose your GPU architecture, e.g. sm_62 for Titan XP , then run
python .\libs\build_libs.py sm_62
architectures | capabilities | example GPU |
---|---|---|
sm_30, sm_32 | Basic features + Keplersupport +Unified memory programming | |
sm_35 | + Dynamic parallelism support | |
sm_50, sm_52, sm_53 | + Maxwell support | M40 |
sm_60, sm_61, sm_62 | + Pascal support | Titan XP, 1080(Ti), 1070 |
sm_70 | + Volta support | V100 |
# Take a look at config.ini, config some hyper-parameters.
import sys
# add this project's root directory to PATH
sys.path.append("/ANY_DIR_YOU_CLONE_AT/mask-rcnn-pytorch/")
from maskrcnn import MaskRCNN
mask_rcnn = MaskRCNN(num_classes=81, pretrained="imagenet")
-
Download the tiny (25MB) dataset CST-Dataset
Download link: CST-Dataset
-
replace
config.ini
withexamples/cst-dataset/config.ini
-
see Jupyter Notebook example-cst-dataset.ipynb
release later
(release later)
dataset | train memory(GB) | train time (hr/epoch) | inference time(s/img) | box AP | mask AP |
---|---|---|---|---|---|
PASCAL VOC 2012 | |||||
COCO 2017 |
Source directories are arranged according to internal models or execution process of Mask R-CNN model, trying to decouple these models or processes to make it easy for adding experimental variants.
Several feature map extractor backbones support Mask R-CNN, like ResNet-101-FPN.
RoI(Region of Interest) proposal model, like RPN and variants.
Pooling for fixed dimensional representation(e.g. 14x14 pixels), like RoIAlign and some variants.
Predict heads include classification head, bounding box head, mask head and their variants.
Some utils like function to calculate iou, and visualization tools.
Unittests and sanity checks.
Some third-party libs this project based on.