Modified from official RetinaFace
This is only the testing code. Not for training.
The official code takes too much CPU computation (which happens in generate_anchor:anchors_plane -> cython:anchors_cython). When running 4 processes, all cores went up to 100%. And, it can only support single image prediction.
It is obvious that we don't need to render all anchors before selecting proposals. Therefore, I change the pipeline to:
-
Select proposal indices where (conf > threshold) to produce (image_idx, row_idx, col_idx)
-
Generate anchors for selected indices only (which is much light-weighted)
-
Generate landmarks for selected indices only
-
Batched_nms (from torchvision)
I move all operations to GPU which should be much faster.
See test_all.py
pip install torchsul
pip install opencv-python
Other packages are already included in anaconda.
If you want to use RetinaFace-R50, download from here