Code Monkey home page Code Monkey logo

mtcnn-light's Introduction

MTCNN-light

Introduction

this repository is the implementation of MTCNN with no framework, Just need opencv and openblas.
"Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks", implemented with C++,no framework
it is very easy for you to use.
it is can be a part of your project with no framework, like caffe and mxnet.
it is real time for VGA, and you can improve it's runtime.

Time Cost

The average time cost is about 68ms per frame(640,480).The result is generated by testing a camera. mini_size is 40
cpu i5-4590
os windows10 64bit

Dependencies

opencv 2.0+
openblas

##ubuntu

opencv

you can find many tutorials.

openblas

It is very easy to install
1 download the source code from https://github.com/xianyi/OpenBLAS
2 Extract it and type "cd xxx", xxx means the directory
3 type "make"
4 type "make install PREFIX=your_installation_directory"

if you don't have cmake

apt-get install cmake

usage

cd root_directory
vim CMakeLists.txt
change include_directories(the_directory_of_openblas_include_of_yours)
change link_directories(the_directory_of_openblas_lib_of_yours)
save and exit

cmake .
make
./main

for windows

opencv and openblas

there is binary packages of openblas for windows, you just need download it
But you should to be careful, if you download the 64bit ,you need configure
the opencv and vs project environment with 64bit, don't choose x86

mtcnn-light's People

Contributors

alphaqi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mtcnn-light's Issues

时间统计错误,VGA图片人脸检测结果为约400ms

代码中为:
clock_t start; start = clock(); find.findFace(image); imshow("result", image); imwrite("result.jpg",image); start = clock() -start; cout<<"time is "<<start/10e3<<endl;
有错误

应该先将start转换为秒,再转换为ms,转换为s要除以CLOCKS_PER_SEC
实际上1s=1000ms,从s到ms的转换应该是乘以1000.
正确代码应该为:
cout<<"time is "<<(double)start/CLOCKS_PER_SEC*1000<<" ms "<<endl;
得到人脸探测时间为400ms, 也就是0.4s

OpenCV Error: Assertion failed

Hi,

After a while , around few thousands frame it gives below error. I couldnt figure it out .

Is there anybody faced same problem ?

thx

OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /opt/concourse/worker/volumes/live/d8bcd4d1-79b2-4aa5-797a-b95097f1118f/volume/opencv_1512680501887/work/modules/core/src/matrix.cpp, line 538
/opt/concourse/worker/volumes/live/d8bcd4d1-79b2-4aa5-797a-b95097f1118f/volume/opencv_1512680501887/work/modules/core/src/matrix.cpp:538: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat

THX author发现一些bug,Find some bug...

首先area的计算大多数没有+1
first ,the area calculate not plus 1;
当我使用视频接入时,误检测较多,错误出现在
when I connect video,fp more,wrong below
firstBbox_.clear();
firstOrderScore_.clear();
secondBbox_.clear();
secondBboxScore_.clear();
thirdBbox_.clear();
thirdBboxScore_.clear();
正确写法eg:(right style):
//second stage
...
firstBbox_.clear();
firstOrderScore_.clear();
if(count<1) return faces;
...
//third stage
secondBbox_.clear();
secondBboxScore_.clear();
if(count<1) return faces;
//after third stage
thirdBbox_.clear();
thirdBboxScore_.clear();
还有
mtcnn.cpp
bbox.x1 = round((striderow+0.5)/scale);
bbox.y1 = round((stride
col+0.5)/scale);
bbox.x2 = round((striderow+cellsize)/scale);
bbox.y2 = round((stride
col+cellsize)/scale);
network.cpp
if((*it).x2>height-1)(*it).x2 = height - 1;
if((*it).y2>width-1)(*it).y2 = width - 1;
may be better?

Why `INTER_LINEAR` instead of `INTER_AREA` for resizing

What

Since you use INTER_LINEAR for resizing images in mtcnn,
I'd be happy if I could hear why not INTER_AREA from you.

If you have no policy for this interpolation, that will also be the enough answer to me!

Thanks in advance!!

would you add LNet support later

any plan to add MTCNNv2 LNet which is using image patch around each landmark outputted from stage 3 to make a precise regression.

openBlas Thread number issue.

Today I accidentially change the :

export OPENBLAS_NUM_THREADS=4

to

export OPENBLAS_NUM_THREADS=1
Than the detection timing getting better ?? !!

from 70 ms to 20 ms and the cpu usage decreased to %30 from %99..

I really confused ..

any idea why ?

raspbian latest is the OS. Rpi 3B+

How do you run the compiled binaries?

Im using FreeBSD 13.1
openCV 4.5

I was able to compile the project.
Clueless on how to run the compiled binaries...

I get this error:

$ ./main /dev/video4

[ WARN:[email protected]] global /usr/ports/graphics/opencv/work/opencv-4.6.0/modules/imgcodecs/src/loadsave.cpp (239) findDecoder imread_('4.jpg'): can't open/read file: check file path/integrity

Abort trap (core dumped)

Thanks.

Adding optimizations reduces the run time to 10% of what it's getting now

the things I changed were:

  1. compiling opencv from source reduced the runtime considerably from 50ms to 10ms on my system. I used the master branch. OpenCL was giving me problems with the opencv version install from the repositories.
  2. Adding an optimization level of -O3 to cmake further increased the speed from 10ms to 5ms.

This was ran under fedora 27 on a i7-2600k processor. I do have a GTX 980 ti installed but I don't think OpenCL used it. opencv did not have cuda support in either version.

可以脱离openblas 实现一下么

最近我在尝试脱离openblas实现这个项目,不过结果总是有问题,不知道哪里有偏差,可能是openblas矩阵相乘那里理解有问题

the error "the feature2MatrixInit failed!!"

The problem still exists.
First of all ,I want to test 1920*1080 video.
Then,when i set the "minsize" to 60(default),the application output "the minsize is too small,please change it",and,I set the value of "minsize" to 80, "the feature2MatrixInit failed!!"Segmentation fault
And,I change the "mtcnn find(image.rows, image.cols)" to "mtcnn find(640,480)" in pikaqiu.cpp. Finally the program work correct. but I do not ensure the accuracy! I do not understand it.
BTW, I think the the way of timer is not correct,I am in ubuntu. your test code is clock()/10e3,maybe is clock()/CLOCKS_PER_SEC
Thanks

segmentation fault in Raspberry

Hi,

Same code :

`while(true){
         start = clock();
         cap>>image;
         cv::resize(image, image, cv::Size(), 1.0 * FACE_DOWNSAMPLE_RATIO, 1.0 * FACE_DOWNSAMPLE_RATIO);

         find.findFace(image);

         imshow("result", image);
         if( waitKey(1)>=0 ) break;
         start = clock() -start;
         cout<<"time is  "<<start/10e3<<endl;
     }`

gives segmentation fault at raspberrypi3 . at

find.face(image)

any clue ?

Why the time is "start/10e3" but not "start/1e3"?

Your code is:
clock_t start; start = clock(); find.findFace(image); imshow("result", image); imwrite("result.jpg",image); start = clock() -start; cout<<"time is "<<start/10e3<<endl;
Why you compute the time of "ms" with "start/10e3"? i. e. How to compute the time of "ms" with C++? I think it is right with "start/1e3".

looking forward for your reply.

best.

你好,这两个地方可能导致错误

1、mtcnn::mtcnn(int row, int col)
float minl = row>col?row:col;
可能会导致最后一个尺度的图像的宽或高小于12,进而导致后面this->conv3_matrix的height或width小于0,导致feature2MatrixInit函数里报错。
按你的意思是不是应该row<col?row:col;

2、maxPooling
if ((pbox->width - kernelSize) % stride == 0)
有时候会报错,if(maxNum<(ptemp+i+kernelRowpbox->width))这里指针越界,我发现此时是(pbox->width - kernelSize) % stride == 0但是(pbox->height - kernelSize) % stride != 0。
是否应该加上heigth的限制即,
if ((pbox->width - kernelSize) % stride == 0 && (pbox->height - kernelSize) % stride == 0)

谢谢你的开源!
我用原来的训练的模型,检测近红图像效果不理想。是不是应该在近红外上再训练下呢?

txt文件写入顺序是怎么样的?

你好!我想问一下txt文件是怎么写入的?比如我用的是tensorflow的参数文件,一个卷积层的结构是(kernel,kernel, inputmap,outputmap),这四个维度应该按什么顺序写入txt?看了前面说按照reshape不行啊

network里的代码注释是乱码

能否告知一下编码的格式(估计是中文?),是utf8和gb2312都试过了,还是显示不正常(e.g. line117 in network.cpp)。

Maybe your time is not correct

Hi, I found that your test code is clock()/10e3, I think this is a bug, maybe is clock()/1e3, so the time maybe is 260ms?

.TXT 参数文件

您好,请问您的.txt 参数文件中 数据是怎么组织的?比如PNet的第一层卷积,权重是331*10,输入单通道图像,代码中使用了矩阵相乘,且将输入特征平面转换成了行向量的形式,那么这里的权重是按什么顺序写到.txt文件中的?多谢!

How to use padding based on this framework?

If l set the pad is 1 when initing the conv1_wb like this initConvAndFc(this->conv1_wb, 20, 3, 3, 1, 1)
there will be an error: free(): invalid pointer: 0x00007fab8d8297d8
And I found in function convolutionInit(), when calculating the size of outpBox padding size is not considered.
I wonder if i can use padding based on this framework and how to ?
thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.