A skeleton-based real-time online action recognition project, classifying and recognizing base on framewise joints, which can be used for safety monitoring.. (The code comments are partly descibed in chinese)
The pipline of this work is:
- Realtime pose estimation by OpenPose;
- Online human tracking for multi-people scenario by DeepSort algorithm;
- Action recognition with DNN for each person based on single framewise joints detected from Openpose.
- python >= 3.5
- Opencv >= 3.4.1
- sklearn
- tensorflow & keras
- numpy & scipy
- pathlib
- NVIDIA graphic card driver
- CUDA toolkit (need to match the version of the driver)
- CUDNN (need to match the version of the CUDA)
$ pip3 install tensorflow-gpu
$ git clone https://github.com/zonghan0904/Online-Realtime-Action-Recognition-based-on-OpenPose.git
- Follow the installation instructions on README
- Download the openpose VGG tf-model with command line
./download.sh
(/Pose/graph_models/VGG_origin) or fork here, and place it under the corresponding folder; VGG_origin: training with the VGG net, as same as the CMU providing caffemodel, more accurate but slower, mobilenet_thin: training with the Mobilenet, much smaller than the origin VGG, faster but less accurate. However, Please attention that the Action Dataset in this repo is collected along with the VGG model running.
python save_video.py
, it will start the webcam and save the video.python collect_data.py
, it will start the webcam and generate the joints data (training data) per frame as a txt file. (you can choose to test video with commandpython collect_data.py --video=test.mp4
)python test_webcam.py
, it will start the webcam and classify actions. (you can choose to test video with commandpython test_webcam.py --video=test.mp4
)
python test_AE450.py
, it will start AE450 and classify actions.
- prepare data(actions) by running
collect_data.py
, the origin data will be saved as a.txt
. - transforming the
.txt
to.csv
, you can use EXCEL to do this. - do the training with the
traing.py
inAction/training/
, remember to change the action_enum and output-layer of model.
※因為Human3.6m資料集需要至Human3.6M註冊審核才能下載,因為審核至今仍未通過,因此訓練與驗證皆使用自製資料集進行,之前的進度報告已有提過
-
輸出資料種類包含:
- 坐著
- 揮手
- 跌倒
- 其他(無分類)
-
驗收靜態 100 frame 姿態辨識
- ROS Topic of raw image: /camera0/color/image_raw
- ROS Topic of depth image: /camera0/aligned_depth_to_color/image_raw
(the default height and default width of the depth image are not the same as raw image, ==need to be aligned== with
roslaunch realsense2_camera rs_camera.launch align_depth:=true
)
- ROS Topic of results image: /ae450/image/color
- ROS Topic of ROI: /ae450/image/roi (set x_offset=pixel_x, set y_offset=pixel_y, set heigh=depth)
-
根據需要辨識的動作種類拍攝影片,如下圖所示,假設需要辨識4種分類,那麼至少需要拍攝4種分類的影片,每種動作分類的影片最好符合下列條件:
- 單一被拍攝者,出現多重被拍攝者會影響到後續訓練資料集的蒐集
- 背景盡量較單純,避免OpenPose錯誤偵測到joint而影響到後續訓練資料集的蒐集
- 被拍攝者應盡量平均出現在影像的各個位置或不同角度
-
執行
python collect_data.py --video={動作類別}.mp4
可以生成被拍攝者在影片中所有執行該類別動作時的joint data,而該joint data會以與影片同名字的txt形式產生 -
以
empty_data.csv
作為樣板複製一份ncrl_data_reduced.csv
,將所有的txt檔改檔名成csv檔並將內容依序複製貼上到ncrl_data_reduced.csv
上的A欄至AJ欄,並在AK欄填上該動作類別的數字代號,該代號是自己定義。(坐著的影片類別為0,揮手的影片類別為1,跌倒的影片類別為2,其他的影片類別為3),除此之外,請刪除row和row之間的空白的部份,此部份是因為OpenPose在當下frame未偵測出joint data造成的,可使用圖表的filter功能刪除 -
若各個csv檔的AL欄後的欄位有值出現,那是因為OpenPose在當下frame若偵測出多個人或是誤判偵測出多個人,則會產生不只一組joint data,請不要複製到
ncrl_data_reduced.csv
(只有ncrl_data_reduced.csv
的A欄至AK欄會有值,其他欄位不能有值) -
將資料集依class的排序由低排到高,另外若OpenPose在當下frame未偵測到對應的joint資料,則會在該欄位填0,為了避免training data有太多沒意義的資料,可以用filter過濾掉出現過多次0的rows,我們所得到的訓練資料過濾掉一行超過12個0的資料,最後存檔後將
ncrl_data_reduced.csv
複製到Online-Realtime-Action-Recognition-based-on-OpenPose/Action/training
-
將
Action/action_enum.py
和Action/training/train.py
的Actions類別改成要辨識的動作和對應的數字代號,另外可以參考下圖,將ncrl_data_reduced.csv
裡各個類別的data數量貼上程式裡面的這段 -
python train.py
後就可以開始訓練,所生成的model就會位於Action/training/
裡面,不需要移動它,python test_AE450.py
和python test_webcam.py
會到這個路徑裡面找尋model並進行程式
- terminal 1
$ roscore
- terminal 2 (execute the program)
$ python3 test_AE450.py
- terminal 3 (play the recorded data)
$ rosbag play {the bag file in bag/}
# the name of the raw image topic should be named: /camera0/color/image_raw
# the name of the depth image topic should be named: /camera0/aligned_depth_to_color/image_raw
- terminal 4 (results visualization)
$ rviz # subscribe /ae450/image/color and show the image
$ rostopic echo /ae450/image/roi
- terminal 1 (execute the program)
$ python3 test_webcam.py --video {path to the video}.mp4
# The results would be displayed on OpenCV figure
Thanks to the following awesome works: