Code Monkey home page Code Monkey logo

isbfsar's Introduction

Interactive Open-Set Skeleton-Based One-Shot Action-Recognition

MIT License GitHub stars

The aim of this project is to provide an efficient pipeline for Action Recognition in Human Robot Interaction.

The whole 3D human pose is estimated and used to understand which action inside the support set the human is performing. Action can be easily added or removed from the support set in any moment. The Open-Set score confirms or rejects the Few-Shot prediction to avoid false positives. The Mutual Gaze Constraint can be added to an action as additional filter. Our visualizer

Modules

This repository contains different modules:

Installation

The program is divided into two parts:

  • source.py runs on the host machine, it connects to the RealSense (or webcam), it provides frames to main.py, it visualizes the results with the VISPYVisualizer
  • main.py runs either in a Conda environment or in a Docker, it is responsible for all the computation part.

Since the hpe modules is accelerated with TensorRT engines that requires to be built on the target machine, we provide the engines build over the Dockerfile, that allows for a fast installation. Check here the instruction to install the Human Pose Estimation module.

Run with Docker

Follow the instruction inside the README.md of every module: hpe, ar, and focus. Install Vispy and pyrealsense2 and build the Docker image with:

docker build -t ecub .

To run, start two separate processes:

python manager.py python source.py

Launch the main script with the following command (replace PATH with %cd% in Windows or {$pwd} on Ubuntu):

docker run -it --rm --gpus=all -v "PATH":/home/ecub ecub:latest python main.py

isbfsar's People

Contributors

steb6 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

isbfsar's Issues

Getting error in loading the engine

@StefanoBerti @andrearosasco I've used this command "docker run -it --rm --gpus=all -v "D:/amnt/Quidich_pose_estimation_poc/ISBFSAR":/home/ecub ecub:latest python modules/hpe/hpe.py"

and getting below error. Help me with this.

==========
== CUDA ==
==========

CUDA Version 11.3.1

Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Ubuntu: True
2023-03-27 07:23:06.763 | INFO     | utils.tensorrt_runner:__init__:22 - Loading yolo engine...
[03/27/2023-07:23:07] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::42] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 232, Serialized Engine Version: 213)
[03/27/2023-07:23:07] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)
Traceback (most recent call last):
  File "modules/hpe/hpe.py", line 181, in <module>
    h = HumanPoseEstimator(MetrabsTRTConfig(), RealSenseIntrinsics())
  File "modules/hpe/hpe.py", line 42, in __init__
    self.yolo = Runner(model_config.yolo_engine_path)  # model_config.yolo_engine_path
  File "/home/ecub/utils/tensorrt_runner.py", line 36, in __init__
    for binding in engine:
TypeError: 'NoneType' object is not iterable

I noticed that your repo was updated

Hi @StefanoBerti ,I noticed that your repo was updated, I tested, but the following errors occurred:

  1. Run "modules/hpe/metrabs_trt/utils/extract_onnxs_from_metrabs.py"
    File "D:\python\scripts\AR-main3\modules\hpe\metrabs_trt\utils\extract_onnxs
_from_metrabs.py", line 122, in metr_head  *
        pred2d, pred3d = model.crop_model.heatmap_heads.conv_final(tf.cast(my_pr
ediction_inputs, tf.float16), training=False)
    ValueError: Found zero restored functions for caller function.
  1. Run "modules/hpe/metrabs_trt/utils/from_pytorch_to_onnx.py"
  File "D:\python\scripts\AR-main3\modules\hpe\metrabs_trt\utils\from_pytorch_to
_onnx.py", line 5, in <module>
    torch_out = model(x)
TypeError: 'collections.OrderedDict' object is not callable
  1. If I create an engine files from an old onnx files, and run "main.py", prompt that these files cannot be found:
'modules/hpe/metrabs_trt/models/numpy/heads_weight.npy'
'modules/hpe/metrabs_trt/models/numpy/heads_bias.npy'
'modules/ar/trx/checkpoints/debug.pth'

How do I get or create these files, Can you send these files to me for testing?
  1. What are the functions of the new TRX and LSTM?

  2. Do you have a gtx1060 or gtx1080 graphics card? Can you test its compatibility with your program?

smpl pose ?

i have run it with only hpe.py, and i notice the smpl in the code ,
so,
Q1: the results in hpe.py are 3d keypoints and the skeleton length and human box ?
Q2: i want to get the smpl pose for the results , how to do ? i find it shows smpl image in project metrabs .
I'm checking the project metrabs code ,
and can you give me some advices or some python module to get it ?

different image shape

when i change the test type with RealSense_pipeline by device name to test mp4 with opencv, the input frame size to h.estimate() is 640*480,
but when i run hpe.py , it show me :

    Traceback (most recent call last):
      File "D:/ISBFSAR/modules/hpe/demo.py", line 259, in <module>
        res = h.estimate(img)
      File "D:/ISBFSAR/modules/hpe/demo.py", line 109, in estimate
        bbone_in = self.image_transformation([frame.astype(int), H.astype(np.float32)])
      File "D:\ISBFSAR\utils\tensorrt_runner.py", line 60, in __call__
        np.copyto(inp.host, elem)
      File "<__array_function__ internals>", line 180, in copyto
    ValueError: could not broadcast input array from shape (9,) into shape (45,)

why ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.