Code Monkey home page Code Monkey logo

actiondetectionforsignlanguage's People

Contributors

nicknochnack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

actiondetectionforsignlanguage's Issues

Model details question

Hi, why do you 3 LSTM models for action detection? Could you please give some papers references on the model architecture?
Thanks!

list index out of range while running the last section of realtime detection

IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\2541305508.py in
43
44 # Viz probabilities
---> 45 image = prob_viz(res, actions, image, colors)
46
47 cv2.rectangle(image, (0,0), (640, 40), (245, 117, 16), -1)

~\AppData\Local\Temp\ipykernel_9912\2568779431.py in prob_viz(res, actions, input_frame, colors)
3 output_frame = input_frame.copy()
4 for num, prob in enumerate(res):
----> 5 cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
6 cv2.putText(output_frame, actions[num], (0, 85+num
40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
7

IndexError: list index out of range

Unable to import to_categorical

AlreadyExistsError Traceback (most recent call last)
in
1 from sklearn.model_selection import train_test_split
----> 2 from tensorflow.keras.utils import to_categorical

~\anaconda3\lib\site-packages\tensorflow_init_.py in
39 import sys as _sys
40
---> 41 from tensorflow.python.tools import module_util as _module_util
42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader
43

~\anaconda3\lib\site-packages\tensorflow\python_init_.py in
46 from tensorflow.python import data
47 from tensorflow.python import distribute
---> 48 from tensorflow.python import keras
49 from tensorflow.python.feature_column import feature_column_lib as feature_column
50 from tensorflow.python.layers import layers

~\anaconda3\lib\site-packages\tensorflow\python\keras_init_.py in
25
26 # See b/110718070#comment18 for more details about this import.
---> 27 from tensorflow.python.keras import models
28
29 from tensorflow.python.keras.engine.input_layer import Input

~\anaconda3\lib\site-packages\tensorflow\python\keras\models.py in
24 from tensorflow.python.keras import metrics as metrics_module
25 from tensorflow.python.keras import optimizer_v1
---> 26 from tensorflow.python.keras.engine import functional
27 from tensorflow.python.keras.engine import sequential
28 from tensorflow.python.keras.engine import training

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\functional.py in
36 from tensorflow.python.keras.engine import keras_tensor
37 from tensorflow.python.keras.engine import node as node_module
---> 38 from tensorflow.python.keras.engine import training as training_lib
39 from tensorflow.python.keras.engine import training_utils
40 from tensorflow.python.keras.saving.saved_model import network_serialization

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py in
50 from tensorflow.python.keras.engine import base_layer_utils
51 from tensorflow.python.keras.engine import compile_utils
---> 52 from tensorflow.python.keras.engine import data_adapter
53 from tensorflow.python.keras.engine import training_utils
54 from tensorflow.python.keras.mixed_precision import loss_scale_optimizer as lso

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py in
56
57 keras_data_adapter_gauge = monitoring.BoolGauge(
---> 58 "/tensorflow/api/keras/data_adapters", "keras data adapter usage", "method")
59
60 try:

~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, name, description, *labels)
349 """
350 super(BoolGauge, self).init('BoolGauge', _bool_gauge_methods,
--> 351 len(labels), name, description, *labels)
352
353 def get_cell(self, *labels):

~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, metric_name, metric_methods, label_length, *args)
124 self._metric_name, len(self._metric_methods)))
125
--> 126 self._metric = self._metric_methods[self._label_length].create(*args)
127
128 def del(self):

AlreadyExistsError: Another metric with the same name already exists.

versions for all python modules??

Hello, I am getting various errors and conflicts installing the python modules. Can someone list out the version #'s of all the packages used in the 'Action Detection Refined.ipnb' script?

Thank you!

Training sequences with different number of frames

I have data that contains sequences with different number of frames. i.e. one sequence with 58 and another one with 63 etc...
how can I use the data to do the following part:
image
here when I use np.array it flatten the array as the elements have different shape or size
image
and here I cannot specify the input shape as I couldn't do the previous part

newwwww

mp_holistic = mp.solutions.holistic
mp_drawing = mp.solutions.drawing_utils

def mp_detection(image, model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writable = False
results = model.process(image)
image.flags.writable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
return image, results
def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image,results.face_landmarks,mp_holistic.FACE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)

cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
ret, frame = cap.read()
image, results = mp_detection(frame, holistic)
print(results)
cv2.imshow("live feed", frame)

    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()
plt.imshow()

Drum action detection

Hi,
I am trying to design a pose estimation based drum player. The accuracy is not good. I tried to include only hand and pose.

def extract_keypoints(results):
    pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(33*4)
    lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(21*3)
    rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(21*3)
    return np.concatenate([pose,lh, rh])

Also, I tried to reduce the sequence length as drum motions are fast -

DATA_PATH = os.path.join('MP_Data') 
actions = np.array(['Kick', 'Snare', 'Cymbal','Normal','hihat'])
no_sequences = 30
sequence_length = 15
start_folder = 0

and taking 15 frames during a live demo. But it's not accurate. Can you please tell me the way to make it perfect? Or make a video on this ?
Cheers,
Suti

i'm getting an error while running the data collection feed

code :

cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    
    # NEW LOOP
    # Loop through actions
    for action in actions:
        # Loop through sequences aka videos
        for sequence in range(start_folder, start_folder+no_sequences):
            # Loop through video length aka sequence length
            for frame_num in range(sequence_length):

                # Read feed
                ret, frame = cap.read()
                cv2.startWindowThread()

                # Make detections
                image, results = mediapipe_detection(frame, holistic)

                # Draw landmarks
                draw_styled_landmarks(image, results)
                
                # NEW Apply wait logic
                if frame_num == 0: 
                    cv2.putText(image, 'STARTING COLLECTION', (120,200), 
                               cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255, 0), 4, cv2.LINE_AA)
                    cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12), 
                               cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
                    # Show to screen
                    cv2.imshow('OpenCV data collection Feed', image)
                    cv2.waitKey(1)

                else: 
                    cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12), 
                               cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
                    # Show to screen
                    cv2.imshow('OpenCV data collection Feed', image)
                
                # NEW Export keypoints
                keypoints = extract_keypoints(results)
                npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
                np.save(npy_path, keypoints)

                # break
                if cv2.waitKey(10) & 0xFF == ord('q'):
                    break
                    
    cap.release()
    cv2.destroyAllWindows()
    cv2.waitKey(1)```
    



    error:
    
    FileNotFoundError                         Traceback (most recent call last)
Cell In[58], line 43
     41 keypoints = extract_keypoints(results)
     42 npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
---> 43 np.save(npy_path, keypoints)
     45 # break
     46 if cv2.waitKey(10) & 0xFF == ord('q'):

File <__array_function__ internals>:180, in save(*args, **kwargs)

File [~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518](https://file+.vscode-resource.vscode-cdn.net/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518), in save(file, arr, allow_pickle, fix_imports)
    516     if not file.endswith('.npy'):
    517         file = file + '.npy'
--> 518     file_ctx = open(file, "wb")
    520 with file_ctx as fid:
    521     arr = np.asanyarray(arr)

FileNotFoundError: [Errno 2] No such file or directory: '/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/MP_Data/hello/31/0.npy'


what should i do?? Even my MP_Data folder is created

Incorrect predictions

import cv2
import numpy as np
import os
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import TensorBoard
import mediapipe as mp
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import LSTM, Dense

from sklearn.metrics import multilabel_confusion_matrix, accuracy_score

mp_holistic = mp.solutions.holistic

mp_drawing = mp.solutions.drawing_utils

DATA_PATH = os.path.join("C:/Users/96654/PycharmProjects/mrdas/newmydata")

Actions that we try to detect

actions = np.array(['hello', 'thanks', 'cool'])

Thirty videos worth of data

no_sequences = 40

Videos are going to be 50 frames in length

sequence_length = 40

Folder start

start_folder = 0
for action in actions:
for sequence in range(no_sequences):
try:
os.makedirs(os.path.join(DATA_PATH, action, str(sequence)))
except:
pass

def mediapipe_detection (image,model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writeable = False
results = model.process(image)
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image, results

def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
def draw_styled_landmarks(image, results):
# Draw face connections
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION,
mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1),
mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1)
)
# Draw pose connections
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS,
mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
)
# Draw left hand connections
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
)
# Draw right hand connections
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
)

def extract_keypoints(results):
pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(334)
face = np.array([[res.x, res.y, res.z] for res in results.face_landmarks.landmark]).flatten() if results.face_landmarks else np.zeros(468
3)
lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(213)
rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(21
3)
return np.concatenate([pose, face, lh, rh])

#sys.tracebacklimit = 0
label_map = {label:num for num, label in enumerate(actions)}
sequences, labels = [],[]
for action in actions:
for sequence in np.array(os.listdir(os.path.join(DATA_PATH, action))).astype(int):
window = []
for frame_num in range(sequence_length):
res = np.load(os.path.join(DATA_PATH, action, str(sequence), "{}.npy".format(frame_num)))
window.append(res)
sequences.append(window)
labels.append(label_map[action])
np.array(sequences).shape
np.array(labels).shape

    X = np.array(sequences)
    X.shape
    y = to_categorical(labels).astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05)
y_test.shape

colors = [(245, 117, 16), (117, 245, 16), (16, 117, 245)]

#log_dir = os.path.join('Logs2')
#tb_callback = TensorBoard(log_dir=log_dir)
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='relu', input_shape=(40,1662)))
model.add(LSTM(128, return_sequences=True, activation='relu'))
model.add(LSTM(64, return_sequences=False, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(actions.shape[0], activation='softmax'))
res = [0.7,0.2,0.1]
actions[np.argmax(res)]
#model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
#model.fit(X_train, y_train, epochs=2000, callbacks=[tb_callback])
model.summary()
res = model.predict(X_test)
actions[np.argmax(res[4])]
model.save('action1.h5')
yhat = model.predict(X_test)
ytrue = np.argmax(y_test, axis=1).tolist()
yhat = np.argmax(yhat, axis=1).tolist()
multilabel_confusion_matrix(ytrue, yhat)
accuracy_score(ytrue, yhat)

def prob_viz (res,actions,input_frame,colors):

output_frame = input_frame.copy()
for num, prob in enumerate(res):
    cv2.rectangle(output_frame, (0, 60 + num * 40), (int(prob * 100), 90 + num * 40), colors[num], -1)
    cv2.putText(output_frame, actions[num], (0, 85 + num * 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,cv2.LINE_AA)

return output_frame

#sequence.reverse()
#len(sequence)

#sequence.append('def')
#sequence.reverse()
#sequence[-40:]

sequence = []
sentence = []
predictions = []
threshold = 0.5

cap = cv2.VideoCapture(0)

Set mediapipe model

with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():

    # Read feed
    ret, frame = cap.read()

    # Make detections
    image, results = mediapipe_detection(frame, holistic)

    #print(results)

    # Draw landmarks
    draw_styled_landmarks(image, results)
    image = cv2.flip(image, 1)
    # 2. Prediction logic
    keypoints = extract_keypoints(results)
    sequence.append(keypoints)
    sequence = sequence[-40:]

    if len(sequence) == 40:
        res = model.predict(np.expand_dims(sequence, axis=0))[0]
        print(actions[np.argmax(res)])
        predictions.append(np.argmax(res))

    # 3. Viz logic
        if np.unique(predictions[-10:])[0] == np.argmax(res):
            if res[np.argmax(res)] > threshold:

                if len(sentence) > 0:
                    if actions[np.argmax(res)] != sentence[-1]:
                        sentence.append(actions[np.argmax(res)])
                else:
                    sentence.append(actions[np.argmax(res)])

        if len(sentence) > 5:
            sentence = sentence[-5:]

        # Viz probabilities
        image = prob_viz(res, actions, image, colors)

    cv2.rectangle(image, (0, 0), (640, 40), (245, 117, 16), -1)
    cv2.putText(image, ' '.join(sentence), (3, 30),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)

    # Show to screen
    cv2.imshow('OpenCV Feed', image)

    # Break gracefully
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

cap.release()
cv2.destroyAllWindows()

And also this

colors = [(245,117,16), (117,245,16), (16,117,245)]
def prob_viz(res, actions, input_frame, colors):
output_frame = input_frame.copy()
for num, prob in enumerate(res):
cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
cv2.putText(output_frame, actions[num], (0, 85+num
40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)

return output_frame

plt.figure(figsize=(18,18))
plt.imshow(prob_viz(res, actions, image, colors))


IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\914121024.py in
1 plt.figure(figsize=(18,18))
----> 2 plt.imshow(prob_viz(res, actions, image, colors))

~\AppData\Local\Temp\ipykernel_9912\2568779431.py in prob_viz(res, actions, input_frame, colors)
3 output_frame = input_frame.copy()
4 for num, prob in enumerate(res):
----> 5 cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
6 cv2.putText(output_frame, actions[num], (0, 85+num
40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
7

IndexError: list index out of range

start_folder value in step 5

shouldn't the value of the start_folder initialized with 0 rather than 30 since we are adding the no. sequences to it therefore resulting in frames accessing values greator than 30.

video frame length

Since the length is different for each movement, do not enter the frame that matches the length.
Is there any way to learn frames of different lengths?

No text recognizing my actions or sign language is showing on the OpenCv feed. Need help ASAP!

I did this project and added frames to recognise 5 actions but by the end, no action was getting recognized. I cannot seem to find any error as the code is running perfectly and OpenCV feed is showing only with no texts recognizing the actions. Someone please help me ASAP as I need it for school project in two days. I'm so helpless.. Please Please Please.. Somebody. I will attach the code I did here.
Action Detection.zip

Detect multiple actions

Thank you for the video and code! I have a question though: I have trained the model and I am trying to make detection on prerecorded videos. Each video contain multiple actions (performed sequentially). However, the model only gives one detection (one action) per video. Is it possible to solve this issue?

setup folder for collection

for me the folder isnt getting created but the code doesnt show any errors. i have tried two set of codes 1st one is nick.. his code doesnt show any errors i thnk it is bcos of try block second code is from pull request where it shows error
Screenshot (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.