nicknochnack / actiondetectionforsignlanguage Goto Github PK

View Code? Open in Web Editor NEW

369.0 13.0 239.0 9.4 MB

A practical implementation of sign language estimation using an LSTM NN built on TF Keras.

Jupyter Notebook 100.00%

actiondetectionforsignlanguage's People

Contributors

Stargazers

Watchers

Forkers

phoque6 shiftdirect untszlung neha-duggirala christophe-ye-biname fffdiminished tx-mar alext132 sutirthachakraborty abbie-dev1 adeyinka-hub gastongev hercules261188 yayayru praneesha-irugal arkincognito maxgaller debashishsau nandanhemanth exponentialr dmohazab psindra tarunmuppuri edybk ohashi yejuham dipu0 sahilreza26 bigluck07 red2506 tienhoangvan tzebin s1mplifi3r rutuja1121 0pt1mu5-sys shankarkarande kashimmirza samigroupllc nikhilmahana rgvillanueva28 tusharamd prasannaswetha hendbellallah dom-inic agumon2020 akr1912 joshuaanderson18 gaurijagatap geeend tanyu1102 neu-robin1993 gauravoraon123 mosabalrsaheed pythonsus emmanuelrtm chongkan christofbuechi anusornc bhumikr qkrgus1021 mohamedashiq888 yumnammomojit ayushbindal riali-mouad jampan rachit2525 ka3ax01 abbasali-io sunny11286 hashimshim salem-med hamdaalmahri vaibhavgit9210 kiet2000 gengcauwong mishrasudhir1 h1tenbafna emmytheo prabhatyadavdatascientist this-is-joejoe stancx1 kalyandatha mo-adi susanjhamilton vinoth029 aashwin2202 amit0902 2668342956 devodkador arghyadipchowdhury ashutosh0120 piramid sunshineywz123 georgeraveen ixeixe jackchan0528 ajitkumar15 mbencherif oziomaoguine tasinnahian

actiondetectionforsignlanguage's Issues

Installation setup

failed to install sklearn and matplotlib. anyone knows the problem?

Model details question

Hi, why do you 3 LSTM models for action detection? Could you please give some papers references on the model architecture?
Thanks!

list index out of range while running the last section of realtime detection

IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\2541305508.py in
43
44 # Viz probabilities
---> 45 image = prob_viz(res, actions, image, colors)
46
47 cv2.rectangle(image, (0,0), (640, 40), (245, 117, 16), -1)

~\AppData\Local\Temp\ipykernel_9912\2568779431.py in prob_viz(res, actions, input_frame, colors)
3 output_frame = input_frame.copy()
4 for num, prob in enumerate(res):
----> 5 cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
6 cv2.putText(output_frame, actions[num], (0, 85+num40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
7

IndexError: list index out of range

Unable to import to_categorical

AlreadyExistsError Traceback (most recent call last)
in
1 from sklearn.model_selection import train_test_split
----> 2 from tensorflow.keras.utils import to_categorical

~\anaconda3\lib\site-packages\tensorflow_init_.py in
39 import sys as _sys
40
---> 41 from tensorflow.python.tools import module_util as _module_util
42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader
43

~\anaconda3\lib\site-packages\tensorflow\python_init_.py in
46 from tensorflow.python import data
47 from tensorflow.python import distribute
---> 48 from tensorflow.python import keras
49 from tensorflow.python.feature_column import feature_column_lib as feature_column
50 from tensorflow.python.layers import layers

~\anaconda3\lib\site-packages\tensorflow\python\keras_init_.py in
25
26 # See b/110718070#comment18 for more details about this import.
---> 27 from tensorflow.python.keras import models
28
29 from tensorflow.python.keras.engine.input_layer import Input

~\anaconda3\lib\site-packages\tensorflow\python\keras\models.py in
24 from tensorflow.python.keras import metrics as metrics_module
25 from tensorflow.python.keras import optimizer_v1
---> 26 from tensorflow.python.keras.engine import functional
27 from tensorflow.python.keras.engine import sequential
28 from tensorflow.python.keras.engine import training

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\functional.py in
36 from tensorflow.python.keras.engine import keras_tensor
37 from tensorflow.python.keras.engine import node as node_module
---> 38 from tensorflow.python.keras.engine import training as training_lib
39 from tensorflow.python.keras.engine import training_utils
40 from tensorflow.python.keras.saving.saved_model import network_serialization

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py in
50 from tensorflow.python.keras.engine import base_layer_utils
51 from tensorflow.python.keras.engine import compile_utils
---> 52 from tensorflow.python.keras.engine import data_adapter
53 from tensorflow.python.keras.engine import training_utils
54 from tensorflow.python.keras.mixed_precision import loss_scale_optimizer as lso

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py in
56
57 keras_data_adapter_gauge = monitoring.BoolGauge(
---> 58 "/tensorflow/api/keras/data_adapters", "keras data adapter usage", "method")
59
60 try:

~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, name, description, *labels)
349 """
350 super(BoolGauge, self).init('BoolGauge', _bool_gauge_methods,
--> 351 len(labels), name, description, *labels)
352
353 def get_cell(self, *labels):

~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, metric_name, metric_methods, label_length, *args)
124 self._metric_name, len(self._metric_methods)))
125
--> 126 self._metric = self._metric_methods[self._label_length].create(*args)
127
128 def del(self):

AlreadyExistsError: Another metric with the same name already exists.

Failed to convert a NumPy array to a Tensor

I tried multiple times but I can't convert the array.
Someone can help

versions for all python modules??

Hello, I am getting various errors and conflicts installing the python modules. Can someone list out the version #'s of all the packages used in the 'Action Detection Refined.ipnb' script?

Thank you!

Training sequences with different number of frames

I have data that contains sequences with different number of frames. i.e. one sequence with 58 and another one with 63 etc...
how can I use the data to do the following part:

here when I use np.array it flatten the array as the elements have different shape or size

and here I cannot specify the input shape as I couldn't do the previous part

running the file i got this error

File "", line 2
if results.face_landmarks
^
IndentationError: unexpected indent

newwwww

mp_holistic = mp.solutions.holistic
mp_drawing = mp.solutions.drawing_utils

def mp_detection(image, model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writable = False
results = model.process(image)
image.flags.writable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
return image, results
def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image,results.face_landmarks,mp_holistic.FACE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)

cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
ret, frame = cap.read()
image, results = mp_detection(frame, holistic)
print(results)
cv2.imshow("live feed", frame)

    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()
plt.imshow()

Drum action detection

Hi,
I am trying to design a pose estimation based drum player. The accuracy is not good. I tried to include only hand and pose.

def extract_keypoints(results):
    pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(33*4)
    lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(21*3)
    rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(21*3)
    return np.concatenate([pose,lh, rh])

Also, I tried to reduce the sequence length as drum motions are fast -

DATA_PATH = os.path.join('MP_Data') 
actions = np.array(['Kick', 'Snare', 'Cymbal','Normal','hihat'])
no_sequences = 30
sequence_length = 15
start_folder = 0

and taking 15 frames during a live demo. But it's not accurate. Can you please tell me the way to make it perfect? Or make a video on this ?
Cheers,
Suti

i'm getting an error while running the data collection feed

code :

cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    
    # NEW LOOP
    # Loop through actions
    for action in actions:
        # Loop through sequences aka videos
        for sequence in range(start_folder, start_folder+no_sequences):
            # Loop through video length aka sequence length
            for frame_num in range(sequence_length):

                # Read feed
                ret, frame = cap.read()
                cv2.startWindowThread()

                # Make detections
                image, results = mediapipe_detection(frame, holistic)

                # Draw landmarks
                draw_styled_landmarks(image, results)
                
                # NEW Apply wait logic
                if frame_num == 0: 
                    cv2.putText(image, 'STARTING COLLECTION', (120,200), 
                               cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255, 0), 4, cv2.LINE_AA)
                    cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12), 
                               cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
                    # Show to screen
                    cv2.imshow('OpenCV data collection Feed', image)
                    cv2.waitKey(1)

                else: 
                    cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12), 
                               cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
                    # Show to screen
                    cv2.imshow('OpenCV data collection Feed', image)
                
                # NEW Export keypoints
                keypoints = extract_keypoints(results)
                npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
                np.save(npy_path, keypoints)

                # break
                if cv2.waitKey(10) & 0xFF == ord('q'):
                    break
                    
    cap.release()
    cv2.destroyAllWindows()
    cv2.waitKey(1)```
    



    error:
    
    FileNotFoundError                         Traceback (most recent call last)
Cell In[58], line 43
     41 keypoints = extract_keypoints(results)
     42 npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
---> 43 np.save(npy_path, keypoints)
     45 # break
     46 if cv2.waitKey(10) & 0xFF == ord('q'):

File <__array_function__ internals>:180, in save(*args, **kwargs)

File [~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518](https://file+.vscode-resource.vscode-cdn.net/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518), in save(file, arr, allow_pickle, fix_imports)
    516     if not file.endswith('.npy'):
    517         file = file + '.npy'
--> 518     file_ctx = open(file, "wb")
    520 with file_ctx as fid:
    521     arr = np.asanyarray(arr)

FileNotFoundError: [Errno 2] No such file or directory: '/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/MP_Data/hello/31/0.npy'


what should i do?? Even my MP_Data folder is created

Your code is not arrange sequencely

Cant run this file when I close Jupyter

So basically when i close the anaconda server and run it again it says
" action " or "actions" is not defined

Incorrect predictions

import cv2
import numpy as np
import os
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import TensorBoard
import mediapipe as mp
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import LSTM, Dense

from sklearn.metrics import multilabel_confusion_matrix, accuracy_score

mp_holistic = mp.solutions.holistic

mp_drawing = mp.solutions.drawing_utils

DATA_PATH = os.path.join("C:/Users/96654/PycharmProjects/mrdas/newmydata")

Actions that we try to detect

actions = np.array(['hello', 'thanks', 'cool'])

Thirty videos worth of data

no_sequences = 40

Videos are going to be 50 frames in length

sequence_length = 40

Folder start

start_folder = 0
for action in actions:
for sequence in range(no_sequences):
try:
os.makedirs(os.path.join(DATA_PATH, action, str(sequence)))
except:
pass

def mediapipe_detection (image,model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writeable = False
results = model.process(image)
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image, results

def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
def draw_styled_landmarks(image, results):
# Draw face connections
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION,
mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1),
mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1)
)
# Draw pose connections
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS,
mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
)
# Draw left hand connections
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
)
# Draw right hand connections
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
)

def extract_keypoints(results):
pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(334)
face = np.array([[res.x, res.y, res.z] for res in results.face_landmarks.landmark]).flatten() if results.face_landmarks else np.zeros(4683)
lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(213)
rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(213)
return np.concatenate([pose, face, lh, rh])

#sys.tracebacklimit = 0
label_map = {label:num for num, label in enumerate(actions)}
sequences, labels = [],[]
for action in actions:
for sequence in np.array(os.listdir(os.path.join(DATA_PATH, action))).astype(int):
window = []
for frame_num in range(sequence_length):
res = np.load(os.path.join(DATA_PATH, action, str(sequence), "{}.npy".format(frame_num)))
window.append(res)
sequences.append(window)
labels.append(label_map[action])
np.array(sequences).shape
np.array(labels).shape

    X = np.array(sequences)
    X.shape
    y = to_categorical(labels).astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05)
y_test.shape

colors = [(245, 117, 16), (117, 245, 16), (16, 117, 245)]

#log_dir = os.path.join('Logs2')
#tb_callback = TensorBoard(log_dir=log_dir)
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='relu', input_shape=(40,1662)))
model.add(LSTM(128, return_sequences=True, activation='relu'))
model.add(LSTM(64, return_sequences=False, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(actions.shape[0], activation='softmax'))
res = [0.7,0.2,0.1]
actions[np.argmax(res)]
#model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
#model.fit(X_train, y_train, epochs=2000, callbacks=[tb_callback])
model.summary()
res = model.predict(X_test)
actions[np.argmax(res[4])]
model.save('action1.h5')
yhat = model.predict(X_test)
ytrue = np.argmax(y_test, axis=1).tolist()
yhat = np.argmax(yhat, axis=1).tolist()
multilabel_confusion_matrix(ytrue, yhat)
accuracy_score(ytrue, yhat)

def prob_viz (res,actions,input_frame,colors):

output_frame = input_frame.copy()
for num, prob in enumerate(res):
    cv2.rectangle(output_frame, (0, 60 + num * 40), (int(prob * 100), 90 + num * 40), colors[num], -1)
    cv2.putText(output_frame, actions[num], (0, 85 + num * 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,cv2.LINE_AA)

return output_frame

#sequence.reverse()
#len(sequence)

#sequence.append('def')
#sequence.reverse()
#sequence[-40:]

sequence = []
sentence = []
predictions = []
threshold = 0.5

cap = cv2.VideoCapture(0)

Set mediapipe model

with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():

    # Read feed
    ret, frame = cap.read()

    # Make detections
    image, results = mediapipe_detection(frame, holistic)

    #print(results)

    # Draw landmarks
    draw_styled_landmarks(image, results)
    image = cv2.flip(image, 1)
    # 2. Prediction logic
    keypoints = extract_keypoints(results)
    sequence.append(keypoints)
    sequence = sequence[-40:]

    if len(sequence) == 40:
        res = model.predict(np.expand_dims(sequence, axis=0))[0]
        print(actions[np.argmax(res)])
        predictions.append(np.argmax(res))

    # 3. Viz logic
        if np.unique(predictions[-10:])[0] == np.argmax(res):
            if res[np.argmax(res)] > threshold:

                if len(sentence) > 0:
                    if actions[np.argmax(res)] != sentence[-1]:
                        sentence.append(actions[np.argmax(res)])
                else:
                    sentence.append(actions[np.argmax(res)])

        if len(sentence) > 5:
            sentence = sentence[-5:]

        # Viz probabilities
        image = prob_viz(res, actions, image, colors)

    cv2.rectangle(image, (0, 0), (640, 40), (245, 117, 16), -1)
    cv2.putText(image, ' '.join(sentence), (3, 30),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)

    # Show to screen
    cv2.imshow('OpenCV Feed', image)

    # Break gracefully
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

cap.release()
cv2.destroyAllWindows()

And also this

colors = [(245,117,16), (117,245,16), (16,117,245)]
def prob_viz(res, actions, input_frame, colors):
output_frame = input_frame.copy()
for num, prob in enumerate(res):
cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
cv2.putText(output_frame, actions[num], (0, 85+num40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)

return output_frame

plt.figure(figsize=(18,18))
plt.imshow(prob_viz(res, actions, image, colors))

IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\914121024.py in
1 plt.figure(figsize=(18,18))
----> 2 plt.imshow(prob_viz(res, actions, image, colors))

IndexError: list index out of range

start_folder value in step 5

shouldn't the value of the start_folder initialized with 0 rather than 30 since we are adding the no. sequences to it therefore resulting in frames accessing values greator than 30.

code issues in Action Detection Tutorial.ipynb

In function draw_landmarks, replace Face_CONNECTION with FACEMESH_TESSELATION
Similarly in function draw_styled_landmarks, replace Face_CONNECTION with FACEMESH_TESSELATION

video frame length

Since the length is different for each movement, do not enter the frame that matches the length.
Is there any way to learn frames of different lengths?

AttributeError: 'str' object has no attribute 'decode'

i am getting his error on model.load_weights('action.h5")

No text recognizing my actions or sign language is showing on the OpenCv feed. Need help ASAP!

I did this project and added frames to recognise 5 actions but by the end, no action was getting recognized. I cannot seem to find any error as the code is running perfectly and OpenCV feed is showing only with no texts recognizing the actions. Someone please help me ASAP as I need it for school project in two days. I'm so helpless.. Please Please Please.. Somebody. I will attach the code I did here.
Action Detection.zip

How can I extract key points from external video file ( i.e : .mp4 ) ?

Detect multiple actions

Thank you for the video and code! I have a question though: I have trained the model and I am trying to make detection on prerecorded videos. Each video contain multiple actions (performed sequentially). However, the model only gives one detection (one action) per video. Is it possible to solve this issue?