nicknochnack / actiondetectionforsignlanguage Goto Github PK
View Code? Open in Web Editor NEWA practical implementation of sign language estimation using an LSTM NN built on TF Keras.
A practical implementation of sign language estimation using an LSTM NN built on TF Keras.
failed to install sklearn and matplotlib. anyone knows the problem?
Hi, why do you 3 LSTM models for action detection? Could you please give some papers references on the model architecture?
Thanks!
IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\2541305508.py in
43
44 # Viz probabilities
---> 45 image = prob_viz(res, actions, image, colors)
46
47 cv2.rectangle(image, (0,0), (640, 40), (245, 117, 16), -1)
~\AppData\Local\Temp\ipykernel_9912\2568779431.py in prob_viz(res, actions, input_frame, colors)
3 output_frame = input_frame.copy()
4 for num, prob in enumerate(res):
----> 5 cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
6 cv2.putText(output_frame, actions[num], (0, 85+num40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
7
IndexError: list index out of range
AlreadyExistsError Traceback (most recent call last)
in
1 from sklearn.model_selection import train_test_split
----> 2 from tensorflow.keras.utils import to_categorical
~\anaconda3\lib\site-packages\tensorflow_init_.py in
39 import sys as _sys
40
---> 41 from tensorflow.python.tools import module_util as _module_util
42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader
43
~\anaconda3\lib\site-packages\tensorflow\python_init_.py in
46 from tensorflow.python import data
47 from tensorflow.python import distribute
---> 48 from tensorflow.python import keras
49 from tensorflow.python.feature_column import feature_column_lib as feature_column
50 from tensorflow.python.layers import layers
~\anaconda3\lib\site-packages\tensorflow\python\keras_init_.py in
25
26 # See b/110718070#comment18 for more details about this import.
---> 27 from tensorflow.python.keras import models
28
29 from tensorflow.python.keras.engine.input_layer import Input
~\anaconda3\lib\site-packages\tensorflow\python\keras\models.py in
24 from tensorflow.python.keras import metrics as metrics_module
25 from tensorflow.python.keras import optimizer_v1
---> 26 from tensorflow.python.keras.engine import functional
27 from tensorflow.python.keras.engine import sequential
28 from tensorflow.python.keras.engine import training
~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\functional.py in
36 from tensorflow.python.keras.engine import keras_tensor
37 from tensorflow.python.keras.engine import node as node_module
---> 38 from tensorflow.python.keras.engine import training as training_lib
39 from tensorflow.python.keras.engine import training_utils
40 from tensorflow.python.keras.saving.saved_model import network_serialization
~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py in
50 from tensorflow.python.keras.engine import base_layer_utils
51 from tensorflow.python.keras.engine import compile_utils
---> 52 from tensorflow.python.keras.engine import data_adapter
53 from tensorflow.python.keras.engine import training_utils
54 from tensorflow.python.keras.mixed_precision import loss_scale_optimizer as lso
~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py in
56
57 keras_data_adapter_gauge = monitoring.BoolGauge(
---> 58 "/tensorflow/api/keras/data_adapters", "keras data adapter usage", "method")
59
60 try:
~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, name, description, *labels)
349 """
350 super(BoolGauge, self).init('BoolGauge', _bool_gauge_methods,
--> 351 len(labels), name, description, *labels)
352
353 def get_cell(self, *labels):
~\anaconda3\lib\site-packages\tensorflow\python\eager\monitoring.py in init(self, metric_name, metric_methods, label_length, *args)
124 self._metric_name, len(self._metric_methods)))
125
--> 126 self._metric = self._metric_methods[self._label_length].create(*args)
127
128 def del(self):
AlreadyExistsError: Another metric with the same name already exists.
Hello, I am getting various errors and conflicts installing the python modules. Can someone list out the version #'s of all the packages used in the 'Action Detection Refined.ipnb' script?
Thank you!
I have data that contains sequences with different number of frames. i.e. one sequence with 58 and another one with 63 etc...
how can I use the data to do the following part:
here when I use np.array it flatten the array as the elements have different shape or size
and here I cannot specify the input shape as I couldn't do the previous part
File "", line 2
if results.face_landmarks
^
IndentationError: unexpected indent
mp_holistic = mp.solutions.holistic
mp_drawing = mp.solutions.drawing_utils
def mp_detection(image, model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writable = False
results = model.process(image)
image.flags.writable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
return image, results
def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image,results.face_landmarks,mp_holistic.FACE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
ret, frame = cap.read()
image, results = mp_detection(frame, holistic)
print(results)
cv2.imshow("live feed", frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
plt.imshow()
Hi,
I am trying to design a pose estimation based drum player. The accuracy is not good. I tried to include only hand and pose.
def extract_keypoints(results):
pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(33*4)
lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(21*3)
rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(21*3)
return np.concatenate([pose,lh, rh])
Also, I tried to reduce the sequence length as drum motions are fast -
DATA_PATH = os.path.join('MP_Data')
actions = np.array(['Kick', 'Snare', 'Cymbal','Normal','hihat'])
no_sequences = 30
sequence_length = 15
start_folder = 0
and taking 15 frames during a live demo. But it's not accurate. Can you please tell me the way to make it perfect? Or make a video on this ?
Cheers,
Suti
code :
cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
# NEW LOOP
# Loop through actions
for action in actions:
# Loop through sequences aka videos
for sequence in range(start_folder, start_folder+no_sequences):
# Loop through video length aka sequence length
for frame_num in range(sequence_length):
# Read feed
ret, frame = cap.read()
cv2.startWindowThread()
# Make detections
image, results = mediapipe_detection(frame, holistic)
# Draw landmarks
draw_styled_landmarks(image, results)
# NEW Apply wait logic
if frame_num == 0:
cv2.putText(image, 'STARTING COLLECTION', (120,200),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255, 0), 4, cv2.LINE_AA)
cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
# Show to screen
cv2.imshow('OpenCV data collection Feed', image)
cv2.waitKey(1)
else:
cv2.putText(image, 'Collecting frames for {} Video Number {}'.format(action, sequence), (15,12),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
# Show to screen
cv2.imshow('OpenCV data collection Feed', image)
# NEW Export keypoints
keypoints = extract_keypoints(results)
npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
np.save(npy_path, keypoints)
# break
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
cv2.waitKey(1)```
error:
FileNotFoundError Traceback (most recent call last)
Cell In[58], line 43
41 keypoints = extract_keypoints(results)
42 npy_path = os.path.join(DATA_PATH, action, str(sequence), str(frame_num))
---> 43 np.save(npy_path, keypoints)
45 # break
46 if cv2.waitKey(10) & 0xFF == ord('q'):
File <__array_function__ internals>:180, in save(*args, **kwargs)
File [~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518](https://file+.vscode-resource.vscode-cdn.net/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/~/miniconda3/envs/tensorflow/lib/python3.10/site-packages/numpy/lib/npyio.py:518), in save(file, arr, allow_pickle, fix_imports)
516 if not file.endswith('.npy'):
517 file = file + '.npy'
--> 518 file_ctx = open(file, "wb")
520 with file_ctx as fid:
521 arr = np.asanyarray(arr)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/utkx2/Desktop/python/MLDL/Projects/Human-Action-Recognition/MP_Data/hello/31/0.npy'
what should i do?? Even my MP_Data folder is created
So basically when i close the anaconda server and run it again it says
" action " or "actions" is not defined
import cv2
import numpy as np
import os
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import TensorBoard
import mediapipe as mp
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.metrics import multilabel_confusion_matrix, accuracy_score
mp_holistic = mp.solutions.holistic
mp_drawing = mp.solutions.drawing_utils
DATA_PATH = os.path.join("C:/Users/96654/PycharmProjects/mrdas/newmydata")
actions = np.array(['hello', 'thanks', 'cool'])
no_sequences = 40
sequence_length = 40
start_folder = 0
for action in actions:
for sequence in range(no_sequences):
try:
os.makedirs(os.path.join(DATA_PATH, action, str(sequence)))
except:
pass
def mediapipe_detection (image,model):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image.flags.writeable = False
results = model.process(image)
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image, results
def draw_landmarks(image, results):
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
def draw_styled_landmarks(image, results):
# Draw face connections
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION,
mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1),
mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1)
)
# Draw pose connections
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS,
mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
)
# Draw left hand connections
mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
)
# Draw right hand connections
mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4),
mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
)
def extract_keypoints(results):
pose = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark]).flatten() if results.pose_landmarks else np.zeros(334)
face = np.array([[res.x, res.y, res.z] for res in results.face_landmarks.landmark]).flatten() if results.face_landmarks else np.zeros(4683)
lh = np.array([[res.x, res.y, res.z] for res in results.left_hand_landmarks.landmark]).flatten() if results.left_hand_landmarks else np.zeros(213)
rh = np.array([[res.x, res.y, res.z] for res in results.right_hand_landmarks.landmark]).flatten() if results.right_hand_landmarks else np.zeros(213)
return np.concatenate([pose, face, lh, rh])
#sys.tracebacklimit = 0
label_map = {label:num for num, label in enumerate(actions)}
sequences, labels = [],[]
for action in actions:
for sequence in np.array(os.listdir(os.path.join(DATA_PATH, action))).astype(int):
window = []
for frame_num in range(sequence_length):
res = np.load(os.path.join(DATA_PATH, action, str(sequence), "{}.npy".format(frame_num)))
window.append(res)
sequences.append(window)
labels.append(label_map[action])
np.array(sequences).shape
np.array(labels).shape
X = np.array(sequences)
X.shape
y = to_categorical(labels).astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05)
y_test.shape
colors = [(245, 117, 16), (117, 245, 16), (16, 117, 245)]
#log_dir = os.path.join('Logs2')
#tb_callback = TensorBoard(log_dir=log_dir)
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='relu', input_shape=(40,1662)))
model.add(LSTM(128, return_sequences=True, activation='relu'))
model.add(LSTM(64, return_sequences=False, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(actions.shape[0], activation='softmax'))
res = [0.7,0.2,0.1]
actions[np.argmax(res)]
#model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
#model.fit(X_train, y_train, epochs=2000, callbacks=[tb_callback])
model.summary()
res = model.predict(X_test)
actions[np.argmax(res[4])]
model.save('action1.h5')
yhat = model.predict(X_test)
ytrue = np.argmax(y_test, axis=1).tolist()
yhat = np.argmax(yhat, axis=1).tolist()
multilabel_confusion_matrix(ytrue, yhat)
accuracy_score(ytrue, yhat)
def prob_viz (res,actions,input_frame,colors):
output_frame = input_frame.copy()
for num, prob in enumerate(res):
cv2.rectangle(output_frame, (0, 60 + num * 40), (int(prob * 100), 90 + num * 40), colors[num], -1)
cv2.putText(output_frame, actions[num], (0, 85 + num * 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,cv2.LINE_AA)
return output_frame
#sequence.reverse()
#len(sequence)
#sequence.append('def')
#sequence.reverse()
#sequence[-40:]
sequence = []
sentence = []
predictions = []
threshold = 0.5
cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
# Read feed
ret, frame = cap.read()
# Make detections
image, results = mediapipe_detection(frame, holistic)
#print(results)
# Draw landmarks
draw_styled_landmarks(image, results)
image = cv2.flip(image, 1)
# 2. Prediction logic
keypoints = extract_keypoints(results)
sequence.append(keypoints)
sequence = sequence[-40:]
if len(sequence) == 40:
res = model.predict(np.expand_dims(sequence, axis=0))[0]
print(actions[np.argmax(res)])
predictions.append(np.argmax(res))
# 3. Viz logic
if np.unique(predictions[-10:])[0] == np.argmax(res):
if res[np.argmax(res)] > threshold:
if len(sentence) > 0:
if actions[np.argmax(res)] != sentence[-1]:
sentence.append(actions[np.argmax(res)])
else:
sentence.append(actions[np.argmax(res)])
if len(sentence) > 5:
sentence = sentence[-5:]
# Viz probabilities
image = prob_viz(res, actions, image, colors)
cv2.rectangle(image, (0, 0), (640, 40), (245, 117, 16), -1)
cv2.putText(image, ' '.join(sentence), (3, 30),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
# Show to screen
cv2.imshow('OpenCV Feed', image)
# Break gracefully
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
cap.release()
cv2.destroyAllWindows()
colors = [(245,117,16), (117,245,16), (16,117,245)]
def prob_viz(res, actions, input_frame, colors):
output_frame = input_frame.copy()
for num, prob in enumerate(res):
cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
cv2.putText(output_frame, actions[num], (0, 85+num40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
return output_frame
plt.figure(figsize=(18,18))
plt.imshow(prob_viz(res, actions, image, colors))
IndexError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_9912\914121024.py in
1 plt.figure(figsize=(18,18))
----> 2 plt.imshow(prob_viz(res, actions, image, colors))
~\AppData\Local\Temp\ipykernel_9912\2568779431.py in prob_viz(res, actions, input_frame, colors)
3 output_frame = input_frame.copy()
4 for num, prob in enumerate(res):
----> 5 cv2.rectangle(output_frame, (0,60+num40), (int(prob100), 90+num40), colors[num], -1)
6 cv2.putText(output_frame, actions[num], (0, 85+num40), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2, cv2.LINE_AA)
7
IndexError: list index out of range
shouldn't the value of the start_folder initialized with 0 rather than 30 since we are adding the no. sequences to it therefore resulting in frames accessing values greator than 30.
In function draw_landmarks, replace Face_CONNECTION with FACEMESH_TESSELATION
Similarly in function draw_styled_landmarks, replace Face_CONNECTION with FACEMESH_TESSELATION
Since the length is different for each movement, do not enter the frame that matches the length.
Is there any way to learn frames of different lengths?
i am getting his error on model.load_weights('action.h5")
I did this project and added frames to recognise 5 actions but by the end, no action was getting recognized. I cannot seem to find any error as the code is running perfectly and OpenCV feed is showing only with no texts recognizing the actions. Someone please help me ASAP as I need it for school project in two days. I'm so helpless.. Please Please Please.. Somebody. I will attach the code I did here.
Action Detection.zip
Thank you for the video and code! I have a question though: I have trained the model and I am trying to make detection on prerecorded videos. Each video contain multiple actions (performed sequentially). However, the model only gives one detection (one action) per video. Is it possible to solve this issue?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.