Code Monkey home page Code Monkey logo

bhoomika121002 / virtuothink-sign2text-empowering-virtual-intelligence Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 443.94 MB

The "American Sign Language to Text Conversion" project employs computer vision and machine learning to translate ASL gestures into written text. Through real-time gesture recognition and translation algorithms, it bridges communication gaps between ASL users and non-users, fostering inclusivity and understanding across linguistic communities.

License: MIT License

Python 64.03% Jupyter Notebook 35.97%

virtuothink-sign2text-empowering-virtual-intelligence's Introduction

Sign Language to Text Conversion

Abstract

This project is a real-time method utilizing neural networks for fingerspelling based American Sign Language (ASL), addressing the challenge of communication barriers faced by individuals unfamiliar with sign language or lacking access to interpreters. Leveraging convolutional neural networks (CNNs), our approach achieves a 98.00% accuracy in recognizing the 26 letters of the alphabet. Initially, the hand undergoes filtering to enhance feature extraction, followed by classification to predict gesture classes. By incorporating hand position and orientation, we generate training and testing data for CNNs, enabling efficient recognition of human hand gestures from camera images. This method offers a promising solution for bridging communication gaps, empowering individuals to engage with ASL-based communication seamlessly.

Introduction

American Sign Language (ASL) stands as a vital mode of communication for individuals who are Deaf and Mute (D&M), constituting their primary means of expressing thoughts and interacting with others. As the only disability experienced by D&M individuals pertains to communication, the utilization of spoken languages becomes impractical, rendering sign language indispensable for facilitating meaningful exchanges. Sign language, characterized by its reliance on visual expressions conveyed through hand gestures, serves as a cornerstone of communication within the D&M community. Within this context, gestures play a pivotal role as nonverbal conduits for conveying ideas, emotions, and messages, all comprehended through visual perception.

The essence of sign language lies in its nonverbal communication components, which are understood exclusively through vision. ASL comprises three fundamental components: handshapes, movements, and locations, each contributing to the rich tapestry of expression within the language. Handshapes denote the formation of specific hand configurations, movements encompass the dynamic changes in hand positions and motions, while locations signify the spatial positioning of gestures within the signing space. Together, these components form the intricate grammar and vocabulary of ASL, enabling D&M individuals to articulate nuanced concepts and engage in diverse forms of interaction.

image

Central to the project's objective is the development of a robust model capable of recognizing Fingerspelling-based hand gestures, a crucial aspect of ASL utilized for spelling out individual letters and forming complete words. By focusing on Fingerspelling, the model aims to address a fundamental aspect of ASL communication, offering a means for D&M individuals to convey precise lexical information efficiently. Through training and implementation, this endeavor seeks to empower D&M individuals by enhancing their ability to express themselves and engage in seamless communication with others, thereby fostering inclusivity and bridging the gap between the D&M community and the broader society.

image

Problem Statement

Addressing the communication gap for non-sign language users, we aim to develop a real-time system using neural networks to accurately recognize fingerspelling-based American Sign Language (ASL) gestures. The challenge lies in achieving high accuracy in interpreting the diverse hand configurations representing the 26 letters of the alphabet. Our focus is on developing a reliable solution to bridge communication barriers and enhance accessibility for diverse user groups.

Objectives

To integrate machine learning models to recognize and interpret variations in sign language expressions accurately.

To implement real-time processing capabilities in the system to ensure prompt and accurate interpretation of ASL fingerspelling gestures for seamless communication.

To compile a diverse database of sign language gestures to cover cultural nuances and regional differences.

Methodology

The system is a vision-based approach. All signs are represented with bare hands and so it eliminates the problem of using any artificial devices for interaction.

Gesture Classification:

Algorithm Layer 1:

  1. Apply Gaussian Blur filter and threshold to the frame taken with openCV to get the processed image after feature extraction.

2.This processed image is passed to the CNN model for prediction and if a letter is detected for more than 50 frames then the letter is printed and taken into consideration for forming the word.

3.Space between the words is considered using the blank symbol.

Algorithm Layer 2:

1.We detect various sets of symbols which show similar results on getting detected.

2.We then classify between those sets using classifiers made for those sets only.

Finger Spelling Sentence Formation Implementation:

Whenever the count of a letter detected exceeds a specific value and no other letter is close to it by a threshold, we print the letter and add it to the current string (In our code we kept the value as 50 and difference threshold as 20). Otherwise, we clear the current dictionary which has the count of detections of the present symbol to avoid the probability of a wrong letter getting predicted. Whenever the count of a blank (plain background) detected exceeds a specific value and if the current buffer is empty no spaces are detected. In another case it predicts the end of a word by printing a space and the current gets appended to the sentence below.

AutoCorrect Feature:

A python library Hunspell_suggest is used to suggest correct alternatives for each (incorrect) input word and we display a set of words matching the current word in which the user can select a word to append it to the current sentence. This helps in reducing mistakes committed in spellings and assists in predicting complex words.

Conclusion

A real-time vision-based American Sign Language (ASL) recognition system developed for Deaf and Mute (D&M) individuals, focusing on ASL alphabets.

A final accuracy of 98.0% on the dataset, with significant improvement achieved through the implementation of two layers of algorithms.

Enhanced prediction process enables differentiation between symbols with high similarity, improving system accuracy and robustness.

The system demonstrates the capability to detect nearly all ASL symbols under optimal conditions: clear presentation, minimal background noise, and adequate lighting.

The developed ASL recognition system showcases promising accuracy and functionality, offering potential to significantly enhance communication opportunities for the D&M community. Continued advancements in technology and algorithm refinement hold promise for further improving system accuracy and usability, ultimately fostering greater inclusivity and communication access for D&M individuals.

virtuothink-sign2text-empowering-virtual-intelligence's People

Contributors

bhoomika121002 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.