albanjoseph / signapse Goto Github PK
View Code? Open in Web Editor NEWSignapse is an open source software tool for helping everyday people learn sign language for free!
License: GNU General Public License v3.0
Signapse is an open source software tool for helping everyday people learn sign language for free!
License: GNU General Public License v3.0
I've just realised that shooting a "launch video" for the product is actually in the mark scheme and seems to be worth a reasonable amount. We should definitely do this since the product works and we can easily throw something together quickly.
There are example videos on 4 github projects from last year:
Documentation has again fallen behind the developed source code. Another iteration of docs update is required to help keep everything refreshed.
Static analysers provide an easy way to detect memory leaks, lint code and find other issues in the codebase. Tools like clang-tidy
and cppcheck
can be easily added to the code-base and crafted into a github action. This would be cool
Might be good to have a tests badge in the readme. Sometimes see them on other github projects. Gives an immediate indication as to whether integrations have passed or failed our unit testing and shows we are using an integration process which builds confidence in our code-base.
ATM the hand signs are flipped vertically to allow the user to see themselves as they might in a mirror. This actually teaches users to make the WRONG SIGNS. Remove this feature.
We should build integration tests including linting, compiling and unit tests. An integration machine could be set up and run to help us continuously evaluate our software and prepare for subsequent releases
While we are confident the program handles memory leakage and multithreading responsibly, some further checks should be performed to ensure security.
Another check through the OOP design should be performed to make sure classes are SOLID and adhere to Liskov's P
The GUI could be improved a bit more, some ideas:
To allow for an object classifer to be used in place of a detection model the localisation of the sign in camera frame must be pre-determined. For this, an area within frame should be selected with some drawing of the bounding indicated on-screen. This is a pre-cursor step to populating the CNNProcessor
class with a classification network for sign identification.
A localisation bounding box should be selected and shown in feedback to the user (at present using imshow()
function from OpenCV). CNNProcessor requires functionality to crop the frame around this bbox for further processing. These features should be built on branch hand-crop
and merged to dev
before work on training the CNN may begin.
To perform sign detection, we require a CNN classifier network trained on images of various hand signs. This task comprises training a CNN and interfacing with the network via an executable runtime.
Data:
A diverse source of data is required. Sign language MNIST on Kaggle seems like a good place to start. Perhaps down-sampling the nr-classes would be safe to begin with rather than the full alphabet, only 4 or 5 signs could be distinguished.
https://www.kaggle.com/datamunge/sign-language-mnist
Network Architecture:
Some work could be carried out in selecting a suitable network architecture for our task. Main tade-off when selecting a classifier is execution latency (on RPi and x86 machines) vs accuracy at the output (top-1, top-5 or other metric). Possible architectures could be mobilenets (v1, 2 or 3) or a larger network like Inceptionv3.
Runtime/interface:
Various neu-network runtimes could facilitate execution within the app. First port-of-call would be to try out OpenCV's DNN library as OpenCV packages are already installed by our app. Tensoflow also offers TFLite runtime, this may accelerate execution time. Invariably, network execution should be built into the CNNProcessor object. Results from frames processed should be added to the Scene struct and passed along to output. Work may also be required to analyse latency of this stage and scale-down nr network executions completed to get appropriate throughput through CNNProcessor.
At present the app is accessible using the terminal only, in the future we would like a GUI to help interact with the system. An initial task should be to make a wireframe of the GUI interface (on paper or otherwise) and then build this wireframe using QtDesigner.
Guides on how to use and install QtDesigner are available:
https://doc.qt.io/qt-5/qtdesigner-manual.html
Currently, the Git repo has no description, looks a bit off. Would be good to add a description there and some tags for search engine optimisation
As the code-base has matured, it's now time to refactor and clean up some of the mess we've left. This includes:
gui/
images/
and models/
could all go into a directory assets/
as is standard (I think)We must get in some wiki entries to ensure we get the marks we deserve for the course! From reading the mark scheme, here are some some suggested wiki entries:
At present, the python code used to generate our deep learning model is kept offline on my machine. This code is still a part of Signapse and should be included in our application repo. Some effort is required in packaging up the code, including the training and evaluation scripts used and including them into Signapse. The goal being that the network may be generated "from scratch" using only the materials provided in the repo.
Numerical graphs and results can also be added to our Wiki, we want to show off our work on network training!
To synchronise between threads, modifications to the underlying queue data structure are required. A blocking queue should be created which hangs the current thread while there is no element available to deliver. This can be implemented as shown in:
https://stackoverflow.com/questions/12805041/c-equivalent-to-javas-blockingqueue
The branch new-queue
should implement our blocking queue and integrate with the Reel
object.
Building OpenCV from scratch is the only known way to access an up-to-date version of the library from Signapse. Unfortunately, this library is pretty huge, resulting in lengthy compilation time. This is a direct obstruction to helping users get up and running with the application and should be remedied.
One idea is to pre-build binaries for realistic deployment scenarios and host these online. Deb packages could be used for this purpose, similar to apt. To start, we should have and x86 and ARM deb package variants for OpenCV 4.5.X+
Advances in the code-base are currently undocumented. We should try to add documentation using Doxygen to all created classes and functionality.
Documentation will lag advances in the code-base. For this reason, a docs branch should be made from dev
at commit 74631ccd8b93da8b4ca26b5673dfbe0d41a393d5
. Continuous review will see documentation built for subsequent advances in the code.
Reel has protected member variable "frameQueue". This should be renamed "sceneQueue" for clarity as it is actually a queue of scenes.
After some discussion with the course convenors, the existing video processing pipeline is shown to not fit assessment requirements for the course as asynchronous callback routines have been neglected in favour of an execution blocking pipeline.
The architecture can be made more elegant and objects better encapsulated if callback routines are devised for the camera and CNN processing elements. This should be completed for the next release of Signapse
A GUI for signapse has now been wireframed and designed in QtDesigner. Remaining effort is required to integrate our GUI into signapse, adding required functionality, testing the user experience and evaluating the "real time responsiveness of the application" as per point 3 on the RTEP mark scheme.
Assignees currently @albanjoseph and @charger4241, but all-hands are likely to be required to complete before milestone v2 (6th April)
The first release of Signapse is soon. Let's make a buzz about it on social media. The social channels available from the readme can be updated with hype material before and after our release. Maybe some screencaps and a demo of our system would help drive the message that we have a really great product thus far!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.