Comments (4)
Hi @leckijakub @marekhering!
My initial proposition is separating those into three major modules
Mark Generator
- generates randomized shape, as a parameter it receives shape name (str), tuple with shape of expected output (Tuple[int]) and odds for answer to be "negated" (float). Example of calling generator with parameters could be
mg = MarkGenerator(config=config) # TBD on config
cross_marks = mg.generate(mark='cross', shape=(1, 10, 15), negated=0.2)
Which means that we want to have 10 sets of random marks (each set containing marks for 15 questions) with negation rate of 20%.Personal Data Generator
- This module will receive list of names (or rules to generate values) and translate them to handwritten text. In similar manner asMark Generator
it can receive config file and than method can be called to generate sequence of values. Lets make an example that personal data will be students id (index), name and surname.
pdg = PersonalDataGenerator(config=config) # TBD on config
personal_data = pdg.generate([{'type': int, 'shape': (1, 10, 6)}, {'type': str, 'shape': (1, 10, 12), 'min_length': 5, 'capitalize': True}, {'type': str, 'shape': (1, 10, 12), 'min_length': 3, 'capitalize': True}])
print([data.shape for data in personal_data]) # [(1, 10, 6), (1, 10, 12), (1, 10, 12)]
This means thatpersonal_data
holds 10 sets of indexes (6 handwritten int each), names (5-12 letters type string with first letter capitalized), surnames (3-12 letters type string with first letter capitalized)Exams Combiner
- receives exam template with positions of answers that can be marked and personal data fields. With random position offset places generated marks place mark (as a mask) and personal data on top of template. Using previously defined datasets getting ready set for training could look similar to:
ec = ExamsCombiner(config=config) # TBD on config
exams = ec.extract(templates=[{'path': path, 'positions': positions}])
marked_exams = exams.mask(marks=cross_marks, data=personal_data)
print(marked_exams.shape) # (10, height_of_exam, width_of_exam, 3)
from splinter.
- Mark Generator
I'd extract negation from shape generation and create it as a separate mark (circle or so)
If an answer is withdrawn we expect that another is given - in such case we should generate two cross marks and one circle mark.
- Personal Data Generator
agree
- Exams Combiner
To be consistent let us use Exam Generator
with the possibility to generate one as well as multiple exams.
I like the overall architecture of the modules 👍
from splinter.
I'd extract negation from shape generation and create it as a separate mark (circle or so)
If an answer is withdrawn we expect that another is given - in such case we should generate two cross marks and one circle mark.
That's good point. How about doing it this way?
mg = MarkGenerator(config=config) # TBD on config
cross_marks = mg.generate([{'mark'='cross', 'shape'=(1, 10, 15)}, {'mark'='negated_cross', 'shape'=(1,10,3)}])
To be consistent let us use Exam Generator with the possibility to generate one as well as multiple exams.
Agree
from splinter.
That's good point. How about doing it this way?
mg = MarkGenerator(config=config) # TBD on config
cross_marks = mg.generate([{'mark'='cross', 'shape'=(1, 10, 15)}, {'mark'='negated_cross', 'shape'=(1,10,3)}])
Yes, something like this 👍
from splinter.
Related Issues (20)
- Analysing the existing system architecture
- Preparation of the model learning pipline diagram
- Preparation of documentation on existing architecture
- Planning a team meeting with the presentation of architecture
- Repairing an existing model
- WebDAV Server
- Creating the required documentation for research project
- Research on WebDav HOT 1
- Disable the buttons if we do not have any exams loaded.
- Run the project locally (not on a docker) and prepare documentation that describes how to do it
- Analysis of the collected data HOT 1
- Retraining Model with Enhanced Data
- Investigation of Factors Affecting Exam Answer Sheet Accuracy HOT 1
- Systematic Literature Review on Stress Factors in Exam Answer Sheet Completion
- Anonymization of Exam Sheet Scans
- Enhancing Data Generation Efficiency
- Refactoring of Existing Backend Module
- Prepare necessary documentation as required
- Preparing model OCR
- Number dataset script preparation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from splinter.