Comments (21)
w.r.t to the existing scripts there are some optimisations that can be done.
lets try to write programs using the unix philosophy. basically these
2 points for now -
- Write programs that do one thing and do it well.
- Write programs to work together.
applying this to your sort and gen-md scripts, we can do the following -
- sort -
- make the sorting function universal/generic instead of specific to the input
csv format.- like, determine the number of columns in the csv from the number of elements
in the top row instead of hardcoded values.- when sorting we can pass both the column index to be used for sorting and
the order of sort as arguments.- let the sort function return the output csv file (or its instance) instead
of saving the file in some location.- so that the calling function can decide what to do with the retured file.
- generate-markdown
- lets make one function which outputs exactly one block (the struture in the
word template.md file)- this function will work only on one 1 row of csv text stream and output the
text stream for 1 block of the output.- once again let the caller function pass the csv stream as input and catch
the output stream as a return value.- the idea here is that we can reuse this function in multiple places, where
we need to generate md file for entire library or for topic based words, or
for md files split as per word initials etc.and then write a parent function which calls these and does the needed thing as
per the type of output md file needed.
I am still working on this, that is the types of files we need to create. But
they will be something like -
- entire library in 1 file
- 1 file each for 1 alphabet (like A.md will contain all words starting with the
letter 'a', B.md for 'b' and so on..)- 1 file for each topic
Thanks! I can make the optimizations for the generate-markdown script fairly quickly so I'll do those first.
from marathi-shabd.
from marathi-shabd.
I can start working on those in the afternoon.
from marathi-shabd.
- input file will be the db.csv
- output will be a markdown file which will be used on the github pages website. for now it will the be the home page of the site.
- for now, a user will have to manual search for a word of interest (or can also use the browser's search function.)
from marathi-shabd.
prerequisites -
- have a csv file with content in en and mr columns, at least.
- have a template markdown file for the output
steps -
- read the csv file
- create a new markdown file from the template
- extract the en and mr words from a row of the csv
- fill the extracted words in the markdown file
- repeat 3-4 till all rows of csv are done
from marathi-shabd.
Do you have the markdown file template ready?
from marathi-shabd.
Do you have the markdown file template ready?
Yes. It's there in the template folder. Not in a template shape right now but more like an example.
If you want the exact template with placeholders, then I will prepare it later today. But it won't be much different for the example file that is present there currently. It's it suits you, you can begin with that and later update your script once the final template is ready.
The example file shows 3 different ways to arrange the output. Please use the 1st option for now.
from marathi-shabd.
template is added. pls check the explanation in the readme file in the template folder.
from marathi-shabd.
I merged part of the PR #13 into main branch. tested ok at my end.
I will close the PR.
There are some enhancements that can be done. I will think and let you know.
from marathi-shabd.
Hey, do you have anything for me? I have time to work on the project.
from marathi-shabd.
from marathi-shabd.
w.r.t to the existing scripts there are some optimisations that can be done.
lets try to write programs using the unix philosophy. basically these
2 points for now -
- Write programs that do one thing and do it well.
- Write programs to work together.
applying this to your sort and gen-md scripts, we can do the following -
- sort -
- make the sorting function universal/generic instead of specific to the input
csv format. - like, determine the number of columns in the csv from the number of elements
in the top row instead of hardcoded values. - when sorting we can pass both the column index to be used for sorting and
the order of sort as arguments. - let the sort function return the output csv file (or its instance) instead
of saving the file in some location. - so that the calling function can decide what to do with the retured file.
- generate-markdown
- lets make one function which outputs exactly one block (the struture in the
word template.md file) - this function will work only on one 1 row of csv text stream and output the
text stream for 1 block of the output. - once again let the caller function pass the csv stream as input and catch
the output stream as a return value. - the idea here is that we can reuse this function in multiple places, where
we need to generate md file for entire library or for topic based words, or
for md files split as per word initials etc.
and then write a parent function which calls these and does the needed thing as
per the type of output md file needed.
I am still working on this, that is the types of files we need to create. But
they will be something like -
- entire library in 1 file
- 1 file each for 1 alphabet (like A.md will contain all words starting with the
letter 'a', B.md for 'b' and so on..) - 1 file for each topic
from marathi-shabd.
i have hosted the website with some dummy links under the "browse" section. pls have a look at it. you'll get an idea of the type of outputs we need to generate.
from marathi-shabd.
I have merged pr #18. Thanks!
I will now create a .py file with pseudo code for the parent function to make output md file for entire library and other types of output files (topic, alphabetical etc.). You can then use that to write your code.
from marathi-shabd.
@zarbod hi, did you see the .py files I added in the src folder and the pseudo code added in those? I have updated part of the db.csv file and would like to create atleast the md output file for all words (the entire library link on the website). Pls let me know when you are planning to implement those scripts. In case any part is not understood, let me know.
from marathi-shabd.
thanks. please make sure to pull the latest repo first since I made some updates.
also on the website as of now 3 links are having dummy files (entire lib, topics and "a" initial words).
Once your scripts are ready, we can run those on the db.csv file and put the md files containing the actual words from the database onto these links :)
from marathi-shabd.
@zarbod now that the filter script is done, we could continue with the gen-out and gen-block files so that we can use them together to generate the specific MD files.
Let me know if you can start on these.
from marathi-shabd.
@zarbod
तू यावर काम चालू केलं आहेस का? तुला जर वेळ लागणार असेल तर सांग. तसं असेल तर त्या दरम्यान मी पण माझ्याबाजूने program लिहायला प्रयत्न करून बघतो. मला browse site ची पानं शक्य तितकी लवकर अपलोड करायची आहेत म्हणून.
from marathi-shabd.
from marathi-shabd.
चालेल. 👍🏼
from marathi-shabd.
Closing this since the basic operation is working fine. Will open separate issues for specific enhancements.
from marathi-shabd.
Related Issues (20)
- search site - hitting "enter" key should do same as pressing the "search" button HOT 1
- Markdown Word block update HOT 3
- remove header row from csv when used for markdown file generation HOT 3
- delete the "filtered.csv" file after it is no longer necessary
- topics parser HOT 7
- remove python cache folder __pycache__ HOT 5
- topic specific md files do not contain words which have multiple tags HOT 1
- extract topics list in topics-list.md HOT 7
- each browse md file must have a its file name in its heading HOT 3
- Translate and add Marathi text to the website HOT 1
- खगोलशास्त्राचे शब्द जोडणे HOT 1
- क्रिकेटचे शब्द जोडणे HOT 2
- Create a form for users to submit NEW words HOT 3
- Update readme in all folders HOT 1
- Add Marathi text to index.MD
- Form for users to submit missing Marathi words for EXISTING English words HOT 11
- Word info graphic creator HOT 3
- संख्या
- temporary form to add missing/incorrect words
- जाहिरात पत्रकात इंस्टाग्राम खात्याचे नाव सुधारणे HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from marathi-shabd.