Code Monkey home page Code Monkey logo

Comments (10)

sanketgarade avatar sanketgarade commented on July 17, 2024

purpose -

to filter the input csv data as per the provided arguments

filter types

  • invalid data
  • all words
  • specific topic
  • specific alphabet (which will be initial of the english word)
  • TBD (more can be added)

about invalid data filter

the invalid data filter is to be run always before running any other filter, as it eliminates those data elements which have insufficient or invalid data

  • insufficient data - any of the english or marathi word is missing

  • invalid data - if english word contains non english characters (this can be thought of later, and is low priority right now), same for marathi word.

IMP

  • make separate functions for each filter type
  • as per the passed argument of the "filter type" call the relavant function

steps

  1. take passed csv data and filter type as argument
  2. generate a truncated csv data structure as per the target filter
  • call the specific filter function here
  1. return this truncated data to the calling function

from marathi-shabd.

zarbod avatar zarbod commented on July 17, 2024

Hey so when you say "insufficient data" do you mean missing English or Marathi words exclusively, or does it also include missing examples and tags?

from marathi-shabd.

sanketgarade avatar sanketgarade commented on July 17, 2024

only the main 2 words. 1 en and 1 mr.

from marathi-shabd.

zarbod avatar zarbod commented on July 17, 2024

Thanks. Also could you explain what the "all words" filter is supposed to do?

from marathi-shabd.

sanketgarade avatar sanketgarade commented on July 17, 2024

Thanks. Also could you explain what the "all words" filter is supposed to do?

"All words" basically means no filtering (other than the invalid/insufficient data, of course).

from marathi-shabd.

zarbod avatar zarbod commented on July 17, 2024

So I would just call the invalid/insufficient data scripts when the filter type is "All words"?

from marathi-shabd.

sanketgarade avatar sanketgarade commented on July 17, 2024

Yes. Pretty much.

from marathi-shabd.

zarbod avatar zarbod commented on July 17, 2024

Also, the filter by topic function will require the topic as an argument. Do you want me to add an optional topic argument to the main filter function?

from marathi-shabd.

sanketgarade avatar sanketgarade commented on July 17, 2024

Yes you can do it in whichever way that makes the functions easy to use and also reusable.

What I've written in the gen-out.py file is just a basic example.
You can fill in the details and missing gaps.

from marathi-shabd.

sanketgarade avatar sanketgarade commented on July 17, 2024

pending issues from PR #32

among this priority ones are the # 1 and # 3

from marathi-shabd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.