Code Monkey home page Code Monkey logo

kathabhidhana's Introduction

Kathabhidhana: Audio recording for Odia Wiktionary

What comes to your mind when you think of a dictionary? A huge boring book that you never wanted to open? Or may be a mobile app that you open while struggling with understanding a few new words in any write-up? But think of a dictionary that also pronounces the words rather than just showing them in [International Phonetic Alphabet] (https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) (IPA).

Wikipedia has a sister project called Wiktionary. And it's multilingual. Though it's easier to search any word in Google with a suffix "meaning" to hear the pronunciation of the word, there are not many [open-licensed] (https://en.wikipedia.org/wiki/Open_Content_License) audio recordings that you can hear, download, and even use for your own work. Kathabhidhana is a community project led by [Subhashish Panigrahi] (http://meta.wikimedia.org/wiki/User:Psubhashish/) to create an open source solution for recording large chunks of words and then uploading them under open licenses so that they can useful for projects like Wiktionary. The project draws its inspiration largely from another open source [software] (https://github.com/tshrinivasan/voice-recorder-for-tawictionary) created by by [Shrinivasan T] (https://github.com/tshrinivasan).

Currently several Odia-language words are being [recorded] (https://commons.wikimedia.org/wiki/Category:Odia_pronunciation), uploaded on [Wikimedia Commons] (https://commons.wikimedia.org), and are being used in [Odia Wiktionary] (https://or.wiktionary.org), the Odia-language version of Wiktionary. The purpose of creating this audio library is multi-folded—apart from using them on Wiktionary, we also aim at using them for any [Natural Language Processing] (http://en.wikipedia.org/wiki/Natural Language Processing) (NLP) project (and you are free to use [with attribution] (https://github.com/OdiaWikimedia/Kathabhidhana/blob/master/README.md#attribution) any resource available in this page).

alt tag

An Odia version of the resources and tutorial is available here. We are currently working on building more tutorials so that you can learn more about bettering your home studio setup—assuming you don't have access to a fancy recording studio but if you have please do leverage that, tips and tricks about batch renaming files, cleaning up using open source tools like [Audacity] (http://www.audacityteam.org/download/), setting up files for batch upload on Wikimedia Commons, etc. So stay tuned.

Prerequisites

  • Linux or macOS
  • Linux running in a virtual machine

or

For Kathabhidhana for iOS

  • iOS
  • An app called Workflow

How to execute?

(you need to run the command in Linux or Mac, or Linux in a [virtual machine] (https://en.wikipedia.org/wiki/Virtual_machine) if you're on Windows) [Read in Odia] (https://goo.gl/hqXeG3)

  1. Fill the words you want to recoed in a textfile named "file"
  2. run the below command

python voice-record.py 2> err

this will record the sounds in ogg and wav formats.

  1. To upload all the ogg files to Wikimedia Commons

  2. a) Edit the file mediawiki-uploader.py Fill the commons api url, username and password

  3. b) run the below command python mediawiki-uploader.py

Attribution

Other resources

Shoutouts

kathabhidhana's People

Contributors

pattaprateek avatar psubhashish avatar tshrinivasan avatar gyan111 avatar

Stargazers

Tcheneghon Motcheyo Herman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.