Code Monkey home page Code Monkey logo

common-voice-tool's Introduction

common-voice-tool

Common Voice Tool - CLI Tool
GUI - Coming soon!

[Click here for the Italian Version]

Preliminary operations

NOTE: THIS IS BETA VERSION!
LAST BETA: v0.4.1

You need to clone this repo:

git clone https://github.com/dag7dev/common-voice-tool

Beta branch:

git clone --single-branch -b beta https://github.com/dag7dev/common-voice-tool

BASH VERSION

cd common-voice-tool
cd BASH
sudo chmod 755 common-voice-tool.sh

You can finally run this script ;)

Python version

This version can turn in handy if you're on a system that can't run bash scripts (e.g. Windows unless you use WSL or other stuff). It was made by jotaro-sama to automate the work of checking whether the sentences to pass to Mozilla's Common Voice were properly formatted. To run it, any version of Python 3.x should be fine. Just pass the file with the sentences to the script like this:

python3 common-voice-tool.py sentences.txt      #Most GNU/Linux distros, macOS
py -3 common-voice-tool.py sentences.txt        #Windows Command Prompt

The exact commands may vary depending on how you configured your system.

At the moment, the Python version works a bit differently from the bash one, but it's fully functional. It automatically formats the file (putting the output in out.txt) so that:

  • There are no empty lines.
  • There are no double spaces, spaces before the final dot or spaces at the end of the lines.
  • All lines are capitalized.
  • All lines end with a dot.

It also notifies you whether some sentences exceed the 125 characters length (dot included), which was the maximum length for sentences to be passed to Common Voice.

CLI USAGE

If you run the bash script without parameters you will get this (after language selection):

./common-voice-tool
usage: ./common-voice-tool <options>
  -h or -help
    	Show this message
  -range or -chkLen
    	Check if there are lines in the file which exceed a maximum length.
  -trim
    	Trim whitespace at the end of the lines.
  -chkPoint
      Check if all rows in the file end with a dot (doesn't replace it, just checks).
  -ac
      Add a dot to the rows not ending with one.
  -noEmpty
      Remove all the empty lines.
  -capitalize
      Capitalize the first character of every sentence.

To run this script you need to include at least one parameter.
For example:

./common-voice-tool -range

will check if there are lines in the file which exceed a maximum length.

You don't need to pass the filename as a parameter, as the script will prompt you to choose a file when launched.

You can run this script with multiple parameters.
As an example:

./common-voice-tool -range -noEmpty

will check if there are lines exceeding a maximum length and remove all the empty lines.

LIST OF PARAMETERS

Parameter What does it do
-h show help (as if you run without parameters)
-range Check if there are lines exceeding a maximum length
-chkLen same as above
-trim Trim whitespace at the end of every row.
-chkPoint Check if all rows in the file end with a dot (just check).
-ac same as above but it will add the dot if it's missing.
-noEmpty Remove all the empty lines.
-capitalize Capitalize the first character of every sentence.

NOTES

This script MUST be used ONLY with plain TEXT files.
You can select your language by selecting the right country-code when you run the script. 'lang' folder must exists and at least one file must be into the folder.

What branch should I use?

The beta one is "experimental", I always update this branch. The master one, even though I thought about using it only, is unstable. Until the beta becomes stable, I won't push to master and there won't be a release.

WIP

Todo:

  • Split lines automatically
  • Localization DONE
  • Capitalize all first letters at the begin of each row DONE
  • Remove empty lines. DONE
  • Add check row's length while adding a dot at the end of each row. DONE

Can I contact you?

Sure! You can found my email address inside the source code!
You can also contact me here, on GitHub!
Let me know if you love this software or if it has something which needs a fix!

Why this tool?

This tool was meant to help people prepare strings to be used with Mozilla's Common Voice project (check it out, it's really cool!).
It can help you in checking length, adding full stop (when needed) and other several useful things.

How can I help you?

Submit issues and give me more ideas about implementing new features! :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.