readalongs / studio Goto Github PK
View Code? Open in Web Editor NEWAudiobook alignment for Indigenous languages
Home Page: https://readalongs.github.io/Studio/
License: Other
Audiobook alignment for Indigenous languages
Home Page: https://readalongs.github.io/Studio/
License: Other
Currently coveralls makes you go way deep down into the directory structure just to see the relevant files:
There's a way to set the root directory so that it does away with all those files and also let's you see each file's coverage. Right now it just says:
The file "/home/travis/build/dhdaines/ReadAlong-Studio/readalongs/align.py" isn't available on github. Either it's been removed, or the repo root directory needs to be updated.
I think the solution is here and that if you're logged in as the owner (@dhdaines) you should see a somewhere to change the root.
The TokenizerLibrary
class (readalongs/tetx/tokenize_xml.py
) is currently initializing with every single available mapping. See its constructor:
def __init__(self):
self.tokenizers = {None: DefaultTokenizer()}
for x in MAPPINGS_AVAILABLE:
mapping = Mapping(in_lang=x["in_lang"], out_lang=x["out_lang"])
tokenizer_key = self.get_tokenizer_key(x["in_lang"], x["out_lang"])
self.tokenizers[tokenizer_key] = Tokenizer(mapping)
We should instead just load the mappings necessary in the path from the input to eng-arpabet
Marc found a situation where silence splitting made the first word span half the DNA range at the beginning of a file. Notice in the data below that [0,4576] is DNA audio, but the first word in the SMIL file starts exactly at 4576/2=2.288s.
Config:
{
"do-not-align": {
"method": "remove",
"segments": [{
"begin": 0,
"end": 4576
},
{
"begin": 11726,
"end": 26267
}
]
}
}
Command:
readalongs align --config ./s0387_intro2.json --debug --save-temps --force-overwrite --language iku s0387_intro2.xml s0387_intro2.mp3 s0387_intro2.config 2> s0387_intro2.out.config
Data files: Eric received the data files to reproduce this by e-mail from Marc on 2021-04-19.
Output:
<smil xmlns="http://www.w3.org/ns/SMIL" version="3.0">
<body>
<par id="par-t0b0d0p0s0w0">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w0"/>
<audio src="s0387_intro2.config.mp3" clipBegin="2.288" clipEnd="5.016"/>
</par>
<par id="par-t0b0d0p0s0w1">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w1"/>
<audio src="s0387_intro2.config.mp3" clipBegin="5.016" clipEnd="5.526"/>
</par>
<par id="par-t0b0d0p0s0w2">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w2"/>
<audio src="s0387_intro2.config.mp3" clipBegin="5.526" clipEnd="6.806"/>
</par>
<par id="par-t0b0d0p0s0w3">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w3"/>
<audio src="s0387_intro2.config.mp3" clipBegin="6.806" clipEnd="8.201"/>
</par>
<par id="par-t0b0d0p0s0w4">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w4"/>
<audio src="s0387_intro2.config.mp3" clipBegin="8.201" clipEnd="9.301"/>
</par>
<par id="par-t0b0d0p0s0w5">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w5"/>
<audio src="s0387_intro2.config.mp3" clipBegin="9.301" clipEnd="10.126"/>
</par>
<par id="par-t0b0d0p0s0w6">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w6"/>
<audio src="s0387_intro2.config.mp3" clipBegin="10.126" clipEnd="10.606"/>
</par>
<par id="par-t0b0d0p0s0w7">
<text src="s0387_intro2.config.xml#t0b0d0p0s0w7"/>
<audio src="s0387_intro2.config.mp3" clipBegin="10.606" clipEnd="11.716"/>
</par>
</body>
</smil>
Some issues are coming up in code reviews around stylistic choices (ex. single quotes vs double quotes). We should use a consistent formatter to reduce style noise in commits.
we should add a citation for the readalongs paper
Currently, in readalongs/align.py
we use dicts to store words, typically with a start
, end
and then id
or text
. This is confusing and difficult to use and document correctly. See discussion starting with "Orthogonal to this PR" in #119
Implementation to consider: reimplement a Word
class using @dataclass
and have the .text
attribute calculated (and cached) on use, so that we just have one Word object type with the different query attributes we need, thus making it at lot more intuitive to both document and use.
The last release was a 50mb bundle because it now includes the cmusphinx model as part of the release.
This is OK, but we cannot let the bundle get much bigger, especially since PyPI has a size limit of 60mb by default.
See https://www.dampfkraft.com/code/distributing-large-files-with-pypi.html for a potential solution, like having the model as a downloadable file on the GitHub release instead of inside the PyPI bundle.
See #36
If a mapping doesn't use NFD normalization, then alignment fails for some languages..
We want to generate readalongs that can be used entirely standalone, without requiring a stable internet connection.
We want to both get the latest greatest features, but also have stable component, so we have to figure out how to allow creators of readalongs the choice to use a stable version vs. choosing to have the latest version.
Why stable, and offline-friendly?
Currently, ReadAlong-Studio requires NFD. This creates problems with the g2p
module because mappings in that module can use NFC, NFD, NKFC or NKFD. ReadAlong-Studio should just look at the mapping of any given language it's working with and use its declared standard.
This was brought up in PR https://github.com/dhdaines/ReadAlong-Studio/pull/8 - we might want to replace the English g2p system we use. g2p_en
was proposed, others?
This is an interim solution, but we should have some way of dealing with English input.
When a plain text file start with one or more blank lines, blank pages are inserted before the first page of the RA.
When there are 3+ consecutive blank lines mid-text, blank pages are inserted in the middle.
When there are 2+ blank lines at the end of the plain text, blank pages are inserted after the RA.
While we can ask the user to remove such extraneous blank lines, it would be better UX with fewer gotchas to ignore them with these rules:
In short, never create an empty <div type="page"></div>
element in the output.
With the change of output for readalongs align
from a bunch of files to a directory containing them, the -f
option no longer works:
$ readalongs align -l fra -s -f -i in.txt in.mp3 out
INFO - Server initialized for eventlet.
Usage: readalongs align [OPTIONS] INPUTFILE WAVFILE OUTPUT_BASE
Error: Output folder 'out' already exists
Expected behaviour: with -f
, the output folder is allowed to exist already, and files therein should get overwritten.
This issue is 2-fold. One, the testAlignText
is not passing. Two, Travis, is giving a false positive.
This issues is specific to Windows, the problem does not occur on *nix systems, because of how files are opened. On Windows, a write file handle has an exclusive lock, preventing a read file handle from being opened on the same file before the write handle is closed. As a consequence, the way we use NamedTemporaryFile
in readalongs/align.py
methods create_input_xml()
and create_input_tei()
is broken on Windows.
For cross-platform compatibility, this is the required workflow, which I know is a bit of a pain:
tempfile=NamedTemporaryFile(..., delete=False)
tempfile
tempfile
(which would cause the file to be deleted if we had used delete=True
)tempfile.name()
for readingtempfile.name()
whereever we still need it to delete it once we're actually done with it.Yuck, I know.
Fixing this should solve issue #20, which I believe is another symptom of the same problem.
To reproduce, on Windows only:
cd test
python test_force_align.py
Observed behaviour:
We should be able to specify that g2p for certain languages will come from particular RESTful endpoints. The endpoints will be documented in an OAS 3.0 spec.
Maybe something like readalongs align sample.xml sample.wav output -g2p eng=https://www.sample.com/api/v1/g2p
Related Issue: roedoejet/g2p#11
Related PR: https://github.com/dhdaines/ReadAlong-Studio/pull/8
given a single file HTML file, we need to be able to extract the XML, SMIL, audio and images files.
Use case: you receive a readalong by e-mail, and you want to inspect it, and maybe apply manual corrections to the alignment, or add pictures, or do any other changes to it.
I created a file with silences and some DNA text, and I get a KeyError with a stack trace trying to align it.
To reproduce: readalongs align data/ej-fra-dna-silence.xml data/ej-fra.m4a sil-dna
data/ej-fra-dna-silence.xml
:
<?xml version='1.0' encoding='utf-8'?>
<TEI>
<!-- To exclude any element from alignment, add the do-not-align="true" attribute to
it, e.g., <p do-not-align="true">...</p>, or
<s>Some text <foo do-not-align="true">do not align this</foo> more text</s> -->
<text xml:lang="fra">
<body>
<div type="page">
<p>
<s><silence dur="1"/>Bonjour.</s>
<s>Je m'appelle Éric Joanis.</s>
<s>Je suis <silence dur="1.382s"></silence> programmeur au sein <silence dur="500ms"></silence> de l'équipe des technologies pour les langues autochtones au CNRC.</s>
</p>
</div>
<div type="page">
<anchor time="28.6s"/>
<p do-not-align="true">
<s>J'ai fait une bonne partie de ma carrière en traduction automatique statistique, mais maintenant cette approche est déclassée par l'apprentissage profond.</s>
<s>En ce moment je travaille à l'alignement du hansard du Nunavut pour produire un corpus bilingue anglais-inuktitut.</s>
<s>Ce corpus permettra d'entraîner la TA, neuronale ou statistique, ainsi que d'autres applications de traitement du langage naturel.</s>
</p>
<anchor time="50.2s"/>
<p>
<s>En parallèle, j'aide à écrire des tests pour rendre le ReadAlong-Studio plus robuste.</s>
</p>
</div>
</body>
</text>
</TEI>
Traceback:
Traceback (most recent call last):
File "C:\Users\joanise\RAS\ras-env\Scripts\readalongs-script.py", line 11, in <module>
load_entry_point('readalongs', 'console_scripts', 'readalongs')()
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 596, in main
return super().main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1062, in main
rv = self.invoke(ctx)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 440, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "c:\users\joanise\ras\studio\readalongs\cli.py", line 267, in align
verbose_g2p_warnings=kwargs["g2p_verbose"],
File "c:\users\joanise\ras\studio\readalongs\align.py", line 452, in align_audio
words_dict[el.attrib["id"]]["end"] * 1000
KeyError: 't0b0d1p0s0w0'
BUG reproduce:
If you follow the Docker instruction under the Usage
guidance:
git clone ...
cd ...
docker build . --tag=readalong-studio
You will see an error as:
ImportError: cannot import name 'ContextVar' from 'werkzeug.local' (/usr/local/lib/python3.7/dist-packages/werkzeug/local.py)
Potential Cause:
This is due to the Werkzeug==0.16.0
requirement in the Dockerfile
. And that version of Werkzeug will not work with the latest Flask
framework.
Potential Solution:
One workaround is to comment out the related part in the Dockerfile
#RUN python3 -m pip uninstall -y Werkzeug
#RUN python3 -m pip install Werkzeug==0.16.0
Issues:
Werkzeug
package is even necessary as I don't find it in the requirments.txt
and the pip readalongs version is working fine.Hi,
I am currently looking for a way/solution to be able to read a-long my ebooks.
I have an audiobook for each of my ebooks.
My question now is, (since I am not a programmer myself and I am not sure if I understand this correctly) can this software be used to basically create a "whisper sync" like feature, to be able to read a-long in the same way?
best,
gelsas
The TokenizerLibrary
class in readalongs/text/tokenize_xml.py
overwrites tokenizers with the same input language. So, for example if there is a mapping from git
to git-ipa
and then a mapping from git
to git-apa
, the git->git-ipa
one is overwritten by the git-apa
one. I put a bandaid on this here: a3c414d but this won't work if we want to tokenize/align through anything other than an ipa mapping. It also requires out_lang
to end in -ipa
. This should be more robust.
After aligning and then running readalongs epub output.smil output.epub
, I try to read the epub file using calibre (v3.39.1) and get the following error:
calibre, version 3.39.1
ERROR: Could not open e-book: Failed to read book, /Users/pinea/Calibre Library/Unknown/output (5)/output - Unknown.epub click "Show Details" for more information
Traceback (most recent call last):
File "site-packages/calibre/utils/ipc/simple_worker.py", line 289, in main
File "site-packages/calibre/ebooks/oeb/iterator/book.py", line 65, in extract_book
File "site-packages/calibre/customize/conversion.py", line 244, in __call__
File "site-packages/calibre/ebooks/conversion/plugins/epub_input.py", line 344, in convert
ValueError: No valid entries in the spine of this EPUB
Reproduce:
$ readalongs align -i data/ej-fra.txt data/ej-fra.m4a delme -f
...
File "c:\users\joanise\ras\studio\readalongs\align.py", line 387, in align_audio
final_end = end
UnboundLocalError: local variable 'end' referenced before assignment
Should output CLI error: "missing -l option".
Seems related:
$ readalongs align data/ej-fra.xml data/noise_at_1500.mp3 delme -f
...
File "c:\users\joanise\ras\studio\readalongs\align.py", line 387, in align_audio
final_end = end
UnboundLocalError: local variable 'end' referenced before assignment
Should output: "no non-noise segments founds" or something like it.
Dealing with filename conflicts is tougher if the files are just exported to the current directory. output-base
should be a valid path to a place where a directory with that name could be created. So instead of files being created like output-base.wav
and output-base.smil
, we should create output-base/output-base.wav
and output-base/output-base.smil
.
In requirements.txt, we just say g2p>=0.5.*
, which has been convenient and useful, but now we have a breaking change that renders this inadequate. In branch dev.g2p-cascade
, I now depend on is_arpabet()
and other code introduced in g2p v0.5.20210514. How can we declare this requirement without breaking our way of doing locally installed sandboxes?
Can I just say g2p>=0.5.20210514
, or do we need to bump g2p to 0.6?
Thanks to Marc Tessier for noticing this issue.
From Del: "not sure .. if I am misreading the code but the web interface of read-aslong-studio does not actually call the align method in step 3 after https://github.com/ReadAlongs/Studio/blob/master/readalongs/views.py#L175 I do not see the args being used .."
When running readalongs -h
, the output combines and conflates CLI elements added by readalongs itself with those added by flask and, I believe, angular.
The --version
option appears twice - the first comes from flask, the second from readalongs:
--version Show the flask version
--version Show the version and exit.
As for the five commands shown,
align
and epub
are defined in readalongs/cli.py
.routes
appears to come from angular - is it relevant? Is the output readalongs routes
produces meaningful?run
: I'm not sure where this comes from, since I cannot find the strong 'Runs a development server.' anywhere in the code base, but we use it, so it's clearly relevant. But where does it come from?shell
: again, not sure where this is defined. I've never used it, what is it for exactly?Currently, readalongs align
fails when a word is converted to an empty string by the g2p module.
Error message:
ERROR - Alignment produced a different number of segments and tokens, please examine dictionary and input audio and text.
To reproduce this error, checkout 76faf18
in g2p or any commit before the problem with "s" disappearing is fixed in French g2p, go to OpenSamples
, and run:
readalongs align -i -s -f -l fra UDHR-Librivox/human_rights_un_frn-preamble.txt UDHR-Librivox/human_rights_un_frn_ezwa_64kb-preamble.mp3 output/UDHR-fra-preamble
The error in this specific example is due to word <w>s</w>
(the 330th token in UDHR-fra-preamble.tokenized.xml
, on line 37) turning into an empty string because of my g2p rule erasing word-final "s" including in this case where the whole word is "s". As a consequence, file UDHR-fra-preamble.dict
skips from token t0b0d0p10s0w42
to t0b0d0p10s0w44
, bypassing empty token t0b0d0p10s0w43
, causing a mismatch between the number of tokens and dictionary entries.
Eventually, I'll fix the French g2p to not swallow "s", but Studio needs to handle this case gracefully. Options:
When I run this command in test/
:
readalongs align -l fra -s -f -i data/ej-fra.txt data/ej-fra.m4a delme -t -C
The .xml
file is fine, but the .eaf
, .TextGrid
, and _sentences.*
files generated consider each sentence on the first page as sentences, but each word on the second page as an individual sentence.
The correct output should logically consider the same units as sentences in the .xml
file and the TextGrid and cc/subtitle files.
When creating a ReadAlongs and using a config.json file with images. The image(s) specified should be copied automatically into the OUTPUT_BASE folder ( or inside the assets folder???)
example config.json:
{ "images": { "0": "0.jpg" } }
Very minor issue to follow the Canadian Translation Bureau guidelines for capitalizing the word 'Indigenous' in the repo description. Currently: "Audiobook alignment for North American indigenous languages"
Capitalize the singular and plural forms of the nouns Status Indian, Registered Indian, Non-Status Indian and Treaty Indian, as well as the adjectives Indigenous and Aboriginal, when they refer to Indigenous people in Canada.
We should record public data from @littell @roedoejet @joanise @finguist.
This is because currently our e2e tests require data that is private and belongs to communities.
After we should add the e2e test suite to be run by travis.
Studio should output and read .ras
files rather than just .xml
.
The file format needs an actual DTD defining it.
This was brought up in PR https://github.com/dhdaines/ReadAlong-Studio/pull/8 - I think we should adopt some kind of standard for docstrings, for the sake of documentation and integration with Sphinx as well as just general consistency. The numpy standard was proposed.
I got this error when trying to align https://creeliteracy.org/wp-content/uploads/2020/07/Cover.m4a (from https://creeliteracy.org/2020/07/31/covid-safety-reminder-solomon-ratt-y-dialect/) in the studio.
The studio UI needs some work... it said this completed successfully and gave me a blank readlong widget :/
Here's the temp data:
CompletedProcess(args=['readalongs', 'align', '--force-overwrite', '--save-temps', '--text-grid', '--text-input', '--language', 'crk', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/text.txt', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/sol.wav', '/var/folders/s1/y4p2fc9d1c9bv3nfjhgpvwch0000gq/T/tmp1u_lhndi/aligned1596638807'], returncode=1, stdout=b'', stderr=b'INFO - Server initialized for eventlet.\nINFO - Words (<w>) not present; tokenizing\nTraceback (most recent call last):\n File "/Users/santoseadmin/Work/Studio/venv/bin/readalongs", line 11, in <module>\n load_entry_point(\'readalongs\', \'console_scripts\', \'readalongs\')()\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__\n return self.main(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 557, in main\n return super(FlaskGroup, self).main(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 782, in main\n rv = self.invoke(ctx)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke\n return _process_result(sub_ctx.command.invoke(sub_ctx))\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke\n return callback(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func\n return f(get_current_context(), *args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 412, in decorator\n return __ctx.invoke(f, *args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke\n return callback(*args, **kwargs)\n File "/Users/santoseadmin/Work/Studio/readalongs/cli.py", line 217, in align\n results = align_audio(\n File "/Users/santoseadmin/Work/Studio/readalongs/align.py", line 123, in align_audio\n xml = convert_xml(xml)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 208, in convert_xml\n convert_words(xml_copy, word_unit, output_orthography)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 157, in convert_words\n all_indices = compose_tiers(indices)\n File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 290, in compose_tiers\n reduced_indices = compose_indices(tiers[0], tiers[1])\n File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 278, in compose_indices\n if i2_idx in i2_dict and i2_dict[i2_idx] > highest_i2_found:\nTypeError: \'>\' not supported between instances of \'NoneType\' and \'int\'\n')
Here's that traceback from readalongs align
INFO - Server initialized for eventlet.
INFO - Words (<w>) not present; tokenizing
Traceback (most recent call last):
File "/Users/santoseadmin/Work/Studio/venv/bin/readalongs", line 11, in <module>
load_entry_point('readalongs', 'console_scripts', 'readalongs')()
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 557, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/flask/cli.py", line 412, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "/Users/santoseadmin/Work/Studio/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/santoseadmin/Work/Studio/readalongs/cli.py", line 217, in align
results = align_audio(
File "/Users/santoseadmin/Work/Studio/readalongs/align.py", line 123, in align_audio
xml = convert_xml(xml)
File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 208, in convert_xml
convert_words(xml_copy, word_unit, output_orthography)
File "/Users/santoseadmin/Work/Studio/readalongs/text/convert_xml.py", line 157, in convert_words
all_indices = compose_tiers(indices)
File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 290, in compose_tiers
reduced_indices = compose_indices(tiers[0], tiers[1])
File "/Users/santoseadmin/Work/Studio/readalongs/text/util.py", line 278, in compose_indices
if i2_idx in i2_dict and i2_dict[i2_idx] > highest_i2_found:
TypeError: '>' not supported between instances of 'NoneType' and 'int'
ReadTheDocs cannot install our package which is causing it to fail because autodocumentation tools depend on it. The root cause is because PocketSphinx fails to pip install.
Basically there are no great options as far as I'm concerned, but we should try and figure out a solution one way or another.
Command:
readalongs align -l alq -i Adjidamo-no-intro.txt Adjidamo-no-intro.mp3 delme4
Trace:
Traceback (most recent call last):
File "C:\Users\joanise\RAS\ras-env\Scripts\readalongs-script.py", line 11, in <module>
load_entry_point('readalongs', 'console_scripts', 'readalongs')()
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 557, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 717, in main
rv = self.invoke(ctx)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 412, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "c:\users\joanise\ras\readalong-studio\readalongs\cli.py", line 108, in align
if kwargs['save_temps'] else None))
File "c:\users\joanise\ras\readalong-studio\readalongs\align.py", line 98, in align_audio
xml = etree.parse(xml_path).getroot()
File "src\lxml\etree.pyx", line 3424, in lxml.etree.parse
File "src\lxml\parser.pxi", line 1840, in lxml.etree._parseDocument
File "src\lxml\parser.pxi", line 1866, in lxml.etree._parseDocumentFromURL
File "src\lxml\parser.pxi", line 1770, in lxml.etree._parseDocFromFile
File "src\lxml\parser.pxi", line 1163, in lxml.etree._BaseParser._parseDocFromFile
File "src\lxml\parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src\lxml\parser.pxi", line 711, in lxml.etree._handleParseResult
File "src\lxml\parser.pxi", line 638, in lxml.etree._raiseParseError
OSError: Error reading file 'C:\Users\joanise\AppData\Local\Temp\readalongs_xml_ypb25lr6.xml': failed to load external entity "C:\Users\joanise\AppData\Local\Temp\readalongs_xml_ypb25lr6.xml"
These issues noticed in g2p will affect Studio in the same way:
set-output
command is deprecated and will be disabledGiven the default multi-file output from readalongs align, possibly with manual corrections to the alignments in the SMIL or other manual changes, we want to be able to create the single-file HTML bundle.
Use case: -o html
does not allow the user to do any corrections to their readalong. If they need to fix something but still want the results as a single-file HTML, we need to provide a tools to create the bundle post-hoc.
Currently the ePubs need a bit of work to function properly on iOS, which probably is because we are not quite producing valid EPUB 2.0 or 3.0 yet (I think 3.0 is supported so we should target that)
I'll try to get this going in the next few days (making an issue here to track the work, etc)
Trying to add an image to to a RA using a url , configured in a json config file..
example json file used:
{
"images":
{ "0": "https://www.btb.termiumplus.gc.ca/images/termium-wet.png"}
}
I ran it like this:
readalongs align --config 1.json --text-input 1.Welcome.txt 1.Welcome.mp3 1
NOTE: I did get this good warning message when producing the RA at the end.
WARNING - Please make sure https://www.btb.termiumplus.gc.ca/images/termium-wet.png is accessible to clients using your read-along.
When we look at the my local http server access logs, we can see the the RA tried to hit the server with a GET and got a 404. Technically , we should not be seeing that hit where my browser should have hit the site "www.btb.termiumplus.gc.ca " instead. Also notice how it tried to use the "/assets" folder as well. The warning message was correct in saying "Please make sure XXXX is accessible to clients using your read-along".
10.0.2.2 - - [19/Aug/2021 10:46:29] code 404, message File not found
10.0.2.2 - - [19/Aug/2021 10:46:29] "GET /assets/https://www.btb.termiumplus.gc.ca/images/termium-wet.png HTTP/1.1" 404 -
( Also technically this should be a ReadAlong-Web-Component bug I think)
Currently there is at least one method (maybe more, this needs checking) which doesn't apply lower-casing to the mapping inventory.
Take the following at the beginning of is_word_character
in readalongs/text/tokenize_xml.py
:
def is_word_character(self, c):
if not self.case_sensitive:
c = c.lower()
if c in self.inventory:
return True
The inventory hasn't been lower-cased yet and this method doesn't do what's expected (ie return True if the inventory has upper-case characters).
It looks like the release is being created but the version bump is not updating to the master branch. See https://github.com/ReadAlongs/Studio/runs/3885903392
When there are multiple DNA segments in the config.json file, and the method is removed
, the correction after the first DNA segment is incorrect.
to reproduce:
dna-config.json:
{
"do-not-align":
{
"method": "remove",
"segments":
[
{ "begin": 1000, "end": 1100 },
{ "begin": 3700, "end": 3900 }
]
}
}
The command
readalongs align -c dna-config.json data/ej-fra.xml data/ej-fra.m4a delme
outputs an alignment in the .smil file at (3.040 : 3.770)
and one at (3.770 : 4.000)
. 3.770 s is inside [3700ms, 3900ms) where it should not have been allowed to be.
The problem is that calculate_adjustment()
should shift the timestamp it's adjusting for every previous dna segment it has tallied.
This are a few ideas following a meeting between myself, @joanise and @littell .
The problem is that, currently, when alignment fails, it's not clear exactly why the alignment failed. It's often due to errors in transcription, or false starts from the speaker. This issue is to improve ReadAlong-Studio's ability to handle these occurrences, but also to provide helpful insight as to where in the document the alignment is least certain.
Some possible things to add
We should also be thinking about how to interact with the user when these issues come up. There should be a view in the ReadAlong-Studio web app for debugging that highlights areas where the beam was adjusted or where sudden changes of log perplexity occurred.
For example Danish whose norm_form
value is 'none'
, throws a ValueError
:
File "/Users/pinea/ReadAlong-Studio/readalongs/text/convert_xml.py", line 176, in convert_xml
convert_words(xml_copy, word_unit, output_orthography)
File "/Users/pinea/ReadAlong-Studio/readalongs/text/convert_xml.py", line 132, in convert_words
word.text = ud.normalize(norm_form, word.text)
ValueError: invalid normalization form
On branch Studio: dev.g2p
g2p: master
OpenSamples: master
, all up to date as of now
cd OpenSamples
readalongs align -i -s -f -l fra UDHR-Librivox/human_rights_un_frn-preamble.txt UDHR-Librivox/human_rights_un_frn_ezwa_64kb-preamble.mp3 output/UDHR-fra-preamble
outputs:
?[32mINFO?[0m - Server initialized for eventlet.
INFO - Words (<w>) not present; tokenizing
Traceback (most recent call last):
File "C:\Users\joanise\RAS\ras-env\Scripts\readalongs-script.py", line 11, in <module>
load_entry_point('readalongs', 'console_scripts', 'readalongs')()
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 557, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 717, in main
rv = self.invoke(ctx)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\flask\cli.py", line 412, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "c:\users\joanise\ras\ras-env\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "c:\users\joanise\ras\studio\readalongs\cli.py", line 115, in align
if kwargs['save_temps'] else None))
File "c:\users\joanise\ras\studio\readalongs\align.py", line 109, in align_audio
xml = convert_xml(xml)
File "c:\users\joanise\ras\studio\readalongs\text\convert_xml.py", line 194, in convert_xml
convert_words(xml_copy, word_unit, output_orthography)
File "c:\users\joanise\ras\studio\readalongs\text\convert_xml.py", line 143, in convert_words
all_indices = compose_tiers(indices)
File "c:\users\joanise\ras\studio\readalongs\text\util.py", line 271, in compose_tiers
reduced_indices = compose_indices(tiers[0], tiers[1])
File "c:\users\joanise\ras\studio\readalongs\text\util.py", line 256, in compose_indices
results.append((i1_in, highest_i2_found))
UnboundLocalError: local variable 'highest_i2_found' referenced before assignment
When a language is mapped via a hop, like tce -> tce-equiv -> tce-ipa
, the tokenizer fails to find the right mapping and uses the DefaultTokenizer instead.
echo "ts'e ch’e ghw'nj sih" > tce1.txt
readalongs prepare -l tce tce1.txt tce1.xml
readalongs tokenize tce1.xml tce1.tok.xml
grep '<w>' tce1.tok.xml
<s><w>ts</w>'<w>e</w> <w>ch</w>’<w>e</w> <w>ghw</w>'<w>nj</w> <w>sih</w></s>
The correct output should have been:
<s><w>ts'e</w> <w>ch’e</w> <w>ghw'nj</w> <w>sih</w></s>
Contrast with win
, which has just a win -> win-ipa
mapping and also uses the apostrophe as a letter:
readalongs prepare -l win tce1.txt win1.xml
readalongs tokenize win1.xml win1.tok.xml
grep '<w>' win1.tok.xml
<s><w>ts'e</w> <w>ch</w>’<w>e</w> <w>ghw'nj</w> <w>sih</w></s>
which handles '
as a letter, as it should, though not ’
since win
does not map it as an equiv.
As I am moving the tokenizer into g2p, this problem remains whole. I will not solve it right away. I will put unit test cases to validate it, but comment them out since they will fail for now.
If you accidentally provide an audio file as the first argument to readalongs align
and a text file after, it tries to read the mp3 file as a text file and gives the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 45: invalid start byte
We should handle this differently.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.