osk / node-webvtt Goto Github PK
View Code? Open in Web Editor NEWParse WebVTT files, segments and generates HLS playlists for them
License: MIT License
Parse WebVTT files, segments and generates HLS playlists for them
License: MIT License
The regular expression supplied for matching a WebVTT timestamp does not allow for a timestamp that denotes the hour number with a single digit:
Regular Expression
const TIMESTAMP_REGEXP = /([0-9]{2})?:?([0-9]{2}):([0-9]{2}\.[0-9]{3})/;
Example timestamps
59:58.000 --> 59:59.000
<b>(GASPS) Hunter?</b>
1:00:00.000 --> 1:00:01.000
<b>You're fired!</b>
However according to the WebVTT spec and its description of how to parse a timestamp, a parser should allow for this case:
If string is not exactly two characters in length, or if value1 is greater than 59, let most significant units be hours.
The solution would be to update the regular expression to allow for 1 or 2 digits in the hours slot:
const TIMESTAMP_REGEXP = /([0-9]{1,2})?:?([0-9]{2}):([0-9]{2}\.[0-9]{3})/;
i have some thoughts about parser, can I share them with you?
Here is my file:
WEBVTT - this is coolest format for subtitles ever!!
Thank you for download this subtitles.
author: aparus
title: my movie title
language: en
NOTE
Scene 1. Actors:
Fred: https://avatars.com/fred-avatar.png
John: https://avatars.com/john-avatar.png
00:00:00.500 --> 00:00:02.000
<v.loud Fred> The Web is always changing.
00:00:04.500 --> 00:00:05.678
<v.silent John> Yes. And it is good.
00:00:07.323 --> 00:00:08.437
<v.loud Fred> and the way we access it is changing
:
in meta appears like properties:{'Thank you for download this subtitles': 'Thank you for download this subtitles. '}
. Key is not predefined , but is variable.... may be better to use something like: {plainText: 'Thank you for download this subtitles. '}
. and join there all text without keys?{voice: 'Fred', style: 'loud'}
If you like some of them, I'll open pull request, when I realize them. But I need your opinion first.
Thank you for your attention. Best regards.
I have extracted a subtitle track, as webvtt, from an MPEG4 file (from Contour+2 camera), using ffmpeg. Due to empty lines in the text of 'cue' block this is causing the following exception:
.../node_modules/node-webvtt/lib/parser.js:65
throw new ParserError(msg);
^
Error: Cue identifier needs to be followed by timestamp (cue #1)
With a sample data from the webvtt file being:
WEBVTT
00:00.000 --> 00:01.000
FW version:1700 V2.1.29
FW name: ContourPlus2
UPDATE:N
UPDATE_FW:N
SWITCH_1
1RES:D
1BR:H
1EV:0
1SHRP:3
1AE:C
1CTST:62
1MIC:45
1EXT_MIC:30
1SILENT:1
1LSR:1
1LED:0
1GPS_PWR:0
1GPS_REC:1
1AWB:0
SWITCH_2
2RES:A
2BR:H
2EV:0
2SHRP:3
2AE:C
2CTST:62
2MIC:45
2EXT_MIC:30
2SILENT:0
2LSR:0
2LED:0
2GPS_PWR:0
2GPS_REC:1
2AWB:0
Is there any way we can get node-webvtt to deal with this? The current thought being to require all cue block be empty + properly formatted start/end time line and then treating the rest as cue content? I haven't read enough of the WebVTT spec to know whether this would cause issue.
Because I didn't carry the rounding, numbers such as 1.99999996 will compile to 00:00:01.1 . This is extremely problematic.
I'm open to contributing to this project to add an exporter. My idea is that users can parse the WebVTT file, modify it, then export it back to WebVTT format.
Content:
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:0
1
00:00.300 --> 00:01.889
This is from a recruit, and I've
2
00:01.890 --> 00:03.329
asked them if I can have one of the
3
00:03.330 --> 00:04.849
six consulting conversations, I
4
00:05.010 --> 00:06.509
said, can I do a tech assessment
5
00:06.510 --> 00:08.189
with you? And then they said, no,
6
00:08.340 --> 00:10.230
I do not need a tech assessment.
Error:
[Error: Missing blank line after signature] { error: undefined }
Waiting for the debugger to disconnect...
Error: Error: Missing blank line after signature
Not sure why X-TIMESTAMP-MAP is not supported.
I parsed my vtt file into array
. Now can we reverse this process? can we convert cause
array to WebVTT
Just tried running against a file with the following entries (truncated):
51:13.387 --> 53:20.177
Chapter 14
53:20.180 --> 56:02.180
Chapter 15
56:02.175 --> 59:16.395
Chapter 16
59:16.403 --> 1:04:13.283
Chapter 17
1:04:13.283 --> 1:06:03.283
Chapter 18
1:06:03.276 --> 1:09:25.646
Chapter 19
1:09:25.645 --> 1:11:04.235
Chapter 20
This cause a failure with error 'Error: Start timestamp greater than end (cue #15)'. The offending time entry appears to be: 59:16.403 --> 1:04:13.283
A quick inspection suggests that this is something to with the regex, since adding, the following line, added to parseTimestamp()
:
console.log(matches[1], matches[2], matches[3])
gives 'undefined' in all cases for matches[1]
Sometimes vtt has voice tags (<v> ... </v>
), which are helpful for trying to stylize the expression of a sound. When you don't want them, however, they're very difficult to remove without adding some other parsing library for xml-like objects. Can node-webvtt
offer an option where v-tags are ignored?
It seems that the library doesn't currently parse the styling blocks.
See section titled Within the WebVTT file itself
:
https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API
This example with regions gives the following error:
ParserError: No blank line after signature
@osk I have this problem with node-webvtt
I first parse the English VTT file, translate the text and replace it within the object cue, and then compile the VTT file back.
It works very well but I have noticed that in the resulting file the starting time of the paragraphs is not exactly the same. In the paragraphs of the beginning of the file the difference is small but it accumulates and at the end of the file there are differences of up to 1 minute.
I put original and translate file on a shared driver:
The update cue it's done here: https://github.com/bySabi/subs-translate/blob/master/bin/subs-translate#L299
Thanks for this wonderful module.
I had a section in this file as such:
1096
01:45:13.056 --> 01:45:14.390
...mission.
And this returned an error with:
TypeError: Cannot read property 'split' of undefined
at parseCue (node_modules/node-webvtt/lib/parser.js:142:26)
at cues.map (node_modules/node-webvtt/lib/parser.js:91:16)
at Array.map (<anonymous>)
at parseCues (node_modules/node-webvtt/lib/parser.js:89:6)
at Object.parse (node_modules/node-webvtt/lib/parser.js:57:28)
at main (anthony/parse.js:8122:27)
at Object.<anonymous> (anthony/parse.js:8134:1)
at Module._compile (module.js:653:30)
at Object.Module._extensions..js (module.js:664:10)
at Module.load (module.js:566:32)
I couldn't track down where in the webvtt it was either that was difficult.
Desired effect is that even given this malformed webvtt that it either parses it or gives back an error in the expected format
The WebVTT spec says that the timesttamps should be:
A WebVTT timestamp consists of the following components, in the given order:
Optionally (required if hours is non-zero):
Two or more ASCII digits, representing the hours as a base ten integer.
Right now any hours longer than 2 digits are truncated to only include the final two digits of the hours count.
If the .vtt has more than one trailing lines, this result in node-webvtt failing, with the following output:
<path to project>/node_modules/node-webvtt/lib/parser.js:75
if (lines[0].includes('-->')) {
Sample WebVTT file:
WEBVTT
00:00.000 --> 00:01.000
Hello there
how are you
00:01.000 --> 00:02.000
Hello there
how are you
00:02.000 --> 00:03.000
Hello there
how are you
00:03.000 --> 00:04.000
Hello there
how are you
When I examined the values of 'lines' in the parseCue()
function, the first empty line produced an empty array, which causes the if (lines[0].includes('-->'))
line to fail.
A workaround for now is to trim the text prior to parsing it to the parse() function.
When parsing inputs with empty cues, e.g. WEBVTT↵↵
, parser creates a cue with the following structure:
{
end: 0
identifier: ""
start: 0
styles: ""
text: ""
}
When trying to compile from the parsed input, I get "Error: Cue malformed: start timestamp greater than end
, which is set to go off when end <= start. The default value of end triggers it.
cues
to []
instead of not giving the parsed content the attribute at all.The spec that allows rendering files with some errors. A situation in the spec where it happens: "[...] This is clearly a mistake, so a conformance checker will flag it as an error, but it is still useful to render the cues to the user."
There's an example file with some errors that are ignored by renderers: https://github.com/cgiffard/Captionator/blob/master/video/acid.vtt
Trying to parse the above file returns ParserError: Invalid cue timestamp (cue #14)
without returning anything else. Would be better if it returned a object with {valid: false, errors: [ParserError('Invalid cue timestamp (cue #14)')], cues: [...]}
and only throw an error when the file signature is invalid.
This is important because there are WebVTT files that are authored with errors and as renderers ignore some errors those don't get noticed until someone tries to open those in a strict parser, like node-webvtt. Sadly it wasn't noticed sooner: of the files I'm working with 20% are affected by this issue.
I don't know if this qualifies as an issue, but in case it is any use for you.
It is possible to compile a vtt with subtitles in non chronological order. In order to produce this:
let webvtt=require("node-webvtt");
console.log(webvtt.compile({
"valid":true,
"cues":[
{
"identifier":"",
"start":30,
"end":31,
"text":"This is a subtitle",
"styles":"align:start line:0%"
},
{
"identifier":"",
"start":0,
"end":1,
"text":"Hello world!",
"styles":""
},
{
"identifier":"",
"start":60,
"end":61,
"text":"Foo",
"styles":""
},
{
"identifier":"",
"start":110,
"end":111,
"text":"Bar",
"styles":""
}
]
}));
Whether this is a bug or not, depends on whether a non-chronological vtt file is admissible.
It seems that we can parse() out the metadata from a string --> json
.
But json --> string
is missing the metadata block at the start of the file content when using the compile() method.
I like this parser and as a TypeScript user am interested in a typed version of this package. Wanted to start a discussion about that process if you are interested, and how I can help in the event it moves forward.
Currently we are using video.js and exoplayer for some devices to play our content and are using node-webvtt version "^1.9.4"
Our current .vtt files have a meta header that looks like this:
WEBVTT
X-TIMESTAMP-MAP=MPEGTS:183750,LOCAL:00:00:00.000
However after running our .vtt file through the compile() method, we are noticing that a space is being added after the semicolon in the meta header.
WEBVTT
X-TIMESTAMP-MAP=MPEGTS: 183750,LOCAL:00:00:00.000
This is causing video.js and exoplayer to have issues loading captions due to an error with X-TIMESTAMP-MAP and the caption files are not being displayed.
Looking in the compiler.js, I noticed this section of code which deliberately adds a space to the output:
if (input.meta) {
if (typeof input.meta !== 'object' || Array.isArray(input.meta)) {
throw new CompilerError('Metadata must be an object');
}
Object.entries(input.meta).forEach((i) => {
if (typeof i[1] !== 'string') {
throw new CompilerError(`Metadata value for "${i[0]}" must be string`);
}
output += `${i[0]}: ${i[1]}\n`;
});
}
(https://github.com/osk/node-webvtt/blob/master/lib/compiler.js#L44)
And I am wondering what the reasoning is behind adding that space there in the output section?
Some players handle it correctly but not every player does and by removing that space both video.js and exoplayer were able to render the caption correctly and have it displayed.
Would it be possible to get this space removed in the meta if block?
I see that a meta option has been added to the parse function to support multi line headers.
The segmenter does not have that option and calls parse without passing any option.
I'm using this module to ingest data from Google's YouTube Captions API. Unfortunately, the content it generates has extra lines after the opening WEBVTT
line, for example:
WEBVTT
Kind: captions
Language: en
00:00:00.000 --> 00:00:00.960
[Happy music]
According to MDN, this is not allowed, however nonetheless it appears there. At the moment I'm solving this with a workaround to alter the string before passing it to parse()
:
const adjustedCaption = caption.replace(/^WEBVTT[\s\S]*?\n\n/, "WEBVTT\n\n");
Without this workaround, I receive an error: Missing blank line after signature
. It would be preferable if this module could instead accept an option to ignore trailing signature lines. Looking at the code, this wouldn't have adverse effects on the parsing. Alternatively, these lines could be parsed and added as metadata to the parsed output.
I'd be happy to issue a PR for this if you're comfortable with the approach, or if you have a better suggestion I can look at implementing that too.
Using one of the examples from MDN:
WEBVTT - Translation of that film I like
NOTE
This translation was done by Kyle so that
some friends can watch it with their parents.
1
00:02:15.000 --> 00:02:20.000
- Ta en kopp varmt te.
- Det är inte varmt.
2
00:02:20.000 --> 00:02:25.000
- Har en kopp te.
- Det smakar som te.
NOTE This last line may not translate well.
3
00:02:25.000 --> 00:02:30.000
- Ta en kopp
That's the parser output:
ParserError: "Cue identifier needs to be followed by timestamp (cue #0)"
Would be awesome if this module could support parsing and writing this metadata from a WebVtt file. Large streaming companies like Brightcove leverage that metadata field, and interfacing with their vtt files requires being able to parse/write X-TIMESTAMP-MAP.
Sample explanation:
https://bitmovin.com/docs/player/faqs/why-are-my-webvtt-subtitle-tracks-not-in-sync-with-the-video
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.