novacer / subtite-add-subtitles-to-videos Goto Github PK
View Code? Open in Web Editor NEWA fast video editor for adding subtitles! Supports Windows and Linux.
License: Other
A fast video editor for adding subtitles! Supports Windows and Linux.
License: Other
Currently, SubprocessExecutor only returns stdout or stderr when client exits. For longer running tasks, we may instead want to allow stdout to be received in a streaming fashion. Most likely, the API will be a callback which is called when new data is read from the stdout pipe.
For example, ffmpeg will intermittenly print something like
frame=7149
fps=48.44
stream_0_0_q=0.0
bitrate=N/A
total_size=762
out_time_us=0
out_time_ms=0
out_time=00:00:00.000000
dup_frames=0
drop_frames=0
speed=0x
progress=continue
So Subprocess Executor calls FFMPEG wrapper with what ever data it just read. This may data may be "incomplete", so FFMPEG needs to separately buffer it. Note each section will have last line progress=continue
or progress=end
. So FFMPEG buffers until it sees progress=
, parses the current progress, and returns that to the application.
Note the actual ffmpeg subprocess will be blocked until we read from the buffer, so have to do this quickly. Alternatively, we can use ffmpeg option -stats_period 5
so progress is only written every 5 seconds (thus we reduce number of blocking IO operations and give us more time to process it).
In ffplay.cpp
, the video path should always be wrapped in quotes, instead of having the client making sure it is. This is so even if the path has a space in it, it will not affect the running of the command.
The readme.md right now is a bit disorganized since there are build instructions for windows and linux. The instructions aren't cleanly separated. We should separate the build instructions for each environment to avoid confusion.
For the seconds parsing, at most 5 digits are accepted.
Ex 12345
, 12.345
, or 123.45
etc.
Anymore will not be picked up. We should return parsing error if more than 5 digits entered. Discovered in #21
The current implementation of SubprocessExecutor
is using windows system calls. Since we want to have the app support Unix-y systems we will need to write another implementation.
Namely, instead of CreateProcess()
we will use the posix "sort-of-equivalent" posix_spawnp
.
I suspect we will also need multithreading to be able to read from stdout and stderr separately.
If input is mp4, it may contain mov_text
subtitle track.
If input is mkv, it may contain srt
, ssa
, ass
sub tracks etc.
So we need to decide what to do with those.
UTF-8 BOM (byte order mark) inserts a few extra bytes to the beginning of the file which tells of the encoding type. The issue is that this BOM interferes with the parsing of the file (doesn't work).
We can either
We want to be able to use ASAN on windows build for debugging. The release build will not use ASAN.
We can take inspiration from these steps to enable ASAN for msvc with bazel: https://github.com/bazelbuild/bazel/issues/12955
.
Further, from here it looks like we have to add some dlls to our system path if we want to have dynamic linking
I've already tried adding this locally, but with ASAN enabled GetOpenFileName
will hang forever and not allow the user to select their file. On stack overflow it's said this is a windows bug, doesn't look to be fixed yet. https://stackoverflow.com/a/69718929/17786559. But apparently the following is a workaround.
#include <windows.h>
int main() {
SetProcessAffinityMask(GetCurrentProcess(), 1);
// ...
}
Read: https://devblogs.microsoft.com/oldnewthing/20110707-00/?p=10223
The subprocess launched by SubprocessExecutor
will hang if the client does not call
executor.Start();
executor.WaitUntilFinished();
immediately after each other. This is because the output pipe of the child process is being directed to the subprocess executor. However, if the subprocess executor is not reading the output pipe, then the child process is literally stuck writing to the pipe. Note that WaitUntilFinished()
performs reading of the pipe.
This is problematic since we will not be able to perform async operations while the child process is running. For example:
SubprocessExecutor executor("ffplay /path/to/movie");
executor.Start() // start video player
cin >> subtitle_line; // allow user to input subtitle for that section
executor.WaitUntilFinished();
will cause video player to hang forever.
The solution is either:
The user provided ffplay, ffprobe, and ffmpeg paths should be verified in main()
before proceeding so we can correctly advise the user to install them.
On linux, video files may not necessarily have extension. Then, ffmpeg will fail to process them. We should use https://doc.qt.io/qt-5/qfiledialog.html#defaultSuffix-prop to set a default suffix for output files (such as .mp4 or .mkv). In ffmpeg itself, it is also possible to tell it what the input/output type is. https://stackoverflow.com/questions/9869120/ffmpeg-output-file-format-with-no-extension
Some ideas on how to improve to make using CLI faster:
/play
so user can replay video while subbing. PR #29/cancel
while in add sub mode to exit without adding a subtitle. PR #29Should require automated bazel build and test on both windows and linux before merging into master.
as title
The result of buffer after running ReadFile()
may not be null-terminated.
Thus the returned string of the stdout may contain arbitrary data at the end of the buffer.
ReadFile()
returns the number of bytes read. Thus, buffer[amount_read] = '\0'
will correctly terminate the buffer.
There is one remaining issue, which if the underlying file bytes contains a null char in between the data. Then any buffered bytes after the null char will not be read by ostringstream
. However, in this context we are reading from stdout of specific programs like ffmpeg, ffprobe etc. So shouldn't be an issue.
Possibly we could just store keys on disk in plaintext, but since the APIs are paid it's better to be careful with how the keys are stored. We could use something like libsodium to encrypt the files, or just don't store any keys (user has to re-enter them everytime).
When user selects an existing .srt
file for the output paths, we should attempt to load that file internally. Then, users can edit their existing subtitles instead of creating new ones each time. This will also have the added benefit that users can save their work, quit, and then come back later to complete it.
There's a bug in cli/main.cpp where if the subtitle file path contains quotes (as inputted directly by user from cin), then we cannot open the file using ofstream. Thus we need to differentiate between when file paths actually need quotes or not.
SubTite-add-subtitles-to-videos/subtitler/cli/main.cpp
Lines 82 to 99 in 1de5c1f
There already is a helper function for adding quotes to the input video_path
, since it is passed as an command line argument to ffplay.
SubTite-add-subtitles-to-videos/subtitler/cli/main.cpp
Lines 42 to 49 in 1de5c1f
For paths passed to ofstream
, we basically need the opposite.
Particularly:
As title, autogen_timeline.cpp which was generated using rcc
on windows unsurprisingly only works on windows.
We need to run rcc on linux to get the version which works on linux. Options:
Using subprocess executor fixed in #10, we should add another class which wraps this to open a video player.
We can choose ffplay as the video player.
It is important that the video player be running asynchronously from the subtitler. Since subtitler can become blocked waiting for user input, this must not allow the video player to block.
FFProbe can be used to detect the duration of the video, whether there are audio/video/subtitle streams, and a lot of detailed information that we need.
Currently export dialog allows subtitles to be burned into the mp4. This is fairly slow since it has to decode, add subtitles, and then re-encode the video frames. We can add an additional option to remux the subtitles with the video as mkv. This way, we copy the video and audio streams from the input mp4, and copy the srt to combine everything in an mkv. This should give O(1) processing time (not accounting for time spent copying data).
Benefits:
Drawbacks:
Thus we should make sure to communicate these benefits/drawbacks to the user in the gui.
Currently using software decoding as it is more stable.
On windows, possible to use d3d11, dxva2, or cuda for GPU accelerated decoding instead. In theory, this should give better performance especially for 4k videos, but my simple benchmarks so far show that the HW decoding implementation in https://github.com/Novacer/QtAVPlayer perform worse on any video > 1080p60fps.
Thus we should investigate why the performance is worse (does threading help, or increasing frame buffers etc?) Then hopefully integrate it with the player.
As title, using CLI to play twice should close the current player.
A few ideas:
"play"
then close before playing. This is preferred since we will need a similar pattern for end
. If last command was play and user ends, then close the player. If not, then just move on.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.