fengalin / media-toc Goto Github PK
View Code? Open in Web Editor NEWBuild a table of contents from a media file or split a media file into chapters
License: MIT License
Build a table of contents from a media file or split a media file into chapters
License: MIT License
The following sequence exhibits incorrect waveform rendering:
Remains from previous waveform images appear. After a window resize, gaps also appear on the waveform image.
On a Windows 64bits box, the application crashes upon media selection in Context
on line 177:
let audio_sink = gst::ElementFactory::make("autoaudiosink", "audio_playback_sink").unwrap();
In paused mode, it makes sense to zoom to the sample level. However in playing mode, the waveform is scrolling to fast for a human observer and for the double buffering mechanism stability.
Waveform rendering sometimes exhibits garbage when zooming out or seeking. Sometimes the end of the waveform doesn't seem to correspond to the actual samples at these positions, sometimes there are gaps between sample chunks.
The following sequence produces a panic:
Message:
thread '<unnamed>' panicked at 'Cairo error "invalid matrix (not invertible)"', /home/francois/.cargo/git/checkouts/cairo-e6055a6d391c3fc9/911c1fe/cairo-sys-rs/src/enums.rs:75:12
The application raised:
panicked at 'assertion failed: first >= self.samples_offset', src/media/audio_buffer.rs:158:8
After enlarging / reducing the application window while playing a music file. I think the problem might have appeared after a buffer drain.
Not able to reproduce after. However, the condition should be ease to handle.
Timestamp
needs rework in order to use NaiveDateTime
properly. NaiveTime
is already included in NaiveDateTime
, so there should be no reason to handle 2 attributes.
The tracker and listener continue executing after eos which causes CPU usage.
Note that, they should be reactivated in case of a seek.
Big frames should be resized during conversion to RGB. This would save memory and avoid resizing every time the drawing areas are redrawn.
Current implementation defines the sample step from a target duration for the full range of the widget (its width). This presents the following drawbacks:
It would be better to define zoom steps as say samples / 1000 px and compute the sample window accordingly.
How to reproduce:
After a while, the waveform rendering will start skipping chunks and high CPU usage is observed.
Resizing the application in paused mode makes the waveform shift in an unexpected way and the cursor remain centred.
Stream position should be visible on the following widgets:
Allow seeking in media from the time scale and waveform.
Caps may change during playback. See this discussion.
This should reduce computation needs (especially during waveform drawing)
The table widget should display the table of contents of medias which support this feature.
The table could use a separate window in order to leave real estate for the video and audio visualizations.
With some files (e.g. sample.mk), the waveform flickers when the mouse hovers over the play/pause button.
There are several possible enhancements:
Build a plugin to handle double buffering and waveform rendering. This should make it easier to control the position using a dedicated clock.
The drawing area could be added to the UI the same way the video sink widget is added to the UI.
Current implementation of audio sample drops induces discontinuities while drawing waveforms.
It also takes time and tends to lag the process.
One lead to follow would be to drop samples when no buffer is received, if that happens...
The waveform drawing area uses colours that way not be suitable for certain themes. At the very least, the background should be forced to very dark in order for the waveform and position cursor to be readable in the regular theme.
Cursor position is erratic when zoomed in in paused mode:
The cursor will be rendered at different positions when it should remain at the same position.
Thanks to the double buffering and incremental update mechanisms, we can now spend more time rendering the waveform. This could be an occasion to revert to displaying the channels separately, possibly using transparency, as one of the issues was that channels overlapped.
With some files (e.g. sample.mkv - Opus audio codec and some Ogg audio files), the audio buffer are received after the playback. The waveform presentation is not capable of rendering the actual samples being played.
Try adding a tee with a dedicated queue for samples buffers.
If that's not enough, try adjusting the pipeline buffering.
There must be a crate for this.
When previewing the media, it is sometimes necessary to read multiple frames to use the waveforms display efficiently.
Not only is it necessary to drive the media::Context in order to stop the preview, frames' data must also be kept in the controller for drawing / redrawing.
The requirements for each media controller are different, so a controller might need to keep frames' data eventhough its requirements are already fulfilled. For instance, the VideoController may already have received enough data to display, but the AudioController might need more samples to show a significant amount of data. Since, the packets will be parsed by the media::Context, we should get and keep video frames in the process.
This processing lays the basis for next step: buffering frames for playback / seek.
Displaying the waveforms on a single graph results in a clumsy representation. The first channels rendered are hidden by the last ones if they have larger amplitude.
I don't feel that the channel information is pertinent to the definition of a chapter start or end position. Drawing a mono waveform would reduce the sample buffer, which in turn would reduce the computation needed to draw the waveform and the time spent draining the buffer.
For aesthetic reasons, we could keep the information of which channel has the highest amplitude before mixing down, and draw the sample using a colour matching this channel. It might also be too clumsy though as that could result in a sort of colour patchwork.
During playback, the listener
in the GTK thread is in charge of checking for incoming audio buffer and triggering a waveform redraw when the pipeline position has changed.
The timing for both features might be different. For example, we probably want to drain the buffers as early as possible in order to avoid to pull many of them at once, while redrawing the waveform could occur at a rate of 24 or less per second.
Some videos start with a black frame, which is not particularly suitable for a preview. In the case of videos, seek within the media file (e.g. 10%) to get the preview frame.
Integrate GStreamer
instead of ffmpeg
and see which one is more appropriate.
GStreamer
comes with a ready to use way (plugin?) to draw to GtkDrawingArea
as shown here .
There is also a dedicated structs for Table Of Content.
There are two Rust bindings:
I think sdroege/gstreamer-rs
is the way to go, even though it may be less mature than arturoc/gstreamer1.0-rs
. Anyway, current requirements for Media-TOC
are pretty main stream.
In WaveformBuffer, the sample range calculation tries to guess the best range for the target image. This is inherited from the time when only playback/pause was possible and the AudioBuffer would received samples to be appended at the end until eos.
Now that seeking and zooming are implemented, a lot of complex situations are handled and a distinction between an image extension following the arrival of new samples or a refresh in paused mode after a seek or a zoom was introduced. Most of the complexity about positioning the first visible sample and the cursor needs to take place when the image is drawn in the drawing area.
It should be possible to simplify the sample range calculation by selecting the range around current sample with margin to handle in window centred seeks. This is the approach used in AudioBuffer::refresh and it could probably be generalized to extract_samples too.
Allow user to click on the +/- button to zoom in the time window around current position.
Assemble the main technology bricks:
While seeking in MVI_1790.AVI, the WaveformImage sample_step is sometimes 0:
thread 'main' panicked at 'attempt to divide by zero', src/ui/waveform_buffer.rs:171:35
It can also appear at line 265 in WaveformImage.
Add hidden columns to chapters list with start and end as nano seconds in order to ease searching and selecting in a single iteration.
When the application is launched, the video drawing area is painted with the application background. One a regular light theme, this is not consistent with the audio drawing area which is filled with the dark theme background at the moment.
Use black as a background as it is the colour that will be used by the video widget.
In paused mode, seeking doesn't always show samples even though they are available in the AudioBuffer.
Media can be discovered before building the pipeline using GstDiscoverer as suggested by @sdroege. See this tutorial.
This could allow displaying (some) media info before the actual pipeline is built and choose the audio stream to connect when multiple audio streams are present.
For media with multiple audio or video stream, it should be possible to select the stream to be used.
I suspect the tracker (based on gtk::timeout_add
) to not being triggered as steadily as one could wish.
gtk_widget_add_tick_callback might help getting a smoother waveform rendering.
Enhance waveform rendering by drawing marks at meaningful time positions and a line at amplitude 0
In undetermined conditions, resuming playback is sometimes not possible after pausing the media.
It is straight forward for audio and video playback.
Waveform display needs more attention:
When media is paused and the widget is enlarged, the waveform is drawn but the amplitude doesn't follow the window's height.
Position (time ATM) is queried by the UI on a regular basis. It seems like the query hangs for a variable time, probably waiting to acquire a lock on the pipeline.
This enhancement is an attempt to avoid hanging the UI by updating the position / time in a separate thread.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.