Code Monkey home page Code Monkey logo

Comments (7)

selsamman avatar selsamman commented on June 27, 2024 1

I like the idea of the fluent interface though one could still experience growing pains as more features evolve. Maybe it would be worth designing a blue print for the user interface that includes things this library might do in the future. One could use, for example, FFmpeg as an inventory of hypothetical features.

One other thread that probably should be opened is around building a test suite unless that already exists.

from android-transcoder.

ypresto avatar ypresto commented on June 27, 2024 1

In my idea:

AndroidTranscoder.for("/path/to/video_file.mp4")
    .geometry(1280, 800) // or enum for presets..?
    .videoBitrate(8000 * 1000) // or Kbps..?
    .audioBitRate(128 * 1000) // default: as-is
    .audioChannels(1) // default: as-is
    .faststart(false) // default: true
    .trim(start, end) // default: none
    .listener(listener)
    .start() // returns Future for cancel

Or interface would be split into video and audio to cover so many options:

AndroidTranscoder.for("/path/to/video_file.mp4")
    .videoOptions(new VideoOptions.Builder()
        .geometry(...)
        .bitrate(...)
        .build()
    )
    .audioOptions(new AudioOptions.Builder()
        .bitrate(...)
        .channels(...)
        .build()
    )
    .trim(...)
    .faststart(...)
    .start()

When complicated options like FFmpeg's video filter or something is being added, it should be specified through independent parameter object:

VideoFilterOptions filterOptions = new VideoFilterOptions.Builder()
    ...
    .build()

...

AndroidTranscoder.for("/path/to/video_file.mp4")
    .videoFilter(filterOptions)

This helps keeping main interface simple, even while open to new features :)

from android-transcoder.

selsamman avatar selsamman commented on June 27, 2024

So I like this direction. I do think there is one additional abstraction needed to map multiple inputs (channels) to the output and this intersection between input and output is also probably a good place to apply filters. I propose calling it a segment. There would always be a default segment mapping the default input so the simple case is preserved more or less as described above with the exception that I would propose including trimming as part of the segment.

Here is an example where we stitch two clips with a fade transition and add background music:

AndroidTranscoder
    .input("/path/to/first.mp4", "first")  // 2nd optional parameter is the channel name
    .input("/path/to/first.mp4", "second")
    .input("/path/to/music.mp3", "audio")
    .segment(new VideoSegment.Builder()
        .channel("first")
        .outputTime(0)  // default to zero anyways
        .filter(new VideoFilterFadeOut.Builder()  // Apply a transition filter
            .duration(5) // fades for 5 seconds at end of segment
            .build()
        )
        .build()
    )
    .segment( new VideoSegmentBuilder()
        .channel("second"),
        .outputTime(-5) // second clip overlaps first by 5 seconds to allow for fade
        .filter(new VideoFilterFadeIn.Builder()  
           .duration(5)
           .build()
        )
       .build()
    )
    .segment(new AudioSegment.Builder()
        .channel("audio")
        .outputTime(0) // Needed here to output segment at beginning rather than end
        .loop() // Loop indefinitely
        .filter(new AudioFilterFadeOut.Builder()
           .duration(10) // fade out audio for last 10 seconds
           .build()
         )
        .build() 
    )
    .output("/path/to/myoutput.mp4",)
    .videoOptions(new VideoOptions.Builder()
        .geometry(...)
        .bitrate(...)
        .build()
    )
    .listener(progress)
    .start()

A note on times. So my thought is that you have both an input time output time and an output time which allow for overlaps, splices, trims etc. Segments can overlap but there can be no gaps in the output. By default both input and output times advance by the duration of the segment and so need not be specified in the next segment if that next segment simply follows sequentially. A special case of a negative value allows you to take the default time and subtract an interval which was illustrated in the last example. Segments also have a duration which defaults to the remaining content in the input. It can also be adjusted (trimmed) with a negative value.

Multiple segments may refer to the same channel. So say you wanted to splice 10 seconds at the 15 second mark in a video and also trim the last 5 seconds.

AndroidTranscoder
    .input("/path/to/my_input.mp4") // No channel name so it defaults to "input"
    .segment(new VideoSegment.Builder()
        .duration(15)  // segment to keep
        .build()  // Note there is no channel since it defaults to "input"
    )
    .segment( new VideoSegmentBuilder() // Another segment on the same input
        .inputTime(25)  // skips over the 10 seconds
        .duration(-5)  // Is 5 seconds shorter than the full content for a trim
       .build()
    )
    .output("/path/to/my_output.mp4",)
    .start()

I have need for all of these features and more for a project I am working on and would love to contribute to the interface and the implementation in the months to come if we can come to a consensus on how to go about it.

from android-transcoder.

ypresto avatar ypresto commented on June 27, 2024

I'm worrying about simplicity of this library; its implementation and interface can be more complicated than just transcoding. It might be better to be new library.

To realize ffmpeg-grade flexibility, we should define intermediate layer which wraps extracter + decoder, encoder + muxer and filter.
The challenge is how wrap and simplify OpenGL interface not only for transcoding but also for manipulation.

Example interfaces:

// extracter + decoder
VideoStream videoStream = new RawVideoStream("hoge.mp4").startTime(1000).maxDuration(5000);
// and shorthand
VideoStream.from("hoge.mp4").startTime(1000).duration(5000);

// filter
VideoStream filteredStream = new FilteredVideoStream(videoStream, filterObject); // interface for filterObject is very difficult...
// and shorthand
videoStream.filter(filterObject)

// concat
VideoStream concatenatedStream = videoStream.concat(stream2)
// or
VideoStream combinedStream = VideoStream.combine()
    .video(stream, 0)
    .video(stream2, 4500) // offset time. Combine is necessary because just concat does not work well with crossfading.

// encoder + muxer
new Mp4VideoStreamWriter(combinedStream)
    .videoBitrate(8000*1000)
    .output("output.mp4")
    .start()
// and shorthand
combinedStream.writeTo("output.mp4").videoBitrate(8000*1000).start()

(of course should consider about audio stream...)

Already there is filter implementation:
https://github.com/CyberAgent/android-gpuimage

from android-transcoder.

selsamman avatar selsamman commented on June 27, 2024

I like this better than the segment idea. Breaking it up along these lines from an interface point of view is both simple and powerful. You could have chains of filters and then combine them together. You could apply filters before or after combining and yet still run the whole thing as one giant transcode. Internally you might well use segments to represent the unique "wiring" for a given segment of the video but you would determine this from these user interface objects rather than the consumer having to define it that way.

Some filters may take multiple streams. Examples include chroma-key and cross-fades. In my session example I showed cross-fades as individual filters (one fading in and the other fading out) but after researching this a bit it seems you may need to combine multiple textures in a single shader program to do a cross-fade.

So it might be better for the streams to be passed directly to a filter so we can enforce the presence of both streams:

// Produce a stream that injects a foreground into a background where green exists
VideoStream stream1 = ChromaKey.from(foregroundStream, backgroundStream).key(“green”);

And for cross-fades

// Produce a stream that combines stream1 and stream 2 with a 2 second cross-fade
VideoStream stream3 = DisolveCrossFade.from(stream1, stream2).offset(2000);

And then probably the general single stream case

VidoeStream stream4 = SepiaFilter.from(stream3);

As you eluded to, the real challenge is how to do the rendering with multiple filters. Do you have each filter “contribute” source code to an aggregated fragment shader so you can run a single program for the entire chain or do you somehow render each link in the chain back to a texture and have each filter be independent. The former sounds daunting.

from android-transcoder.

aftabsikander avatar aftabsikander commented on June 27, 2024

When are you guys planning to implement this? I liked the first idea which @ypresto proposed using builder like interface making things more seamless and straight forward.

from android-transcoder.

selsamman avatar selsamman commented on June 27, 2024

I have been working on the core functionality of having multiple streams. This required a substantial refactor of the code though it follows the basic structure that it is in place. As far as filters are concerned the only one that I implemented was cross-fade and internally that was done using a single shader that processes multiple textures. That technique is not terribly conducive to chaining arbitrary filters together from say the GPUImage samples but it is theoretically the fastest since you don't have to render into an FBO and then render again.

Internally the code needs to deal with segments much like I proposed but there is nothing stopping the building of the stream-based builder interface. So my work is closer to the original interface I proposed and looks like this:

https://github.com/selsamman/android-transcoder/blob/WorkInProgress/example/src/androidTest/java/net/ypresto/androidtranscoder/tests/SingleFileTranscoderTest.java

My priority has been getting a react-native library together with the equivalent functionality using AVFoundation. I have done in a preliminary fashion and am working on integrating it into a real app. Then I can double back to make more robust tests after which it can be discussed whether it makes sense to pull my work back into this project or keep it as a separate fork. Finally we can see about layering a stream-centric builder interface on top of the segment interface if that still makes sense. My particular use-case is more suitable for the segment-based interface since it divides a composition down into time-segments.

So I think we are aways away unless others are working towards this.

from android-transcoder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.