Code Monkey home page Code Monkey logo

babitmf / bmf Goto Github PK

View Code? Open in Web Editor NEW
750.0 750.0 63.0 34.94 MB

Cross-platform, customizable multimedia/video processing framework. With strong GPU acceleration, heterogeneous design, multi-language support, easy to use, multi-framework compatible and high performance, the framework is ideal for transcoding, AI inference, algorithm integration, live video streaming, and more.

Home Page: https://babitmf.github.io/

License: Apache License 2.0

CMake 5.16% Python 24.84% C++ 56.73% C 2.92% Makefile 0.07% Java 1.63% Shell 1.83% Objective-C 1.13% Objective-C++ 3.25% Cuda 1.27% Go 1.13% Dockerfile 0.04%
ai arm bmf bytedance cpp cross-platform cuda ffmpeg gpu heterogeneous live-video mediacodec multimedia numpy nvidia opencv python tensorrt transcode x86-64

bmf's People

Contributors

chutiantian0923 avatar frankfengw519 avatar huheng avatar jie-fang avatar mmdzzh avatar mpr0xy avatar sbraveyoung avatar sfeiwong avatar taoboyang avatar tongyuantongyu avatar xiaoweiw-nv avatar zhitianwu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bmf's Issues

encoding and decoding are both hardware-accelerated using GPUs,does BMF copy the GPU data back to memory

case:
ffmpeg -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -hwaccel_device 0 -c:v h264_cuvid -i input.h264 -c:v nvenc_h264 output.h264

Q:
The encoding and decoding are both hardware-accelerated using GPUs, with decoding completed on the GPU and encoding also completed on the GPU, without any memory copying from GPU to host memory.
In this case, what are the differences between BMF’s processing of the decoding process and CPU mode decoding? Are both using vaFrame to receive decoding results, with the address in vaFrame being on the GPU for one and on the CPU for the other?
Also, in this scenario, does BMF copy the GPU data back to memory, encapsulate it as a TASK, and place it in the scheduling queue? Would this not change the original purpose of reducing memory copies?

fill_task_input don't consider the incomplete data , is it implied that the module must consider the incomplete data and carry out local caching and splicing processing when handling (TASK)?

If a certain module (MODULE) requires multiple input streams (such as OVERLAY) to function normally, and only some of these input streams have data, is it implied that the module must consider the incomplete data and carry out local caching and splicing processing when handling (TASK)? Otherwise, there will be incomplete input data, leading to processing failure.

bool ImmediateInputStreamManager::fill_task_input(Task &task) {
bool task_filled = false;
for (auto & input_stream : input_streams_) {

        if (input_stream.second->is_empty()) {
            continue;
        }
        //one task cantain mult pkts, NEED add max pkts ctl?
        while (not input_stream.second->is_empty()) {
            Packet pkt = input_stream.second->pop_next_packet(false);
            if (pkt.timestamp() == BMF_EOF) {
                if (input_stream.second->probed_) {
                    BMFLOG(BMF_INFO) << "immediate sync got EOF from dynamical update";
                    pkt.set_timestamp(DYN_EOS);
                    input_stream.second->probed_ = false;
                } else
                    stream_done_[input_stream.first] = 1;
            }
            //READ:取到task的对应输入队列中
            task.fill_input_packet(input_stream.second->get_id(), pkt);
            task_filled = true;
        }
    }

AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'get_data'

/usr/lib/python3.7/site-packages/bmf/modules/null_sink.py in process(self, task)
21 elif pkt.get_timestamp() != Timestamp.UNSET:
22 Log.log_node(LogLevel.DEBUG, task.get_node(),
---> 23 "process data", pkt.get_data(), 'time',
24 pkt.get_timestamp())
25 return ProcessResult.OK

AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'get_data'

demo里,google drive的文件无法下载

!gdown --fuzzy https://drive.google.com/file/d/1l8bDSrWn6643aDhyaocVStXdoUbVC3o2/view?usp=sharing -O big_bunny_10s_30fps.mp4
Traceback (most recent call last):
File "/usr/local/bin/gdown", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/gdown/cli.py", line 151, in main
filename = download(
File "/usr/local/lib/python3.10/dist-packages/gdown/download.py", line 203, in download
filename_from_url = m.groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'

请问如何在一个graph内设置两个输出?

运行人脸识别demo(trt_face_detect.py)把判断语句:if (output_queue_size >= 2):屏蔽掉,云溪谷会报错:
[2024-05-11 04:09:39.784] [info] node:c_ffmpeg_encoder 3 scheduler 1
[2024-05-11 04:09:39.807] [error] node id:1 catch exception: KeyError: (1,)
At:
/root/bmf/lvyantest/tensorrt_post/trt_face_detect.py(226): process
[2024-05-11 04:09:39.807] [error] node id:1 Process node failed, will exit.
[2024-05-11 04:09:39.808] [info] node 1 got exception, close directly
[2024-05-11 04:09:39.808] [info] schedule queue 0 start to join thread
[2024-05-11 04:09:39.808] [error] node id:1 catch exception: KeyError: (1,)

At:
/root/bmf/lvyantest/tensorrt_post/trt_face_detect_ok.py(226): process

[2024-05-11 04:09:39.808] [error] node id:1 Process node failed, will exit.
Traceback (most recent call last):
File "testtrt1.py", line 54, in
main()
File "testtrt1.py", line 50, in main
video.run()
File "/root/bmf/output/bmf/builder/bmf_stream.py", line 82, in run
return self.node_.get_graph().run(self)
File "/root/bmf/output/bmf/builder/bmf_graph.py", line 747, in run
self.exec_graph_.close()
File "/root/bmf/lvyantest/tensorrt_post/trt_face_detect_ok.py", line 226, in process
output_queue_1 = task.get_outputs()[1]
KeyError: 1
是否是graph内只设置了一个输出,但是trt_face_detect.py内逻辑输出了文本的输出。请问在graph内怎样设置才能正确输出两个结果。

我的graph代码如下:
import time
import tensorrt as trt
import bmf.hml.hmp as mp
from nms import NMS
import PIL
from PIL import Image

def main():
graph1 = graph({'dump_graph':1})

video = graph1.decode({
    "input_path": "./face.mp4",
    "video_params": {
       "hwaccel": "cuda",
    }
})["video"]

video = video.module("trt_face_detect", {
    "model_path": "./version-RFB-640.engine",
    "input_shapes": {
        "input": [1, 3, 480, 640]
    }}, entry="trt_face_detect_ok.trt_face_detect")

video = video.module("face_postprocess",
    entry="face_postprocess.face_postprocess")

video = video.encode(
    None, {
        "output_path": "./trt_out.mp4",
        "video_params": {
            "codec": "h264_nvenc",
            "bit_rate": 5000000,
            "max_fr": 30
        }
    }
    )


video.run()

if name == "main":
main()
感谢!

Pass non-image data between modules using Packet

When building the controlnet demo, I am trying to build a pipeline that looks like this:

image decoder ---> controlnet inference module ---> image encoder
^
|
prompt reader ---------+

The prompt is read from files and passed to the controlnet inference module through bmf.Packet. The type of the prompt is Python dict, and bmf/python/py_module_sdk.cpp shows that bmf.Packet support all python types. But when I run the pipeline, I get the following error:

[2023-10-11 02:09:40.995] [error] node id:2 catch exception: BMF(0.0.8) /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/node.cpp:352: error: (-5:Bad argument) [Node_2_c_ffmpeg_filter] Process result != 0.
 in function 'process_node'

[2023-10-11 02:09:40.996] [error] node id:2 Process node failed, will exit.
[2023-10-11 02:09:40.996] [info] node 2 got exception, close directly
[2023-10-11 02:09:40.996] [info] node id:2 process eof, add node to scheduler
[2023-10-11 02:09:40.996] [info] schedule queue 0 start to join thread
[ipp1-2035:1225 :0:1315] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1e0)
==== backtrace (tid:   1315) ====
 0 0x0000000000042520 __sigaction()  ???:0
 1 0x000000000015856d CFFFilter::init_filtergraph()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:242
 2 0x0000000000159b9c CFFFilter::process_filter_graph()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:393
 3 0x000000000015ae0e CFFFilter::process()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/c_modules/src/ffmpeg_filter.cpp:578
 4 0x0000000000372a9e bmf_engine::Node::process_node()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/node.cpp:348
 5 0x00000000003a5a7a bmf_engine::SchedulerQueue::exec()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/scheduler_queue.cpp:153
 6 0x00000000003a5678 bmf_engine::SchedulerQueue::exec_loop()  /home/scratch.xiaoweiw_sw/bytedance/babitmf/bmf/engine/c_engine/src/scheduler_queue.cpp:111
 7 0x00000000003a8b1e std::__invoke_impl<int, int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*>()  /usr/include/c++/11/bits/invoke.h:74
 8 0x00000000003a8a72 std::__invoke<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*>()  /usr/include/c++/11/bits/invoke.h:96
 9 0x00000000003a89d3 std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> >::_M_invoke<0ul, 1ul>()  /usr/include/c++/11/bits/std_thread.h:259
10 0x00000000003a898a std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> >::operator()()  /usr/include/c++/11/bits/std_thread.h:266
11 0x00000000003a896a std::thread::_State_impl<std::thread::_Invoker<std::tuple<int (bmf_engine::SchedulerQueue::*)(), bmf_engine::SchedulerQueue*> > >::_M_run()  /usr/include/c++/11/bits/std_thread.h:211
12 0x00000000000dc253 std::error_code::default_error_condition()  ???:0
13 0x0000000000094b43 pthread_condattr_setpshared()  ???:0
14 0x0000000000126a00 __xmknodat()  ???:0
=================================
Segmentation fault (core dumped)

I haven't used ffmpeg filter module in the graph, tt seems that bmf will insert ffmpeg filter modules in the graph. Test code as follows.

test_controlnet.py:

import sys

sys.path.append("../../")
import bmf

sys.path.pop()

def test():
    input_video_path = "./ControlNet/test_imgs/bird.png"
    input_prompt_path = "./prompt.txt"
    output_path = "./output.jpg"

    graph = bmf.graph()

    video = graph.decode({'input_path': input_video_path})
    prompt = graph.module('text_module', {'path': input_prompt_path})
    concat = bmf.concat(video['video'], prompt)
    concat.module('controlnet_module', {}).run()

if __name__ == '__main__':
    test()

text_module.py:

import sys
import random
from typing import List, Optional
import pdb

from bmf import *
import bmf.hml.hmp as mp

class text_module(Module):
    def __init__(self, node, option=None):
        self.node_ = node
        self.eof_received_ = False
        self.prompt_path = './prompt.txt'
        if 'path' in option.keys():
            self.prompt_path = option['path']

    def process(self, task):
        pdb.set_trace()
        output_queue = task.get_outputs()[0]

        if self.eof_received_:
            output_queue.put(Packet.generate_eof_packet())
            Log.log_node(LogLevel.DEBUG, self.node_, 'output text stream', 'done')
            task.set_timestamp(Timestamp.DONE)
            return ProcessResult.OK

        prompt_dict = dict()
        with open(self.prompt_path) as f:
            for line in f:
                pk, pt = line.partition(":")[::2]
                prompt_dict[pk] = pt

        out_pkt = Packet(prompt_dict)
        out_pkt.timestamp = 0
        output_queue.put(out_pkt)
        self.eof_received_ = True

        return ProcessResult.OK

def register_inpaint_module_info(info):
    info.module_description = "Text file IO module"

人脸检测demo代码trt_face_detect.py内关于output_queue_size的逻辑的使用方法

trt_face_detect.py的代码片段 :

def process(self, task):
    input_queue = task.get_inputs()[0]
    output_queue_0 = task.get_outputs()[0]
    output_queue_size = len(task.get_outputs())
    if output_queue_size >= 2:
        output_queue_1 = task.get_outputs()[1]

    while not input_queue.empty():
        pkt = input_queue.get()
        if pkt.timestamp == Timestamp.EOF:
            self.eof_received_ = True
        if pkt.is_(VideoFrame):
            self.frame_cache_.put(pkt.get(VideoFrame))

    while self.frame_cache_.qsize(
    ) >= self.in_frame_num_ or self.eof_received_:
        out_frames, detect_result_list = self.inference()
        for idx, frame in enumerate(out_frames):
            pkt = Packet(frame)
            pkt.timestamp = frame.pts
            output_queue_0.put(pkt)

            if (output_queue_size >= 2):
                pkt = Packet(detect_result_list[idx])
                pkt.timestamp = frame.pts
                output_queue_1.put(pkt)

        if self.frame_cache_.empty():
            break

    if self.eof_received_:
        for key in task.get_outputs():
            task.get_outputs()[key].put(Packet.generate_eof_packet())
            Log.log_node(LogLevel.DEBUG, self.node_, "output stream",
                         "done")
        task.timestamp = Timestamp.DONE

    return ProcessResult.OK

代码中有个判断if (output_queue_size >= 2):
进入逻辑后,会对检测结果:detect_result_list进行额外输出。
但是我执行demo时并不能触发这个逻辑分支。请问做什么操作才能使output_queue_size>=2。额外输出一个关于检测结果的输出。
感谢

require sm at /home/dan/zs/cuda118/bmf/bmf/hml/src/core/stream.cpp:130, Stream on device type 1 is not supported

Python Stack ignored

Stack trace (most recent call last):
#5 Object "/usr/bin/python3.8", at 0x5d6065, in _PyObject_MakeTpCall
#4 Object "/usr/bin/python3.8", at 0x5d5498, in PyCFunction_Call
#3 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/_hmp.cpython-38-x86_64-linux-gnu.so", at 0x7fe59311d0f4, in PyInit__hmp
#2 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/hmp.cpython-38-x86_64-linux-gnu.so", at 0x7fe593113266, in
#1 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/libhmp.so.1", at 0x7fe592ef2948, in hmp::current_stream(hmp::Device::Type)
#0 Object "/home/dan/zs/cuda118/bmf/output/bmf/lib/libhmp.so.1", at 0x7fe592eec1b9, in hmp::logging::dump_stack_trace(int)
Traceback (most recent call last):
File "detect_trt_sample.py", line 41, in
main()
File "detect_trt_sample.py", line 13, in main
trt_face_detect = bmf.create_module(
File "/home/dan/zs/cuda118/bmf/output/bmf/builder/bmf.py", line 28, in create_module
return engine.Module(module_info, json.dumps(option), "", "", "")
File "/home/dan/zs/cuda118/bmf/output/demo/face_detect/trt_face_detect.py", line 90, in init
self.stream
= mp.current_stream(mp.kCUDA)
RuntimeError: require sm at /home/dan/zs/cuda118/bmf/bmf/hml/src/core/stream.cpp:130, Stream on device type 1 is not supported

ModuleNotFoundError: No module named 'bmf.lib._hmp'

按照README.md 的引导 建立conda虚拟环境,下载完相关依赖运行demo后找不到hmp库

(deoldify_py39) root@bd912f7bf229:~/bmf/bmf/demo/colorization_python# python3.9 deoldify_demo.py 
Traceback (most recent call last):
  File "/root/bmf/bmf/demo/colorization_python/deoldify_demo.py", line 1, in <module>
    import bmf
  File "/root/bmf/output/bmf/__init__.py", line 3, in <module>
    from bmf.python_sdk.module_functor import make_sync_func
  File "/root/bmf/output/bmf/python_sdk/__init__.py", line 1, in <module>
    from .module_functor import make_sync_func, ProcessDone
  File "/root/bmf/output/bmf/python_sdk/module_functor.py", line 1, in <module>
    import bmf.lib._hmp
ModuleNotFoundError: No module named 'bmf.lib._hmp'

相关依赖下载无误

(deoldify_py39) root@bd912f7bf229:~/bmf/bmf/demo/colorization_python# pip3 list | grep Babit
BabitMF                  0.0.9
BabitMF-GPU              0.0.9

在硬盘中也能找到_hmp.cpython-39-x86_64-linux-gnu.so这个库文件

(deoldify_py39) root@bd912f7bf229:~/bmf/bmf/demo/colorization_python# ls /root/miniconda3/envs/deoldify_py39/lib/python3.9/site-packages/bmf/lib/
_bmf.cpython-39-x86_64-linux-gnu.so  libbenchmark.a       libbmf_module_sdk.so        libbmf_py_loader.so      libbuiltin_modules.so.0.0.9  libengine.so.0.0.9  libhmp.so.1
_hmp.cpython-39-x86_64-linux-gnu.so  libbenchmark_main.a  libbmf_module_sdk.so.0      libbuiltin_modules.so    libengine.so                 libfmt.a            libhmp.so.1.2.0
libbackward.a                        libbmf_go_loader.so  libbmf_module_sdk.so.0.0.9  libbuiltin_modules.so.0  libengine.so.0               libhmp.so           libspdlog.a

RuntimeError: [json.exception.type_error.302] type must be string, but is array

pkts = (
    bmf.graph().decode({
        'input_path': stream,
        "loglevel": "quiet",
    })['video']
    .start()  # this will return a packet generator
)

for i, pkt in enumerate(pkts):
    # convert frame to a nd array
    if pkt.is_(bmf.VideoFrame):
        vf = pkt.get(bmf.VideoFrame)
        rgb = mp.PixelInfo(mp.kPF_RGB24)
        np_vf = vf.reformat(rgb).frame().plane(0).numpy()
        # we can add some more processing here, e.g. predicting
        print("frame", i, "shape", np_vf.shape)
    else:
        break

When I used the above code to read the stream, an error occurred. When I switched the video stream to a local video, the error disappeared. I don't know where the problem is, but my video stream is correct. I can use ffmpeg to read the stream and save it as mp4.
The following is the error message:
image

PyCUDA ERROR: The context stack was not empty upon module cleanup

graph.decode 输入的input_path为直播流时,当直播流突然断开,bmf 会coredump:

[2024-03-23 04:12:10.668] [info] node:c_ffmpeg_encoder 2 scheduler 1
[2024-03-23 04:14:31.322] [info] node id:0 decode flushing
[2024-03-23 04:14:31.322] [info] node id:0 Process node end
[2024-03-23 04:14:31.364] [info] node id:0 close node
[2024-03-23 04:14:31.364] [info] node 0 close report, closed count: 1
[2024-03-23 04:14:31.364] [info] node id:1 eof received
[2024-03-23 04:14:31.364] [info] node id:1 eof processed, remove node from scheduler
[2024-03-23 04:14:31.365] [info] node id:1 process eof, add node to scheduler
[2024-03-23 04:14:31.373] [info] node id:1 Process node end
[2024-03-23 04:14:31.373] [info] node id:1 close node
[2024-03-23 04:14:31.373] [info] node 1 close report, closed count: 2
[2024-03-23 04:14:31.373] [info] node id:2 eof received
[2024-03-23 04:14:31.373] [info] node id:2 eof processed, remove node from scheduler
[2024-03-23 04:14:31.374] [info] node id:2 process eof, add node to scheduler
[2024-03-23 04:14:31.374] [info] node id:2 Process node end
[2024-03-23 04:14:31.374] [info] node id:2 close node
[2024-03-23 04:14:31.374] [info] node 2 close report, closed count:3
[2024-03-23 04:14:31.374] [info] schedule queue 0 start to join thread
[2024-03-23 04:14:31.374] [info] schedule queue 0 thread quit
[2024-03-23 04:14:31.375] [info] schedule queue 0 closed
[2024-03-23 04:14:31.375] [info] schedule queue 1 start to join thread
[2024-03-23 04:14:31.375] [info] schedule queue 1 thread quit
[2024-03-23 04:14:31.375] [info] schedule queue 1 closed
[2024-03-23 04:14:31.375] [info] all scheduling threads were joint

PyCUDA ERROR: The context stack was not empty upon module cleanup.

A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.

core dumped in frame extract

Hi, when I run the code below in a 4 CPU machine,Aborted (core dumped) happen. Error rate is 6/10. (Run 10 times and error occur 6 times). But in a 16 CPU machine, it doesn't happen. I observed that when executing on the 4cpu machine, the cpu usage is almost 100%. Maybe it is the reason. Apart from adding more CPU, is there any way to avoid this problem?

import bmf
import time
from multiprocessing.pool import ThreadPool
import glob
import numpy as np

def generator_mode(input_list):
    input_path,threads = input_list
    start = time.time()
    graph = bmf.graph()
    video =  graph.decode({
                    'input_path': input_path,
                    "log_level":"quiet",
                    "dec_params": {"threads": threads},
                })['video'].start() # this will return a packet generator
    for pkt in video:
        # convert frame to a nd array
        if pkt.is_(bmf.VideoFrame):
            vf = pkt.get(bmf.VideoFrame)
            v_frame = vf.frame().plane(2).numpy()
        else:
            break
    use = time.time() - start
    return use



if __name__ == '__main__':
    #串行
    # print(time.time())
    test_threads = [0,2,4,6,8]
    video_paths = glob.glob("/root/ori/*.mp4")

    for threads in test_threads:
        for infilename in video_paths:
            extract_u_frame_time = []
            run_path = []
            for i in range(20):
                run_path.append([infilename, str(threads)])
            with ThreadPool(2) as p:
                extract_u_frame_time.extend(p.map(generator_mode, run_path))

the environment version is below:

python=3.7.12
ffmpeg version 4.1.11-0+deb10u1
numpy==1.21.6
BabitMF==0.0.8

stdout of error:

terminate called without an active exception
Aborted (core dumped) happen

Running command

nohup python3 generator_mode.py 

运行人脸检测demo时会一直打印:[info] *** dropping frame 7 at ts 3584。请问:dropping frame是什么意思?是检测时抽帧检测的吗?谢谢

我的代码如下:
import torch
import torch.nn.functional as F
import numpy as np
import sys
import time
import tensorrt as trt
import bmf.hml.hmp as mp
from nms import NMS
import PIL
from PIL import Image

def main():
graph1 = graph({'dump_graph':1})

video = graph1.decode({
    "input_path": "./face.mp4",
    #"video_params": {
    #    "hwaccel": "cuda",
    #}
})["video"]

video = video.module("trt_face_detect", {
    "model_path": "./version-RFB-640.engine",
    "label_to_frame": 1,
    "input_shapes": {
        "input": [1, 3, 480, 640]
    }}, entry="trt_face_detect.trt_face_detect")

video = video.encode(
    None, {
        "output_path": "./trt_out.mp4",
        "video_params": {
            "codec": "h264_nvenc",
            "bit_rate": 5000000,
        }
    })

video.run()

if name == "main":
main()

[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet [rtsp @ 0x7f7d7a7f8980] RTP: missed 6 packets

[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 6 packets
[rtsp @ 0x7f7d7a7f8980] max delay reached. need to consume packet
[rtsp @ 0x7f7d7a7f8980] RTP: missed 2 packets

输入的input_path 是一个由摄像头输出的rtsp流,输出output_path 是一个rtmp流,运行过程中会有很多如上丢包告警,导致拉取的rtmp流画面有很多马赛卡以及卡顿

Run demo with a error in mac os 13.4.1 (22F82)

hi, I run the demo in my mac, but got the error, how to fix the error?

demo % python broadcaster/broadcaster.py
Traceback (most recent call last):
File "/Users/weiliang/Develop/bmf/bmf/demo/broadcaster/broadcaster.py", line 7, in
import bmf
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/init.py", line 3, in
from bmf.python_sdk.module_functor import make_sync_func
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/python_sdk/init.py", line 1, in
from .module_functor import make_sync_func, ProcessDone
File "/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/python_sdk/module_functor.py", line 1, in
import bmf.lib._hmp
ImportError: dlopen(/Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/lib/_hmp.cpython-39-darwin.so, 0x0002): Library not loaded: @executable_path/../../../../Python
Referenced from: /Users/weiliang/.pyenv/versions/3.9.18/lib/python3.9/site-packages/bmf/lib/_hmp.cpython-39-darwin.so
Reason: tried: '/Users/weiliang/Python' (no such file), '/usr/local/lib/Python' (no such file), '/usr/lib/Python' (no such file, not in dyld cache)

BMF框架如何去支持音频流PCM数据的输入

具体描述:这个音频流的数据并不是从流媒体上获取的,而是通过网络传输去不断接收到的流式数据包,想去实时的编码处理(不能保存为本地文件后再去读取本地文件),请问要去如何实现呢?

docker运行blur_gpu module报错

1.docker pull babitmf/bmf_runtime:latest;

2.nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84 Driver Version: 460.84 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:88:00.0 Off | 0 |
| N/A 35C P0 50W / 250W | 16435MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:8D:00.0 Off | 0 |
| N/A 41C P0 51W / 250W | 18213MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla P40 Off | 00000000:B3:00.0 Off | 0 |
| N/A 32C P0 49W / 250W | 15643MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 Tesla P40 Off | 00000000:B6:00.0 Off | 0 |
| N/A 34C P0 50W / 250W | 11013MiB / 22919MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

3.nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

4.cvcuda.gaussian_into报错
Line 563: '' failed: no kernel image is available for execution on the device

请问docker环境还需要怎么配置吗?

内部注册的 SIGTERM 和 SIGINT 信号处理器导致应用无法正常退出

嗨,bmf 很好用
但我在 bmf/engine/c_engine/src/graph.cpp 中的 Graph::Graph 构造函数中发现:

Graph::Graph(
    GraphConfig graph_config,
    std::map<int, std::shared_ptr<Module>> pre_modules,
    std::map<int, std::shared_ptr<ModuleCallbackLayer>> callback_bindings) {
    std::signal(SIGTERM, terminate);
    std::signal(SIGINT, interrupted);
    ...
}

这里注册的这两个信号处理器总是导致我的应用无法正常退出,因为它接管了这两个信号的处理,就像下面这样:

^Cinterrupted, ending bmf gracefully...
^Cinterrupted, ending bmf gracefully...
^Cinterrupted, ending bmf gracefully...

但是作为 SDK / 依赖包,它不应该依赖信号处理来清理资源,而是应该使用 RAII 或其他方式。

或者,我是否有其他解决办法让我的应用在接收到 SIGTERMSIGINT 时能正常退出?

ffmpegenc 参数能否透传?

int main(int argc, char** argv) {
	std::cout << "hello world!" << std::endl;
	std::string output_file = "rtsp://172.31.60.105/live/test";
	// BMF_CPP_FILE_REMOVE(output_file);

	nlohmann::json graph_para = {{"dump_graph", 0}};
	auto graph = bmf::builder::Graph(bmf::builder::NormalMode, bmf_sdk::JsonParam(graph_para));

	nlohmann::json decode_para = {{"input_path", "/d1/video/renshu.mp4"}};
	auto video = graph.Decode(bmf_sdk::JsonParam(decode_para));

	nlohmann::json logoPara = {{"input_path", "/d1/video/Snipaste_2024-04-27_20-10-56.png"}};
	auto logo = graph.Decode(bmf_sdk::JsonParam(logoPara));

	auto output_stream =
		video["video"].Scale("1280:720").Trim("start=0:duration=7").Setpts("PTS-STARTPTS");

	auto overlay = logo["video"].Scale("300:200").Loop("loop=0:size=10000").Setpts("PTS+0/TB");

	nlohmann::json encode_para = {{"output_path", output_file},
								{"format", "rtsp"},
								{"video_params",
								 {
									 {"rtsp_transport", "tcp"},
									 {"width", 640},
									 {"height", 480},
									 {"codec", "h264"},
								 }}};

	output_stream[0]
		.Overlay({overlay}, "x=if(between(t,0,7),0,NAN):y=if(between(t,0,7),0,NAN):repeatlast=1")
		.EncodeAsVideo(bmf_sdk::JsonParam(encode_para));

	graph.Run();

	return 0;
}

我想尝试 指定tcp的方式推流到一个rtsp地址,但是 这个参数好像并未生效,还是使用的默认udp

RuntimeError: BMF(0.0.7) /root/bmf/bmf/c_modules/src/ffmpeg_decoder.cpp:736: error: (-224:BMF Transcode Error) avformat_open_input failed: Protocol not found in function 'init_input'

When I use BMF to process RTSP video, the following problems will occur:

RuntimeError: BMF(0.0.7) /root/bmf/bmf/c_modules/src/ffmpeg_decoder.cpp:736: error: (-224:BMF Transcode Error) avformat_open_input failed: Protocol not found in function 'init_input'

Based on the example test_generator.py, replace the “'input_path': "../../files/big_bunny_10s_30fps.mp4"" in the code with the following content :

frames = ( bmf.graph() .decode({'input_path': "https://*****:1101/rtp/0615746E.live.flv"})['video'] .fps(1) # .ff_filter('scale', 299, 299) # or you can use '.scale(299, 299)' .start() # this will return a packet generator )

docker images : babitmf/bmf_runtime:latest

运行demo报:[swscaler @ 0x7f5cbb7f1200] No accelerated colorspace conversion found from yuv420p to rgb24.

运行官方镜像,
docker pull babitmf/bmf_runtime:latest
docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -it babitmf/bmf_runtime:latest bash
export CMAKE_ARGS="-DBMF_ENABLE_CUDA=ON"
./build.sh
编译后运行demo:
python3 ~/bmf/output/demo/video_enhance/enhance_demo.py
代码未报错,能够生成视频,但运行过程中一直报:No accelerated colorspace conversion found from yuv420p to rgb24.
问题:
生成的output.mp4视频,播放时完全看不到视频内容,显示花屏。

[swscaler @ 0x7fc7d6f80fc0] No accelerated colorspace conversion found from yuv420p to rgb24.

when i want to run the enhance_demo, i meet the bug, i know it's from ffmpeg, but my computer is in CUDA environment, and the gpu was using by python when i run the demo, i test two computer, still same thing.

i use the docker you provided docker pull babitmf/bmf_runtime:latest

1080Ti CUDA 12.2

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08             Driver Version: 535.161.08   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     Off | 00000000:03:00.0 Off |                  N/A |
| 31%   53C    P2             221W / 250W |    496MiB / 11264MiB |     26%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1872      G   /usr/libexec/Xorg                            56MiB |
|    0   N/A  N/A      2189      G   /usr/bin/gnome-shell                          7MiB |
|    0   N/A  N/A      3434      C   python3.8                                   428MiB |
+---------------------------------------------------------------------------------------+

V100 CUDA 11.4

[root@node02 ~]# nvidia-smi 
Fri May 24 10:59:34 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:18:00.0 Off |                    0 |
| N/A   44C    P0    37W / 250W |   4846MiB / 16160MiB |     17%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   34C    P0    26W / 250W |      4MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     51555      C   python3                          3363MiB |
|    0   N/A  N/A     76909      C   python3.8                        1479MiB |
+-----------------------------------------------------------------------------+

maybe the problem is caused by CUDA version, i see the project use CUDA 11.8, but in my machine, the version is not compatible for now

and the output videos meet wrong pixel data because the color space convert not work

image image

i think the program deal with the video data as RGB24 but YUV420P, so the UV color data is wrong, and the Y data is also wrong in every single pixel, because it's 1 byte per single pixel as Y but 4 bytes as RGB24

CFFFilter Demo

The parameters for CFFFilter seem quite complex; could you provide a Python example using CFFFilter?

cpp copy module can't work with gpu transcoding

Module: test/c_module

def test():
    input_video_path = xxx
    output_path = xxxx
    video = bmf.graph().decode({
            "input_path": input_video_path,
            "video_params": {
                "hwaccel": "cuda",
            }
        })["video"]

    video2 = video.c_module('cpp_copy_module',
                            "../../test/c_module/libcopy_module.so", # use your path
                            "copy_module:CopyModule")
        
    (bmf.encode(
        video2,
        video["audio"],
        {
            "output_path": output_path,
            "video_params": {
                "codec": "h264_nvenc",
                "pix_fmt": "cuda",
            }
        }).run())

The output video isn't encoded normally. There're green and red area in the pictures.

But with CPU decoding and GPU encoding, the results are good.

def test():
    input_video_path = xxx
    output_path = xxxx
    video = bmf.graph().decode({
            "input_path": input_video_path,
        })["video"]

    video2 = video.c_module('cpp_copy_module',
                            "../../test/c_module/libcopy_module.so", # use your path
                            "copy_module:CopyModule")
        
    (bmf.encode(
        video2,
        video["audio"],
        {
            "output_path": output_path,
            "video_params": {
                "codec": "h264_nvenc",
            }
        }).run())

RuntimeError: require false at /root/bmf/bmf/hml/src/imgproc/imgproc.cpp:154, Unsupport PixelInfo

There was a problem when I used the bmf/test/generator/test_generator.py for read stream testing.

# bmf/test/generator/test_generator.py
for i, pkt in enumerate(pkts):
    # convert frame to a nd array
    if pkt.is_(bmf.VideoFrame):
        vf = pkt.get(bmf.VideoFrame)
        rgb = mp.PixelInfo(mp.kPF_RGB24)
        np_vf = vf.reformat(rgb).frame().plane(0).numpy()  # <------ RuntimeError: require false at /root/bmf/bmf/hml/src/imgproc/imgproc.cpp:154, Unsupport PixelInfo
        # we can add some more processing here, e.g. predicting
        print("frame", i, "shape", np_vf.shape)
    else:
        break

I also tried the method in the document, but it seems to be incorrect.

# https://babitmf.github.io/docs/bmf/multiple_features/graph_mode/generatemode/
for i, frame in enumerate(frames):
     # convert frame to a nd array
     if frame is not None:
         np_frame = frame.to_ndarray(format='rgb24')    # <------ AttributeError: 'bmf.lib._bmf.sdk.Packet' object has no attribute 'to_ndarray'

         # we can add some more processing here, e.g. predicting
         print('frame', i, 'shape', np_frame.shape)
     else:
         break

What is Unsupport PixelInfo?
Or do I have any other methods to process the stream like OpenCV into video frames that can be read iteratively?

test push data with raw frame got an error

I use h264 as encode codec, and got the error bellow, and I change this to be mpeg4 and it works well.
My os is ubuntu 20.04 and python is 3.9 and pip install BabitMF

[2023-09-25 03:08:33.274] [error] node id:1 Codec 'libx264' not found
[2023-09-25 03:08:33.274] [error] node id:1 init codec error
[2023-09-25 03:08:33.274] [error] node id:1 catch exception: BMF(0.0.8) /project/bmf/engine/c_engine/src/node.cpp:352: error: (-5:Bad argument) [Node_1_c_ffmpeg_encoder] Process result != 0.
in function 'process_node'

[2023-09-25 03:08:33.274] [error] node id:1 Process node failed, will exit.

import io

import numpy as np
import bmf
from bmf import GraphMode, Module, Log, LogLevel, InputType, ProcessResult, Packet, Timestamp, scale_av_pts, av_time_base, BmfCallBackType, VideoFrame, AudioFrame, BMFAVPacket
from PIL import Image

def init_push_graph(output):
    graph = bmf.graph({"dump_graph": 1, "loglevel": "debug"})
    video_stream = graph.input_stream("video_stream")
    # audio_stream = graph.input_stream("wav_stream")
    decode_stream = video_stream.decode({
        "loglevel": "trace",
        's': '720:1280',
        'pix_fmt': 'rgb24',
        "push_raw_stream": 1,
        "video_codec": "bmp",
        "video_time_base": "1,30000"
        })

    bmf.encode(
            decode_stream,
            None,
            {
                "video_params": {
                    "codec": "h264",
                    "width": 720,
                    "height": 1280,
                    "max_fr": 30,
                    "crf": "23",
                    "preset": "veryfast"
                },
                # "audio_params": {"sample_rate": 44100, "codec": "aac"},
                "loglevel": "trace",
                "output_path": output
            },
        )
    graph.run_wo_block(mode=GraphMode.PUSHDATA)
    return graph

graph = init_push_graph('./test1.mp4')

pts = 0
timestamp = 0
for _ in range(100):
    frame = np.zeros((1280, 720, 3), dtype=np.uint8)
    image = Image.fromarray(frame, mode="RGB")
    byte_stream = io.BytesIO()
    image.save(byte_stream, format='BMP')
    image_bytes = byte_stream.getvalue()
    pkt = BMFAVPacket(len(image_bytes))
    memview = pkt.data.numpy()
    memview[:] = np.frombuffer(image_bytes, dtype=np.uint8)
    pkt.pts = pts
    packet = Packet(pkt)
    packet.timestamp = timestamp
    pts += 1001
    timestamp += 1
    graph.fill_packet("video_stream", packet)

graph.fill_packet("video_stream", Packet.generate_eof_packet())
graph.close()

内置资源和可复用的Module

1.阅读一些测试代码,发下有些资源找不到,请问哪里可以获取到这些资源?
比如test_graph.cpp中dynamic_add函数"../files/dynamic_add.json"

TEST(graph, dynamic_add) {
BMFLOG_SET_LEVEL(BMF_INFO);

time_t time1 = clock();
std::string config_file = "../files/graph_dyn.json";
std::string dyn_config_file = "../files/dynamic_add.json";
GraphConfig graph_config(config_file);
GraphConfig dyn_config(dyn_config_file);
std::map<int, std::shared_ptr<Module>> pre_modules;
std::map<int, std::shared_ptr<ModuleCallbackLayer>> callback_bindings;
std::shared_ptr<Graph> graph =
    std::make_shared<Graph>(graph_config, pre_modules, callback_bindings);
std::cout << "init graph success" << std::endl;

graph->start();
usleep(400000);

std::cout << "graph dynamic add nodes" << std::endl;
graph->update(dyn_config);

graph->close();
time_t time2 = clock();
std::cout << "time:" << time2 - time1 << std::endl;

}

2.目前内置的Module数量较少,请问是否有可复用的一些Module?如果有,哪里可以获取?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.