chenyme / chenyme-aavt Goto Github PK
View Code? Open in Web Editor NEW这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
License: MIT License
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
License: MIT License
非常好的自动化项目,希望可以增加docker的安装方式,这样可以在服务器运行
根据install.bat安装了
pip install streamlit -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install -U openai-whisper -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install openai -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install langchain -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install torch torchvision torchaudio -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
pip install faster-whisper -i https://pypi.tuna.tsinghua.edu.cn/simple some-package
启动后gpu加速无法运行,/workspace/venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_ops_infer.so.8有这个
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Please make sure libcudnn_ops_infer.so.8 is in your library path!
root@ae950ec2447b:/workspace# find / -type f -name libcudnn_ops_infer.so.8
/opt/conda/lib/python3.10/site-packages/torch/lib/libcudnn_ops_infer.so.8
/opt/conda/pkgs/pytorch-2.1.2-py3.10_cuda11.8_cudnn8.7.0_0/lib/python3.10/site-packages/torch/lib/libcudnn_ops_infer.so.8
find: '/proc/17/task/17/net': Invalid argument
find: '/proc/17/net': Invalid argument
find: '/proc/18/task/18/net': Invalid argument
find: '/proc/18/net': Invalid argument
find: '/proc/19/task/19/net': Invalid argument
find: '/proc/19/net': Invalid argument
find: '/sys/kernel/slab': Input/output error
/workspace/venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_ops_infer.so.8
我使用开源的whisper 音频转文字软件可以用amd转换,这个软件支持amd显卡吗
LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input.
假如再大胆一点
这个是不是就是heygen video translation
的大致实现思路,当然我是一个rookie,真的过程想必远比这个复杂,这里最大的难点是,如何识别出不同的声音的前后时间轴,中间还有相关的去背景音,识别误差校准等很多问题
Traceback (most recent call last):
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\state\session_state_proxy.py", line 119, in getattr
return self[key]
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\state\session_state_proxy.py", line 90, in getitem
return get_session_state()[key]
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\state\safe_session_state.py", line 91, in getitem
return self._state[key]
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\state\session_state.py", line 400, in getitem
raise KeyError(_missing_key_error_message(key))
KeyError: 'st.session_state has no key "w_model_option". Did you forget to initialize it? More info: https://docs.streamlit.io/library/advanced-features/session-state#initialization'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "E:\software\Chenyme_AAVT_0.5.1\Chenyme_AAVT_0.5.1\pages\📽️视频(Video).py", line 59, in
result = get_whisper_result(uploaded_file, output_file, device, st.session_state.w_model_option,
File "C:\Users\Administrator\pinokio\bin\miniconda\lib\site-packages\streamlit\runtime\state\session_state_proxy.py", line 121, in getattr
raise AttributeError(_missing_attr_error_message(key))
AttributeError: st.session_state has no attribute "w_model_option". Did you forget to initialize it? More info: https://docs.streamlit.io/library/advanced-features/session-state#initialization
若使用kimi进行翻译,如果字幕超过一定数量之后,就无法进行翻译了,例如,一个视频超过10分钟,之后,就没办法进行翻译了,所提供的字幕就是还是英文的
python 3.12版本会在安装faster-whisper失败,3.11版本可以
建议写到readme 里面
kimi 是哪个平台 链接是什么?
挂着程序在后台跑了两个多小时顺便看了部电影, 然后edge直接把挂在后台的标签页杀了, 命令行窗口还能看到网页上显示的东西全都跟被重置了一样. 程序跑完的结果根本看不到.
部署在本地,似乎是跨域问题?你们没遇到这个问题吗?
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 542, in _run_script
exec(code, module.dict)
File "C:\Chenyme_AAVT_0.6.1\pages\🎙️音频(Audio).py", line 46, in
result = get_whisper_result(uploaded_file, cache_dir, device, w_model_option, w_version, vad)
TypeError: get_whisper_result() missing 3 required positional arguments: 'lang', 'beam_size', and 'min_vad'
2024-03-12 07:30:23.183 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 542, in _run_script
exec(code, module.dict)
File "C:\Chenyme_AAVT_0.6.1\pages\🎙️音频(Audio).py", line 46, in
result = get_whisper_result(uploaded_file, cache_dir, device, w_model_option, w_version, vad)
TypeError: get_whisper_result() missing 3 required positional arguments: 'lang', 'beam_size', and 'min_vad'
上传视频时出现AxiosError: Request failed with status code 403
执行安装文件的时候创建一个专用的虚拟环境,在里面安装依赖包,而不是在全局Python环境中安装
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.