Comments (8)
todo:
[x] 添加取消屏幕录制时缩放的策略选项,允许高分屏用户录制完整分辨率的视频文件;
[ ] 暴露 OCR 接口,添加更多 OCR 选型、允许用户添加自定义的 OCR 引擎,添加与完善 benchmark 测试对比工具;
from windrecorder.
业内(做OCR的)朋友推荐了PaddleOCR和EasyOCR。
实测chineseOCRlite的准确率比Windows的高多了。
感觉可以考虑加入GPU推理。
from windrecorder.
尝试了一下调用微信自带的OCR来替换Windows.Media.Ocr.Cli,精度提升非常大,性能还没测试,也没在前端做按钮,更新刚刚推送到我fork的一个分支(https://github.com/B1lli/Windrecorder/tree/dev ),如果 @Antonoko 愿意的话,可以加在前端页面作为可选的替换OCR选择
棒!大概 0.2.0 版本前会添加自定义 ocr 接口的配置,这个 ocr 方式可以作为一个备选项加入~
from windrecorder.
https://sspai.com/prime/story/rewind-diy
这个哥们也在复现rewind,用的ocr技术是
识别文字和压缩截图尺寸:使用 OCRmyPDF
少数派过去曾有一篇文章介绍如何通过 OCRmyPDF 在扫描版 PDF 中检索文字。本文沿用那篇文章所介绍的用法,唯一多用到的选项是 --optimize 3;根据文档,这是指对图片进行比较激进的有损压缩,特别适合截图留档这种「能看清就行」的场景。
from windrecorder.
我改本地代码调用了chineseOCRlite,删除数据库全部ocr,效果好了很多! 字小的,模糊的可以考虑这个。
用chineseOCRlite的时候,在crnn.py的25行加入,可以避免输出大量onnx的警告:
rt.set_default_logger_severity(3)
from windrecorder.
https://cnocr.readthedocs.io/zh/latest/models/
cnocr 我看了一下,很灵活,cpu、gpu、模型都可以配置,效果很好。但是配环境很麻烦。
最好还是能暴露接口
from windrecorder.
目前禁用了 chineseOCRlite 的主要原因是效能比较糟糕(需要消耗更多的计算资源、时间也相对慢一些)、且同输入图像准确率和系统自带相比也接近。在同15分钟视频切片下,chineseOCRlite
OCR 耗时约为8分钟,Windows.Media.Ocr.Cli
大概为3分钟不到。
准确率较低的原因可能是由于录制的规格分辨率比较低,导致基于此画面的OCR结果准确率也低,可以参见这个讨论:
#9 (comment)
(因为我屏幕的缩放开得比较大,所以没有太注意到准确率的问题……下个版本中我们会加上关闭压缩分辨率策略的选项🤯,通过录制原始的分辨率画面,应该可以对 OCR 准确度有较大的提升
OCRmyPDF 我们也瞅瞅看!未来也可能会加上 paddleOCR 等方式选项进行 benchmark 供选择🤔
from windrecorder.
尝试了一下调用微信自带的OCR来替换Windows.Media.Ocr.Cli,精度提升非常大,性能还没测试,也没在前端做按钮,更新刚刚推送到我fork的一个分支(https://github.com/B1lli/Windrecorder/tree/dev ),如果 @Antonoko 愿意的话,可以加在前端页面作为可选的替换OCR选择
from windrecorder.
Related Issues (20)
- feat: add time lapse video generator as extension
- bug: 录制时关机或关闭Windrecoder会导致录像文件损坏 | Shutting down or closing Windrecoder during recording might cause the video file to be damaged. HOT 1
- the use of Chinese and English in the same documents HOT 2
- 点击托盘时很长时间才能弹出选项,并且很卡 HOT 10
- OCR Support for more languages HOT 1
- Default Value for Settings Page HOT 1
- 建议添加识别某应用在前台时,停止录制视频 HOT 3
- 开机后自动启动应用问题 | After checked "run on system startup" settings in webui, may cause webui exit abnormally. HOT 1
- need help: 在使用 install_img_embedding_module.bat 安装图像语义检索时出现错误 HOT 7
- ERROR at end of video recording, unable to index and view in browser UI after time today morning. HOT 3
- windows.media.ocr.cli.exe this application could not be started HOT 1
- Record PC Audio HOT 3
- 我可以把这个项目用于UI自动化回放吗 HOT 1
- uform model seems to have failed to download, please check the network, add a proxy, or try again. HOT 11
- 有动态壁纸时会把动态壁纸全部录进去 HOT 2
- Suggest this project be portable to Linux HOT 2
- Unexpected Screen Capture Resolution Issue. Capture at (0,-5), so height not divisible by 2. HOT 3
- 在microsoft edge上无法回放视频 HOT 6
- I still have a lot of lockscreen footage in my memory HOT 1
- A way to refresh database HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from windrecorder.