Code Monkey home page Code Monkey logo

documentsearch's Introduction

Hi there 👋

  • 🎯 喜欢python、transformers、nlp、pytorch

documentsearch's People

Contributors

yuanzhoulvpi2017 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

documentsearch's Issues

RuntimeError: CUDA error: invalid device ordinal

作者你好,我在colab上运行项目的时候出现了这样的问题,看代码中的cuda指定的是0也没有指定多个GPU。我不知道是什么原因导致的这个问题。
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be
incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

搜尋文檔格式支援

您好,想請問目前支援 pdf 和 docx 外的格式嗎?
像 pandas.Dataframe(), JSON or text 之類的?

謝謝🙏

运行 web_ui 报错

大佬帮忙看看,执行命令 streamlit run web_ui.py --server.fileWatcherType none --server.port 8080,项目启动后,提问时报错:

2023-04-27 11:40:16.159 Uncaught app exception
Traceback (most recent call last):
  File "/root/anaconda3/envs/ds/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/root/DocumentSearch/web_ui.py", line 36, in <module>
    output_str, output_df = kl.search_result(input_str)
  File "/root/DocumentSearch/demo.py", line 252, in search_result
    search_table_info = pd.concat(
  File "/root/anaconda3/envs/ds/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/root/anaconda3/envs/ds/lib/python3.9/site-packages/pandas/core/reshape/concat.py", line 368, in concat
    op = _Concatenator(
  File "/root/anaconda3/envs/ds/lib/python3.9/site-packages/pandas/core/reshape/concat.py", line 425, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

效果探索

这种 和 直接用BM25检索本地的知识文档,效果有什么区别吗,,感觉这种就是稍微带点生成,

embedding

为什么 我换成 sber的取embedding 效果差很多,

知识库和问题的向量化方式不同

知识库中的文件内容的Embedding是用chinese-roberta-wwm-ext向量化模型做的
而输入问题的Embedding是用THUDM/chatglm-6b对话大模型做的
两者之间计算相似度合理吗?为什么不统一用一个模型做?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.