Code Monkey home page Code Monkey logo

irlab-sdu / fuzi.mingcha Goto Github PK

View Code? Open in Web Editor NEW
215.0 6.0 14.0 12.15 MB

夫子•明察司法大模型是由山东大学、浪潮云、**政法大学联合研发,以 ChatGLM 为大模型底座,基于海量中文无监督司法语料与有监督司法微调数据训练的中文司法大模型。该模型支持法条检索、案例分析、三段论推理判决以及司法对话等功能,旨在为用户提供全方位、高精准的法律咨询与解答服务。

License: Apache License 2.0

Python 100.00%
chatglm-6b judicial large-language-models legal legal-ai legalai llms nlp pretrained-models

fuzi.mingcha's People

Contributors

furyton avatar nancheng58 avatar yi-bai-bai avatar zt-yao avatar zwh-sdu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fuzi.mingcha's Issues

ModuleNotFoundError: No module named 'transformers_modules.fuzi'

Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("/content/drive/MyDrive/fuzi.mingcha/fuzi.mingcha-v1.0", trust_remote_code=True)
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 719, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 497, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module.replace(".py", ""))
File "/usr/local/lib/python3.10/dist-packages/transformers/dynamic_module_utils.py", line 199, in get_class_in_module
module = importlib.import_module(module_path)
File "/usr/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.fuzi'

加载模型出错,ValueError: Unrecognized configuration class <class 'transformers_modules.fuzi-mingcha-v1_0.configuration_chatglm.ChatGLMConfig'>

您好按照你的方式加载模型出错,@Furyton @zwh-sdu 错误如下:

>>> model = AutoModelForCausalLM.from_pretrained(p, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/zhaoyuhang/anaconda3/envs/fuzimingcha/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 487, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.fuzi-mingcha-v1_0.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

法条检索有误

问:民法总则的第三十条是什么?
答:民法总则的第三十条规定如下 因撤销权产生的民事法律行为,自始没有法律约束力。
问:这个回答准确么?
答:这个回答不是非常准确。民法总则的第三十条规定了因撤销权产生的民事法律行为,自始没有法律约束力,也就是说,撤销合同的意思表示只能在特定的法律程序下执行,不能随意撤销。因此,在实际情况中,需要根据具体的法律规定和证据来判断撤销权的行使是否合理和合法。
问:因撤销权产生的民事法律行为,自始没有法律约束力。这是哪里的法条?
答:这一条法条是民法总则的第三十条规定。民法总则是为了规范民事行为,保护民事主体的合法权益,促进社会和谐稳定而制定的。这条法律规定了撤销权的性质和效力,并且规定了撤销权的行使必须在特定的法律程序下进行,不能随意撤销。

实际法条:第三十条 依法具有监护资格的人之间可以协议确定监护人。协议确定监护人应当尊重被监护人的真实意愿。

多GPU运行

您好,我尝试了多GPU运行,但是一直没有成功,请问您有什么好的方法吗

UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 7: illegal multibyte sequence

作者你好,我在启动Singularity镜像时遇到了以下的错误:
Traceback (most recent call last):
File "/mnt/task1/api.py", line 132, in
create_index()
File "/mnt/task1/api.py", line 65, in create_index
headers = next(reader)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 7: illegal multibyte sequence

似乎是使用gbk编码解析文件时遇到了错误。我将其改成了utf-8,可以运行。即把api.py line 63改成:
with open(path, "r", encoding="utf-8") as csvfile:

系统环境如下:
OS: Ubuntu 22.04.3 LTS
singularity-ce version 3.8.0

这是什么原因呀?

AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

我在执行cli_demo.py时,报错找不到属性
(base) root@hzhb:/data/fuzi.mingcha-main/src# python3 cli_demo.py --url_lucene_task1 "法条检索对应部署的 pylucene 地址" --url_lucene_task2 "类案检索对应部署的 pylucene 地址"
正在加载模型
Traceback (most recent call last):
File "cli_demo.py", line 17, in
tokenizer = AutoTokenizer.from_pretrained("/data/fuzi-mingcha-v1_0", trust_remote_code=True)
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained
return cls._from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/fuzi-mingcha-v1_0/tokenization_chatglm.py", line 196, in init
super().init(
File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/root/.cache/huggingface/modules/transformers_modules/fuzi-mingcha-v1_0/tokenization_chatglm.py", line 248, in get_vocab
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
File "/root/.cache/huggingface/modules/transformers_modules/fuzi-mingcha-v1_0/tokenization_chatglm.py", line 244, in vocab_size
return self.sp_tokenizer.num_tokens
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

TypeError: slice indices must be integers or None or have an __index__ method

你好,我在使用cli_demo.py进行任务2基于案例检索时碰到了以下报错:
Traceback (most recent call last):
File "/home/eagle/fuzi.mingcha/src/cli_demo.py", line 133, in
main()
File "/home/eagle/fuzi.mingcha/src/cli_demo.py", line 118, in main
retrieval_law = retrieval_law + f"第{i + 1}条:\n{doc[-max_len / len(docs):]}\n"
TypeError: slice indices must be integers or None or have an index method

我按照ChatGPT的建议,把这一行改成了:
retrieval_law = retrieval_law + f"第{i + 1}条:\n{doc[-int(max_len / len(docs)):]}\n"

用 int() 函数转换为整数后,可以运行。这里是不是有bug呀?

系统环境:
OS: Ubuntu 22.04.3 LTS
Python 3.10.12

镜像文件格式不对

在安装好singularity后跑起来报
ERROR : Unknown image format/type: /home/user/mount1/lsh/pylucene_singularity.sif
ABORT : Retval = 255
的错误

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.