Code Monkey home page Code Monkey logo

cogltx's People

Contributors

dm-thu avatar hyandell avatar sleepychord avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cogltx's Issues

句子对分类

麻烦问一下该代码可以用在对子对分类吗?

transformers version incompatible

when i try to pip install transformers==2.4.1,there are some problems:
ImportError: cannot import name 'AutoModelForMaskedLM' from 'transformers'
(C:**\Anaconda3\envs\py_light\lib\site-packages\transformers_init_.py)
so, i updated the transformers version(3.0.2),but can't find ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP...
from transformers import BertPreTrainedModel, RobertaConfig, RobertaModel, ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP, RobertaForSequenceClassification
ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP
Does anyone know how to fix it? Thank you!

Checkpoint contains hyperparameters but IntrospectorModule's __init__ is missing the argument 'hparams'.

/home/anaconda3/envs/cogltx/lib/python3.7/site-packages/pytorch_lightning/utilities/warnings.py:18: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argumentin theDataLoader` init to improve performance.
warnings.warn(*args, **kwargs)
Traceback (most recent call last):
File "run_20news.py", line 45, in
main_loop(config)
File "/data/CogLTX-main/main_loop.py", line 57, in main_loop
trainer.fit(introspector)
File "/home/anaconda3/envs/cogltx/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 695, in fit
self.load_spawn_weights(model)
File "/home/anaconda3/envs/cogltx/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 373, in load_spawn_weights
loaded_model = original_model.class.load_from_checkpoint(path)
File "/home/anaconda3/envs/cogltx/lib/python3.7/site-packages/pytorch_lightning/core/lightning.py", line 1509, in load_from_checkpoint
model = cls._load_model_state(checkpoint, *args, **kwargs)
File "/home/anaconda3/envs/cogltx/lib/python3.7/site-packages/pytorch_lightning/core/lightning.py", line 1533, in _load_model_state
f"Checkpoint contains hyperparameters but {cls.name}'s init "
pytorch_lightning.utilities.exceptions.MisconfigurationException: Checkpoint contains hyperparameters but IntrospectorModule's init is missing the argument 'hparams'. Are you loading the correct checkpoint?

20news task, 跑完main_loop之后,找不到model

---运行了main_loop 之后没有save_dir,但是log_dir 里面有checkpoints文件夹,里面有.ckpt文件
Traceback (most recent call last):
File "/Users/liyong/openSource/CogLTX/run_20news.py", line 58, in
for qbuf, dbuf, buf, relevance_score, ids, output in prediction(config):
File "/Users/liyong/openSource/CogLTX/main_loop.py", line 74, in prediction
intro_model = IntrospectorModule.load_from_checkpoint(find_lastest_checkpoint(os.path.join(config.save_dir, 'introspector', f'version_{config.version}', 'checkpoints'))).to(device).eval()
File "/Users/liyong/opt/anaconda3/envs/cogLTX/lib/python3.6/site-packages/pytorch_lightning/core/lightning.py", line 1501, in load_from_checkpoint
checkpoint = torch.load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/Users/liyong/opt/anaconda3/envs/cogLTX/lib/python3.6/site-packages/torch/serialization.py", line 419, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/Users/liyong/openSource/CogLTX/save_dir/introspector/version_0/checkpoints/'

muti-label classification

在多标签分类的任务中,数据的输入格式是[[CLS] label [SEP] doc],这样不会造成标签的泄露吗?此外新的数据输入的时候,是没有标签的。期待您的回复,谢谢!

run_newsQA 错误

运行 run_newsQA 时出现此错误:
ipykernel_launcher.py: error: the following arguments are required: --gpus
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

你能帮我吗?

ValueError: value should be one of int, float, str, bool, or torch.Tensor

......
INFO:summarizer.preprocessing.cleaner:'pattern' package not found; tag filters are not available for English
Traceback (most recent call last):
File "run_20news.py", line 45, in
main_loop(config)
File "/content/main_loop.py", line 57, in main_loop
trainer.fit(introspector)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 687, in fit
mp.spawn(self.ddp_train, nprocs=self.num_gpus, args=(model,))
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 331, in ddp_train
self.run_pretrain_routine(model)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 757, in run_pretrain_routine
self.logger.log_hyperparams(ref_model.hparams)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/logging/base.py", line 14, in wrapped_fn
fn(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/logging/tensorboard.py", line 88, in log_hyperparams
self.experiment.add_hparams(hparam_dict=params, metric_dict={})
File "/usr/local/lib/python3.7/dist-packages/torch/utils/tensorboard/writer.py", line 292, in add_hparams
exp, ssi, sei = hparams(hparam_dict, metric_dict)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/tensorboard/summary.py", line 156, in hparams
raise ValueError('value should be one of int, float, str, bool, or torch.Tensor')
ValueError: value should be one of int, float, str, bool, or torch.Tensor

在prediction中用了标签(qbuf)是不是存在标签泄露的嫌疑。

在测试集上测试指标时,mem_replay(main_loop.py line75)调用了qbuf(测试集标签)参数,选出关键块,送入reasoner做task-specific的推理。是不是作弊了?是不是有标签泄露的嫌疑,judge和reasoner既然是jointly train的,为什么预测时judge还要用标签去初始化Z矩阵。如果我在工业场景上实际运用,我的实际数据是没有标签的,我如何做mem_replay?

请问巨佬,batch size 如何设置?

巨佬您好!关于 batch size 我有一些疑惑。
鉴于本文特有的 buffer 数据结构,做 QA 任务时,初始化 Trainer 不需要传入 dataloader,但是 batch size 是在哪里设置的呢?

Multilabel classification

Hi,

I would like to ask whether it is possible to use your code for multilabel classfication. If so, how does the unsupervised learning of relevant sentences in texts to labels work in this case?

I'm looking forward to your reply.
Many thanks,

Caching the score of Judge during training Judge?

If I'm understanding the code correctly, at the end of training_step in introspector code, it is updating the scores of Judge as suggest in the algorithm in the paper: "can be replaced by cached scores during training judge". However, training_step() is one batch instead of one epoch. The weights of Judge keep updating for batches. Shouldn't we calculate the score of Judge after Judge is fully trained inside one epoch?

run_20news acc仅有0.81

作者提供的代码一直无法运行,经过多次调试,修正了代码里的问题,终于可以运行了,但20news的acc只有0.81,和文章中展示的0.87有差距
for long text: accuray 0.7934272300469484, total 1065

作者能否修复一下代码,或者提供一个docker,产出0.87的结果?

pickle error

When I run run_20news.py, here comes the error:

Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/usr/lib/python3.6/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'conditional_trans_classification' on <module 'mp_main' from '/data/CogLTX/run_20news.py'>

I wonder why it happens,please!

run_20news.py run problem

Hello,
I ran it with run_20news.py --gpus 0 to run this.
Has anyone run this article author's code? I checked 17 papers citing the authors of this paper, but they were for reference only, and none of them were actually used. Several papers were referred to for classification of long text documents, but the method using a pre-trained model was poor, so the method of this paper was adopted. Unfortunately, I can't get this program to work normally. Can someone help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.