esca_code's People
esca_code's Issues
code
你好作者,我对你的研究很感兴趣,希望能够复现论文中的模型,是否可以放出完整代码呢,非常感谢!
咨询pretrain extractor时的问题,以及如果正确运行应该会输出什么
您好!我现在正在尝试使用您的代码复现模型,现在遇到了这个bug:
RuntimeError: MAGMA getrf : U(4,4) is 0, U is singular at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/generic/THCTensorMathMagma.cu:363
在NVIDIA/waveglow#63 这个issue中我看到了类似的问题,这里面说修改其他cudnn版本可能可以解决这个问题。我现在是在Linux云服务器上运行,cudnn版本是7.6.5,system cuda version是CUDA Version 10.0.130 (命令行cat /usr/local/cuda/version.txt
)。
我想请问您是否了解类似问题的解决方式呢?如果您也不清楚的话,我想问问您这边使用的环境是什么呢?我看看换环境是否是问题所在。
我的调用命令:
nohup python -u mypath/Esca_Code/BERT/src/train.py --pairwise -task ext -mode train -bert_data_path mypath/bert_data_cnndm_final/cnndm -ext_dropout 0.1 -model_path mypath/test_models/ -lr 0.01 -visible_gpus 1,2,3 -report_every 10 -save_checkpoint_steps 100 -batch_size 300 -train_steps 1000 -accum_count 2 -log_file mypath/logs/bert_extractor.log -use_interval true -warmup_steps 1000 -max_pos 512 -temp_dir mypath/bertcache/ -result_path mypath/test_results/ >> mypath/logs/nohup3.log 2>&1 &
我按照您的README文件配的代码环境。比较相关的包版本:
Python 3.6.13
pip 21.3.1
wheel 0.37.0
torch 1.1.0
torchvision 0.3.0
pytorch-transformers 1.2.0
numpy 1.16.4
multiprocess 0.70.12.2
pyrouge 0.1.3
tensorboard 2.7.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.0
tensorboardX 2.4.1
完整的报错内容(我不知道在此之前的报告内容是否还需要):
Traceback (most recent call last):
File "mypath/Esca_Code/BERT/src/train.py", line 182, in <module>
train_ext(args, device_id)
File "mypath/Esca_Code/BERT/src/train_extractive.py", line 206, in train_ext
train_multi_ext(args)
File "mypath/Esca_Code/BERT/src/train_extractive.py", line 49, in train_multi_ext
p.join()
File "mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/process.py", line 124, in join
res = self._popen.wait(timeout)
File "mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/popen_fork.py", line 50, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/popen_fork.py", line 28, in poll
pid, sts = os.waitpid(self.pid, flag)
File "mypath/Esca_Code/BERT/src/train_extractive.py", line 106, in signal_handler
raise Exception(msg)
Exception:
-- Tracebacks above this line can probably
be ignored --
Traceback (most recent call last):
File "mypath/Esca_Code/BERT/src/train_extractive.py", line 64, in run
train_single_ext(args, device_id)
File "mypath/Esca_Code/BERT/src/train_extractive.py", line 264, in train_single_ext
trainer.train(train_iter_fct, args.train_steps)
File "mypath/Esca_Code/BERT/src/models/trainer_ext.py", line 148, in train
report_stats)
File "mypath/Esca_Code/BERT/src/models/trainer_ext.py", line 313, in _gradient_accumulation
sent_scores, mask = self.model(src, segs, clss, mask, mask_cls)
File "mypath/anaconda3/envs/envesca/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "mypath/Esca_Code/BERT/src/models/model_builder.py", line 266, in forward
sent_scores = self.cal_matrix0(sents_vec, mask_cls)
File "mypath/Esca_Code/BERT/src/models/model_builder.py", line 241, in cal_matrix0
D_ = torch.inverse(tmp_D)
RuntimeError: MAGMA getrf : U(4,4) is 0, U is singular at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/generic/THCTensorMathMagma.cu:363
mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 3 leaked semaphores to clean up at shutdown
len(cache))
mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 3 leaked semaphores to clean up at shutdown
len(cache))
mypath/anaconda3/envs/envesca/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 3 leaked semaphores to clean up at shutdown
len(cache))
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.