Comments (13)
It means that you need to use GPU with larger memories.
The multi-GPU does not work in this way. Please check the relevant docs (fairseq/pytorch).
Multi-GPU can help you expand the batch size, but can not adapt to max_seq_len in generation tasks. Basically, multi-GPU helps you assign different training data in a batch to different GPUs and then aggregate them to achieve a larger batch size.
from multi-view-seq2seq.
Please modify the path with your actual path to pre-trained BART model.
from multi-view-seq2seq.
thank you.it works. but when i training Multi-View model,and then it come to my eyes:
OOM: Ran out of memory with exception: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 9.78 GiB already allocated; 21.75 MiB free; 9.97 GiB reserved in total by PyTorch)
The GPU are GeForceRTX2080Ti * 2 which is 11GB(every GPU) and i set CUDA_VISIBLE_DEVICES=0,1
but it didn‘t work
Maybe i need a GPU which is more than 16GB??but i don’t know why two GPU totally 22GB did not work .oh my god.
thanks for everything.
from multi-view-seq2seq.
话说,这个可以部署中文对话数据集吗emmm(大佬是浙大的呀
from multi-view-seq2seq.
You could do that, but you need a Chinese-pre-trained model. Maybe mBART could work.
from multi-view-seq2seq.
thank you.it works. but when i training Multi-View model,and then it come to my eyes:
OOM: Ran out of memory with exception: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 9.78 GiB already allocated; 21.75 MiB free; 9.97 GiB reserved in total by PyTorch)The GPU are GeForceRTX2080Ti * 2 which is 11GB(every GPU) and i set CUDA_VISIBLE_DEVICES=0,1
but it didn‘t workMaybe i need a GPU which is more than 16GB??but i don’t know why two GPU totally 22GB did not work .oh my god.
thanks for everything.
BART_PATH= PATH-TO-BART-MODEL (./bart.large/model.pt)我改成BART_PATH= (./bart.large/model.pt)为什么bash train_single_view.sh显示train.py: error: argument --restore-file: expected one argument,请问该怎么修改呢
from multi-view-seq2seq.
不是这样改,应该是BART_PATH=”./bart.large/model.pt”
from multi-view-seq2seq.
不是这样改,应该是BART_PATH=”./bart.large/model.pt”
train_single_view.sh: line 6: /content/drive/MyDrive/Multi-View-Seq2Seq/train_sh/bart.large/model.pt: Permission denied
usage: train.py [-h] [--no-progress-bar] [--log-interval N]
[--log-format {json,none,simple,tqdm}]
[--tensorboard-logdir DIR] [--seed N] [--cpu] [--fp16]
[--memory-efficient-fp16] [--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale D]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE] [--multi-views]
[--balance] [--lr-weight LR_WEIGHT] [--T T]
[--criterion {adaptive_loss,binary_cross_entropy,composite_loss,cross_entropy,label_smoothed_cross_entropy,label_smoothed_cross_entropy_with_alignment,legacy_masked_lm_loss,masked_lm,nat_loss,sentence_prediction,sentence_ranking}]
[--tokenizer {moses,nltk,space}]
[--bpe {fastbpe,gpt2,bert,sentencepiece,subword_nmt}]
[--optimizer {adadelta,adafactor,adagrad,adam,adamax,lamb,nag,sgd}]
[--lr-scheduler {cosine,fixed,inverse_sqrt,polynomial_decay,reduce_lr_on_plateau,tri_stage,triangular}]
[--task TASK] [--num-workers N]
[--skip-invalid-size-inputs-valid-test] [--max-tokens N]
[--max-sentences N] [--required-batch-size-multiple N]
[--dataset-impl FORMAT] [--train-subset SPLIT]
[--valid-subset SPLIT] [--validate-interval N]
[--fixed-validation-seed N] [--disable-validation]
[--max-tokens-valid N] [--max-sentences-valid N]
[--curriculum N] [--distributed-world-size N]
[--distributed-rank DISTRIBUTED_RANK]
[--distributed-backend DISTRIBUTED_BACKEND]
[--distributed-init-method DISTRIBUTED_INIT_METHOD]
[--distributed-port DISTRIBUTED_PORT] [--device-id DEVICE_ID]
[--distributed-no-spawn] [--ddp-backend {c10d,no_c10d}]
[--bucket-cap-mb MB] [--fix-batches-to-gpus]
[--find-unused-parameters] [--fast-stat-sync]
[--broadcast-buffers] [--arch ARCH] [--max-epoch N]
[--max-update N] [--clip-norm NORM] [--sentence-avg]
[--update-freq N1,N2,...,N_K] [--lr LR_1,LR_2,...,LR_N]
[--min-lr LR] [--use-bmuf] [--save-dir DIR]
[--restore-file RESTORE_FILE] [--reset-dataloader]
[--reset-lr-scheduler] [--reset-meters] [--reset-optimizer]
[--optimizer-overrides DICT] [--save-interval N]
[--save-interval-updates N] [--keep-interval-updates N]
[--keep-last-epochs N] [--keep-best-checkpoints N] [--no-save]
[--no-epoch-checkpoints] [--no-last-checkpoints]
[--no-save-optimizer-state]
[--best-checkpoint-metric BEST_CHECKPOINT_METRIC]
[--maximize-best-checkpoint-metric] [--patience N]
train.py: error: argument --restore-file: expected one argument还是一样地问题,BART_PATH=”./bart.large/model.pt”后面是不是还有一个文件地址
from multi-view-seq2seq.
你有下载bart模型吗 要放好位置噢
from multi-view-seq2seq.
你有下载bart模型吗 要放好位置噢
谢谢,已经可以跑了
from multi-view-seq2seq.
你有下载bart模型吗 要放好位置噢
| INFO | fairseq.trainer | no existing checkpoint found ”./bart.large/model.pt”
我把下载地model.pt文件放在/content/drive/MyDrive/Multi-View-Seq2Seq/train_sh/bart.large/model.pt,为啥说找不到啊
from multi-view-seq2seq.
你有下载bart模型吗 要放好位置噢
而且我换成绝对路径也是找不到,,,
from multi-view-seq2seq.
你有下载bart模型吗 要放好位置噢
| INFO | fairseq.trainer | no existing checkpoint found ”./bart.large/model.pt”
我把下载地model.pt文件放在/content/drive/MyDrive/Multi-View-Seq2Seq/train_sh/bart.large/model.pt,为啥说找不到啊
ememem,文件路径打成中文引号了,,,
from multi-view-seq2seq.
Related Issues (20)
- How Do I run this model on an 11 GB 1080ti? HOT 1
- RuntimeError: result type Float can't be cast to the desired output type Long HOT 1
- The distance of score reproduction is a little less HOT 9
- already installed fairseq, but it’s still a problem:ModuleNotFoundError: No module named 'fairseq.logging'
- A variable in the paper HOT 1
- Execute ./train_single_view.sh appear problem
- A tiny question on the code HOT 1
- Can not reproduce report result about MultiView BART HOT 6
- Error while reproducing results
- The generation results HOT 1
- Link to dataset expired HOT 1
- unintended bug on sep (Read_Labels.ipynb)
- Issue with unrecognized arguments when running train_multi_view.sh
- minor bug fixes HOT 3
- Could you elaborate on how to evaluate topic segmentation task? HOT 1
- Unexpected key(s) in state_dict while using pre-trained model HOT 1
- Why the segmentation of the stage view in the demo in the paper is different from the test set HOT 2
- Error(s) in loading state_dict for BARTModel HOT 4
- GT-SALT/Multi-View-Seq2Seq, Please help with data format. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multi-view-seq2seq.