When running the wikitext evaluation of gpt2 <div class="snippet-clipboard-content

fixed the in case you're interested

Indeed, thank you for clarifying. I was copying the command that <code class="notransl

Whew I evaluated the GPT-2 <div class="snippet-clipboard-content notranslate posit

perplexity too big for gpt2 wikitext evaluation about megatron-lm HOT 5 CLOSED

nvidia commented on July 21, 2024

perplexity too big for gpt2 wikitext evaluation

from megatron-lm.

Comments (5)

raulpuric commented on July 21, 2024 1

Hmmm that doesn't seem right.

I just ran this and got the correct result:
python3 evaluate_gpt2.py --model-parallel-size 1 --num-layers 12 --hidden-size 768 --log-interval 100 --load anything --eval-batch-size 16 --num-attention-heads 12 --seq-length 1024 --max-position-embeddings 1024 --tokenizer-type GPT2BPETokenizer --text-key text --distributed-backend nccl --hidden-dropout 0.1 --attention-dropout 0.1 --fp16 --overlapping-eval 32 --cache-dir cache --load-openai --valid-data ../gpt2_staging/eval_datasets/wikitext-103/wiki.test.tokens

This gives me

 validation results on wiki | avg loss: 3.1057E+00 | ppl: 2.2326E+01 | adjusted ppl: 3.0537E+01 | token ratio: 1.1008449901248143 |

Also for the future I think you meant to run scripts/run_gpt2_eval.py. We just realized there's a line missing from this, but it should work after I patch it in a sec.

from megatron-lm.

raulpuric commented on July 21, 2024 1

I think the arguments you're missing are --num-layers 12 --num-attention-heads 12

from megatron-lm.

raulpuric commented on July 21, 2024 1

fixed the script in case you're interested

from megatron-lm.

mschrimpf commented on July 21, 2024

Indeed, thank you for clarifying. I was copying the command that run_gpt2_eval.py was executing which was missing those arguments as you said.

from megatron-lm.

CatFootPrint commented on July 21, 2024

Whew I evaluated the GPT-2

python evaluate_gpt2.py --model-parallel-size=1 --num-layers=12 --hidden-size=768 --vocab-size=50257 --log-interval=1000 --load=anything --eval-batch-size=16 --num-attention-heads=12 --seq-length=1024 --max-position-embeddings=1024 --tokenizer-type=GPT2BPETokenizer --text-key=text --distributed-backend=nccl --hidden-dropout=0.1 --attention-dropout=0.1 --fp16 --overlapping-eval=32 --cache-dir=cache --load-openai
--valid-data=/data2/z00487393/Documents/Datasets/Wikipedia/wikitext-2-v1/wikitext-2/wiki.test.tokens

The result is as follows

Evaluate GPT2 model
WARNING: No training data specified
using world size: 1 and model-parallel size: 1 
 > using dynamic loss scaling
> initializing model parallel with size 1
> initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234
wikitext
Original Tokens: 528983, Detokenized tokens: 245566
> padded vocab (size: 50257) with 0 dummy tokens (new size: 50257)
global rank: 0 | vocab size: 50257 | eod token: 50256 | num_examples: 16531 | num_original_tokens: 245566 | num_tokenized_tokens: 528983
building GPT2 model ...
 > number of parameters: 124439808
loading openai weights
model.cpu()
Traceback (most recent call last):
  File "/data2/z00487393/Applications/Anaconda/envs/ML_pytorch/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-6f4706f7ea66>", line 4, in <module>
    runfile('/data2/z00487393/Documents/Scripts/PyTorch/Megatron/Megatron-LM-master/evaluate_gpt2.py', args=['--model-parallel-size=1', '--num-layers=12', '--hidden-size=768', '--vocab-size', '50257', '--log-interval=1000', '--load=anything', '--eval-batch-size=16', '--num-attention-heads=12', '--seq-length=1024', '--max-position-embeddings=1024', '--tokenizer-type=GPT2BPETokenizer', '--text-key=text', '--distributed-backend=nccl', '--hidden-dropout=0.1', '--attention-dropout=0.1', '--fp16', '--overlapping-eval=32', '--cache-dir=cache', '--load-openai', '--valid-data=/data2/z00487393/Documents/Datasets/Wikipedia/wikitext-2-v1/wikitext-2/wiki.test.tokens'], wdir='/data2/z00487393/Documents/Scripts/PyTorch/Megatron/Megatron-LM-master')
  File "/data2/z00487393/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/data2/z00487393/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/data2/z00487393/Documents/Scripts/PyTorch/Megatron/Megatron-LM-master/evaluate_gpt2.py", line 574, in <module>
    main()
  File "/data2/z00487393/Documents/Scripts/PyTorch/Megatron/Megatron-LM-master/evaluate_gpt2.py", line 557, in main
    gpt2model = GPT2LMHeadModel.from_pretrained(model_path, cache_dir='gpt2_weights')
  File "/data2/z00487393/Applications/Anaconda/envs/ML_pytorch/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling_gpt2.py", line 423, in from_pretrained
    state_dict = torch.load(resolved_archive_file, map_location='cpu')
  File "/data2/z00487393/Applications/Anaconda/envs/ML_pytorch/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/data2/z00487393/Applications/Anaconda/envs/ML_pytorch/lib/python3.6/site-packages/torch/serialization.py", line 573, in _load
    result = unpickler.load()
_pickle.UnpicklingError: invalid load key, '5'.

Thank you very much for your guidance.

from megatron-lm.

perplexity too big for gpt2 wikitext evaluation about megatron-lm HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent