Comments (7)
dacorvo Sorry you are having an issue with your batch size > 1 test runs. While we are looking into the problem, could you try compileing with a batch size of 4 and 8 to post the results to this ticket?
from transformers-neuronx.
Here is the failure with batch size 4:
$ gptj_demo run --batch_size 4 --n_positions 20 gpt-j-6B
running GPTJForSampling.from_pretrained
running model.to_neuron
...2023-06-27T07:21:05Z ERROR 2286 [WalrusDriver]: An exception was thrown:
--------------------------------------------------------------------------------
0# __cxa_throw in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
1# 0x00007F52C6A91E96 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
2# birverifier::checkInputMemType(bir::Instruction const&, unsigned int, llvm::SmallVector<bir::MemoryType, 3u> const&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
3# birverifier::InstVisitor::visitInstIndirectSave(bir::InstIndirectSave&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
4# neuronxcc::walrus::Verifier::run(bir::Module&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
5# neuronxcc::walrus::WalrusPass::run(std::vector<std::unique_ptr<bir::Module, std::default_delete<bir::Module> >, std::allocator<std::unique_ptr<bir::Module, std::default_delete<bir::Module> > > >&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
6# 0x00007F527859C3FE in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
7# run_walrus_driver(int, char**) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
8# 0x00007F52C6AD3130 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/EmbeddedWalrusDriver.cpython-38-x86_64-linux-gnu.so
9# 0x00007F527C894820 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
10# 0x00007F527C89F35E in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
11# _PyObject_MakeTpCall in /usr/bin/python3
12# _PyObject_FastCallDict in /usr/bin/python3
13# _PyObject_Call_Prepend in /usr/bin/python3
14# 0x00007F527C8929EC in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
15# 0x00007F527C8B471E in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
16# _PyObject_MakeTpCall in /usr/bin/python3
17# _PyObject_FastCallDict in /usr/bin/python3
18# _PyObject_Call_Prepend in /usr/bin/python3
19# 0x00007F5311A50C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
20# 0x00007F5311A652D6 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
21# _PyObject_MakeTpCall in /usr/bin/python3
22# _PyObject_FastCallDict in /usr/bin/python3
23# _PyObject_Call_Prepend in /usr/bin/python3
24# 0x00007F5311A50C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
25# 0x00007F5311A60AC8 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
26# 0x00007F527C8A8BE2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
27# _PyObject_MakeTpCall in /usr/bin/python3
28# _PyObject_FastCallDict in /usr/bin/python3
29# _PyObject_Call_Prepend in /usr/bin/python3
30# 0x00007F5311A7DC6B in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Pipeline.cpython-38-x86_64-linux-gnu.so
31# 0x00007F5311A80082 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Pipeline.cpython-38-x86_64-linux-gnu.so
32# _PyObject_MakeTpCall in /usr/bin/python3
33# _PyObject_FastCallDict in /usr/bin/python3
34# _PyObject_Call_Prepend in /usr/bin/python3
35# 0x00007F5311A50C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
36# 0x00007F5311A652D6 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
37# _PyObject_MakeTpCall in /usr/bin/python3
38# _PyObject_FastCallDict in /usr/bin/python3
39# _PyObject_Call_Prepend in /usr/bin/python3
40# 0x00007F5311A50C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
41# 0x00007F5311A60AC8 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
42# _PyObject_MakeTpCall in /usr/bin/python3
43# _PyObject_FastCallDict in /usr/bin/python3
44# _PyObject_Call_Prepend in /usr/bin/python3
45# 0x00007F5311504ECC in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
46# 0x00007F531153CBA9 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
47# _PyObject_MakeTpCall in /usr/bin/python3
48# _PyObject_FastCallDict in /usr/bin/python3
49# _PyObject_Call_Prepend in /usr/bin/python3
50# 0x00007F531150ACD1 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
51# _PyObject_MakeTpCall in /usr/bin/python3
52# _PyObject_FastCallDict in /usr/bin/python3
53# _PyObject_Call_Prepend in /usr/bin/python3
54# 0x00007F5311B5179C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
55# 0x00007F5311B5D9AA in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
56# _PyObject_MakeTpCall in /usr/bin/python3
57# _PyObject_FastCallDict in /usr/bin/python3
58# _PyObject_Call_Prepend in /usr/bin/python3
59# 0x00007F5311B53CED in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
60# 0x00007F5311B53EC2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
61# 0x00007F5311B66DA2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
62# _PyObject_MakeTpCall in /usr/bin/python3
63# _PyEval_EvalFrameDefault in /usr/bin/python3
64# _PyEval_EvalCodeWithName in /usr/bin/python3
65# PyEval_EvalCode in /usr/bin/python3
66# 0x000000000067DBF1 in /usr/bin/python3
67# 0x000000000067DC6F in /usr/bin/python3
68# 0x000000000067DD11 in /usr/bin/python3
69# PyRun_SimpleFileExFlags in /usr/bin/python3
70# Py_RunMain in /usr/bin/python3
71# Py_BytesMain in /usr/bin/python3
72# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
73# _start in /usr/bin/python3
--------------------------------------------------------------------------------
2023-06-27T07:21:05Z ERROR 2286 [WalrusDriver]: Walrus pass: birverifier failed!
2023-06-27T07:21:05Z ERROR 2286 [WalrusDriver]: Failure Reason: === BIR verification failed ===
Reason: Expect memory location to be of type SB
Instruction: I-25457
Opcode: IndirectSave
Input index: 1
Argument AP:
Access Pattern: [[512,4],[512,1],[1,512]]
SymbolicAP
Memory Location: {_reshape_382_hlo_id_3499__mhlo.reshape_22_pftranspose_10864_set}@PSUM
...
subprocess.CalledProcessError: Command '['neuronx-cc', 'compile', '--framework=XLA', '--target=trn1', '/tmp/tmp6stffmrw/Scribable.3484.1.pb', '--output=/tmp/tmp6stffmrw/Scribable.3484.1.pb.neff', '--verbose=35']' returned non-zero exit status 1.
from transformers-neuronx.
And for batch size 8:
gptj_demo run --batch_size 8 --n_positions 20 gpt-j-6B
running GPTJForSampling.from_pretrained
running model.to_neuron
...2023-06-27T07:29:21Z ERROR 2586 [WalrusDriver]: An exception was thrown:
--------------------------------------------------------------------------------
0# __cxa_throw in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
1# 0x00007F5892B11E96 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
2# birverifier::checkInputMemType(bir::Instruction const&, unsigned int, llvm::SmallVector<bir::MemoryType, 3u> const&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
3# birverifier::InstVisitor::visitInstIndirectSave(bir::InstIndirectSave&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libBIRVerifier.so
4# neuronxcc::walrus::Verifier::run(bir::Module&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
5# neuronxcc::walrus::WalrusPass::run(std::vector<std::unique_ptr<bir::Module, std::default_delete<bir::Module> >, std::allocator<std::unique_ptr<bir::Module, std::default_delete<bir::Module> > > >&) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
6# 0x00007F58465233FE in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
7# run_walrus_driver(int, char**) in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/../../../starfish/lib/libwalrus.so
8# 0x00007F5892B53130 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/support/EmbeddedWalrusDriver.cpython-38-x86_64-linux-gnu.so
9# 0x00007F584A99B820 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
10# 0x00007F584A9A635E in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
11# _PyObject_MakeTpCall in /usr/bin/python3
12# _PyObject_FastCallDict in /usr/bin/python3
13# _PyObject_Call_Prepend in /usr/bin/python3
14# 0x00007F584A9999EC in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
15# 0x00007F584A9BB71E in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
16# _PyObject_MakeTpCall in /usr/bin/python3
17# _PyObject_FastCallDict in /usr/bin/python3
18# _PyObject_Call_Prepend in /usr/bin/python3
19# 0x00007F58DFB59C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
20# 0x00007F58DFB6E2D6 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
21# _PyObject_MakeTpCall in /usr/bin/python3
22# _PyObject_FastCallDict in /usr/bin/python3
23# _PyObject_Call_Prepend in /usr/bin/python3
24# 0x00007F58DFB59C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
25# 0x00007F58DFB69AC8 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
26# 0x00007F584A9AFBE2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/jobs/WalrusDriver.cpython-38-x86_64-linux-gnu.so
27# _PyObject_MakeTpCall in /usr/bin/python3
28# _PyObject_FastCallDict in /usr/bin/python3
29# _PyObject_Call_Prepend in /usr/bin/python3
30# 0x00007F58DFB86C6B in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Pipeline.cpython-38-x86_64-linux-gnu.so
31# 0x00007F58DFB89082 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Pipeline.cpython-38-x86_64-linux-gnu.so
32# _PyObject_MakeTpCall in /usr/bin/python3
33# _PyObject_FastCallDict in /usr/bin/python3
34# _PyObject_Call_Prepend in /usr/bin/python3
35# 0x00007F58DFB59C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
36# 0x00007F58DFB6E2D6 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
37# _PyObject_MakeTpCall in /usr/bin/python3
38# _PyObject_FastCallDict in /usr/bin/python3
39# _PyObject_Call_Prepend in /usr/bin/python3
40# 0x00007F58DFB59C3C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
41# 0x00007F58DFB69AC8 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/Job.cpython-38-x86_64-linux-gnu.so
42# _PyObject_MakeTpCall in /usr/bin/python3
43# _PyObject_FastCallDict in /usr/bin/python3
44# _PyObject_Call_Prepend in /usr/bin/python3
45# 0x00007F58DF60DECC in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
46# 0x00007F58DF645BA9 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
47# _PyObject_MakeTpCall in /usr/bin/python3
48# _PyObject_FastCallDict in /usr/bin/python3
49# _PyObject_Call_Prepend in /usr/bin/python3
50# 0x00007F58DF613CD1 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/commands/CompileCommand.cpython-38-x86_64-linux-gnu.so
51# _PyObject_MakeTpCall in /usr/bin/python3
52# _PyObject_FastCallDict in /usr/bin/python3
53# _PyObject_Call_Prepend in /usr/bin/python3
54# 0x00007F58DFC5A79C in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
55# 0x00007F58DFC669AA in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
56# _PyObject_MakeTpCall in /usr/bin/python3
57# _PyObject_FastCallDict in /usr/bin/python3
58# _PyObject_Call_Prepend in /usr/bin/python3
59# 0x00007F58DFC5CCED in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
60# 0x00007F58DFC5CEC2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
61# 0x00007F58DFC6FDA2 in /usr/local/lib/python3.8/dist-packages/neuronxcc/driver/CommandDriver.cpython-38-x86_64-linux-gnu.so
62# _PyObject_MakeTpCall in /usr/bin/python3
63# _PyEval_EvalFrameDefault in /usr/bin/python3
64# _PyEval_EvalCodeWithName in /usr/bin/python3
65# PyEval_EvalCode in /usr/bin/python3
66# 0x000000000067DBF1 in /usr/bin/python3
67# 0x000000000067DC6F in /usr/bin/python3
68# 0x000000000067DD11 in /usr/bin/python3
69# PyRun_SimpleFileExFlags in /usr/bin/python3
70# Py_RunMain in /usr/bin/python3
71# Py_BytesMain in /usr/bin/python3
72# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
73# _start in /usr/bin/python3
--------------------------------------------------------------------------------
2023-06-27T07:29:21Z ERROR 2586 [WalrusDriver]: Walrus pass: birverifier failed!
2023-06-27T07:29:21Z ERROR 2586 [WalrusDriver]: Failure Reason: === BIR verification failed ===
Reason: Expect memory location to be of type SB
Instruction: I-25625
Opcode: IndirectSave
Input index: 1
Argument AP:
Access Pattern: [[512,8],[512,1],[1,512]]
SymbolicAP
Memory Location: {_reshape_382_hlo_id_3499__mhlo.reshape_22_pftranspose_10864_set}@PSUM
...
subprocess.CalledProcessError: Command '['neuronx-cc', 'compile', '--framework=XLA', '--target=trn1', '/tmp/tmp4w_o2yf2/Scribable.3484.1.pb', '--output=/tmp/tmp4w_o2yf2/Scribable.3484.1.pb.neff', '--verbose=35']' returned non-zero exit status 1.
from transformers-neuronx.
dacorvo Thx for posting - seems the error is consistent regardless of batch size. We are still investigating why this is occurring.
from transformers-neuronx.
Can you confirm this is fixed with latest release ?
from transformers-neuronx.
dacorvo we have not made an explicit fix for this in the latest release, however you are welcome to try it to see if other compiler changes may have had an impact.
from transformers-neuronx.
It seems to be fixed in 0.5.58.
from transformers-neuronx.
Related Issues (20)
- Avoid splitting Hugging Face Hub checkpoint files on disk HOT 7
- Can't save/serialize any models except GPT2 HOT 4
- Compilation error on llama 7 B with batch size 8 HOT 4
- from_pretrained is broken after transformers made safetensor serialization default HOT 1
- LLaMA fails when the input token length is over 1790 tokens HOT 6
- Llama2 inference overhead time way too long HOT 6
- llama-2/codellama benchmark for inf2.xlarge HOT 4
- Mixtral Model support HOT 2
- Vicuna13B model support HOT 1
- Inf2 Modified Llama 2 Loading Issue HOT 11
- Skipping generation for useless tokens, and modiying cacheids HOT 3
- How to use generate() with inputs_embeds HOT 2
- Mixtral config issue -- not handling null well HOT 8
- Generate Llama 2 from Embeddings HOT 5
- Infering logits from `model.forward` for the entire batch instead of the last forward's output. HOT 6
- Support for MPT model HOT 1
- `stopping_criteria_list(input_ids, probs)` does not check for the correct sequence. HOT 4
- User feedback when compiling and reloading a large model HOT 1
- Issue while compiling Mistral 7B 0.2 Instruct HOT 5
- Backward compatibility with saved llama 2 compiled artifacts HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers-neuronx.