Comments (5)
this bug caused by wrong version of libnccl
i solved it by reinstalling a right ver libnccl and recreating a new python env based on this libnccl
from alpa.
this bug caused by wrong version of libnccl i solved it by reinstalling a right ver libnccl and recreating a new python env based on this libnccl
may i ask your concrete version of python and libnccl, thx
from alpa.
yeah
python == 3.8.13
gcc == 7.5.0
nccl == libnccl.so.2.8.4
from alpa.
Hi, I am running into the same issue when building from source. I don't understand how libnccl version affects the filenotfound error? Any other solution to this?
from alpa.
Hi, I am running into the same issue when building from source. I don't understand how libnccl version affects the filenotfound error? Any other solution to this?
the mirror url is write in some workplace file. it seems the file not found
problem not the error reason. the incorrect libnccl version is the main cause.
from alpa.
Related Issues (20)
- Will alpa support jax 0.4.x and cuda 12.x?
- cupy package mismatches with CUDA version in the docs HOT 2
- Unable to use pipeline parallelism with multi-node meshes HOT 1
- PLS, a paper related question I want to ask HOT 1
- Question abuot licence / usage HOT 1
- IndexError: `InlinedVector::at(size_type) const` failed bounds check
- Check failed: operand_dim < ins->operand(0)->shape().rank() (2 vs. 2)Does not support this kind of Gather. HOT 2
- How to build debug-version Alpa-modified jaxlib HOT 3
- when i check installation by running python3 -m alpa.test_install,AssertionError happend HOT 6
- Unsupported parallel mode in shard-only auto perf test: load_solution
- How to use Alpa to serve BERT models
- Error about python3 -m alpa.test_install
- A question about file /alpa/benchmark/gen_serving_database.py
- Any solution to support llama2 finetune?
- Why did you choose ray instead of using torch distributed? HOT 2
- Ray spill out of disk error when using alpa to auto-parallelize llama HOT 2
- [Bug] Segment fault when using alpa to parallelize llama with jax 0.4.6 environment HOT 2
- How to profile Alpa models and get the trace HOT 1
- Check Installation failled HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alpa.