Comments (24)
Do you mind to share a little bit more of the error log? And also your environment setup, e.g. OS version, compiler verison, CUDA, etc.
from byteps.
CUDA Version 9.0.176
Linux version 3.10.0-862.el7.x86_64 (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
centos7
Python3.5/TF1.9/keras
The error.log as follow:
warnings.warn(msg)
running install
running bdist_egg
running egg_info
writing byteps.egg-info/PKG-INFO
writing dependency_links to byteps.egg-info/dependency_links.txt
writing top-level names to byteps.egg-info/top_level.txt
reading manifest file 'byteps.egg-info/SOURCES.txt'
writing manifest file 'byteps.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2
===
maybe caused by g++ version??
from byteps.
@SCismycat Could you provide the commands that you use?
Besides, if you do the following:
cd byteps/3rdparty/ps-lite
make clean && make -j
Does it report the same error?
from byteps.
My cmd as follow:
git clone --recurse-submodules https://github.com/bytedance/byteps
ls
cd byteps/
ls
python3 setup.py install
I do the make command,message as follow:
make clean && make -j
rm -rf build tests/test_connection tests/test_kv_app_multi_servers tests/test_simple_app tests/test_kv_app_multi_workers tests/test_kv_app_benchmark tests/test_kv_app tests/*.d tests/*.dSYM
find src -name "*.pb.[ch]*" -delete
/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/bin/protoc --cpp_out=./src --proto_path=./src src/meta.proto
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
rm src/meta.pb.h
from byteps.
You could try yum install devtoolset-7.
from byteps.
Yes, this is gcc version problem. BytePS right now requires gcc 4.9 or above.
You can try the suggestion from @byronyi , or use our dockerfile.
from byteps.
Marking this as "enhancement". Perhaps we could check gcc version explicitly during installation, until we support gcc 4.8.
from byteps.
@byronyi @bobzhuyb @changlan I meet a problem. I don't have sudo right for the server, so I have to use anaconda vitual env.
I install the following gcc7 envs in anaconda.
conda install -n torch1.1 -c omgarcia isl
conda install -n torch1.1 -c quantstack gcc-7
But still meets exact the same error message.
So any solution for non-root user with anaconda3?
from byteps.
@luzhilin19951120 When you say exact the same error message, do you mean this line ?
g++: 错误:unrecognized command line option ‘-std=c++14’
If so, can you try again with latest master branch? Make sure the 3rdparty/ps-lite is updated as well. We recently removed the dependency on c++14.
from byteps.
@bobzhuyb Sorry, but may you make it clearer about how to Make sure the 3rdparty/ps-lite is updated
?
I tried to pull again. With latest master, the following error message occurred.
make: *** [/home/luzhilin/software/byteps/3rdparty/ps-lite/deps/include/google/protobuf/message.h] 错误 2
**error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2**
from byteps.
To make sure your pslite is the latest, cd into your byteps/3rdparty/ps-lite and then type git log
to see if the latest commit is 52f042b
If not, then you are not using the latest ps-lite.
from byteps.
@ymjiang well then I am on the latest commit of ps-lite
But still meet the aforementioned error when making ps-lite library...
from byteps.
@luzhilin19951120 Can you please show more about the error log? The information is kind of limited.
Besides, would you mind try using gcc-4.9?
from byteps.
@ymjiang This is all terminal STD output. make.log
And all the ERR message has been given.
from byteps.
@luzhilin19951120 Then I would suggest using gcc-4.9 for compile, as we have suggested in README.
from byteps.
@luzhilin19951120 I don't see error in your make.log..... Am I missing something? Can you also redirect stderr to the file?
I suggest you do a refresh clone if you don't know how to start over
git clone --recurse-submodules https://github.com/bytedance/byteps
Also, your problem is that you can't build protobuf. It's different from the first post, and I am not sure whether this is really BytePS's own problem.
from byteps.
@ymjiang @bobzhuyb
Well I recurse the submodules, change gcc to 4.9
And when reinstall the BytePS with the following command
BYTEPS_NCCL_HOME=/usr/local/nccl_2.3.7 BYTEPS_CUDA_HOME=/usr/local/cuda-9.0/ BYTEPS_USE_RDMA=1 python setup.py install > make.log 2>&1
I still got the following error message (redirected, including std and err out)
make.log
Yeap, the problem seems to be the protobuf. But that is a dependency of BytePS and there should be a reason that the building failed~
from byteps.
tensorflow/tensorflow#5017 (comment)
Have a look at this thread and upvoted answers?
If you search the error message in your make.log, you can see a lot of related issues.
/usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found
/usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found
from byteps.
@bobzhuyb Yes, it turns out to be gcc library problem. The dynamic lib of gcc is not updated since I merely installed gcc4.9 in anaconda.
I managed to deploy a gcc5.4 environment and the protofuf bug is gone.
from byteps.
@bobzhuyb However another problem exist when buildind PyTorch plugin. And I failed to locate the bug, which seems to be inside build_ext.build_extension(pytorch_lib)
The detailed log is as follows.
make.log
p.s. I installed pytorch1.1.0 using
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
from byteps.
byteps/common/global.cc:19:18: fatal error: numa.h: 没有那个文件或目录
Install libnuma-dev
Read your own log, check the error message, and install libraries based on the error message...
from byteps.
@bobzhuyb Sorry for your troublesome, I am not quite familiar with C++ based package/library.
The environment setting seems to be a little bit inconvinient for non-root user. I may try to use docker later on. I think it would be better if you can release PIPY library version like horovod😄.
from byteps.
@luzhilin19951120 We already release some pip libraries. See https://github.com/bytedance/byteps/blob/master/docs/pip-list.md
from byteps.
I believe we have addressed all the issues here, including fallback to c++11 from c++14, and providing pip packages. Closing this. Feel free to reopen.
from byteps.
Related Issues (20)
- Stuck in the bps.init(). HOT 7
- Is it right to do allreduce immediately for non-zero ranks in bytescheduler? HOT 2
- 啥时候支持sparse模型?
- 有计划支持纯cpu吗?我们worker也用cpu机器的 HOT 2
- benchmark with cross barrier error
- Successfully installed BytePS but cannot import byteps.torch or byteps.tensorflow HOT 2
- Running multiple workers on a single GPU machine
- Release BytePS docker image support for TF2
- 安装报错 HOT 1
- Communication failure in MXNet with BytePS HOT 3
- support for fault tolerance and straggler mitigation
- broadcast and is_initialized api are not supported with pytorch.
- Supported environment
- 安装问题
- Mistakes of Workload calculation HOT 5
- How does the tensorflow scheduler plugin used in the tf_benchmark_cnn.py HOT 1
- segmentation fault while launching the worker HOT 1
- Is there any benchmark comparison with Megatron-LM ?
- 支持的cuda和pytorch版本
- install failed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from byteps.