Code Monkey home page Code Monkey logo

Comments (24)

byronyi avatar byronyi commented on May 13, 2024

Do you mind to share a little bit more of the error log? And also your environment setup, e.g. OS version, compiler verison, CUDA, etc.

from byteps.

SCismycat avatar SCismycat commented on May 13, 2024

CUDA Version 9.0.176
Linux version 3.10.0-862.el7.x86_64 (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
centos7
Python3.5/TF1.9/keras

The error.log as follow:

 warnings.warn(msg)
running install
running bdist_egg
running egg_info
writing byteps.egg-info/PKG-INFO
writing dependency_links to byteps.egg-info/dependency_links.txt
writing top-level names to byteps.egg-info/top_level.txt
reading manifest file 'byteps.egg-info/SOURCES.txt'
writing manifest file 'byteps.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2

===
maybe caused by g++ version??

from byteps.

ymjiang avatar ymjiang commented on May 13, 2024

@SCismycat Could you provide the commands that you use?

Besides, if you do the following:

cd byteps/3rdparty/ps-lite
make clean && make -j 

Does it report the same error?

from byteps.

SCismycat avatar SCismycat commented on May 13, 2024

My cmd as follow:

git clone --recurse-submodules https://github.com/bytedance/byteps
ls
cd byteps/
ls
python3 setup.py install

I do the make command,message as follow:

 make clean && make -j
rm -rf build  tests/test_connection  tests/test_kv_app_multi_servers  tests/test_simple_app  tests/test_kv_app_multi_workers  tests/test_kv_app_benchmark  tests/test_kv_app tests/*.d tests/*.dSYM
find src -name "*.pb.[ch]*" -delete
/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/bin/protoc --cpp_out=./src --proto_path=./src src/meta.proto
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/customer.o src/customer.cc >build/customer.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/postoffice.o src/postoffice.cc >build/postoffice.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/van.o src/van.cc >build/van.d
g++ -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include -std=c++0x -MM -MT build/meta.pb.o src/meta.pb.cc >build/meta.pb.d
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/postoffice.cc -o build/postoffice.o
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/customer.cc -o build/customer.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/postoffice.o] 错误 1
make: *** 正在等待未完成的任务....
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/customer.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/meta.pb.cc -o build/meta.pb.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/meta.pb.o] 错误 1
g++ -std=c++14 -msse2 -fPIC -O3 -ggdb -Wall -finline-functions -I./src -I./include -I/work/leslee_work/bytedanceDL/byteps/3rdparty/ps-lite/deps/include  -pthread -c src/van.cc -o build/van.o
g++: 错误:unrecognized command line option ‘-std=c++14’
make: *** [build/van.o] 错误 1
rm src/meta.pb.h

from byteps.

byronyi avatar byronyi commented on May 13, 2024

You could try yum install devtoolset-7.

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024

Yes, this is gcc version problem. BytePS right now requires gcc 4.9 or above.

You can try the suggestion from @byronyi , or use our dockerfile.

from byteps.

changlan avatar changlan commented on May 13, 2024

Marking this as "enhancement". Perhaps we could check gcc version explicitly during installation, until we support gcc 4.8.

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@byronyi @bobzhuyb @changlan I meet a problem. I don't have sudo right for the server, so I have to use anaconda vitual env.

I install the following gcc7 envs in anaconda.

conda install -n torch1.1 -c omgarcia isl
conda install -n torch1.1 -c quantstack gcc-7

But still meets exact the same error message.

So any solution for non-root user with anaconda3?

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024

@luzhilin19951120 When you say exact the same error message, do you mean this line ?

g++: 错误:unrecognized command line option ‘-std=c++14’

If so, can you try again with latest master branch? Make sure the 3rdparty/ps-lite is updated as well. We recently removed the dependency on c++14.

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@bobzhuyb Sorry, but may you make it clearer about how to Make sure the 3rdparty/ps-lite is updated?

I tried to pull again. With latest master, the following error message occurred.

make: *** [/home/luzhilin/software/byteps/3rdparty/ps-lite/deps/include/google/protobuf/message.h] 错误 2
**error: An ERROR occured while running the Makefile for the ps-lite library. Exit code: 2**

from byteps.

ymjiang avatar ymjiang commented on May 13, 2024

To make sure your pslite is the latest, cd into your byteps/3rdparty/ps-lite and then type git log to see if the latest commit is 52f042b

If not, then you are not using the latest ps-lite.

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@ymjiang well then I am on the latest commit of ps-lite
image

But still meet the aforementioned error when making ps-lite library...

from byteps.

ymjiang avatar ymjiang commented on May 13, 2024

@luzhilin19951120 Can you please show more about the error log? The information is kind of limited.

Besides, would you mind try using gcc-4.9?

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@ymjiang This is all terminal STD output. make.log

And all the ERR message has been given.

from byteps.

ymjiang avatar ymjiang commented on May 13, 2024

@luzhilin19951120 Then I would suggest using gcc-4.9 for compile, as we have suggested in README.

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024

@luzhilin19951120 I don't see error in your make.log..... Am I missing something? Can you also redirect stderr to the file?

I suggest you do a refresh clone if you don't know how to start over

git clone --recurse-submodules https://github.com/bytedance/byteps

Also, your problem is that you can't build protobuf. It's different from the first post, and I am not sure whether this is really BytePS's own problem.

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@ymjiang @bobzhuyb
Well I recurse the submodules, change gcc to 4.9
image

And when reinstall the BytePS with the following command
BYTEPS_NCCL_HOME=/usr/local/nccl_2.3.7 BYTEPS_CUDA_HOME=/usr/local/cuda-9.0/ BYTEPS_USE_RDMA=1 python setup.py install > make.log 2>&1

I still got the following error message (redirected, including std and err out)
make.log

Yeap, the problem seems to be the protobuf. But that is a dependency of BytePS and there should be a reason that the building failed~

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024

tensorflow/tensorflow#5017 (comment)
Have a look at this thread and upvoted answers?

If you search the error message in your make.log, you can see a lot of related issues.

/usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found
/usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@bobzhuyb Yes, it turns out to be gcc library problem. The dynamic lib of gcc is not updated since I merely installed gcc4.9 in anaconda.

I managed to deploy a gcc5.4 environment and the protofuf bug is gone.

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@bobzhuyb However another problem exist when buildind PyTorch plugin. And I failed to locate the bug, which seems to be inside build_ext.build_extension(pytorch_lib)

The detailed log is as follows.
make.log

p.s. I installed pytorch1.1.0 using

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024
byteps/common/global.cc:19:18: fatal error: numa.h: 没有那个文件或目录

Install libnuma-dev

Read your own log, check the error message, and install libraries based on the error message...

from byteps.

Kylin9511 avatar Kylin9511 commented on May 13, 2024

@bobzhuyb Sorry for your troublesome, I am not quite familiar with C++ based package/library.

The environment setting seems to be a little bit inconvinient for non-root user. I may try to use docker later on. I think it would be better if you can release PIPY library version like horovod😄.

from byteps.

ymjiang avatar ymjiang commented on May 13, 2024

@luzhilin19951120 We already release some pip libraries. See https://github.com/bytedance/byteps/blob/master/docs/pip-list.md

from byteps.

bobzhuyb avatar bobzhuyb commented on May 13, 2024

I believe we have addressed all the issues here, including fallback to c++11 from c++14, and providing pip packages. Closing this. Feel free to reopen.

from byteps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.