dicecco1 / fpga_caffe Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Hi,
Is this version of CAFFE for Opencl works with FPGA?
ERROR: No program executable for device
*** Aborted at 1521319025 (unix time) try "date -d @1521319025" if you are using GNU date ***
PC: @ 0x7fd621207954 xocl::detail::event::validOrError()
*** SIGSEGV (@0x20) received by PID 21575 (TID 0x7fd62487db00) from PID 32; stack trace: ***
@ 0x7fd622e5c4b0 (unknown)
@ 0x7fd621207954 xocl::detail::event::validOrError()
@ 0x7fd62120717c clWaitForEvents
@ 0x7fd624335256 caffe::OCLPoolingHWCNLayer<>::launchKernel()
@ 0x7fd62433545e caffe::OCLPoolingHWCNLayer<>::Forward_ocl()
@ 0x7fd6243dd35c caffe::Net<>::ForwardFromTo()
@ 0x7fd6243dd707 caffe::Net<>::Forward()
@ 0x7fd6244345b2 caffe::Solver<>::Test()
@ 0x7fd624434f7e caffe::Solver<>::TestAll()
@ 0x7fd624437607 caffe::Solver<>::Step()
@ 0x7fd62443784a caffe::Solver<>::Solve()
@ 0x40ada9 train()
@ 0x40719e main
@ 0x7fd622e47830 __libc_start_main
@ 0x407a29 _start
@ 0x0 (unknown)
Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help.
Do not post such requests to Issues. Doing so interferes with the development of Caffe.
Please read the guidelines for contributing before submitting this issue.
If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.
Operating system:
Compiler:
CUDA version (if applicable):
CUDNN version (if applicable):
BLAS:
Python or MATLAB version (for pycaffe and matcaffe respectively):
hi, @dicecco1,
If I use kernel crp_layer_hwcn_cpfp and set OCFACT to 4, then how could I do the corresponding changes to prototxt? Every thing works ok if I do not change the original parameters, such as pad_to=4,num_cu=16 and num_pe=4. I wonder if we could improve the performance by changing the parameters.
Thanks!
In your paper related to this work, you mentioned a Winograd algorithm, but I can't find any information about it in this project, except one file, but it hasn't been used and it is denoted that only forward process is supported.
So I was wondering how about the performance difference between Winograd and direct convolution ? Does it make sense that you deleted the things about Winograd due to Winograd doesn't perform better than direct convolution? And, could you unveil more details about the performance comparison?
Hi, thanks for open-sourcing. I try to compile and run this, after "make all", I got an error bellow:
LD -o .build_release/lib/libcaffe.so.1.0.0
/usr/bin/ld: cannot find -lxilinxopencl
/usr/bin/ld: cannot find -llmx6.0
collect2: error: ld returned 1 exit status
Makefile:608: recipe for target '.build_release/lib/libcaffe.so.1.0.0' failed
make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1
According to your reply in issue #5, I sourced the settings64.sh file, but the error still exists.
By the way, I'm using SDAccel 2017.1 on Ubuntu 16.04.5 LTS Version.
Any suggestions?
I want to use the kernel crp_layer_hwcn_cpfp_16pegrp to predict the output. I have already changed the deploy.txt. But the output are the same for 256 images. Then what's wrong?
The output:
ILSVRC2012_val_00000001,847,76,103,50,51,65
ILSVRC2012_val_00000002,847,76,103,50,51,970
ILSVRC2012_val_00000003,847,76,103,50,51,230
ILSVRC2012_val_00000004,847,76,103,50,51,809
ILSVRC2012_val_00000005,847,76,103,50,51,516
ILSVRC2012_val_00000006,847,76,103,50,51,57
ILSVRC2012_val_00000007,847,76,103,50,51,334
ILSVRC2012_val_00000008,847,76,103,50,51,415
ILSVRC2012_val_00000009,847,76,103,50,51,674
ILSVRC2012_val_00000010,847,76,103,50,51,332
ILSVRC2012_val_00000011,847,76,103,50,51,109
ILSVRC2012_val_00000012,847,76,103,50,51,286
ILSVRC2012_val_00000013,847,76,103,50,51,370
ILSVRC2012_val_00000014,847,76,103,50,51,757
ILSVRC2012_val_00000015,847,76,103,50,51,595
ILSVRC2012_val_00000016,847,76,103,50,51,147
ILSVRC2012_val_00000017,847,76,103,50,51,108
ILSVRC2012_val_00000018,847,76,103,50,51,23
ILSVRC2012_val_00000019,847,76,103,50,51,478
ILSVRC2012_val_00000020,847,76,103,50,51,517
ILSVRC2012_val_00000021,847,76,103,50,51,334
ILSVRC2012_val_00000022,847,76,103,50,51,173
ILSVRC2012_val_00000023,847,76,103,50,51,948
The last column is the label data.
Hi @dicecco1 ,
Came across this work on SDAccel forum and read your paper, thx for open-sourcing!
After tweaking around the Makefile I am able to finish Make-ing the codes,
Make runtest
passed these tests below, with sw_emu mode of SDAccel, and using the sw xclbins:
oclConvolutionLayerTest/1
oclConvolutionLayerTest/3
OCLPoolingLayerTest/1
OCLPoolingLayerTest/3
but it also failed some tests, like:
[----------] 6 tests from BlobSimpleTest/1, where TypeParam = double
[ RUN ] BlobSimpleTest/1.TestInitialization
[ OK ] BlobSimpleTest/1.TestInitialization (0 ms)
[ RUN ] BlobSimpleTest/1.TestPointersCPUOCL
src/caffe/test/test_blob.cpp:47: Failure
Value of: this->blob_preshaped_->ocl_data()
Actual: false
Expected: true
src/caffe/test/test_blob.cpp:49: Failure
Value of: this->blob_preshaped_->mutable_ocl_data()
Actual: false
Expected: true
[ FAILED ] BlobSimpleTest/1.TestPointersCPUOCL, where TypeParam = double (0 ms)
another one, with the data mismatch lines repeating many times:
[----------] 3 tests from OCLLRNLayerTest/2, where TypeParam = caffe::GPUDevice<float>
[ RUN ] OCLLRNLayerTest/2.TestForwardAcrossChannelsLRN2
src/caffe/test/test_lrn_layer.cpp:614: Failure
The difference between this->blob_top_->cpu_data()[i] and top_reference.cpu_data()[i] is 0.10926561057567596, which exceeds this->epsilon_, where
this->blob_top_->cpu_data()[i] evaluates to 0,
top_reference.cpu_data()[i] evaluates to -0.10926561057567596, and
this->epsilon_ evaluates to 9.9999997473787516e-06.
...
[ FAILED ] OCLLRNLayerTest/2.TestForwardAcrossChannelsLRN2, where TypeParam = caffe::GPUDevice<float> (11173 ms)
[ RUN ] OCLLRNLayerTest/2.TestForwardAcrossChannelsLRN1
*** Aborted at 1501926253 (unix time) try "date -d @1501926253" if you are using GNU date ***
PC: @ 0x7f762f241f54 clWaitForEvents
*** SIGSEGV (@0x0) received by PID 31881 (TID 0x7f7621f11720) from PID 0; stack trace: ***
@ 0x3d0dc0f7e0 (unknown)
@ 0x7f762f241f54 clWaitForEvents
@ 0x7f762e1e23d1 caffe::OCLLRNLayer<>::Call_ocl()
...
and then the test ended due to the segmentation fault.
Any ideas? SDAccel version was 2016.1.
I'm using centos7 and SDAccel 2017.1.
When I want to compile following your instructs in #3 (comment)
After command "make all", I got a problem like this :
...
CXX src/gtest/gtest-all.cpp
LD -o .build_release/lib/libcaffe.so.1.0.0
/bin/ld: cannot find -lxilinxopencl
/bin/ld: cannot find -llmx6.0
/bin/ld: cannot find -lcblas
/bin/ld: cannot find -latlas
collect2: error: ld returned 1 exit status
make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1
So, where shall I found the lxilinxopencl?
or could you please give a more detailed installation tutorial, such as how did you modify your Makefile.config ?
Do you have any plans to accelerate caffe with Intel Arria 10 FPGA?
I finally can't make it in the last issue.
But, after changes like following:
I compiled the fpga_caffe project successfully.
Then I run the "make testfpga", it was OK.
Then I try to test whether the testfpga work fine, so I go to the ./build_release/testfpga, and run a command like below, but it seems like that only the PCIeBandwidthTest is OK, others all failed due to an erro "ERROR: entry in lengths[i] is zero", finally ends up with "Segmentation fault (core dumped)".
[root@localhost testfpga]# XCL_EMULATION_MODE=true ./test_all.bin
[==========] Running 25 tests from 9 test cases.
[----------] Global test environment set-up.
[----------] 6 tests from PCIeBandwidthTest/0, where TypeParam = OCLDevice<float>
[ RUN ] PCIeBandwidthTest/0.TestSetup
[ OK ] PCIeBandwidthTest/0.TestSetup (5 ms)
[ RUN ] PCIeBandwidthTest/0.TestBurst
[ OK ] PCIeBandwidthTest/0.TestBurst (60 ms)
[ RUN ] PCIeBandwidthTest/0.TestBurst2
[ OK ] PCIeBandwidthTest/0.TestBurst2 (71 ms)
[ RUN ] PCIeBandwidthTest/0.TestPadBurst
[ OK ] PCIeBandwidthTest/0.TestPadBurst (138 ms)
[ RUN ] PCIeBandwidthTest/0.TestByChannel
[ OK ] PCIeBandwidthTest/0.TestByChannel (85 ms)
[ RUN ] PCIeBandwidthTest/0.TestLocalBandwidth
ERROR: entry in lengths[i] is zero
Segmentation fault (core dumped)
After that, I run the "make test" to generate common test, and then I run the "make runtest".
Things seems all right until it comes to "BlobSimpleTest"
[----------] 5 tests from BlobSimpleTest/0, where TypeParam = float
[ RUN ] BlobSimpleTest/0.TestPointersCPUOCL
ERROR: clCreateBuffer
*** Aborted at 1511440095 (unix time) try "date -d @1511440095" if you are using GNU date ***
PC: @ 0x7fd979c913ef clEnqueueReadWriteBuffer()
*** SIGSEGV (@0x20) received by PID 26980 (TID 0x7fd97c893940) from PID 32; stack trace: ***
@ 0x7fd9784497e0 (unknown)
@ 0x7fd979c913ef clEnqueueReadWriteBuffer()
@ 0x7fd979c9397f clEnqueueWriteBuffer
@ 0x7fd978cb7eaa caffe::SyncedMemory::ocl_data()
@ 0x7fd978ca1c51 caffe::Blob<>::ocl_data()
@ 0x5ea1c5 caffe::BlobSimpleTest_TestPointersCPUOCL_Test<>::TestBody()
@ 0x7dc1ed testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x7d4791 testing::Test::Run()
@ 0x7d4876 testing::TestInfo::Run()
@ 0x7d49b7 testing::TestCase::Run()
@ 0x7d4d1e testing::internal::UnitTestImpl::RunAllTests()
@ 0x7dbd6d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x7d3dee testing::UnitTest::Run()
@ 0x4d5cc5 main
@ 0x7fd9780c4d1d __libc_start_main
@ 0x4d59bd (unknown)
make: *** [runtest] Segmentation fault (core dumped)
So, can you provide some suggestion? :-)
hi, @dicecco1
Recently when I study your kernel code crp_layer_hwcn_cpfp.cpp, I am puzzled by some troubles.
First, how do you set the buffer size? according to what? For example, in
Second, what does the *Fact variable means? like
Hi @dicecco1 ,
I want rebuild winograd_pe.xclbin, but I can't find winograd_pe.cl in src/caffe/ocl_caffe/convolution/winograd/, can you help update it?
Hello, I try to use python to run the Alexnet model,My code like this:
import caffe
MODEL_FILE = 'alexnet_four_channel_model.caffemodel'
DEPLOY_FILE = 'config/deploy.prototxt'
TEST_ROOT = 'datas/'
caffe.set_mode_cpu()
net = caffe.Net(DEPLOY_FILE, MODEL_FILE, caffe.TEST)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2, 1, 0))
net.blobs['data'].reshape(1, 3, 227, 227)
img = caffe.io.load_image('temp.jpg')
net.blobs['data'].data[...] = transformer.preprocess('data', img)
out = net.forward()
pridects = out['prob']
predict = pridects.argmax()
print(predict)
Then, I found the deploy.protxt doesn't have an softmax layer, and I found that the the first dim of input_shape must be 256? does it mean the number of pictures? I try to modify it to 1, then it signals an error that
CHECK(num_ % 16 == 0);
I want to know what I need to do so that I can run the fpga_alexnet model. Now, I have compiled an crp_layer_hwcn_cpfp.xclbin file, and put it into the folder .build_release/opencl/src/caffe/layers/,
what others I should do?
Thanks~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.