Code Monkey home page Code Monkey logo

Comments (12)

rgiduthuri avatar rgiduthuri commented on July 26, 2024 1

Wondering if there is a residue in ~/.cache from previous hardware/software versions.
Try remove ~/.cache folder and repeat the same test.

from miopen.

reger-men avatar reger-men commented on July 26, 2024

thank you @rgiduthuri for your answer.
Now I get 92% tests passed, 1 tests failed out of 12
The error still persists in test_bn_spatial_test

Start  4: test_bn_spatial_test
 4/12 Test  #4: test_bn_spatial_test .............***Exception: Other  0.41 sec
FAILED: /ROCm/MIOpen/src/ocl/clhelper.cpp:101: Error Building OpenCL Program in BuildProgram()
2 errors generated.

error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR
 Build Program Failure

from miopen.

daniellowell avatar daniellowell commented on July 26, 2024

from miopen.

reger-men avatar reger-men commented on July 26, 2024

This is the output of: ./build/bin/test_bn_spatial_test

FAILED:/ROCm/MIOpen-master/src/ocl/clhelper.cpp:101: Error Building OpenCL Program in BuildProgram()
/tmp/OCL17328T3.cl:160:24: error: use of unknown builtin '__builtin_amdgcn_mov_dpp'
    *value += as_float(__builtin_amdgcn_mov_dpp(as_int(*value), 0x111, 0xF, 0xF, 0));
                       ^
/tmp/OCL17328T3.cl:168:23: error: use of unknown builtin '__builtin_amdgcn_readlane'
    *value = as_float(__builtin_amdgcn_readlane(as_int(*value), 63));
                      ^
/tmp/OCL17328T3.cl:168:23: note: did you mean '__builtin_amdgcn_mov_dpp'?
/tmp/OCL17328T3.cl:160:24: note: '__builtin_amdgcn_mov_dpp' declared here
    *value += as_float(__builtin_amdgcn_mov_dpp(as_int(*value), 0x111, 0xF, 0xF, 0));
                       ^
/home/foreman/sources/stream/opencl/compiler/clc2/ocl-headers/build/lnx64a/B_rel/opencl12_builtins.h:5511:39: note: expanded from macro 'as_float'
#define as_float(x) __builtin_astype((x), float)
                                      ^
2 errors generated.

error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR
 Build Program Failure
Forward Train Spatial Batch Normalization: 
Input tensor: 4, 64, 28, 28
terminate called after throwing an instance of 'miopen::Exception'
  what():  /ROCm/MIOpen-master/src/ocl/clhelper.cpp:101: Error Building OpenCL Program in BuildProgram()
/tmp/OCL17328T3.cl:160:24: error: use of unknown builtin '__builtin_amdgcn_mov_dpp'
    *value += as_float(__builtin_amdgcn_mov_dpp(as_int(*value), 0x111, 0xF, 0xF, 0));
                       ^
/tmp/OCL17328T3.cl:168:23: error: use of unknown builtin '__builtin_amdgcn_readlane'
    *value = as_float(__builtin_amdgcn_readlane(as_int(*value), 63));
                      ^
/tmp/OCL17328T3.cl:168:23: note: did you mean '__builtin_amdgcn_mov_dpp'?
/tmp/OCL17328T3.cl:160:24: note: '__builtin_amdgcn_mov_dpp' declared here
    *value += as_float(__builtin_amdgcn_mov_dpp(as_int(*value), 0x111, 0xF, 0xF, 0));
                       ^
/home/foreman/sources/stream/opencl/compiler/clc2/ocl-headers/build/lnx64a/B_rel/opencl12_builtins.h:5511:39: note: expanded from macro 'as_float'
#define as_float(x) __builtin_astype((x), float)
                                      ^
2 errors generated.

error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR
 Build Program Failure
Aborted (core dumped)

from miopen.

daniellowell avatar daniellowell commented on July 26, 2024

from miopen.

reger-men avatar reger-men commented on July 26, 2024

@daniellowell, unfortunately the error could not be bypassed by uncommenting:

//#ifdef __AMDGCN__ 
//#undef __AMDGCN__ 
//#endif

the output of clinfo return:

Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.0 AMD-APP (2442.7)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 AMD Radeon (TM) Pro WX 7100 Graphics
  Device Topology:				 PCI[ B#2, D#0, F#0 ]
  Max compute units:				 36
  Max work items dimensions:			 3
    Max work items[0]:				 256
    Max work items[1]:				 256
    Max work items[2]:				 256
  Max work group size:				 256
  Preferred vector width char:			 4
  Preferred vector width short:			 2
  Preferred vector width int:			 1
  Preferred vector width long:			 1
  Preferred vector width float:			 1
  Preferred vector width double:		 1
  Native vector width char:			 4
  Native vector width short:			 2
  Native vector width int:			 1
  Native vector width long:			 1
  Native vector width float:			 1
  Native vector width double:			 1
  Max clock frequency:				 1243Mhz
  Address bits:					 64
  Max memory allocation:			 4244635648
  Image support:				 Yes
  Max number of images read arguments:		 128
  Max number of images write arguments:		 8
  Max image 2D width:				 16384
  Max image 2D height:				 16384
  Max image 3D width:				 2048
  Max image 3D height:				 2048
  Max image 3D depth:				 2048
  Max samplers within kernel:			 16
  Max size of kernel argument:			 1024
  Alignment (bits) of base address:		 2048
  Minimum alignment (bytes) for any datatype:	 128
  Single precision floating point capability
    Denorms:					 No
    Quiet NaNs:					 Yes
    Round to nearest even:			 Yes
    Round to zero:				 Yes
    Round to +ve and infinity:			 Yes
    IEEE754-2008 fused multiply-add:		 Yes
  Cache type:					 Read/Write
  Cache line size:				 64
  Cache size:					 16384
  Global memory size:				 8349130752
  Constant buffer size:				 4244635648
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Max pipe arguments:				 0
  Max pipe active reservations:			 0
  Max pipe packet size:				 0
  Max global variable size:			 0
  Max global variable preferred total size:	 0
  Max read/write image args:			 0
  Max on device events:				 0
  Queue on device max size:			 0
  Max on device queues:				 0
  Queue on device preferred size:		 0
  SVM capabilities:				 
    Coarse grain buffer:			 No
    Fine grain buffer:				 No
    Fine grain system:				 No
    Atomics:					 No
  Preferred platform atomic alignment:		 0
  Preferred global atomic alignment:		 0
  Preferred local atomic alignment:		 0
  Kernel Preferred work group size multiple:	 64
  Error correction support:			 0
  Unified memory for Host and Device:		 0
  Profiling timer resolution:			 1
  Device endianess:				 Little
  Available:					 Yes
  Compiler available:				 Yes
  Execution capabilities:				 
    Execute OpenCL kernels:			 Yes
    Execute native function:			 No
  Queue on Host properties:				 
    Out-of-Order:				 No
    Profiling :					 Yes
  Queue on Device properties:				 
    Out-of-Order:				 No
    Profiling :					 No
  Platform ID:					 0x7f384e1cf478
  Name:						 Ellesmere
  Vendor:					 Advanced Micro Devices, Inc.
  Device OpenCL C version:			 OpenCL C 1.2 
  Driver version:				 2442.7
  Profile:					 FULL_PROFILE
  Version:					 OpenCL 1.2 AMD-APP (2442.7)
  Extensions:					 cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event 

from miopen.

daniellowell avatar daniellowell commented on July 26, 2024

Ellesmire is gfx803 so that should be ok. Can you run:

rocm_agent_enumerator -t all

It will give you a output like:

Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz
gfx000
gfx900

For some reason your front end compiler is not picking up the supported hardware instructions. I don't know if you have another OpenCL device that is being targeted by the compiler.

As for the workaround, there is actually a bug there. Looks like:

#ifdef __AMDGCN__
...
#endif

Needs to be in three other places in those files:

MIOpenBatchNormFwdTrainSpatial.cl
Start Line 157:

#ifdef __AMDGCN__
static inline void dppRegReduce64(_FLOAT* value, _FLOAT scale)
...
#endif

End line 216

MIOpenBatchNormFwdTrainSpatial.cl
Start Line 776:

#ifdef __AMDGCN__
    unsigned int segment = MIO_BN_GRP1 >> 6;
#endif

End line 778

MIOpenBatchNormBwdSpatial.cl
Start line 151

static inline void dppRegReduce64(_FLOAT* value, _FLOAT scale)
...
#endif

You can add those manually and rebuild, or wait for my fix on the github MIOpen later today.

Daniel Lowell

from miopen.

reger-men avatar reger-men commented on July 26, 2024

The output of:
rocm_agent_enumerator -t all

Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz
gfx000
gfx803

from miopen.

daniellowell avatar daniellowell commented on July 26, 2024

Hey @reger-men where did you get your OpenCL?
You should be using rocm-opencl and rocm-opencl-dev with MIOpen.
I noticed you have:

-- Found OPENCL: /usr/lib/libOpenCL.so  

In your cmake output. The install director for rocm-opencl is /opt/rocm/opencl

from miopen.

reger-men avatar reger-men commented on July 26, 2024

When I compile with: cmake -DMIOPEN_BACKEND=OpenCL -DOPENCL_LIBRARIES=/opt/rocm/opencl/lib/x86_64/libOpenCL.so DOPENCL_INCLUDE_DIRS=/opt/rocm/opencl/include ..

I get:

FAILED: /home/lashab/Desktop/OpenVX/WorkPlace/ROCm/MIOpen-master/src/ocl/handleocl.cpp:274: clGetPlatformIDs failed. 0
...
The following tests FAILED:
	  1 - test_activation (SEGFAULT)
	  3 - test_bn_peract_test (SEGFAULT)
	  4 - test_bn_spatial_test (SEGFAULT)
	  5 - test_check_numerics_test (SEGFAULT)
	  6 - test_conv (SEGFAULT)
	  7 - test_custom_allocator (SEGFAULT)
	  8 - test_main (SEGFAULT)
	  9 - test_pooling_test (SEGFAULT)
	 10 - test_soft_max (SEGFAULT)
	 11 - test_tensor_ops (SEGFAULT)

The OpenCL lib in /usr/lib is an link to /opt/rocm/opencl/lib/x86_64/libOpenCL.so

from miopen.

daniellowell avatar daniellowell commented on July 26, 2024
$ cat /etc/OpenCL/vendors/amdocl64.icd 
libamdocl64.so

What do get when you run the above command?

I would delete the symlink, remove OpenCL from your system, either manually, or using

sudo apt-get remove rocm-opencl*

Then add it back in. Your platform is not being detected correctly using the libraries on your system.

from miopen.

reger-men avatar reger-men commented on July 26, 2024

Thanks @daniellowell for your support. There were many dependencies in my operating system.
I reinstall Ubuntu 16.04 and I install ROCm. Now it work.

I think my mistake was, that I install AMDAPPSDK alongside ROCm.

from miopen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.