Oclgrind is awesome! https://github.com/jrprice/Oclgrind Although I do
not have AMDGPU target real device llvm/lib/Target/AMDGPU but the
simulator helps me being familiar with Host to communicate
(clCreateBuffer) with Device :)
在 2017年06月28日 11:40, Leslie Zhai 写道:> Hi Tom,
>
> I found it https://clang.llvm.org/docs/FAQ.html for
> clang/include/clang/Driver/CC1Options.td
>
> 在 2017年06月27日 11:44, Leslie Zhai 写道:
>> Hi Tom,
>>
>> I found the correct mailing list finally :)
>>
>>
>> 在 2017年06月26日 19:21, Tom Stellard 写道:
>>> On 06/26/2017 04:20 AM, Leslie Zhai wrote:
>>>> Hi Tom,
>>>>
>>>> Thanks for your kind response!
>>>>
>>>>> clang -include /path/to/libclc/headers/clc.h -I
>>>>> /path/to/libclc/headers -Dcl_clang_storage_class_specifiers
>>>>> -target amdgcn--amdhsa -mcpu=carrizo $INPUT_FILE -o
$OUTPUT_FILE
>>>> When I tried to build cos.cl testcase
>>>> https://github.com/llvm-mirror/libclc/blob/master/test/cos.cl
>>>>
>>>> $ clang -include clc/clc.h -Dcl_clang_storage_class_specifiers
>>>> -target amdgcn--amdhsa -mcpu=carrizo cos.cl -o cos.out
>>>>
>>>> cos.cl:1:27: error: pointer arguments to kernel functions must
>>>> reside in '__global', '__constant' or
>>>> '__local' address space
>>>> __kernel void foo(float4 *f) {
>>>> ^
>>>> 1 error generated.
>>>>
>>>> Then I added __local before float4 *f like this:
>>>>
>>>> Index: cos.cl
>>>>
==================================================================>>>>
--- cos.cl (revision 306265)
>>>> +++ cos.cl (working copy)
>>>> @@ -1,3 +1,3 @@
>>>> -__kernel void foo(float4 *f) {
>>>> +__kernel void foo(__local float4 *f) {
>>>> *f = cos(*f);
>>>> }
>>>>
>>>> because [OpenCL] Improve address space diagnostics
>>>> https://reviews.llvm.org/D27671
>>>>
>>>> then rebuilt again,
>>>>
>>>> cos.cl:1:15: error: unsupported call to function _Z3cosDv4_f
>>>> __kernel void foo(__local float4 *f) {
>>>> ^
>>>> 1 error generated.
>>>>
>>>> why not found *cos* function? please give me some hint, thanks
a lot!
>>>>
>>> Hi,
>>>
>>> This is becuase you aren't linking the kernel with the libclc
>>> bitcode library,
>>> try passing the path to the libclc library to clang using this
option:
>>> -mlink-bitcode-file
>>>
>>> -Tom
>> I should at first read
>> http://lists.llvm.org/pipermail/libclc-dev/2016-May/002203.html
>>
>> $ clang -x cl -Dcl_clang_storage_class_specifiers -target
>> amdgcn--amdhsa -mcpu=carrizo -B -Xclang -mlink-bitcode-file -Xclang
>> clc/amdgcn--amdhsa.bc -include clc/clc.h -o cos.co cos.cl
>>
>> but,
>>
>> clang-5.0: error: unknown argument: '-mlink-bitcode-file'
>>
>> Allow linking multiple bitcode files https://reviews.llvm.org/D13913
>> so buildbot and other developers had already checked the
>> CC1Options.td for 'mlink-bitcode-file' argument, but how to use
it
>> correctly, please give me some hint, thanks a lot!
> $ clang -x cl -Dcl_clang_storage_class_specifiers -target
> amdgcn--amdhsa -mcpu=carrizo -Xclang -mlink-bitcode-file
> /usr/lib/clc/amdgcn--amdhsa.bc -include clc/clc.h -o cos.co cos.cl
>
> error: error reading '/tmp/amdgcn--amdhsa-ab2e8b.o'
> error: unable to open output file '': 'Permission denied'
> 2 errors generated.
>
> it failed to generate amdgcn--amdhsa-ab2e8b.o for
> /usr/lib/clc/amdgcn--amdhsa.bc?
>
>
>>
>>
>> PS: it works as Ricardo mentioned!
>>
>> $ clang -Dcl_clang_storage_class_specifiers -isystem /usr/include
>> -include clc/clc.h -target amdgcn--amdhsa -mcpu=carrizo -S -emit-llvm
>> -xcl -o cos.ll cos.cl
>> $ llvm-link cos.ll /usr/lib/clc/amdgcn--amdhsa.bc -o cos.linked.bc
>> $ clang -target amdgcn--amdhsa -mcpu=carrizo cos.linked.bc -S -o
>> cos.amdhsa.s
> works!
>
> $ clang -x assembler -target amdgcn--amdhsa -mcpu=carrizo -c -o cos.o
> cos.amdhsa.s
> $ clang -target amdgcn--amdhsa cos.o -o cos.co
> $ llvm-readobj -sections -symbols -program-headers cos.co
I will test clCreateProgramWithBinary(... "cos.co" ...)!
>
>
> File: cos.co
> Format: ELF64-amdgpu-hsacobj
> Arch: amdgcn
> AddressSize: 64bit
> LoadName:
> Sections [
> Section {
> Index: 0
> Name: (0)
> Type: SHT_NULL (0x0)
> Flags [ (0x0)
> ]
> Address: 0x0
> Offset: 0x0
> Size: 0
> Link: 0
> Info: 0
> AddressAlignment: 0
> EntrySize: 0
> }
> Section {
> Index: 1
> Name: .note (1)
> Type: SHT_NOTE (0x7)
> Flags [ (0x2)
> SHF_ALLOC (0x2)
> ]
> Address: 0x200
> Offset: 0x200
> Size: 1136
> Link: 0
> Info: 0
> AddressAlignment: 4
> EntrySize: 0
> }
> Section {
> Index: 2
> Name: .dynsym (7)
> Type: SHT_DYNSYM (0xB)
> Flags [ (0x2)
> SHF_ALLOC (0x2)
> ]
> Address: 0x670
> Offset: 0x670
> Size: 48
> Link: 4
> Info: 1
> AddressAlignment: 8
> EntrySize: 24
> }
> Section {
> Index: 3
> Name: .hash (15)
> Type: SHT_HASH (0x5)
> Flags [ (0x2)
> SHF_ALLOC (0x2)
> ]
> Address: 0x6A0
> Offset: 0x6A0
> Size: 24
> Link: 2
> Info: 0
> AddressAlignment: 4
> EntrySize: 4
> }
> Section {
> Index: 4
> Name: .dynstr (21)
> Type: SHT_STRTAB (0x3)
> Flags [ (0x2)
> SHF_ALLOC (0x2)
> ]
> Address: 0x6B8
> Offset: 0x6B8
> Size: 5
> Link: 0
> Info: 0
> AddressAlignment: 1
> EntrySize: 0
> }
> Section {
> Index: 5
> Name: .text (29)
> Type: SHT_PROGBITS (0x1)
> Flags [ (0x6)
> SHF_ALLOC (0x2)
> SHF_EXECINSTR (0x4)
> ]
> Address: 0x1000
> Offset: 0x1000
> Size: 8132
> Link: 0
> Info: 0
> AddressAlignment: 256
> EntrySize: 0
> }
> Section {
> Index: 6
> Name: .dynamic (35)
> Type: SHT_DYNAMIC (0x6)
> Flags [ (0x3)
> SHF_ALLOC (0x2)
> SHF_WRITE (0x1)
> ]
> Address: 0x3000
> Offset: 0x3000
> Size: 96
> Link: 4
> Info: 0
> AddressAlignment: 8
> EntrySize: 16
> }
> Section {
> Index: 7
> Name: .AMDGPU.csdata (44)
> Type: SHT_PROGBITS (0x1)
> Flags [ (0x0)
> ]
> Address: 0x0
> Offset: 0x3060
> Size: 0
> Link: 0
> Info: 0
> AddressAlignment: 1
> EntrySize: 0
> }
> Section {
> Index: 8
> Name: .comment (59)
> Type: SHT_PROGBITS (0x1)
> Flags [ (0x30)
> SHF_MERGE (0x10)
> SHF_STRINGS (0x20)
> ]
> Address: 0x0
> Offset: 0x3060
> Size: 186
> Link: 0
> Info: 0
> AddressAlignment: 1
> EntrySize: 0
> }
> Section {
> Index: 9
> Name: .symtab (68)
> Type: SHT_SYMTAB (0x2)
> Flags [ (0x0)
> ]
> Address: 0x0
> Offset: 0x3120
> Size: 264
> Link: 11
> Info: 10
> AddressAlignment: 8
> EntrySize: 24
> }
> Section {
> Index: 10
> Name: .shstrtab (76)
> Type: SHT_STRTAB (0x3)
> Flags [ (0x0)
> ]
> Address: 0x0
> Offset: 0x3228
> Size: 94
> Link: 0
> Info: 0
> AddressAlignment: 1
> EntrySize: 0
> }
> Section {
> Index: 11
> Name: .strtab (86)
> Type: SHT_STRTAB (0x3)
> Flags [ (0x0)
> ]
> Address: 0x0
> Offset: 0x3286
> Size: 64
> Link: 0
> Info: 0
> AddressAlignment: 1
> EntrySize: 0
> }
> ]
> Symbols [
> Symbol {
> Name: (0)
> Value: 0x0
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: Undefined (0x0)
> }
> Symbol {
> Name: BB0_11 (1)
> Value: 0x25D0
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_12 (8)
> Value: 0x2A74
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_2 (15)
> Value: 0x12E8
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_3 (21)
> Value: 0x177C
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_5 (27)
> Value: 0x191C
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_6 (33)
> Value: 0x1DC0
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_8 (39)
> Value: 0x1F94
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: BB0_9 (45)
> Value: 0x2430
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other: 0
> Section: .text (0x5)
> }
> Symbol {
> Name: _DYNAMIC (55)
> Value: 0x3000
> Size: 0
> Binding: Local (0x0)
> Type: None (0x0)
> Other [ (0x2)
> STV_HIDDEN (0x2)
> ]
> Section: .dynamic (0x6)
> }
> Symbol {
> Name: foo (51)
> Value: 0x1000
> Size: 8132
> Binding: Global (0x1)
> Type: AMDGPU_HSA_KERNEL (0xA)
> Other: 0
> Section: .text (0x5)
> }
> ]
> ProgramHeaders [
> ProgramHeader {
> Type: PT_PHDR (0x6)
> Offset: 0x40
> VirtualAddress: 0x40
> PhysicalAddress: 0x40
> FileSize: 448
> MemSize: 448
> Flags [ (0x4)
> PF_R (0x4)
> ]
> Alignment: 8
> }
> ProgramHeader {
> Type: PT_LOAD (0x1)
> Offset: 0x0
> VirtualAddress: 0x0
> PhysicalAddress: 0x0
> FileSize: 1725
> MemSize: 1725
> Flags [ (0x4)
> PF_R (0x4)
> ]
> Alignment: 4096
> }
> ProgramHeader {
> Type: PT_LOAD (0x1)
> Offset: 0x1000
> VirtualAddress: 0x1000
> PhysicalAddress: 0x1000
> FileSize: 8132
> MemSize: 8132
> Flags [ (0x5)
> PF_R (0x4)
> PF_X (0x1)
> ]
> Alignment: 4096
> }
> ProgramHeader {
> Type: PT_LOAD (0x1)
> Offset: 0x3000
> VirtualAddress: 0x3000
> PhysicalAddress: 0x3000
> FileSize: 96
> MemSize: 96
> Flags [ (0x6)
> PF_R (0x4)
> PF_W (0x2)
> ]
> Alignment: 4096
> }
> ProgramHeader {
> Type: PT_DYNAMIC (0x2)
> Offset: 0x3000
> VirtualAddress: 0x3000
> PhysicalAddress: 0x3000
> FileSize: 96
> MemSize: 96
> Flags [ (0x6)
> PF_R (0x4)
> PF_W (0x2)
> ]
> Alignment: 8
> }
> ProgramHeader {
> Type: PT_GNU_RELRO (0x6474E552)
> Offset: 0x3000
> VirtualAddress: 0x3000
> PhysicalAddress: 0x3000
> FileSize: 96
> MemSize: 4096
> Flags [ (0x4)
> PF_R (0x4)
> ]
> Alignment: 1
> }
> ProgramHeader {
> Type: PT_GNU_STACK (0x6474E551)
> Offset: 0x0
> VirtualAddress: 0x0
> PhysicalAddress: 0x0
> FileSize: 0
> MemSize: 0
> Flags [ (0x6)
> PF_R (0x4)
> PF_W (0x2)
> ]
> Alignment: 0
> }
> ProgramHeader {
> Type: PT_NOTE (0x4)
> Offset: 0x200
> VirtualAddress: 0x200
> PhysicalAddress: 0x200
> FileSize: 1136
> MemSize: 1136
> Flags [ (0x4)
> PF_R (0x4)
> ]
> Alignment: 4
> }
> ]
>
>
>>
>>
>> $ clang --version
>> iSoft clang version 5.0.0 (trunk 305877) (based on LLVM 5.0.0svn)
>> Target: x86_64-isoft-linux
>> Thread model: posix
>> InstalledDir: /bin
>>
>> Registered Targets:
>> ...
>> amdgcn - AMD GCN GPUs
>> ...
>> avr - Atmel AVR Microcontroller
>> ...
>> mips64el - Mips64el [experimental]
>> ...
>> nvptx64 - NVIDIA PTX 64-bit
>> ...
>> r600 - AMD GPUs HD2XXX-HD6XXX
>> ...
>>
>
--
Regards,
Leslie Zhai https://reviews.llvm.org/p/xiangzhai/