Displaying 5 results from an estimated 5 matches for "nvptxbaseinfo".
2016 Dec 21
2
llvm/cuda: Indentify kernel functions and optimizations
...ied for NVPTX usage. I can parse the whole IR for this
kernel metadata and then proceed, but this is very clumsy.
Other way is to work with cuda-device-only IR. But then I am not sure how
to run this cuda-only-IR, as this contains cuda-device specific code and
not rest of the host-device code.
In NVPTXBaseInfo.h, the function attributes are defined, along
with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the
kernel function?
Thanks.
-Gurunath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161221/...
2016 Dec 21
0
llvm/cuda: Indentify kernel functions and optimizations
...the whole IR for this
> kernel metadata and then proceed, but this is very clumsy.
>
> Other way is to work with cuda-device-only IR. But then I am not sure how
> to run this cuda-only-IR, as this contains cuda-device specific code and
> not rest of the host-device code.
>
> In NVPTXBaseInfo.h, the function attributes are defined, along
> with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the
> kernel function?
>
> Thanks.
>
> -Gurunath
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llv...
2016 Dec 23
0
Assign different RegClasses to a virtual, register based on 'uniform' attribute?
...t; kernel metadata and then proceed, but this is very clumsy.
>>
>> Other way is to work with cuda-device-only IR. But then I am not sure how
>> to run this cuda-only-IR, as this contains cuda-device specific code and
>> not rest of the host-device code.
>>
>> In NVPTXBaseInfo.h, the function attributes are defined, along
>> with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the
>> kernel function?
>>
>> Thanks.
>>
>> -Gurunath
>>
>> _______________________________________________
>> LLVM Developers mai...
2013 Mar 20
0
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
...or now.
__shared__ would be address space 3, so:
#define __shared__ __attribute__((address_space(3)))
Either using [] or * should work. Just be aware that you will need to
specify a shared size when you launch the kernel. You can get the address
space mapping from lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h.
>
>
>
>
>
> --
> View this message in context:
> http://llvm.1065342.n5.nabble.com/UNREACHABLE-executed-error-while-trying-to-generate-PTX-tp56026p56080.html
> Sent from the LLVM - Dev mailing list archive at Nabble.com.
> _________________________________________...
2013 Mar 20
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
OK. That helps.
It does flash a warning though
[DEVICE-C++] nbody.kernel.cpp
nbody.kernel.cpp:29:9: warning: '__constant__' macro redefined
#define __constant__ __attribute__((address_space(2)))
^
/opt/cuda/include/host_defines.h:183:9: note: previous definition is here
#define __constant__ \
^
1 warning generated.
Another question is
What about extern __shared__ ?
I