search for: nvptxbaseinfo

Displaying 5 results from an estimated 5 matches for "nvptxbaseinfo".

2016 Dec 21
2
llvm/cuda: Indentify kernel functions and optimizations
...ied for NVPTX usage. I can parse the whole IR for this kernel metadata and then proceed, but this is very clumsy. Other way is to work with cuda-device-only IR. But then I am not sure how to run this cuda-only-IR, as this contains cuda-device specific code and not rest of the host-device code. In NVPTXBaseInfo.h, the function attributes are defined, along with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the kernel function? Thanks. -Gurunath -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161221/...
2016 Dec 21
0
llvm/cuda: Indentify kernel functions and optimizations
...the whole IR for this > kernel metadata and then proceed, but this is very clumsy. > > Other way is to work with cuda-device-only IR. But then I am not sure how > to run this cuda-only-IR, as this contains cuda-device specific code and > not rest of the host-device code. > > In NVPTXBaseInfo.h, the function attributes are defined, along > with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the > kernel function? > > Thanks. > > -Gurunath > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llv...
2016 Dec 23
0
Assign different RegClasses to a virtual, register based on 'uniform' attribute?
...t; kernel metadata and then proceed, but this is very clumsy. >> >> Other way is to work with cuda-device-only IR. But then I am not sure how >> to run this cuda-only-IR, as this contains cuda-device specific code and >> not rest of the host-device code. >> >> In NVPTXBaseInfo.h, the function attributes are defined, along >> with PROPERTY_ISKERNEL_FUNCTION. How can this be used to identify the >> kernel function? >> >> Thanks. >> >> -Gurunath >> >> _______________________________________________ >> LLVM Developers mai...
2013 Mar 20
0
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
...or now. __shared__ would be address space 3, so: #define __shared__ __attribute__((address_space(3))) Either using [] or * should work. Just be aware that you will need to specify a shared size when you launch the kernel. You can get the address space mapping from lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h. > > > > > > -- > View this message in context: > http://llvm.1065342.n5.nabble.com/UNREACHABLE-executed-error-while-trying-to-generate-PTX-tp56026p56080.html > Sent from the LLVM - Dev mailing list archive at Nabble.com. > _________________________________________...
2013 Mar 20
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
OK. That helps. It does flash a warning though [DEVICE-C++] nbody.kernel.cpp nbody.kernel.cpp:29:9: warning: '__constant__' macro redefined #define __constant__ __attribute__((address_space(2))) ^ /opt/cuda/include/host_defines.h:183:9: note: previous definition is here #define __constant__ \ ^ 1 warning generated. Another question is What about extern __shared__ ? I