search for: yuanfeng

Displaying 16 results from an estimated 16 matches for "yuanfeng".

2016 Mar 15
2
instrumenting device code with gpucc
Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016 at 12:13 AM, Yuanfeng Peng < > yuanfeng.jack.peng at gmail.com> wrote: > >> Hey Jingyue, >> &gt...
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you sent, I believe the linking issue was caused > by nvvm reflection anchors. I haven't played with that, but I guess running > nvvm-reflect on an IR removes the nvvm reflec...
2016 Mar 13
2
instrumenting device code with gpucc
...nd link it with axpy-sm_30.fatbin. However, now that I instrumented the IR of the host code (axpy.bc) and did `llc axpy.bc -o axpy.s`, which cmd should I use to link axpy.s with axpy-sm_30.fatbin? I tried to use -cc1as, but the flag '-fcuda-include-gpubinary' was not recognized. Thanks! yuanfeng On Sat, Mar 12, 2016 at 12:05 AM, Jingyue Wu <jingyue at google.com> wrote: > I've no idea. Without instrumentation, nvvm_reflect_anchor doesn't appear > in the final PTX, right? If that's the case, some pass in llc must have > deleted the anchor and you should be able...
2016 Mar 10
4
instrumenting device code with gpucc
...device function. AFAIK, PTX assembly cannot be linked. So, if you want that hook function called from your device code, you should merge the IR of the hook function and the IR of your device code into one IR (via linking or direct IR emitting) before the IR to PTX. On Wed, Mar 9, 2016 at 4:31 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > Thanks for the instructions! I instrumented the device code and got a > binary of axpy.cu; however, the resulting executable always fails on the > first cudaMalloc call in host code (the kernel had not even been l...
2017 Jun 14
2
Separate compilation of CUDA code?
Hi, I wonder whether the current version of LLVM supports separate compilation and linking of device code, i.e., is there a flag analogous to nvcc's --relocatable-device-code flag? If not, is there any plan to support this? Thanks! Yuanfeng Peng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/1865e072/attachment.html>
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
...erated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng Peng via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, >> however, about what the purpose of nvvm_reflect_anchorv() is here, and why >> does the front-end always generate this f...
2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
...v() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any magic option available to do so? Thanks! yuanfeng On Thu, Apr 7, 2016 at 5:19 PM, Reid Kleckner <rnk at google.com> wrote: > The actual problem here is that PTX appears to not allow '.' in symbol > names. We should probably just change our PTX emission to mangle dots > somehow. > > On Thu, Apr 7, 2016 at 4:24 PM, Yua...
2016 Aug 01
3
[GPUCC] link against libdevice
...goes, I am curious. :) Thank you very much for the bug report! If you like I'll cc you on any relevant changes, just create an account at https://reviews.llvm.org (if necessary; I can't seem to find you) and let me know your username. Regards, -Justin On Sun, Jul 31, 2016 at 10:59 PM, Yuanfeng Peng <yuanfeng at cis.upenn.edu> wrote: > Hi Justin, > > Thanks for your response! The clang & llvm I'm using was built from source. > > Below is the output of compiling with -v. Any suggestions would be > appreciated! > > clang version 3.9.0 (trunk 270145) (...
2016 Aug 01
0
[GPUCC] link against libdevice
...ist.* * "/usr/local/cuda/bin/ptxas" -m64 -O0 --gpu-name sm_50 --output-file /tmp/scalarProd-181f7e.o /tmp/scalarProd-32a530.s* *ptxas fatal : Unresolved extern function '__nv_mul24'* *clang-3.9: error: ptxas command failed with exit code 255 (use -v to see invocation)* Thanks! Yuanfeng On Mon, Aug 1, 2016 at 1:04 AM, Justin Lebar <jlebar at google.com> wrote: > Hi, Yuanfeng. > > What version of clang are you using? CUDA is only known to work at > tip of head, so you must build clang yourself from source. > > I suspect that's your problem, but if bui...
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
...st object file ab.o, with ab-device.o & ab-device.fatbin embedded; 7) Link against libraries and get the final binary: a.out. The binary a.out fails with an exception I when run it; but when I try to debug it with cuda-gdb or cuda-memcheck, no source information was available. Why? Thanks! Yuanfeng Peng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170614/1bfdd28e/attachment.html>
2016 Jul 29
2
[GPUCC] link against libdevice
...--cuda-gpu-arch=sm_50 scalarProd.cu* but ended up with the following error: *ptxas fatal : Unresolved extern function '__nv_mul24'* Seems to me that libdevice was not automatically linked. I wonder what flags I need to pass to clang to have the code linked against libdevice? Thanks! Yuanfeng Peng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160729/ffdf3989/attachment.html>
2016 Aug 01
2
[GPUCC] link against libdevice
Hi, Yuanfeng. What version of clang are you using? CUDA is only known to work at tip of head, so you must build clang yourself from source. I suspect that's your problem, but if building from source doesn't fix it, please attach the output of compiling with -v. Regards, -Justin On Sun, Jul 31, 2016...
2016 Aug 01
0
[GPUCC] link against libdevice
Directly CC-ing some folks who may be able to help. On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I was trying to compile scalarProd.cu (from CUDA SDK) with the following > command: > > * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc > --cuda-gpu-arch=sm_50 scalarProd.cu* > > but ended u...
2016 Apr 07
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
...21__nvvm_reflect_anchorv.2*()`*? Or is that possible to prevent _ZL21__nvvm_reflect_anchorv*() *from being generated into a.bc & b.bc in the first place? Or is this possible to ask llvm-link to NOT rename *_ZL21__nvvm_reflect_anchorv() **into **ZL21__nvvm_reflect_anchorv**.2**()* ? Thanks! *Yuanfeng Peng * -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160407/cdefa5d8/attachment.html>
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we...
2017 Jun 17
2
Separate compilation of CUDA code?
Hi, I wonder whether the current version of LLVM supports separate compilation and linking of device code, i.e., is there a flag analogous to nvcc's --relocatable-device-code flag? If not, is there any plan to support this? Thanks! Yuanfeng Peng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170617/3835ed80/attachment.html>