thr3ads.net - similar to: "[web] sub-domain"

Displaying 20 results from an estimated 6000 matches similar to: "[web] sub-domain"

2015 Nov 10

[web] sub-domain

So that people have a shorter link to go to that entry page directly. It's especially useful for non-LLVM folks who want to try out LLVM's CUDA support. Many researchers fall into this category btw because LLVM used to support very little CUDA. They don't like to search llvm.org for what they want. On Tue, Nov 10, 2015 at 2:59 PM, C Bergström <cbergstrom at pathscale.com> wrote:

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

On Wed, Jul 8, 2015 at 7:08 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at pathscale.com> wrote: >> On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com> wrote: >>> On 8 July 2015 at 09:53, C Bergström <cbergstrom at pathscale.com> wrote: >>>> regarding >>>>

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com> wrote: > On 8 July 2015 at 09:53, C Bergström <cbergstrom at pathscale.com> wrote: >> regarding >> -------- >> Fixed address allocations weren't going to be part of that, but I see >> that it makes sense for a variety of use cases. One question I have >> here is how this is intended

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

On Tue, Jul 07, 2015 at 08:13:28PM -0400, Ilia Mirkin wrote: > On Tue, Jul 7, 2015 at 8:11 PM, C Bergström <cbergstrom at pathscale.com> wrote: > > On Wed, Jul 8, 2015 at 7:08 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > >> On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at pathscale.com> wrote: > >>> On Wed, Jul 8, 2015 at 6:58 AM, Ben

CUDA fixed VA allocations and sparse mappings

2015 Jul 07

CUDA fixed VA allocations and sparse mappings

regarding -------- Fixed address allocations weren't going to be part of that, but I see that it makes sense for a variety of use cases. One question I have here is how this is intended to work where the RM needs to make some of these allocations itself (for graphics context mapping, etc), how should potential conflicts with user mappings be handled? -------- As an initial implemetation you

instrumenting device code with gpucc

2016 Mar 10

instrumenting device code with gpucc

It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you

instrumenting device code with gpucc

2016 Mar 05

instrumenting device code with gpucc

On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and

CUDA compilation "No available targets are compatible with this triple." problem

2017 Aug 02

CUDA compilation "No available targets are compatible with this triple." problem

Yes, I followed the guide. The same error showed up: >clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 -I/usr/local/cuda/include -lcudart_static -ldl -lrt -pthread error: unable to create target: 'No available targets are compatible with this triple.' ________________________________ From: Kevin Choi <code.kchoi at gmail.com> Sent: Wednesday, August 2,

instrumenting device code with gpucc

2016 Mar 15

instrumenting device code with gpucc

Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016

instrumenting device code with gpucc

2016 Mar 13

instrumenting device code with gpucc

Hey Jingyue, Thanks for being so responsive! I finally figured out a way to resolve the issue: all I have to do is to use `-only-needed` when merging the device bitcodes with llvm-link. However, since we actually need to instrument the host code as well, I encountered another issue when I tried to glue the instrumented host code and fatbin together. When I only instrumented the device code, I

[cfe-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Mar 14

[cfe-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

> I'd support some of Jame's comments if liboffload wasn't glued to OMP as it is now. I certainly have no objection to moving liboffload elsewhere if that makes it more useful to people. There is no real "glue" holding it there; it simply ended up in the OpenMP directory structure because that was an easy place to put it, not because that's the optimal place for it.

[LLVMdev] Attaching range metadata to IntrinsicInst

2014 Jun 17

[LLVMdev] Attaching range metadata to IntrinsicInst

Eh? How do you envision this? -eric On Tue, Jun 17, 2014 at 2:09 PM, Jingyue Wu <jingyue at google.com> wrote: > Hi Nick, > > That makes sense. I think a main issue here is that the ranges of these PTX > special registers (e.g., threadIdx.x) depend on -target-cpu which is only > visible to clang and llc. Would you mind we specify "target cpu" in the IR > similar

instrumenting device code with gpucc

2016 Mar 12

instrumenting device code with gpucc

Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you

[LLVMdev] Attaching range metadata to IntrinsicInst

2014 Jun 17

[LLVMdev] Attaching range metadata to IntrinsicInst

On Tue, Jun 17, 2014 at 2:33 PM, Jingyue Wu <jingyue at google.com> wrote: > Hi Eric, > > In the IR, besides "target datalayout" and "target triple", we have a > special "target cpu" string which is set by the Clang front-end according to > its -target-cpu flag. We also write a Module::getTargetCPU() method to > retrieve this string from the

problem on compiling cuda program with clang++

2016 Oct 27

problem on compiling cuda program with clang++

Hi all, I compiled the *llvm3.9* source code on the *Nvidia TX1* board. And now I am following the document in the docs/CompileCudaWithLLVM.rst to compile cuda program with clang++. However, when I compile `axpy.cu` using `nvcc`, *nvcc* can generate the correct the binary; while compiling `axpy.cu` using clang++, the detailed command is `clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_53

Assign different RegClasses to a virtual, register based on 'uniform' attribute?

2016 Dec 23

Assign different RegClasses to a virtual, register based on 'uniform' attribute?

On 2016年12月22日 15:37, via llvm-dev wrote: > Send llvm-dev mailing list submissions to > llvm-dev at lists.llvm.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > or, via email, send a message with subject or body 'help' to > llvm-dev-request at lists.llvm.org > > You can reach the

[LLVMdev] Supporting heterogeneous computing in llvm.

2015 Jun 06

[LLVMdev] Supporting heterogeneous computing in llvm.

On Sun, Jun 7, 2015 at 2:52 AM, Eric Christopher <echristo at gmail.com> wrote: > > > On Sat, Jun 6, 2015 at 12:43 PM C Bergström <cbergstrom at pathscale.com> > wrote: >> >> On Sun, Jun 7, 2015 at 2:34 AM, Eric Christopher <echristo at gmail.com> >> wrote: >> > >> > >> > On Sat, Jun 6, 2015 at 12:31 PM C Bergström

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

In our experience, as Owen also suggests, a pragma or a language extension can be avoided by a combination of static and dynamic analysis. We prefer this approach in our compiler ;) Regards, Vinod On Sat, Jan 24, 2015 at 12:09 AM, Owen Anderson <resistor at mac.com> wrote: > Hi Jingyue, > > Have you considered using dynamic uniformity checks? In my experience you > can

problem on compiling cuda program with clang++

2016 Oct 27

problem on compiling cuda program with clang++

(+llvm-dev) My question was whether your host machine, the one which is running the compiler, is ARM (as opposed to x86 or POWER). The header you pointed to was in "aarch64-linux-gnu", which made me think you might be on an ARM system. If you are not running linux x86, it is not likely to work. If you are running linux x86, we will need much more details about your system in order to

similar to: [web] sub-domain