thr3ads.net - similar to: "Assign different RegClasses to a virtual, register based on 'uniform' attribute?"

Displaying 20 results from an estimated 2000 matches similar to: "Assign different RegClasses to a virtual, register based on 'uniform' attribute?"

llvm/cuda: Indentify kernel functions and optimizations

2016 Dec 21

llvm/cuda: Indentify kernel functions and optimizations

https://github.com/llvm-mirror/llvm/blob/652375a8cc49615de31fd9d424753795059185b6/lib/Target/NVPTX/NVPTXUtilities.h#L58 Does this solve your problem? On Wed, Dec 21, 2016 at 2:29 PM, Gurunath Kadam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I am trying to instrument CUDA kernel functions only (llvm-3.9.0). > > Is there a way to identify cuda kernel

llvm/cuda: Indentify kernel functions and optimizations

2016 Dec 21

llvm/cuda: Indentify kernel functions and optimizations

Hi, I am trying to instrument CUDA kernel functions only (llvm-3.9.0). Is there a way to identify cuda kernel functions? I see that in llvm IR for CUDA has nvvm annotations section, where kernel functions are identified for NVPTX usage. I can parse the whole IR for this kernel metadata and then proceed, but this is very clumsy. Other way is to work with cuda-device-only IR. But then I am not

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net>: > > On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote: > > Hi, > > > > I am working on a new LLVM target for Intel GPU, which also has same kind > > of scalar/vector register classes used in AMDGPU target. Like for a i32 > > virtual register, it will be held in scalar register if its

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

On Wed, Dec 21, 2016 at 10:31:57AM -0500, Matt Arsenault wrote: > > > On Dec 21, 2016, at 10:26, Ruiling Song <ruiling.song83 at gmail.com> wrote: > > > > > > > > 2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net <mailto:tom at stellard.net>>: > > > > > > On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 20

Assign different RegClasses to a virtual register based on 'uniform' attribute?

On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote: > Hi, > > I am working on a new LLVM target for Intel GPU, which also has same kind > of scalar/vector register classes used in AMDGPU target. Like for a i32 > virtual register, it will be held in scalar register if its value is > uniform across a wavefront/warp, otherwise it will be in a vector register. > Does

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 21

Assign different RegClasses to a virtual register based on 'uniform' attribute?

> On Dec 21, 2016, at 10:26, Ruiling Song <ruiling.song83 at gmail.com> wrote: > > > > 2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net <mailto:tom at stellard.net>>: > > > > On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote: > > > Hi, > > > > > > I am working on a new LLVM target for Intel GPU, which

Assign different RegClasses to a virtual register based on 'uniform' attribute?

2016 Dec 20

Assign different RegClasses to a virtual register based on 'uniform' attribute?

Hi, I am working on a new LLVM target for Intel GPU, which also has same kind of scalar/vector register classes used in AMDGPU target. Like for a i32 virtual register, it will be held in scalar register if its value is uniform across a wavefront/warp, otherwise it will be in a vector register. Does AMDGPU already done this? I read the code, but I didn't figure out how to do this. Anybody has

LLVM Pass for Instructions in Function (error

2016 Nov 28

LLVM Pass for Instructions in Function (error

> On Nov 27, 2016, at 6:40 PM, Gurunath Kadam via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Sandeep, > > Thanks. > > One question about: > > Value* AddrPointer = Inst->getIperand(0); > > So this works for LVALUE(S) i.e. in my case pointer on LHS of '='. I cannot find anything online about getloperand online. > > For reference

LLVM Pass for Instructions in Function (error

2016 Nov 28

LLVM Pass for Instructions in Function (error

Hi, Sent via the Samsung Galaxy Note® 3, an AT&T 4G LTE smartphone -------- Original message -------- From: Gurunath Kadam via llvm-dev <llvm-dev at lists.llvm.org> Date: 11/27/2016 7:49 PM (GMT-06:00) To: llvm-dev at lists.llvm.org Subject: [llvm-dev] LLVM Pass for Instructions in Function (error Hi, Please find the embedded code. Also you may follow

LLVM/CLANG: CUDA compilation fail for inline assembly code

2016 Oct 14

LLVM/CLANG: CUDA compilation fail for inline assembly code

Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here

instrumenting device code with gpucc

2016 Mar 10

instrumenting device code with gpucc

It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you

instrumenting device code with gpucc

2016 Mar 15

instrumenting device code with gpucc

Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016

instrumenting device code with gpucc

2016 Mar 05

instrumenting device code with gpucc

On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and

instrumenting device code with gpucc

2016 Mar 13

instrumenting device code with gpucc

Hey Jingyue, Thanks for being so responsive! I finally figured out a way to resolve the issue: all I have to do is to use `-only-needed` when merging the device bitcodes with llvm-link. However, since we actually need to instrument the host code as well, I encountered another issue when I tried to glue the instrumented host code and fatbin together. When I only instrumented the device code, I

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 25

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

Hi Owen and Vinod, Thanks for sharing the paper! I like the idea a lot. Regarding the paper itself, Vinod, are the consensual branches (e.g., cbranch.ifnone) you mentioned in the paper publicly available in PTX ISA? Owen, could you explain more on the approach of using branch-if-none instructions in your mind? I believe you have lots of great insights, but I don't see how cbranch.ifnone

[LLVMdev] Attaching range metadata to IntrinsicInst

2014 Jun 17

[LLVMdev] Attaching range metadata to IntrinsicInst

Eh? How do you envision this? -eric On Tue, Jun 17, 2014 at 2:09 PM, Jingyue Wu <jingyue at google.com> wrote: > Hi Nick, > > That makes sense. I think a main issue here is that the ranges of these PTX > special registers (e.g., threadIdx.x) depend on -target-cpu which is only > visible to clang and llc. Would you mind we specify "target cpu" in the IR > similar

instrumenting device code with gpucc

2016 Mar 12

instrumenting device code with gpucc

Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you

[LLVMdev] Attaching range metadata to IntrinsicInst

2014 Jun 17

[LLVMdev] Attaching range metadata to IntrinsicInst

On Tue, Jun 17, 2014 at 2:33 PM, Jingyue Wu <jingyue at google.com> wrote: > Hi Eric, > > In the IR, besides "target datalayout" and "target triple", we have a > special "target cpu" string which is set by the Clang front-end according to > its -target-cpu flag. We also write a Module::getTargetCPU() method to > retrieve this string from the

[RFC] BasicAA considers address spaces?

2015 Aug 08

[RFC] BasicAA considers address spaces?

On Aug 7, 2015, at 8:28 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: ________________________________ From: "Jingyue Wu" <jingyue at google.com<mailto:jingyue at google.com>> To: "Matt Arsenault" <Matthew.Arsenault at amd.com<mailto:Matthew.Arsenault at amd.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

In our experience, as Owen also suggests, a pragma or a language extension can be avoided by a combination of static and dynamic analysis. We prefer this approach in our compiler ;) Regards, Vinod On Sat, Jan 24, 2015 at 12:09 AM, Owen Anderson <resistor at mac.com> wrote: > Hi Jingyue, > > Have you considered using dynamic uniformity checks? In my experience you > can

similar to: Assign different RegClasses to a virtual, register based on 'uniform' attribute?