Displaying 20 results from an estimated 2000 matches similar to: "Assign different RegClasses to a virtual, register based on 'uniform' attribute?"
2016 Dec 21
0
llvm/cuda: Indentify kernel functions and optimizations
https://github.com/llvm-mirror/llvm/blob/652375a8cc49615de31fd9d424753795059185b6/lib/Target/NVPTX/NVPTXUtilities.h#L58
Does this solve your problem?
On Wed, Dec 21, 2016 at 2:29 PM, Gurunath Kadam via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I am trying to instrument CUDA kernel functions only (llvm-3.9.0).
>
> Is there a way to identify cuda kernel
2016 Dec 21
2
llvm/cuda: Indentify kernel functions and optimizations
Hi,
I am trying to instrument CUDA kernel functions only (llvm-3.9.0).
Is there a way to identify cuda kernel functions?
I see that in llvm IR for CUDA has nvvm annotations section, where kernel
functions are identified for NVPTX usage. I can parse the whole IR for this
kernel metadata and then proceed, but this is very clumsy.
Other way is to work with cuda-device-only IR. But then I am not
2016 Dec 21
3
Assign different RegClasses to a virtual register based on 'uniform' attribute?
2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net>:
>
> On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote:
> > Hi,
> >
> > I am working on a new LLVM target for Intel GPU, which also has same
kind
> > of scalar/vector register classes used in AMDGPU target. Like for a i32
> > virtual register, it will be held in scalar register if its
2016 Dec 21
1
Assign different RegClasses to a virtual register based on 'uniform' attribute?
On Wed, Dec 21, 2016 at 10:31:57AM -0500, Matt Arsenault wrote:
>
> > On Dec 21, 2016, at 10:26, Ruiling Song <ruiling.song83 at gmail.com> wrote:
> >
> >
> >
> > 2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net <mailto:tom at stellard.net>>:
> > >
> > > On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song
2016 Dec 20
0
Assign different RegClasses to a virtual register based on 'uniform' attribute?
On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote:
> Hi,
>
> I am working on a new LLVM target for Intel GPU, which also has same kind
> of scalar/vector register classes used in AMDGPU target. Like for a i32
> virtual register, it will be held in scalar register if its value is
> uniform across a wavefront/warp, otherwise it will be in a vector register.
> Does
2016 Dec 21
0
Assign different RegClasses to a virtual register based on 'uniform' attribute?
> On Dec 21, 2016, at 10:26, Ruiling Song <ruiling.song83 at gmail.com> wrote:
>
>
>
> 2016-12-20 22:14 GMT+08:00 Tom Stellard <tom at stellard.net <mailto:tom at stellard.net>>:
> >
> > On Tue, Dec 20, 2016 at 11:00:09AM +0800, Ruiling Song wrote:
> > > Hi,
> > >
> > > I am working on a new LLVM target for Intel GPU, which
2016 Dec 20
2
Assign different RegClasses to a virtual register based on 'uniform' attribute?
Hi,
I am working on a new LLVM target for Intel GPU, which also has same kind
of scalar/vector register classes used in AMDGPU target. Like for a i32
virtual register, it will be held in scalar register if its value is
uniform across a wavefront/warp, otherwise it will be in a vector register.
Does AMDGPU already done this? I read the code, but I didn't figure out how
to do this. Anybody has
2016 Nov 28
2
LLVM Pass for Instructions in Function (error
> On Nov 27, 2016, at 6:40 PM, Gurunath Kadam via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi Sandeep,
>
> Thanks.
>
> One question about:
>
> Value* AddrPointer = Inst->getIperand(0);
>
> So this works for LVALUE(S) i.e. in my case pointer on LHS of '='. I cannot find anything online about getloperand online.
>
> For reference
2016 Nov 28
2
LLVM Pass for Instructions in Function (error
Hi,
Sent via the Samsung Galaxy NoteĀ® 3, an AT&T 4G LTE smartphone
-------- Original message --------
From: Gurunath Kadam via llvm-dev <llvm-dev at lists.llvm.org>
Date: 11/27/2016 7:49 PM (GMT-06:00)
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] LLVM Pass for Instructions in Function (error
Hi,
Please find the embedded code. Also you may follow
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi,
I am sorry for sending this query again here, but maybe I sent it to wrong
list yesterday.
I am trying to compile LonestarGPU-rev2.0
<http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download>
benchmark suite with LLVM/CLANG.
This suite has a following piece of code (more info here
2016 Mar 10
4
instrumenting device code with gpucc
It's hard to tell what is wrong without a concrete example. E.g., what is
the program you are instrumenting? What is the definition of the hook
function? How did you link that definition with the binary?
One thing suspicious to me is that you may have linked the definition of
_Cool_MemRead_Hook as a host function instead of a device function. AFAIK,
PTX assembly cannot be linked. So, if you
2016 Mar 15
2
instrumenting device code with gpucc
Hi Jingyue,
Sorry to ask again, but how exactly could I glue the fatbin with the
instrumented host code? Or does it mean we actually cannot instrument both
the host & device code at the same time?
Thanks!
yuanfeng
On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote:
> Including fatbin into host code should be done in frontend.
>
> On Mon, Mar 14, 2016
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com>
wrote:
> Hi Jingyue,
>
> My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother
> you, but I'm having trouble with gpucc in my project, and I would be really
> grateful for your help!
>
> Currently we're trying to instrument CUDA code using LLVM 3.9, and
2016 Mar 13
2
instrumenting device code with gpucc
Hey Jingyue,
Thanks for being so responsive! I finally figured out a way to resolve the
issue: all I have to do is to use `-only-needed` when merging the device
bitcodes with llvm-link.
However, since we actually need to instrument the host code as well, I
encountered another issue when I tried to glue the instrumented host code
and fatbin together. When I only instrumented the device code, I
2015 Jan 25
2
[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence
Hi Owen and Vinod,
Thanks for sharing the paper! I like the idea a lot. Regarding the paper
itself, Vinod, are the consensual branches (e.g., cbranch.ifnone) you
mentioned in the paper publicly available in PTX ISA?
Owen, could you explain more on the approach of using branch-if-none
instructions in your mind? I believe you have lots of great insights, but I
don't see how cbranch.ifnone
2014 Jun 17
2
[LLVMdev] Attaching range metadata to IntrinsicInst
Eh? How do you envision this?
-eric
On Tue, Jun 17, 2014 at 2:09 PM, Jingyue Wu <jingyue at google.com> wrote:
> Hi Nick,
>
> That makes sense. I think a main issue here is that the ranges of these PTX
> special registers (e.g., threadIdx.x) depend on -target-cpu which is only
> visible to clang and llc. Would you mind we specify "target cpu" in the IR
> similar
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue,
Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect
anchor didn't go away; ptxas is still complaining about the duplicate
definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused
the nvvm-reflect pass?
Thanks!
yuanfeng
On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote:
> According to the examples you
2014 Jun 17
3
[LLVMdev] Attaching range metadata to IntrinsicInst
On Tue, Jun 17, 2014 at 2:33 PM, Jingyue Wu <jingyue at google.com> wrote:
> Hi Eric,
>
> In the IR, besides "target datalayout" and "target triple", we have a
> special "target cpu" string which is set by the Clang front-end according to
> its -target-cpu flag. We also write a Module::getTargetCPU() method to
> retrieve this string from the
2015 Aug 08
2
[RFC] BasicAA considers address spaces?
On Aug 7, 2015, at 8:28 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote:
________________________________
From: "Jingyue Wu" <jingyue at google.com<mailto:jingyue at google.com>>
To: "Matt Arsenault" <Matthew.Arsenault at amd.com<mailto:Matthew.Arsenault at amd.com>>
Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at
2015 Jan 24
2
[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence
In our experience, as Owen also suggests, a pragma or a language extension
can be avoided by a combination of static and dynamic analysis. We prefer
this approach in our compiler ;)
Regards,
Vinod
On Sat, Jan 24, 2015 at 12:09 AM, Owen Anderson <resistor at mac.com> wrote:
> Hi Jingyue,
>
> Have you considered using dynamic uniformity checks? In my experience you
> can