Displaying 14 results from an estimated 14 matches for "__syncthreads".
2015 Aug 21
3
[CUDA/NVPTX] is inlining __syncthreads allowed?
Hi Justin,
Is a compiler allowed to inline a function that calls __syncthreads? I saw
nvcc does that, but not sure it's valid though. For example,
void foo() {
__syncthreads();
}
if (threadIdx.x % 2 == 0) {
...
foo();
} else {
...
foo();
}
Before inlining, all threads meet at one __syncthreads(). After inlining
if (threadIdx.x % 2 == 0) {
...
__syncthrea...
2015 Aug 21
2
[CUDA/NVPTX] is inlining __syncthreads allowed?
...; duplicate the barrier. Which nvcc shows this behavior?
>
> Adding Vinod and Yuan for comment.
>
> > On Aug 21, 2015, at 4:24 PM, Jingyue Wu <jingyue at google.com> wrote:
> >
> > Hi Justin,
> >
> > Is a compiler allowed to inline a function that calls __syncthreads? I
> saw nvcc does that, but not sure it's valid though. For example,
> >
> > void foo() {
> > __syncthreads();
> > }
> >
> > if (threadIdx.x % 2 == 0) {
> > ...
> > foo();
> > } else {
> > ...
> > foo();
> > }...
2020 May 22
2
__syncthreads() inrtinsic seems to be missing from __clang_cuda_device_functions.h
Hi!
I see the `__syncthreads_{and|or|count}()` CUDA intrinsics listed in
https://github.com/llvm/llvm-project/blob/master/clang/lib/Headers/__clang_cuda_device_functions.h
but
the vanilla `__syncthreads()` intrinsic seems to be missing. Is this an
oversight, or is it omitted for a specific reason?
Thank you!
-------------- ne...
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi,
I wanted to ask whether there is ongoing effort (or an already established
tool) that enables to convert CUDA kernels (that uses CUDA specific
intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware
that I can do this for OpenCL with the help of libclc but I can not find
something similar for CUDA.
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150408/8d0c6590/atta...
2015 Apr 08
2
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
...2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>:
>
>> Hi,
>>
>> I wanted to ask whether there is ongoing effort (or an already
>> established tool) that enables to convert CUDA kernels (that uses CUDA
>> specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I
>> am aware that I can do this for OpenCL with the help of libclc but I can
>> not find something similar for CUDA.
>>
>> Thanks
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at...
2016 Jul 01
2
Missing TargetPrefix for NVVM intrinsics
...There seem to be three prefixes that are used in this file. About 900
are int_nvvm_*, 30 are int_ptx_*, and 1 is int_cuda. It isn't clear to
me if this inconsistency is intentional or warranted - should these all
be named int_nvvm_*? Is there a good reason to differentiate int_ptx_*?
Why does __syncthreads map to int_cuda_syncthreads, rather than
int_nvvm_syncthreads?
I'm probably going to go ahead and add the TargetPrefix to the nvvm
intrinsics, but I'm not familiar enough with NVPTX to know what to do
with the others.
Thanks,
-- Justin
2015 Aug 14
2
[LLVMdev] RFC: Convergent attribute
...xample, if a function containing a convergent instruction is called at two call sites, inlining the function produces two convergent instructions. Neither of the two is control equivalent to the original, but they combined are in some sense.
>
> I came across this when I am thinking whether __syncthreads in CUDA should be tagged "convergent'. Right now, it's tagged as noduplicate so inlining and loop unrolling are disallowed. But I think noduplicate is too strong for the semantics of convergent.
>
> Jingyue
>
> On Wed, May 13, 2015 at 1:17 PM, Owen Anderson <resistor...
2012 May 01
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
...t seems to only be possible to
> use the bar.sync 0 instruction. Unless this is being removed for PTX 3.0, the
> spec (and the PTX backend) support using bar.sync {0..15}. The old PTX
> intrinsic also supports a non-zero integer operand.
The NVVM intrinsic is there to implement CUDA's __syncthreads(). The old intrinsic is still exposed. I'll see about adding a proper NVVM intrinsic.
>
> * I guess this raises the question of whether or not it's actually worthwhile
> retaining compatibility with the old backend. I converted my Jet compiler to
> use NVVM intrinsics and st...
2015 Aug 14
2
[LLVMdev] RFC: Convergent attribute
...function
> containing a convergent instruction is called at two call sites, inlining
> the function produces two convergent instructions. Neither of the two is
> control equivalent to the original, but they combined are in some sense.
>
> I came across this when I am thinking whether __syncthreads in CUDA should
> be tagged "convergent'. Right now, it's tagged as noduplicate so inlining
> and loop unrolling are disallowed. But I think noduplicate is too strong
> for the semantics of convergent.
>
> Jingyue
>
> On Wed, May 13, 2015 at 1:17 PM, Owen Anderson &...
2012 May 02
0
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
...s this is being removed for PTX 3.0, the
spec (and the PTX backend) support using bar.sync {0..15}. The old PTX
intrinsic also supports a non-zero integer operand.
</pre>
</blockquote>
<pre wrap=""><!---->
The NVVM intrinsic is there to implement CUDA's __syncthreads(). The old intrinsic is still exposed. I'll see about adding a proper NVVM intrinsic.
</pre>
<blockquote type="cite">
<pre wrap="">* I guess this raises the question of whether or not it's actually worthwhile
retaining compatibility with the...
2014 Sep 30
2
[LLVMdev] Behaviour of NVPTX intrinsic
I have written test.ll as below and ran 'opt' on it as
" opt -std-compile-opts test.ll -S -o -" . But the output shows that there
is code motion around the barrier intrinsics.
test.ll
-------
; ModuleID = 'test.bc'
define void @test(i16* %I_0, i16* %I_1, i16* %I_2, i16* %I_3, i16* %O_0) {
entry:
%T_0 = load volatile i16* %I_0
%T_1 = load volatile i16* %I_1
%T_2 =
2015 May 13
8
[LLVMdev] RFC: Convergent attribute
Below is a proposal for a new "convergent" intrinsic attribute and MachineInstr property, needed for correctly modeling many SPMD/SIMT programming models in LLVM. Comments and feedback welcome.
—Owen
In order to make LLVM more suitable for programming models variously called SPMD
and SIMT, we would like to propose a new intrinsic and MachineInstr annotation
called
2012 Apr 29
0
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Justin,<br>
<br>
Firstly, this is great! It seems to be so much further forward in terms
of features
2012 Apr 27
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
Thanks for the feedback!
The attached patch addresses the style issues that have been found.
From: Jim Grosbach [mailto:grosbach at apple.com]
Sent: Wednesday, April 25, 2012 2:22 PM
To: Justin Holewinski
Cc: llvm-commits at cs.uiuc.edu; llvmdev at cs.uiuc.edu; Vinod Grover
Subject: Re: [llvm-commits] [PATCH][RFC] NVPTX Backend
Hi Justin,
Cool stuff, to be sure. Excited to see this.
As a