Displaying 5 results from an estimated 5 matches for "subgroupadd".
2020 Aug 09
2
[RFC] Introducing convergence control bundles and intrinsics
...hreads should communicate. This is not the case in an
unstructured CFG. Here's an example of a non-trivial loop where a
reduction is used (the result of the reduction is the sum of the input
values of all participating threads):
A:
br label %B
B:
...
%sum.b = call i32 @subgroupAdd(i32 %v) ; convergent
...
br i1 %cc1, label %B, label %C
C:
...
br i1 %cc2, label %B, label %D
D:
; loop exit
Suppose this code is executed by two threads grouped in a (very short)
vector, and the threads execute the following sequences of basic blocks:
> Th...
2020 Aug 17
2
[RFC] Introducing convergence control bundles and intrinsics
...;s an example of a non-trivial loop where a
> > reduction is used (the result of the reduction is the sum of the input
> > values of all participating threads):
> >
> > A:
> > br label %B
> >
> > B:
> > ...
> > %sum.b = call i32 @subgroupAdd(i32 %v) ; convergent
> > ...
> > br i1 %cc1, label %B, label %C
> >
> > C:
> > ...
> > br i1 %cc2, label %B, label %D
> >
> > D:
> > ; loop exit
> >
> > Suppose this code is executed by two threads grouped in...
2020 Aug 17
2
[RFC] Introducing convergence control bundles and intrinsics
...ed (the result of the reduction is the sum of the input
> >>> values of all participating threads):
> >>>
> >>> A:
> >>> br label %B
> >>>
> >>> B:
> >>> ...
> >>> %sum.b = call i32 @subgroupAdd(i32 %v) ; convergent
> >>> ...
> >>> br i1 %cc1, label %B, label %C
> >>>
> >>> C:
> >>> ...
> >>> br i1 %cc2, label %B, label %D
> >>>
> >>> D:
> >>> ; loop e...
2020 Aug 09
2
_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier
...il.com> wrote:
>
> Hi Craig,
>
> that's an interesting problem.
>
> We have a superficially similar problem in GPU programming models
> where there are cross-thread communication operations that are
> sensitive to control flow, as in:
>
> if (c) {
> b = subgroupAdd(a);
> bar(b);
> } else {
> b = subgroupAdd(a);
> baz(b);
> }
>
> LLVM will merge those, even though it changes the behavior
> (potentially summing over a larger set of threads than in the original
> program). Merging them is inherently correct for LLVM'...
2020 Jul 28
2
_mm_lfence in both pathes of an if/else are hoisted by SimplfyCFG potentially breaking use as a speculation barrier
_mm_lfence was originally documented as a load fence. But in light of
speculative execution vulnerabilities it has started being advertised as a
way to prevent speculative execution. Current Intel Software Development
Manual documents it as "Specifically, LFENCE does not execute until all
prior instructions have completed locally, and no later instruction begins
execution until LFENCE