thr3ads.net - llvm dev - [llvm-dev] Instruction itineraries and fence/barrier instructions [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Phil Tomson via llvm-dev

2016-Aug-22 18:20 UTC

[llvm-dev] Instruction itineraries and fence/barrier instructions

We improved our instruction itineraries and now we're seeing our testcases
for fence instructions break.

For example, we have this testcase:

@write_me = external global i32
@read_me = external global i32

; Function Attrs: nounwind
define i32 @xstg_intrinsic(i32 %foo) #0 {
entry:
; CHECK: store        r0, r1, 0, 32
; CHECK-NEXT: fence 2
  %foo.addr = alloca i32, align 4
  store i32 %foo, i32* %foo.addr, align 4
  %0 = load i32* %foo.addr, align 4
  store volatile i32 %0, i32* @write_me, align 4
  call void @llvm.xstg.memory.barrier(i32 2, i8 0)
  %1 = load volatile i32* @read_me, align 4
  ret i32 %1
}

Prior to adding our instruction itineraries the code generated was:

xstg_intrinsic:                         # @xstg_intrinsic
# BB#0:                                 # %entry
    subI    r509, r509, 16, 64
    store        r510, r509, 0, 64
    bitop1        r510, r509, 0, OR, 64
    store        r0, r510, 12, 32
    movimm        r1, %hi(write_me), 64
    movimmshf32    r1, r1, %lo(write_me)
    store        r0, r1, 0, 32
    fence 2
    movimm        r0, %hi(read_me), 64
    movimmshf32    r0, r0, %lo(read_me)
    load        r1, r0, 0, 32
    bitop1        r509, r510, 0, OR, 64
    load        r510, r509, 0, 64
    addI    r509, r509, 16, 64
    jabs        r511

Note the separation between the store prior to the fence and the code that
comes after.

Now that we've got itineraries in place we see:

    subI    r509, r509, 16, 64
    store        r510, r509, 0, 64
    bitop1        r510, r509, 0, OR, 64
    movimm        r1, %hi(write_me), 64
    store        r0, r510, 12, 32
    movimmshf32    r1, r1, %lo(write_me)
    movimm        r2, %hi(read_me), 64
    store        r0, r1, 0, 32
    movimmshf32    r2, r2, %lo(read_me)
    fence 2
    load        r1, r2, 0, 32
    bitop1        r509, r510, 0, OR, 64
    load        r510, r509, 0, 64
    addI    r509, r509, 16, 64
    jabs        r511

the movimm which sets up the address for the load has been moved up prior
to the fence.

Is there a way to indicate in the itinerary that position of the fence
should be fixed - no instruction reordering "through" the
fence/barrier?

Phil
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160822/d5c1bf32/attachment.html>

Matt Arsenault via llvm-dev

2016-Aug-22 18:40 UTC

head link

[llvm-dev] Instruction itineraries and fence/barrier instructions

> On Aug 22, 2016, at 11:20, Phil Tomson via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> We improved our instruction itineraries and now we're seeing our
testcases for fence instructions break.
> 
> For example, we have this testcase:
> 
> @write_me = external global i32
> @read_me = external global i32
> 
> ; Function Attrs: nounwind
> define i32 @xstg_intrinsic(i32 %foo) #0 {
> entry:
> ; CHECK: store        r0, r1, 0, 32
> ; CHECK-NEXT: fence 2
>   %foo.addr = alloca i32, align 4
>   store i32 %foo, i32* %foo.addr, align 4
>   %0 = load i32* %foo.addr, align 4
>   store volatile i32 %0, i32* @write_me, align 4
>   call void @llvm.xstg.memory.barrier(i32 2, i8 0)
>   %1 = load volatile i32* @read_me, align 4
>   ret i32 %1
> }
> 
> Prior to adding our instruction itineraries the code generated was:
> 
> xstg_intrinsic:                         # @xstg_intrinsic
> # BB#0:                                 # %entry
>     subI    r509, r509, 16, 64
>     store        r510, r509, 0, 64
>     bitop1        r510, r509, 0, OR, 64
>     store        r0, r510, 12, 32
>     movimm        r1, %hi(write_me), 64
>     movimmshf32    r1, r1, %lo(write_me)
>     store        r0, r1, 0, 32
>     fence 2
>     movimm        r0, %hi(read_me), 64
>     movimmshf32    r0, r0, %lo(read_me)
>     load        r1, r0, 0, 32
>     bitop1        r509, r510, 0, OR, 64
>     load        r510, r509, 0, 64
>     addI    r509, r509, 16, 64
>     jabs        r511
> 
> Note the separation between the store prior to the fence and the code that
comes after.
> 
> Now that we've got itineraries in place we see:
> 
>     subI    r509, r509, 16, 64
>     store        r510, r509, 0, 64
>     bitop1        r510, r509, 0, OR, 64
>     movimm        r1, %hi(write_me), 64
>     store        r0, r510, 12, 32
>     movimmshf32    r1, r1, %lo(write_me)
>     movimm        r2, %hi(read_me), 64
>     store        r0, r1, 0, 32
>     movimmshf32    r2, r2, %lo(read_me)
>     fence 2
>     load        r1, r2, 0, 32
>     bitop1        r509, r510, 0, OR, 64
>     load        r510, r509, 0, 64
>     addI    r509, r509, 16, 64
>     jabs        r511
> 
> the movimm which sets up the address for the load has been moved up prior
to the fence.
> 
> Is there a way to indicate in the itinerary that position of the fence
should be fixed - no instruction reordering "through" the
fence/barrier?
> 
> Phil
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

I don’t see a change relative to the memory instructions. Do you mean you want
this to avoid scheduling of any instruction around any other? Does the
instruction have isSideEffects set on it? I think the fallback if that isn’t
enough is to override TargetInstrInfo::isSchedulingBoundary

-Matt

Phil Tomson via llvm-dev

2016-Aug-22 18:54 UTC

head link

[llvm-dev] Instruction itineraries and fence/barrier instructions

On Mon, Aug 22, 2016 at 11:40 AM, Matt Arsenault <arsenm2 at gmail.com>
wrote:
>
> > On Aug 22, 2016, at 11:20, Phil Tomson via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > We improved our instruction itineraries and now we're seeing our
> testcases for fence instructions break.
> >
> > For example, we have this testcase:
> >
> > @write_me = external global i32
> > @read_me = external global i32
> >
> > ; Function Attrs: nounwind
> > define i32 @xstg_intrinsic(i32 %foo) #0 {
> > entry:
> > ; CHECK: store        r0, r1, 0, 32
> > ; CHECK-NEXT: fence 2
> >   %foo.addr = alloca i32, align 4
> >   store i32 %foo, i32* %foo.addr, align 4
> >   %0 = load i32* %foo.addr, align 4
> >   store volatile i32 %0, i32* @write_me, align 4
> >   call void @llvm.xstg.memory.barrier(i32 2, i8 0)
> >   %1 = load volatile i32* @read_me, align 4
> >   ret i32 %1
> > }
> >
> > Prior to adding our instruction itineraries the code generated was:
> >
> > xstg_intrinsic:                         # @xstg_intrinsic
> > # BB#0:                                 # %entry
> >     subI    r509, r509, 16, 64
> >     store        r510, r509, 0, 64
> >     bitop1        r510, r509, 0, OR, 64
> >     store        r0, r510, 12, 32
> >     movimm        r1, %hi(write_me), 64
> >     movimmshf32    r1, r1, %lo(write_me)
> >     store        r0, r1, 0, 32
> >     fence 2
> >     movimm        r0, %hi(read_me), 64
> >     movimmshf32    r0, r0, %lo(read_me)
> >     load        r1, r0, 0, 32
> >     bitop1        r509, r510, 0, OR, 64
> >     load        r510, r509, 0, 64
> >     addI    r509, r509, 16, 64
> >     jabs        r511
> >
> > Note the separation between the store prior to the fence and the code
> that comes after.
> >
> > Now that we've got itineraries in place we see:
> >
> >     subI    r509, r509, 16, 64
> >     store        r510, r509, 0, 64
> >     bitop1        r510, r509, 0, OR, 64
> >     movimm        r1, %hi(write_me), 64
> >     store        r0, r510, 12, 32
> >     movimmshf32    r1, r1, %lo(write_me)
> >     movimm        r2, %hi(read_me), 64
> >     store        r0, r1, 0, 32
> >     movimmshf32    r2, r2, %lo(read_me)
> >     fence 2
> >     load        r1, r2, 0, 32
> >     bitop1        r509, r510, 0, OR, 64
> >     load        r510, r509, 0, 64
> >     addI    r509, r509, 16, 64
> >     jabs        r511
> >
> > the movimm which sets up the address for the load has been moved up
> prior to the fence.
> >
> > Is there a way to indicate in the itinerary that position of the fence
> should be fixed - no instruction reordering "through" the
fence/barrier?
> >
> > Phil
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> I don’t see a change relative to the memory instructions.

True, we may be being a bit too skiddish about this... perhaps the solution
is to change the testcase so that we can ensure that the relative order
between the store and the fence has been preserved.

> Do you mean you want this to avoid scheduling of any instruction around
> any other? Does the instruction have isSideEffects set on it?

Where can I find information about isSideEffects? Googling "LLVm
isSideEffects" didnt' reveal anything that looked relevant.

> I think the fallback if that isn’t enough is to override TargetInstrInfo::
> isSchedulingBoundary
>
>Thanks, I'll look at that in other targets.

Phil
> -Matt-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160822/e81147cf/attachment-0001.html>

Ryan Taylor via llvm-dev

2016-Aug-22 19:18 UTC

head link

[llvm-dev] Fwd: Instruction itineraries and fence/barrier instructions

Forgot to add list.

---------- Forwarded message ----------
From: Ryan Taylor <ryta1203 at gmail.com>
Date: Mon, Aug 22, 2016 at 3:17 PM
Subject: Re: [llvm-dev] Instruction itineraries and fence/barrier
instructions
To: Phil Tomson <phil.a.tomson at gmail.com>


hasSideEffects is a property in tablegen for the instructions
https://llvm.org/svn/llvm-project/llvm/trunk/include/llvm/Target/Target.td

Opts check this flag to see if it's ok to opt that instruction, for
example, DCE. That instruction might set some register control flag that
isn't modeled by the compiler (for example), hence needing hasSideEffects
set so that DCE does not remove it.

I hope I understood your question correctly. I would definitely try setting
hasSideEffects on the fence instruction, hope that works.

It might also be worth it to look at other targets and how they implement
their barriers/fences.

-Ryan

On Mon, Aug 22, 2016 at 2:54 PM, Phil Tomson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Mon, Aug 22, 2016 at 11:40 AM, Matt Arsenault <arsenm2 at
gmail.com>
> wrote:
>
>>
>> > On Aug 22, 2016, at 11:20, Phil Tomson via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >
>> > We improved our instruction itineraries and now we're seeing
our
>> testcases for fence instructions break.
>> >
>> > For example, we have this testcase:
>> >
>> > @write_me = external global i32
>> > @read_me = external global i32
>> >
>> > ; Function Attrs: nounwind
>> > define i32 @xstg_intrinsic(i32 %foo) #0 {
>> > entry:
>> > ; CHECK: store        r0, r1, 0, 32
>> > ; CHECK-NEXT: fence 2
>> >   %foo.addr = alloca i32, align 4
>> >   store i32 %foo, i32* %foo.addr, align 4
>> >   %0 = load i32* %foo.addr, align 4
>> >   store volatile i32 %0, i32* @write_me, align 4
>> >   call void @llvm.xstg.memory.barrier(i32 2, i8 0)
>> >   %1 = load volatile i32* @read_me, align 4
>> >   ret i32 %1
>> > }
>> >
>> > Prior to adding our instruction itineraries the code generated
was:
>> >
>> > xstg_intrinsic:                         # @xstg_intrinsic
>> > # BB#0:                                 # %entry
>> >     subI    r509, r509, 16, 64
>> >     store        r510, r509, 0, 64
>> >     bitop1        r510, r509, 0, OR, 64
>> >     store        r0, r510, 12, 32
>> >     movimm        r1, %hi(write_me), 64
>> >     movimmshf32    r1, r1, %lo(write_me)
>> >     store        r0, r1, 0, 32
>> >     fence 2
>> >     movimm        r0, %hi(read_me), 64
>> >     movimmshf32    r0, r0, %lo(read_me)
>> >     load        r1, r0, 0, 32
>> >     bitop1        r509, r510, 0, OR, 64
>> >     load        r510, r509, 0, 64
>> >     addI    r509, r509, 16, 64
>> >     jabs        r511
>> >
>> > Note the separation between the store prior to the fence and the
code
>> that comes after.
>> >
>> > Now that we've got itineraries in place we see:
>> >
>> >     subI    r509, r509, 16, 64
>> >     store        r510, r509, 0, 64
>> >     bitop1        r510, r509, 0, OR, 64
>> >     movimm        r1, %hi(write_me), 64
>> >     store        r0, r510, 12, 32
>> >     movimmshf32    r1, r1, %lo(write_me)
>> >     movimm        r2, %hi(read_me), 64
>> >     store        r0, r1, 0, 32
>> >     movimmshf32    r2, r2, %lo(read_me)
>> >     fence 2
>> >     load        r1, r2, 0, 32
>> >     bitop1        r509, r510, 0, OR, 64
>> >     load        r510, r509, 0, 64
>> >     addI    r509, r509, 16, 64
>> >     jabs        r511
>> >
>> > the movimm which sets up the address for the load has been moved
up
>> prior to the fence.
>> >
>> > Is there a way to indicate in the itinerary that position of the
fence
>> should be fixed - no instruction reordering "through" the
fence/barrier?
>> >
>> > Phil
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> I don’t see a change relative to the memory instructions.
>
>
> True, we may be being a bit too skiddish about this... perhaps the
> solution is to change the testcase so that we can ensure that the relative
> order between the store and the fence has been preserved.
>
>
>> Do you mean you want this to avoid scheduling of any instruction around
>> any other? Does the instruction have isSideEffects set on it?
>
>
> Where can I find information about isSideEffects? Googling "LLVm
> isSideEffects" didnt' reveal anything that looked relevant.
>
>
>> I think the fallback if that isn’t enough is to override
>> TargetInstrInfo::isSchedulingBoundary
>>
>>
> Thanks, I'll look at that in other targets.
>
> Phil
>
>> -Matt
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160822/1b03f4bd/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Aug 2016 - Instruction itineraries and fence/barrier instructions

[llvm-dev] Instruction itineraries and fence/barrier instructions

[llvm-dev] Instruction itineraries and fence/barrier instructions

[llvm-dev] Instruction itineraries and fence/barrier instructions

[llvm-dev] Fwd: Instruction itineraries and fence/barrier instructions

Possibly Parallel Threads