thr3ads.net - llvm dev - [llvm-dev] [GlobalISel] A Proposal for global instruction selection [Jan 2016]

If this information is useful, please help other people find it:
Share via:

James Molloy via llvm-dev

2016-Jan-13 20:20 UTC

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

>  (Right?)
Uh no, the register content explicitly does change :( We insert REV
instructions (byteswap) on each bitcast. Bitcasts can be merged and elided
etc, but conceptually there's a register content change on every bitcast.

James

On Wed, 13 Jan 2016 at 18:09 Philip Reames <listmail at philipreames.com>
wrote:
>
>
> On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
> > ----- Original Message -----
> >> From: "James Molloy" <james at jamesmolloy.co.uk>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Quentin Colombet" <
> qcolombet at apple.com>
> >> Sent: Wednesday, January 13, 2016 9:54:26 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
instruction
> selection
> >>
> >>
> >>> I think that teaching the optimizer about big-Endian lane
ordering
> >>> would have been better.
> >>
> >> It's certainly arguable. Even in hindsight I'm glad we
didn't -
> >> that's the approach GCC took and they've been fixing
subtle bugs in
> >> their vectorizer ever since.
> >>
> >>
> >>> Inserting the REV after every LDR
> >>
> >> We only do this conceptually. In most cases REVs cancel out, and
we
> >> have the LD1 instruction which is LDR+REV. With enough peepholes
> >> there's really no need for code to run slower.
> >>
> >>
> >>> Given what's been done, should we update the LangRef.
> >>
> >> Potentially, yes. I hadn't realised quite how strongly worded
it was
> >> with respect to this.
> >>
> > Please do ;)
> I'm not sure changing bitcast is the right place.  Since the bitcast is
> representing the in-register value (which doesn't change), maybe we
> should define it as part of the load/store instead?  That's essentially
> what's going on; we're converting from a canonical register form to
a
> variety of memory forms.  (Right?)
> >
> >   -Hal
> >
> >> James
> >>
> >>
> >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov
> wrote:
> >>
> >>
> >>
> >>
> >> [resending so the message is smaller]
> >>
> >>
> >>
> >>
> >>
> >>
> >> From: "James Molloy via llvm-dev" < llvm-dev at
lists.llvm.org >
> >> To: "Quentin Colombet" < qcolombet at apple.com >
> >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> >> Sent: Wednesday, January 13, 2016 2:35:32 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> >> instruction selection
> >>
> >> Hi Philip,
> >>
> >>
> >>
> >>
> >>
> >> store <2 x i64> %1, <2 x i64>* %y
> >>
> >> Yes. The memory pattern differs. This is the first diagram on the
> >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
> >>
> >>
> >> I think that teaching the optimizer about big-Endian lane ordering
> >> would have been better. Inserting the REV after every LDR sounds
> >> very similar to what we do for VSX on little-Endian PowerPC
systems
> >> (PowerPC may have a slight advantage here in that we don't
need to
> >> do insertelement / extractelement / shufflevector through memory
on
> >> systems where little-Endian mode is relevant, see
> >>
>
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
> >> ).
> >>
> >> Given what's been done, should we update the LangRef. It
currently
> >> reads, " The ‘ bitcast ‘ instruction converts value to type
ty2 . It
> >> is always a no-op cast because no bits change with this
conversion.
> >> The conversion is done as if the value had been stored to memory
and
> >> read back as type ty2 ." But this is now, at the least,
misleading,
> >> because this process of storing the value as one type and reading
it
> >> back in as another does, in fact, change the bits. We need to make
> >> clear that this might change the bits (perhaps specifically by
> >> calling out this case of vector bitcasts on big-Endian systems?).
> >>
> >>
> >>
> >> Also, regarding this, " Most operating systems however do not
run
> >> with alignment faults enabled, so this is often not an
issue." Are
> >> you saying that the processor does the correct thing in this case
> >> (if alignment faults are not enabled, then it performs a proper
> >> unaligned load), or that the operating-system trap handler
emulates
> >> the unaligned load should one occur?
> >>
> >> Thanks again,
> >> Hal
> >>
> >>
> >> _______________________________________________
> >>
> >>
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160113/15d7d48a/attachment.html>

Philip Reames via llvm-dev

2016-Jan-13 20:30 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

On 01/13/2016 12:20 PM, James Molloy wrote:> >  (Right?)
>
> Uh no, the register content explicitly does change :( We insert REV 
> instructions (byteswap) on each bitcast. Bitcasts can be merged and 
> elided etc, but conceptually there's a register content change on 
> every bitcast.Ok.  Then we need to change the LangRef as suggested.  Given this is a 
rather important semantic change, I think you need to send a top level 
RFC to the list.

A couple of points that will need clarified:
- Does this only apply to vector types?  It definitely doesn't apply 
between pointer types today.  What about integer, floating point, and FCAs?
- Is combining two casts into one a legal operation?  I think it is so 
far, but we need to explicitly state that.
- Do we have a predicate for identifying no-op casts that can be freely 
removed/combined?
- Is coercing a load to the type it's immediately bitcast to legal under 
this model?>
> James
>
> On Wed, 13 Jan 2016 at 18:09 Philip Reames <listmail at philipreames.com
> <mailto:listmail at philipreames.com>> wrote:
>
>
>
>     On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
>     > ----- Original Message -----
>     >> From: "James Molloy" <james at jamesmolloy.co.uk
>     <mailto:james at jamesmolloy.co.uk>>
>     >> To: "Hal Finkel" <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>>
>     >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>>, "Quentin
Colombet"
>     <qcolombet at apple.com <mailto:qcolombet at apple.com>>
>     >> Sent: Wednesday, January 13, 2016 9:54:26 AM
>     >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>     instruction selection
>     >>
>     >>
>     >>> I think that teaching the optimizer about big-Endian lane
ordering
>     >>> would have been better.
>     >>
>     >> It's certainly arguable. Even in hindsight I'm glad we
didn't -
>     >> that's the approach GCC took and they've been fixing
subtle bugs in
>     >> their vectorizer ever since.
>     >>
>     >>
>     >>> Inserting the REV after every LDR
>     >>
>     >> We only do this conceptually. In most cases REVs cancel out,
and we
>     >> have the LD1 instruction which is LDR+REV. With enough
peepholes
>     >> there's really no need for code to run slower.
>     >>
>     >>
>     >>> Given what's been done, should we update the LangRef.
>     >>
>     >> Potentially, yes. I hadn't realised quite how strongly
worded
>     it was
>     >> with respect to this.
>     >>
>     > Please do ;)
>     I'm not sure changing bitcast is the right place.  Since the
>     bitcast is
>     representing the in-register value (which doesn't change), maybe we
>     should define it as part of the load/store instead?  That's
>     essentially
>     what's going on; we're converting from a canonical register
form to a
>     variety of memory forms.  (Right?)
>     >
>     >   -Hal
>     >
>     >> James
>     >>
>     >>
>     >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at
anl.gov
>     <mailto:hfinkel at anl.gov> > wrote:
>     >>
>     >>
>     >>
>     >>
>     >> [resending so the message is smaller]
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> From: "James Molloy via llvm-dev" < llvm-dev at
lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org> >
>     >> To: "Quentin Colombet" < qcolombet at apple.com
>     <mailto:qcolombet at apple.com> >
>     >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org> >
>     >> Sent: Wednesday, January 13, 2016 2:35:32 AM
>     >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>     >> instruction selection
>     >>
>     >> Hi Philip,
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> store <2 x i64> %1, <2 x i64>* %y
>     >>
>     >> Yes. The memory pattern differs. This is the first diagram on
the
>     >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts
)
>     >>
>     >>
>     >> I think that teaching the optimizer about big-Endian lane
ordering
>     >> would have been better. Inserting the REV after every LDR
sounds
>     >> very similar to what we do for VSX on little-Endian PowerPC
systems
>     >> (PowerPC may have a slight advantage here in that we don't
need to
>     >> do insertelement / extractelement / shufflevector through
memory on
>     >> systems where little-Endian mode is relevant, see
>     >>
>    
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
>     >> ).
>     >>
>     >> Given what's been done, should we update the LangRef. It
currently
>     >> reads, " The ‘ bitcast ‘ instruction converts value to
type ty2
>     . It
>     >> is always a no-op cast because no bits change with this
conversion.
>     >> The conversion is done as if the value had been stored to
>     memory and
>     >> read back as type ty2 ." But this is now, at the least,
misleading,
>     >> because this process of storing the value as one type and
>     reading it
>     >> back in as another does, in fact, change the bits. We need to
make
>     >> clear that this might change the bits (perhaps specifically by
>     >> calling out this case of vector bitcasts on big-Endian
systems?).
>     >>
>     >>
>     >>
>     >> Also, regarding this, " Most operating systems however do
not run
>     >> with alignment faults enabled, so this is often not an
issue." Are
>     >> you saying that the processor does the correct thing in this
case
>     >> (if alignment faults are not enabled, then it performs a
proper
>     >> unaligned load), or that the operating-system trap handler
emulates
>     >> the unaligned load should one occur?
>     >>
>     >> Thanks again,
>     >> Hal
>     >>
>     >>
>     >> _______________________________________________
>     >>
>     >>
>     >> LLVM Developers mailing list
>     >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     >>
>     >>
>     >> --
>     >> Hal Finkel
>     >> Assistant Computational Scientist
>     >> Leadership Computing Facility
>     >> Argonne National Laboratory
>     >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160113/37c2ec45/attachment.html>

Daniel Sanders via llvm-dev

2016-Jan-14 13:17 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

> Ok.  Then we need to change the LangRef as suggested.  Given this is a
rather important semantic change, I think you need to send a top level RFC to
the list.
FWIW, I don't think this is a semantic change to LLVM-IR itself. I think
it's more clearing up the misconception that LLVM-IR semantics also apply to
SelectionDAG's operations. That said, I do think it's important to
mention this in LangRef since it's very easy to make this mistake and very
few targets need to worry about the distinction.

To explain why I don't think this is a semantic change to LLVM-IR, let's
consider this example from earlier:
    %0 = load <4 x i32> %x
    %1 = bitcast <4 x i32> %0 to <2 x i64>
    store <2 x i64> %1, <2 x i64>* %y

In LLVM-IR terms, if the value of %0 is:
    %0 = 0x00112233_44556677_8899aabb_ccddeeff
then the value of %1 is:
    %1 = 0x0011223344556677_8899aabbccddeeff
which agrees with the store/load and the 'no bits change' statements in
LangRef.

However, the mapping of these bits to physical register bits is not consistent
between types:
    Physreg(%0) = 0xccddeeff_8899aabb_44556677_00112233
    Physreg(%1) = 0x8899aabbccddeeff_0011223344556677

Essentially, I'm saying that BitCastInst and ISD::BITCAST have slightly
different semantics because of their different domains. The former is working on
an abstract representation of the values where both statements in LangRef are
true, but the latter is closer to the target where the 'no bits change'
statement ceases to be true in some cases.> A couple of points that will need clarified:
> - Does this only apply to vector types?  It definitely doesn't apply
between pointer types today.  What about integer, floating point, and FCAs?
I've only seen it for vector types so far but in theory it could happen for
other types. I'd expect FCAs to encounter it since the physical registers
may contain padding that isn't present in the LLVM-IR representation and the
placement and amount of padding will depend on the exact FCA.
I can think of cases where address space casts can encounter the same problem
but that's already been covered in LangRef ("It can be a no-op cast or
a complex value modification, depending on the target and the address space
pair.").

Does anyone use FCAs directly? Most targets seem to convert them to same-sized
integers or bitcast an FCA* to i8*.
> - Is combining two casts into one a legal operation?  I think it is so far,
but we need to explicitly state that.
Yes, A->B->C and A->C are equivalent.
> - Do we have a predicate for identifying no-op casts that can be freely
removed/combined?
James mentioned one in CGP but I haven't been able to find it. I don't
think it's necessary to have one at the LLVM-IR level but we do need one in
the backends. I remember adding one to the backend but I can't find that
either so I think I'm remembering one of my patches from before I split
MSA's registers into type-specific classes.
> - Is coercing a load to the type it's immediately bitcast to legal
under this model?
Yes.

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Philip
Reames via llvm-dev
Sent: 13 January 2016 20:31
To: James Molloy; Hal Finkel
Cc: llvm-dev
Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction selection


On 01/13/2016 12:20 PM, James Molloy wrote:>  (Right?)
Uh no, the register content explicitly does change :( We insert REV instructions
(byteswap) on each bitcast. Bitcasts can be merged and elided etc, but
conceptually there's a register content change on every bitcast.
Ok.  Then we need to change the LangRef as suggested.  Given this is a rather
important semantic change, I think you need to send a top level RFC to the list.

A couple of points that will need clarified:
- Does this only apply to vector types?  It definitely doesn't apply between
pointer types today.  What about integer, floating point, and FCAs?
- Is combining two casts into one a legal operation?  I think it is so far, but
we need to explicitly state that.
- Do we have a predicate for identifying no-op casts that can be freely
removed/combined?
- Is coercing a load to the type it's immediately bitcast to legal under
this model?


James

On Wed, 13 Jan 2016 at 18:09 Philip Reames <listmail at
philipreames.com<mailto:listmail at philipreames.com>> wrote:


On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:> ----- Original Message -----
>> From: "James Molloy" <james at
jamesmolloy.co.uk<mailto:james at jamesmolloy.co.uk>>
>> To: "Hal Finkel" <hfinkel at anl.gov<mailto:hfinkel at
anl.gov>>
>> Cc: "llvm-dev" <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>, "Quentin
Colombet" <qcolombet at apple.com<mailto:qcolombet at
apple.com>>
>> Sent: Wednesday, January 13, 2016 9:54:26 AM
>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction
selection
>>
>>
>>> I think that teaching the optimizer about big-Endian lane ordering
>>> would have been better.
>>
>> It's certainly arguable. Even in hindsight I'm glad we
didn't -
>> that's the approach GCC took and they've been fixing subtle
bugs in
>> their vectorizer ever since.
>>
>>
>>> Inserting the REV after every LDR
>>
>> We only do this conceptually. In most cases REVs cancel out, and we
>> have the LD1 instruction which is LDR+REV. With enough peepholes
>> there's really no need for code to run slower.
>>
>>
>>> Given what's been done, should we update the LangRef.
>>
>> Potentially, yes. I hadn't realised quite how strongly worded it
was
>> with respect to this.
>>
> Please do ;)I'm not sure changing bitcast is the right place.  Since the bitcast is
representing the in-register value (which doesn't change), maybe we
should define it as part of the load/store instead?  That's essentially
what's going on; we're converting from a canonical register form to a
variety of memory forms.  (Right?)>
>   -Hal
>
>> James
>>
>>
>> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at
anl.gov<mailto:hfinkel at anl.gov> > wrote:
>>
>>
>>
>>
>> [resending so the message is smaller]
>>
>>
>>
>>
>>
>>
>> From: "James Molloy via llvm-dev" < llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org> >
>> To: "Quentin Colombet" < qcolombet at
apple.com<mailto:qcolombet at apple.com> >
>> Cc: "llvm-dev" < llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org> >
>> Sent: Wednesday, January 13, 2016 2:35:32 AM
>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>> instruction selection
>>
>> Hi Philip,
>>
>>
>>
>>
>>
>> store <2 x i64> %1, <2 x i64>* %y
>>
>> Yes. The memory pattern differs. This is the first diagram on the
>> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
>>
>>
>> I think that teaching the optimizer about big-Endian lane ordering
>> would have been better. Inserting the REV after every LDR sounds
>> very similar to what we do for VSX on little-Endian PowerPC systems
>> (PowerPC may have a slight advantage here in that we don't need to
>> do insertelement / extractelement / shufflevector through memory on
>> systems where little-Endian mode is relevant, see
>>
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
>> ).
>>
>> Given what's been done, should we update the LangRef. It currently
>> reads, " The ‘ bitcast ‘ instruction converts value to type ty2 .
It
>> is always a no-op cast because no bits change with this conversion.
>> The conversion is done as if the value had been stored to memory and
>> read back as type ty2 ." But this is now, at the least,
misleading,
>> because this process of storing the value as one type and reading it
>> back in as another does, in fact, change the bits. We need to make
>> clear that this might change the bits (perhaps specifically by
>> calling out this case of vector bitcasts on big-Endian systems?).
>>
>>
>>
>> Also, regarding this, " Most operating systems however do not run
>> with alignment faults enabled, so this is often not an issue." Are
>> you saying that the processor does the correct thing in this case
>> (if alignment faults are not enabled, then it performs a proper
>> unaligned load), or that the operating-system trap handler emulates
>> the unaligned load should one occur?
>>
>> Thanks again,
>> Hal
>>
>>
>> _______________________________________________
>>
>>
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160114/c1f586c9/attachment.html>

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Jan 2016 - [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Seemingly Similar Threads