thr3ads.net - llvm dev - [llvm-dev] [GlobalISel] A Proposal for global instruction selection [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Jan-13 16:01 UTC

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

----- Original Message -----> From: "James Molloy" <james at jamesmolloy.co.uk>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Quentin
Colombet" <qcolombet at apple.com>
> Sent: Wednesday, January 13, 2016 9:54:26 AM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction
selection
> 
> 
> > I think that teaching the optimizer about big-Endian lane ordering
> > would have been better.
> 
> 
> It's certainly arguable. Even in hindsight I'm glad we didn't -
> that's the approach GCC took and they've been fixing subtle bugs in
> their vectorizer ever since.
> 
> 
> > Inserting the REV after every LDR
> 
> 
> We only do this conceptually. In most cases REVs cancel out, and we
> have the LD1 instruction which is LDR+REV. With enough peepholes
> there's really no need for code to run slower.
> 
> 
> > Given what's been done, should we update the LangRef.
> 
> 
> Potentially, yes. I hadn't realised quite how strongly worded it was
> with respect to this.
> 
Please do ;)

 -Hal
> 
> James
> 
> 
> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> 
> [resending so the message is smaller]
> 
> 
> 
> 
> 
> 
> From: "James Molloy via llvm-dev" < llvm-dev at lists.llvm.org
>
> To: "Quentin Colombet" < qcolombet at apple.com >
> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> Sent: Wednesday, January 13, 2016 2:35:32 AM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection
> 
> Hi Philip,
> 
> 
> 
> 
> 
> store <2 x i64> %1, <2 x i64>* %y
> 
> Yes. The memory pattern differs. This is the first diagram on the
> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
> 
> 
> I think that teaching the optimizer about big-Endian lane ordering
> would have been better. Inserting the REV after every LDR sounds
> very similar to what we do for VSX on little-Endian PowerPC systems
> (PowerPC may have a slight advantage here in that we don't need to
> do insertelement / extractelement / shufflevector through memory on
> systems where little-Endian mode is relevant, see
>
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
> ).
> 
> Given what's been done, should we update the LangRef. It currently
> reads, " The ‘ bitcast ‘ instruction converts value to type ty2 . It
> is always a no-op cast because no bits change with this conversion.
> The conversion is done as if the value had been stored to memory and
> read back as type ty2 ." But this is now, at the least, misleading,
> because this process of storing the value as one type and reading it
> back in as another does, in fact, change the bits. We need to make
> clear that this might change the bits (perhaps specifically by
> calling out this case of vector bitcasts on big-Endian systems?).
> 
> 
> 
> Also, regarding this, " Most operating systems however do not run
> with alignment faults enabled, so this is often not an issue." Are
> you saying that the processor does the correct thing in this case
> (if alignment faults are not enabled, then it performs a proper
> unaligned load), or that the operating-system trap handler emulates
> the unaligned load should one occur?
> 
> Thanks again,
> Hal
> 
> 
> _______________________________________________
> 
> 
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Philip Reames via llvm-dev

2016-Jan-13 18:08 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:> ----- Original Message -----
>> From: "James Molloy" <james at jamesmolloy.co.uk>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Quentin Colombet" <qcolombet at apple.com>
>> Sent: Wednesday, January 13, 2016 9:54:26 AM
>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global instruction
selection
>>
>>
>>> I think that teaching the optimizer about big-Endian lane ordering
>>> would have been better.
>>
>> It's certainly arguable. Even in hindsight I'm glad we
didn't -
>> that's the approach GCC took and they've been fixing subtle
bugs in
>> their vectorizer ever since.
>>
>>
>>> Inserting the REV after every LDR
>>
>> We only do this conceptually. In most cases REVs cancel out, and we
>> have the LD1 instruction which is LDR+REV. With enough peepholes
>> there's really no need for code to run slower.
>>
>>
>>> Given what's been done, should we update the LangRef.
>>
>> Potentially, yes. I hadn't realised quite how strongly worded it
was
>> with respect to this.
>>
> Please do ;)I'm not sure changing bitcast is the right place.  Since the bitcast is 
representing the in-register value (which doesn't change), maybe we 
should define it as part of the load/store instead?  That's essentially 
what's going on; we're converting from a canonical register form to a 
variety of memory forms.  (Right?)>
>   -Hal
>
>> James
>>
>>
>> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov >
wrote:
>>
>>
>>
>>
>> [resending so the message is smaller]
>>
>>
>>
>>
>>
>>
>> From: "James Molloy via llvm-dev" < llvm-dev at
lists.llvm.org >
>> To: "Quentin Colombet" < qcolombet at apple.com >
>> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
>> Sent: Wednesday, January 13, 2016 2:35:32 AM
>> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
>> instruction selection
>>
>> Hi Philip,
>>
>>
>>
>>
>>
>> store <2 x i64> %1, <2 x i64>* %y
>>
>> Yes. The memory pattern differs. This is the first diagram on the
>> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
>>
>>
>> I think that teaching the optimizer about big-Endian lane ordering
>> would have been better. Inserting the REV after every LDR sounds
>> very similar to what we do for VSX on little-Endian PowerPC systems
>> (PowerPC may have a slight advantage here in that we don't need to
>> do insertelement / extractelement / shufflevector through memory on
>> systems where little-Endian mode is relevant, see
>>
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
>> ).
>>
>> Given what's been done, should we update the LangRef. It currently
>> reads, " The ‘ bitcast ‘ instruction converts value to type ty2 .
It
>> is always a no-op cast because no bits change with this conversion.
>> The conversion is done as if the value had been stored to memory and
>> read back as type ty2 ." But this is now, at the least,
misleading,
>> because this process of storing the value as one type and reading it
>> back in as another does, in fact, change the bits. We need to make
>> clear that this might change the bits (perhaps specifically by
>> calling out this case of vector bitcasts on big-Endian systems?).
>>
>>
>>
>> Also, regarding this, " Most operating systems however do not run
>> with alignment faults enabled, so this is often not an issue." Are
>> you saying that the processor does the correct thing in this case
>> (if alignment faults are not enabled, then it performs a proper
>> unaligned load), or that the operating-system trap handler emulates
>> the unaligned load should one occur?
>>
>> Thanks again,
>> Hal
>>
>>
>> _______________________________________________
>>
>>
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>

James Molloy via llvm-dev

2016-Jan-13 20:20 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

>  (Right?)
Uh no, the register content explicitly does change :( We insert REV
instructions (byteswap) on each bitcast. Bitcasts can be merged and elided
etc, but conceptually there's a register content change on every bitcast.

James

On Wed, 13 Jan 2016 at 18:09 Philip Reames <listmail at philipreames.com>
wrote:
>
>
> On 01/13/2016 08:01 AM, Hal Finkel via llvm-dev wrote:
> > ----- Original Message -----
> >> From: "James Molloy" <james at jamesmolloy.co.uk>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Quentin Colombet" <
> qcolombet at apple.com>
> >> Sent: Wednesday, January 13, 2016 9:54:26 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
instruction
> selection
> >>
> >>
> >>> I think that teaching the optimizer about big-Endian lane
ordering
> >>> would have been better.
> >>
> >> It's certainly arguable. Even in hindsight I'm glad we
didn't -
> >> that's the approach GCC took and they've been fixing
subtle bugs in
> >> their vectorizer ever since.
> >>
> >>
> >>> Inserting the REV after every LDR
> >>
> >> We only do this conceptually. In most cases REVs cancel out, and
we
> >> have the LD1 instruction which is LDR+REV. With enough peepholes
> >> there's really no need for code to run slower.
> >>
> >>
> >>> Given what's been done, should we update the LangRef.
> >>
> >> Potentially, yes. I hadn't realised quite how strongly worded
it was
> >> with respect to this.
> >>
> > Please do ;)
> I'm not sure changing bitcast is the right place.  Since the bitcast is
> representing the in-register value (which doesn't change), maybe we
> should define it as part of the load/store instead?  That's essentially
> what's going on; we're converting from a canonical register form to
a
> variety of memory forms.  (Right?)
> >
> >   -Hal
> >
> >> James
> >>
> >>
> >> On Wed, 13 Jan 2016 at 14:39 Hal Finkel < hfinkel at anl.gov
> wrote:
> >>
> >>
> >>
> >>
> >> [resending so the message is smaller]
> >>
> >>
> >>
> >>
> >>
> >>
> >> From: "James Molloy via llvm-dev" < llvm-dev at
lists.llvm.org >
> >> To: "Quentin Colombet" < qcolombet at apple.com >
> >> Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> >> Sent: Wednesday, January 13, 2016 2:35:32 AM
> >> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> >> instruction selection
> >>
> >> Hi Philip,
> >>
> >>
> >>
> >>
> >>
> >> store <2 x i64> %1, <2 x i64>* %y
> >>
> >> Yes. The memory pattern differs. This is the first diagram on the
> >> right at: http://llvm.org/docs/BigEndianNEON.html#bitconverts )
> >>
> >>
> >> I think that teaching the optimizer about big-Endian lane ordering
> >> would have been better. Inserting the REV after every LDR sounds
> >> very similar to what we do for VSX on little-Endian PowerPC
systems
> >> (PowerPC may have a slight advantage here in that we don't
need to
> >> do insertelement / extractelement / shufflevector through memory
on
> >> systems where little-Endian mode is relevant, see
> >>
>
http://llvm.org/devmtg/2014-10/Slides/Schmidt-SupportingVectorProgramming.pdf
> >> ).
> >>
> >> Given what's been done, should we update the LangRef. It
currently
> >> reads, " The ‘ bitcast ‘ instruction converts value to type
ty2 . It
> >> is always a no-op cast because no bits change with this
conversion.
> >> The conversion is done as if the value had been stored to memory
and
> >> read back as type ty2 ." But this is now, at the least,
misleading,
> >> because this process of storing the value as one type and reading
it
> >> back in as another does, in fact, change the bits. We need to make
> >> clear that this might change the bits (perhaps specifically by
> >> calling out this case of vector bitcasts on big-Endian systems?).
> >>
> >>
> >>
> >> Also, regarding this, " Most operating systems however do not
run
> >> with alignment faults enabled, so this is often not an
issue." Are
> >> you saying that the processor does the correct thing in this case
> >> (if alignment faults are not enabled, then it performs a proper
> >> unaligned load), or that the operating-system trap handler
emulates
> >> the unaligned load should one occur?
> >>
> >> Thanks again,
> >> Hal
> >>
> >>
> >> _______________________________________________
> >>
> >>
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160113/15d7d48a/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Jan 2016 - [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Possibly Parallel Threads