thr3ads.net - llvm dev - [llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine. [Feb 2017]

If this information is useful, please help other people find it:
Share via:

James Y Knight via llvm-dev

2017-Feb-09 04:48 UTC

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.

I'd think i1 would be the proper and correct choice for a carry flag for
the generic instruction. I expect that would also make UADDO/USUBO
redundant with ADDC/SUBC (which would seem a good outcome).

You'd need to make sure the right thing happened when converting from
ADDC's 1-bit carry in/out to X86ISD::AD[DC]'s EFLAGS i/o. Right now the
conversion can get away with assuming that the only thing which can be used
as input to ADDC is the glue-output of ADD/ADDC. That would no longer be
the case once they're using a normal value.

(Actually, I wonder if it would be possible to represent x86's EFLAGS.CARRY
as a 1-bit subreg of EFLAGS, and use that on input to X86ISD::ADC; that may
allow cheaper spills/reloads if it only has to save the one bit?).

On Wed, Feb 8, 2017 at 7:28 PM, Amaury SECHET <deadalnix at gmail.com>
wrote:
> Well I don't think this would break most backend. The opcode is
generate
> only if the backend allows so to boot, so many backend actually do not use
> it at all.
>
> Anyway, changing these opcode is kind of much broader scope than what I
> anticipated, but I maybe can make it work. What do you think would be an
> appropriate type for the carry instead of glue ?
>
> 2017-02-08 21:19 GMT+01:00 James Y Knight <jyknight at google.com>:
>
>> I don't think that'd work, because it leaves all other backends
broken.
>> AFAICT, your transform is simply not a legal transform, with the way
the
>> ADDC/ADDE opcodes are currently defined, and to do it you really need
to
>> fix the opcode definitions to not involve glue, first.
>>
>> I also note that your transform doesn't actually trigger at all on
this
>> particular test case on x86, because the dag ends up looking like
(uaddo X,
>> (adde Y, 0, Carry)), which the transform doesn't match. That is not
really
>> relevant to the issue here, because the x86 backend does work even if
the
>> transform gets triggered (replacing the final store with "store
i64 %add37,
>> i64* %r, align 8" will make this happen)
>>
>>
>> On Tue, Feb 7, 2017 at 5:44 PM, Amaury SECHET <deadalnix at
gmail.com>
>> wrote:
>>
>>> Making this a value instead of a glue looks like a good longer term
>>> solution, but it doesn't quite cut it as a short term one. It
would require
>>> to change a fair amount of code in the DAG stack, plus in each
backend.
>>>
>>> @James,
>>> do you think doing something similar to what the X86 backend does
in the
>>> PowerPC one would cut it ?
>>>
>>>
>>> 2017-02-07 22:51 GMT+01:00 Nemanja Ivanovic <nemanja.i.ibm at
gmail.com>:
>>>
>>>> Would it not make sense to refactor the code so those don't
use glue
>>>> rather than emitting them with glue and then getting rid of it.
There are
>>>> times when we would like to emit these in separate blocks but
can't
>>>> (presumably because of the glue).
>>>>
>>>> On Tue, Feb 7, 2017 at 9:15 PM, James Y Knight via llvm-dev
<
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> That's seems really odd that ADDC/ADDE uses glue there,
instead of a
>>>>> plain value.
>>>>>
>>>>> The x86 backend has code that converts the glue into a
value, which is
>>>>> why it wasn't affected.... (LowerADDC_ADDE_SUBC_SUBE).
>>>>>
>>>>> On Tue, Feb 7, 2017 at 2:48 PM, Amaury SECHET via llvm-dev
<
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Long story short:
https://llvm.org/bugs/show_bug.cgi?id=31890
>>>>>>
>>>>>> The backend fails to schedule a given DAG, the reason
being that
>>>>>> there is an instruction and it glue that needs to be
broken apart as they
>>>>>> can't be scheduled consecutively. See attached file
for a picture of the
>>>>>> DAG.
>>>>>>
>>>>>> Not sure what's the best course of action is, and
not sure why this
>>>>>> isn't a problem for the X86 backend either. I'm
looking for advice on the
>>>>>> best course of actions. As I see it, the option are:
>>>>>>
>>>>>> 1/ add extra logic in the DAGCombiner to make sure this
doesn't
>>>>>> happen. I don't see a way this could be done
cheaply and overall I don't
>>>>>> think this is the best option/
>>>>>> 2/ Have the ScheduleDAG machinery detect this case and
break up the
>>>>>> glue, for instance via breaking up (adde X, Y, Carry)
into (add (add X, Y)n
>>>>>> (adde 0, 0, Carry)) or something alike when the
situation present itself.
>>>>>> 3/ Do whatever the X86 backend does, which I'm not
sure what it is.
>>>>>>
>>>>>> Advice ?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> Amaury SECHET
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170208/a10fb276/attachment.html>

Amaury SECHET via llvm-dev

2017-Feb-09 10:21 UTC

head link

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.

So the plan would be:

1/ Use UADDO instead of ADDC.
2/ Deprecate and remove ADDC.
3/ Have ADDE/SUBE accept a boolean as carry input instead of a glue.

There are still one thing I don't really understand: how come the X86
backend end up avoiding turnning into the problem altogether to begin with.
I think that's an important piece of the puzzle.


2017-02-09 5:48 GMT+01:00 James Y Knight <jyknight at google.com>:
> I'd think i1 would be the proper and correct choice for a carry flag
for
> the generic instruction. I expect that would also make UADDO/USUBO
> redundant with ADDC/SUBC (which would seem a good outcome).
>
> You'd need to make sure the right thing happened when converting from
> ADDC's 1-bit carry in/out to X86ISD::AD[DC]'s EFLAGS i/o. Right now
the
> conversion can get away with assuming that the only thing which can be used
> as input to ADDC is the glue-output of ADD/ADDC. That would no longer be
> the case once they're using a normal value.
>
> (Actually, I wonder if it would be possible to represent x86's
> EFLAGS.CARRY as a 1-bit subreg of EFLAGS, and use that on input to
> X86ISD::ADC; that may allow cheaper spills/reloads if it only has to save
> the one bit?).
>
> On Wed, Feb 8, 2017 at 7:28 PM, Amaury SECHET <deadalnix at
gmail.com> wrote:
>
>> Well I don't think this would break most backend. The opcode is
generate
>> only if the backend allows so to boot, so many backend actually do not
use
>> it at all.
>>
>> Anyway, changing these opcode is kind of much broader scope than what I
>> anticipated, but I maybe can make it work. What do you think would be
an
>> appropriate type for the carry instead of glue ?
>>
>> 2017-02-08 21:19 GMT+01:00 James Y Knight <jyknight at
google.com>:
>>
>>> I don't think that'd work, because it leaves all other
backends broken.
>>> AFAICT, your transform is simply not a legal transform, with the
way the
>>> ADDC/ADDE opcodes are currently defined, and to do it you really
need to
>>> fix the opcode definitions to not involve glue, first.
>>>
>>> I also note that your transform doesn't actually trigger at all
on this
>>> particular test case on x86, because the dag ends up looking like
(uaddo X,
>>> (adde Y, 0, Carry)), which the transform doesn't match. That is
not really
>>> relevant to the issue here, because the x86 backend does work even
if the
>>> transform gets triggered (replacing the final store with
"store i64 %add37,
>>> i64* %r, align 8" will make this happen)
>>>
>>>
>>> On Tue, Feb 7, 2017 at 5:44 PM, Amaury SECHET <deadalnix at
gmail.com>
>>> wrote:
>>>
>>>> Making this a value instead of a glue looks like a good longer
term
>>>> solution, but it doesn't quite cut it as a short term one.
It would require
>>>> to change a fair amount of code in the DAG stack, plus in each
backend.
>>>>
>>>> @James,
>>>> do you think doing something similar to what the X86 backend
does in
>>>> the PowerPC one would cut it ?
>>>>
>>>>
>>>> 2017-02-07 22:51 GMT+01:00 Nemanja Ivanovic <nemanja.i.ibm
at gmail.com>:
>>>>
>>>>> Would it not make sense to refactor the code so those
don't use glue
>>>>> rather than emitting them with glue and then getting rid of
it. There are
>>>>> times when we would like to emit these in separate blocks
but can't
>>>>> (presumably because of the glue).
>>>>>
>>>>> On Tue, Feb 7, 2017 at 9:15 PM, James Y Knight via llvm-dev
<
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> That's seems really odd that ADDC/ADDE uses glue
there, instead of a
>>>>>> plain value.
>>>>>>
>>>>>> The x86 backend has code that converts the glue into a
value, which
>>>>>> is why it wasn't affected....
(LowerADDC_ADDE_SUBC_SUBE).
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 2:48 PM, Amaury SECHET via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Long story short:
https://llvm.org/bugs/show_bug.cgi?id=31890
>>>>>>>
>>>>>>> The backend fails to schedule a given DAG, the
reason being that
>>>>>>> there is an instruction and it glue that needs to
be broken apart as they
>>>>>>> can't be scheduled consecutively. See attached
file for a picture of the
>>>>>>> DAG.
>>>>>>>
>>>>>>> Not sure what's the best course of action is,
and not sure why this
>>>>>>> isn't a problem for the X86 backend either.
I'm looking for advice on the
>>>>>>> best course of actions. As I see it, the option
are:
>>>>>>>
>>>>>>> 1/ add extra logic in the DAGCombiner to make sure
this doesn't
>>>>>>> happen. I don't see a way this could be done
cheaply and overall I don't
>>>>>>> think this is the best option/
>>>>>>> 2/ Have the ScheduleDAG machinery detect this case
and break up the
>>>>>>> glue, for instance via breaking up (adde X, Y,
Carry) into (add (add X, Y)n
>>>>>>> (adde 0, 0, Carry)) or something alike when the
situation present itself.
>>>>>>> 3/ Do whatever the X86 backend does, which I'm
not sure what it is.
>>>>>>>
>>>>>>> Advice ?
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>>
>>>>>>> Amaury SECHET
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170209/3134b20b/attachment-0001.html>

Amaury SECHET via llvm-dev

2017-Feb-10 13:00 UTC

head link

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.

More on that front. I commented out the part of
DAGTypeLegalizer::ExpandIntRes_ADDSUB that legalize using ADDC/ADDE so it
falls back on using UADDO. The generated DAG on X86 looks legit, but I end
up with 66 test failures, most of them seems to be due to trunc i8 X to i32
ending up being generated. Help !

2017-02-09 11:21 GMT+01:00 Amaury SECHET <deadalnix at gmail.com>:
> So the plan would be:
>
> 1/ Use UADDO instead of ADDC.
> 2/ Deprecate and remove ADDC.
> 3/ Have ADDE/SUBE accept a boolean as carry input instead of a glue.
>
> There are still one thing I don't really understand: how come the X86
> backend end up avoiding turnning into the problem altogether to begin with.
> I think that's an important piece of the puzzle.
>
>
> 2017-02-09 5:48 GMT+01:00 James Y Knight <jyknight at google.com>:
>
>> I'd think i1 would be the proper and correct choice for a carry
flag for
>> the generic instruction. I expect that would also make UADDO/USUBO
>> redundant with ADDC/SUBC (which would seem a good outcome).
>>
>> You'd need to make sure the right thing happened when converting
from
>> ADDC's 1-bit carry in/out to X86ISD::AD[DC]'s EFLAGS i/o. Right
now the
>> conversion can get away with assuming that the only thing which can be
used
>> as input to ADDC is the glue-output of ADD/ADDC. That would no longer
be
>> the case once they're using a normal value.
>>
>> (Actually, I wonder if it would be possible to represent x86's
>> EFLAGS.CARRY as a 1-bit subreg of EFLAGS, and use that on input to
>> X86ISD::ADC; that may allow cheaper spills/reloads if it only has to
save
>> the one bit?).
>>
>> On Wed, Feb 8, 2017 at 7:28 PM, Amaury SECHET <deadalnix at
gmail.com>
>> wrote:
>>
>>> Well I don't think this would break most backend. The opcode is
generate
>>> only if the backend allows so to boot, so many backend actually do
not use
>>> it at all.
>>>
>>> Anyway, changing these opcode is kind of much broader scope than
what I
>>> anticipated, but I maybe can make it work. What do you think would
be an
>>> appropriate type for the carry instead of glue ?
>>>
>>> 2017-02-08 21:19 GMT+01:00 James Y Knight <jyknight at
google.com>:
>>>
>>>> I don't think that'd work, because it leaves all other
backends broken.
>>>> AFAICT, your transform is simply not a legal transform, with
the way the
>>>> ADDC/ADDE opcodes are currently defined, and to do it you
really need to
>>>> fix the opcode definitions to not involve glue, first.
>>>>
>>>> I also note that your transform doesn't actually trigger at
all on this
>>>> particular test case on x86, because the dag ends up looking
like (uaddo X,
>>>> (adde Y, 0, Carry)), which the transform doesn't match.
That is not really
>>>> relevant to the issue here, because the x86 backend does work
even if the
>>>> transform gets triggered (replacing the final store with
"store i64 %add37,
>>>> i64* %r, align 8" will make this happen)
>>>>
>>>>
>>>> On Tue, Feb 7, 2017 at 5:44 PM, Amaury SECHET <deadalnix at
gmail.com>
>>>> wrote:
>>>>
>>>>> Making this a value instead of a glue looks like a good
longer term
>>>>> solution, but it doesn't quite cut it as a short term
one. It would require
>>>>> to change a fair amount of code in the DAG stack, plus in
each backend.
>>>>>
>>>>> @James,
>>>>> do you think doing something similar to what the X86
backend does in
>>>>> the PowerPC one would cut it ?
>>>>>
>>>>>
>>>>> 2017-02-07 22:51 GMT+01:00 Nemanja Ivanovic
<nemanja.i.ibm at gmail.com>:
>>>>>
>>>>>> Would it not make sense to refactor the code so those
don't use glue
>>>>>> rather than emitting them with glue and then getting
rid of it. There are
>>>>>> times when we would like to emit these in separate
blocks but can't
>>>>>> (presumably because of the glue).
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 9:15 PM, James Y Knight via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> That's seems really odd that ADDC/ADDE uses
glue there, instead of a
>>>>>>> plain value.
>>>>>>>
>>>>>>> The x86 backend has code that converts the glue
into a value, which
>>>>>>> is why it wasn't affected....
(LowerADDC_ADDE_SUBC_SUBE).
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 2:48 PM, Amaury SECHET via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Long story short:
https://llvm.org/bugs/show_bug.cgi?id=31890
>>>>>>>>
>>>>>>>> The backend fails to schedule a given DAG, the
reason being that
>>>>>>>> there is an instruction and it glue that needs
to be broken apart as they
>>>>>>>> can't be scheduled consecutively. See
attached file for a picture of the
>>>>>>>> DAG.
>>>>>>>>
>>>>>>>> Not sure what's the best course of action
is, and not sure why this
>>>>>>>> isn't a problem for the X86 backend either.
I'm looking for advice on the
>>>>>>>> best course of actions. As I see it, the option
are:
>>>>>>>>
>>>>>>>> 1/ add extra logic in the DAGCombiner to make
sure this doesn't
>>>>>>>> happen. I don't see a way this could be
done cheaply and overall I don't
>>>>>>>> think this is the best option/
>>>>>>>> 2/ Have the ScheduleDAG machinery detect this
case and break up the
>>>>>>>> glue, for instance via breaking up (adde X, Y,
Carry) into (add (add X, Y)n
>>>>>>>> (adde 0, 0, Carry)) or something alike when the
situation present itself.
>>>>>>>> 3/ Do whatever the X86 backend does, which
I'm not sure what it is.
>>>>>>>>
>>>>>>>> Advice ?
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>>
>>>>>>>> Amaury SECHET
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170210/f4df3bac/attachment.html>

llvm dev - Feb 2017 - Problem ScheduleDAG on PowerPC, X86 works fine.

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.

[llvm-dev] Problem ScheduleDAG on PowerPC, X86 works fine.