thr3ads.net - llvm dev - [llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz [Apr 2018]

If this information is useful, please help other people find it:
Share via:

Jessica Paquette via llvm-dev

2018-Apr-23 20:41 UTC

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Hi Eli,
> I just tried some tests, and I'm seeing a bunch of failures on SPEC at
-O3; looks like mostly crashes at runtime.   I can try to reduce a testcase if
you need it.If you could do that, that would be great. Our testing has been primarily for
-Oz and -O2, so I haven’t looked at -O3 at all.
> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
I’m worried that walking the entire list of functions in the module when nothing
has the minsize attribute would incur unnecessary compile-time overhead. If
that’s a reasonable thing to do though, I’m fine with that approach. It’d be a
less invasive change, and would give us the desired LTO behaviour for free.

- Jessica

> On Apr 23, 2018, at 1:24 PM, Friedman, Eli <efriedma at
codeaurora.org> wrote:
> 
> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>> We perform regular testing to ensure the outliner produces correct
AArch64 code at -Oz. Tests include the LLVM test suite and standard external
test suites such as SPEC. All tests compile and execute. We've also been
making sure that the outliner produces debuggable code. Users are still
guaranteed to have sane backtraces in the presence of outlined functions.
>> 
>> Added exposure to various programs would help the outlining algorithm
mature further. This, in turn, will help the overall outlining project. For
example, there have been a few discussions on implementing an IR-level outlining
pass [3, 4]. Ultimately, the goal is to create a shared outlining interface.
This interface would allow the outliner to exist at any level of representation
[4]. The general outlining algorithm will be part of the shared interface. Thus,
in the spirit of incremental improvement, it makes sense to begin
"stress-testing" it sooner than later.
> 
> I just tried some tests, and I'm seeing a bunch of failures on SPEC at
-O3; looks like mostly crashes at runtime.   I can try to reduce a testcase if
you need it.
> 
>> 
>> There are a few patches necessary to facilitate this. They are
available in the patches section of this email. I’ll summarize what they do here
for the sake of discussion though.
>> 
>> The first patch is one that teaches the backend about size optimization
levels. This is comparable to what's done in the inliner. Today, the only
way to tell if something is optimizing for size is by looking at function
attributes. This is fine for function passes, but insufficient for module passes
like the         MachineOutliner. The function attribute approach forces the
outliner to iterate over every function in the module before deciding to take
action. If -Oz isn't passed in, then the outliner will not find any
functions worth outlining from. This would incur unnecessary compile-time
overhead. Thus, we decided the best course of action is to teach the backend
about size options.
> 
> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
> 
> In general, we've been moving away from global settings so we can
optimize more effectively in this sort of scenario.
> 
> -Eli
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/fe55698d/attachment.html>

Friedman, Eli via llvm-dev

2018-Apr-23 21:35 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

On 4/23/2018 1:41 PM, Jessica Paquette wrote:> Hi Eli,
>
>> I just tried some tests, and I'm seeing a bunch of failures on SPEC
>> at -O3; looks like mostly crashes at runtime.   I can try to reduce a 
>> testcase if you need it.
> If you could do that, that would be great. Our testing has been 
> primarily for -Oz and -O2, so I haven’t looked at -O3 at all.
Okay, I'll try to come up with something soon.
>
>> I don't think this is really the right approach.  With LTO, you can
>> have a mix of functions, some of which are minsize, and some of which 
>> are not.  Or with profile info, we might want to outline only cold 
>> code (I guess this isn't implemented yet, but potentially future 
>> work).  Tying whether we run the outliner to a command-line flag 
>> restricts the possible uses; either the entire module gets outlining, 
>> or none of it does.
> I’m worried that walking the entire list of functions in the module 
> when nothing has the minsize attribute would incur unnecessary 
> compile-time overhead. If that’s a reasonable thing to do though, I’m 
> fine with that approach. It’d be a less invasive change, and would 
> give us the desired LTO behaviour for free.
Walking the list of functions is very cheap, relatively speaking; I'm 
not concerned about the cost of that.  The cost I'd be concerned about 
is the cost of running a ModulePass at that point in the pipeline; IIRC 
the last time someone tried it, there were bug reports about memory 
usage (see https://bugs.llvm.org/show_bug.cgi?id=36123 .)

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

Jessica Paquette via llvm-dev

2018-Apr-23 21:37 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

I just ran SPEC at -O3 with the outliner enabled for AArch64 and didn’t get any
failures on my end. Which flags did you use? I’m curious about what’s going on
here...

I used -O3 -mllvm -enable-machine-outliner -arch arm64.

- Jessica
> On Apr 23, 2018, at 1:41 PM, Jessica Paquette <jpaquette at
apple.com> wrote:
> 
> Hi Eli,
> 
>> I just tried some tests, and I'm seeing a bunch of failures on SPEC
at -O3; looks like mostly crashes at runtime.   I can try to reduce a testcase
if you need it.
> If you could do that, that would be great. Our testing has been primarily
for -Oz and -O2, so I haven’t looked at -O3 at all.
> 
>> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
> 
> I’m worried that walking the entire list of functions in the module when
nothing has the minsize attribute would incur unnecessary compile-time overhead.
If that’s a reasonable thing to do though, I’m fine with that approach. It’d be
a less invasive change, and would give us the desired LTO behaviour for free.
> 
> - Jessica
> 
> 
>> On Apr 23, 2018, at 1:24 PM, Friedman, Eli <efriedma at
codeaurora.org <mailto:efriedma at codeaurora.org>> wrote:
>> 
>> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>>> We perform regular testing to ensure the outliner produces correct
AArch64 code at -Oz. Tests include the LLVM test suite and standard external
test suites such as SPEC. All tests compile and execute. We've also been
making sure that the outliner produces debuggable code. Users are still
guaranteed to have sane backtraces in the presence of outlined functions.
>>> 
>>> Added exposure to various programs would help the outlining
algorithm mature further. This, in turn, will help the overall outlining
project. For example, there have been a few discussions on implementing an
IR-level outlining pass [3, 4]. Ultimately, the goal is to create a shared
outlining interface. This interface would allow the outliner to exist at any
level of representation [4]. The general outlining algorithm will be part of the
shared interface. Thus, in the spirit of incremental improvement, it makes sense
to begin "stress-testing" it sooner than later.
>> 
>> I just tried some tests, and I'm seeing a bunch of failures on SPEC
at -O3; looks like mostly crashes at runtime.   I can try to reduce a testcase
if you need it.
>> 
>>> 
>>> There are a few patches necessary to facilitate this. They are
available in the patches section of this email. I’ll summarize what they do here
for the sake of discussion though.
>>> 
>>> The first patch is one that teaches the backend about size
optimization levels. This is comparable to what's done in the inliner.
Today, the only way to tell if something is optimizing for size is by looking at
function attributes. This is fine for function passes, but insufficient for
module passes like the MachineOutliner. The function attribute approach forces
the outliner to iterate over every function in the module before deciding to
take action. If -Oz isn't passed in, then the outliner will not find any
functions worth outlining from. This would incur unnecessary compile-time
overhead. Thus, we decided the best course of action is to teach the backend
about size options.
>> 
>> I don't think this is really the right approach.  With LTO, you can
have a mix of functions, some of which are minsize, and some of which are not. 
Or with profile info, we might want to outline only cold code (I guess this
isn't implemented yet, but potentially future work).  Tying whether we run
the outliner to a command-line flag restricts the possible uses; either the
entire module gets outlining, or none of it does.
>> 
>> In general, we've been moving away from global settings so we can
optimize more effectively in this sort of scenario.
>> 
>> -Eli
>> -- 
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/bb0d88d7/attachment.html>

Friedman, Eli via llvm-dev

2018-Apr-23 22:55 UTC

head link

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Sorry, I was using a modified compiler, which by coincidence made the 
bug much easier to reproduce.

In some rare cases, the compiler will use x30 as a general-purpose 
register; in that case, outlining breaks because the "ret" branches to
the wrong address.  Testcase (reproduce with "clang -O3 
--target=aarch64-pc-linux-gnu -mllvm -enable-machine-outliner"):

extern long g1;
extern long g2;
void foo() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}
void foo2() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}
void foo3() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}

-Eli

On 4/23/2018 2:37 PM, Jessica Paquette wrote:> I just ran SPEC at -O3 with the outliner enabled for AArch64 and 
> didn’t get any failures on my end. Which flags did you use? I’m 
> curious about what’s going on here...
>
> I used -O3 -mllvm -enable-machine-outliner -arch arm64.
>
> - Jessica
>
>> On Apr 23, 2018, at 1:41 PM, Jessica Paquette <jpaquette at
apple.com
>> <mailto:jpaquette at apple.com>> wrote:
>>
>> Hi Eli,
>>
>>> I just tried some tests, and I'm seeing a bunch of failures on
SPEC
>>> at -O3; looks like mostly crashes at runtime.   I can try to reduce
>>> a testcase if you need it.
>> If you could do that, that would be great. Our testing has been 
>> primarily for -Oz and -O2, so I haven’t looked at -O3 at all.
>>
>>> I don't think this is really the right approach.  With LTO, you
can
>>> have a mix of functions, some of which are minsize, and some of 
>>> which are not.  Or with profile info, we might want to outline only
>>> cold code (I guess this isn't implemented yet, but potentially 
>>> future work).  Tying whether we run the outliner to a command-line 
>>> flag restricts the possible uses; either the entire module gets 
>>> outlining, or none of it does.
>> I’m worried that walking the entire list of functions in the module 
>> when nothing has the minsize attribute would incur unnecessary 
>> compile-time overhead. If that’s a reasonable thing to do though, I’m 
>> fine with that approach. It’d be a less invasive change, and would 
>> give us the desired LTO behaviour for free.
>>
>> - Jessica
>>
>>
>>> On Apr 23, 2018, at 1:24 PM, Friedman, Eli <efriedma at
codeaurora.org
>>> <mailto:efriedma at codeaurora.org>> wrote:
>>>
>>> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>>>> We perform regular testing to ensure the outliner produces
correct
>>>> AArch64 code at -Oz. Tests include the LLVM test suite and
standard
>>>> external test suites such as SPEC. All tests compile and 
>>>> execute. We've also been making sure that the outliner
produces
>>>> debuggable code. Users are still guaranteed to have sane
backtraces
>>>> in the presence of outlined functions.
>>>>
>>>> Added exposure to various programs would help the outlining 
>>>> algorithm mature further. This, in turn, will help the overall 
>>>> outlining project. For example, there have been a few
discussions
>>>> on implementing an IR-level outlining pass [3, 4]. Ultimately,
the
>>>> goal is to create a shared outlining interface. This interface 
>>>> would allow the outliner to exist at any level of
representation
>>>> [4]. The general outlining algorithm will be part of the shared
>>>> interface. Thus, in the spirit of incremental improvement, it
makes
>>>> sense to begin "stress-testing" it sooner than later.
>>>
>>> I just tried some tests, and I'm seeing a bunch of failures on
SPEC
>>> at -O3; looks like mostly crashes at runtime.   I can try to reduce
>>> a testcase if you need it.
>>>
>>>>
>>>> There are a few patches necessary to facilitate this. They are 
>>>> available in the patches section of this email. I’ll
summarize what
>>>> they do here for the sake of discussion though.
>>>>
>>>> The first patch is one that teaches the backend about size 
>>>> optimization levels. This is comparable to what's done in
the
>>>> inliner. Today, the only way to tell if something is optimizing
for
>>>> size is by looking at function attributes. This is fine for 
>>>> function passes, but insufficient for module passes like the 
>>>> MachineOutliner. The function attribute approach forces the 
>>>> outliner to iterate over every function in the module before 
>>>> deciding to take action. If -Oz isn't passed in, then the
outliner
>>>> will not find any functions worth outlining from. This would
incur
>>>> unnecessary compile-time overhead. Thus, we decided the best
course
>>>> of action is to teach the backend about size options.
>>>
>>> I don't think this is really the right approach.  With LTO, you
can
>>> have a mix of functions, some of which are minsize, and some of 
>>> which are not. Or with profile info, we might want to outline only 
>>> cold code (I guess this isn't implemented yet, but potentially 
>>> future work).  Tying whether we run the outliner to a command-line 
>>> flag restricts the possible uses; either the entire module gets 
>>> outlining, or none of it does.
>>>
>>> In general, we've been moving away from global settings so we
can
>>> optimize more effectively in this sort of scenario.
>>>
>>> -Eli
>>> -- 
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
>>
>
-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/8d7f7c96/attachment.html>

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Apr 2018 - [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Apparently Analagous Threads