thr3ads.net - llvm dev - [llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled [Jun 2016]

If this information is useful, please help other people find it:
Share via:

vivek pandya via llvm-dev

2016-Jun-20 19:22 UTC

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

On Mon, Jun 20, 2016 at 10:06 PM, Davide Italiano <davide at freebsd.org>
wrote:
> On Sun, Jun 19, 2016 at 11:41 AM, vivek pandya via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Hello,
> >
> > I build FireFox-46.0.1 source with llvm to test interprocedural
register
> > allocation.
> > The build was successful with out any runtime faliures, here are few
> stats:
>
> This is very good, thanks for working on this.
>
> >
> > Measure W/O IPRA WITH IPRA
> > ======= ======== ========> > Total Build Time 76 mins 82.3 mins
8% increment
> > Octane v2.0 JS Benchmark Score (higher is better) 18675.69  19665.16
5%
> > improvement
>
> This speedup is kind of amazing, enough to make me a little bit suspicious.
> From what I can see, Octane is not exactly a microbenchmark but tries
> to model complex/real-world web applications, so, I think you might
> want to analyze where this speedup is coming from?
Hi Davide,

I don't understand much about browser benchmarks but what IPRA is trying to
do is reduce spill code , and trying to keep values in register where ever
it can so speed up is comping from improved code quality. But with current
infrastructure it is hard to tell which particular functions in browser
code is getting benefitted.
> Also, "score" might
> be a misleading metric, can you actually elaborate what that means?
> How does that relate to, let's say, runtime performance improvement?
>Octane score considers 2 things execution speed and latency (pause during
execution) . You can find more information here
https://developers.google.com/octane/faq

-Vivek
>
> Thanks!
>
> --
> Davide
>
> "There are no solved problems; there are only problems that are more
> or less solved" -- Henri Poincare
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160621/8dee5c9d/attachment.html>

Davide Italiano via llvm-dev

2016-Jun-21 00:28 UTC

head link

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

On Mon, Jun 20, 2016 at 12:22 PM, vivek pandya <vivekvpandya at gmail.com>
wrote:>
>
> On Mon, Jun 20, 2016 at 10:06 PM, Davide Italiano <davide at
freebsd.org>
> wrote:
>>
>> On Sun, Jun 19, 2016 at 11:41 AM, vivek pandya via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> > Hello,
>> >
>> > I build FireFox-46.0.1 source with llvm to test interprocedural
register
>> > allocation.
>> > The build was successful with out any runtime faliures, here are
few
>> > stats:
>>
>> This is very good, thanks for working on this.
>>
>> >
>> > Measure W/O IPRA WITH IPRA
>> > ======= ======== ========>> > Total Build Time 76 mins
82.3 mins 8% increment
>> > Octane v2.0 JS Benchmark Score (higher is better) 18675.69 
19665.16 5%
>> > improvement
>>
>> This speedup is kind of amazing, enough to make me a little bit
>> suspicious.
>> From what I can see, Octane is not exactly a microbenchmark but tries
>> to model complex/real-world web applications, so, I think you might
>> want to analyze where this speedup is coming from?
>
> Hi Davide,
>
> I don't understand much about browser benchmarks but what IPRA is
trying to
> do is reduce spill code , and trying to keep values in register where ever
> it can so speed up is comping from improved code quality. But with current
> infrastructure it is hard to tell which particular functions in browser
code
> is getting benefitted.
It sounds a little bit weird that you see such a big improvement for a
benchmark that's supposed to exercise JS (which is very likely handled
by a JIT inside FF), that's why I asked. My point (and worry) is that
benchmarks are very hard to get right, and from time to time you might
end up getting better numbers because of noise and not for 'improved
code quality'.
In other words, as you're presenting numbers, you should be able to
defend those numbers with an analysis which explains why your pass
makes the code better. Hope this makes sense.

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

vivek pandya via llvm-dev

2016-Jun-21 02:36 UTC

head link

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

On Tue, Jun 21, 2016 at 5:58 AM, Davide Italiano <davide at freebsd.org>
wrote:
> On Mon, Jun 20, 2016 at 12:22 PM, vivek pandya <vivekvpandya at
gmail.com>
> wrote:
> >
> >
> > On Mon, Jun 20, 2016 at 10:06 PM, Davide Italiano <davide at
freebsd.org>
> > wrote:
> >>
> >> On Sun, Jun 19, 2016 at 11:41 AM, vivek pandya via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >> > Hello,
> >> >
> >> > I build FireFox-46.0.1 source with llvm to test
interprocedural
> register
> >> > allocation.
> >> > The build was successful with out any runtime faliures, here
are few
> >> > stats:
> >>
> >> This is very good, thanks for working on this.
> >>
> >> >
> >> > Measure W/O IPRA WITH IPRA
> >> > ======= ======== ========> >> > Total Build Time
76 mins 82.3 mins 8% increment
> >> > Octane v2.0 JS Benchmark Score (higher is better) 18675.69 
19665.16
> 5%
> >> > improvement
> >>
> >> This speedup is kind of amazing, enough to make me a little bit
> >> suspicious.
> >> From what I can see, Octane is not exactly a microbenchmark but
tries
> >> to model complex/real-world web applications, so, I think you
might
> >> want to analyze where this speedup is coming from?
> >
> > Hi Davide,
> >
> > I don't understand much about browser benchmarks but what IPRA is
trying
> to
> > do is reduce spill code , and trying to keep values in register where
> ever
> > it can so speed up is comping from improved code quality. But with
> current
> > infrastructure it is hard to tell which particular functions in
browser
> code
> > is getting benefitted.
>
> It sounds a little bit weird that you see such a big improvement for a
> benchmark that's supposed to exercise JS (which is very likely handled
> by a JIT inside FF), that's why I asked. My point (and worry) is that
> benchmarks are very hard to get right, and from time to time you might
> end up getting better numbers because of noise and not for 'improved
> code quality'.
>Yes Davide I also get the same concerns from llvm-devs that benchmarks are
very hard to get right.

Does ff ship JIT related code with in FF or it uses some library for that?

Also IPRA work in llvm is still in progress and this was my very first
experiment building a large software with it. My focus was to check if
there is not any compile time or runtime failures while building such a big
software, but I asked on #firefox IRC about how to measure browser
performance and I got these suggestions, during that chat some one from ff
community asked me to report back the result that is why I just mail this
to ff-dev.
> In other words, as you're presenting numbers, you should be able to
> defend those numbers with an analysis which explains why your pass
> makes the code better. Hope this makes sense.
>I have noted the points you mentioned and as work progresses I will try to
improve on benchmarking too but this is not the final conclusion.

-Vivek
>
> --
> Davide
>
> "There are no solved problems; there are only problems that are more
> or less solved" -- Henri Poincare
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160621/4f2469e9/attachment.html>

Mehdi Amini via llvm-dev

2016-Jun-21 03:28 UTC

head link

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

> On Jun 20, 2016, at 3:22 PM, vivek pandya via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> 
> On Mon, Jun 20, 2016 at 10:06 PM, Davide Italiano <davide at freebsd.org
<mailto:davide at freebsd.org>> wrote:
> On Sun, Jun 19, 2016 at 11:41 AM, vivek pandya via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> > Hello,
> >
> > I build FireFox-46.0.1 source with llvm to test interprocedural
register
> > allocation.
> > The build was successful with out any runtime faliures, here are few
stats:
> 
> This is very good, thanks for working on this.
> 
> >
> > Measure W/O IPRA WITH IPRA
> > ======= ======== ========> > Total Build Time 76 mins 82.3 mins
8% increment
> > Octane v2.0 JS Benchmark Score (higher is better) 18675.69  19665.16
5%
> > improvement
> 
> This speedup is kind of amazing, enough to make me a little bit suspicious.
> From what I can see, Octane is not exactly a microbenchmark but tries
> to model complex/real-world web applications, so, I think you might
> want to analyze where this speedup is coming from?
> Hi Davide,
> 
> I don't understand much about browser benchmarks but what IPRA is
trying to do is reduce spill code , and trying to keep values in register where
ever it can so speed up is comping from improved code quality. But with current
infrastructure it is hard to tell which particular functions in browser code is
getting benefitted.
You need to confirm that speedups are not “luck” (for instance by removing a
spill you changed the code alignment, or just having the code layout in a CGSCC
order that would make a difference).
Here is a possible way:

1) Find one or two benchmarks that show the most improvements.
2) Run with and without IPRA in a profiler (Instruments or other).
3) Disassemble the hot path and try to figure out why. To confirm your findings
(i.e. If you find a set of functions / call-sites) that you think are
responsible for the speedup, you can try to bisect by forcing IPRA to run only
on selected functions.


— 
Mehdi

> Also, "score" might
> be a misleading metric, can you actually elaborate what that means?
> How does that relate to, let's say, runtime performance improvement?
> Octane score considers 2 things execution speed and latency (pause during
execution) . You can find more information here
https://developers.google.com/octane/faq
<https://developers.google.com/octane/faq>
> 
> -Vivek
> 
> Thanks!
> 
> --
> Davide
> 
> "There are no solved problems; there are only problems that are more
> or less solved" -- Henri Poincare
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160620/ff0c67dc/attachment-0001.html>

vivek pandya via llvm-dev

2016-Jul-01 14:56 UTC

head link

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

On Tue, Jun 21, 2016 at 8:58 AM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
> On Jun 20, 2016, at 3:22 PM, vivek pandya via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> On Mon, Jun 20, 2016 at 10:06 PM, Davide Italiano <davide at
freebsd.org>
> wrote:
>
>> On Sun, Jun 19, 2016 at 11:41 AM, vivek pandya via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> > Hello,
>> >
>> > I build FireFox-46.0.1 source with llvm to test interprocedural
register
>> > allocation.
>> > The build was successful with out any runtime faliures, here are
few
>> stats:
>>
>> This is very good, thanks for working on this.
>>
>> >
>> > Measure W/O IPRA WITH IPRA
>> > ======= ======== ========>> > Total Build Time 76 mins
82.3 mins 8% increment
>> > Octane v2.0 JS Benchmark Score (higher is better) 18675.69 
19665.16 5%
>> > improvement
>>
>> This speedup is kind of amazing, enough to make me a little bit
>> suspicious.
>> From what I can see, Octane is not exactly a microbenchmark but tries
>> to model complex/real-world web applications, so, I think you might
>> want to analyze where this speedup is coming from?
>
> Hi Davide,
>
> I don't understand much about browser benchmarks but what IPRA is
trying
> to do is reduce spill code , and trying to keep values in register where
> ever it can so speed up is comping from improved code quality. But with
> current infrastructure it is hard to tell which particular functions in
> browser code is getting benefitted.
>
>
> You need to confirm that speedups are not “luck” (for instance by removing
> a spill you changed the code alignment, or just having the code layout in a
> CGSCC order that would make a difference).
> Here is a possible way:
>
> 1) Find one or two benchmarks that show the most improvements.
> 2) Run with and without IPRA in a profiler (Instruments or other).
> 3) Disassemble the hot path and try to figure out why. To confirm your
> findings (i.e. If you find a set of functions / call-sites) that you think
> are responsible for the speedup, you can try to bisect by forcing IPRA to
> run only on selected functions.
>
> I have tried some test case which has run time improvements and top mostsuch cases where not benefited by IPRA but change of code gen order to be
on call graph. I was able to verify it as most of those test cases have
very few functions and inside that function there are library calls like
printf so IPRA can not help much there. But one test case in which there is
around 4% performance improvement
test-suite/MultiSource/Benchmarks/FreeBench/pifft that I tried out and
generated assembly file for IPRA and NO_IPRA run and comparing those files
can show IPRA improves code quality by avoiding no of spills/restore
mainly.
Please check generated assembly here:
https://gist.github.com/vivekvpandya/081baba01196c705f8b9baf420d960a1/revisions

how ever these functions are not really on hot path ( according to
Instruments app) but still functions like mp_add , mp_sub has about  9 call
sites so I think this justifies improvements due to IPRA. I have also
compared some assembly for some large functions (~2000 lines of assembly
code) from sqlite3 source code and observed that in such large functions
IPRA is able to save good number of spills.

Sincerely,
Vivek

>
> —
> Mehdi
>
>
> Also, "score" might
>> be a misleading metric, can you actually elaborate what that means?
>> How does that relate to, let's say, runtime performance
improvement?
>>
> Octane score considers 2 things execution speed and latency (pause during
> execution) . You can find more information here
> https://developers.google.com/octane/faq
>
> -Vivek
>
>>
>> Thanks!
>>
>> --
>> Davide
>>
>> "There are no solved problems; there are only problems that are
more
>> or less solved" -- Henri Poincare
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160701/6eb1a01d/attachment-0001.html>

llvm dev - Jun 2016 - FireFox-46.0.1 build with interprocedural register allocation enabled

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled

[llvm-dev] FireFox-46.0.1 build with interprocedural register allocation enabled