Displaying 3 results from an estimated 3 matches for "benchmark1".
Did you mean:
benchmark
2017 Feb 17
2
(RFC) Adjusting default loop fully unroll threshold
...it happens only for the double-variants, not the float-variants.
>
> +1
>
> The second chart shows relative code size increase (vertical axis) vs relative performance improvement (horizontal axis):
> I manually checked the cause of the 3 biggest performance regressions (proprietary benchmark1: -13.70%; MultiSource/Applications/hexxagon/hexxagon: -10.10%; MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow
> -5.23%).
> For the proprietary benchmark and hexxagon, the code generation didn't change for the hottest parts, so probably is caused by micro-architectural effects of c...
2017 Feb 16
4
(RFC) Adjusting default loop fully unroll threshold
...it happens only for the double-variants, not the
> float-variants.
>
+1
The second chart shows relative code size increase (vertical axis) vs
> relative performance improvement (horizontal axis):
> I manually checked the cause of the 3 biggest performance regressions
> (proprietary benchmark1: -13.70%;
> MultiSource/Applications/hexxagon/hexxagon: -10.10%;
> MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow -5.23%).
> For the proprietary benchmark and hexxagon, the code generation didn't
> change for the hottest parts, so probably is caused by micro-architectural
&g...
2017 Feb 15
2
(RFC) Adjusting default loop fully unroll threshold
Thanks for running these Kristof!
I'd still like to hear from Apple, and if we can get a few more x86
micro-architectures covered that'd be great, but it looks like -O3 is
uncontroversial, and the question is whether this makes sense at O2...
To me, it would help a lot to know the actual breakdown of benchmarks such
as yours Kristof (as they seem to have more codesize impact than others