Displaying 20 results from an estimated 900 matches similar to: "Lowering llvm.memset for ARM target"
2017 Aug 16
2
[cfe-dev] Disable memset synthesis
On Tue, Aug 15, 2017 at 9:37 PM, Tim Northover via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> On 15 August 2017 at 19:38, bharathi seshadri via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > I find that GCC has an option -fno-tree-loop-distribute-patterns that
> > can be used to disable memcpy/memset synthesis. I wonder if there is
> > something similar
2017 Nov 10
5
[RFC] Enable Partial Inliner by default
Hi Graham,
Thank you for offering help. I am trying to create a reproducer. The problem is that the crashes happen whilst LTO is used. One thing I am sure about IR is broken at compile time.
Thanks,
Evgeny
From: Graham Yiu <gyiu at ca.ibm.com>
Date: Friday, 10 November 2017 at 16:09
To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>
Cc: "junbuml at codeaurora.org"
2017 Jan 27
2
Reversion of rL292621 caused about 7% performance regressions on Cortex-M
Hi Wei,
Thank you for information.
Please let me know about any progress in fixing the failures.
I can help with checking that the final patch gives the same level of performance improvements.
Kind regards,
Evgeny Astigeevich
Senior Compiler Engineer
Compilation Tools
ARM
> -----Original Message-----
> From: Wei Mi [mailto:wmi at google.com]
> Sent: Friday, January 27, 2017 6:20 PM
2017 Jan 27
2
Reversion of rL292621 caused about 7% performance regressions on Cortex-M
Hi Evgeny,
Quentin and Matthias found it was a problem about subreg live range
update and will push a fix soon
(http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170123/424126.html).
Thanks,
Wei.
On Fri, Jan 27, 2017 at 10:35 AM, Wei Mi <wmi at google.com> wrote:
> Sure. Will keep you posted.
>
> Thanks,
> Wei.
>
> On Fri, Jan 27, 2017 at 10:31 AM, Evgeny
2017 Nov 10
2
[RFC] Making .eh_frame more linker-friendly
> But if we still need to deal with CIEs and generate .eh_frame_hdr in a special way,
> does it make sense to make this change to simplify only a small part of a linker?
For huge C++ projects this could improve link time if GC is a bottleneck. It will also improve eh_frame_hdr build time because you don’t spend time on parsing garbage. However a linker will have to have two versions of GC:
2017 Jan 23
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Confirm there is no change in IR if the hack is disabled in the sources.
David wrote that these instructions are created by SCEV.
Are other targets affected by the changes, e.g. X86?
Kind regards,
Evgeny Astigeevich
Senior Compiler Engineer
Compilation Tools
ARM
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Sunday, January 22, 2017 10:45 PM
To: Evgeny Astigeevich
Cc: llvm-dev; nd
2017 Nov 10
2
[RFC] Making .eh_frame more linker-friendly
Hi Igor,
> It sounds like the linker has to be aware of the .eh_frame section details to be able to generate .eh_frame_hdr and eliminate duplicate CIEs, right?
Yes, a linker needs some details but not all of them. It needs to know sizes of records and initial locations (PC Begin) to find out which functions FDEs belong to.
> So, is there any difference whether it knows that in one place
2017 Jan 24
3
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
> On Jan 24, 2017, at 7:18 AM, Sanjay Patel <spatel at rotateright.com> wrote:
>
>
>
> On Mon, Jan 23, 2017 at 10:53 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>
>> On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
2017 Jan 27
2
Reversion of rL292621 caused about 7% performance regressions on Cortex-M
Hi Wei,
Your reversion of rL292621 caused about 7% performance regressions in our benchmark on Cortex-M7/M4.
In your commit comment I see it causes build bot failures.
What kind are the failures? Compiler crashes or incorrect code generation? Will you fix them?
We are interested in the changes because of performance improvements they give.
Kind regards,
Evgeny Astigeevich
Senior Compiler
2017 Jan 22
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Thank you for information.
I’ll build clang without the hack and re-run the benchmark tomorrow.
-Evgeny
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Sunday, January 22, 2017 8:00 PM
To: Evgeny Astigeevich
Cc: llvm-dev; nd
Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
> Do you mean to
2017 Nov 10
0
[RFC] Enable Partial Inliner by default
Hi Evgeny,
I just realized that if these are compile-time errors I can help
investigate on my end. Do you have something I can use to reproduce?
Cheers,
Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu at ca.ibm.com
From: Graham Yiu/Toronto/IBM
To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>
Cc:
2017 Jan 24
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
> On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> All targets are likely affected in some way by the icmp+shl fold introduced with r292492. It's a basic pattern that occurs in lots of code. Did you see any perf wins on your targets with this commit?
>
> Sadly, it is also likely that many (all?) targets are negatively
2017 Nov 02
13
[RFC] Enable Partial Inliner by default
Forgot to add that all experiments were done with '-O3 -m64
-fexperimental-new-pass-manager'.
Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu at ca.ibm.com
From: Graham Yiu/Toronto/IBM
To: llvm-dev at lists.llvm.org
Cc: junbuml at codeaurora.org, xinliangli at gmail.com
Date: 11/02/2017 05:26 PM
Subject: [RFC]
2018 Jan 29
2
[RFC] Enable Partial Inliner by default
Hello All,
This conversations seems to have fizzled out and I would like to try to
revive it. My intention is to pick up where Graham left off with enabling
partial-inlining by default.
On Sat, Dec 9, 2017 at 7:47 AM, Florian Hahn via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> On 13/11/2017 14:47, Evgeny Astigeevich via llvm-dev wrote:
>
>> Hi Graham,
2017 Oct 25
5
RFC: Switching to the new pass manager by default
On 10/25/2017 12:32 PM, Evgeny Astigeevich wrote:
>
> Hi Hal,
>
> I quickly checked the execution profile. It is real. The code changed
> significantly. A number of the hottest regions changed. I’ll compare IRs.
>
Thanks. Obviously a 1000% execution performance regression seems
problematic.
-Hal
> JFYI FreeBench/fourinarow time graph:
>
2017 Aug 16
3
Disable memset synthesis
Our application is 32-bit big-endian ARM and we use -O3 with LTO.
clang optimizes certain initialization of structures to zero with
calls to memset, which are not further lowered to move instructions.
Investigating perf reports, it looks like it may be beneficial to
disable this optimization that introduces a function call to memset in
certain hot paths.
I tried passing -fno-builtin, but that
2015 Jul 17
2
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
Before the fix, the compiler may simply return 'noalias' for cases it can
not really prove to be noalias, but actually correct by luck (or even wrong
noalias, but does not result in miscompile). It would be useful to find out
the set of missed noalias queries from GlobalModRef with your benchmark and
examine if there is some improvement can be done.
David
On Fri, Jul 17, 2015 at 6:32
2017 Jan 22
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Hi Sanjay,
The benchmark source file: http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup
Clang options used to produce the initial IR: clang -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a53 -fomit-frame-pointer -O3 -DNDEBUG -w -Werror=date-time -c sieve.c -S -emit-llvm -mllvm -disable-llvm-optzns --target=aarch64-arm-linux
Opt options: opt -O3
2018 Jan 29
0
[RFC] Enable Partial Inliner by default
Hi Sean,
Thank you for reminding me.
It looks like it get lost among tons of emails and other tasks.
I’ll check if the code size issues still exist.
Thanks,
Evgeny Astigeevich
From: Sean Fertile <sd.fertile at gmail.com>
Date: Monday, 29 January 2018 at 19:52
To: Florian Hahn <Florian.Hahn at arm.com>
Cc: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>, Graham Yiu <gyiu
2015 Jul 17
2
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
Can you say what Benchmark or give a test case so we understand the nature
of the regression? As Gerolf said, that will be important to understand
what is best to do.
On Fri, Jul 17, 2015, 06:43 Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>
wrote:
> Yes, the regression is stable. I double checked this. A full benchmark
> run consists of at least 10 sub-runs to validate the