Displaying 20 results from an estimated 8000 matches similar to: "Passing literal -cpu model string to qemu"
2017 Jun 01
3
[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53
Thanks for everyone giving their feedback!
I saw pretty unanimous support for making -mcpu=generic the default and making -mcpu=generic schedule for an in-order CPU (Cortex-A8 in this case).
I'll be making those changes shortly.
I think the comments also make clear that it's less obvious whether we'd want -mcpu=native to become a default. It's probably good for some use cases, but
2017 May 31
6
[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53
Motivation
At the moment, when targeting armv7a, clang defaults to generate code as if -mcpu=cortex-a8 was specified.
When targeting armv8a, it defaults to generate code as if -mcpu=cortex-a53 was specified.
This leads to surprising code generation, by the compiler optimizing for a specific micro-architecture, whereas the intent from the user was probably to generate code that is
2015 Feb 09
3
[LLVMdev] aarch64 status for generating SIMD instructions
So far, all I have tried is -O3 and with & without "-mcpu=cortex-a57".
I'm new to LLVM so I'm not familiar with what optimization flags are available.
I tried poking around in the LLVM documentation but haven't found a definitive list.
The clang man page is skimpy on details.
From: Arnaud A. de Grandmaison [mailto:arnaud.degrandmaison at arm.com]
Sent: Monday, February
2014 Jun 26
2
[LLVMdev] Contributing the Apple ARM64 compiler backend
HI James,
Thanks for your reply and hints on what can be done for the Aarch64 backend
optimization for llvm
We have SPEC license and v8 hardware. So I will start looking into it
warm regards
Manjunath
On Wed, Jun 25, 2014 at 8:42 PM, James Molloy <james.molloy at arm.com> wrote:
> Hi Manjunath,
>
> At the time of writing that status we had only done our initial analysis.
>
2015 Feb 09
2
[LLVMdev] aarch64 status for generating SIMD instructions
I'm using Fedora 22 and gcc 4.9.2 to run llvm 3.5.1 on an ARM Juno reference box (cortex A53 & A57).
I tried compiling some simple functions like dot product and axpy() into assembly to see if any of the SIMD instructions were generated (they weren't).
Perhaps I'm missing some compiler flag to enable it.
Does anyone know what the status is for aarch64 generating SIMD instructions?
2015 Feb 09
3
[LLVMdev] aarch64 status for generating SIMD instructions
% clang -S -O3 -mcpu=cortex-a57 -ffast-math -Rpass-analysis=loop-vectorize dot.c
dot.c:15:1: remark: loop not vectorized: value that could not be identified as
reduction is used outside the loop [-Rpass-analysis=loop-vectorize]
}
^
dot.c:15:1: note: could not determine the original source location for :0:0
I found “llvm-as < /dev/null | llc -march=aarch64 -mattr=help” which listed a
2014 Jun 26
2
[LLVMdev] Contributing the Apple ARM64 compiler backend
Hi Sanjay,
The behaviour I’m talking about I’ve actually pinned down to CodeGenPrepare not working too well with ISA’s that don’t have a good scaled load. I have a patch to fix it that is going through performance testing now.
Your testcase seems specific to x86 – for aarch64 we get the rather spiffy:
_Z3fooPii: // @_Z3fooPii
// BB#0:
2017 Feb 15
2
(RFC) Adjusting default loop fully unroll threshold
Thanks for running these Kristof!
I'd still like to hear from Apple, and if we can get a few more x86
micro-architectures covered that'd be great, but it looks like -O3 is
uncontroversial, and the question is whether this makes sense at O2...
To me, it would help a lot to know the actual breakdown of benchmarks such
as yours Kristof (as they seem to have more codesize impact than others
2015 Jan 13
2
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Hi folks,
Moving the discussion to llvm.dev.
None of the changes we talked earlier help.
Find attached the C source code that you can use to reproduce the issue.
clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast -fno-math-errno test.c -S -o test.s -mllvm -debug-only=licm
LICM hoisting to while.body.lr.ph: %21 = load double** %arrayidx8, align 8, !tbaa !5
LICM hoisting to
2009 Jul 10
3
strange strsplit gsub problem 0 is this a bug or a string length limitation?
I was working with the rmetrics portfolioBacktesting function and dug into
the code to try to find why my formula with 113 items, i.e. A1 thru A113,
was being truncated and I only get 85 items, not 113.
Is it due to a string length limitation in R or is it a bug in the strsplit
or gsub functions, or in my string?
I'd very much appreciate any suggestions
============Input script:
2015 Jan 14
2
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Can you send me actual LLVM IR or a preprocessed source from using -E?
I don't have a machine handy that has headers that target that arch.
On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at dberlin.org> wrote:
> Anything other than noalias or mustalias should be getting passed down the
> stack, so either that is not happening or CFL aa is giving better answers
> and
2017 Feb 16
4
(RFC) Adjusting default loop fully unroll threshold
First off, I just want to say wow and thank you. This kind of data is
amazing. =D
On Thu, Feb 16, 2017 at 2:46 AM Kristof Beyls <Kristof.Beyls at arm.com> wrote:
> The biggest relative code size increases indeed didn't happen for the
> biggest programs, but instead for a few programs weighing in at about 100KB.
> I'm assuming the Google benchmark set covers much bigger
2015 Jan 14
3
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote:
> This is caused by CFLAA returning PartialAlias for a query that BasicAA
> can prove is NoAlias.
>
One of them is wrong. Which one?
I'm not sure from your description that this is a chaining issue.
PartialAlias doesn't chain and isn't supposed to, it's a final answer just
like NoAlias and
2016 May 13
2
A question about AArch64 Cortex-A57 subtarget definition
Hello everybody,
I'm reading the .td files defining the Cortex-A57 processor,
which is a subtarget of AArch64 target, and there is something
confusing me in the `AArch64SchedA57.td` file.
In the top of `AArch64SchedA57.td`, various processor resource are
defined, as follows
```
def A57UnitB : ProcResource<1>; // Type B micro-ops
def A57UnitI : ProcResource<2>; // Type
2017 Jan 24
3
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Hi Sanjay,
Thank you for your analysis. It’s interesting why the x86 machine is not affected. Maybe the x86 backend is smarter than the AArch64 backend, or it might be micro-architectural differences.
I don’t mind to keep the changes on trunk.
What I’d like to see is who will/should be involved in solving the issue. What kind of help/support is needed? Should we (ARM Compilation Tools) start
2015 Jan 15
2
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Yes.
I've attached an updated patch that does the following:
1. Fixes the partialalias of globals/arguments
2. Enables partialalias for cases where nothing has been unified to a
global/argument
3. Fixes that select was unifying the condition to the other pieces (the
condition does not need to be processed :P). This was causing unnecessary
aliasing.
4. Adds a regression test to
2015 Jan 14
3
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Oh, sorry, i didn't rebase it when i changed the fix, you would have had to
apply the first on top of the second.
Here is one against HEAD
On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at codeaurora.org> wrote:
> Daniel, your patch does not apply cleanly. Are you on the tip?
>
> The code I see there is no line if (QueryResult == MayAlias|| QueryResult == PartialAlias)
2015 Jan 14
4
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Inline
- George
> On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at dberlin.org> wrote:
>
>
>
>> On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at google.com> wrote:
>>> On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote:
>>> This is caused by CFLAA returning PartialAlias for a query that BasicAA can
2017 Jan 23
2
[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
Confirm there is no change in IR if the hack is disabled in the sources.
David wrote that these instructions are created by SCEV.
Are other targets affected by the changes, e.g. X86?
Kind regards,
Evgeny Astigeevich
Senior Compiler Engineer
Compilation Tools
ARM
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Sunday, January 22, 2017 10:45 PM
To: Evgeny Astigeevich
Cc: llvm-dev; nd
2017 Feb 17
2
(RFC) Adjusting default loop fully unroll threshold
> On Feb 16, 2017, at 4:41 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
>
> On Thu, Feb 16, 2017 at 3:45 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> First off, I just want to say wow and thank you. This kind of data is amazing. =D
>
> On Thu, Feb 16, 2017 at