thr3ads.net - similar to: "Passing literal -cpu model string to qemu"

Displaying 20 results from an estimated 8000 matches similar to: "Passing literal -cpu model string to qemu"

[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53

2017 Jun 01

[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53

Thanks for everyone giving their feedback! I saw pretty unanimous support for making -mcpu=generic the default and making -mcpu=generic schedule for an in-order CPU (Cortex-A8 in this case). I'll be making those changes shortly. I think the comments also make clear that it's less obvious whether we'd want -mcpu=native to become a default. It's probably good for some use cases, but

[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53

2017 May 31

[RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53

Motivation At the moment, when targeting armv7a, clang defaults to generate code as if -mcpu=cortex-a8 was specified. When targeting armv8a, it defaults to generate code as if -mcpu=cortex-a53 was specified. This leads to surprising code generation, by the compiler optimizing for a specific micro-architecture, whereas the intent from the user was probably to generate code that is

[LLVMdev] aarch64 status for generating SIMD instructions

2015 Feb 09

[LLVMdev] aarch64 status for generating SIMD instructions

So far, all I have tried is -O3 and with & without "-mcpu=cortex-a57". I'm new to LLVM so I'm not familiar with what optimization flags are available. I tried poking around in the LLVM documentation but haven't found a definitive list. The clang man page is skimpy on details. From: Arnaud A. de Grandmaison [mailto:arnaud.degrandmaison at arm.com] Sent: Monday, February

[LLVMdev] Contributing the Apple ARM64 compiler backend

2014 Jun 26

[LLVMdev] Contributing the Apple ARM64 compiler backend

HI James, Thanks for your reply and hints on what can be done for the Aarch64 backend optimization for llvm We have SPEC license and v8 hardware. So I will start looking into it warm regards Manjunath On Wed, Jun 25, 2014 at 8:42 PM, James Molloy <james.molloy at arm.com> wrote: > Hi Manjunath, > > At the time of writing that status we had only done our initial analysis. >

[LLVMdev] aarch64 status for generating SIMD instructions

2015 Feb 09

[LLVMdev] aarch64 status for generating SIMD instructions

I'm using Fedora 22 and gcc 4.9.2 to run llvm 3.5.1 on an ARM Juno reference box (cortex A53 & A57). I tried compiling some simple functions like dot product and axpy() into assembly to see if any of the SIMD instructions were generated (they weren't). Perhaps I'm missing some compiler flag to enable it. Does anyone know what the status is for aarch64 generating SIMD instructions?

[LLVMdev] aarch64 status for generating SIMD instructions

2015 Feb 09

[LLVMdev] aarch64 status for generating SIMD instructions

% clang -S -O3 -mcpu=cortex-a57 -ffast-math -Rpass-analysis=loop-vectorize dot.c dot.c:15:1: remark: loop not vectorized: value that could not be identified as reduction is used outside the loop [-Rpass-analysis=loop-vectorize] } ^ dot.c:15:1: note: could not determine the original source location for :0:0 I found “llvm-as < /dev/null | llc -march=aarch64 -mattr=help” which listed a

[LLVMdev] Contributing the Apple ARM64 compiler backend

2014 Jun 26

[LLVMdev] Contributing the Apple ARM64 compiler backend

Hi Sanjay, The behaviour I’m talking about I’ve actually pinned down to CodeGenPrepare not working too well with ISA’s that don’t have a good scaled load. I have a patch to fix it that is going through performance testing now. Your testcase seems specific to x86 – for aarch64 we get the rather spiffy: _Z3fooPii: // @_Z3fooPii // BB#0:

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 15

(RFC) Adjusting default loop fully unroll threshold

Thanks for running these Kristof! I'd still like to hear from Apple, and if we can get a few more x86 micro-architectures covered that'd be great, but it looks like -O3 is uncontroversial, and the question is whether this makes sense at O2... To me, it would help a lot to know the actual breakdown of benchmarks such as yours Kristof (as they seem to have more codesize impact than others

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 13

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Hi folks, Moving the discussion to llvm.dev. None of the changes we talked earlier help. Find attached the C source code that you can use to reproduce the issue. clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast -fno-math-errno test.c -S -o test.s -mllvm -debug-only=licm LICM hoisting to while.body.lr.ph: %21 = load double** %arrayidx8, align 8, !tbaa !5 LICM hoisting to

strange strsplit gsub problem 0 is this a bug or a string length limitation?

2009 Jul 10

strange strsplit gsub problem 0 is this a bug or a string length limitation?

I was working with the rmetrics portfolioBacktesting function and dug into the code to try to find why my formula with 113 items, i.e. A1 thru A113, was being truncated and I only get 85 items, not 113. Is it due to a string length limitation in R or is it a bug in the strsplit or gsub functions, or in my string? I'd very much appreciate any suggestions ============Input script:

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 14

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Can you send me actual LLVM IR or a preprocessed source from using -E? I don't have a machine handy that has headers that target that arch. On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at dberlin.org> wrote: > Anything other than noalias or mustalias should be getting passed down the > stack, so either that is not happening or CFL aa is giving better answers > and

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 16

(RFC) Adjusting default loop fully unroll threshold

First off, I just want to say wow and thank you. This kind of data is amazing. =D On Thu, Feb 16, 2017 at 2:46 AM Kristof Beyls <Kristof.Beyls at arm.com> wrote: > The biggest relative code size increases indeed didn't happen for the > biggest programs, but instead for a few programs weighing in at about 100KB. > I'm assuming the Google benchmark set covers much bigger

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 14

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote: > This is caused by CFLAA returning PartialAlias for a query that BasicAA > can prove is NoAlias. > One of them is wrong. Which one? I'm not sure from your description that this is a chaining issue. PartialAlias doesn't chain and isn't supposed to, it's a final answer just like NoAlias and

A question about AArch64 Cortex-A57 subtarget definition

2016 May 13

A question about AArch64 Cortex-A57 subtarget definition

Hello everybody, I'm reading the .td files defining the Cortex-A57 processor, which is a subtarget of AArch64 target, and there is something confusing me in the `AArch64SchedA57.td` file. In the top of `AArch64SchedA57.td`, various processor resource are defined, as follows ``` def A57UnitB : ProcResource<1>; // Type B micro-ops def A57UnitI : ProcResource<2>; // Type

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Hi Sanjay, Thank you for your analysis. It’s interesting why the x86 machine is not affected. Maybe the x86 backend is smarter than the AArch64 backend, or it might be micro-architectural differences. I don’t mind to keep the changes on trunk. What I’d like to see is who will/should be involved in solving the issue. What kind of help/support is needed? Should we (ARM Compilation Tools) start

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 15

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Yes. I've attached an updated patch that does the following: 1. Fixes the partialalias of globals/arguments 2. Enables partialalias for cases where nothing has been unified to a global/argument 3. Fixes that select was unifying the condition to the other pieces (the condition does not need to be processed :P). This was causing unnecessary aliasing. 4. Adds a regression test to

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 14

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Oh, sorry, i didn't rebase it when i changed the fix, you would have had to apply the first on top of the second. Here is one against HEAD On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at codeaurora.org> wrote: > Daniel, your patch does not apply cleanly. Are you on the tip? > > The code I see there is no line if (QueryResult == MayAlias|| QueryResult == PartialAlias)

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 14

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Inline - George > On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > >> On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at google.com> wrote: >>> On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote: >>> This is caused by CFLAA returning PartialAlias for a query that BasicAA can

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 23

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Confirm there is no change in IR if the hack is disabled in the sources. David wrote that these instructions are created by SCEV. Are other targets affected by the changes, e.g. X86? Kind regards, Evgeny Astigeevich Senior Compiler Engineer Compilation Tools ARM From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Sunday, January 22, 2017 10:45 PM To: Evgeny Astigeevich Cc: llvm-dev; nd

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 17

(RFC) Adjusting default loop fully unroll threshold

> On Feb 16, 2017, at 4:41 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Thu, Feb 16, 2017 at 3:45 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > First off, I just want to say wow and thank you. This kind of data is amazing. =D > > On Thu, Feb 16, 2017 at

similar to: Passing literal -cpu model string to qemu