thr3ads.net - search: "vmovs"

Displaying 20 results from an estimated 67 matches for "vmovs".

Did you mean: movs

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

Hi, ** Problematic ** I am looking for advices to share some logic between DAG combine and target lowering. Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. Let me know if there is another, better supported, approach for this kind of problems. ** Motivating Example ** The motivating example comes

[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels

2011 Nov 12

[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels

This would be best reported to Apple's Radar bug database at http://bugreport.apple.com/ but its whole website has been down for a while. I have a 100% reproducible Thumb-2 code generation error that occurs at all of the levels of optimization available in the Xcode 4.2 for Snow Leopard build settings GUI: -O0, -O1, -O2, -O3 and -Os. However the bad machine code only occurs in Release

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com>wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and > target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during > target specific isel lowering will be eliminated during DAG combine. > > Let me

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during target

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Mon, Jul 1, 2013 at 1:33 PM, Quentin Colombet <qcolombet at apple.com>wrote: > On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > > On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> > wrote: > >> Hi, >> >> ** Problematic ** >> I am looking for advices to share some logic between DAG

[RFC] [ARM] Execute only support

2015 Dec 04

[RFC] [ARM] Execute only support

Hi, I'm planning to implement "execute only" support in the ARM code generator. This basically means that the compiler will not generate data access to the generated code sections (e.g. data and code are strictly separated into different sections). Outline: - Add the subtarget feature/attribute "execute-only" to the ARM code generator to enable the feature.

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

Hi folks, me again, So, I want to implement a simple optimization in a NEON case I've seen these days, most as a matter of exercise, but it also simplifies (just a bit) the code generated. The case is simple: uint32x2_t x, res; res = vceq_u32(x, vcreate_u32(0)); This will generate the following code: ; zero d16 vmov.i32 d16, #0x0 ; load a

[PATCH 5/5] resample: Add NEON optimized inner_product_single for floating point

2011 Sep 01

[PATCH 5/5] resample: Add NEON optimized inner_product_single for floating point

From: Jyri Sarha <jsarha at ti.com> Also adds inline asm implementations of WORD2INT(x) macro for fixed and floating point. --- libspeex/resample_neon.h | 101 ++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 101 insertions(+), 0 deletions(-) diff --git a/libspeex/resample_neon.h b/libspeex/resample_neon.h index ba93e41..e7e981e 100644 --- a/libspeex/resample_neon.h +++

[LLVMdev] RE : Vector argument passing abi for ARM ?

2012 Jul 05

[LLVMdev] RE : Vector argument passing abi for ARM ?

Hi Duncan, I also thought it was a bug, especially since it worked with LLVM 3.0, but since it is not defined by ABI, I was not sure if I need to submit it as a BUG. I wanted to be sure that it is an actual BUG before submitting it and got the not-a-bug answer. Here is a small example to reproduce the problem I'm experiencing: ; ModuleID = 'bugparam.ll' target datalayout =

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

2011 May 26

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

Hi all, LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community. If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com Thanks, Evan Job description The Apple compiler team is seeking an engineer who is strongly

[LLVMdev] Question about ARM/vfp/NEON code generation

2011 May 27

[LLVMdev] Question about ARM/vfp/NEON code generation

I have a code generation question for ARM with VFP and NEON. I am generating code for the following function as a test: void FloatingPointTest(float f1, float f2, float f3) { float f4 = f1 * f2; if (f4 > f3) printf("%f\n",f2); else printf("%f\n",f3); } I have tried compiling with: 1. -mfloat-abi=softfp and -mfpu=neon 2.

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

On Nov 12, 2010, at 7:23 AM, Renato Golin wrote: > Hi folks, me again, > > So, I want to implement a simple optimization in a NEON case I've seen > these days, most as a matter of exercise, but it also simplifies (just > a bit) the code generated. > > The case is simple: > > uint32x2_t x, res; > res = vceq_u32(x, vcreate_u32(0)); > > This

[LLVMdev] RE : Vector argument passing abi for ARM ?

2012 Jul 05

[LLVMdev] RE : Vector argument passing abi for ARM ?

Hi Sebastien, > I also thought it was a bug, especially since it worked with LLVM 3.0, but since it is not defined by ABI, I was not sure if I need to submit it as a BUG. yes it is a bug. > I wanted to be sure that it is an actual BUG before submitting it and got the not-a-bug answer. I didn't read Nadav's reply as saying there was no bug, in fact he explicitly said in his email

Cannot compile speexdsp 1.2rc3 on ARM64

2016 Jul 30

Cannot compile speexdsp 1.2rc3 on ARM64

I've filed a bug for aarch64 https://github.com/xiph/speexdsp/issues/7 and provided the port in a fork with a pull request. We need someone to review/merge in the pull request? It provides the source code, but my testing was under Android builds, so there would be some configure changes needed to build it stand alone. On Tue, Apr 19, 2016 at 4:32 PM, Frank Barchard <fbarchard at

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: > I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together.

[LLVMdev] neon registers llvm using

2014 Mar 10

[LLVMdev] neon registers llvm using

Hi, Everyone: Can anyone let me know the default NEON registers llvm going to use with armv7 devices? For example, d10 and d11 are treated as default zero? I am using Xcode5 + llvm and I got a case that compiler will generate neon codes " vst.8 {d10, d11}, [r1] " from C codes: "int aMV[4]; ...... aMV[0] = aMV[1] = aMV[2] = aMV[3] = 0; " and I

how to build NE10 Project using llvm compiler

2018 Jul 30

how to build NE10 Project using llvm compiler

Hello, I’m using NXP layerscape Arch (A53/A72), and I want to use NE 10 Project library , and llvm compiler 3.8.1.1 (https://projectne10.github.io/Ne10/) <https://projectne10.github.io/Ne10/> When compiling the project file I get the following errors : ./NE10_abs.asm.s:59:9: error: unrecognized instruction mnemonic vmov s2, r3 ^ ../NE10_abs.asm.s:62:9: error:

clang 4.0.0: Invalid code for builtin floating point function with -mfloat-abi=hard -ffast-math (ARM)

2017 Mar 21

clang 4.0.0: Invalid code for builtin floating point function with -mfloat-abi=hard -ffast-math (ARM)

Hello, clang/llvm 4.0.0 generates invalid calls for builtin functions with -mfloat-abi=hard -ffast-math. Small example fail.c: // clang -O2 -target armv7a-none-none-eabi -mfloat-abi=hard -ffast-math -S fail.c -o - extern float sinf (float x); float sin1 (float x) {return (sinf (x));} generates code to pass the parameter in r0 and expect the result in r0. The same code without

[PATCH 0/5] ARM NEON optimization for samplerate converter

2011 Sep 01

[PATCH 0/5] ARM NEON optimization for samplerate converter

From: Jyri Sarha <jsarha at ti.com> I optimized Speex resampler for NEON capable ARM CPUs. The first patch should speed up resampling on any platform that can spare the increased memory usage. It would be nice to have these merged to the master branch. Please let me know if there is anything I can do to help the the merge. The patches have been rebased on top of master branch in

[LLVMdev] Vector argument passing abi for ARM ?

2012 Jul 05

[LLVMdev] Vector argument passing abi for ARM ?

Hi Sebastien, > Thanks for the quick answer, how do I know which type is legal/illegal with respect to calling convention ? the code generators are supposed to produce working code no matter what the parameter type is. The fact that the ARM ABI doesn't specify how <2 x i8> is passed just means that the code generators can pass it using whatever technique it feels like (since it

search for: vmovs