Displaying 20 results from an estimated 1121 matches for "neons".
Did you mean:
neon
2016 Mar 25
3
NEON FP flags
On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote:
> As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations
2016 Mar 29
1
NEON FP flags
On Fri, Mar 25, 2016 at 01:23:03PM +0000, Renato Golin via llvm-dev wrote:
> On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote:
> > As I understand it, the fundamental property being addresses here is: Are
> > the semantics of scalar FP math the same as vector FP math? TTI seems like
> > a good place to expose that information. If the semantics are indeed
2014 Nov 25
4
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote:
>
> > Also is there plans to make the NEON optimisations on ARMv7 run time
> > detectable like they have in cairo/pixman? For generic distributions
> > it would nice to be able to be able to enable them as they offer
> > decent performance improvements but have the code
2016 Mar 25
0
NEON FP flags
Hi Renato,
As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations (similarly, gcc's man page says it requires
2011 May 27
2
[LLVMdev] Question about ARM/vfp/NEON code generation
Thanks, that helps a lot.
> All chips (to date) with NEON have VFP3, so it's safe to assume that a
-mfpu=neon will have VFP3, so all the decisions
> about code generated for VFP3 can safely be assumed by targets with
NEON.
Just to confirm my understanding, can I correctly say in general that
the llc code generator might blur distinctions between NEON and VFP3
when it can do so
2016 Mar 22
2
NEON FP flags
On 22 March 2016 at 11:34, James Molloy <James.Molloy at arm.com> wrote:
> I don’t think this part is right. The denormal flag would have to be set by
> whatever code generates the FP instruction, which would be Clang’s codegen
> layer. So the if (Darwin) would be there, not in TTI.
Right, I meant the information to set/not set would be in TTI, not the
actual setting.
I don't
2013 Jun 07
3
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 07:05, Owen Anderson <resistor at mac.com> wrote:
> Darwin uses NEON for floating point, but does *not* (and should not).
> globally enable fast math flags. Use of NEON for FP needs to remain
> achievable without globally setting the fast math flags. Fast math may
> imply reasonably imply NEON, but the opposite direction is not accurate.
>
> That said, I
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
>> Darwin uses NEON for floating point, but does *not* (and should not).
>> globally enable fast math flags. Use of NEON for FP needs to remain
>> achievable without globally setting the fast math flags. Fast math may
>> imply reasonably imply NEON, but the opposite direction is not accurate.
| Good point. Fast math is probably a too tough requirement. I need to
| look
2009 Nov 10
4
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
memcpy intrinsic. I used the Neon load multiple instruction to move up
to 48 bytes at a time . Over 15 scalar instructions collapsed down
into these 2 Neon instructions.
fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359
fstmiad r1, {d0, d1, d2, d3, d4, d5}
It seems like this should be faster. But I did
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/06/2013 11:58 PM, Renato Golin wrote:
> On 7 June 2013 07:05, Owen Anderson <resistor at mac.com> wrote:
Hi Owen, hi Renato,
thanks for your replies.
>> Darwin uses NEON for floating point, but does *not* (and should not).
>> globally enable fast math flags. Use of NEON for FP needs to remain
>> achievable without globally setting the fast math flags. Fast
2011 May 27
0
[LLVMdev] Question about ARM/vfp/NEON code generation
On May 27, 2011, at 10:49 AM, David Dunkle wrote:
> Thanks, that helps a lot.
>
>> All chips (to date) with NEON have VFP3, so it's safe to assume that a
> -mfpu=neon will have VFP3, so all the decisions
>> about code generated for VFP3 can safely be assumed by targets with
> NEON.
>
> Just to confirm my understanding, can I correctly say in general that
>
2011 May 27
0
[LLVMdev] Question about ARM/vfp/NEON code generation
On 27 May 2011 02:04, David Dunkle <ddunkle at arxan.com> wrote:
> In all cases, I get code that looks pretty very the same; its like what
> is below. However, I am expecting to see instruction level differences
> between the vfp3 and neon versions. When I do the same with gcc 4.2 I do
> see differences in the generated code.
Hi David,
You could see different instructions (as
2009 Nov 10
3
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 9, 2009, at 5:59 PM, David Conrad wrote:
> On Nov 9, 2009, at 7:34 PM, Neel Nagar wrote:
>
>> I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
>> memcpy intrinsic. I used the Neon load multiple instruction to move
>> up
>> to 48 bytes at a time . Over 15 scalar instructions collapsed down
>> into these 2 Neon instructions.
Nice. Thanks
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
> |I just looked again at the +neonfp flag. Compiling with and without
> |+neonfp flag seems to only affect scalar types in the attached test
> |case. If e.g. the LLVM vectorizer introduces vector instructions on
> |LLVM-IR level floating point vectors still yield NEON assembly even if
> |compiled with "-mattr=+neon,-neonfp". Is this expected?
>
> I'm virtually
2011 May 27
1
[LLVMdev] Question about ARM/vfp/NEON code generation
I have a code generation question for ARM with VFP and NEON.
I am generating code for the following function as a test:
void FloatingPointTest(float f1, float f2, float f3)
{
float f4 = f1 * f2;
if (f4 > f3)
printf("%f\n",f2);
else
printf("%f\n",f3);
}
I have tried compiling with:
1. -mfloat-abi=softfp and -mfpu=neon
2.
2014 Nov 25
2
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On 25 November 2014 at 10:11, Viswanath Puttagunta
<viswanath.puttagunta at linaro.org> wrote:
>
> On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com> wrote:
> >
> > On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote:
> >>
> >> > Also is there plans to make the NEON optimisations
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
Hi,
I was recently looking into the translation of LLVM-IR vector instructions
to ARM NEON assembly. Specifically, when this is legal to do and when we
need to be careful.
I attached a very simple test case:
define <4 x float> @fooP(<4 x float> %A, <4 x float> %B)
{
%C = fmul <4 x float> %A, %B
ret <4 x float> %C
}
If fooP is compiled with "llc -march=arm
2013 Sep 26
2
[LLVMdev] ARM NEON intrinsics in clang
Hello LLVM Devs,
I am starting my PhD on Automatic Parallelization for DSP and want to play
with some ARM NEON intrinsics for a start. I spent the last three days
trying to compile a version of LLVM that would allow me to compile sources
that contain these intrinsics, but with no success.
In the process I found out that clang doesn't support NEON (as per
2009 Nov 11
0
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 11, 2009, at 3:27 AM, Rodolph Perfetta wrote:
>
> If you know about the alignment, maybe use structured load/store
> (vst1.64/vld1.64 {dn-dm}). You may also want to work on whole cache
> lines
> (64 bytes on A8). You can find more in this discussion:
> http://groups.google.com/group/beagleboard/browse_thread/thread/12c7bd415fbc
>
2014 Mar 26
19
[LLVMdev] 3.4.1 Release Plans
Hi,
We are now about halfway between the 3.4 and 3.5 releases, and I would
like to start preparing for a 3.4.1 release. Here is my proposed release
schedule:
Mar 26 - April 9: Identify and backport additional bug fixes to the 3.4 branch.
April 9 - April 18: Testing Phase
April 18: 3.4.1 Release
How you can help:
- If you have any bug fixes you think should be included to 3.4.1, send
me an