search for: featureneonforfp

Displaying 4 results from an estimated 4 matches for "featureneonforfp".

2013 Jun 07
3
[LLVMdev] NEON vector instructions and the fast math IR flags
...a weird combination of ARM7TDMI+NEON. There are two things to know at this point: 1. When the execution gets to resetSubtargetFeatures, what CPU has it detected for your arguments. You may also have to look at ARM.td to see if the CPU that it got detected has in its description the feature "FeatureNEONForFP". 2. If the CPU is correct (Cortex-A*), and it's neither A5 nor A8, do we still want to generate single-precision float on NEON when non-Darwin and safe math? I don't think so. Possibly, that condition should be extended to ignore the CPU you're using and *only* emit NEON SP-FP wh...
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On Jun 6, 2013, at 8:35 PM, Tobias Grosser <grosser at google.com> wrote: > I understand that some users do not require 754 compliant floating point behavior (clang on darwin?), which means they would probably not need this change. However, it should also not hurt them performance-wise as such users would probably set the relevant global fast-math flags to reduce the precision
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
Hi, I was recently looking into the translation of LLVM-IR vector instructions to ARM NEON assembly. Specifically, when this is legal to do and when we need to be careful. I attached a very simple test case: define <4 x float> @fooP(<4 x float> %A, <4 x float> %B) { %C = fmul <4 x float> %A, %B ret <4 x float> %C } If fooP is compiled with "llc -march=arm
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
...NEON. > > There are two things to know at this point: > > 1. When the execution gets to resetSubtargetFeatures, what CPU has it > detected for your arguments. You may also have to look at ARM.td to see if > the CPU that it got detected has in its description the feature > "FeatureNEONForFP". > > 2. If the CPU is correct (Cortex-A*), and it's neither A5 nor A8, do we > still want to generate single-precision float on NEON when non-Darwin and > safe math? I don't think so. Possibly, that condition should be extended to > ignore the CPU you're using and *...