thr3ads.net - search: "useneonforsingleprecisionfp"

Displaying 7 results from an estimated 7 matches for "useneonforsingleprecisionfp".

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

...table is being built over the last months. The only way to get this result is indirectly via the cost model but the > backend must still support vectorized IR (it is part of the language) via > scalarization. > Absolutely! There are two problems to solve: increase the cost for SPFP when UseNEONForSinglePrecisionFP is false, so that vectorizers don't generate such code, and legalize correctly in the backend, for vector code that does not respect that flag. (You can of course assign UMAX cost for all floating point vector types in > the cost model for ARM and get the desired result - this won’t solve...

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On Jun 6, 2013, at 8:35 PM, Tobias Grosser <grosser at google.com> wrote: > I understand that some users do not require 754 compliant floating point behavior (clang on darwin?), which means they would probably not need this change. However, it should also not hurt them performance-wise as such users would probably set the relevant global fast-math flags to reduce the precision

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

...solve the real problem in the ARM backend. > The only way to get this result is indirectly via the cost model but the backend must still support vectorized IR (it is part of the language) via scalarization. > > Absolutely! There are two problems to solve: increase the cost for SPFP when UseNEONForSinglePrecisionFP is false, so that vectorizers don't generate such code, and legalize correctly in the backend, for vector code that does not respect that flag. > > > (You can of course assign UMAX cost for all floating point vector types in the cost model for ARM and get the desired result - this wo...

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On Jun 7, 2013, at 9:22 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 7 June 2013 14:49, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > It is not the vectorizer that is the issue, it is the ARM backend that currently translates vectorized floating point IR to NEON instructions (it should scalarize it if desired to do so - i.e. if people care about

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

Hi, I was recently looking into the translation of LLVM-IR vector instructions to ARM NEON assembly. Specifically, when this is legal to do and when we need to be careful. I attached a very simple test case: define <4 x float> @fooP(<4 x float> %A, <4 x float> %B) { %C = fmul <4 x float> %A, %B ret <4 x float> %C } If fooP is compiled with "llc -march=arm

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On 7 June 2013 14:49, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > It is not the vectorizer that is the issue, it is the ARM backend that > currently translates vectorized floating point IR to NEON instructions (it > should scalarize it if desired to do so - i.e. if people care about > denormals). > Hi Arnold, Can't the vectorizer not generate the v4f32

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

...t think anyone would object to making VFP codegen > available under non-Darwin triples. It's just a matter of making it happen. > Hi Owen, ARMSubtarget::resetSubtargetFeatures(StringRef CPU, StringRef FS) has a check to see if the target is Darwin or if UnsafeMath is enabled to set the UseNEONForSinglePrecisionFP, but only for A5 and A8, where this was a problem. Maybe I was too conservative on my fix. Tobi, The march=arm option would default to ARMv4, while mattr=+neon would force NEON, but I'm not sure it would default to A8, which would be a weird combination of ARM7TDMI+NEON. There are two things...

search for: useneonforsingleprecisionfp