Displaying 20 results from an estimated 9000 matches similar to: "[LLVMdev] NEON vector instructions and the fast math IR flags"
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On Jun 6, 2013, at 8:35 PM, Tobias Grosser <grosser at google.com> wrote:
> I understand that some users do not require 754 compliant floating point behavior (clang on darwin?), which means they would probably not need this change. However, it should also not hurt them performance-wise as such users would probably set the relevant global fast-math flags to reduce the precision
2013 Jun 07
3
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 07:05, Owen Anderson <resistor at mac.com> wrote:
> Darwin uses NEON for floating point, but does *not* (and should not).
> globally enable fast math flags. Use of NEON for FP needs to remain
> achievable without globally setting the fast math flags. Fast math may
> imply reasonably imply NEON, but the opposite direction is not accurate.
>
> That said, I
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/06/2013 11:58 PM, Renato Golin wrote:
> On 7 June 2013 07:05, Owen Anderson <resistor at mac.com> wrote:
Hi Owen, hi Renato,
thanks for your replies.
>> Darwin uses NEON for floating point, but does *not* (and should not).
>> globally enable fast math flags. Use of NEON for FP needs to remain
>> achievable without globally setting the fast math flags. Fast
2013 Jun 07
3
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote:
> When to set which subtarget feature is a policy decision, where I honestly
> don't have any opinion on for clang. The best is probably to mirror the gcc
> behavior on linux targets.
>
Not really, since GCC has no special behaviour for Darwin, AFAIK.
My change will only generate SP-FP on NEON for A5 and A8
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote:
> When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc behavior on linux targets.
>
> Not really, since GCC has no special
2016 Feb 11
4
Vectorization with fast-math on irregular ISA sub-sets
----- Original Message -----
> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "James Molloy" <James.Molloy at arm.com>, "Nadav Rotem" <nrotem at apple.com>, "Arnold Schwaighofer"
> <aschwaighofer at apple.com>, "LLVM Dev" <llvm-dev at
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
> |I just looked again at the +neonfp flag. Compiling with and without
> |+neonfp flag seems to only affect scalar types in the attached test
> |case. If e.g. the LLVM vectorizer introduces vector instructions on
> |LLVM-IR level floating point vectors still yield NEON assembly even if
> |compiled with "-mattr=+neon,-neonfp". Is this expected?
>
> I'm virtually
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
>> Darwin uses NEON for floating point, but does *not* (and should not).
>> globally enable fast math flags. Use of NEON for FP needs to remain
>> achievable without globally setting the fast math flags. Fast math may
>> imply reasonably imply NEON, but the opposite direction is not accurate.
| Good point. Fast math is probably a too tough requirement. I need to
| look
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/07/2013 06:49 AM, Arnold Schwaighofer wrote:
>
> On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote:
>
>> On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote:
>> When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc
2016 Feb 15
2
Vectorization with fast-math on irregular ISA sub-sets
Hi,
> James, is that a correct assessment?
Yes, it is also my belief that the only way ARMv7 NEON differs from IEEE754 is lack of denormal support.
James
> On 11 Feb 2016, at 10:53, Renato Golin <renato.golin at linaro.org> wrote:
>
> Hal,
>
> I had a read on the ARM ARM about VFP and SIMD FP semantics and my
> analysis is that NEON's only problem is the
2013 Jun 10
0
[LLVMdev] NEON vector instructions and the fast math IR flags
| For programs that have mixed precision requirements for floating point
| operations we probably need to do this according to the fast math flags.
| Until we get there, a good first step would probably be to provide a
| global option similar to -enable-no-infs-fp-math that specifies if
| denormals should be allowed or not. This would allow the user to specify
| the precision requirements, without
2013 Jun 10
1
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/10/2013 01:56 AM, David Tweed wrote:
> | For programs that have mixed precision requirements for floating point
> | operations we probably need to do this according to the fast math flags.
> | Until we get there, a good first step would probably be to provide a
> | global option similar to -enable-no-infs-fp-math that specifies if
> | denormals should be allowed or not. This
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 14:49, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> It is not the vectorizer that is the issue, it is the ARM backend that
> currently translates vectorized floating point IR to NEON instructions (it
> should scalarize it if desired to do so - i.e. if people care about
> denormals).
>
Hi Arnold,
Can't the vectorizer not generate the v4f32
2016 Mar 29
1
NEON FP flags
On Fri, Mar 25, 2016 at 01:23:03PM +0000, Renato Golin via llvm-dev wrote:
> On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote:
> > As I understand it, the fundamental property being addresses here is: Are
> > the semantics of scalar FP math the same as vector FP math? TTI seems like
> > a good place to expose that information. If the semantics are indeed
2016 Mar 25
3
NEON FP flags
On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote:
> As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations
2013 Jun 07
1
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 18:08, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> What I am suggesting is that (if you care about denormals):
>
> * the arm backend has to be fixed to scalarize floating point vector
> operations (behind a flag)
> * the arm target transform model has to correctly reflect that
>
Yup. What I had in mind, too. This is why I asked Tobi to create
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On Jun 7, 2013, at 9:22 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 7 June 2013 14:49, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> It is not the vectorizer that is the issue, it is the ARM backend that currently translates vectorized floating point IR to NEON instructions (it should scalarize it if desired to do so - i.e. if people care about
2016 Mar 25
0
NEON FP flags
Hi Renato,
As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations (similarly, gcc's man page says it requires
2016 Feb 08
2
Vectorization with fast-math on irregular ISA sub-sets
Folks,
I'm now looking at https://llvm.org/bugs/show_bug.cgi?id=16274, which
seems to have some support in the vectorizer, but not as we need for
this particular case. I may have missed something obvious, please let
me know if there is a better way.
As you already know, ARM has two FP instruction sets: VFP and NEON.
VFP applies to single FP registers while NEON is a full SIMD. The
problem is
2013 Mar 19
0
[LLVMdev] ARM NEON VMUL.f32 issue
Hi Renato,
You're right. Strictly speaking, using NEON for scalar floating point isn't completely safe for exactly this reason (also NaNs, IIRC). We generally do it anyway because on common cores (cortex-a8) VFP is pretty terrible and the NEON approximation is correct for the vast majority of use-cases that people care about. Yes, that's cutting some corners. Would you mind making