thr3ads.net - similar to: "How to prevent clang/llvm from generating floating-point instructions?"

Displaying 20 results from an estimated 8000 matches similar to: "How to prevent clang/llvm from generating floating-point instructions?"

How to prevent clang/llvm from generating floating-point instructions?

2016 Mar 16

How to prevent clang/llvm from generating floating-point instructions?

Hi Tim, Thanks for your message! It turns out that the infrastructure (an outdated one) that I am working on is using gcc+dragonegg to generate llvm code: gcc -m32 -S -c -O0 -fplugin=$(DRAGONEGG_SO) -fplugin-arg-dragonegg-emit-ir $< -o $@.tmp It directly generates llvm code with fadd, etc. I'm not familiar with dragonegg plugin... Thanks, XIaochu On Wed, Mar 16, 2016 at 12:00 PM,

[LLVMdev] API change: add, sub, and mul no longer do floating-point

2010 May 03

[LLVMdev] API change: add, sub, and mul no longer do floating-point

Quick heads up: On LLVM trunk, the add, sub, and mul instructions no longer accept floating-point operands. The fadd, fsub, and fmul instructions should be used instead. This change actually happened back in LLVM 2.6; since then, LLVM has just been silently converting add into fadd and so on, and the change today is that it no longer does this silent conversion. Dan

[LLVMdev] Does DragonEgg support parameters like -fno-builtin in clang?

2013 Apr 10

[LLVMdev] Does DragonEgg support parameters like -fno-builtin in clang?

Hi chenwj, Thanks! I have tried it, but the generated byte code still uses `llvm.memset`. I guess the flag `-fno-builtin` is not used by DragonEgg, or I missed some other configuration parameters. On Tue, Apr 9, 2013 at 7:45 PM, 陳韋任 (Wei-Ren Chen) <chenwj at iis.sinica.edu.tw > wrote: > On Tue, Apr 09, 2013 at 04:39:14PM -0700, Jeff Jia wrote: > > Hi, > > > > I have

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 18

[LLVMdev] SIMD instructions and memory alignment on X86

Are you able to send any IR for others to reproduce this issue? On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com> wrote: > Unfortunately, this doesn't appear to be the bug I'm hitting. I applied > the fix to my source and it didn't make a difference. > > Also further testing found me getting the same behavior with other SIMD > instructions.

[LLVMdev] spilling & xmm register usage

2010 Sep 29

[LLVMdev] spilling & xmm register usage

Hello everybody, I have stumbled upon a test case (the attached module is a slightly reduced version) that shows extremely reduced performance on linux compared to windows when executed using LLVM's JIT. We narrowed the problem down to the actual code being generated, the source IR on both systems is the same. Try compiling the attached module: llc -O3 -filetype=asm -o BAD.s BAD.ll Under

[LLVMdev] Does DragonEgg support parameters like -fno-builtin in clang?

2013 Apr 10

[LLVMdev] Does DragonEgg support parameters like -fno-builtin in clang?

Hi, I have figured out a way to achieve similar effects. gcc -S -c -O0 -fplugin=$(DRAGONEGG_SO) -fplugin-arg-dragonegg-emit-ir hello.c -o hello.bc opt -O3 -disable-simplify-libcalls hello.bc -o hello.bc On Wed, Apr 10, 2013 at 12:54 PM, Jeff Jia <fjia at cs.ucsd.edu> wrote: > Hi chenwj, > > Thanks! I have tried it, but the generated byte code still uses >

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 18

[LLVMdev] SIMD instructions and memory alignment on X86

Unfortunately, this doesn't appear to be the bug I'm hitting. I applied the fix to my source and it didn't make a difference. Also further testing found me getting the same behavior with other SIMD instructions. The common factor is in each case, ECX is set to 0x7fffffff, and it's an operation using xmm ptr ecx+offset . Additionally, turning the optimization level passed to

[LLVMdev] API change: add, sub, and mul no longer do floating-point

2010 May 03

[LLVMdev] API change: add, sub, and mul no longer do floating-point

On Mon, May 3, 2010 at 3:53 PM, Dan Gohman <gohman at apple.com> wrote: > Quick heads up: On LLVM trunk, the add, sub, and mul instructions no > longer accept floating-point operands. The fadd, fsub, and fmul instructions > should be used instead. > > This change actually happened back in LLVM 2.6; since then, LLVM has just > been silently converting add into fadd and so

clang invokes assembler when generating obj file?

2015 Sep 09

clang invokes assembler when generating obj file?

Nice! Thanks, Tom. It works. On Wed, Sep 9, 2015 at 12:30 PM Tom Stellard <tom at stellard.net> wrote: > On Wed, Sep 09, 2015 at 07:21:30PM +0000, Xiaochu Liu via llvm-dev wrote: > > Dear there, > > > > I'm trying to use clang to invoke my backend to generate obj code using > > command: > > > > clang -target x-linux-gnu global.c -c > > >

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

2013 Nov 10

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

The loop vectorizer is doing an amazing job so far. Most of the time. I just came across one function which led to unexpected behavior: On this function the loop vectorizer finds a 256 bit vector as the wides vector type for the x86-64 architecture. (!) This is strange, as it was always finding the correct size of 128 bit as the widest type. I isolated the IR of the function to check if this is

Fusing contract fadd/fsub with normal fmul

2017 Jun 10

Fusing contract fadd/fsub with normal fmul

Hi, On LLVM 5.0 (current trunk), fadd/fsub and fmul that are both marked with `contract` or `fast` can be merged to a fma instruction by the backend. I'm wondering about the exact semantic of this new flag as well as `fast` and in particular, would it be valid to do this when only the `fadd`/`fsub` (and not the `fmul`) is marked with `contract` or at least `fast`. The reasoning is that doing

[LLVMdev] Modifications to SLP

2015 Jul 07

[LLVMdev] Modifications to SLP

Hi all! It takes the current SLP vectorizer too long to vectorize my scalar code. I am talking here about functions that have a single, huge basic block with O(10^6) instructions. Here's an example: %0 = getelementptr float* %arg1, i32 49 %1 = load float* %0 %2 = getelementptr float* %arg1, i32 4145 %3 = load float* %2 %4 = getelementptr float* %arg2, i32 49 %5 = load

Is it a valid fp transformation?

2017 Mar 20

Is it a valid fp transformation?

I agree. There’s implementation-defined behavior on the conversion of (arg*58) to int, but that shouldn’t be at issue here. The transform of (float)x + 1 => (float)(x + 1) is bogus. > On Mar 20, 2017, at 10:41 AM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Looks broken to me; I don't think there's UB in the original program. > > The fold in

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 22

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Hi Duncan, On 17.09.2014 21:10, Duncan Sands wrote: > Hi Oleg, > > On 17/09/14 18:45, Oleg Ranevskyy wrote: >> Hi, >> >> Thank you for all your helpful comments. >> >> To sum up, below is the list of correct folding examples for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

2013 Nov 10

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

I looked more into this. For the previously sent IR the vector width of 256 bit is found mistakenly (and reproducibly) on this hardware: model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz For the same IR the loop vectorizer finds the correct vector width (128 bit) on: model name : Intel(R) Xeon(R) CPU E5630 @ 2.53GHz model name : Intel(R) Core(TM) i7 CPU M 640 @

how to simplify FP ops with an undef operand?

2018 Feb 28

how to simplify FP ops with an undef operand?

%y = fadd float %x, undef Can we simplify this? Currently in IR, we do nothing for fadd/fsub/fmul. For fdiv/frem, we propagate undef. The code comment for fdiv/frem says: "the undef could be a snan" If that's correct, then shouldn't it be the same for fadd/fsub/fmul? But this can't be correct because we support targets that don't raise exceptions...and even targets

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

2013 Nov 10

[LLVMdev] loop vectorizer erroneously finds 256 bit vectors

Hi Frank, I'm not an Intel expert, but it seems that your Xeon E5 supports AVX, which does have 256-bit vectors. The other two only supports SSE instructions, which are only 128-bit long. cheers, --renato On 10 November 2013 06:05, Frank Winter <fwinter at jlab.org> wrote: > I looked more into this. For the previously sent IR the vector width of > 256 bit is found mistakenly

[LLVMdev] Issue with Machine Verifier and earlyclobber

2012 Jul 15

[LLVMdev] Issue with Machine Verifier and earlyclobber

On Jul 15, 2012, at 9:20 AM, Borja Ferrer <borja.ferav at gmail.com> wrote: > Jakob, one more hint, I've placed some asserts around the code you added and noticed that the InlineSpiller::insertReload() function is not being called. > > 2012/7/14 Borja Ferrer <borja.ferav at gmail.com> > Hello Jakob, > > I'm still getting the error, I can give you any other

[RFC] Changes to llvm.experimental.vector.reduce intrinsics

2019 Apr 05

[RFC] Changes to llvm.experimental.vector.reduce intrinsics

On 05/04/2019 09:37, Simon Pilgrim via llvm-dev wrote: > On 04/04/2019 14:11, Sander De Smalen wrote: >> Proposed change: >> >> ---------------------------- >> >> In this RFC I propose changing the intrinsics for >> llvm.experimental.vector.reduce.fadd and >> llvm.experimental.vector.reduce.fmul (see options A and B). I also >> propose renaming

Condition code in DAGCombiner::visitFADDForFMACombine?

2018 Aug 22

Condition code in DAGCombiner::visitFADDForFMACombine?

On 22.08.2018 13:29, Ryan Taylor wrote: > The example starts as SPIR-V with the NoContraction decoration flag on > the fmul. > > I think what you are saying seems valid in that if the user had put the > flag on the fadd instead of the fmul it would not contract and so in > this example the user needs to put the NoContraction on the fadd though > I'm not sure

similar to: How to prevent clang/llvm from generating floating-point instructions?