search for: kkhoo

Displaying 17 results from an estimated 17 matches for "kkhoo".

Did you mean: khoo
2013 Nov 16
2
[LLVMdev] struct with signed bitfield (PR17827)
...on were unsigned or the shl had a nsw flag, I think this would be okay. Since none of these is true, I don't think this transformation is correct. H. On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <mark.lacey at apple.com> wrote: > > On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: > > I've been diagnosing this bug: > http://llvm.org/bugs/show_bug.cgi?id=17827 > > Summary: I think the following program miscompiles at -O1 because the fact > that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How >...
2013 Nov 16
0
[LLVMdev] struct with signed bitfield (PR17827)
On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: > I've been diagnosing this bug: > http://llvm.org/bugs/show_bug.cgi?id=17827 > > Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix thi...
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or other flags specified. On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote: > Thanks for the explanation, Tim! > > gcc 4.8.1 *does* generate an fma for your code example for an x86 target > that supports fma. I'd bet that the HW vendors' compilers do the same, but > I don't have any of those installed at the moment to...
2013 Nov 15
4
[LLVMdev] struct with signed bitfield (PR17827)
I've been diagnosing this bug: http://llvm.org/bugs/show_bug.cgi?id=17827 Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this? $ cat bitfield.c /* %struct.S = type { i8, [3 x i8] } ??? */ struct S { int f0:3; } a; int foo (int p) { struct S c = a; c.f0 = p & 6;
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
...n? Can someone please clarify on this point? The performance gain with vmla instruction is huge. Somewhere i read that LLVM prefers precision accuracy over performance. Is this true and hence LLVM is not emiting vmla instructions for cortex-a8? On Thu, Dec 19, 2013 at 6:41 AM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote: > Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or > other flags specified. > > > On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote: > >> Thanks for the explanation, Tim! >> >&g...
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim! gcc 4.8.1 *does* generate an fma for your code example for an x86 target that supports fma. I'd bet that the HW vendors' compilers do the same, but I don't have any of those installed at the moment to test that theory. So this is a bug in those compilers? Do you know how they justify it? I see section 6.5 "Expressions" in the C standard, and
2013 Nov 16
0
[LLVMdev] struct with signed bitfield (PR17827)
...t; think this would be okay. Since none of these is true, I don't think this > transformation is correct. > > H. > > > > On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <mark.lacey at apple.com> wrote: > >> >> On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: >> >> I've been diagnosing this bug: >> http://llvm.org/bugs/show_bug.cgi?id=17827 >> >> Summary: I think the following program miscompiles at -O1 because the >> fact that 'f0' is a signed 3-bit value is lost in the unopti...
2014 Jan 14
2
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
On Thu, Nov 28, 2013 at 1:03 AM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote: > Hi Jun, > > I'm not sure how to fix this yet, but this looks incorrectly defined in > lib/Target/X86/X86InstrInfo.td: > > def MOV32o32a : Ii32 <0xA1, RawFrm, (outs), (ins offset32:$src), > "mov{l}\t{$src, %eax|eax...
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...de like: double foo(double a, double b, double c) { return a * b + c; } Which will now require a vmovaps + vfmadd231. If this impacts real benchmarks we could add an optimization to change the FMA variant based on how it's used. - Lang. On Fri, Dec 20, 2013 at 8:29 AM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: > Hi Lang, > > Unfortunately, I don't have an answer on the commutability question, but I > wanted to let you know that I filed a bug on this: > http://llvm.org/bugs/show_bug.cgi?id=17229 > > This also shows a memory operand variant of the fma t...
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...e c) { > return a * b + c; > } > > Which will now require a vmovaps + vfmadd231. > > If this impacts real benchmarks we could add an optimization to change the FMA variant based on how it's used. > > - Lang. > > On Fri, Dec 20, 2013 at 8:29 AM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote: >> Hi Lang, >> >> Unfortunately, I don't have an answer on the commutability question, >> but I wanted to let you know that I filed a bug on this: >> http://llvm.org/bugs/show_bug.cgi?id=17229 >> >> This also shows a memory...
2013 Nov 27
0
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Hi Jun, I'm not sure how to fix this yet, but this looks incorrectly defined in lib/Target/X86/X86InstrInfo.td: def MOV32o32a : Ii32 <0xA1, RawFrm, (outs), (ins offset32:$src), "mov{l}\t{$src, %eax|eax, $src}", [], IIC_MOV_MEM>, Requires<[In32BitMode]>; This instruction can be REX-prefixed for a 64-bit move, and that also
2013 Nov 27
3
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Hi, With objdump, i have this (Intel syntax) 64 a1 00 00 00 00 mov eax,fs:0x0 However, if I pass above string to llvm-mc, I would have: $ echo "0x64 0xa1 0x00 0x00 0x00 0x00"|./Release+Asserts/bin/llvm-mc -disassemble -arch=x86 --output-asm-variant=1 .text mov eax, dword ptr [0] You can see a big difference. This is on the latest code. Any idea how to
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Lang, Unfortunately, I don't have an answer on the commutability question, but I wanted to let you know that I filed a bug on this: http://llvm.org/bugs/show_bug.cgi?id=17229 This also shows a memory operand variant of the fma that you may want to consider in your patch and testcases. Thanks! On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote: > Hi all,
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi all, The 213 variant of the FMA3 instructions is currently marked commutable (see X86InstrFMA.td). Is that safe? According to the ISA the FMA3 instructions aren't commutable for non-numeric results, so I'd have thought commuting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this
2013 Nov 05
0
[LLVMdev] add "3.3" to the bugzilla version list for all components?
Recently, I've seen a few bugs filed against llvm "trunk" because "3.3" isn't in the list of versions in LLVM bugzilla. This is causing confusion for customers and wasting time for devs. "3.3" does exist for the clang product, but nowhere else it seems. Can someone with bugzilla admin power add "3.3" as a version for other products and the
2013 Nov 27
0
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Thanks, Tim! As Craig noted: http://llvm.org/bugs/show_bug.cgi?id=16962#c1 "There are many things wrong with these instructions." :) On Wed, Nov 27, 2013 at 10:17 AM, Tim Northover <t.p.northover at gmail.com>wrote: > > I would file a bugzilla in the x86 component and cc Craig Topper, the x86 > > disasm/codegen expert. > > If you chase down the revision
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
> "-ffp-contract=fast" is needed Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still haven't seen any explanation for how this is better though... http://llvm.org/bugs/show_bug.cgi?id=17188 http://llvm.org/bugs/show_bug.cgi?id=17211 On Wed, Dec 18, 2013 at 6:02 AM, Tim Northover <t.p.northover at gmail.com>wrote: > > I believe that's the