Displaying 17 results from an estimated 17 matches for "perfwizard".
2013 Nov 16
2
[LLVMdev] struct with signed bitfield (PR17827)
...nsigned or the shl had a nsw flag, I think this
would be okay. Since none of these is true, I don't think this
transformation is correct.
H.
On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <mark.lacey at apple.com> wrote:
>
> On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:
>
> I've been diagnosing this bug:
> http://llvm.org/bugs/show_bug.cgi?id=17827
>
> Summary: I think the following program miscompiles at -O1 because the fact
> that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How
> do we fix this...
2013 Nov 16
0
[LLVMdev] struct with signed bitfield (PR17827)
On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:
> I've been diagnosing this bug:
> http://llvm.org/bugs/show_bug.cgi?id=17827
>
> Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?
I don’t ha...
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or
other flags specified.
On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
> Thanks for the explanation, Tim!
>
> gcc 4.8.1 *does* generate an fma for your code example for an x86 target
> that supports fma. I'd bet that the HW vendors' compilers do the same, but
> I don't have any of those installed at the moment to test that the...
2013 Nov 15
4
[LLVMdev] struct with signed bitfield (PR17827)
I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827
Summary: I think the following program miscompiles at -O1 because the fact
that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How
do we fix this?
$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
int f0:3;
} a;
int foo (int p) {
struct S c = a;
c.f0 = p & 6;
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
...meone please clarify on this
point? The performance gain with vmla instruction is huge. Somewhere i read
that LLVM prefers precision accuracy over performance. Is this true and
hence LLVM is not emiting vmla instructions for cortex-a8?
On Thu, Dec 19, 2013 at 6:41 AM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
> Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or
> other flags specified.
>
>
> On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
>
>> Thanks for the explanation, Tim!
>>
>> gcc 4.8.1 *...
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim!
gcc 4.8.1 *does* generate an fma for your code example for an x86 target
that supports fma. I'd bet that the HW vendors' compilers do the same, but
I don't have any of those installed at the moment to test that theory. So
this is a bug in those compilers? Do you know how they justify it?
I see section 6.5 "Expressions" in the C standard, and
2013 Nov 16
0
[LLVMdev] struct with signed bitfield (PR17827)
...this would be okay. Since none of these is true, I don't think this
> transformation is correct.
>
> H.
>
>
>
> On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <mark.lacey at apple.com> wrote:
>
>>
>> On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:
>>
>> I've been diagnosing this bug:
>> http://llvm.org/bugs/show_bug.cgi?id=17827
>>
>> Summary: I think the following program miscompiles at -O1 because the
>> fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR....
2014 Jan 14
2
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
On Thu, Nov 28, 2013 at 1:03 AM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
> Hi Jun,
>
> I'm not sure how to fix this yet, but this looks incorrectly defined in
> lib/Target/X86/X86InstrInfo.td:
>
> def MOV32o32a : Ii32 <0xA1, RawFrm, (outs), (ins offset32:$src),
> "mov{l}\t{$src, %eax|eax, $src}",...
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...double foo(double a, double b, double c) {
return a * b + c;
}
Which will now require a vmovaps + vfmadd231.
If this impacts real benchmarks we could add an optimization to change
the FMA variant based on how it's used.
- Lang.
On Fri, Dec 20, 2013 at 8:29 AM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:
> Hi Lang,
>
> Unfortunately, I don't have an answer on the commutability question, but I
> wanted to let you know that I filed a bug on this:
> http://llvm.org/bugs/show_bug.cgi?id=17229
>
> This also shows a memory operand variant of the fma that you may wa...
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...t; return a * b + c;
> }
>
> Which will now require a vmovaps + vfmadd231.
>
> If this impacts real benchmarks we could add an optimization to change the FMA variant based on how it's used.
>
> - Lang.
>
> On Fri, Dec 20, 2013 at 8:29 AM, Kay Tiong Khoo <kkhoo at perfwizard.com> wrote:
>> Hi Lang,
>>
>> Unfortunately, I don't have an answer on the commutability question,
>> but I wanted to let you know that I filed a bug on this:
>> http://llvm.org/bugs/show_bug.cgi?id=17229
>>
>> This also shows a memory operand varian...
2013 Nov 27
0
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Hi Jun,
I'm not sure how to fix this yet, but this looks incorrectly defined in
lib/Target/X86/X86InstrInfo.td:
def MOV32o32a : Ii32 <0xA1, RawFrm, (outs), (ins offset32:$src),
"mov{l}\t{$src, %eax|eax, $src}", [], IIC_MOV_MEM>,
Requires<[In32BitMode]>;
This instruction can be REX-prefixed for a 64-bit move, and that also
2013 Nov 27
3
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Hi,
With objdump, i have this (Intel syntax)
64 a1 00 00 00 00 mov eax,fs:0x0
However, if I pass above string to llvm-mc, I would have:
$ echo "0x64 0xa1 0x00 0x00 0x00 0x00"|./Release+Asserts/bin/llvm-mc
-disassemble -arch=x86 --output-asm-variant=1
.text
mov eax, dword ptr [0]
You can see a big difference. This is on the latest code. Any idea how to
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Lang,
Unfortunately, I don't have an answer on the commutability question, but I
wanted to let you know that I filed a bug on this:
http://llvm.org/bugs/show_bug.cgi?id=17229
This also shows a memory operand variant of the fma that you may want to
consider in your patch and testcases.
Thanks!
On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote:
> Hi all,
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi all,
The 213 variant of the FMA3 instructions is currently marked
commutable (see X86InstrFMA.td). Is that safe? According to the ISA
the FMA3 instructions aren't commutable for non-numeric results, so
I'd have thought commuting this would only be valid in fast-math mode?
For the curious, the reason that I'm asking is that we currently
always select the 213 variant, but this
2013 Nov 05
0
[LLVMdev] add "3.3" to the bugzilla version list for all components?
Recently, I've seen a few bugs filed against llvm "trunk" because "3.3"
isn't in the list of versions in LLVM bugzilla. This is causing confusion
for customers and wasting time for devs.
"3.3" does exist for the clang product, but nowhere else it seems.
Can someone with bugzilla admin power add "3.3" as a version for other
products and the
2013 Nov 27
0
[LLVMdev] Some bugs in x86 disasm (llvm-mc)
Thanks, Tim!
As Craig noted:
http://llvm.org/bugs/show_bug.cgi?id=16962#c1
"There are many things wrong with these instructions."
:)
On Wed, Nov 27, 2013 at 10:17 AM, Tim Northover <t.p.northover at gmail.com>wrote:
> > I would file a bugzilla in the x86 component and cc Craig Topper, the x86
> > disasm/codegen expert.
>
> If you chase down the revision
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
> "-ffp-contract=fast" is needed
Correct - clang is different than gcc, icc, msvc, xlc, etc. on this. Still
haven't seen any explanation for how this is better though...
http://llvm.org/bugs/show_bug.cgi?id=17188
http://llvm.org/bugs/show_bug.cgi?id=17211
On Wed, Dec 18, 2013 at 6:02 AM, Tim Northover <t.p.northover at gmail.com>wrote:
> > I believe that's the