similar to: [LLVMdev] Miscompilation on MingW32

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Miscompilation on MingW32"

2011 Nov 11
3
[LLVMdev] Misaligned SSE store problem (with reduced source)
Using LLVM 2.9, the following LLVM IR produces invalid x86 32 bit assembly (a misaligned SSE store). ; ModuleID = 'MisalignedStore' define void @MisalignedStore() nounwind readnone { entry: %v = alloca <4 x float>, align 16 store <4 x float> zeroinitializer, <4 x float>* %v, align 16 br label %post-block post-block: %f = alloca float ret void } If I feed
2011 Nov 11
0
[LLVMdev] Misaligned SSE store problem (with reduced source)
On Thu, Nov 10, 2011 at 6:13 PM, Aaron Dwyer <Aaron.Dwyer at imgtec.com> wrote: > Using LLVM 2.9, the following LLVM IR produces invalid x86 32 bit assembly > (a misaligned SSE store). > ; ModuleID = 'MisalignedStore' > define void @MisalignedStore() nounwind readnone { > entry: >   %v = alloca <4 x float>, align 16 >   store <4 x float>
2009 Aug 25
2
[LLVMdev] Build issues on Solaris
On 19/08/2009, at 4:00 AM, Anton Korobeynikov wrote: > Hello, Nathan > >> or if it should be a configure test, which might be safer. Are there >> any x86 platforms (other than apple) that don't need PLT-indirect >> calls? > Yes, mingw. However just tweaking the define is not enough - we're not Ok, so configure might be the way to go then, maybe something
2012 Mar 20
0
[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization
I was told that my writeup lacked an example and details so I reproduced the code that X uses and I was able to boil down the issue to a couple of lines of code. Sorry again for the length of this email. Code was compiled on OpenBSD with clang 3.0-release. ======================================================================== With -O0 which works as X expects:
2016 Apr 04
2
How to call an (x86) cleanup/catchpad funclet
I've modified llvm to emit vc++ compatible SEH structures for my personality on x86/Windows and my handler works fine, but the only thing I can't figure out is how to call these funclets, they look like: Catch: "?catch$3@?0?m3 at 4HA": LBB4_3: # %BasicBlock26 pushl %ebp pushl %eax addl $12, %ebp movl %esp, -28(%ebp) movl $LBB4_5, %eax
2014 Dec 21
2
[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences
Which performance guidelines are you referring to? I'm not that familiar with decade-old CPUs, but to the best of my knowledge, this is not true on current hardware. There is one specific circumstance where PUSHes should be avoided - for Atom/Silvermont processors, the memory form of PUSH is inefficient, so the register-freeing optimization below may not be profitable (see 14.3.3.6 and
2006 Jun 26
0
[klibc 24/43] i386 support for klibc
The parts of klibc specific to the i386 architecture. Signed-off-by: H. Peter Anvin <hpa at zytor.com> --- commit bd0599e5290ca1a16bb7a68f7c362d395c612eb3 tree 8f33afdd02a14c22e7a3984da2bad13184e3f729 parent 84f6a72f42cf41e32daa59871a0b5424572093e4 author H. Peter Anvin <hpa at zytor.com> Sun, 25 Jun 2006 16:58:21 -0700 committer H. Peter Anvin <hpa at zytor.com> Sun, 25 Jun
2012 Dec 01
0
[LLVMdev] operator overloading fails while debugging with gdb for i386
Problem seems not only with operator overloading, It occurs with struct value returning also. gdb while debugging expects the return value in eax, gcc does returns in eax, But Clang returns in edx(it can be checked in gdb by printing the contents of edx). Code(sample code) struct A1 { int x; int y; }; A1 sum(const A1 one, const A1 two) { A1 plus = {0,0}; plus.x = one.x + two.x; plus.y
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior
2007 Apr 18
1
[PATCH] (with benchmarks) binary patching of paravirt_ops call sites
Hi all, Sorry for the delay. This implements binary patching of call sites for interrupt-related paravirt ops, since no-doubt Andi wasn't the only one to believe this approach is slow. The benchmarks were done on a UP 3GHz Pentium 4 with 512MB of RAM. 2.6.17-rc4 vs 2.6.17-rc4 with CONFIG_PARAVIRT=y vs 2.6.17-rc4 CONFIG_PARAVIRT=y with patch. Summary: with binary patching, the difference
2007 Apr 18
1
[PATCH] (with benchmarks) binary patching of paravirt_ops call sites
Hi all, Sorry for the delay. This implements binary patching of call sites for interrupt-related paravirt ops, since no-doubt Andi wasn't the only one to believe this approach is slow. The benchmarks were done on a UP 3GHz Pentium 4 with 512MB of RAM. 2.6.17-rc4 vs 2.6.17-rc4 with CONFIG_PARAVIRT=y vs 2.6.17-rc4 CONFIG_PARAVIRT=y with patch. Summary: with binary patching, the difference
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after
2012 Dec 02
0
[LLVMdev] operator overloading fails while debugging with gdb for i386
Hi, As you told that function ends up returning void, I just confirmed it in the IR, the function is defined as: define *void* @_Z3sum2A1S_(*%struct.A1* noalias sret %agg.result*, %struct.A1* byval align 4 %one, %struct.A1* byval align 4 %two). But when i checked the register values in g++, eax contains an address of stack, which points to the value (object) returned by sum. That is if we
2013 Dec 13
0
[LLVMdev] GVNPRE /PRE is not effective
Hi All, The PRE or GVNPRE is not effective for the below use case. int sum; int phi =30; void f (int i, int *a) { if ((a[i] << (1)) > -15) sum =(phi+ 0x7fffffffL )/ a[i]; if ((a[i] << (2)) > -15) sum =(phi + 0x7fffffffL) /a[i]; } respective asm (clang on trunk ) #clang -O3 -S test.c BB#0: # %entry pushl %edi pushl %esi
2007 Apr 18
1
Patch: use .pushsection/.popsection
I think this might fix the X bug... -------------- next part -------------- diff -r e698e6ee2fa1 arch/i386/kernel/entry.S --- a/arch/i386/kernel/entry.S Tue Aug 08 10:18:34 2006 -0700 +++ b/arch/i386/kernel/entry.S Tue Aug 08 10:36:17 2006 -0700 @@ -162,17 +162,17 @@ 2: popl %es; \ 2: popl %es; \ CFI_ADJUST_CFA_OFFSET -4;\ /*CFI_RESTORE es;*/\ -.section .fixup,"ax"; \ +.pushsection
2007 Apr 18
1
Patch: use .pushsection/.popsection
I think this might fix the X bug... -------------- next part -------------- diff -r e698e6ee2fa1 arch/i386/kernel/entry.S --- a/arch/i386/kernel/entry.S Tue Aug 08 10:18:34 2006 -0700 +++ b/arch/i386/kernel/entry.S Tue Aug 08 10:36:17 2006 -0700 @@ -162,17 +162,17 @@ 2: popl %es; \ 2: popl %es; \ CFI_ADJUST_CFA_OFFSET -4;\ /*CFI_RESTORE es;*/\ -.section .fixup,"ax"; \ +.pushsection
2012 Dec 01
2
[LLVMdev] operator overloading fails while debugging with gdb for i386
Hi, Structures are passed by pointer, so the return value is not actually in eax. That code gets transformed into something like: void sum(A1 *out, const A1 one, const A1 two) { out->x = one.x + two.x out->y = one.y + two.y } So actually the function ends up returning void and operating on a hidden parameter, so %eax is dead at the end of the function and should not be being relied
2011 Mar 24
2
[LLVMdev] GCC vs. LLVM difference on simple code example
Hi, I have a question on why gcc and llvm-gcc compile the following simple code snippet differently: extern int a; extern int *b; void foo() { int i; for (i = 1; i < 100; ++i) a += b[i]; } gcc compiles this function hoisting the load of the global variable "b" outside of the loop, while llvm-gcc keeps it inside the loop. This results in slower code on the part of
2014 Dec 21
5
[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences
Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp
2007 May 21
2
changing definition of paravirt_ops.iret
I'm implementing a more efficient version of the Xen iret paravirt_op, so that it can use the real iret instruction where possible. I really need to get access to per-cpu variables, so I can set the event mask state in the vcpu_info structure, but unfortunately at the point where INTERRUPT_RETURN is used in entry.S, the usermode %fs has already been restored. How would you feel if we changed