thr3ads.net - search: "movl"

Function calls keep increasing the stack usage

2018 Sep 14

2

Function calls keep increasing the stack usage

...@bar > .cfi_startproc > # %bb.0: # %entry > pushq %rbp > .cfi_def_cfa_offset 16 > .cfi_offset %rbp, -16 > movq %rsp, %rbp > .cfi_def_cfa_register %rbp > subq $16, %rsp > movl $1, %edi > movl $2, %esi > callq foo > movl $3, %edi > movl $4, %esi > movl %eax, -4(%rbp) # 4-byte Spill > callq foo > movl %eax, -8(%rbp) # 4-byte Spill > addq $16, %...

Function calls keep increasing the stack usage

2018 Sep 14

6

Function calls keep increasing the stack usage

...I found that LLVM generates redundant code when calling functions with constant parameters, with optimizations disabled. Consider the following C code snippet: int foo(int x, int y); void bar() { foo(1, 2); foo(3, 4); } Clang/LLVM 6.0 generates the following assembly code: _bar: subl $32, %esp movl $1, %eax movl $2, %ecx movl $1, (%esp) movl $2, 4(%esp) movl %eax, 28(%esp) movl %ecx, 24(%esp) calll _foo movl $3, %ecx movl $4, %edx movl $3, (%esp) movl $4, 4(%esp) movl %eax, 20(%esp) movl %ecx, 16(%esp) movl %edx, 12(%esp) calll _foo movl %eax, 8(%esp) addl $32, %esp retl Note how the constan...

Boot from CD -> system + data on USB storage

2004 Nov 12

2

Boot from CD -> system + data on USB storage

Hi, I am looking for a solution to boot MY system on any PC. To store most of the system and all of my data I want to use an USB storage (in my case an external USB harddisk (2.0 capable)). Since booting off an USB device is not an universal thing I would prefer to have a boot disk with a minimal system - just enough to load most (all?) of the system from the attached USB device. Is this an

[LLVMdev] operator overloading fails while debugging with gdb for i386

2012 Dec 01

0

[LLVMdev] operator overloading fails while debugging with gdb for i386

...onst A1 one, const A1 two) { A1 plus = {0,0}; plus.x = one.x + two.x; plus.y = one.y + two.y; return (plus); } int main (void) { A1 one= {2,3}; A1 two= {4,5}; A1 three = sum(one,two); return 0; } gcc assembley (snippet of sum function) _Z3sum2A1S_: .loc 1 8 0 pushl %ebp movl %esp, %ebp .loc 1 9 0 movl 8(%ebp), %eax movl $0, (%eax) movl 8(%ebp), %eax movl $0, 4(%eax) .loc 1 10 0 movl 12(%ebp), %edx movl 20(%ebp), %eax addl %eax, %edx movl 8(%ebp), %eax movl %edx, (%eax) .loc 1 11 0 movl...

[LLVMdev] operator overloading fails while debugging with gdb for i386

2012 Nov 29

2

[LLVMdev] operator overloading fails while debugging with gdb for i386

For the given test: class A1 { int x; int y; public: A1(int a, int b) { x=a; y=b; } A1 operator+(const A1&); }; A1 A1::operator+(const A1& second) { A1 sum(0,0); sum.x = x + second.x; sum.y = y + second.y; return (sum); } int main (void) { A1 one(2,3); A1 two(4,5); return 0; } when the exectable of this code is debugged in gdb for i386, we dont get the

[GE users] Apple Leopard has dtrace -- anyone used the SGE probes/scripts yet?

2007 Nov 14

10

[GE users] Apple Leopard has dtrace -- anyone used the SGE probes/scripts yet?

Hi, Chris (cc) and I try to get the SGE master monitor work with Apple Leopard dtrace. Unfortunately we are stuck with the error msg below. Anyone having an idea what could be the cause? What I can rule out as cause is function inlining for the reasons explained below. Background information on SGE master monitor implementation is under http://wiki.gridengine.info/wiki/index.php/Dtrace

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

5

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp Into this: pushl $4 pushl $3 pushl $2 pushl $1 calll _func addl $16, %esp The main motivation for this is code size (a "pushl $4" is 2 bytes, a "...

How to avoid register spills at wide integer addition?

2016 May 31

0

How to avoid register spills at wide integer addition?

...as %px, i512*noalias %py) { %x = load i512* %px %y = load i512* %py %z = add i512 %x, %y store i512 %z, i512* %pz ret void } >llc-3.6 -O3 -march=x86 t.ll -o - add512: pushl %ebp pushl %ebx pushl %edi pushl %esi subl $56, %esp movl 84(%esp), %eax movl 80(%esp), %edi movl 8(%edi), %esi movl (%edi), %edx movl 4(%edi), %ebx movl 60(%eax), %ecx movl %ecx, 52(%esp) # 4-byte Spill movl 56(%eax), %ecx movl %ecx, 48(%esp) # 4-...

[LLVMdev] operator overloading fails while debugging with gdb for i386

2012 Dec 01

2

[LLVMdev] operator overloading fails while debugging with gdb for i386

...onst A1 one, const A1 two) { A1 plus = {0,0}; plus.x = one.x + two.x; plus.y = one.y + two.y; return (plus); } int main (void) { A1 one= {2,3}; A1 two= {4,5}; A1 three = sum(one,two); return 0; } gcc assembley (snippet of sum function) _Z3sum2A1S_: .loc 1 8 0 pushl %ebp movl %esp, %ebp .loc 1 9 0 movl 8(%ebp), %eax movl $0, (%eax) movl 8(%ebp), %eax movl $0, 4(%eax) .loc 1 10 0 movl 12(%ebp), %edx movl 20(%ebp), %eax addl %eax, %edx movl 8(%ebp), %eax movl %edx, (%eax) .loc 1 11 0 movl...

[RFC] __builtin_constant_p() Improvements

2018 Apr 12

3

[RFC] __builtin_constant_p() Improvements

...1; return 0; } static __attribute__((always_inline)) int mux() { if (__builtin_constant_p(37)) return 927; return 0; } int bar(int a) { if (a) return foo(42); else return mux(); } Now outputs this code at -O1: bar: .cfi_startproc # %bb.0: # %entry testl %edi, %edi movl $927, %ecx # imm = 0x39F movl $1, %eax cmovel %ecx, %eax retq And this code at -O0: bar: # @bar .cfi_startproc # %bb.0: # %entry pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_regi...

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

2011 Jan 12

2

[LLVMdev] Wrong assembly is written for x86_64 target in JIT without optimization?

...0x0000000800989c5e: retq --- result after running llvm-as and llc on the same function --- subq $56, %rsp .Ltmp0: movq %rdi, 48(%rsp) # 8-byte Spill movq %rsi, 40(%rsp) # 8-byte Spill # BB#1: # %lbl1 movl $1, %eax movl $4, %esi movl $0, %ecx movl %eax, %edi movl %esi, 36(%rsp) # 4-byte Spill movl %ecx, %esi movl %ecx, %edx movl %eax, 32(%rsp) # 4-byte Spill movl %ecx, 28(%rsp)...

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 08

4

[LLVMdev] Efficient Pattern matching in Instruction Combine

...int a, b; scanf("%d %d", &a, &b); return cal(a,b); } *X86 .s file with clang at O2 for above program :* suyog at suyog-Inspiron-N5010:~$ Open/rbuild/bin/clang -S -O2 1.c main: # @main # BB#0: subl $28, %esp leal 20(%esp), %eax movl %eax, 8(%esp) leal 24(%esp), %eax movl %eax, 4(%esp) movl $.L.str, (%esp) calll __isoc99_scanf movl 20(%esp), %eax * orl 24(%esp), %eax* addl $28, %esp retl As seen, optimization happened at IR level itself reflected in .s file. *GCC output f...

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

2

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

..., Michael M wrote: Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp Into this: pushl $4 pushl $3 pushl $2 pushl $1 calll _func addl $16, %esp The main motivation for this is code size (a "pushl $4" is 2 bytes, a "...

[LLVMdev] Miscompilation on MingW32

2008 Jun 11

0

[LLVMdev] Miscompilation on MingW32

...gw32), using the current SVN and lli or llc, it returns a random value. The assembly below is the output of llc for that target. I can clearly see the 4 allocas, x1 is at %edi, x2 is at %ebx, %retval on the stack at -16(%ebp) and %dummy, which is unused is at %esp %after the last alloca. The first movl after the two addl is using a wrong address to store the result (%esp points to %dummy, but the result should not be stored there). ------------------------------------------------------------------------ .text .align 16 .def _tmp; .scl 3; .type 32; .endef _tmp: pushl %ebp Llabel1: movl %es...

[LLVMdev] RegAllocFast uses too much stack

2011 Jul 11

4

[LLVMdev] RegAllocFast uses too much stack

...void test() { foo(0); foo(1); foo(2); } This doesn't just spill out all the registers to the stack before each call, we also set up 0, 1 and 2 into regs first, then spill them and don't even get a chance to reuse stack slots. That's just bad: pushq %rax movl $2, %edi movl $1, %eax movl $0, %ecx movl %edi, 4(%rsp) # 4-byte Spill movl %ecx, %edi movl %eax, (%rsp) # 4-byte Spill callq foo movl (%rsp), %edi # 4-byte Reload callq foo...

[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization

2012 Mar 20

0

[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization

...ct The relevant bits of objdump -R $ objdump -R liba.so | grep ex_func 2000211c R_386_GLOB_DAT ex_func 20002108 R_386_JUMP_SLOT ex_func Now in the asm here we can see that in the TAILCALL case the ex_func is looked up in GOT. $ grep ex_func a.s calll ex_func at PLT movl ex_func at GOT(%esi), %eax ... jmpl *%eax # TAILCALL I am aware that this might turn a bit religous however I think that always looking up the function pointer in PLT would eliminate this issue. I don't see a benefit of using the GOT in this particular case. The asm of...

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

2

[newbie] trouble with global variables and CreateLoad/Store in JIT

That's useful to know that the static compilation code path works. Furthermore, as expected from that: 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4 00000054: IMAGE_REL_I386_DIR32 _foo It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading. Can you try debugging into lib/ExecutionEngine/RuntimeDyld/...

[PATCH v5 32/75] x86/head/64: Load segment registers earlier

2020 Jul 24

0

[PATCH v5 32/75] x86/head/64: Load segment registers earlier

...rch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index f958d4e4ee08..057c7bd3eeb6 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -174,6 +174,32 @@ SYM_CODE_START(secondary_startup_64) */ lgdt early_gdt_descr(%rip) + /* set up data segments */ + xorl %eax,%eax + movl %eax,%ds + movl %eax,%ss + movl %eax,%es + + /* + * We don't really need to load %fs or %gs, but load them anyway + * to kill any stale realmode selectors. This allows execution + * under VT hardware. + */ + movl %eax,%fs + movl %eax,%gs + + /* Set up %gs. + * + * The base of %gs always...

Suboptimal code generated by clang+llc in quite a common scenario (?)

2019 Aug 08

2

Suboptimal code generated by clang+llc in quite a common scenario (?)

...8* %2, i64 2 store i8 %k, i8* %arrayidx2, align 1, !tbaa !13 ret i32 0 } According to that, the variable ‘scscx’ is loaded three times despite it’s never modified. The resulting assembly code is this: .globl _tst _tst: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset %ebp, -8 movl %esp, %ebp .cfi_def_cfa_register %ebp pushl %esi .cfi_offset %esi, -12 movb 16(%ebp), %al movb 12(%ebp), %cl movb 8(%ebp), %dl movl _scscx, %esi movb %dl, (%esi) movl _scscx, %edx movb %cl, 1(%edx) movl _scscx, %ecx movb %al, 2(%ecx) xorl %eax, %eax popl %esi popl %ebp retl .cfi_en...

[LLVMdev] operator overloading fails while debugging with gdb for i386

2012 Dec 02

0

[LLVMdev] operator overloading fails while debugging with gdb for i386

...y; > > return (plus); > } > > > int main (void) > { > A1 one= {2,3}; > A1 two= {4,5}; > A1 three = sum(one,two); > return 0; > } > > > gcc assembley (snippet of sum function) > > _Z3sum2A1S_: > .loc 1 8 0 > pushl %ebp > movl %esp, %ebp > .loc 1 9 0 > movl 8(%ebp), %eax > movl $0, (%eax) > movl 8(%ebp), %eax > movl $0, 4(%eax) > .loc 1 10 0 > movl 12(%ebp), %edx > movl 20(%ebp), %eax > addl %eax, %edx > movl 8(%ebp), %eax &g...

search for: movl