thr3ads.net - search: "esps"

Displaying 20 results from an estimated 3784 matches for "esps".

Did you mean: esp

2008 Jan 28

Strange signal 11 crashes

These crashes only happen at night/during the evening, way after maximum load: Jan 27 21:19:57 postamt kernel: [1490698.849461] imap[15089]: segfault at 00000008 eip 080b779b esp bfbe20a0 error 4 Jan 27 21:20:50 postamt kernel: [1490752.022142] imap[15251]: segfault at 00000008 eip 080b779b esp bfd241e0 error 4 Jan 27 21:21:53 postamt kernel: [1490814.348208] imap[15482]: segfault at 00000008 eip

[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX

2013 Jul 19

[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX

(Changing subject line as diagnosis has changed) I'm attaching the compiled code that I've been getting, both with CodeGenOpt::Default and CodeGenOpt::None . The crash isn't occurring with CodeGenOpt::None, but that seems to be because ECX isn't being used - it still gets set to 0x7fffffff by one of the calls to 76719BA1 I notice that X86::SQRTPD[m|r] appear in

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 08

[LLVMdev] Efficient Pattern matching in Instruction Combine

Hi Duncan, David, Sean. Thanks for your reply. > It'd be interesting if you could find a design that also treated these > the same: > > (B ^ A) | ((A ^ B) ^ C) -> (A ^ B) | C > (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C > (B ^ A) | ((C ^ A) ^ B) -> (A ^ B) | C > > I.e., `^` is also associative. Agree with Duncan on including associative operation too.

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 14

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Hello, While investigating one of the existing tests (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some interesting code. The IR is very straightforward: define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { entry: ret i32 %a3 } define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { entry: %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32

[LLVMdev] How could I get memory address for each assemble instruction?

2004 Sep 13

[LLVMdev] How could I get memory address for each assemble instruction?

Hi all, I am trying to disassemble *.bc to assemble code by using llvm-dis command, but what I got is like the following. So how could I get the assemble code like objdump? I mean the memory address for each instruction. Thanks Qiuyu llvm-dis: .text .align 16 .globl adpcm_coder .type adpcm_coder, @function adpcm_coder: .LBBadpcm_coder_0: # entry sub %ESP, 116 mov DWORD PTR [%ESP + 12],

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp

Where's the optimiser gone? (part 5.c): missed tail calls, and more...

2018 Dec 01

Where's the optimiser gone? (part 5.c): missed tail calls, and more...

Compile the following functions with "-O3 -target i386-win32" (see <https://godbolt.org/z/exmjWY>): __int64 __fastcall div(__int64 foo, __int64 bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: push dword ptr [esp + 16] | push dword ptr [esp + 16] | push dword ptr [esp + 16] |

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 19

[LLVMdev] SIMD instructions and memory alignment on X86

Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think

[LLVMdev] jit X86 target compilation callback bug

2010 Feb 03

[LLVMdev] jit X86 target compilation callback bug

Hello again. I still think that you are wrong. Realignement with and esp,-16 not always changes stack poiner. If esp is already aligned to 16 byte boundary, it will not change! Take a look at following example. Assume esp has value 0x000001000 at start of X86CompilationCallback function. Then execution of it will yield following esp values: 0x000000FFC - after push ebp 0x000000FFC - after mov

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

2013 Feb 15

[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Hey Eli, On Thu, Feb 14, 2013 at 5:45 PM, Eli Bendersky <eliben at google.com> wrote: > Hello, > > While investigating one of the existing tests > (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some > interesting code. The IR is very straightforward: > > define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 > %a4) { > entry: >

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

[LLVMdev] Suboptimal code due to excessive spilling

Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 05

[newbie] trouble with global variables and CreateLoad/Store in JIT

Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them. So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 13

[LLVMdev] Efficient Pattern matching in Instruction Combine

Thanks Sean for the reference. I will go through it and see if i can implement it for generic boolean expression minimization. Regards, Suyog On Wed, Aug 13, 2014 at 2:30 AM, Sean Silva <chisophugis at gmail.com> wrote: > Re-adding the mailing list (remember to hit "reply all") > > > On Tue, Aug 12, 2014 at 9:36 AM, suyog sarda <sardask01 at gmail.com> wrote:

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola < nikodemus at random-state.net> wrote: > Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it > through an opaque identity function ... then everything works fine. > > Is this a bug in LLVM or is there some magic involving globals I'm > misunderstanding? > This looks like a bug in the handling of

[LLVMdev] Counting instructions in MCJIT

2012 Jun 28

[LLVMdev] Counting instructions in MCJIT

Hi Verena, I think that we can count the number of instructions with "-stats" command line option. As you mentioned, this option uses Statistic class like "STATISTIC(EmittedInsts, "Number of machine instrs printed");" I don't know exactly about parallel code generation environment but this option seems like to work correctly in common case as following. This is

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

That's useful to know that the static compilation code path works. Furthermore, as expected from that: 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4 00000054: IMAGE_REL_I386_DIR32 _foo It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Function calls keep increasing the stack usage

2018 Sep 14

Function calls keep increasing the stack usage

Hi everyone, I found that LLVM generates redundant code when calling functions with constant parameters, with optimizations disabled. Consider the following C code snippet: int foo(int x, int y); void bar() { foo(1, 2); foo(3, 4); } Clang/LLVM 6.0 generates the following assembly code: _bar: subl $32, %esp movl $1, %eax movl $2, %ecx movl $1, (%esp) movl $2, 4(%esp) movl %eax, 28(%esp) movl

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

[LLVMdev] Suboptimal code due to excessive spilling

I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Which performance guidelines are you referring to? I'm not that familiar with decade-old CPUs, but to the best of my knowledge, this is not true on current hardware. There is one specific circumstance where PUSHes should be avoided - for Atom/Silvermont processors, the memory form of PUSH is inefficient, so the register-freeing optimization below may not be profitable (see 14.3.3.6 and

[LLVMdev] Intel vs. AT&T Assembly.

2006 Apr 29

[LLVMdev] Intel vs. AT&T Assembly.

Hi Jeff, > > I notice `lli -print-machineinstrs -x86-asm-syntax=(att|intel)' both > > prefix registers with `%'. Is this right? I thought AT&T did this > > and Intel didn't. The GNU gas manual concurs. > > > > http://www.gnu.org/software/binutils/manual/gas-2.9.1/html_chapter/as_16.html > > The Intel version is just a clone of the

search for: esps