thr3ads.net - similar to: "[LLVMdev] Is PIC code defeating the branch predictor?"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Is PIC code defeating the branch predictor?"

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

On Jan 3, 2011, at 11:30 PM, Jakob Stoklund Olesen wrote: > I noticed that we generate code like this for i386 PIC: > > calll L0$pb > L0$pb: > popl %eax > movl %eax, -24(%ebp) ## 4-byte Spill > > I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched. Yes, this will defeat the

[LLVMdev] Is PIC code defeating the branch predictor?

2011 Jan 04

[LLVMdev] Is PIC code defeating the branch predictor?

On 04 Jan 2011, at 08:30, Jakob Stoklund Olesen wrote: > I noticed that we generate code like this for i386 PIC: > > calll L0$pb > L0$pb: > popl %eax > movl %eax, -24(%ebp) ## 4-byte Spill > > I worry that this defeats the return address prediction for returns > in the function because calls and returns no longer are matched. According to benchmarks by

[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable

2014 Mar 14

[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable

>> Any thoughs? > > I'm now struggling to see how GCC justifies it. What if a different > translation-unit declared those variables in a different order? I also > can't get the same behaviour here, do you have a more complete > command-line? Ah, I see; the translation-unit that does the optimisation needs to have them as a definition (i.e. "= {0}") rather

[LLVMdev] How to tell whether a GlobalValue is user-defined

2014 Aug 25

[LLVMdev] How to tell whether a GlobalValue is user-defined

I think this is preventing constants in the constant pool (e.g., floating point literal) from being placed in the mergeable constant sections? We want to keep the const arrays declared in the program (s_dashArraySize1) out of the mergeable constant sections, but don't mind placing constants in the constant pool or constant arrays that the compiler defines, such as switch.table and

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Hello all, In r223757 I've committed a patch that performs, for the 32-bit x86 calling convention, the transformation of MOV instructions that push function arguments onto the stack into actual PUSH instructions. For example, it will transform this: subl $16, %esp movl $4, 12(%esp) movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) calll _func addl $16, %esp

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Which performance guidelines are you referring to? I'm not that familiar with decade-old CPUs, but to the best of my knowledge, this is not true on current hardware. There is one specific circumstance where PUSHes should be avoided - for Atom/Silvermont processors, the memory form of PUSH is inefficient, so the register-freeing optimization below may not be profitable (see 14.3.3.6 and

[LLVMdev] Misaligned SSE store problem (with reduced source)

2011 Nov 11

[LLVMdev] Misaligned SSE store problem (with reduced source)

Using LLVM 2.9, the following LLVM IR produces invalid x86 32 bit assembly (a misaligned SSE store). ; ModuleID = 'MisalignedStore' define void @MisalignedStore() nounwind readnone { entry: %v = alloca <4 x float>, align 16 store <4 x float> zeroinitializer, <4 x float>* %v, align 16 br label %post-block post-block: %f = alloca float ret void } If I feed

How to call an (x86) cleanup/catchpad funclet

2016 Apr 04

How to call an (x86) cleanup/catchpad funclet

I've modified llvm to emit vc++ compatible SEH structures for my personality on x86/Windows and my handler works fine, but the only thing I can't figure out is how to call these funclets, they look like: Catch: "?catch$3@?0?m3 at 4HA": LBB4_3: # %BasicBlock26 pushl %ebp pushl %eax addl $12, %ebp movl %esp, -28(%ebp) movl $LBB4_5, %eax

[Release-testers] [7.0.0 Release] rc1 has been tagged

2018 Aug 06

[Release-testers] [7.0.0 Release] rc1 has been tagged

On Sun, Aug 5, 2018 at 5:49 PM, Dimitry Andric <dimitry at andric.com> wrote: > On 3 Aug 2018, at 13:37, Hans Wennborg via Release-testers <release-testers at lists.llvm.org> wrote: >> >> 7.0.0-rc1 was just tagged (from the branch at r338847). >> >> It's early in the release process, but I'd like to find out what the >> status is of the branch

[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable

2014 Mar 14

[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable

Hi Rafael, Yes, merging gv prevents linker to do garbage collection. Should it be implemented as a peephole pass? If we do it too early, the distance between GVs are not fixed yet. PS: Below is the GCC output with "extern" hidden: ldr r2, .L2 stmfd sp!, {r3, lr} .save {r3, lr} .LPIC0: add r0, pc, r2 bl _Z4initPv(PLT) ldr r1, .L2+4 .LPIC1: add r0, pc, r1 bl _Z4initPv(PLT) ldr

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

That's useful to know that the static compilation code path works. Furthermore, as expected from that: 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4 00000054: IMAGE_REL_I386_DIR32 _foo It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola < nikodemus at random-state.net> wrote: > Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it > through an opaque identity function ... then everything works fine. > > Is this a bug in LLVM or is there some magic involving globals I'm > misunderstanding? > This looks like a bug in the handling of

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 07

[newbie] trouble with global variables and CreateLoad/Store in JIT

My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue. Managed an lli reproduction: $ cat jit-0.ll target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32" target triple =

[LLVMdev] Misaligned SSE store problem (with reduced source)

2011 Nov 11

[LLVMdev] Misaligned SSE store problem (with reduced source)

On Thu, Nov 10, 2011 at 6:13 PM, Aaron Dwyer <Aaron.Dwyer at imgtec.com> wrote: > Using LLVM 2.9, the following LLVM IR produces invalid x86 32 bit assembly > (a misaligned SSE store). > ; ModuleID = 'MisalignedStore' > define void @MisalignedStore() nounwind readnone { > entry: > %v = alloca <4 x float>, align 16 > store <4 x float>

[LLVMdev] PIC documentation ?

2009 Jun 16

[LLVMdev] PIC documentation ?

Anton, >> Can I ask what platform ABI's are documented other than Itanium ? > I'd bet all platform ABI are more or less documented. Right. Maybe we should collect references and do some LLVM PIC documentation and put it on LLVM website ? >> I need to get to understand PIC on x86, x86_64 and PowerPC for the COFF >> and MachO backends. > ABI is normally induced

[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization

2012 Mar 20

[LLVMdev] Runtime linker issue wtih X11R6 on i386 with -O3 optimization

I was told that my writeup lacked an example and details so I reproduced the code that X uses and I was able to boil down the issue to a couple of lines of code. Sorry again for the length of this email. Code was compiled on OpenBSD with clang 3.0-release. ======================================================================== With -O0 which works as X expects:

crash on 2 gig file

2003 Jun 16

crash on 2 gig file

Hi, I'm still waiting for my list subscription but if I don't send in this bug report now I won't for who-knows-how-long and I want to get it in.... I'm using 2.5.6 compiled from cvs on SCO Open Server 5.0.6 bothe machines are the same version of OS and the same copy of rsync. on my live machine one of the database files eventually got to be 2 gigs the file has since then been

parsing numeric values

2009 Nov 18

parsing numeric values

Dear list, I'm seeking advice to extract some numeric values from a log file created by an external program. Consider the following example, input <- readLines(textConnection( "some text <ax> = 1.3770E-03 <bx> = 3.4644E-07 <ay> = 1.9412E-04 <by> = 4.8840E-08 other text <aax> = 1.3770E-03 <bbx> = 3.4644E-07

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

On Oct 20, 2010, at 7:46 AM, Roland Scheidegger wrote: > On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: >> Look in X86InstrControl.td. The call instructions are all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3,

Function calls keep increasing the stack usage

2018 Sep 14

Function calls keep increasing the stack usage

Hi everyone, I found that LLVM generates redundant code when calling functions with constant parameters, with optimizations disabled. Consider the following C code snippet: int foo(int x, int y); void bar() { foo(1, 2); foo(3, 4); } Clang/LLVM 6.0 generates the following assembly code: _bar: subl $32, %esp movl $1, %eax movl $2, %ecx movl $1, (%esp) movl $2, 4(%esp) movl %eax, 28(%esp) movl

similar to: [LLVMdev] Is PIC code defeating the branch predictor?