similar to: [LLVMdev] x86-64 large stack offsets

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] x86-64 large stack offsets"

2011 Sep 26
0
[LLVMdev] x86-64 large stack offsets
On Sep 26, 2011, at 12:02 PM, Cameron McInally wrote: > > Here's one of the offending instructions produced by 2.9: > > movsd -2147482472(%rsp), %xmm0 > > Fixing the displacement overflow is pretty easy. It's just a matter of changing a few variable types in LLVM from unsigned to uint64_t in the functions that calculate the stack offsets. The real trouble I'm
2010 Jul 08
1
[LLVMdev] simple way to print disassembly of final code from jit?
Thanks for all the hints everyone. Based on your suggestion, O.J., I've added code to toy.cpp from the tutorial to disassemble. ready> 1+1; ready> movabsq $140737353367568, %rax movsd (%rax), %xmm0 ret Evaluated to 2.000000 ready> Which looks correct by inspection - printing the byte array to stdout and feeding it to llvm-mc offline produces the same code as one would also
2010 Jul 07
0
[LLVMdev] simple way to print disassembly of final code from jit?
Hi Bill, I'm coincidently planning right now on doing exactly the same things as you. I haven't yet had a chance to implement the code, but I can point you to how I currently believe you can get access to what you need. If you take a look at the code for the implementation of lvm::JIT::runJITOnFunction(Function *, MachineCodeInfo *), you'll see that if a MachineCodeInfo parameter is
2010 Jul 07
3
[LLVMdev] simple way to print disassembly of final code from jit?
Thanks Reid - I'm on Windows. I guess I just assumed I was missing something obvious in how to hook up the JIT and disassembler! Given the nice looking disassembly code I found, I thought people would be doing it all the time :-) b. On Tue, Jul 6, 2010 at 8:41 PM, Reid Kleckner <reid.kleckner at gmail.com> wrote: > If you're on a recent flavor of Linux, you may be able to just
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after
2017 Mar 01
2
[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior
2012 Jul 27
2
[LLVMdev] X86 FMA4
Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory. vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5. vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1. He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a
2019 Oct 25
4
unnecessary reload of 8-byte struct on i386
Hello folks, I've recently been looking at the generated code for a few functions in Chromium while investigating crashes, and I came across a curious pattern. A smallish repro case is available at https://godbolt.org/z/Dsu1WI . In that case, the function Assembler::emit_arith receives a struct (Operand) by value and passes it by value to another function. That struct is 8 bytes long, so the
2012 Jan 13
2
[LLVMdev] Odd weak symbol thing on i386
Hi, I'm compiling lldiv.c from the NetBSD standard library. It works on ARM, Mips, Microblaze,ppc, ppc64, and x86_64. On i386 a very strange thing happens. Here's the source: #include <stdlib.h> #define __weak_alias(sym) __attribute__ ((weak, alias (#sym))) lldiv_t lldiv(long long int num, long long int denom) __weak_alias(_lldiv); lldiv_t _lldiv(long long num, long
2017 May 18
3
Memory accesses and determining aliasing at the MI level
In order to implement a subtle memory access optimisation during post-RA scheduling, I want to be able to determine some properties about the memory access. If I have two registers referring to memory, how can I determine if they are derived from the same base-pointer? Often LLVM will optimise to use intermediate registers holding partial displacements, for example, when a 'struct'
2013 Jul 19
4
[LLVMdev] SIMD instructions and memory alignment on X86
Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think
2017 Jan 06
2
RFC: LLD range extension thunks
After looking at this for a while, I do not think that this problem is NP-hard. With a finite "short branch" displacement k, I was not able to come up with a gadget that could create global constraints as would be needed to e.g. model an instance of 3SAT or vertex cover in terms of this problem. The problem is hard though. I believe that it is likely to be exponential in the "short
2017 Feb 13
5
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: > On 02/13/2017 02:42 PM, Waiman Long wrote: > > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: > >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: > >>> That way we'd end up with something like: > >>> > >>> asm(" > >>> push %rdi; > >>>
2017 Feb 13
5
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: > On 02/13/2017 02:42 PM, Waiman Long wrote: > > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: > >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: > >>> That way we'd end up with something like: > >>> > >>> asm(" > >>> push %rdi; > >>>
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
I write a small function and test it under clang and gcc, filet test.c: double X[100]; double Y[100]; double DA = 0.3; int f() { int i; for (i = 0; i < 100; i++) Y[i] = Y[i] - DA * X[i]; return 0; } clang -S -O3 -o test.s test.c -march=native -ccc-echo result: "D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr e -disable-llvm-verifier
2013 Jan 04
3
[LLVMdev] instruction scheduling issue
Hi all, I'm trying to insert a function call "llvm_memory_profiling " right before each memory access. The function uses the effective address of the memory access as its single parameter. A example is as follows: the function call at 402a99 has a parameter passed to %rdi at 402a91. One can see that the function call is exactly before the memory access I want to monitor because
2018 Nov 15
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
Introduction ----------------- Currently llvm-mca only accepts assembly code as input. We would like to extend llvm-mca to support object files, allowing users to analyze the performance of binaries. The proposed changes (which involve both clang and llvm) optionally introduce an object file section, but this can be stripped-out if desired. For the llvm-mca binary support feature to be useful, a
2017 Jan 06
3
RFC: LLD range extension thunks
On Fri, Jan 6, 2017 at 6:21 AM, Rui Ueyama via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Thu, Jan 5, 2017 at 8:15 PM, Peter Smith <peter.smith at linaro.org> > wrote: > >> Hello Rui, >> >> Thanks for the comments >> >> - Synthetic sections and rewriting relocations >> I think that this would definitely be worth trying. It should
2011 Mar 19
2
[LLVMdev] Apparent optimizer bug on X86_64
Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 <rpcalc_parse+628>: mov