thr3ads.net - similar to: "[RFC][llvm-mca] Adding binary support to llvm-mca."

Displaying 20 results from an estimated 10000 matches similar to: "[RFC][llvm-mca] Adding binary support to llvm-mca."

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Nov 21

[RFC][llvm-mca] Adding binary support to llvm-mca.

Hi Andrea, Thanks for your input. On Wed, Nov 21, 2018 at 12:43:52PM +0000, Andrea Di Biagio wrote: [... snip ...] > About the suggested design: > I like the idea of being able to identify code regions using a numeric > identifier. > However, what happens if a code region spans through multiple basic blocks? The current patch does not take into consideration cases where the region

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Nov 27

[RFC][llvm-mca] Adding binary support to llvm-mca.

Thanks for clarifying it Matt. In general, I quite like your suggested design. My only concern is about the semantic of the two new intrinsics. You design doesn't allow mca ranges to span through multiple basic blocks. That constraint is acceptable for now, since llvm-mca doesn't know how to deal with control flow. However, I am a bit concerned about what might happen in future if we

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Dec 03

[RFC][llvm-mca] Adding binary support to llvm-mca.

Hi Andrea, On Mon, Dec 03, 2018 at 01:21:33PM +0000, Andrea Di Biagio wrote: > So, I have been thinking a bit more about this whole design. > > The more I think about your suggested design, the more I am convinced that > we should do something more to support ranges in binary object files too. > My understanding is that the reason why we don't support object files in >

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Dec 10

[RFC][llvm-mca] Adding binary support to llvm-mca.

+1 to what Clement said. I believe the intrinsics are a better design to support many architectures. IACA users are probably decorating their code with IACA_START / IACA_END macros. One possibility is to provide a header that define these macros in terms of the new intrinsics. On Mon, Dec 10, 2018 at 3:59 PM Clement Courbet <courbet at google.com> wrote: > Hi Matt/Andrea, > > I

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Dec 10

[RFC][llvm-mca] Adding binary support to llvm-mca.

Hi Matt, I can see a near future where perf-analysis tooling uses branch history profiler captures to determine how often loops/branches are taken and feeds that into llvm-mca, especially for hot/branchy loop analysis reports etc. Are you confident that your approach will be easily extendable for this? Similarly, being able to generally embed the profile markers in object libraries for

[LLVMdev] how to annotate assembler

2012 Mar 02

[LLVMdev] how to annotate assembler

Hi, In GCC there is one useful option -dp (or -dP for more verbose output) to annotate assembler with instruction patterns, that was used when assembler was generated. For example: double test(long long s) { return s; } gcc -S -dp -O0 test.c test: .LFB0: .cfi_startproc pushq %rbp # 18 *pushdi2_rex64/1 [length = 1] .cfi_def_cfa_offset 16 movq %rsp, %rbp # 19 *movdi_1_rex64/2

[LLVMdev] Apparent optimizer bug on X86_64

2011 Mar 19

[LLVMdev] Apparent optimizer bug on X86_64

Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 <rpcalc_parse+628>: mov

[LLVMdev] how to annotate assembler

2012 Mar 02

[LLVMdev] how to annotate assembler

On 02.03.2012, at 09:20, Konstantin Vladimirov wrote: > Hi, > > In GCC there is one useful option -dp (or -dP for more verbose output) > to annotate assembler with instruction patterns, that was used when > assembler was generated. For example: The internal "-mllvm -show-mc-inst" option is probably as close as you can get. $ clang -S -O0 test.c -mllvm -show-mc-inst -o

[LLVMdev] x86-64 large stack offsets

2011 Sep 26

[LLVMdev] x86-64 large stack offsets

Hey guys, I'm working on a bug for x86-64 in LLVM 2.9. Well, it's actually two issues. The assembly generated for large stack offsets has an overflow; And, once the overflow is fixed, the displacement is too large for GNU ld to handle it. void fool( int long n ) { double w[268435600]; double z[268435600]; unsigned long i; for ( i = 0; i < n; i++ ) { w[i] = 1.0; z[i] =

Finding caller-saved registers at a function call site

2016 Jun 22

Finding caller-saved registers at a function call site

Hi Rob, Rob Lyerly via llvm-dev wrote: > I'm looking for a way to get all the caller-saved registers (both the > register and the stack slot at which it was saved) for a given function > call site in the backend. What's the best way to grab this > information? Is it possible to get this information if I have the > MachineInstr of the function call? I'm currently

Finding caller-saved registers at a function call site

2016 Jun 22

Finding caller-saved registers at a function call site

Hi everyone, I'm looking for a way to get all the caller-saved registers (both the register and the stack slot at which it was saved) for a given function call site in the backend. What's the best way to grab this information? Is it possible to get this information if I have the MachineInstr of the function call? I'm currently targeting the AArch64 & X86 backends. Thanks! --

Finding caller-saved registers at a function call site

2016 Jun 27

Finding caller-saved registers at a function call site

Hi Sanjoy, I'm having trouble finding caller-saved registers using the RegMask operand you've mentioned. As an example, I've got a C function that looks like this: double recurse(int depth, double val) { if(depth < max_depth) return recurse(depth + 1, val * 1.2) + val; else return outer_func(val); } As a quick refresher, all "xmm" registers are considered

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

2017 Mar 01

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

[LLVMdev] Suboptimal code due to excessive spilling

Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 19

[LLVMdev] SIMD instructions and memory alignment on X86

Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think

unnecessary reload of 8-byte struct on i386

2019 Oct 25

unnecessary reload of 8-byte struct on i386

Hello folks, I've recently been looking at the generated code for a few functions in Chromium while investigating crashes, and I came across a curious pattern. A smallish repro case is available at https://godbolt.org/z/Dsu1WI . In that case, the function Assembler::emit_arith receives a struct (Operand) by value and passes it by value to another function. That struct is 8 bytes long, so the

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

[LLVMdev] Suboptimal code due to excessive spilling

I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior

[LLVMdev] Odd weak symbol thing on i386

2012 Jan 13

[LLVMdev] Odd weak symbol thing on i386

Hi, I'm compiling lldiv.c from the NetBSD standard library. It works on ARM, Mips, Microblaze,ppc, ppc64, and x86_64. On i386 a very strange thing happens. Here's the source: #include <stdlib.h> #define __weak_alias(sym) __attribute__ ((weak, alias (#sym))) lldiv_t lldiv(long long int num, long long int denom) __weak_alias(_lldiv); lldiv_t _lldiv(long long num, long

[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?

2012 Jan 04

[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?

I write a small function and test it under clang and gcc, filet test.c: double X[100]; double Y[100]; double DA = 0.3; int f() { int i; for (i = 0; i < 100; i++) Y[i] = Y[i] - DA * X[i]; return 0; } clang -S -O3 -o test.s test.c -march=native -ccc-echo result: "D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr e -disable-llvm-verifier

[LLVMdev] Code Generation Problem llvm 1.9

2007 Dec 20

[LLVMdev] Code Generation Problem llvm 1.9

I sent a long message yesterday describing a problem I thought had to do with the JIT stubs. After further investigating, the problem seems to be in the code generation. The following basic block seems to have an error in it's code generation: __exp.exit: ; preds = %codeRepl258, %__exp_bb_bb.exit phi double [ 1.000000e+00, %codeRepl258 ], [ %.reload.reload.i,

similar to: [RFC][llvm-mca] Adding binary support to llvm-mca.