thr3ads.net - similar to: "[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?"

[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?

2012 Aug 14

[LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?

On 13 August 2012 12:02, Andrew Ruef <awruef at umd.edu> wrote: > I think there's a bug somewhere in TableGen for the X86 disassembler > emitter. The following test: > > $ echo "0xF3 0xA5" | ./bin/llvm-mc -disassemble > .section __TEXT,__text,regular,pure_instructions > movsd > > (from llvm trunk) > > 0xF3 is the REP prefix, so the

[LLVMdev] Which floating-point comparison?

2010 Mar 28

[LLVMdev] Which floating-point comparison?

On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace <russell.wallace at gmail.com> wrote: > I notice llvm provides both ordered and unordered variants of > floating-point comparison. Which of these is the right one to use by > default? I suppose the two criteria would be, in order of importance: > > 1. Which is more efficient (more directly maps to typical hardware)? You can

[LLVMdev] Which floating-point comparison?

2010 Mar 28

[LLVMdev] Which floating-point comparison?

I notice llvm provides both ordered and unordered variants of floating-point comparison. Which of these is the right one to use by default? I suppose the two criteria would be, in order of importance: 1. Which is more efficient (more directly maps to typical hardware)? 2. Which is more familiar (more like the way C and Fortran do it)?

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

[LLVMdev] Suboptimal code due to excessive spilling

Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

[LLVMdev] Suboptimal code due to excessive spilling

I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior

[LLVMdev] Odd weak symbol thing on i386

2012 Jan 13

[LLVMdev] Odd weak symbol thing on i386

Hi, I'm compiling lldiv.c from the NetBSD standard library. It works on ARM, Mips, Microblaze,ppc, ppc64, and x86_64. On i386 a very strange thing happens. Here's the source: #include <stdlib.h> #define __weak_alias(sym) __attribute__ ((weak, alias (#sym))) lldiv_t lldiv(long long int num, long long int denom) __weak_alias(_lldiv); lldiv_t _lldiv(long long num, long

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

2017 Mar 01

[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm

Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**

unnecessary reload of 8-byte struct on i386

2019 Oct 25

unnecessary reload of 8-byte struct on i386

Hello folks, I've recently been looking at the generated code for a few functions in Chromium while investigating crashes, and I came across a curious pattern. A smallish repro case is available at https://godbolt.org/z/Dsu1WI . In that case, the function Assembler::emit_arith receives a struct (Operand) by value and passes it by value to another function. That struct is 8 bytes long, so the

[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?

2012 Jan 04

[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?

I write a small function and test it under clang and gcc, filet test.c: double X[100]; double Y[100]; double DA = 0.3; int f() { int i; for (i = 0; i < 100; i++) Y[i] = Y[i] - DA * X[i]; return 0; } clang -S -O3 -o test.s test.c -march=native -ccc-echo result: "D:/work/trunk/bin/Release/clang.exe" -cc1 -triple i686-pc-win32 -S -disable-fr e -disable-llvm-verifier

[LLVMdev] Apparent optimizer bug on X86_64

2011 Mar 19

[LLVMdev] Apparent optimizer bug on X86_64

Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 <rpcalc_parse+628>: mov

[LLVMdev] instruction scheduling issue

2013 Jan 04

[LLVMdev] instruction scheduling issue

Hi all, I'm trying to insert a function call "llvm_memory_profiling " right before each memory access. The function uses the effective address of the memory access as its single parameter. A example is as follows: the function call at 402a99 has a parameter passed to %rdi at 402a91. One can see that the function call is exactly before the memory access I want to monitor because

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

I want to write a pattern that looks something like this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, (MOVSDmr addr:$dst, FR64:$src))), imm:3) So I want to convert an unaligned vector store to a scalar store, a shuffle and a scalar store. There are several question I have: - Is the imm:3 syntax

[LLVMdev] instruction scheduling issue

2013 Jan 07

[LLVMdev] instruction scheduling issue

Liu, I do not think there is a trivial way to do it. Do you really _have_ to have those instructions together, or mere order is enough? Also, how much performance are you willing to sacrifice to do what you do? Maybe turning off scheduling all together is an acceptable solution? Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux

[LLVMdev] x86-64 large stack offsets

2011 Sep 26

[LLVMdev] x86-64 large stack offsets

Hey guys, I'm working on a bug for x86-64 in LLVM 2.9. Well, it's actually two issues. The assembly generated for large stack offsets has an overflow; And, once the overflow is fixed, the displacement is too large for GNU ld to handle it. void fool( int long n ) { double w[268435600]; double z[268435600]; unsigned long i; for ( i = 0; i < n; i++ ) { w[i] = 1.0; z[i] =

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Nov 15

[RFC][llvm-mca] Adding binary support to llvm-mca.

Introduction ----------------- Currently llvm-mca only accepts assembly code as input. We would like to extend llvm-mca to support object files, allowing users to analyze the performance of binaries. The proposed changes (which involve both clang and llvm) optionally introduce an object file section, but this can be stripped-out if desired. For the llvm-mca binary support feature to be useful, a

[LLVMdev] Code Generation Problem llvm 1.9

2007 Dec 20

[LLVMdev] Code Generation Problem llvm 1.9

I sent a long message yesterday describing a problem I thought had to do with the JIT stubs. After further investigating, the problem seems to be in the code generation. The following basic block seems to have an error in it's code generation: __exp.exit: ; preds = %codeRepl258, %__exp_bb_bb.exit phi double [ 1.000000e+00, %codeRepl258 ], [ %.reload.reload.i,

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

On Monday 17 November 2008 14:28, David Greene wrote: > I want to write a pattern that looks something like this: > > def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), > (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, > (MOVSDmr addr:$dst, FR64:$src))), imm:3) > > So I want to convert an unaligned vector store to a scalar store, a

[LLVMdev] how to annotate assembler

2012 Mar 02

[LLVMdev] how to annotate assembler

Hi, In GCC there is one useful option -dp (or -dP for more verbose output) to annotate assembler with instruction patterns, that was used when assembler was generated. For example: double test(long long s) { return s; } gcc -S -dp -O0 test.c test: .LFB0: .cfi_startproc pushq %rbp # 18 *pushdi2_rex64/1 [length = 1] .cfi_def_cfa_offset 16 movq %rsp, %rbp # 19 *movdi_1_rex64/2

[RFC][llvm-mca] Adding binary support to llvm-mca.

2018 Nov 21

[RFC][llvm-mca] Adding binary support to llvm-mca.

Hi Andrea, Thanks for your input. On Wed, Nov 21, 2018 at 12:43:52PM +0000, Andrea Di Biagio wrote: [... snip ...] > About the suggested design: > I like the idea of being able to identify code regions using a numeric > identifier. > However, what happens if a code region spans through multiple basic blocks? The current patch does not take into consideration cases where the region

[LLVMdev] how to annotate assembler

2012 Mar 02

[LLVMdev] how to annotate assembler

On 02.03.2012, at 09:20, Konstantin Vladimirov wrote: > Hi, > > In GCC there is one useful option -dp (or -dP for more verbose output) > to annotate assembler with instruction patterns, that was used when > assembler was generated. For example: The internal "-mllvm -show-mc-inst" option is probably as close as you can get. $ clang -S -O0 test.c -mllvm -show-mc-inst -o

similar to: [LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?