thr3ads.net - search: "edy"

Displaying 20 results from an estimated 1881 matches for "edy".

Did you mean: edx

2014 Jan 18

[LLVMdev] Scheduling quirks

Hello all! When I compile the following more or less stupid functions with clang++ -O3 -S test.cpp ===> int test_register(int x) { x ^= (x >> 2); x ^= (x >> 3); x = x ^ (x >> 4); int y = x; x >>= 5; x ^= y; // almost the same but explicit return x; } int test_scheduler(int x) { return ((x>>2) & 15) ^ ((x>>3) & 31); }

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 13

[LLVMdev] trunk's optimizer generates slower code than 3.5

I submitted the problem report to clang's bugzilla but no one seems to care so I have to send it to the mailing list. clang 3.7 svn (trunk 229055 as the time I was to report this problem) generates slower code than 3.5 (Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)) for the following code. It is a "8 queens puzzle" solver written as an educational example. As

A pattern for portable __builtin_add_overflow()

2018 Nov 20

A pattern for portable __builtin_add_overflow()

Hi LLVM, clang, I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction. Here are docs about these builtins: https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins . With unsigned types this is easy: int uaddo_native(unsigned

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

The regressions in the performance of generated code, introduced by the llvm 3.6 release, don't seem to be limited to this 8 queens puzzle" solver test case. See... http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1 where a bit hit in the performance of the Sparse Matrix Multiply test of the SciMark v2.0 benchmark was observed as well as others.

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

Using the SciMark 2.0 code from http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the same... make CFLAGS="-O3 -march=native" I am able to reproduce the 22% performance regression in the run time of the Sparse matmult benchmark. For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with the release llvm clang 3.5.1 compiler and 1217.363+/-1.1004 for the current

An assembly optimization and fix

2004 Sep 10

An assembly optimization and fix

I have optimized FLAC__fixed_compute_best_predictor_asm_ia32_mmx_cmov function and fixed bug when data_len == 0. Now the function is about 50% faster and flac -5 is about 5% faster on my box. I have tested it thoroughly, I think it can go to flac 1.0.4. -- Miroslav Lichvar -------------- next part -------------- --- src/libFLAC/ia32/fixed_asm.nasm.orig 2002-01-26 19:05:12.000000000 +0100 +++

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

[LLVMdev] Suboptimal code due to excessive spilling

Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after

[LLVMdev] RegAllocFast uses too much stack

2011 Jul 11

[LLVMdev] RegAllocFast uses too much stack

I discovered recently that RegAllocFast spills all the registers before every function call. This is the root cause of one of our recursive functions that takes about 150 bytes of stack when built with gcc (same at -O0 and -O2, or 120 bytes at llc -O2) taking 960 bytes of stack when built by llc -O0. That's pretty bad for situations where you have small stacks, which is not uncommon for

Recent -Os code size regressions

2015 Nov 21

Recent -Os code size regressions

On Fri, Nov 20, 2015 at 5:06 PM, James Molloy <james at jamesmolloy.co.uk> wrote: > > Hi, > > We'd need to look precisely at what's causing the code size bloat. The midend commit pointed out by Steve shouldn't cause bloat in and of itself - it should reduce code size. It removes a load of stores and branches. > > I know a backend change I made to ARM isn't

Ignored branch predictor hints

2018 May 09

Ignored branch predictor hints

Hello, #define likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; else if (e == 1) return "1"; else return "f"; } GCC correctly prefers the first case: b(int): mov eax, OFFSET FLAT:.LC0 test edi, edi jne .L7 ret But Clang seems to ignore _builtin_expect hints in this case.

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

[LLVMdev] Suboptimal code due to excessive spilling

I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 11

[LLVMdev] Bug in X86CompilationCallback_SSE

I don't know how to file a PR, but I have a patch (see below), that should work regardless of abi differences, since it relies on the compiler to do the though job. void X86CompilationCallback_SSE(void) { char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned asm volatile ( "movl %%eax,(%0)\n" "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX

[LLVMdev] Boostrap Failure -- Expected Differences?

2007 Apr 30

[LLVMdev] Boostrap Failure -- Expected Differences?

On Apr 27, 2007, at 3:50 PM, David Greene wrote: > The saga continues. > > I've been tracking the interface changes and merging them with > the refactoring work I'm doing. I got as far as building stage3 > of llvm-gcc but the object files from stage2 and stage3 differ: > > > warning: ./cc1-checksum.o differs > warning: ./cc1plus-checksum.o differs > >

[LLVMdev] Boostrap Failure -- Expected Differences?

2007 Apr 27

[LLVMdev] Boostrap Failure -- Expected Differences?

The saga continues. I've been tracking the interface changes and merging them with the refactoring work I'm doing. I got as far as building stage3 of llvm-gcc but the object files from stage2 and stage3 differ: warning: ./cc1-checksum.o differs warning: ./cc1plus-checksum.o differs (Are the above two ok?) The list below is clearly bad. I think it's every object file in the

How to debug instruction selection

2017 Aug 15

How to debug instruction selection

Hi there, I try to JIT compile some bitcode and seeing the following error: LLVM ERROR: Cannot select: 0x28ec830: ch,glue = X86ISD::CALL 0x28ec7c0, 0x28ef900, Register:i32 %EDI, Register:i8 %AL, RegisterMask:Untyped, 0x28ec7c0:1 0x28ef900: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (i8*, ...)* @_ZN5FooBr7xprintfEPKcz> 0 0x28ec520: i32 = TargetGlobalAddress<void (i8*, ...)*

patch

2004 Sep 10

patch

So here is quick patch solving the problem, now it should be PIC. -- Miroslav Lichvar lichvarm@phoenix.inf.upol.cz -------------- next part -------------- --- lpc_asm.nasm.orig Wed Jul 18 02:23:40 2001 +++ lpc_asm.nasm Sat Nov 17 21:09:46 2001 @@ -59,10 +59,10 @@ ; ALIGN 16 cident FLAC__lpc_compute_autocorrelation_asm_ia32 - ;[esp + 24] == autoc[] - ;[esp + 20] == lag - ;[esp + 16] ==

Incidence Function Model in R help

2009 Jun 21

Incidence Function Model in R help

All: Though I am fairly new to R, I am trying to work my way through J Oksanen's "Incidence Function Model in R" and can't get past some error with my glm arguments. I'm getting through > attach(amphimedon_compressa) > plot(x.crd,y.crd,asp=1,xlab="Easting",ylab="Northing",pch=21,col=p+1,bg=5*p) > d<-dist(cbind(x.crd,y.crd)) > alpha<-1

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017 Feb 13

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

On Fri, Feb 10, 2017 at 12:00:43PM -0500, Waiman Long wrote: > >> +asm( > >> +".pushsection .text;" > >> +".global __raw_callee_save___kvm_vcpu_is_preempted;" > >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > >> +"__raw_callee_save___kvm_vcpu_is_preempted:" > >> +FRAME_BEGIN >

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017 Feb 13

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

[LLVMdev] Float compare-for-equality and select optimization opportunity

2008 May 27

[LLVMdev] Float compare-for-equality and select optimization opportunity

Hi all, I'm trying to generate code containing an ordered float compare for equality, and select. The resulting code however has an unordered compare and some Boolean logic that I think could be eliminated. In C syntax the code looks like this: float x, y; int a, b, c if(x == y) // Rotate the integers { int t; t = a; a = b;

search for: edy