thr3ads.net - search: "esys"

Displaying 20 results from an estimated 1874 matches for "esys".

Did you mean: ess

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

[LLVMdev] Suboptimal code due to excessive spilling

Hi, I have run into the following strange behavior and wanted to ask for some advice. For the C program below, function sum() gets inlined in foo() but the code generated looks very suboptimal (the code is an extract from a larger program). Below I show the 32-bit x86 assembly as produced by the demo page on the llvm home page ("Output A"). As you can see from the assembly, after

[LLVMdev] Area for improvement

2005 Feb 22

[LLVMdev] Area for improvement

I noticed that fourinarow is one of the programs in which LLVM is much slower than GCC, so I decided to take a look and see why that is so. The program has many loops that look like this: #define ROWS 6 #define COLS 7 void init_board(char b[COLS][ROWS+1]) { int i,j; for (i=0;i<COLS;i++) for (j=0;j<ROWS;j++) b[i][j]='.';

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

[LLVMdev] Suboptimal code due to excessive spilling

I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround? /Patrik Hägglund -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker Sent: den 28 mars 2012 03:18 To: llvmdev Subject: [LLVMdev] Suboptimal code due to excessive spilling Hi, I have run into the following strange behavior

[LLVMdev] Area for improvement

2005 Feb 22

[LLVMdev] Area for improvement

On Mon, 21 Feb 2005, Jeff Cohen wrote: > I noticed that fourinarow is one of the programs in which LLVM is much slower > than GCC, so I decided to take a look and see why that is so. The program > has many loops that look like this: > > #define ROWS 6 > #define COLS 7 > > void init_board(char b[COLS][ROWS+1]) > { > int i,j; > > for

[LLVMdev] Area for improvement

2005 Feb 22

[LLVMdev] Area for improvement

Sorry, I thought I was running selection dag isel but I screwed up when trying out the really big array. You're right, it does clean it up except for the multiplication. So LoopStrengthReduce is not ready for prime time and doesn't actually get used? I might consider whipping it into shape. Does it still have to handle getelementptr in its full generality? Chris Lattner wrote:

esi cache server built on mongrel

2007 May 15

esi cache server built on mongrel

Hi all, I just wanted to share a project I''ve been working on now for a few months. It''s still far from complete, but has already been very useful for me and my coworkers. I''m calling it mongrel-esi. ESI stands for Edge Side Include. The specs live here, http://www.w3.org/TR/esi-lang. I currently only have support for <esi:include, <esi:try,

Nasty Bug (BIOS?).

2005 Aug 18

Nasty Bug (BIOS?).

At first I thought, I was dealing with the known EBIOS/CBIOS-problem. The symptom was exactly the same (hangs at ...EBIOS). As 3.10-pre8 and 3.10-pre9, in contrary what was mentioned in the ML, did not bring any improvement, I looked deeper into what could be my specific problem. I found out that the program just halted at 'cmp [esi],edx' (line 658; ldlinux.asm 3.10-pre9)! By replacing

[LLVMdev] Area for improvement

2005 Feb 22

[LLVMdev] Area for improvement

When I increased COLS to the point where the loop could no longer be unrolled, the selection dag code generator generated effectively the same code as the default X86 code generator. Lots of redundant imul/movl/addl sequences. It can't clean it up either. Only unrolling all nested loops permits it to be optimized away, regardless of code generator. Jeff Cohen wrote: > I noticed

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 11

[LLVMdev] Bug in X86CompilationCallback_SSE

I don't know how to file a PR, but I have a patch (see below), that should work regardless of abi differences, since it relies on the compiler to do the though job. void X86CompilationCallback_SSE(void) { char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned asm volatile ( "movl %%eax,(%0)\n" "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 13

[LLVMdev] trunk's optimizer generates slower code than 3.5

I submitted the problem report to clang's bugzilla but no one seems to care so I have to send it to the mailing list. clang 3.7 svn (trunk 229055 as the time I was to report this problem) generates slower code than 3.5 (Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)) for the following code. It is a "8 queens puzzle" solver written as an educational example. As

[LLVMdev] Area for improvement

2005 Feb 22

[LLVMdev] Area for improvement

On Mon, 21 Feb 2005, Jeff Cohen wrote: > Sorry, I thought I was running selection dag isel but I screwed up when > trying out the really big array. You're right, it does clean it up except > for the multiplication. > > So LoopStrengthReduce is not ready for prime time and doesn't actually get > used? I don't know what the status of it is. You could try it out,

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

The regressions in the performance of generated code, introduced by the llvm 3.6 release, don't seem to be limited to this 8 queens puzzle" solver test case. See... http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1 where a bit hit in the performance of the Sparse Matrix Multiply test of the SciMark v2.0 benchmark was observed as well as others.

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

Using the SciMark 2.0 code from http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the same... make CFLAGS="-O3 -march=native" I am able to reproduce the 22% performance regression in the run time of the Sparse matmult benchmark. For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with the release llvm clang 3.5.1 compiler and 1217.363+/-1.1004 for the current

Suboptimal code generated by clang+llc in quite a common scenario (?)

2019 Aug 08

Suboptimal code generated by clang+llc in quite a common scenario (?)

I found a something that I quite not understand when compiling a common piece of code using the -Os flags. I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code: char pp[3]; char *scscx = pp; int tst( char i, char j, char k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } The above gets

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 06

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big

[LLVMdev] Boostrap Failure -- Expected Differences?

2007 Apr 30

[LLVMdev] Boostrap Failure -- Expected Differences?

On Apr 27, 2007, at 3:50 PM, David Greene wrote: > The saga continues. > > I've been tracking the interface changes and merging them with > the refactoring work I'm doing. I got as far as building stage3 > of llvm-gcc but the object files from stage2 and stage3 differ: > > > warning: ./cc1-checksum.o differs > warning: ./cc1plus-checksum.o differs > >

cannot use winedbg on ubuntu feisty ?

2007 Aug 06

cannot use winedbg on ubuntu feisty ?

Hi everyone, This happens every time I start winedbg, no matter what program I want to debug. It's ubuntu feisty fawn and I have built wine from sources. user@machine:~$ winedbg "C:\Program Files\Diablo II\Diablo II.exe" WineDbg starting on pid 000a wine: Unhandled page fault on read access to 0x00000000 at address 0xb7d5cc23 (thread 0009), starting debugger... Unhandled exception:

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 27

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

"Sanjay Patel" <spatel at rotateright.com> wrote: > IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like > this: > unsigned int foo(unsigned int crc) { > if (crc & 0x80000000) > crc <<= 1, crc ^= 0xEDB88320; > else > crc <<= 1; > return crc; > } To document this for x86 too: rewrite the function

Where's the optimiser gone? (part 5.c): missed tail calls, and more...

2018 Dec 01

Where's the optimiser gone? (part 5.c): missed tail calls, and more...

Compile the following functions with "-O3 -target i386-win32" (see <https://godbolt.org/z/exmjWY>): __int64 __fastcall div(__int64 foo, __int64 bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: push dword ptr [esp + 16] | push dword ptr [esp + 16] | push dword ptr [esp + 16] |

[LLVMdev] Float compare-for-equality and select optimization opportunity

2008 May 27

[LLVMdev] Float compare-for-equality and select optimization opportunity

Hi all, I'm trying to generate code containing an ordered float compare for equality, and select. The resulting code however has an unordered compare and some Boolean logic that I think could be eliminated. In C syntax the code looks like this: float x, y; int a, b, c if(x == y) // Rotate the integers { int t; t = a; a = b;

search for: esys