Displaying 20 results from an estimated 1872 matches for "esy".
Did you mean:
esp
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
Hi,
I have run into the following strange behavior and wanted to ask for
some advice. For the C program below, function sum() gets inlined in
foo() but the code generated looks very suboptimal (the code is an
extract from a larger program).
Below I show the 32-bit x86 assembly as produced by the demo page on
the llvm home page ("Output A"). As you can see from the assembly,
after
2005 Feb 22
5
[LLVMdev] Area for improvement
I noticed that fourinarow is one of the programs in which LLVM is much
slower than GCC, so I decided to take a look and see why that is so.
The program has many loops that look like this:
#define ROWS 6
#define COLS 7
void init_board(char b[COLS][ROWS+1])
{
int i,j;
for (i=0;i<COLS;i++)
for (j=0;j<ROWS;j++)
b[i][j]='.';
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
I don't know much about this, but maybe -mllvm -unroll-count=1 can be used as a workaround?
/Patrik Hägglund
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Brent Walker
Sent: den 28 mars 2012 03:18
To: llvmdev
Subject: [LLVMdev] Suboptimal code due to excessive spilling
Hi,
I have run into the following strange behavior
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote:
> I noticed that fourinarow is one of the programs in which LLVM is much slower
> than GCC, so I decided to take a look and see why that is so. The program
> has many loops that look like this:
>
> #define ROWS 6
> #define COLS 7
>
> void init_board(char b[COLS][ROWS+1])
> {
> int i,j;
>
> for
2005 Feb 22
2
[LLVMdev] Area for improvement
Sorry, I thought I was running selection dag isel but I screwed up when
trying out the really big array. You're right, it does clean it up
except for the multiplication.
So LoopStrengthReduce is not ready for prime time and doesn't actually
get used?
I might consider whipping it into shape. Does it still have to handle
getelementptr in its full generality?
Chris Lattner wrote:
2007 May 15
1
esi cache server built on mongrel
Hi all,
I just wanted to share a project I''ve been working on now for a few
months. It''s still far from complete, but has already been very useful for
me and my coworkers. I''m calling it mongrel-esi. ESI stands for Edge Side
Include. The specs live here, http://www.w3.org/TR/esi-lang. I currently
only have support for <esi:include, <esi:try,
2005 Aug 18
2
Nasty Bug (BIOS?).
At first I thought, I was dealing with the known EBIOS/CBIOS-problem.
The symptom was exactly the same (hangs at ...EBIOS). As 3.10-pre8 and
3.10-pre9, in contrary what was mentioned in the ML, did not bring any
improvement, I looked deeper into what could be my specific problem.
I found out that the program just halted at 'cmp [esi],edx' (line 658;
ldlinux.asm 3.10-pre9)! By replacing
2005 Feb 22
0
[LLVMdev] Area for improvement
When I increased COLS to the point where the loop could no longer be
unrolled, the selection dag code generator generated effectively the
same code as the default X86 code generator. Lots of redundant
imul/movl/addl sequences. It can't clean it up either. Only unrolling
all nested loops permits it to be optimized away, regardless of code
generator.
Jeff Cohen wrote:
> I noticed
2009 Mar 11
4
[LLVMdev] Bug in X86CompilationCallback_SSE
I don't know how to file a PR, but I have a patch (see below), that
should work regardless of abi differences, since it relies on the
compiler to do the though job.
void X86CompilationCallback_SSE(void) {
char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned
asm volatile (
"movl %%eax,(%0)\n"
"movl %%edx,4(%0)\n" // Save EAX/EDX/ECX
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
I submitted the problem report to clang's bugzilla but no one seems to
care so I have to send it to the mailing list.
clang 3.7 svn (trunk 229055 as the time I was to report this problem)
generates slower code than 3.5 (Apple LLVM version 6.0
(clang-600.0.56) (based on LLVM 3.5svn)) for the following code.
It is a "8 queens puzzle" solver written as an educational example. As
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote:
> Sorry, I thought I was running selection dag isel but I screwed up when
> trying out the really big array. You're right, it does clean it up except
> for the multiplication.
>
> So LoopStrengthReduce is not ready for prime time and doesn't actually get
> used?
I don't know what the status of it is. You could try it out,
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
The regressions in the performance of generated code, introduced
by the llvm 3.6 release, don't seem to be limited to this 8 queens
puzzle" solver test case. See...
http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1
where a bit hit in the performance of the Sparse Matrix Multiply test
of the SciMark v2.0 benchmark was observed as well as others.
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
Using the SciMark 2.0 code from
http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the
same...
make CFLAGS="-O3 -march=native"
I am able to reproduce the 22% performance regression in the run time
of the Sparse matmult benchmark.
For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with
the release llvm clang 3.5.1 compiler
and 1217.363+/-1.1004 for the current
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
I found a something that I quite not understand when compiling a common piece of code using the -Os flags.
I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code:
char pp[3];
char *scscx = pp;
int tst( char i, char j, char k )
{
scscx[0] = i;
scscx[1] = j;
scscx[2] = k;
return 0;
}
The above gets
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll,
while clang/LLVM recognizes common bit-twiddling idioms/expressions
like
unsigned int rotate(unsigned int x, unsigned int n)
{
return (x << n) | (x >> (32 - n));
}
and typically generates "rotate" machine instructions for this
expression, it fails to recognize other also common bit-twiddling
idioms/expressions.
The standard IEEE CRC-32 for "big
2007 Apr 30
0
[LLVMdev] Boostrap Failure -- Expected Differences?
On Apr 27, 2007, at 3:50 PM, David Greene wrote:
> The saga continues.
>
> I've been tracking the interface changes and merging them with
> the refactoring work I'm doing. I got as far as building stage3
> of llvm-gcc but the object files from stage2 and stage3 differ:
>
>
> warning: ./cc1-checksum.o differs
> warning: ./cc1plus-checksum.o differs
>
>
2007 Aug 06
0
cannot use winedbg on ubuntu feisty ?
Hi everyone,
This happens every time I start winedbg, no matter what program
I want to debug. It's ubuntu feisty fawn and I have built
wine from sources.
user@machine:~$ winedbg "C:\Program Files\Diablo II\Diablo II.exe"
WineDbg starting on pid 000a
wine: Unhandled page fault on read access to 0x00000000 at address
0xb7d5cc23 (thread 0009), starting debugger...
Unhandled exception:
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote:
> IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like
> this:
> unsigned int foo(unsigned int crc) {
> if (crc & 0x80000000)
> crc <<= 1, crc ^= 0xEDB88320;
> else
> crc <<= 1;
> return crc;
> }
To document this for x86 too: rewrite the function
2018 Dec 01
2
Where's the optimiser gone? (part 5.c): missed tail calls, and more...
Compile the following functions with "-O3 -target i386-win32"
(see <https://godbolt.org/z/exmjWY>):
__int64 __fastcall div(__int64 foo, __int64 bar)
{
return foo / bar;
}
On the left the generated code; on the right the expected,
properly optimised code:
push dword ptr [esp + 16] |
push dword ptr [esp + 16] |
push dword ptr [esp + 16] |
2008 May 27
3
[LLVMdev] Float compare-for-equality and select optimization opportunity
Hi all,
I'm trying to generate code containing an ordered float compare for
equality, and select. The resulting code however has an unordered compare
and some Boolean logic that I think could be eliminated. In C syntax the
code looks like this:
float x, y;
int a, b, c
if(x == y) // Rotate the integers
{
int t;
t = a;
a = b;