Hi,
I've restarted my Elsa/LLVM project after three months of having real
life intrude. I upgraded my LLVM source to the current trunk. I had to
make a few changes to my source, e.g. LLVMFoldingBuilder became
IRBuilder and several instances of "new" became "Create".
Now, a test case that previously succeeded fails. I run the following
script:
#!/bin/sh
if [ 1 -ne 0 ] ; then
echo -n "ellsif:"
/usr/bin/time -f "real %e user %U sys %S" ../elsa/ellsif/ellsif -v
-O0 -g -o ellbzip2 -D_FILE_OFFSET_BITS=64 $* bzip2.c crctable.c
randtable.c compress.c blocksort.c huffman.c decompress.c bzlib.c
if [ $? -ne 0 ] ; then exit $?; fi
echo -n "1:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -1
< sample1.ref > sample1.rb2
echo -n "2:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -2
< sample2.ref > sample2.rb2
echo -n "3:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -3
< sample3.ref > sample3.rb2
echo -n "4:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -d
< sample1.bz2 > sample1.tst
echo -n "5:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -d
< sample2.bz2 > sample2.tst
echo -n "6:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2
-ds < sample3.bz2 > sample3.tst
cmp sample1.bz2 sample1.rb2
cmp sample2.bz2 sample2.rb2
cmp sample3.bz2 sample3.rb2
cmp sample1.tst sample1.ref
cmp sample2.tst sample2.ref
cmp sample3.tst sample3.ref
fi
At the -O0 optimization level, it works just fine:
~/bzip2-1.0.4] main% ./ellsamake
ellsif:<premain>: CommandLine Error: Argument 'machine-licm'
defined
more than once!
ellsif: CommandLine Error: Argument 'machine-licm' defined more than
once!
adding bzip2.c as a C file
adding crctable.c as a C file
adding randtable.c as a C file
adding compress.c as a C file
adding blocksort.c as a C file
adding huffman.c as a C file
adding decompress.c as a C file
adding bzlib.c as a C file
Phase: Preprocessing
preprocess bzip2.c to become a preprocessed C file
/usr/bin/gcc -E -o bzip2.i bzip2.c
preprocess crctable.c to become a preprocessed C file
/usr/bin/gcc -E -o crctable.i crctable.c
preprocess randtable.c to become a preprocessed C file
/usr/bin/gcc -E -o randtable.i randtable.c
preprocess compress.c to become a preprocessed C file
/usr/bin/gcc -E -o compress.i compress.c
preprocess blocksort.c to become a preprocessed C file
/usr/bin/gcc -E -o blocksort.i blocksort.c
preprocess huffman.c to become a preprocessed C file
/usr/bin/gcc -E -o huffman.i huffman.c
preprocess decompress.c to become a preprocessed C file
/usr/bin/gcc -E -o decompress.i decompress.c
preprocess bzlib.c to become a preprocessed C file
/usr/bin/gcc -E -o bzlib.i bzlib.c
Phase: Translation
compile bzip2.i to become an unoptimized LLVM bitcode file
bzip2.i has been deleted
compile crctable.i to become an unoptimized LLVM bitcode file
crctable.i has been deleted
compile randtable.i to become an unoptimized LLVM bitcode file
randtable.i has been deleted
compile compress.i to become an unoptimized LLVM bitcode file
compress.i has been deleted
compile blocksort.i to become an unoptimized LLVM bitcode file
blocksort.i has been deleted
compile huffman.i to become an unoptimized LLVM bitcode file
huffman.i has been deleted
compile decompress.i to become an unoptimized LLVM bitcode file
decompress.i has been deleted
compile bzlib.i to become an unoptimized LLVM bitcode file
bzlib.i has been deleted
Phase: Optimization
optimize bzip2.ubc to become an LLVM bitcode file
optimize crctable.ubc to become an LLVM bitcode file
optimize randtable.ubc to become an LLVM bitcode file
optimize compress.ubc to become an LLVM bitcode file
optimize blocksort.ubc to become an LLVM bitcode file
optimize huffman.ubc to become an LLVM bitcode file
optimize decompress.ubc to become an LLVM bitcode file
optimize bzlib.ubc to become an LLVM bitcode file
Phase: Bitcode linking
bclink bzip2.bc to become a file that has been linked
bclink crctable.bc to become a file that has been linked
bclink randtable.bc to become a file that has been linked
bclink compress.bc to become a file that has been linked
bclink blocksort.bc to become a file that has been linked
bclink huffman.bc to become a file that has been linked
bclink decompress.bc to become a file that has been linked
bclink bzlib.bc to become a file that has been linked
bzip2.bc was consumed by the bitcode linker
crctable.bc was consumed by the bitcode linker
randtable.bc was consumed by the bitcode linker
compress.bc was consumed by the bitcode linker
blocksort.bc was consumed by the bitcode linker
huffman.bc was consumed by the bitcode linker
decompress.bc was consumed by the bitcode linker
bzlib.bc was consumed by the bitcode linker
bclink ellbzip2.bc added to the file list
Phase: Generating
generate ellbzip2.bc to become an assembly source file
Phase: Linking
assemble ellbzip2.s to become a file that has been linked
/usr/bin/gcc -fno-strict-aliasing -O3 -o ellbzip2 ellbzip2.s
ellbzip2.s has been deleted
real 5.72 user 5.50 sys 0.13
1:real 0.03 user 0.03 sys 0.00
2:real 0.07 user 0.07 sys 0.00
3:real 0.17 user 0.16 sys 0.00
4:real 0.00 user 0.00 sys 0.00
5:real 0.02 user 0.01 sys 0.00
6:real 0.01 user 0.01 sys 0.00
[~/bzip2-1.0.4] main%
If I use a higher optimization level, the compilation fails (e.g. -O5):
[~/bzip2-1.0.4] main% ./ellsamake
ellsif:<premain>: CommandLine Error: Argument 'machine-licm'
defined
more than once!
ellsif: CommandLine Error: Argument 'machine-licm' defined more than
once!
adding bzip2.c as a C file
adding crctable.c as a C file
adding randtable.c as a C file
adding compress.c as a C file
adding blocksort.c as a C file
adding huffman.c as a C file
adding decompress.c as a C file
adding bzlib.c as a C file
Phase: Preprocessing
preprocess bzip2.c to become a preprocessed C file
/usr/bin/gcc -E -o bzip2.i bzip2.c
preprocess crctable.c to become a preprocessed C file
/usr/bin/gcc -E -o crctable.i crctable.c
preprocess randtable.c to become a preprocessed C file
/usr/bin/gcc -E -o randtable.i randtable.c
preprocess compress.c to become a preprocessed C file
/usr/bin/gcc -E -o compress.i compress.c
preprocess blocksort.c to become a preprocessed C file
/usr/bin/gcc -E -o blocksort.i blocksort.c
preprocess huffman.c to become a preprocessed C file
/usr/bin/gcc -E -o huffman.i huffman.c
preprocess decompress.c to become a preprocessed C file
/usr/bin/gcc -E -o decompress.i decompress.c
preprocess bzlib.c to become a preprocessed C file
/usr/bin/gcc -E -o bzlib.i bzlib.c
Phase: Translation
compile bzip2.i to become an unoptimized LLVM bitcode file
bzip2.i has been deleted
compile crctable.i to become an unoptimized LLVM bitcode file
crctable.i has been deleted
compile randtable.i to become an unoptimized LLVM bitcode file
randtable.i has been deleted
compile compress.i to become an unoptimized LLVM bitcode file
compress.i has been deleted
compile blocksort.i to become an unoptimized LLVM bitcode file
blocksort.i has been deleted
compile huffman.i to become an unoptimized LLVM bitcode file
huffman.i has been deleted
compile decompress.i to become an unoptimized LLVM bitcode file
decompress.i has been deleted
compile bzlib.i to become an unoptimized LLVM bitcode file
bzlib.i has been deleted
Phase: Optimization
optimize bzip2.ubc to become an LLVM bitcode file
ellsif: /home/rich/llvm-trunk-new/lib/VMCore/Value.cpp:63: virtual
llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when
a
value is destroyed!"' failed.
../elsa/ellsif/ellsif[0x899d61e]
../elsa/ellsif/ellsif[0x899d750]
[0x110420]
/lib/libc.so.6(abort+0x101)[0xa62f91]
/lib/libc.so.6(__assert_fail+0xee)[0xa5a93e]
../elsa/ellsif/ellsif[0x894eee4]
../elsa/ellsif/ellsif[0x821e061]
../elsa/ellsif/ellsif[0x890c215]
../elsa/ellsif/ellsif[0x890c863]
../elsa/ellsif/ellsif[0x891c54f]
../elsa/ellsif/ellsif[0x81b3b73]
../elsa/ellsif/ellsif[0x88004fd]
../elsa/ellsif/ellsif[0x87ff0da]
../elsa/ellsif/ellsif[0x87f9c27]
../elsa/ellsif/ellsif[0x87f9e9d]
../elsa/ellsif/ellsif[0x892ef3f]
../elsa/ellsif/ellsif[0x885d5f5]
../elsa/ellsif/ellsif[0x892ebc6]
../elsa/ellsif/ellsif[0x892ed7e]
../elsa/ellsif/ellsif[0x892edd1]
../elsa/ellsif/ellsif[0x80542ea]
../elsa/ellsif/ellsif[0x80578e9]
/lib/libc.so.6(__libc_start_main+0xe0)[0xa4e390]
../elsa/ellsif/ellsif[0x804d7a1]
Command terminated by signal 6
real 2.70 user 2.54 sys 0.12
[~/bzip2-1.0.4] main%
Strangely enough, I get the same results for -O2 .. -O4 but -O1 fails by
getting stuck in the bitcode linker and consuming memory.
I assume I am creating some malformed bitcode and I'll track it down by
elimination, but I was hoping that someone might have some insight that
would help me narrow down my search.
-Rich
Richard Pennington
2008-May-17 18:34 UTC
[LLVMdev] More info, was Help needed after hiatus
Hi,
I know my last question was very vague (i.e. "It stopped working, what
went wrong?"), so here is a little more concrete example:
If I run the optimizer (opt) on this code snippet with -std-compile-opts
the optimizer hangs.
; ModuleID = 'test.ubc'
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32"
target triple = "i686-pc-linux-gnu"
declare void @BZALLOC(i32)
define void @f(i32) {
entry:
%blockSize100k = alloca i32 ; <i32*> [#uses=2]
store i32 %0, i32* %blockSize100k
%n = alloca i32 ; <i32*> [#uses=2]
load i32* %blockSize100k ; <i32>:1 [#uses=1]
store i32 %1, i32* %n
load i32* %n ; <i32>:2 [#uses=1]
add i32 %2, 2 ; <i32>:3 [#uses=1]
mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to
i32) ; <i32>:4 [#uses=1]
call void @BZALLOC( i32 %4 )
br label %return
return: ; preds = %entry
ret void
}
This is generated from this test program:
extern void BZALLOC(int s);
void f(int blockSize100k)
{
int n = blockSize100k;
BZALLOC( (n+2) * sizeof(unsigned int) );
}
Besides the optimizer hanging, the strange thing about this is that it
doesn't hang if the blockSize100k variable is a local rather than a
parameter or if the n+2 is changed to n+1 (!?!).
Is this code intrinsically incorrect, especially the getelementptr for
sizeof(), or should I look at the optimizer? It seems to be hanging in
the createInstructionCombiningPass.
-Rich
On Sat, May 17, 2008 at 11:34 AM, Richard Pennington <rich at pennware.com> wrote:> If I run the optimizer (opt) on this code snippet with -std-compile-opts > the optimizer hangs. > > > ; ModuleID = 'test.ubc' > target datalayout > "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32" > target triple = "i686-pc-linux-gnu" > > declare void @BZALLOC(i32) > > define void @f(i32) { > entry: > %blockSize100k = alloca i32 ; <i32*> [#uses=2] > store i32 %0, i32* %blockSize100k > %n = alloca i32 ; <i32*> [#uses=2] > load i32* %blockSize100k ; <i32>:1 [#uses=1] > store i32 %1, i32* %n > load i32* %n ; <i32>:2 [#uses=1] > add i32 %2, 2 ; <i32>:3 [#uses=1] > mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to > i32) ; <i32>:4 [#uses=1] > call void @BZALLOC( i32 %4 ) > br label %return > > return: ; preds = %entry > ret void > }BTW, It's usually better to file a bug for this sort of thing. The issue is around InstructionCombining:2507: // W*X + Y*Z --> W * (X+Z) iff W == Y if (I.getType()->isIntOrIntVector()) { Value *W, *X, *Y, *Z; if (match(LHS, m_Mul(m_Value(W), m_Value(X))) && match(RHS, m_Mul(m_Value(Y), m_Value(Z)))) { The issue starts with the lines: add i32 %2, 2 ; <i32>:3 [#uses=1] mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to i32) ; <i32>:4 [#uses=1] Roughly, the multiplication gets distributed, resulting in something like (loaded value) * (ptrtointexpr) + 2 * (ptrtointexpr). This gets matched by the match, which then reverses the transformation. This, of course, gets matched by the code to distribute the multiply, resulting in a never-ending cycle. A side note: I know I've seen suggestions that "ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)" is a suitable replacement for sizeof, but if it is supposed to be legal, the documentation for getelementptr should make that clear. I'm pretty sure the equivalent C, "((int*)0)+1", has undefined behavior. -Eli
Anton Korobeynikov
2008-May-17 21:31 UTC
[LLVMdev] More info, was Help needed after hiatus
Hello, Eli> BTW, It's usually better to file a bug for this sort of thing.I already filled a PR for this stuff. In any case instcombine shouldn't cycle :) -- With best regards, Anton Korobeynikov. Faculty of Mathematics & Mechanics, Saint Petersburg State University.