Hi,
I've restarted my Elsa/LLVM project after three months of having real 
life intrude. I upgraded my LLVM source to the current trunk. I had to 
make a few changes to my source, e.g. LLVMFoldingBuilder became 
IRBuilder and several instances of "new" became "Create".
Now, a test case that previously succeeded fails. I run the following 
script:
#!/bin/sh
if [ 1 -ne 0 ] ; then
   echo -n "ellsif:"
   /usr/bin/time -f "real %e user %U sys %S" ../elsa/ellsif/ellsif -v 
-O0 -g -o ellbzip2 -D_FILE_OFFSET_BITS=64 $* bzip2.c crctable.c 
randtable.c compress.c blocksort.c huffman.c decompress.c bzlib.c
   if [ $? -ne 0 ] ; then exit $?; fi
   echo -n "1:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -1
  < sample1.ref > sample1.rb2
   echo -n "2:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -2
  < sample2.ref > sample2.rb2
   echo -n "3:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -3
  < sample3.ref > sample3.rb2
   echo -n "4:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -d
  < sample1.bz2 > sample1.tst
   echo -n "5:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2 -d
  < sample2.bz2 > sample2.tst
   echo -n "6:"; /usr/bin/time -f "real %e user %U sys %S"
./ellbzip2
-ds < sample3.bz2 > sample3.tst
   cmp sample1.bz2 sample1.rb2
   cmp sample2.bz2 sample2.rb2
   cmp sample3.bz2 sample3.rb2
   cmp sample1.tst sample1.ref
   cmp sample2.tst sample2.ref
   cmp sample3.tst sample3.ref
fi
At the -O0 optimization level, it works just fine:
~/bzip2-1.0.4] main% ./ellsamake
ellsif:<premain>: CommandLine Error: Argument 'machine-licm'
defined
more than once!
ellsif: CommandLine Error: Argument 'machine-licm' defined more than
once!
   adding bzip2.c as a C file
   adding crctable.c as a C file
   adding randtable.c as a C file
   adding compress.c as a C file
   adding blocksort.c as a C file
   adding huffman.c as a C file
   adding decompress.c as a C file
   adding bzlib.c as a C file
Phase: Preprocessing
   preprocess bzip2.c to become a preprocessed C file
     /usr/bin/gcc -E -o bzip2.i bzip2.c
   preprocess crctable.c to become a preprocessed C file
     /usr/bin/gcc -E -o crctable.i crctable.c
   preprocess randtable.c to become a preprocessed C file
     /usr/bin/gcc -E -o randtable.i randtable.c
   preprocess compress.c to become a preprocessed C file
     /usr/bin/gcc -E -o compress.i compress.c
   preprocess blocksort.c to become a preprocessed C file
     /usr/bin/gcc -E -o blocksort.i blocksort.c
   preprocess huffman.c to become a preprocessed C file
     /usr/bin/gcc -E -o huffman.i huffman.c
   preprocess decompress.c to become a preprocessed C file
     /usr/bin/gcc -E -o decompress.i decompress.c
   preprocess bzlib.c to become a preprocessed C file
     /usr/bin/gcc -E -o bzlib.i bzlib.c
Phase: Translation
   compile bzip2.i to become an unoptimized LLVM bitcode file
   bzip2.i has been deleted
   compile crctable.i to become an unoptimized LLVM bitcode file
   crctable.i has been deleted
   compile randtable.i to become an unoptimized LLVM bitcode file
   randtable.i has been deleted
   compile compress.i to become an unoptimized LLVM bitcode file
   compress.i has been deleted
   compile blocksort.i to become an unoptimized LLVM bitcode file
   blocksort.i has been deleted
   compile huffman.i to become an unoptimized LLVM bitcode file
   huffman.i has been deleted
   compile decompress.i to become an unoptimized LLVM bitcode file
   decompress.i has been deleted
   compile bzlib.i to become an unoptimized LLVM bitcode file
   bzlib.i has been deleted
Phase: Optimization
   optimize bzip2.ubc to become an LLVM bitcode file
   optimize crctable.ubc to become an LLVM bitcode file
   optimize randtable.ubc to become an LLVM bitcode file
   optimize compress.ubc to become an LLVM bitcode file
   optimize blocksort.ubc to become an LLVM bitcode file
   optimize huffman.ubc to become an LLVM bitcode file
   optimize decompress.ubc to become an LLVM bitcode file
   optimize bzlib.ubc to become an LLVM bitcode file
Phase: Bitcode linking
   bclink bzip2.bc to become a file that has been linked
   bclink crctable.bc to become a file that has been linked
   bclink randtable.bc to become a file that has been linked
   bclink compress.bc to become a file that has been linked
   bclink blocksort.bc to become a file that has been linked
   bclink huffman.bc to become a file that has been linked
   bclink decompress.bc to become a file that has been linked
   bclink bzlib.bc to become a file that has been linked
   bzip2.bc was consumed by the bitcode linker
   crctable.bc was consumed by the bitcode linker
   randtable.bc was consumed by the bitcode linker
   compress.bc was consumed by the bitcode linker
   blocksort.bc was consumed by the bitcode linker
   huffman.bc was consumed by the bitcode linker
   decompress.bc was consumed by the bitcode linker
   bzlib.bc was consumed by the bitcode linker
   bclink ellbzip2.bc added to the file list
Phase: Generating
   generate ellbzip2.bc to become an assembly source file
Phase: Linking
   assemble ellbzip2.s to become a file that has been linked
     /usr/bin/gcc -fno-strict-aliasing -O3 -o ellbzip2 ellbzip2.s
   ellbzip2.s has been deleted
real 5.72 user 5.50 sys 0.13
1:real 0.03 user 0.03 sys 0.00
2:real 0.07 user 0.07 sys 0.00
3:real 0.17 user 0.16 sys 0.00
4:real 0.00 user 0.00 sys 0.00
5:real 0.02 user 0.01 sys 0.00
6:real 0.01 user 0.01 sys 0.00
[~/bzip2-1.0.4] main%
If I use a higher optimization level, the compilation fails (e.g. -O5):
[~/bzip2-1.0.4] main% ./ellsamake
ellsif:<premain>: CommandLine Error: Argument 'machine-licm'
defined
more than once!
ellsif: CommandLine Error: Argument 'machine-licm' defined more than
once!
   adding bzip2.c as a C file
   adding crctable.c as a C file
   adding randtable.c as a C file
   adding compress.c as a C file
   adding blocksort.c as a C file
   adding huffman.c as a C file
   adding decompress.c as a C file
   adding bzlib.c as a C file
Phase: Preprocessing
   preprocess bzip2.c to become a preprocessed C file
     /usr/bin/gcc -E -o bzip2.i bzip2.c
   preprocess crctable.c to become a preprocessed C file
     /usr/bin/gcc -E -o crctable.i crctable.c
   preprocess randtable.c to become a preprocessed C file
     /usr/bin/gcc -E -o randtable.i randtable.c
   preprocess compress.c to become a preprocessed C file
     /usr/bin/gcc -E -o compress.i compress.c
   preprocess blocksort.c to become a preprocessed C file
     /usr/bin/gcc -E -o blocksort.i blocksort.c
   preprocess huffman.c to become a preprocessed C file
     /usr/bin/gcc -E -o huffman.i huffman.c
   preprocess decompress.c to become a preprocessed C file
     /usr/bin/gcc -E -o decompress.i decompress.c
   preprocess bzlib.c to become a preprocessed C file
     /usr/bin/gcc -E -o bzlib.i bzlib.c
Phase: Translation
   compile bzip2.i to become an unoptimized LLVM bitcode file
   bzip2.i has been deleted
   compile crctable.i to become an unoptimized LLVM bitcode file
   crctable.i has been deleted
   compile randtable.i to become an unoptimized LLVM bitcode file
   randtable.i has been deleted
   compile compress.i to become an unoptimized LLVM bitcode file
   compress.i has been deleted
   compile blocksort.i to become an unoptimized LLVM bitcode file
   blocksort.i has been deleted
   compile huffman.i to become an unoptimized LLVM bitcode file
   huffman.i has been deleted
   compile decompress.i to become an unoptimized LLVM bitcode file
   decompress.i has been deleted
   compile bzlib.i to become an unoptimized LLVM bitcode file
   bzlib.i has been deleted
Phase: Optimization
   optimize bzip2.ubc to become an LLVM bitcode file
ellsif: /home/rich/llvm-trunk-new/lib/VMCore/Value.cpp:63: virtual 
llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when
a
value is destroyed!"' failed.
../elsa/ellsif/ellsif[0x899d61e]
../elsa/ellsif/ellsif[0x899d750]
[0x110420]
/lib/libc.so.6(abort+0x101)[0xa62f91]
/lib/libc.so.6(__assert_fail+0xee)[0xa5a93e]
../elsa/ellsif/ellsif[0x894eee4]
../elsa/ellsif/ellsif[0x821e061]
../elsa/ellsif/ellsif[0x890c215]
../elsa/ellsif/ellsif[0x890c863]
../elsa/ellsif/ellsif[0x891c54f]
../elsa/ellsif/ellsif[0x81b3b73]
../elsa/ellsif/ellsif[0x88004fd]
../elsa/ellsif/ellsif[0x87ff0da]
../elsa/ellsif/ellsif[0x87f9c27]
../elsa/ellsif/ellsif[0x87f9e9d]
../elsa/ellsif/ellsif[0x892ef3f]
../elsa/ellsif/ellsif[0x885d5f5]
../elsa/ellsif/ellsif[0x892ebc6]
../elsa/ellsif/ellsif[0x892ed7e]
../elsa/ellsif/ellsif[0x892edd1]
../elsa/ellsif/ellsif[0x80542ea]
../elsa/ellsif/ellsif[0x80578e9]
/lib/libc.so.6(__libc_start_main+0xe0)[0xa4e390]
../elsa/ellsif/ellsif[0x804d7a1]
Command terminated by signal 6
real 2.70 user 2.54 sys 0.12
[~/bzip2-1.0.4] main%
Strangely enough, I get the same results for -O2 .. -O4 but -O1 fails by 
getting stuck in the bitcode linker and consuming memory.
I assume I am creating some malformed bitcode and I'll track it down by 
elimination, but I was hoping that someone might have some insight that 
would help me narrow down my search.
-Rich
Richard Pennington
2008-May-17  18:34 UTC
[LLVMdev] More info, was Help needed after hiatus
Hi,
I know my last question was very vague (i.e. "It stopped working, what 
went wrong?"), so here is a little more concrete example:
If I run the optimizer (opt) on this code snippet with -std-compile-opts 
the optimizer hangs.
; ModuleID = 'test.ubc'
target datalayout = 
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32"
target triple = "i686-pc-linux-gnu"
declare void @BZALLOC(i32)
define void @f(i32) {
entry:
         %blockSize100k = alloca i32             ; <i32*> [#uses=2]
         store i32 %0, i32* %blockSize100k
         %n = alloca i32         ; <i32*> [#uses=2]
         load i32* %blockSize100k                ; <i32>:1 [#uses=1]
         store i32 %1, i32* %n
         load i32* %n            ; <i32>:2 [#uses=1]
         add i32 %2, 2           ; <i32>:3 [#uses=1]
         mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to 
i32)             ; <i32>:4 [#uses=1]
         call void @BZALLOC( i32 %4 )
         br label %return
return:         ; preds = %entry
         ret void
}
This is generated from this test program:
extern void BZALLOC(int s);
void f(int blockSize100k)
{
     int n = blockSize100k;
     BZALLOC( (n+2) * sizeof(unsigned int) );
}
Besides the optimizer hanging, the strange thing about this is that it 
doesn't hang if the blockSize100k variable is a local rather than a 
parameter or if the n+2 is changed to n+1 (!?!).
Is this code intrinsically incorrect, especially the getelementptr for 
sizeof(), or should I look at the optimizer? It seems to be hanging in 
the createInstructionCombiningPass.
-Rich
On Sat, May 17, 2008 at 11:34 AM, Richard Pennington <rich at pennware.com> wrote:> If I run the optimizer (opt) on this code snippet with -std-compile-opts > the optimizer hangs. > > > ; ModuleID = 'test.ubc' > target datalayout > "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32" > target triple = "i686-pc-linux-gnu" > > declare void @BZALLOC(i32) > > define void @f(i32) { > entry: > %blockSize100k = alloca i32 ; <i32*> [#uses=2] > store i32 %0, i32* %blockSize100k > %n = alloca i32 ; <i32*> [#uses=2] > load i32* %blockSize100k ; <i32>:1 [#uses=1] > store i32 %1, i32* %n > load i32* %n ; <i32>:2 [#uses=1] > add i32 %2, 2 ; <i32>:3 [#uses=1] > mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to > i32) ; <i32>:4 [#uses=1] > call void @BZALLOC( i32 %4 ) > br label %return > > return: ; preds = %entry > ret void > }BTW, It's usually better to file a bug for this sort of thing. The issue is around InstructionCombining:2507: // W*X + Y*Z --> W * (X+Z) iff W == Y if (I.getType()->isIntOrIntVector()) { Value *W, *X, *Y, *Z; if (match(LHS, m_Mul(m_Value(W), m_Value(X))) && match(RHS, m_Mul(m_Value(Y), m_Value(Z)))) { The issue starts with the lines: add i32 %2, 2 ; <i32>:3 [#uses=1] mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to i32) ; <i32>:4 [#uses=1] Roughly, the multiplication gets distributed, resulting in something like (loaded value) * (ptrtointexpr) + 2 * (ptrtointexpr). This gets matched by the match, which then reverses the transformation. This, of course, gets matched by the code to distribute the multiply, resulting in a never-ending cycle. A side note: I know I've seen suggestions that "ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)" is a suitable replacement for sizeof, but if it is supposed to be legal, the documentation for getelementptr should make that clear. I'm pretty sure the equivalent C, "((int*)0)+1", has undefined behavior. -Eli
Anton Korobeynikov
2008-May-17  21:31 UTC
[LLVMdev] More info, was Help needed after hiatus
Hello, Eli> BTW, It's usually better to file a bug for this sort of thing.I already filled a PR for this stuff. In any case instcombine shouldn't cycle :) -- With best regards, Anton Korobeynikov. Faculty of Mathematics & Mechanics, Saint Petersburg State University.