Hi, I've restarted my Elsa/LLVM project after three months of having real life intrude. I upgraded my LLVM source to the current trunk. I had to make a few changes to my source, e.g. LLVMFoldingBuilder became IRBuilder and several instances of "new" became "Create". Now, a test case that previously succeeded fails. I run the following script: #!/bin/sh if [ 1 -ne 0 ] ; then echo -n "ellsif:" /usr/bin/time -f "real %e user %U sys %S" ../elsa/ellsif/ellsif -v -O0 -g -o ellbzip2 -D_FILE_OFFSET_BITS=64 $* bzip2.c crctable.c randtable.c compress.c blocksort.c huffman.c decompress.c bzlib.c if [ $? -ne 0 ] ; then exit $?; fi echo -n "1:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -1 < sample1.ref > sample1.rb2 echo -n "2:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -2 < sample2.ref > sample2.rb2 echo -n "3:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -3 < sample3.ref > sample3.rb2 echo -n "4:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -d < sample1.bz2 > sample1.tst echo -n "5:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -d < sample2.bz2 > sample2.tst echo -n "6:"; /usr/bin/time -f "real %e user %U sys %S" ./ellbzip2 -ds < sample3.bz2 > sample3.tst cmp sample1.bz2 sample1.rb2 cmp sample2.bz2 sample2.rb2 cmp sample3.bz2 sample3.rb2 cmp sample1.tst sample1.ref cmp sample2.tst sample2.ref cmp sample3.tst sample3.ref fi At the -O0 optimization level, it works just fine: ~/bzip2-1.0.4] main% ./ellsamake ellsif:<premain>: CommandLine Error: Argument 'machine-licm' defined more than once! ellsif: CommandLine Error: Argument 'machine-licm' defined more than once! adding bzip2.c as a C file adding crctable.c as a C file adding randtable.c as a C file adding compress.c as a C file adding blocksort.c as a C file adding huffman.c as a C file adding decompress.c as a C file adding bzlib.c as a C file Phase: Preprocessing preprocess bzip2.c to become a preprocessed C file /usr/bin/gcc -E -o bzip2.i bzip2.c preprocess crctable.c to become a preprocessed C file /usr/bin/gcc -E -o crctable.i crctable.c preprocess randtable.c to become a preprocessed C file /usr/bin/gcc -E -o randtable.i randtable.c preprocess compress.c to become a preprocessed C file /usr/bin/gcc -E -o compress.i compress.c preprocess blocksort.c to become a preprocessed C file /usr/bin/gcc -E -o blocksort.i blocksort.c preprocess huffman.c to become a preprocessed C file /usr/bin/gcc -E -o huffman.i huffman.c preprocess decompress.c to become a preprocessed C file /usr/bin/gcc -E -o decompress.i decompress.c preprocess bzlib.c to become a preprocessed C file /usr/bin/gcc -E -o bzlib.i bzlib.c Phase: Translation compile bzip2.i to become an unoptimized LLVM bitcode file bzip2.i has been deleted compile crctable.i to become an unoptimized LLVM bitcode file crctable.i has been deleted compile randtable.i to become an unoptimized LLVM bitcode file randtable.i has been deleted compile compress.i to become an unoptimized LLVM bitcode file compress.i has been deleted compile blocksort.i to become an unoptimized LLVM bitcode file blocksort.i has been deleted compile huffman.i to become an unoptimized LLVM bitcode file huffman.i has been deleted compile decompress.i to become an unoptimized LLVM bitcode file decompress.i has been deleted compile bzlib.i to become an unoptimized LLVM bitcode file bzlib.i has been deleted Phase: Optimization optimize bzip2.ubc to become an LLVM bitcode file optimize crctable.ubc to become an LLVM bitcode file optimize randtable.ubc to become an LLVM bitcode file optimize compress.ubc to become an LLVM bitcode file optimize blocksort.ubc to become an LLVM bitcode file optimize huffman.ubc to become an LLVM bitcode file optimize decompress.ubc to become an LLVM bitcode file optimize bzlib.ubc to become an LLVM bitcode file Phase: Bitcode linking bclink bzip2.bc to become a file that has been linked bclink crctable.bc to become a file that has been linked bclink randtable.bc to become a file that has been linked bclink compress.bc to become a file that has been linked bclink blocksort.bc to become a file that has been linked bclink huffman.bc to become a file that has been linked bclink decompress.bc to become a file that has been linked bclink bzlib.bc to become a file that has been linked bzip2.bc was consumed by the bitcode linker crctable.bc was consumed by the bitcode linker randtable.bc was consumed by the bitcode linker compress.bc was consumed by the bitcode linker blocksort.bc was consumed by the bitcode linker huffman.bc was consumed by the bitcode linker decompress.bc was consumed by the bitcode linker bzlib.bc was consumed by the bitcode linker bclink ellbzip2.bc added to the file list Phase: Generating generate ellbzip2.bc to become an assembly source file Phase: Linking assemble ellbzip2.s to become a file that has been linked /usr/bin/gcc -fno-strict-aliasing -O3 -o ellbzip2 ellbzip2.s ellbzip2.s has been deleted real 5.72 user 5.50 sys 0.13 1:real 0.03 user 0.03 sys 0.00 2:real 0.07 user 0.07 sys 0.00 3:real 0.17 user 0.16 sys 0.00 4:real 0.00 user 0.00 sys 0.00 5:real 0.02 user 0.01 sys 0.00 6:real 0.01 user 0.01 sys 0.00 [~/bzip2-1.0.4] main% If I use a higher optimization level, the compilation fails (e.g. -O5): [~/bzip2-1.0.4] main% ./ellsamake ellsif:<premain>: CommandLine Error: Argument 'machine-licm' defined more than once! ellsif: CommandLine Error: Argument 'machine-licm' defined more than once! adding bzip2.c as a C file adding crctable.c as a C file adding randtable.c as a C file adding compress.c as a C file adding blocksort.c as a C file adding huffman.c as a C file adding decompress.c as a C file adding bzlib.c as a C file Phase: Preprocessing preprocess bzip2.c to become a preprocessed C file /usr/bin/gcc -E -o bzip2.i bzip2.c preprocess crctable.c to become a preprocessed C file /usr/bin/gcc -E -o crctable.i crctable.c preprocess randtable.c to become a preprocessed C file /usr/bin/gcc -E -o randtable.i randtable.c preprocess compress.c to become a preprocessed C file /usr/bin/gcc -E -o compress.i compress.c preprocess blocksort.c to become a preprocessed C file /usr/bin/gcc -E -o blocksort.i blocksort.c preprocess huffman.c to become a preprocessed C file /usr/bin/gcc -E -o huffman.i huffman.c preprocess decompress.c to become a preprocessed C file /usr/bin/gcc -E -o decompress.i decompress.c preprocess bzlib.c to become a preprocessed C file /usr/bin/gcc -E -o bzlib.i bzlib.c Phase: Translation compile bzip2.i to become an unoptimized LLVM bitcode file bzip2.i has been deleted compile crctable.i to become an unoptimized LLVM bitcode file crctable.i has been deleted compile randtable.i to become an unoptimized LLVM bitcode file randtable.i has been deleted compile compress.i to become an unoptimized LLVM bitcode file compress.i has been deleted compile blocksort.i to become an unoptimized LLVM bitcode file blocksort.i has been deleted compile huffman.i to become an unoptimized LLVM bitcode file huffman.i has been deleted compile decompress.i to become an unoptimized LLVM bitcode file decompress.i has been deleted compile bzlib.i to become an unoptimized LLVM bitcode file bzlib.i has been deleted Phase: Optimization optimize bzip2.ubc to become an LLVM bitcode file ellsif: /home/rich/llvm-trunk-new/lib/VMCore/Value.cpp:63: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. ../elsa/ellsif/ellsif[0x899d61e] ../elsa/ellsif/ellsif[0x899d750] [0x110420] /lib/libc.so.6(abort+0x101)[0xa62f91] /lib/libc.so.6(__assert_fail+0xee)[0xa5a93e] ../elsa/ellsif/ellsif[0x894eee4] ../elsa/ellsif/ellsif[0x821e061] ../elsa/ellsif/ellsif[0x890c215] ../elsa/ellsif/ellsif[0x890c863] ../elsa/ellsif/ellsif[0x891c54f] ../elsa/ellsif/ellsif[0x81b3b73] ../elsa/ellsif/ellsif[0x88004fd] ../elsa/ellsif/ellsif[0x87ff0da] ../elsa/ellsif/ellsif[0x87f9c27] ../elsa/ellsif/ellsif[0x87f9e9d] ../elsa/ellsif/ellsif[0x892ef3f] ../elsa/ellsif/ellsif[0x885d5f5] ../elsa/ellsif/ellsif[0x892ebc6] ../elsa/ellsif/ellsif[0x892ed7e] ../elsa/ellsif/ellsif[0x892edd1] ../elsa/ellsif/ellsif[0x80542ea] ../elsa/ellsif/ellsif[0x80578e9] /lib/libc.so.6(__libc_start_main+0xe0)[0xa4e390] ../elsa/ellsif/ellsif[0x804d7a1] Command terminated by signal 6 real 2.70 user 2.54 sys 0.12 [~/bzip2-1.0.4] main% Strangely enough, I get the same results for -O2 .. -O4 but -O1 fails by getting stuck in the bitcode linker and consuming memory. I assume I am creating some malformed bitcode and I'll track it down by elimination, but I was hoping that someone might have some insight that would help me narrow down my search. -Rich
Richard Pennington
2008-May-17 18:34 UTC
[LLVMdev] More info, was Help needed after hiatus
Hi, I know my last question was very vague (i.e. "It stopped working, what went wrong?"), so here is a little more concrete example: If I run the optimizer (opt) on this code snippet with -std-compile-opts the optimizer hangs. ; ModuleID = 'test.ubc' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32" target triple = "i686-pc-linux-gnu" declare void @BZALLOC(i32) define void @f(i32) { entry: %blockSize100k = alloca i32 ; <i32*> [#uses=2] store i32 %0, i32* %blockSize100k %n = alloca i32 ; <i32*> [#uses=2] load i32* %blockSize100k ; <i32>:1 [#uses=1] store i32 %1, i32* %n load i32* %n ; <i32>:2 [#uses=1] add i32 %2, 2 ; <i32>:3 [#uses=1] mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to i32) ; <i32>:4 [#uses=1] call void @BZALLOC( i32 %4 ) br label %return return: ; preds = %entry ret void } This is generated from this test program: extern void BZALLOC(int s); void f(int blockSize100k) { int n = blockSize100k; BZALLOC( (n+2) * sizeof(unsigned int) ); } Besides the optimizer hanging, the strange thing about this is that it doesn't hang if the blockSize100k variable is a local rather than a parameter or if the n+2 is changed to n+1 (!?!). Is this code intrinsically incorrect, especially the getelementptr for sizeof(), or should I look at the optimizer? It seems to be hanging in the createInstructionCombiningPass. -Rich
On Sat, May 17, 2008 at 11:34 AM, Richard Pennington <rich at pennware.com> wrote:> If I run the optimizer (opt) on this code snippet with -std-compile-opts > the optimizer hangs. > > > ; ModuleID = 'test.ubc' > target datalayout > "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-s0:0:64-f80:32:32" > target triple = "i686-pc-linux-gnu" > > declare void @BZALLOC(i32) > > define void @f(i32) { > entry: > %blockSize100k = alloca i32 ; <i32*> [#uses=2] > store i32 %0, i32* %blockSize100k > %n = alloca i32 ; <i32*> [#uses=2] > load i32* %blockSize100k ; <i32>:1 [#uses=1] > store i32 %1, i32* %n > load i32* %n ; <i32>:2 [#uses=1] > add i32 %2, 2 ; <i32>:3 [#uses=1] > mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to > i32) ; <i32>:4 [#uses=1] > call void @BZALLOC( i32 %4 ) > br label %return > > return: ; preds = %entry > ret void > }BTW, It's usually better to file a bug for this sort of thing. The issue is around InstructionCombining:2507: // W*X + Y*Z --> W * (X+Z) iff W == Y if (I.getType()->isIntOrIntVector()) { Value *W, *X, *Y, *Z; if (match(LHS, m_Mul(m_Value(W), m_Value(X))) && match(RHS, m_Mul(m_Value(Y), m_Value(Z)))) { The issue starts with the lines: add i32 %2, 2 ; <i32>:3 [#uses=1] mul i32 %3, ptrtoint (i32* getelementptr (i32* null, i32 1) to i32) ; <i32>:4 [#uses=1] Roughly, the multiplication gets distributed, resulting in something like (loaded value) * (ptrtointexpr) + 2 * (ptrtointexpr). This gets matched by the match, which then reverses the transformation. This, of course, gets matched by the code to distribute the multiply, resulting in a never-ending cycle. A side note: I know I've seen suggestions that "ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)" is a suitable replacement for sizeof, but if it is supposed to be legal, the documentation for getelementptr should make that clear. I'm pretty sure the equivalent C, "((int*)0)+1", has undefined behavior. -Eli
Anton Korobeynikov
2008-May-17 21:31 UTC
[LLVMdev] More info, was Help needed after hiatus
Hello, Eli> BTW, It's usually better to file a bug for this sort of thing.I already filled a PR for this stuff. In any case instcombine shouldn't cycle :) -- With best regards, Anton Korobeynikov. Faculty of Mathematics & Mechanics, Saint Petersburg State University.