Hi Soham, You're right that in LLVM IR arithmetic (with the current definition of `undef`) is not distributive. You can't replace `A * (B + C)` with `A * B + A * C` in general, since (exactly as you said) for A `undef`, B = `1`, C = `-1` the former always computes `0` while the latter computes `undef`. This is fundamentally because replacing `A * (B + C)` with `A * B + A * C` increases the number of uses of `A` (which is `undef`), and thus injects more behavior into the program. I _think_ going from `A * B + A * C` to `A * (B + C)` is okay though. Here is a simpler example of a similar issue: `X - X` is not equivalent `0`, in that `0` cannot be replaced with `X - X`, even though `X - X` can be folded to `0`. -- Sanjoy
Sanjoy Das wrote: > Here is a simpler example of a similar issue: `X - X` is not > equivalent `0`, in that `0` cannot be replaced with `X - X`, even > though `X - X` can be folded to `0`. I forgot to state that `X` above is some unknown value which can potentially be `undef`. A full example would be: ``` define i32 @f_0(i32 %x) { ret i32 0 } define i32 @f_1(i32 %x) { %v = sub i32 %x, %x ret i32 %v } ``` `@f_1` can be optimized to `@f_0` but not vice versa, unless you somehow know that `%x` cannot be `undef`. -- Sanjoy
Thanks for your answers. Another example of unsound transformation on Boolean algebra. According to the LLVM documentation ( http://llvm.org/docs/LangRef.html#undefined-values) it is unsafe to consider ' a & undef = undef ' and ' a | undef = undef ' but 'undef xor undef = undef' is safe. Now, given an expression ((a & (~b)) | ((~a) & b)) where a and b are undef expressions, the value must be specialized to some constant (e.g. 0). But, if the expression is transformed to $a xor b$ then it may return 'undef'. As a result, the transformation ' ((a & (~b)) |((~a) & b)) ~> xor a b ' is unsound. LLVM presently performs this transformation. Best Regards, soham source ---------- int foo(){ int a,b; // unintialized and 'undef' return ((a & (~b)) |((~a) & b)); } unoptimized IR // clang++ -emit-llvm udf.cpp -S ---------------------- define i32 @_Z3foov() #0 { entry: %a = alloca i32, align 4 %b = alloca i32, align 4 %0 = load i32, i32* %a, align 4 %1 = load i32, i32* %b, align 4 %neg = xor i32 %1, -1 %and = and i32 %0, %neg %2 = load i32, i32* %a, align 4 %neg1 = xor i32 %2, -1 %3 = load i32, i32* %b, align 4 %and2 = and i32 %neg1, %3 %or = or i32 %and, %and2 ret i32 %or } Here %or must return some constant e.g. 0. optimized IR // opt -O3 udf.ll -S ------------------- define i32 @_Z3foov() #0 { entry: ret i32 undef } On Tue, Sep 13, 2016 at 7:39 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Sanjoy Das wrote: > > Here is a simpler example of a similar issue: `X - X` is not > > equivalent `0`, in that `0` cannot be replaced with `X - X`, even > > though `X - X` can be folded to `0`. > > I forgot to state that `X` above is some unknown value which can > potentially be `undef`. A full example would be: > > ``` > define i32 @f_0(i32 %x) { > ret i32 0 > } > > define i32 @f_1(i32 %x) { > %v = sub i32 %x, %x > ret i32 %v > } > ``` > > `@f_1` can be optimized to `@f_0` but not vice versa, unless you > somehow know that `%x` cannot be `undef`. > > -- Sanjoy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160913/bb2331bc/attachment.html>
Thanks for your answers. Another example of unsound transformation on Boolean algebra. According to the LLVM documentation (http://llvm.org/docs/LangRef.html#undefined-values) it is unsafe to consider ' a & undef = undef ' and ' a | undef = undef ' but 'undef xor undef = undef' is safe. Now, given an expression ((a & (~b)) | ((~a) & b)) where a and b are undef expressions, the value must be specialized to some constant (e.g. 0). But, if the expression is transformed to $a xor b$ then it may return 'undef'. As a result, the transformation ' ((a & (~b)) |((~a) & b)) ~> xor a b ' is unsound. LLVM presently performs this transformation. Best Regards, soham source ---------- int foo(){ int a,b; // unintialized and 'undef' return ((a & (~b)) |((~a) & b)); } unoptimized IR // clang++ -emit-llvm udf.cpp -S ---------------------- define i32 @_Z3foov() #0 { entry: %a = alloca i32, align 4 %b = alloca i32, align 4 %0 = load i32, i32* %a, align 4 %1 = load i32, i32* %b, align 4 %neg = xor i32 %1, -1 %and = and i32 %0, %neg %2 = load i32, i32* %a, align 4 %neg1 = xor i32 %2, -1 %3 = load i32, i32* %b, align 4 %and2 = and i32 %neg1, %3 %or = or i32 %and, %and2 ret i32 %or } Here %or must return some constant e.g. 0. optimized IR // opt -O3 udf.ll -S ------------------- define i32 @_Z3foov() #0 { entry: ret i32 undef }> Sanjoy Das wrote: > > Here is a simpler example of a similar issue: `X - X` is not > > equivalent `0`, in that `0` cannot be replaced with `X - X`, even > > though `X - X` can be folded to `0`. > > I forgot to state that `X` above is some unknown value which can > potentially be `undef`. A full example would be: > > ``` > define i32 @f_0(i32 %x) { > ret i32 0 > } > > define i32 @f_1(i32 %x) { > %v = sub i32 %x, %x > ret i32 %v > } > ``` > > `@f_1` can be optimized to `@f_0` but not vice versa, unless you > somehow know that `%x` cannot be `undef`. > > -- Sanjoy >