Chandler Carruth via llvm-dev
2016-Feb-26 07:00 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Thu, Feb 25, 2016 at 10:40 PM Justin Bogner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> writes: > > On Thu, Feb 25, 2016 at 6:35 PM, Duncan P. N. Exon Smith > > <dexonsmith at apple.com> wrote: > >>> // In C > >>> void foo() { > >>> int c; > >>> if (c) print("X"); > >>> escape(&c); // escape is an empty function > >>> } > >>> > >>> which I think is not UB in C (is it?), but will boil down to the kind > >>> of IR above. > >> > >> I'm pretty sure the `if (c)` is UB because it's branching on an > uninitialized > >> value, which could have a trap representation. > > > > I am *way* out of my depth here, but what if 'c' was an 'unsigned > > char' (and not an 'int')? Wouldn't that prevent UB, since it is > > escaped (cannot be a register variable), and is an 'unsigned char' > > (doesn't have a trap representation)? > > Nah, C's pretty explicit that using uninitialized locals is undefined > behaviour, regardless of type. From C11 J.2: > > The value of an object with automatic storage duration is used while > it is indeterminate (6.2.4, 6.7.9, 6.8). > > It's also stated in 6.2.4 that "the initial value of the object is > indeterminate." >Reading an indeterminate value isn't UB in C. Even branching on it isn't.> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/803efb77/attachment.html>
Justin Bogner via llvm-dev
2016-Feb-26 08:33 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Chandler Carruth <chandlerc at google.com> writes:> On Thu, Feb 25, 2016 at 10:40 PM Justin Bogner via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> I am *way* out of my depth here, but what if 'c' was an 'unsigned >>> char' (and not an 'int')? Wouldn't that prevent UB, since it is >>> escaped (cannot be a register variable), and is an 'unsigned char' >>> (doesn't have a trap representation)? >> >> Nah, C's pretty explicit that using uninitialized locals is undefined >> behaviour, regardless of type. From C11 J.2: >> >> The value of an object with automatic storage duration is used while >> it is indeterminate (6.2.4, 6.7.9, 6.8). >> >> It's also stated in 6.2.4 that "the initial value of the object is >> indeterminate." > > Reading an indeterminate value isn't UB in C. Even branching on it isn't.Maybe not, but this is quite literally "[using] the value of an object with automatic storage duration while it is indeterminate", unless there's a very strange definition of use buried somewhere else in the standard. This case is explicitly called out.
Sanjoy Das via llvm-dev
2016-Feb-26 17:03 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Couple of other problematic transforms:
# undef refinement
After thinking about this a bit, I think undef refinement happens a
lot more often than I initially thought, and it happens implicitly.
Consider the following case:
void @foo(int* ptr) available_externally {
int k = *ptr;
if (k == 1 && k == 2) print("X");
}
void main() {
int* ptr = malloc();
*ptr = 200;
@foo(ptr)
}
=>
void @foo(int* ptr) readnone available_externally {
//int k = *ptr;
//if (false) print("X");
}
void main() {
int* ptr = malloc();
*ptr = 200;
@foo(ptr)
}
==>
void @foo(int* ptr) readnone available_externally {
//int k = *ptr;
//if (false) print("X");
}
void main() {
int* ptr = malloc();
@foo(ptr) // since this is readnone
*ptr = 200;
}
This is a problem if we replace @foo with an unoptimized version
(`undef` can be both `== 1` and `== 2`): in such a case we can print
`"X"` while in the original program that wasn't possible.
The problem here is that we're "implicitly" folding / constraining
`undef` when folding `(k == 1 && k == 2)` to `false`. In the case
where `k` is `undef` (the non-undef case is trivial), the fold is
justified by first constraining the two `undef` s to one value, and
then folding the comparison. And we do this kind of implicit `undef`
refinement all over the place -- the rewrite
i = 0;
N = ...;
do {
} while (i++ != N);
=>
i = 0;
N = ...;
//do {
//} while (i++ != N);
i = N + 1;
makes the same refinement, that **if** `N` is `undef` then all of the
uses of `N` through the loops iteration space are the same `N`. OTOH
If we *knew* `N` was `undef`, we could have correctly replaced the
loop with `while (true)` (so that above loop is also problematic in
the same way as the firat `k == 1 && k == 2` case.
# dereferenceability
We cannot reorder `readnone nounwind available_externally` functions
across `@free` since there could have been:
void @foo(int* ptr) available_externally {
int unused = *ptr;
}
where the load was optimized away in the current TU but is present in
the -O0 TU.
This is similar to the dead-arg-elimination transform Andy Ayers and I
mentioned earlier (passing in `undef` or `null` for `ptr` is also a
problem for similar reasons).
-- Sanjoy
Maybe Matching Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")