Grang, Mandeep Singh via llvm-dev
2018-Aug-09 18:52 UTC
[llvm-dev] Writing static analyzers to detect non-deterministic behavior?
Thanks for your response David. 1) I'm not sure it's do-able. I don't know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program. Only iterating over an unordered container is probably not a sufficient hint (it'd have a high false positive rate to warn on every instance of that) - and I don't have any great ideas about whether that false positive rate could be sufficiently reduced with maybe some heuristics about connecting one iteration to another container population, and to the final program output The idea I had in mind was to check the commutativity of each operation inside the iteration of the unordered container. I realize we may not be able to do this for every operation and even if we could that would still be a heuristic and prone to false positives. Maybe we can start with a simpler problem to tackle instead of the iteration order. That's why in the patch I pushed I wrote a very simple checker for std::sort with raw pointers. > 2) Maybe - but I would think that'd still end up using heuristics to guess at whether one iteration order impacts the order of another container, etc. Yes, maybe iteration order is more difficult to problem but we should be able to detect non-deterministic sorting order of keys with same values. The idea I had in mind is similar to what we do in llvm::sort (which is random shuffle of the container). We instrument the user code to sort a container twice - first is the user's normal std::sort, the second time we random shuffle the container and then sort again, and then check if there is a difference in the two outputs. The run time cost would be the cost of the shuffle + the extra sort + the cost of checking the two containers [O(n lg n) + 2 O(n)]. --Mandeep On 8/9/2018 11:13 AM, David Blaikie wrote:> 1) I'm not sure it's do-able. I don't know of any nice way to track > whether an ordered walk of an unordered container leaks out into the > final output of the program. Only iterating over an unordered > container is probably not a sufficient hint (it'd have a high false > positive rate to warn on every instance of that) - and I don't have > any great ideas about whether that false positive rate could be > sufficiently reduced with maybe some heuristics about connecting one > iteration to another container population, and to the final program > output... > > 2) Maybe - but I would think that'd still end up using heuristics to > guess at whether one iteration order impacts the order of another > container, etc. > > On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > In the past, I had added the ability in LLVM to uncover 2 types of > non-deterministic behaviors: iteration of unordered containers with > pointer-like keys and sorting of elements with the same keys. > > Now, I wanted to add checkers for these (and other types of > non-deterministic behaviors) so that they could be applied more > widely. > I also realize that not all of these may be doable at compile-time. > > With that in mind, I have pushed a patch which I adds a new > category of > checks for non-determinism: https://reviews.llvm.org/D50488. > > The first checker which I have pushed here will simply check if raw > pointers are being sorted. In subsequent patches, I will fine tune > this > by handling more complex cases of sorting of pointer-like keys. > > I have another patch which checks for iteration of unordered > containers > but that may be very noisy in its current form. > > I would like comments/suggestions from the community on: > > 1. Are static analysis checks to detect non-determinism something > worth > doing/doable? > > 2. How about writing sanitizers to detect non-deterministic behavior? > Would that be too costly in terms of run time and instrumentation? > > --Mandeep > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180809/9a58f496/attachment.html>
David Blaikie via llvm-dev
2018-Aug-10 18:13 UTC
[llvm-dev] Writing static analyzers to detect non-deterministic behavior?
On Thu, Aug 9, 2018 at 11:52 AM Grang, Mandeep Singh <mgrang at codeaurora.org> wrote:> Thanks for your response David. > > 1) I'm not sure it's do-able. I don't know of any nice way to track > whether an ordered walk of an unordered container leaks out into the final > output of the program. Only iterating over an unordered container is > probably not a sufficient hint (it'd have a high false positive rate to > warn on every instance of that) - and I don't have any great ideas about > whether that false positive rate could be sufficiently reduced with maybe > some heuristics about connecting one iteration to another container > population, and to the final program output > > The idea I had in mind was to check the commutativity of each operation > inside the iteration of the unordered container. I realize we may not be > able to do this for every operation and even if we could that would still > be a heuristic and prone to false positives. Maybe we can start with a > simpler problem to tackle instead of the iteration order. That's why in the > patch I pushed I wrote a very simple checker for std::sort with raw > pointers. >You're referring to the llvm::sort work, that randomizes the elements due to the instability of std::sort? (or some other work on std::sort and raw pointers?) Changes like llvm::sort I don't see as "static/dynamic analysis" - but changing behavior & relying on the products own tests to validate the behavior is still correct. (unlike static or dynamic analysis which produces warnings/errors on the relevant codepaths).> > 2) Maybe - but I would think that'd still end up using heuristics to > guess at whether one iteration order impacts the order of another > container, etc. > > Yes, maybe iteration order is more difficult to problem but we should be > able to detect non-deterministic sorting order of keys with same values. > The idea I had in mind is similar to what we do in llvm::sort (which is > random shuffle of the container). We instrument the user code to sort a > container twice - first is the user's normal std::sort, the second time we > random shuffle the container and then sort again, and then check if there > is a difference in the two outputs. The run time cost would be the cost of > the shuffle + the extra sort + the cost of checking the two containers [O(n > lg n) + 2 O(n)]. >Though different results there doesn't necessarily represent a bug in the program - it's a question of whether that different order impacts the program behavior. Could experiment - but I'd guess that the false positive rate would be a bit high.. (that said, I'm a bit of a pessimist in general - so take my perspective with a grain of salt :) ) - Dave> --Mandeep > > On 8/9/2018 11:13 AM, David Blaikie wrote: > > 1) I'm not sure it's do-able. I don't know of any nice way to track > whether an ordered walk of an unordered container leaks out into the final > output of the program. Only iterating over an unordered container is > probably not a sufficient hint (it'd have a high false positive rate to > warn on every instance of that) - and I don't have any great ideas about > whether that false positive rate could be sufficiently reduced with maybe > some heuristics about connecting one iteration to another container > population, and to the final program output... > > 2) Maybe - but I would think that'd still end up using heuristics to guess > at whether one iteration order impacts the order of another container, etc. > > On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> In the past, I had added the ability in LLVM to uncover 2 types of >> non-deterministic behaviors: iteration of unordered containers with >> pointer-like keys and sorting of elements with the same keys. >> >> Now, I wanted to add checkers for these (and other types of >> non-deterministic behaviors) so that they could be applied more widely. >> I also realize that not all of these may be doable at compile-time. >> >> With that in mind, I have pushed a patch which I adds a new category of >> checks for non-determinism: https://reviews.llvm.org/D50488. >> >> The first checker which I have pushed here will simply check if raw >> pointers are being sorted. In subsequent patches, I will fine tune this >> by handling more complex cases of sorting of pointer-like keys. >> >> I have another patch which checks for iteration of unordered containers >> but that may be very noisy in its current form. >> >> I would like comments/suggestions from the community on: >> >> 1. Are static analysis checks to detect non-determinism something worth >> doing/doable? >> >> 2. How about writing sanitizers to detect non-deterministic behavior? >> Would that be too costly in terms of run time and instrumentation? >> >> --Mandeep >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180810/fec2da14/attachment.html>
George Karpenkov via llvm-dev
2018-Aug-10 18:17 UTC
[llvm-dev] Writing static analyzers to detect non-deterministic behavior?
Iterating over unordered_map and unordered_set (or user-defined? llvm::DenseSet/DenseMap?) would seem like usual suspects for such checks.> I don’t know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the programPrecisely, so if it’s not doable for a checker, it’s probably not reliably doable for an engineer as well. In my experience, most iterations over unordered containers tend to affect the final output, and if project owners really care about avoiding non-determinism it would IMO make sense to ban those.> On Aug 9, 2018, at 11:52 AM, Grang, Mandeep Singh via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Thanks for your response David. > > 1) I'm not sure it's do-able. I don't know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program. Only iterating over an unordered container is probably not a sufficient hint (it'd have a high false positive rate to warn on every instance of that) - and I don't have any great ideas about whether that false positive rate could be sufficiently reduced with maybe some heuristics about connecting one iteration to another container population, and to the final program output > > The idea I had in mind was to check the commutativity of each operation inside the iteration of the unordered container. I realize we may not be able to do this for every operation and even if we could that would still be a heuristic and prone to false positives. Maybe we can start with a simpler problem to tackle instead of the iteration order. That's why in the patch I pushed I wrote a very simple checker for std::sort with raw pointers. > > > 2) Maybe - but I would think that'd still end up using heuristics to guess at whether one iteration order impacts the order of another container, etc. > > Yes, maybe iteration order is more difficult to problem but we should be able to detect non-deterministic sorting order of keys with same values. The idea I had in mind is similar to what we do in llvm::sort (which is random shuffle of the container). We instrument the user code to sort a container twice - first is the user's normal std::sort, the second time we random shuffle the container and then sort again, and then check if there is a difference in the two outputs. The run time cost would be the cost of the shuffle + the extra sort + the cost of checking the two containers [O(n lg n) + 2 O(n)]. > > --Mandeep > > > On 8/9/2018 11:13 AM, David Blaikie wrote: >> 1) I'm not sure it's do-able. I don't know of any nice way to track whether an ordered walk of an unordered container leaks out into the final output of the program. Only iterating over an unordered container is probably not a sufficient hint (it'd have a high false positive rate to warn on every instance of that) - and I don't have any great ideas about whether that false positive rate could be sufficiently reduced with maybe some heuristics about connecting one iteration to another container population, and to the final program output... >> >> 2) Maybe - but I would think that'd still end up using heuristics to guess at whether one iteration order impacts the order of another container, etc. >> >> On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> In the past, I had added the ability in LLVM to uncover 2 types of >> non-deterministic behaviors: iteration of unordered containers with >> pointer-like keys and sorting of elements with the same keys. >> >> Now, I wanted to add checkers for these (and other types of >> non-deterministic behaviors) so that they could be applied more widely. >> I also realize that not all of these may be doable at compile-time. >> >> With that in mind, I have pushed a patch which I adds a new category of >> checks for non-determinism: https://reviews.llvm.org/D50488 <https://reviews.llvm.org/D50488>. >> >> The first checker which I have pushed here will simply check if raw >> pointers are being sorted. In subsequent patches, I will fine tune this >> by handling more complex cases of sorting of pointer-like keys. >> >> I have another patch which checks for iteration of unordered containers >> but that may be very noisy in its current form. >> >> I would like comments/suggestions from the community on: >> >> 1. Are static analysis checks to detect non-determinism something worth >> doing/doable? >> >> 2. How about writing sanitizers to detect non-deterministic behavior? >> Would that be too costly in terms of run time and instrumentation? >> >> --Mandeep >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180810/965cbaa0/attachment.html>
David Blaikie via llvm-dev
2018-Aug-10 18:20 UTC
[llvm-dev] Writing static analyzers to detect non-deterministic behavior?
On Fri, Aug 10, 2018 at 11:17 AM George Karpenkov <ekarpenkov at apple.com> wrote:> Iterating over unordered_map and unordered_set (or user-defined? > llvm::DenseSet/DenseMap?) would seem like usual suspects for such checks. > > > I don’t know of any nice way to track whether an ordered walk of an > unordered container leaks out into the final output of the program > > Precisely, so if it’s not doable for a checker, it’s probably not reliably > doable for an engineer as well. >I wouldn't think that was the case - you might iterate one hashed container and build another deep in other function calls - I've certainly debugged through many layers of that sort of thing when poking around in LLVM.> In my experience, most iterations over unordered containers tend to affect > the final output, >I'd be surprised if that was the case - though a bit hard to measure. I would guess LLVM has maybe O(hundreds) of iterations over hashed containers that don't effect the final output - but that's just a guess.> and if project owners really care about avoiding non-determinism it would > IMO make sense to ban those. >Seems a bit severe to me. But I could be wrong. - Dave> > On Aug 9, 2018, at 11:52 AM, Grang, Mandeep Singh via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Thanks for your response David. > > 1) I'm not sure it's do-able. I don't know of any nice way to track > whether an ordered walk of an unordered container leaks out into the final > output of the program. Only iterating over an unordered container is > probably not a sufficient hint (it'd have a high false positive rate to > warn on every instance of that) - and I don't have any great ideas about > whether that false positive rate could be sufficiently reduced with maybe > some heuristics about connecting one iteration to another container > population, and to the final program output > > The idea I had in mind was to check the commutativity of each operation > inside the iteration of the unordered container. I realize we may not be > able to do this for every operation and even if we could that would still > be a heuristic and prone to false positives. Maybe we can start with a > simpler problem to tackle instead of the iteration order. That's why in the > patch I pushed I wrote a very simple checker for std::sort with raw > pointers. > > > 2) Maybe - but I would think that'd still end up using heuristics to > guess at whether one iteration order impacts the order of another > container, etc. > > Yes, maybe iteration order is more difficult to problem but we should be > able to detect non-deterministic sorting order of keys with same values. > The idea I had in mind is similar to what we do in llvm::sort (which is > random shuffle of the container). We instrument the user code to sort a > container twice - first is the user's normal std::sort, the second time we > random shuffle the container and then sort again, and then check if there > is a difference in the two outputs. The run time cost would be the cost of > the shuffle + the extra sort + the cost of checking the two containers [O(n > lg n) + 2 O(n)]. > > --Mandeep > > On 8/9/2018 11:13 AM, David Blaikie wrote: > > 1) I'm not sure it's do-able. I don't know of any nice way to track > whether an ordered walk of an unordered container leaks out into the final > output of the program. Only iterating over an unordered container is > probably not a sufficient hint (it'd have a high false positive rate to > warn on every instance of that) - and I don't have any great ideas about > whether that false positive rate could be sufficiently reduced with maybe > some heuristics about connecting one iteration to another container > population, and to the final program output... > > 2) Maybe - but I would think that'd still end up using heuristics to guess > at whether one iteration order impacts the order of another container, etc. > > On Wed, Aug 8, 2018 at 7:43 PM Grang, Mandeep Singh via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> In the past, I had added the ability in LLVM to uncover 2 types of >> non-deterministic behaviors: iteration of unordered containers with >> pointer-like keys and sorting of elements with the same keys. >> >> Now, I wanted to add checkers for these (and other types of >> non-deterministic behaviors) so that they could be applied more widely. >> I also realize that not all of these may be doable at compile-time. >> >> With that in mind, I have pushed a patch which I adds a new category of >> checks for non-determinism: https://reviews.llvm.org/D50488. >> >> The first checker which I have pushed here will simply check if raw >> pointers are being sorted. In subsequent patches, I will fine tune this >> by handling more complex cases of sorting of pointer-like keys. >> >> I have another patch which checks for iteration of unordered containers >> but that may be very noisy in its current form. >> >> I would like comments/suggestions from the community on: >> >> 1. Are static analysis checks to detect non-determinism something worth >> doing/doable? >> >> 2. How about writing sanitizers to detect non-deterministic behavior? >> Would that be too costly in terms of run time and instrumentation? >> >> --Mandeep >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180810/efb33ae4/attachment.html>