Alexey Samsonov via llvm-dev
2016-Jan-12 22:57 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
(+correct cfe-dev list) On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> wrote:> Hi Yuri, > > On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi all, >> >> UndefinedBehaviorSanitizer currently does not check for undefined >> behaviors which result from improper usage of standard library functions. >> > > It's not really an undefined behavior per se: you just violate the > contract provided by library functions (at least I haven't found a clause > in C++ standard > that specifies that the behavior of smth. like std::sort() with invalid > operator< is invalid). The actual behavior of these functions is > determined by library authors, and can be anything from UB to unspecified > result, to nicely formatted error report (if you're using a debug version > of system library): > I think I actually encountered all these cases. > > That said, the benefit of checking the sorting routines is clear: I'm not > surprised you were able to catch numerous bugs with your tool. > > UBSan currently doesn't have interceptors, and all of its checks [1] are > designed to work without runtime library support, so that it can be used as > a mitigation > tool (with -fsanitize-trap=undefined). It's challenging to add UBSan > interceptors at this point: this would limit the tool portability (at least > portability of some functionality), > and complicate interoperation of UBSan with another tools. There is an > opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check > arguments there. > This functionality can be controlled by a runtime flag. > > Why do you need compiler instrumentation? Do you want to automatically > inject hooks to SortChecker into std::sort() functions, so that you don't > need to annotate C++ standard library code? > We submitted some annotations to libc++ code (e.g. to report containter > overflow bugs in sanitizers). > > [1] -fsanitize=vptr is an only notable exception > > One notorious instance of such errors is invalid usage of qsort or bsearch >> routines (or std::sort and friends in case of C++): >> * using comparison function that violates ordering axioms (reflexivity, >> symmetry, transitivity) >> * returning unstable results from comparison function >> * passing unsorted array to bsearch >> * etc. >> >> Errors like this will usually result in slightly invalid output (not >> fully sorted arrays, rare failed bsearches, etc.) but may as well cause >> aborts on some systems (https://bugzilla.samba.org/show_bug.cgi?id=3959). >> >> I've recently developed a simple proof-of-concept tool SortChecker ( >> https://github.com/yugr/sortcheck) which intercepts calls to qsort and >> friends (via LD_PRELOAD) and performs various sanity checks before passing >> control to libc e.g. >> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >> * etc. >> >> Results were quite inspiring: I've found several errors in popular >> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also pretty >> sure most C++ developers have failed to write correct comparison function >> at least once. >> >> Could SortChecker functionality be considered for integration into UBSan? >> Extending SortChecker to C++ land would require significant work which has >> already been done in sanitizers (compiler instrumentation, portable >> parallel code, etc.). On the other hand, ability to detect new classes of >> bugs should benefit UBSan and community. > > > >> >> -Y >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > > -- > Alexey Samsonov > vonosmas at gmail.com >-- Alexey Samsonov vonosmas at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160112/b64894da/attachment-0001.html>
Kostya Serebryany via llvm-dev
2016-Jan-13 00:10 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
FTR, here is one way to implement this in the library: https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h Search for "check sort predicate for strict weak ordering" On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev < llvm-dev at lists.llvm.org> wrote:> (+correct cfe-dev list) > > On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> > wrote: > >> Hi Yuri, >> >> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi all, >>> >>> UndefinedBehaviorSanitizer currently does not check for undefined >>> behaviors which result from improper usage of standard library functions. >>> >> >> It's not really an undefined behavior per se: you just violate the >> contract provided by library functions (at least I haven't found a clause >> in C++ standard >> that specifies that the behavior of smth. like std::sort() with invalid >> operator< is invalid). The actual behavior of these functions is >> determined by library authors, and can be anything from UB to unspecified >> result, to nicely formatted error report (if you're using a debug version >> of system library): >> I think I actually encountered all these cases. >> >> That said, the benefit of checking the sorting routines is clear: I'm not >> surprised you were able to catch numerous bugs with your tool. >> >> UBSan currently doesn't have interceptors, and all of its checks [1] are >> designed to work without runtime library support, so that it can be used as >> a mitigation >> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >> interceptors at this point: this would limit the tool portability (at least >> portability of some functionality), >> and complicate interoperation of UBSan with another tools. There is an >> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check >> arguments there. >> This functionality can be controlled by a runtime flag. >> >> Why do you need compiler instrumentation? Do you want to automatically >> inject hooks to SortChecker into std::sort() functions, so that you don't >> need to annotate C++ standard library code? >> We submitted some annotations to libc++ code (e.g. to report containter >> overflow bugs in sanitizers). >> >> [1] -fsanitize=vptr is an only notable exception >> >> One notorious instance of such errors is invalid usage of qsort or >>> bsearch routines (or std::sort and friends in case of C++): >>> * using comparison function that violates ordering axioms (reflexivity, >>> symmetry, transitivity) >>> * returning unstable results from comparison function >>> * passing unsorted array to bsearch >>> * etc. >>> >>> Errors like this will usually result in slightly invalid output (not >>> fully sorted arrays, rare failed bsearches, etc.) but may as well cause >>> aborts on some systems (https://bugzilla.samba.org/show_bug.cgi?id=3959 >>> ). >>> >>> I've recently developed a simple proof-of-concept tool SortChecker ( >>> https://github.com/yugr/sortcheck) which intercepts calls to qsort and >>> friends (via LD_PRELOAD) and performs various sanity checks before passing >>> control to libc e.g. >>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>> * etc. >>> >>> Results were quite inspiring: I've found several errors in popular >>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also pretty >>> sure most C++ developers have failed to write correct comparison function >>> at least once. >>> >>> Could SortChecker functionality be considered for integration into >>> UBSan? Extending SortChecker to C++ land would require significant work >>> which has already been done in sanitizers (compiler instrumentation, >>> portable parallel code, etc.). On the other hand, ability to detect new >>> classes of bugs should benefit UBSan and community. >> >> >> >>> >>> -Y >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> >> >> -- >> Alexey Samsonov >> vonosmas at gmail.com >> > > > > -- > Alexey Samsonov > vonosmas at gmail.com > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160112/9ec7f847/attachment.html>
Yury Gribov via llvm-dev
2016-Jan-13 06:27 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
On 01/13/2016 01:57 AM, Alexey Samsonov wrote:> (+correct cfe-dev list) > > On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> wrote: > >> Hi Yuri, >> >> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi all, >>> >>> UndefinedBehaviorSanitizer currently does not check for undefined >>> behaviors which result from improper usage of standard library functions. >>> >> >> It's not really an undefined behavior per se: you just violate the >> contract provided by library functions (at least I haven't found a clause >> in C++ standard >> that specifies that the behavior of smth. like std::sort() with invalid >> operator< is invalid).Well, C11 which says that "That is, for qsort they shall define a total ordering on the array, and for bsearch the same object shall always compare the same way with the key." ("total ordering" is of course a misnomer, AFAIK they meant weak ordering). AFAIK violation of "shall" usually means UB. As for C++11, it has for e.g. srtd::sort: "Requires: operator< (for the version with no arguments) or comp (for the version with a comparison argument) defines a strict weak ordering (25.4)." which also sounds like UB.> The actual behavior of these functions is >> determined by library authors, and can be anything from UB to unspecified >> result, to nicely formatted error report (if you're using a debug version >> of system library): >> I think I actually encountered all these cases. >> >> That said, the benefit of checking the sorting routines is clear: I'm not >> surprised you were able to catch numerous bugs with your tool. >> >> UBSan currently doesn't have interceptors, and all of its checks [1] are >> designed to work without runtime library support, so that it can be used as >> a mitigation >> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >> interceptors at this point: this would limit the tool portability (at least >> portability of some functionality), >> and complicate interoperation of UBSan with another tools.Could you give some hints about portability problems? E.g. I thought that ASan (which has a ton of interceptors) has no problems on Windows. In general, not being able to detect UB in stdlibs sounds like a limitation. >> There is an>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check >> arguments there. >> This functionality can be controlled by a runtime flag.It's certainly doable but somewhat "ugly" as [ATM]San are not about UB. On the other hand TSan already has non-thread-related checks anyway. I wonder what other folks think about this.>> Why do you need compiler instrumentation? Do you want to automatically >> inject hooks to SortChecker into std::sort() functions, so that you don't >> need to annotate C++ standard library code?Exactly, LD_PRELOAD does not work for inline STL functions (actually it does not fully work even for Glibc which has inline bsearch).>> We submitted some annotations to libc++ code (e.g. to report containter >> overflow bugs in sanitizers).Annotating all popular stdlibs rather than doing a single wrapper would be a challenge. On the other hand, it may have an additional benefit of being extendable to other popular libraries which provide qsort-like APIs (Glib2, Berkeley DB and many others). E.g. we could have some kind of __attribute__((strict_order)) for callbacks which would then be handled by compiler.>> [1] -fsanitize=vptr is an only notable exception >> >> One notorious instance of such errors is invalid usage of qsort or bsearch >>> routines (or std::sort and friends in case of C++): >>> * using comparison function that violates ordering axioms (reflexivity, >>> symmetry, transitivity) >>> * returning unstable results from comparison function >>> * passing unsorted array to bsearch >>> * etc. >>> >>> Errors like this will usually result in slightly invalid output (not >>> fully sorted arrays, rare failed bsearches, etc.) but may as well cause >>> aborts on some systems (https://bugzilla.samba.org/show_bug.cgi?id=3959). >>> >>> I've recently developed a simple proof-of-concept tool SortChecker ( >>> https://github.com/yugr/sortcheck) which intercepts calls to qsort and >>> friends (via LD_PRELOAD) and performs various sanity checks before passing >>> control to libc e.g. >>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>> * etc. >>> >>> Results were quite inspiring: I've found several errors in popular >>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also pretty >>> sure most C++ developers have failed to write correct comparison function >>> at least once. >>> >>> Could SortChecker functionality be considered for integration into UBSan? >>> Extending SortChecker to C++ land would require significant work which has >>> already been done in sanitizers (compiler instrumentation, portable >>> parallel code, etc.). On the other hand, ability to detect new classes of >>> bugs should benefit UBSan and community. >> >> >> >>> >>> -Y >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> >> >> -- >> Alexey Samsonov >> vonosmas at gmail.com >> > > >
Yury Gribov via llvm-dev
2016-Jan-13 06:28 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
On 01/13/2016 03:10 AM, Kostya Serebryany wrote:> FTR, here is one way to implement this in the library: > https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h > Search for "check sort predicate for strict weak ordering"Nice, although this wouldn't catch violations of transitivity (which is probably the most important type of bug).> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> (+correct cfe-dev list) >> >> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> >> wrote: >> >>> Hi Yuri, >>> >>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hi all, >>>> >>>> UndefinedBehaviorSanitizer currently does not check for undefined >>>> behaviors which result from improper usage of standard library functions. >>>> >>> >>> It's not really an undefined behavior per se: you just violate the >>> contract provided by library functions (at least I haven't found a clause >>> in C++ standard >>> that specifies that the behavior of smth. like std::sort() with invalid >>> operator< is invalid). The actual behavior of these functions is >>> determined by library authors, and can be anything from UB to unspecified >>> result, to nicely formatted error report (if you're using a debug version >>> of system library): >>> I think I actually encountered all these cases. >>> >>> That said, the benefit of checking the sorting routines is clear: I'm not >>> surprised you were able to catch numerous bugs with your tool. >>> >>> UBSan currently doesn't have interceptors, and all of its checks [1] are >>> designed to work without runtime library support, so that it can be used as >>> a mitigation >>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >>> interceptors at this point: this would limit the tool portability (at least >>> portability of some functionality), >>> and complicate interoperation of UBSan with another tools. There is an >>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and check >>> arguments there. >>> This functionality can be controlled by a runtime flag. >>> >>> Why do you need compiler instrumentation? Do you want to automatically >>> inject hooks to SortChecker into std::sort() functions, so that you don't >>> need to annotate C++ standard library code? >>> We submitted some annotations to libc++ code (e.g. to report containter >>> overflow bugs in sanitizers). >>> >>> [1] -fsanitize=vptr is an only notable exception >>> >>> One notorious instance of such errors is invalid usage of qsort or >>>> bsearch routines (or std::sort and friends in case of C++): >>>> * using comparison function that violates ordering axioms (reflexivity, >>>> symmetry, transitivity) >>>> * returning unstable results from comparison function >>>> * passing unsorted array to bsearch >>>> * etc. >>>> >>>> Errors like this will usually result in slightly invalid output (not >>>> fully sorted arrays, rare failed bsearches, etc.) but may as well cause >>>> aborts on some systems (https://bugzilla.samba.org/show_bug.cgi?id=3959 >>>> ). >>>> >>>> I've recently developed a simple proof-of-concept tool SortChecker ( >>>> https://github.com/yugr/sortcheck) which intercepts calls to qsort and >>>> friends (via LD_PRELOAD) and performs various sanity checks before passing >>>> control to libc e.g. >>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>>> * etc. >>>> >>>> Results were quite inspiring: I've found several errors in popular >>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also pretty >>>> sure most C++ developers have failed to write correct comparison function >>>> at least once. >>>> >>>> Could SortChecker functionality be considered for integration into >>>> UBSan? Extending SortChecker to C++ land would require significant work >>>> which has already been done in sanitizers (compiler instrumentation, >>>> portable parallel code, etc.). On the other hand, ability to detect new >>>> classes of bugs should benefit UBSan and community. >>> >>> >>> >>>> >>>> -Y >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >>> >>> >>> -- >>> Alexey Samsonov >>> vonosmas at gmail.com >>> >> >> >> >> -- >> Alexey Samsonov >> vonosmas at gmail.com >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >
Marshall Clow via llvm-dev
2016-Jan-15 06:11 UTC
[llvm-dev] [cfe-dev] RFC: Extend UBSan with qsort checks
On Tue, Jan 12, 2016 at 10:27 PM, Yury Gribov via cfe-dev < cfe-dev at lists.llvm.org> wrote:> > As for C++11, it has for e.g. srtd::sort: > > "Requires: operator< (for the version with no arguments) or comp (for the > version with a comparison argument) defines a strict weak ordering (25.4)." > > which also sounds like UB.Exactly correct. If your comparison operator does not define a SWO, then you have no guarantees of the results. Crashes are common. In principle, I am very much in favor of a checker like this. Passing an invalid comparison operator is one of the most common misuses of std::sort. I worry about the cost, though. -- Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160114/4c967cf5/attachment.html>