Yury Gribov via llvm-dev
2016-Jan-13 07:08 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
On 01/13/2016 09:57 AM, Kostya Serebryany wrote:> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com> wrote: > >> On 01/13/2016 03:10 AM, Kostya Serebryany wrote: >> >>> FTR, here is one way to implement this in the library: >>> >>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h >>> Search for "check sort predicate for strict weak ordering" >>> >> >> Nice, although this wouldn't catch violations of transitivity (which is >> probably the most important type of bug). >> >> > I thought it would...But debug operator() only knows about 2 inputs whereas for transitivity you'll need to consider at least 3.> Can you give a test? > >> >> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>> (+correct cfe-dev list) >>>> >>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> >>>> wrote: >>>> >>>> Hi Yuri, >>>>> >>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>> Hi all, >>>>>> >>>>>> UndefinedBehaviorSanitizer currently does not check for undefined >>>>>> behaviors which result from improper usage of standard library >>>>>> functions. >>>>>> >>>>>> >>>>> It's not really an undefined behavior per se: you just violate the >>>>> contract provided by library functions (at least I haven't found a >>>>> clause >>>>> in C++ standard >>>>> that specifies that the behavior of smth. like std::sort() with invalid >>>>> operator< is invalid). The actual behavior of these functions is >>>>> determined by library authors, and can be anything from UB to >>>>> unspecified >>>>> result, to nicely formatted error report (if you're using a debug >>>>> version >>>>> of system library): >>>>> I think I actually encountered all these cases. >>>>> >>>>> That said, the benefit of checking the sorting routines is clear: I'm >>>>> not >>>>> surprised you were able to catch numerous bugs with your tool. >>>>> >>>>> UBSan currently doesn't have interceptors, and all of its checks [1] are >>>>> designed to work without runtime library support, so that it can be >>>>> used as >>>>> a mitigation >>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >>>>> interceptors at this point: this would limit the tool portability (at >>>>> least >>>>> portability of some functionality), >>>>> and complicate interoperation of UBSan with another tools. There is an >>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and >>>>> check >>>>> arguments there. >>>>> This functionality can be controlled by a runtime flag. >>>>> >>>>> Why do you need compiler instrumentation? Do you want to automatically >>>>> inject hooks to SortChecker into std::sort() functions, so that you >>>>> don't >>>>> need to annotate C++ standard library code? >>>>> We submitted some annotations to libc++ code (e.g. to report containter >>>>> overflow bugs in sanitizers). >>>>> >>>>> [1] -fsanitize=vptr is an only notable exception >>>>> >>>>> One notorious instance of such errors is invalid usage of qsort or >>>>> >>>>>> bsearch routines (or std::sort and friends in case of C++): >>>>>> * using comparison function that violates ordering axioms (reflexivity, >>>>>> symmetry, transitivity) >>>>>> * returning unstable results from comparison function >>>>>> * passing unsorted array to bsearch >>>>>> * etc. >>>>>> >>>>>> Errors like this will usually result in slightly invalid output (not >>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well cause >>>>>> aborts on some systems ( >>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959 >>>>>> ). >>>>>> >>>>>> I've recently developed a simple proof-of-concept tool SortChecker ( >>>>>> https://github.com/yugr/sortcheck) which intercepts calls to qsort and >>>>>> friends (via LD_PRELOAD) and performs various sanity checks before >>>>>> passing >>>>>> control to libc e.g. >>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>>>>> * etc. >>>>>> >>>>>> Results were quite inspiring: I've found several errors in popular >>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also >>>>>> pretty >>>>>> sure most C++ developers have failed to write correct comparison >>>>>> function >>>>>> at least once. >>>>>> >>>>>> Could SortChecker functionality be considered for integration into >>>>>> UBSan? Extending SortChecker to C++ land would require significant work >>>>>> which has already been done in sanitizers (compiler instrumentation, >>>>>> portable parallel code, etc.). On the other hand, ability to detect new >>>>>> classes of bugs should benefit UBSan and community. >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> -Y >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Alexey Samsonov >>>>> vonosmas at gmail.com >>>>> >>>>> >>>> >>>> >>>> -- >>>> Alexey Samsonov >>>> vonosmas at gmail.com >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> >>> >> >
Yury Gribov via llvm-dev
2016-Jan-13 07:09 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
On 01/13/2016 10:08 AM, Yury Gribov wrote:> On 01/13/2016 09:57 AM, Kostya Serebryany wrote: >> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com> >> wrote: >> >>> On 01/13/2016 03:10 AM, Kostya Serebryany wrote: >>> >>>> FTR, here is one way to implement this in the library: >>>> >>>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h >>>> >>>> Search for "check sort predicate for strict weak ordering" >>>> >>> >>> Nice, although this wouldn't catch violations of transitivity (which is >>> probably the most important type of bug). >>> >>> >> I thought it would... > > But debug operator() only knows about 2 inputs whereas for transitivity > you'll need to consider at least 3. > >> Can you give a test?Here's an example of transitivity violation in GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988>> >>> >>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>> (+correct cfe-dev list) >>>>> >>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> >>>>> wrote: >>>>> >>>>> Hi Yuri, >>>>>> >>>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>> >>>>>> Hi all, >>>>>>> >>>>>>> UndefinedBehaviorSanitizer currently does not check for undefined >>>>>>> behaviors which result from improper usage of standard library >>>>>>> functions. >>>>>>> >>>>>>> >>>>>> It's not really an undefined behavior per se: you just violate the >>>>>> contract provided by library functions (at least I haven't found a >>>>>> clause >>>>>> in C++ standard >>>>>> that specifies that the behavior of smth. like std::sort() with >>>>>> invalid >>>>>> operator< is invalid). The actual behavior of these functions is >>>>>> determined by library authors, and can be anything from UB to >>>>>> unspecified >>>>>> result, to nicely formatted error report (if you're using a debug >>>>>> version >>>>>> of system library): >>>>>> I think I actually encountered all these cases. >>>>>> >>>>>> That said, the benefit of checking the sorting routines is clear: I'm >>>>>> not >>>>>> surprised you were able to catch numerous bugs with your tool. >>>>>> >>>>>> UBSan currently doesn't have interceptors, and all of its checks >>>>>> [1] are >>>>>> designed to work without runtime library support, so that it can be >>>>>> used as >>>>>> a mitigation >>>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >>>>>> interceptors at this point: this would limit the tool portability (at >>>>>> least >>>>>> portability of some functionality), >>>>>> and complicate interoperation of UBSan with another tools. There >>>>>> is an >>>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and >>>>>> check >>>>>> arguments there. >>>>>> This functionality can be controlled by a runtime flag. >>>>>> >>>>>> Why do you need compiler instrumentation? Do you want to >>>>>> automatically >>>>>> inject hooks to SortChecker into std::sort() functions, so that you >>>>>> don't >>>>>> need to annotate C++ standard library code? >>>>>> We submitted some annotations to libc++ code (e.g. to report >>>>>> containter >>>>>> overflow bugs in sanitizers). >>>>>> >>>>>> [1] -fsanitize=vptr is an only notable exception >>>>>> >>>>>> One notorious instance of such errors is invalid usage of qsort or >>>>>> >>>>>>> bsearch routines (or std::sort and friends in case of C++): >>>>>>> * using comparison function that violates ordering axioms >>>>>>> (reflexivity, >>>>>>> symmetry, transitivity) >>>>>>> * returning unstable results from comparison function >>>>>>> * passing unsorted array to bsearch >>>>>>> * etc. >>>>>>> >>>>>>> Errors like this will usually result in slightly invalid output (not >>>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well >>>>>>> cause >>>>>>> aborts on some systems ( >>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959 >>>>>>> ). >>>>>>> >>>>>>> I've recently developed a simple proof-of-concept tool SortChecker ( >>>>>>> https://github.com/yugr/sortcheck) which intercepts calls to >>>>>>> qsort and >>>>>>> friends (via LD_PRELOAD) and performs various sanity checks before >>>>>>> passing >>>>>>> control to libc e.g. >>>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>>>>>> * etc. >>>>>>> >>>>>>> Results were quite inspiring: I've found several errors in popular >>>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also >>>>>>> pretty >>>>>>> sure most C++ developers have failed to write correct comparison >>>>>>> function >>>>>>> at least once. >>>>>>> >>>>>>> Could SortChecker functionality be considered for integration into >>>>>>> UBSan? Extending SortChecker to C++ land would require >>>>>>> significant work >>>>>>> which has already been done in sanitizers (compiler instrumentation, >>>>>>> portable parallel code, etc.). On the other hand, ability to >>>>>>> detect new >>>>>>> classes of bugs should benefit UBSan and community. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> -Y >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Alexey Samsonov >>>>>> vonosmas at gmail.com >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Alexey Samsonov >>>>> vonosmas at gmail.com >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>>> >>>> >>> >> >
Kostya Serebryany via llvm-dev
2016-Jan-14 00:17 UTC
[llvm-dev] RFC: Extend UBSan with qsort checks
Inviting Paul to the party (he wrote the libstdc++ sort checker <https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h> ). On Tue, Jan 12, 2016 at 11:09 PM, Yury Gribov <y.gribov at samsung.com> wrote:> On 01/13/2016 10:08 AM, Yury Gribov wrote: > >> On 01/13/2016 09:57 AM, Kostya Serebryany wrote: >> >>> On Tue, Jan 12, 2016 at 10:28 PM, Yury Gribov <y.gribov at samsung.com> >>> wrote: >>> >>> On 01/13/2016 03:10 AM, Kostya Serebryany wrote: >>>> >>>> FTR, here is one way to implement this in the library: >>>>> >>>>> >>>>> https://gcc.gnu.org/svn/gcc/branches/google/gcc-4_9/libstdc++-v3/include/bits/stl_algo.h >>>>> >>>>> Search for "check sort predicate for strict weak ordering" >>>>> >>>>> >>>> Nice, although this wouldn't catch violations of transitivity (which is >>>> probably the most important type of bug). >>>> >>>> >>>> I thought it would... >>> >> >> But debug operator() only knows about 2 inputs whereas for transitivity >> you'll need to consider at least 3. >> >> Can you give a test? >>> >> > Here's an example of transitivity violation in GCC: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68988 > > > >>> >>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov via llvm-dev < >>>> >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>> (+correct cfe-dev list) >>>>> >>>>>> >>>>>> On Tue, Jan 12, 2016 at 2:57 PM, Alexey Samsonov <vonosmas at gmail.com> >>>>>> wrote: >>>>>> >>>>>> Hi Yuri, >>>>>> >>>>>>> >>>>>>> On Mon, Jan 11, 2016 at 9:53 AM, Yury Gribov via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>>> >>>>>>>> UndefinedBehaviorSanitizer currently does not check for undefined >>>>>>>> behaviors which result from improper usage of standard library >>>>>>>> functions. >>>>>>>> >>>>>>>> >>>>>>>> It's not really an undefined behavior per se: you just violate the >>>>>>> contract provided by library functions (at least I haven't found a >>>>>>> clause >>>>>>> in C++ standard >>>>>>> that specifies that the behavior of smth. like std::sort() with >>>>>>> invalid >>>>>>> operator< is invalid). The actual behavior of these functions is >>>>>>> determined by library authors, and can be anything from UB to >>>>>>> unspecified >>>>>>> result, to nicely formatted error report (if you're using a debug >>>>>>> version >>>>>>> of system library): >>>>>>> I think I actually encountered all these cases. >>>>>>> >>>>>>> That said, the benefit of checking the sorting routines is clear: I'm >>>>>>> not >>>>>>> surprised you were able to catch numerous bugs with your tool. >>>>>>> >>>>>>> UBSan currently doesn't have interceptors, and all of its checks >>>>>>> [1] are >>>>>>> designed to work without runtime library support, so that it can be >>>>>>> used as >>>>>>> a mitigation >>>>>>> tool (with -fsanitize-trap=undefined). It's challenging to add UBSan >>>>>>> interceptors at this point: this would limit the tool portability (at >>>>>>> least >>>>>>> portability of some functionality), >>>>>>> and complicate interoperation of UBSan with another tools. There >>>>>>> is an >>>>>>> opportunity to, say, add qsort() interceptor to ASan/TSan/MSan, and >>>>>>> check >>>>>>> arguments there. >>>>>>> This functionality can be controlled by a runtime flag. >>>>>>> >>>>>>> Why do you need compiler instrumentation? Do you want to >>>>>>> automatically >>>>>>> inject hooks to SortChecker into std::sort() functions, so that you >>>>>>> don't >>>>>>> need to annotate C++ standard library code? >>>>>>> We submitted some annotations to libc++ code (e.g. to report >>>>>>> containter >>>>>>> overflow bugs in sanitizers). >>>>>>> >>>>>>> [1] -fsanitize=vptr is an only notable exception >>>>>>> >>>>>>> One notorious instance of such errors is invalid usage of qsort or >>>>>>> >>>>>>> bsearch routines (or std::sort and friends in case of C++): >>>>>>>> * using comparison function that violates ordering axioms >>>>>>>> (reflexivity, >>>>>>>> symmetry, transitivity) >>>>>>>> * returning unstable results from comparison function >>>>>>>> * passing unsorted array to bsearch >>>>>>>> * etc. >>>>>>>> >>>>>>>> Errors like this will usually result in slightly invalid output (not >>>>>>>> fully sorted arrays, rare failed bsearches, etc.) but may as well >>>>>>>> cause >>>>>>>> aborts on some systems ( >>>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=3959 >>>>>>>> ). >>>>>>>> >>>>>>>> I've recently developed a simple proof-of-concept tool SortChecker ( >>>>>>>> https://github.com/yugr/sortcheck) which intercepts calls to >>>>>>>> qsort and >>>>>>>> friends (via LD_PRELOAD) and performs various sanity checks before >>>>>>>> passing >>>>>>>> control to libc e.g. >>>>>>>> * sign(cmp(a, b)) == - sign(cmp(b, a)) for all array elements >>>>>>>> * etc. >>>>>>>> >>>>>>>> Results were quite inspiring: I've found several errors in popular >>>>>>>> open-source packages (GCC, Harfbuzz, dpkg, libXt, etc.). I'm also >>>>>>>> pretty >>>>>>>> sure most C++ developers have failed to write correct comparison >>>>>>>> function >>>>>>>> at least once. >>>>>>>> >>>>>>>> Could SortChecker functionality be considered for integration into >>>>>>>> UBSan? Extending SortChecker to C++ land would require >>>>>>>> significant work >>>>>>>> which has already been done in sanitizers (compiler instrumentation, >>>>>>>> portable parallel code, etc.). On the other hand, ability to >>>>>>>> detect new >>>>>>>> classes of bugs should benefit UBSan and community. >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -Y >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Alexey Samsonov >>>>>>> vonosmas at gmail.com >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Alexey Samsonov >>>>>> vonosmas at gmail.com >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160113/8c050509/attachment.html>