Marshall Clow
2015-Jun-01 14:20 UTC
[LLVMdev] RFC: Adding attribute(nonnull) to things in libc++
This weekend, I got an email from Nuno Lopes informing me that UBSAN now paid attention to attribute(nonnull), and he was having some problems with it going off when using libc++. I did some investigation, and found that he was exactly right - there were places (deep inside the vector code, for example) which called std::memcpy(null, null, 0) - which is definitely UB. In an ideal world, our C library would define ::memcpy with the correct annotations, and libc++ would import that into namespace std, and we'd be golden. But we don't have a C library - we use whatever is provided by the system we're running on, so that's not really an option. For my testing, I changed libc++'s <cstring> header: -using ::memcpy; +inline _LIBCPP_INLINE_VISIBILITY +void* memcpy(void* __s1, const void* __s2, size_t __n) __attribute__((nonnull(1, 2))) +{ return ::memcpy(__s1, __s2, __n); } (similarly for memmove and memcmp), and I found several cases of simple code that now UBSAN fires off on: such as: std::vector<int> v; v.push_back(1); and : int *p = NULL; std::copy(p,p,p); This seems fairly useful to me. I would like to hear other people's opinions about: * Is adding this kind of UB detection something that people want in libc++? * What do people think about wrapping the C library functions to enable UBSAN to catch them (this is a separate Q from the first Q, because I can see putting these kind of parameter checks into functions that have no counterpart in the C library). Sadly, this would NOT affect calls to ::memcpy (for example), just std::memcpy. * Is that the best way to annotate the declarations? Is there a more portable, standard way to do this (things that start with double underscores worry me). In any case, I would probably wrap this in a macro to support compilers that don't understand whatever mechanism we end up using. Thanks -- Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150601/41bb1111/attachment.html>
Reid Kleckner
2015-Jun-01 16:57 UTC
[LLVMdev] [cfe-dev] RFC: Adding attribute(nonnull) to things in libc++
Why should memset / memcpy be attribute nonnull? Is there standardese that supports that? On Mon, Jun 1, 2015 at 7:20 AM, Marshall Clow <mclow.lists at gmail.com> wrote:> This weekend, I got an email from Nuno Lopes informing me that UBSAN now > paid attention to attribute(nonnull), and he was having some problems with > it going off when using libc++. > > I did some investigation, and found that he was exactly right - there were > places (deep inside the vector code, for example) which called > std::memcpy(null, null, 0) - which is definitely UB. > > In an ideal world, our C library would define ::memcpy with the correct > annotations, and libc++ would import that into namespace std, and we'd be > golden. > > But we don't have a C library - we use whatever is provided by the system > we're running on, so that's not really an option. > > For my testing, I changed libc++'s <cstring> header: > > -using ::memcpy; > +inline _LIBCPP_INLINE_VISIBILITY > +void* memcpy(void* __s1, const void* __s2, size_t __n) > __attribute__((nonnull(1, 2))) > +{ return ::memcpy(__s1, __s2, __n); } > > (similarly for memmove and memcmp), and I found several cases of simple > code that now UBSAN fires off on: > > such as: std::vector<int> v; v.push_back(1); > and : int *p = NULL; std::copy(p,p,p); > > This seems fairly useful to me. > > I would like to hear other people's opinions about: > > * Is adding this kind of UB detection something that people want in libc++? > > * What do people think about wrapping the C library functions to enable > UBSAN to catch them (this is a separate Q from the first Q, because I can > see putting these kind of parameter checks into functions that have no > counterpart in the C library). Sadly, this would NOT affect calls to > ::memcpy (for example), just std::memcpy. > > * Is that the best way to annotate the declarations? Is there a more > portable, standard way to do this (things that start with double > underscores worry me). In any case, I would probably wrap this in a macro > to support compilers that don't understand whatever mechanism we end up > using. > > Thanks > > -- Marshall > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150601/0b795d0c/attachment.html>
Joerg Sonnenberger
2015-Jun-01 17:35 UTC
[LLVMdev] [cfe-dev] RFC: Adding attribute(nonnull) to things in libc++
On Mon, Jun 01, 2015 at 09:57:17AM -0700, Reid Kleckner wrote:> Why should memset / memcpy be attribute nonnull? Is there standardese that > supports that?The generic entry text of the standard section. IMO this is a standard bug and someone should *please* get it fixed. It is ridiculous that zero sized operations are considered UB. Joerg
Nick Lewycky
2015-Jun-03 07:21 UTC
[LLVMdev] [cfe-dev] RFC: Adding attribute(nonnull) to things in libc++
On 1 June 2015 at 07:20, Marshall Clow <mclow.lists at gmail.com> wrote:> This weekend, I got an email from Nuno Lopes informing me that UBSAN now > paid attention to attribute(nonnull), and he was having some problems with > it going off when using libc++. >FYI, I also looked into turning this on, but with libstdc++, and found that they annotated basic_string<T>::assign(pointer, len) with attribute nonnull. That's a problem, because it's valid to call basic_string<T>::assign(nullptr, 0), but the reasoning why it's valid makes me want to ask the committee whether this is what they intended. The language std text claims that the pointer must point to an array of 'n' (second argument) length, but earlier in the text it also states that in the library, whenever it says "array" it means any pointer upon which address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array). Thus, nullptr is valid if 'n' is zero. This was changed in DR2235: http://cplusplus.github.io/LWG/lwg-defects.html#2235 The text and discussion of DR2235 sound like they intend to make the behaviour of assign match that of the constructor that takes the same arguments. What they actually did was change the constructor to match the behaviour of assign, and it doesn't look like removing the requirement of a nonnull pointer was considered and intended. At this point I made a note that somebody should ask the committee when they get the chance, and never got back around to it. Nick I did some investigation, and found that he was exactly right - there were> places (deep inside the vector code, for example) which called > std::memcpy(null, null, 0) - which is definitely UB. > > In an ideal world, our C library would define ::memcpy with the correct > annotations, and libc++ would import that into namespace std, and we'd be > golden. > > But we don't have a C library - we use whatever is provided by the system > we're running on, so that's not really an option. > > For my testing, I changed libc++'s <cstring> header: > > -using ::memcpy; > +inline _LIBCPP_INLINE_VISIBILITY > +void* memcpy(void* __s1, const void* __s2, size_t __n) > __attribute__((nonnull(1, 2))) > +{ return ::memcpy(__s1, __s2, __n); } > > (similarly for memmove and memcmp), and I found several cases of simple > code that now UBSAN fires off on: > > such as: std::vector<int> v; v.push_back(1); > and : int *p = NULL; std::copy(p,p,p); > > This seems fairly useful to me. > > I would like to hear other people's opinions about: > > * Is adding this kind of UB detection something that people want in libc++? > > * What do people think about wrapping the C library functions to enable > UBSAN to catch them (this is a separate Q from the first Q, because I can > see putting these kind of parameter checks into functions that have no > counterpart in the C library). Sadly, this would NOT affect calls to > ::memcpy (for example), just std::memcpy. > > * Is that the best way to annotate the declarations? Is there a more > portable, standard way to do this (things that start with double > underscores worry me). In any case, I would probably wrap this in a macro > to support compilers that don't understand whatever mechanism we end up > using. > > Thanks > > -- Marshall > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150603/61c0b3ee/attachment.html>
Marshall Clow
2015-Jun-03 13:30 UTC
[LLVMdev] [cfe-dev] RFC: Adding attribute(nonnull) to things in libc++
On Wed, Jun 3, 2015 at 12:21 AM, Nick Lewycky <nlewycky at google.com> wrote:> On 1 June 2015 at 07:20, Marshall Clow <mclow.lists at gmail.com> wrote: > >> This weekend, I got an email from Nuno Lopes informing me that UBSAN now >> paid attention to attribute(nonnull), and he was having some problems with >> it going off when using libc++. >> > > FYI, I also looked into turning this on, but with libstdc++, and found > that they annotated basic_string<T>::assign(pointer, len) with attribute > nonnull. That's a problem, because it's valid to call > basic_string<T>::assign(nullptr, 0), but the reasoning why it's valid makes > me want to ask the committee whether this is what they intended. > > The language std text claims that the pointer must point to an array of > 'n' (second argument) length, but earlier in the text it also states that > in the library, whenever it says "array" it means any pointer upon which > address computations and accesses to objects (that would be valid if the > pointer did point to the first element of such an array). Thus, nullptr is > valid if 'n' is zero. > > This was changed in DR2235: > http://cplusplus.github.io/LWG/lwg-defects.html#2235 > The text and discussion of DR2235 sound like they intend to make the > behaviour of assign match that of the constructor that takes the same > arguments. What they actually did was change the constructor to match the > behaviour of assign, and it doesn't look like removing the requirement of a > nonnull pointer was considered and intended. > > At this point I made a note that somebody should ask the committee when > they get the chance, and never got back around to it. > >I think that the reference in the discussion to [res.on.arguments] (the relevant section there is paragraph 1.2) make it clear that the change was deliberate - to allow (null, 0) as a set of parameters. I do agree that this is different behavior than memcpy, memmove, etc. That should make Joerg happy :-) -- Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150603/b4ba5e32/attachment.html>