George Burgess IV via llvm-dev
2017-Jul-21 06:27 UTC
[llvm-dev] Which assumptions do llvm.memcpy/memmove/memset.* make when the count is 0?
As of earlier this year, we now explicitly ignore the nonnull attributes that glibc puts on memcpy (and other stdlib functions). I don't know how LLVM feels about dangling or underaligned pointers in particular, but AFAICT, we do try hard to make sure that memcpy(NULL, _, 0) works as the user probably intends. Here's the thread I read about it: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html . As I recall, the tl;dr was "optimizing these assumptions to death doesn't realistically buy us much of anything, and there's a nontrivial amount of real-world code that depends on this behavior." On Thu, Jul 20, 2017 at 9:06 PM, John Regehr via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Also note that whereas GCC exploits the tricky definition of memcpy(), LLVM > at present doesn't appear to: > > https://godbolt.org/g/8uxvBQ > > In fact LLVM doesn't even assume the pointer is non-null in a case where I'd > argue that it should: > > https://godbolt.org/g/svHQKL > > John > > > > On 07/20/2017 10:00 PM, John Regehr via llvm-dev wrote: >>> >>> So, the pointer arguments of memcpy *shall* (a violation of a shall >>> clause is UB, per §4/2) have valid values, even though the function will >>> copy zero characters. >> >> >> This is true in C but the question was about LLVM intrinsics. >> >> Since the LangRef does not mention any such restriction, I would assume >> that memcpy(0,0,0) is not UB in LLVM. If it is UB then we must update the >> LangRef to be clear on this point (actually we should update the LangRef >> either way since this is a question that'll come up again). >> >> John >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
John Regehr via llvm-dev
2017-Jul-21 15:31 UTC
[llvm-dev] Which assumptions do llvm.memcpy/memmove/memset.* make when the count is 0?
> Here's the thread I read about it: > http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html . As > I recall, the tl;dr was "optimizing these assumptions to death doesn't > realistically buy us much of anything, and there's a nontrivial amount > of real-world code that depends on this behavior."Yeah, I recall that thread. The issue is that the current question comes from Rust whereas the previous discussion was freely mixing C/C++ and middle-end issues. We need to separate these. I propose documenting in the LangRef that memcpy and related intrinsics are defined even when src and dst don't refer to valid storage as long as the length argument is zero. Then we commit to implementing that behavior. Is that OK with everyone? If so I can update the doc. John
Ralf Jung via llvm-dev
2017-Jul-21 16:10 UTC
[llvm-dev] Which assumptions do llvm.memcpy/memmove/memset.* make when the count is 0?
Hi,> So, the pointer arguments of memcpy shall (a violation of a shall clause is UB, per §4/2) have valid values, even though the function will copy zero characters.So this puts a bound on what LLVM can do, right? However, (also judging from the other answers) LLVM sometimes guarantees more than C does.>> Here's the thread I read about it: >> http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html . As >> I recall, the tl;dr was "optimizing these assumptions to death doesn't >> realistically buy us much of anything, and there's a nontrivial amount >> of real-world code that depends on this behavior." > > Yeah, I recall that thread. The issue is that the current question comes > from Rust whereas the previous discussion was freely mixing C/C++ and > middle-end issues. We need to separate these.Ah, I wanted to link to that thread but couldn't find it; thanks. Right, so this is specifically about the llvm intrinsics that Rust uses, and *not* about the C/C++ frontend.> I propose documenting in the LangRefDocumenting such issues in the LangRef would be great. :) That's always the place I go to with corner cases like this, but often I don't find the answer there either. (Btw, when I come up with such a corner case -- is there a bugtracker where "please clarify LangRef"-kind of issues can be submitted to, or is the mailing list the best venue?)> that memcpy and related intrinsics > are defined even when src and dst don't refer to valid storage as long > as the length argument is zero. Then we commit to implementing that > behavior. Is that OK with everyone? If so I can update the doc.Please also clarify the behavior for NULL or unaligned pointers. (There seems to be an entire lattice of "validity levels" for a pointer: Completely broken, non-NULL and/or aligned, as well as aligned and pointing to valid storage.) Judging from "memcpy(NULL, _, 0) is okay", I suppose NULL is okay (both for src and dest), which only leaves open the question of alignment. Kind regards, Ralf
Joerg Sonnenberger via llvm-dev
2017-Jul-21 18:39 UTC
[llvm-dev] Which assumptions do llvm.memcpy/memmove/memset.* make when the count is 0?
On Fri, Jul 21, 2017 at 09:31:41AM -0600, John Regehr via llvm-dev wrote:> I propose documenting in the LangRef that memcpy and related intrinsics are > defined even when src and dst don't refer to valid storage as long as the > length argument is zero. Then we commit to implementing that behavior. Is > that OK with everyone? If so I can update the doc.I don't think that was the conclusion of the discussion? I mean the result was that a NULL pointer should be explicitly valid if the length argument is zero. That's a bit more restrictive. Joerg