Nuno Lopes via llvm-dev
2017-Nov-08 23:18 UTC
[llvm-dev] Is it ok to allocate > half of address space?
>On 11/8/2017 9:24 AM, Nuno Lopes via llvm-dev wrote: >> Hi, >> >> I was looking into the semantics of GEP inbounds and some BasicAA rules >> and I'm wondering if it's valid in LLVM IR to allocate more than half of >> the address space with a global variable or an alloca. >> If that's a scenario want to consider, then we have problems :) >> >> Consider this C code (32 bits): >> #include <string.h> >> >> char obj[0x80000008]; >> >> char f() { >> char *p = obj + 0x79999999; >> char *q = obj + 0x80000000; >> *q = 1; >> memcpy(p, "abcd", 4); >> return *q; >> } >> >> >> Clearly the stores alias, and the memcpy should override the value >> written by "*q = 1". >> >> I dunno if this is legal in C or not, but the IR produced by clang looks >> like (32 bits): >> >> @obj = common global [2147483656 x i8] zeroinitializer, align 1 >> >> define signext i8 @f() { >> store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr inbounds >> ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), >> i32 -2147483648), align 1 >> call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds >> ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8* >> getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4, >> i32 1, i1 false) >> %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr >> inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), >> i32 -2147483648), align 1 >> ret i8 %1 >> } >> >> With -O2, the store to q gets forwarded, and so we get "ret i8 1". >> So, BasicAA concluded that p and q don't alias. The culprit is an >> overflow in BasicAAResult::isGEPBaseAtNegativeOffset(). >> >> So my question is do we care about this use case where a single >> allocation can take more than half of the address space? >> > > Accoding to LangRef, your IR currently has undefined behavior: the rules > for "inbounds" GEPs say that indexes are treated as signed values. And > solving that would involve changing the way we represent GEPs in IR, so I > think you can consider that out of scope.Sorry, that was a typo. The test case was supposed to not have inbounds (it should work without as well). The current definition of GEP inbounds is complicated, though.. It disallows the following: %a = gep %p, 0x88888888 %b = gep inbounds %a, 1 If %a is within bounds, the "gep inbounds" gives a signed overflow even though it's just a +1 (since 0x88888888 + 1 overflows). So GEP inbounds disables large objects outright. BTW I've always wondered why EmitGEPOffset (http://llvm.org/doxygen/Local_8h_source.html#l00247) doesn't use 'add nsw' if the semantics of GEP inbounds allows that (if my reading of LangRef is correct).> Assuming we're not dealing with inbounds GEPs (e.g. you pass -fwrapv to > clang), I don't see any particular reason to disallow allocations more > than half the address-space.Ok, I can file bug reports for the cases I'm seeing. I can verify correctness of fixes as well. But only starting in a week from now; I'm quite busy at the moment. Nuno
John Regehr via llvm-dev
2017-Nov-09 16:00 UTC
[llvm-dev] Is it ok to allocate > half of address space?
This blog post contains additional examples that may shed light on this topic: https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/ John
John Regehr via llvm-dev
2017-Nov-09 16:40 UTC
[llvm-dev] Is it ok to allocate > half of address space?
The signed cmov emitted in this example (mentioned in the blog post below) seems to indicate that LLVM really has baked in the assumption that objects are not larger than half the address space: https://godbolt.org/g/8zhrZ1 John On 11/09/2017 09:00 AM, John Regehr via llvm-dev wrote:> This blog post contains additional examples that may shed light on this > topic: > > https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/ > > John > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev