thr3ads.net - llvm dev - [llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null. [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2017-Jan-02 19:45 UTC

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.

Hi George,

On Mon, Jan 2, 2017 at 10:41 AM, George Burgess IV
<george.burgess.iv at gmail.com> wrote:> Thanks for the comments!
>
>> Have you considered changing our existing behavior to match GCC's
>> builtin_object_size instead of adding a new parameter
>
> Yup! My issue with turning `i1 %min` into `i8 %flags` is that
> __builtin_object_size would get lowered like:
>
> __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i8 2)
> __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i8 2)
> __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i8 3)
> (__builtin_object_size(p, 3) doesn't actually get lowered)
>
> ...Which might be a bit surprising to some people. If we think that's a
> non-issue, I'm happy to take the simpler approach and use an i8.
What I had in mind was:

 __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i1 0)
 __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i1 0)
 __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i1 1)
 __builtin_object_size(p, 3) -> @llvm.objectsize(i8* %p, i1 1)

and changing the specification of @llvm.objectsize to match.
>> We'll have to have some <min>-awareness either in clang (to
decide if
>> the <null-is-unknown> bit needs to be set) or in the middle end. 
What
>> is your plan here?
>
> My plan was just to always set the `null-is-unknown` bit when lowering a
> call to __builtin_object_size in clang. If `min` is true, we treat unknown
> and null values identically in @llvm.objectsize, so whether
> `null-is-unknown` is set in that case shouldn't matter.
Continuing from above: in other words, since your use case is always
setting the `null-is-unknown` bit, can you re-define @llvm.objectsize
to have that semantic without adding a new parameter?

That's not backward compatible, but there's a simple but conservative
update strategy.
>> However, since Malloc can return null:
>
> I think I was unclear: this behavior would only exist if @llvm.objectsize
> was actually handed null. I wasn't planning on changing how we'd
handle
> memory allocation functions that may return null (GCC gives back 2 for
> __builtin_object_size(malloc(2), 0)). In other words, this
`null-is-unknown`
> bit only makes the objectsize evaluator see `T* null` as though it was
`call
> T* @opaque_user_defined_function()`, nothing else.
I meant to say that it looks like the semantics of ObjectSizeMin(X) is
that "return the conservative minimum object size for all values that
X may take at runtime" (resp. ObjectSizeMax(X)).

For instance this:

void i(int c, volatile int* sink) {
  void* mem1 = malloc(20);
  void* mem2 = malloc(40);

  *sink = __builtin_object_size(c ? mem1 : mem2, 0);
  *sink = __builtin_object_size(c ? mem1 : mem2, 2);
}

is lowered to

        movl    $40, (%rsi)
        movl    $20, (%rsi)
        ret

by GCC.

Applying the same logic to malloc(N), since it returns a location that
has N dereferenceable bytes or NULL, it follows that
ObjectSizeMin(malloc(N)) should return the smaller of
ObjectSizeMin(NULL) and ObjectSizeMin(MemoryLocOfNBytes) == the
smaller of ObjectSizeMin(NULL) and N.  Given that we want the result
of ObjectSizeMin(malloc(N)) == UMIN(ObjectSizeMin(NULL), N) to be N,
we'd want ObjectSizeMin(NULL) to be (unsigned)-1 for consistency.

However, none of this matters if we're not in a position to change the
specification.

-- Sanjoy

> This is also consistent with how @llvm.objectsize already acts: if the
> pointer it's given is the result of a call to `malloc(N)`, it'll
return N
> regardless of the value of `min`.
>
> But yeah, I agree that some of the features of __builtin_object_size
aren't
> entirely intuitive. :)
>
> On Mon, Jan 2, 2017 at 12:01 AM, Sanjoy Das <sanjoy at
playingwithpointers.com>
> wrote:
>>
>> Hi George,
>>
>> Have you considered changing our existing behavior to match GCC's
>> builtin_object_size instead of adding a new parameter?  That may be
>> simpler overall.  There's also a clear upgrade strategy -- fold
every
>> old style call to "<min> ? 0 : 1".
>>
>> You probably already know this, but GCC folds
>> builtin_object_size(0, 0) to -1 and builtin_object_size(0, 2) to 0.
>> We'll have to have some <min>-awareness either in clang (to
decide if
>> the <null-is-unknown> bit needs to be set) or in the middle end. 
What
>> is your plan here?
>>
>> I also found gcc's choice of folding builtin_object_size(0, 2) to 0
and
>> builtin_object_size(0, 0) to -1 somewhat odd; I'd have expected the
>> inverse.  This follows from the following "intuitive" rules
>>
>> ObjSizeMax(X) = UMAX(ObjSizeMax(A), ObjSizeMax(B))
>> ObjSizeMin(X) = UMIN(ObjSizeMin(A), ObjSizeMin(B))
>>
>> (where X is a value that can either be A or B at runtime)
>>
>> and that we want to fold
>>
>> ObjSizeMax(Malloc(N)) = ObjSizeMin(Malloc(N)) = N
>>
>> However, since Malloc can return null:
>>
>> ObjSizeMax(Malloc(N)) = UMAX(N, ObjSizeMax(NULL)) = N
>> ObjSizeMin(Malloc(N)) = UMIN(N, ObjSizeMin(NULL)) = N
>>
>> and for that to be true ObjSizeMax(NULL) >>
builtin_object_size(NULL, 0) needs to be 0 and ObjSizeMin(NULL) >>
builtin_object_size(NULL, 2) needs to be (unsigned)-1.
>>
>> However, I'm not sure if it is up to us to change that; given the
very
>> motivation of this thread is GCC compatibility.
>>
>> -- Sanjoy
>>
>> On Sun, Jan 1, 2017 at 10:03 PM, George Burgess IV via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> > Happy New Year ping. :)
>> >
>> > Will ping again on Wednesday. If I don't get comments by EOD
Thursday,
>> > I'll
>> > assume everyone's OK with this and put together a patch.
>> >
>> > On Wed, Dec 21, 2016 at 11:44 AM, George Burgess IV
>> > <george.burgess.iv at gmail.com> wrote:
>> >>
>> >> tl;dr: We'd like to add a bit to @llvm.objectsize to make
it optionally
>> >> be
>> >> conservative when it's handed a null pointer.
>> >>
>> >> Happy Wednesday!
>> >>
>> >> We're trying to fix PR23277, which is a bug about how
clang+LLVM treat
>> >> __builtin_object_size when it's given a null pointer. For
compatibility
>> >> with
>> >> GCC, clang would like to be able to hand back a conservative
result
>> >> (e.g.
>> >> (size_t)-1), but the LLVM intrinsic that clang lowers
>> >> __builtin_object_size(p, N) to, @llvm.objectsize, only hands
back 0
>> >> when
>> >> given a null pointer. Because it's often assumed that
>> >> __builtin_object_size
>> >> always folds to a constant, handling this only in clang may be
tricky:
>> >> if we
>> >> emit the IR equivalent to ((p) ? __builtin_object_size(p, N) :
-1ULL)
>> >> and
>> >> LLVM can't fold away the null check, we've failed to
emit a constant.
>> >>
>> >> So, the best path forward that I can see is to add a
"null is unknown
>> >> size" bit to @llvm.objectsize, much like the
"min" bit it already has.
>> >> If
>> >> this bit is true, null would be treated like an opaque
pointer.
>> >> Otherwise,
>> >> @llvm.objectsize would act no differently than it does today.
>> >>
>> >> If we like this approach, I'm assuming it would be best to
have this
>> >> bit
>> >> as a third argument to @llvm.objectsize, rather than making
the second
>> >> argument an i8 and using it as a bag of bits.
>> >>
>> >> All thoughts/comments/alternative approaches/etc. highly
appreciated.
>> >> :)
>> >>
>> >> Thanks (and Happy Holidays)!
>> >> George
>> >
>> >
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>>
>>
>>
>> --
>> Sanjoy Das
>> http://playingwithpointers.com
>
>

George Burgess IV via llvm-dev

2017-Jan-02 21:18 UTC

head link

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.

> Continuing from above: in other words, since your use case is always
> setting the `null-is-unknown` bit, can you re-define @llvm.objectsize
> to have that semantic without adding a new parameter?
>
> That's not backward compatible, but there's a simple but
conservative
> update strategy.
That would almost definitely be the simplest way forward, but I'm unsure
how happy this would make other users of objectsize. In particular, our
objectsize sanitizer will no longer catch any null-related issues like:

struct Foo { int a; };
int getA(struct Foo *f) { return f->a; }
int getA2() { struct Foo *f = 0; return f->a; }

// getA2 now traps rather than calling a special ubsan-check-failed function

Grepping through, that's the only other direct user of objectsize that I
can find. If we're okay with that loss in accuracy (and don't believe
that
we have other objectsize users that care), I'm happy to just alter
objectsize's default behavior and see how it does.
> However, none of this matters if we're not in a position to change the
> specification.
Sadly, we're not. :/

On Mon, Jan 2, 2017 at 11:45 AM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi George,
>
> On Mon, Jan 2, 2017 at 10:41 AM, George Burgess IV
> <george.burgess.iv at gmail.com> wrote:
> > Thanks for the comments!
> >
> >> Have you considered changing our existing behavior to match
GCC's
> >> builtin_object_size instead of adding a new parameter
> >
> > Yup! My issue with turning `i1 %min` into `i8 %flags` is that
> > __builtin_object_size would get lowered like:
> >
> > __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i8 2)
> > __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i8 2)
> > __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i8 3)
> > (__builtin_object_size(p, 3) doesn't actually get lowered)
> >
> > ...Which might be a bit surprising to some people. If we think
that's a
> > non-issue, I'm happy to take the simpler approach and use an i8.
>
> What I had in mind was:
>
>  __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i1 0)
>  __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i1 0)
>  __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i1 1)
>  __builtin_object_size(p, 3) -> @llvm.objectsize(i8* %p, i1 1)
>
> and changing the specification of @llvm.objectsize to match.
>
> >> We'll have to have some <min>-awareness either in clang
(to decide if
> >> the <null-is-unknown> bit needs to be set) or in the middle
end.  What
> >> is your plan here?
> >
> > My plan was just to always set the `null-is-unknown` bit when lowering
a
> > call to __builtin_object_size in clang. If `min` is true, we treat
> unknown
> > and null values identically in @llvm.objectsize, so whether
> > `null-is-unknown` is set in that case shouldn't matter.
>
> Continuing from above: in other words, since your use case is always
> setting the `null-is-unknown` bit, can you re-define @llvm.objectsize
> to have that semantic without adding a new parameter?
>
> That's not backward compatible, but there's a simple but
conservative
> update strategy.
>
> >> However, since Malloc can return null:
> >
> > I think I was unclear: this behavior would only exist if
@llvm.objectsize
> > was actually handed null. I wasn't planning on changing how
we'd handle
> > memory allocation functions that may return null (GCC gives back 2 for
> > __builtin_object_size(malloc(2), 0)). In other words, this
> `null-is-unknown`
> > bit only makes the objectsize evaluator see `T* null` as though it was
> `call
> > T* @opaque_user_defined_function()`, nothing else.
>
> I meant to say that it looks like the semantics of ObjectSizeMin(X) is
> that "return the conservative minimum object size for all values that
> X may take at runtime" (resp. ObjectSizeMax(X)).
>
> For instance this:
>
> void i(int c, volatile int* sink) {
>   void* mem1 = malloc(20);
>   void* mem2 = malloc(40);
>
>   *sink = __builtin_object_size(c ? mem1 : mem2, 0);
>   *sink = __builtin_object_size(c ? mem1 : mem2, 2);
> }
>
> is lowered to
>
>         movl    $40, (%rsi)
>         movl    $20, (%rsi)
>         ret
>
> by GCC.
>
> Applying the same logic to malloc(N), since it returns a location that
> has N dereferenceable bytes or NULL, it follows that
> ObjectSizeMin(malloc(N)) should return the smaller of
> ObjectSizeMin(NULL) and ObjectSizeMin(MemoryLocOfNBytes) == the
> smaller of ObjectSizeMin(NULL) and N.  Given that we want the result
> of ObjectSizeMin(malloc(N)) == UMIN(ObjectSizeMin(NULL), N) to be N,
> we'd want ObjectSizeMin(NULL) to be (unsigned)-1 for consistency.
>
> However, none of this matters if we're not in a position to change the
> specification.
>
> -- Sanjoy
>
>
> > This is also consistent with how @llvm.objectsize already acts: if the
> > pointer it's given is the result of a call to `malloc(N)`,
it'll return N
> > regardless of the value of `min`.
> >
> > But yeah, I agree that some of the features of __builtin_object_size
> aren't
> > entirely intuitive. :)
> >
> > On Mon, Jan 2, 2017 at 12:01 AM, Sanjoy Das <
> sanjoy at playingwithpointers.com>
> > wrote:
> >>
> >> Hi George,
> >>
> >> Have you considered changing our existing behavior to match
GCC's
> >> builtin_object_size instead of adding a new parameter?  That may
be
> >> simpler overall.  There's also a clear upgrade strategy --
fold every
> >> old style call to "<min> ? 0 : 1".
> >>
> >> You probably already know this, but GCC folds
> >> builtin_object_size(0, 0) to -1 and builtin_object_size(0, 2) to
0.
> >> We'll have to have some <min>-awareness either in clang
(to decide if
> >> the <null-is-unknown> bit needs to be set) or in the middle
end.  What
> >> is your plan here?
> >>
> >> I also found gcc's choice of folding builtin_object_size(0, 2)
to 0 and
> >> builtin_object_size(0, 0) to -1 somewhat odd; I'd have
expected the
> >> inverse.  This follows from the following "intuitive"
rules
> >>
> >> ObjSizeMax(X) = UMAX(ObjSizeMax(A), ObjSizeMax(B))
> >> ObjSizeMin(X) = UMIN(ObjSizeMin(A), ObjSizeMin(B))
> >>
> >> (where X is a value that can either be A or B at runtime)
> >>
> >> and that we want to fold
> >>
> >> ObjSizeMax(Malloc(N)) = ObjSizeMin(Malloc(N)) = N
> >>
> >> However, since Malloc can return null:
> >>
> >> ObjSizeMax(Malloc(N)) = UMAX(N, ObjSizeMax(NULL)) = N
> >> ObjSizeMin(Malloc(N)) = UMIN(N, ObjSizeMin(NULL)) = N
> >>
> >> and for that to be true ObjSizeMax(NULL) > >>
builtin_object_size(NULL, 0) needs to be 0 and ObjSizeMin(NULL) > >>
builtin_object_size(NULL, 2) needs to be (unsigned)-1.
> >>
> >> However, I'm not sure if it is up to us to change that; given
the very
> >> motivation of this thread is GCC compatibility.
> >>
> >> -- Sanjoy
> >>
> >> On Sun, Jan 1, 2017 at 10:03 PM, George Burgess IV via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >> > Happy New Year ping. :)
> >> >
> >> > Will ping again on Wednesday. If I don't get comments by
EOD Thursday,
> >> > I'll
> >> > assume everyone's OK with this and put together a patch.
> >> >
> >> > On Wed, Dec 21, 2016 at 11:44 AM, George Burgess IV
> >> > <george.burgess.iv at gmail.com> wrote:
> >> >>
> >> >> tl;dr: We'd like to add a bit to @llvm.objectsize to
make it
> optionally
> >> >> be
> >> >> conservative when it's handed a null pointer.
> >> >>
> >> >> Happy Wednesday!
> >> >>
> >> >> We're trying to fix PR23277, which is a bug about how
clang+LLVM
> treat
> >> >> __builtin_object_size when it's given a null pointer.
For
> compatibility
> >> >> with
> >> >> GCC, clang would like to be able to hand back a
conservative result
> >> >> (e.g.
> >> >> (size_t)-1), but the LLVM intrinsic that clang lowers
> >> >> __builtin_object_size(p, N) to, @llvm.objectsize, only
hands back 0
> >> >> when
> >> >> given a null pointer. Because it's often assumed that
> >> >> __builtin_object_size
> >> >> always folds to a constant, handling this only in clang
may be
> tricky:
> >> >> if we
> >> >> emit the IR equivalent to ((p) ? __builtin_object_size(p,
N) : -1ULL)
> >> >> and
> >> >> LLVM can't fold away the null check, we've failed
to emit a constant.
> >> >>
> >> >> So, the best path forward that I can see is to add a
"null is unknown
> >> >> size" bit to @llvm.objectsize, much like the
"min" bit it already
> has.
> >> >> If
> >> >> this bit is true, null would be treated like an opaque
pointer.
> >> >> Otherwise,
> >> >> @llvm.objectsize would act no differently than it does
today.
> >> >>
> >> >> If we like this approach, I'm assuming it would be
best to have this
> >> >> bit
> >> >> as a third argument to @llvm.objectsize, rather than
making the
> second
> >> >> argument an i8 and using it as a bag of bits.
> >> >>
> >> >> All thoughts/comments/alternative approaches/etc. highly
appreciated.
> >> >> :)
> >> >>
> >> >> Thanks (and Happy Holidays)!
> >> >> George
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > LLVM Developers mailing list
> >> > llvm-dev at lists.llvm.org
> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >
> >>
> >>
> >>
> >> --
> >> Sanjoy Das
> >> http://playingwithpointers.com
> >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170102/47503063/attachment.html>

Sanjoy Das via llvm-dev

2017-Jan-02 23:22 UTC

head link

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.

Hi George,

On Mon, Jan 2, 2017 at 1:18 PM, George Burgess IV
<george.burgess.iv at gmail.com> wrote:>> Continuing from above: in other words, since your use case is always
>> setting the `null-is-unknown` bit, can you re-define @llvm.objectsize
>> to have that semantic without adding a new parameter?
>>
>> That's not backward compatible, but there's a simple but
conservative
>> update strategy.
>
> That would almost definitely be the simplest way forward, but I'm
unsure how
> happy this would make other users of objectsize. In particular, our
> objectsize sanitizer will no longer catch any null-related issues like:
>
> struct Foo { int a; };
> int getA(struct Foo *f) { return f->a; }
> int getA2() { struct Foo *f = 0; return f->a; }
I guess that's another sign that the spec is wrong. :)

However, maybe we can use a trick here -- instead of lowering the
check to assert(ObjectSize(f) UGE 4) how about assert((1 +
ObjectSize(f)) UGT 4)?  The addition should always constant fold, and
if `f` is null then we'll assert((1 + (-1)) UGT 4) ==> assert(false).

-- Sanjoy
>
> // getA2 now traps rather than calling a special ubsan-check-failed
function
>
> Grepping through, that's the only other direct user of objectsize that
I can
> find. If we're okay with that loss in accuracy (and don't believe
that we
> have other objectsize users that care), I'm happy to just alter
objectsize's
> default behavior and see how it does.
>
>> However, none of this matters if we're not in a position to change
the
>> specification.
>
> Sadly, we're not. :/
>
> On Mon, Jan 2, 2017 at 11:45 AM, Sanjoy Das <sanjoy at
playingwithpointers.com>
> wrote:
>>
>> Hi George,
>>
>> On Mon, Jan 2, 2017 at 10:41 AM, George Burgess IV
>> <george.burgess.iv at gmail.com> wrote:
>> > Thanks for the comments!
>> >
>> >> Have you considered changing our existing behavior to match
GCC's
>> >> builtin_object_size instead of adding a new parameter
>> >
>> > Yup! My issue with turning `i1 %min` into `i8 %flags` is that
>> > __builtin_object_size would get lowered like:
>> >
>> > __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i8 2)
>> > __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i8 2)
>> > __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i8 3)
>> > (__builtin_object_size(p, 3) doesn't actually get lowered)
>> >
>> > ...Which might be a bit surprising to some people. If we think
that's a
>> > non-issue, I'm happy to take the simpler approach and use an
i8.
>>
>> What I had in mind was:
>>
>>  __builtin_object_size(p, 0) -> @llvm.objectsize(i8* %p, i1 0)
>>  __builtin_object_size(p, 1) -> @llvm.objectsize(i8* %p, i1 0)
>>  __builtin_object_size(p, 2) -> @llvm.objectsize(i8* %p, i1 1)
>>  __builtin_object_size(p, 3) -> @llvm.objectsize(i8* %p, i1 1)
>>
>> and changing the specification of @llvm.objectsize to match.
>>
>> >> We'll have to have some <min>-awareness either in
clang (to decide if
>> >> the <null-is-unknown> bit needs to be set) or in the
middle end.  What
>> >> is your plan here?
>> >
>> > My plan was just to always set the `null-is-unknown` bit when
lowering a
>> > call to __builtin_object_size in clang. If `min` is true, we treat
>> > unknown
>> > and null values identically in @llvm.objectsize, so whether
>> > `null-is-unknown` is set in that case shouldn't matter.
>>
>> Continuing from above: in other words, since your use case is always
>> setting the `null-is-unknown` bit, can you re-define @llvm.objectsize
>> to have that semantic without adding a new parameter?
>>
>> That's not backward compatible, but there's a simple but
conservative
>> update strategy.
>>
>> >> However, since Malloc can return null:
>> >
>> > I think I was unclear: this behavior would only exist if
>> > @llvm.objectsize
>> > was actually handed null. I wasn't planning on changing how
we'd handle
>> > memory allocation functions that may return null (GCC gives back 2
for
>> > __builtin_object_size(malloc(2), 0)). In other words, this
>> > `null-is-unknown`
>> > bit only makes the objectsize evaluator see `T* null` as though it
was
>> > `call
>> > T* @opaque_user_defined_function()`, nothing else.
>>
>> I meant to say that it looks like the semantics of ObjectSizeMin(X) is
>> that "return the conservative minimum object size for all values
that
>> X may take at runtime" (resp. ObjectSizeMax(X)).
>>
>> For instance this:
>>
>> void i(int c, volatile int* sink) {
>>   void* mem1 = malloc(20);
>>   void* mem2 = malloc(40);
>>
>>   *sink = __builtin_object_size(c ? mem1 : mem2, 0);
>>   *sink = __builtin_object_size(c ? mem1 : mem2, 2);
>> }
>>
>> is lowered to
>>
>>         movl    $40, (%rsi)
>>         movl    $20, (%rsi)
>>         ret
>>
>> by GCC.
>>
>> Applying the same logic to malloc(N), since it returns a location that
>> has N dereferenceable bytes or NULL, it follows that
>> ObjectSizeMin(malloc(N)) should return the smaller of
>> ObjectSizeMin(NULL) and ObjectSizeMin(MemoryLocOfNBytes) == the
>> smaller of ObjectSizeMin(NULL) and N.  Given that we want the result
>> of ObjectSizeMin(malloc(N)) == UMIN(ObjectSizeMin(NULL), N) to be N,
>> we'd want ObjectSizeMin(NULL) to be (unsigned)-1 for consistency.
>>
>> However, none of this matters if we're not in a position to change
the
>> specification.
>>
>> -- Sanjoy
>>
>>
>> > This is also consistent with how @llvm.objectsize already acts: if
the
>> > pointer it's given is the result of a call to `malloc(N)`,
it'll return
>> > N
>> > regardless of the value of `min`.
>> >
>> > But yeah, I agree that some of the features of
__builtin_object_size
>> > aren't
>> > entirely intuitive. :)
>> >
>> > On Mon, Jan 2, 2017 at 12:01 AM, Sanjoy Das
>> > <sanjoy at playingwithpointers.com>
>> > wrote:
>> >>
>> >> Hi George,
>> >>
>> >> Have you considered changing our existing behavior to match
GCC's
>> >> builtin_object_size instead of adding a new parameter?  That
may be
>> >> simpler overall.  There's also a clear upgrade strategy --
fold every
>> >> old style call to "<min> ? 0 : 1".
>> >>
>> >> You probably already know this, but GCC folds
>> >> builtin_object_size(0, 0) to -1 and builtin_object_size(0, 2)
to 0.
>> >> We'll have to have some <min>-awareness either in
clang (to decide if
>> >> the <null-is-unknown> bit needs to be set) or in the
middle end.  What
>> >> is your plan here?
>> >>
>> >> I also found gcc's choice of folding
builtin_object_size(0, 2) to 0 and
>> >> builtin_object_size(0, 0) to -1 somewhat odd; I'd have
expected the
>> >> inverse.  This follows from the following
"intuitive" rules
>> >>
>> >> ObjSizeMax(X) = UMAX(ObjSizeMax(A), ObjSizeMax(B))
>> >> ObjSizeMin(X) = UMIN(ObjSizeMin(A), ObjSizeMin(B))
>> >>
>> >> (where X is a value that can either be A or B at runtime)
>> >>
>> >> and that we want to fold
>> >>
>> >> ObjSizeMax(Malloc(N)) = ObjSizeMin(Malloc(N)) = N
>> >>
>> >> However, since Malloc can return null:
>> >>
>> >> ObjSizeMax(Malloc(N)) = UMAX(N, ObjSizeMax(NULL)) = N
>> >> ObjSizeMin(Malloc(N)) = UMIN(N, ObjSizeMin(NULL)) = N
>> >>
>> >> and for that to be true ObjSizeMax(NULL) >> >>
builtin_object_size(NULL, 0) needs to be 0 and ObjSizeMin(NULL) >>
>> builtin_object_size(NULL, 2) needs to be (unsigned)-1.
>> >>
>> >> However, I'm not sure if it is up to us to change that;
given the very
>> >> motivation of this thread is GCC compatibility.
>> >>
>> >> -- Sanjoy
>> >>
>> >> On Sun, Jan 1, 2017 at 10:03 PM, George Burgess IV via
llvm-dev
>> >> <llvm-dev at lists.llvm.org> wrote:
>> >> > Happy New Year ping. :)
>> >> >
>> >> > Will ping again on Wednesday. If I don't get comments
by EOD
>> >> > Thursday,
>> >> > I'll
>> >> > assume everyone's OK with this and put together a
patch.
>> >> >
>> >> > On Wed, Dec 21, 2016 at 11:44 AM, George Burgess IV
>> >> > <george.burgess.iv at gmail.com> wrote:
>> >> >>
>> >> >> tl;dr: We'd like to add a bit to @llvm.objectsize
to make it
>> >> >> optionally
>> >> >> be
>> >> >> conservative when it's handed a null pointer.
>> >> >>
>> >> >> Happy Wednesday!
>> >> >>
>> >> >> We're trying to fix PR23277, which is a bug about
how clang+LLVM
>> >> >> treat
>> >> >> __builtin_object_size when it's given a null
pointer. For
>> >> >> compatibility
>> >> >> with
>> >> >> GCC, clang would like to be able to hand back a
conservative result
>> >> >> (e.g.
>> >> >> (size_t)-1), but the LLVM intrinsic that clang lowers
>> >> >> __builtin_object_size(p, N) to, @llvm.objectsize,
only hands back 0
>> >> >> when
>> >> >> given a null pointer. Because it's often assumed
that
>> >> >> __builtin_object_size
>> >> >> always folds to a constant, handling this only in
clang may be
>> >> >> tricky:
>> >> >> if we
>> >> >> emit the IR equivalent to ((p) ?
__builtin_object_size(p, N) :
>> >> >> -1ULL)
>> >> >> and
>> >> >> LLVM can't fold away the null check, we've
failed to emit a
>> >> >> constant.
>> >> >>
>> >> >> So, the best path forward that I can see is to add a
"null is
>> >> >> unknown
>> >> >> size" bit to @llvm.objectsize, much like the
"min" bit it already
>> >> >> has.
>> >> >> If
>> >> >> this bit is true, null would be treated like an
opaque pointer.
>> >> >> Otherwise,
>> >> >> @llvm.objectsize would act no differently than it
does today.
>> >> >>
>> >> >> If we like this approach, I'm assuming it would
be best to have this
>> >> >> bit
>> >> >> as a third argument to @llvm.objectsize, rather than
making the
>> >> >> second
>> >> >> argument an i8 and using it as a bag of bits.
>> >> >>
>> >> >> All thoughts/comments/alternative approaches/etc.
highly
>> >> >> appreciated.
>> >> >> :)
>> >> >>
>> >> >> Thanks (and Happy Holidays)!
>> >> >> George
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > llvm-dev at lists.llvm.org
>> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Sanjoy Das
>> >> http://playingwithpointers.com
>> >
>> >
>
>

llvm dev - Jan 2017 - RFC: Allowing @llvm.objectsize to be more conservative with null.

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.

[llvm-dev] RFC: Allowing @llvm.objectsize to be more conservative with null.