thr3ads.net - llvm dev - [llvm-dev] [RFC] A proposal for byval in a world with opaque pointers [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Eddy B. via llvm-dev

2016-Jan-19 22:47 UTC

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Hi,

In the past months, several options have been presented for making byval
(and similar attributes, such as inalloca or sret) work with opaque pointers.

The main two I've seen were byval(T) and byval(N) where N is the size of T.

They both have their upsides and downsides, for example: byval(T) would be
a type-parametric attribute, which, AFAIK, does not already exist and may
complicate the attribute system significantly, while byval(N) would be hard
to introduce in tests as computing N from T requires LLVM's DataLayout.

Also, this would have to be done for inalloca and sret as well - sret only
needs it when targeting SPARC, although still generally useful in analysis.

To sidestep some of the concerns and allow a smooth transition towards a
byval that works with opaque pointers, I've come up with a new approach:

Reuse dereferenceable(S) and align A for the size and alignment of byval.

That is, a byval dereferenceable(S) align A argument is guaranteed to have
S bytes available to read from, *and only S*, aligned to a multiple of A.
Reading past that size is UB, as LLVM will not copy more than S bytes.

An API can be provided to add the attribute alongside dereferenceable
and align attributes, for a given Type* and DataLayout.

A preliminary implementation (w/o sret) can be found at:
https://github.com/eddyb/llvm/compare/2579466...65ac99b

To maintain compatibility with existing code, dereferenceable and align
attributes are automatically injected as soon as a non-default DataLayout
is available. The "injection" mechanism could potentially be replaced
with
a pass, although it was easier to experiment with it being guaranteed.

This works out pretty well in practice, as analysis already understands
dereferenceable and can make decisions based on it.

The verifier checks that for byval & friends, dereferenceable(S) and
align A are present (clang always adds align, but not all tests have it)
and that S is the exact size of the pointee type (while we still know that).

That last bit is very important, because it allows a script to do the following:

1. Find all byval arguments in tests that are missing dereferenceable, e.g.
    ... i32* byval align 4 ...
    .... {i8, i64}* byval ...
2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
    ... i32* byval dereferenceable(123400001) align 4 ...
    .... {i8, i16}* byval dereferenceable(123400002) ...
3. Run the tests and record the errors, which may look like:

Attribute 'byval' expects 'dereferenceable(4)' for type i32*,
    found 'dereferenceable(123400001)'

Attribute 'byval' expects 'dereferenceable(16) align 8' for type
{i8, i64}*,
    found 'dereferenceable(123400002)'

4. Use the verifier error messages to replace the bogus attributes
with the proper ones, which include align A when it is missing:
    ... i32* byval dereferenceable(4) align 4 ...
    .... {i8, i16}* byval dereferenceable(16) align 8 ...

For what is worth, the same scheme would also work for byval(N), and
would be entirely unnecessary for byval(T).

I would love to know your thoughts on this, and more specifically:
Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
do you think would provide the easiest transition path for front-ends?

Thank you,
 - eddyb

David Blaikie via llvm-dev

2016-Jan-19 23:09 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

+Chandler, Duncan, David M, and Reid (pseudorandom selection of people who
might be interested/voiced opinions on prior threads/conversations on the
subject)

Firstly, thanks Eddy for doing a lot of work in the opaque pointer area -
it's greatly appreciated. This is a particularly thorny part of it (or, at
least I think so at the moment - sounds like you've seen thornier parts
further down the trail).

On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> In the past months, several options have been presented for making byval
> (and similar attributes, such as inalloca or sret) work with opaque
> pointers.
>
> The main two I've seen were byval(T) and byval(N) where N is the size
of T.
>
> They both have their upsides and downsides, for example: byval(T) would be
> a type-parametric attribute, which, AFAIK, does not already exist and may
> complicate the attribute system significantly, while byval(N) would be hard
> to introduce in tests as computing N from T requires LLVM's DataLayout.
>
By "hard to introduce in tests" you just mean it's hard to edit
the test
cases to add the correct value of N automatically? Yeah - it's certainly
not a simple sed script like my previous transformations.

> Also, this would have to be done for inalloca and sret as well - sret only
> needs it when targeting SPARC, although still generally useful in analysis.
>
Reid, Dave - bringing this to your attention in case you've got any
thoughts on inalloca/sret (was sret part of the Windows-y things, or just
inalloca?), how they should be handled, if it should be different/the same
as byval, etc?

> To sidestep some of the concerns and allow a smooth transition towards a
> byval that works with opaque pointers, I've come up with a new
approach:
>
> Reuse dereferenceable(S) and align A for the size and alignment of byval.
>
Side note: align should already be what's used for the alignment of byval.
I think I remember seeing that there was one target hook and one target
using  that hook, that did some backend alignment fixes. If that is still
present, we should fix/remove it so that the align attribute is the one
source of truth for the alignment of a byval parameter. This should happen
before any changes to byval.

The dereferenceable part is the novelty here and I really don't know how
everyone feels about it. I don't much mind - I suppose I like it because it
means optimizations can, maybe, hack on byval parameters more effectively
without having to have byval-specific smarts necessarily. (also, because
you've already done a bunch of the work)

> That is, a byval dereferenceable(S) align A argument is guaranteed to have
> S bytes available to read from, *and only S*, aligned to a multiple of A.
> Reading past that size is UB, as LLVM will not copy more than S bytes.
>
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
>
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>
> To maintain compatibility with existing code,

^ this isn't actually a goal/necessity, fwiw. So we might not need the
complexity you're outlining below. Or if we do choose to take it, we may
only want/need to keep it around for a relatively short period.

> dereferenceable and align
> attributes are automatically injected as soon as a non-default DataLayout
> is available. The "injection" mechanism could potentially be
replaced with
> a pass, although it was easier to experiment with it being guaranteed.
>
> This works out pretty well in practice, as analysis already understands
> dereferenceable and can make decisions based on it.
>
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have it)
>
Do all tests need it? Align is just for non-default alignment (> align(1))
so it may be OK for the tests to omit it & it's probably not right to
/require/ an align attribute on a byval parameter.

> and that S is the exact size of the pointee type (while we still know
> that).
>
> That last bit is very important, because it allows a script to do the
> following:
>
> 1. Find all byval arguments in tests that are missing dereferenceable, e.g.
>     ... i32* byval align 4 ...
>     .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>     ... i32* byval dereferenceable(123400001) align 4 ...
>     .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
>
> Attribute 'byval' expects 'dereferenceable(4)' for type
i32*,
>     found 'dereferenceable(123400001)'
>
> Attribute 'byval' expects 'dereferenceable(16) align 8' for
type {i8,
> i64}*,
>     found 'dereferenceable(123400002)'
>
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>     ... i32* byval dereferenceable(4) align 4 ...
>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
>
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
>
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
>
> Thank you,
>  - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/ac98fc59/attachment.html>

Antoine Pitrou via llvm-dev

2016-Jan-19 23:11 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

On Wed, 20 Jan 2016 00:47:56 +0200
"Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org>
wrote:> 
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
We (llvmlite and Numba) don't use byval specifically but, were we to use
it, I think byval(T) would by far be the easiest for us.  By contrast,
computing ABI sizes is a bit of a PITA due to llvmlite's architecture
(llvmlite's IR generation side doesn't call into LLVM at all, it
generates textual IR in pure Python).

(as a matter of fact, I plan to keep typed pointers in llvmlite,
because they provide invaluable help discovering bugs early rather than
late - i.e. giving a proper error rather than crashing mysteriously)

Regards

Antoine.

Eddy B. via llvm-dev

2016-Jan-19 23:16 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

> Do all tests need it? Align is just for non-default alignment (>
align(1)) so it may be OK for the tests to omit it & it's probably not
right to /require/ an align attribute on a byval parameter.
I've had to fix some of the tests which checked that the right
alignment was applied,
by looking at the generated asm, but had no "align A" attributes and
relied on the
target alignment guessing, which I had taken out in my experimental
implementation.

The rest of the tests might be fine, but could behave surprisingly
upon closer inspection.

2016-01-20 1:09 GMT+02:00 David Blaikie <dblaikie at
gmail.com>:> +Chandler, Duncan, David M, and Reid (pseudorandom selection of people who
> might be interested/voiced opinions on prior threads/conversations on the
> subject)
>
> Firstly, thanks Eddy for doing a lot of work in the opaque pointer area -
> it's greatly appreciated. This is a particularly thorny part of it (or,
at
> least I think so at the moment - sounds like you've seen thornier parts
> further down the trail).
>
> On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> In the past months, several options have been presented for making
byval
>> (and similar attributes, such as inalloca or sret) work with opaque
>> pointers.
>>
>> The main two I've seen were byval(T) and byval(N) where N is the
size of
>> T.
>>
>> They both have their upsides and downsides, for example: byval(T) would
be
>> a type-parametric attribute, which, AFAIK, does not already exist and
may
>> complicate the attribute system significantly, while byval(N) would be
>> hard
>> to introduce in tests as computing N from T requires LLVM's
DataLayout.
>
>
> By "hard to introduce in tests" you just mean it's hard to
edit the test
> cases to add the correct value of N automatically? Yeah - it's
certainly not
> a simple sed script like my previous transformations.
>
>>
>> Also, this would have to be done for inalloca and sret as well - sret
only
>> needs it when targeting SPARC, although still generally useful in
>> analysis.
>
>
> Reid, Dave - bringing this to your attention in case you've got any
thoughts
> on inalloca/sret (was sret part of the Windows-y things, or just
inalloca?),
> how they should be handled, if it should be different/the same as byval,
> etc?
>
>>
>> To sidestep some of the concerns and allow a smooth transition towards
a
>> byval that works with opaque pointers, I've come up with a new
approach:
>>
>> Reuse dereferenceable(S) and align A for the size and alignment of
byval.
>
>
> Side note: align should already be what's used for the alignment of
byval. I
> think I remember seeing that there was one target hook and one target using
> that hook, that did some backend alignment fixes. If that is still present,
> we should fix/remove it so that the align attribute is the one source of
> truth for the alignment of a byval parameter. This should happen before any
> changes to byval.
>
> The dereferenceable part is the novelty here and I really don't know
how
> everyone feels about it. I don't much mind - I suppose I like it
because it
> means optimizations can, maybe, hack on byval parameters more effectively
> without having to have byval-specific smarts necessarily. (also, because
> you've already done a bunch of the work)
>
>>
>> That is, a byval dereferenceable(S) align A argument is guaranteed to
have
>> S bytes available to read from, *and only S*, aligned to a multiple of
A.
>> Reading past that size is UB, as LLVM will not copy more than S bytes.
>>
>> An API can be provided to add the attribute alongside dereferenceable
>> and align attributes, for a given Type* and DataLayout.
>>
>> A preliminary implementation (w/o sret) can be found at:
>> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>>
>> To maintain compatibility with existing code,
>
>
> ^ this isn't actually a goal/necessity, fwiw. So we might not need the
> complexity you're outlining below. Or if we do choose to take it, we
may
> only want/need to keep it around for a relatively short period.
>
>>
>> dereferenceable and align
>> attributes are automatically injected as soon as a non-default
DataLayout
>> is available. The "injection" mechanism could potentially be
replaced with
>> a pass, although it was easier to experiment with it being guaranteed.
>>
>> This works out pretty well in practice, as analysis already understands
>> dereferenceable and can make decisions based on it.
>>
>> The verifier checks that for byval & friends, dereferenceable(S)
and
>> align A are present (clang always adds align, but not all tests have
it)
>
>
> Do all tests need it? Align is just for non-default alignment (>
align(1))
> so it may be OK for the tests to omit it & it's probably not right
to
> /require/ an align attribute on a byval parameter.
>
>>
>> and that S is the exact size of the pointee type (while we still know
>> that).
>>
>> That last bit is very important, because it allows a script to do the
>> following:
>>
>> 1. Find all byval arguments in tests that are missing dereferenceable,
>> e.g.
>>     ... i32* byval align 4 ...
>>     .... {i8, i64}* byval ...
>> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>>     ... i32* byval dereferenceable(123400001) align 4 ...
>>     .... {i8, i16}* byval dereferenceable(123400002) ...
>> 3. Run the tests and record the errors, which may look like:
>>
>> Attribute 'byval' expects 'dereferenceable(4)' for type
i32*,
>>     found 'dereferenceable(123400001)'
>>
>> Attribute 'byval' expects 'dereferenceable(16) align 8'
for type {i8,
>> i64}*,
>>     found 'dereferenceable(123400002)'
>>
>> 4. Use the verifier error messages to replace the bogus attributes
>> with the proper ones, which include align A when it is missing:
>>     ... i32* byval dereferenceable(4) align 4 ...
>>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
>>
>> For what is worth, the same scheme would also work for byval(N), and
>> would be entirely unnecessary for byval(T).
>>
>> I would love to know your thoughts on this, and more specifically:
>> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
>> do you think would provide the easiest transition path for front-ends?
>>
>> Thank you,
>>  - eddyb
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

Eddy B. via llvm-dev

2016-Jan-19 23:26 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

2016-01-20 1:11 GMT+02:00 Antoine Pitrou via llvm-dev <
llvm-dev at lists.llvm.org>:
> On Wed, 20 Jan 2016 00:47:56 +0200
> "Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org> wrote:
> >
> > I would love to know your thoughts on this, and more specifically:
> > Which of the 3 (byval(T), byval(N) and byval + dereferenceable +
align)
> > do you think would provide the easiest transition path for front-ends?
>
> We (llvmlite and Numba) don't use byval specifically but, were we to
use
> it, I think byval(T) would by far be the easiest for us.  By contrast,
> computing ABI sizes is a bit of a PITA due to llvmlite's architecture
> (llvmlite's IR generation side doesn't call into LLVM at all, it
> generates textual IR in pure Python).
>If you have the datalayout string, the ABI sizes become almost trivial to
compute, the algorithms for structs are simple and I'm not aware of any
special-cases that would complicate a front-end's computation.

> (as a matter of fact, I plan to keep typed pointers in llvmlite,
> because they provide invaluable help discovering bugs early rather than
> late - i.e. giving a proper error rather than crashing mysteriously)
>I wholeheartedly agree: IMO, all LLVM front-ends should attach their own
typesystem information to LLVM values instead of passing bare Value*
around, or the textual equivalent, in your case.

In rustc we didn't do this from the start and it's become a complete
nightmare to work with those old parts of the codebase.

> Regards
>
> Antoine.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160120/bfac4229/attachment.html>

Hal Finkel via llvm-dev

2016-Jan-20 06:49 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

----- Original Message -----> From: "Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Tuesday, January 19, 2016 4:47:56 PM
> Subject: [llvm-dev] [RFC] A proposal for byval in a world with opaque
pointers
> 
> Hi,
> 
> In the past months, several options have been presented for making
> byval
> (and similar attributes, such as inalloca or sret) work with opaque
> pointers.
> 
> The main two I've seen were byval(T) and byval(N) where N is the size
> of T.
> 
> They both have their upsides and downsides, for example: byval(T)
> would be
> a type-parametric attribute, which, AFAIK, does not already exist and
> may
> complicate the attribute system significantly, while byval(N) would
> be hard
> to introduce in tests as computing N from T requires LLVM's
> DataLayout.
>
> 
> Also, this would have to be done for inalloca and sret as well - sret
> only
> needs it when targeting SPARC, although still generally useful in
> analysis.
> 
> To sidestep some of the concerns and allow a smooth transition
> towards a
> byval that works with opaque pointers, I've come up with a new
> approach:
> 
> Reuse dereferenceable(S) and align A for the size and alignment of
> byval.
> 
> That is, a byval dereferenceable(S) align A argument is guaranteed to
> have
> S bytes available to read from, *and only S*, aligned to a multiple
> of A.
> Reading past that size is UB, as LLVM will not copy more than S
> bytes.
> 
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
This sounds like a reasonable plan to me.

 -Hal
> 
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
> 
> To maintain compatibility with existing code, dereferenceable and
> align
> attributes are automatically injected as soon as a non-default
> DataLayout
> is available. The "injection" mechanism could potentially be
replaced
> with
> a pass, although it was easier to experiment with it being
> guaranteed.
> 
> This works out pretty well in practice, as analysis already
> understands
> dereferenceable and can make decisions based on it.
> 
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have
> it)
> and that S is the exact size of the pointee type (while we still know
> that).
> 
> That last bit is very important, because it allows a script to do the
> following:
> 
> 1. Find all byval arguments in tests that are missing
> dereferenceable, e.g.
>     ... i32* byval align 4 ...
>     .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>     ... i32* byval dereferenceable(123400001) align 4 ...
>     .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
> 
> Attribute 'byval' expects 'dereferenceable(4)' for type
i32*,
>     found 'dereferenceable(123400001)'
> 
> Attribute 'byval' expects 'dereferenceable(16) align 8' for
type {i8,
> i64}*,
>     found 'dereferenceable(123400002)'
> 
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>     ... i32* byval dereferenceable(4) align 4 ...
>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
> 
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
> 
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable +
> align)
> do you think would provide the easiest transition path for
> front-ends?
> 
> Thank you,
>  - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Manuel Jacob via llvm-dev

2016-Jan-24 03:16 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Hi,

As there seems to be some concern that adding type attributes 
complicates the attributes system, I decided to experiment with 
implementing type attributes and byval as a type attribute.

Adding type attributes required mostly adding boilerplate code.  The 
only slightly tricky part was updating the types in the linker 
(IRMover.cpp).  This however seems to follow the common pattern in 
IRMover.cpp.  See D16515 for the implementation [1].

Adding the byval attribute on top of this was straightforward as well 
and probably similar to adding an integer attribute.  The tricky bit 
here was bitcode compatibility.  When reading the attribute group, we 
don't have access to the parameter types, so the current implementation 
casts `-1` to Type * and fixes the attribute later (when the attribute 
set is referenced by an actual function).  A similar hack would likely 
also be needed if it was an integer attribute.  See D16516 for the 
implementation [2].


I think we can only go from byval(<type>) or byval(<size>) to 
dereferenceable(<size>), but not the other way around.  In particular, 
the current definition of the dereferenceable attribute contains "It is 
legal for the number of bytes to be less than the size of the pointee 
type.".

-Manuel

[1] http://reviews.llvm.org/D16515
[2] http://reviews.llvm.org/D16516

On 2016-01-19 23:47, Eddy B. via llvm-dev wrote:> Hi,
> 
> In the past months, several options have been presented for making 
> byval
> (and similar attributes, such as inalloca or sret) work with opaque 
> pointers.
> 
> The main two I've seen were byval(T) and byval(N) where N is the size 
> of T.
> 
> They both have their upsides and downsides, for example: byval(T) would 
> be
> a type-parametric attribute, which, AFAIK, does not already exist and 
> may
> complicate the attribute system significantly, while byval(N) would be 
> hard
> to introduce in tests as computing N from T requires LLVM's DataLayout.
> 
> Also, this would have to be done for inalloca and sret as well - sret 
> only
> needs it when targeting SPARC, although still generally useful in 
> analysis.
> 
> To sidestep some of the concerns and allow a smooth transition towards 
> a
> byval that works with opaque pointers, I've come up with a new 
> approach:
> 
> Reuse dereferenceable(S) and align A for the size and alignment of 
> byval.
> 
> That is, a byval dereferenceable(S) align A argument is guaranteed to 
> have
> S bytes available to read from, *and only S*, aligned to a multiple of 
> A.
> Reading past that size is UB, as LLVM will not copy more than S bytes.
> 
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
> 
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
> 
> To maintain compatibility with existing code, dereferenceable and align
> attributes are automatically injected as soon as a non-default 
> DataLayout
> is available. The "injection" mechanism could potentially be
replaced
> with
> a pass, although it was easier to experiment with it being guaranteed.
> 
> This works out pretty well in practice, as analysis already understands
> dereferenceable and can make decisions based on it.
> 
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have 
> it)
> and that S is the exact size of the pointee type (while we still know 
> that).
> 
> That last bit is very important, because it allows a script to do the 
> following:
> 
> 1. Find all byval arguments in tests that are missing dereferenceable, 
> e.g.
>     ... i32* byval align 4 ...
>     .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>     ... i32* byval dereferenceable(123400001) align 4 ...
>     .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
> 
> Attribute 'byval' expects 'dereferenceable(4)' for type
i32*,
>     found 'dereferenceable(123400001)'
> 
> Attribute 'byval' expects 'dereferenceable(16) align 8' for
type {i8,
> i64}*,
>     found 'dereferenceable(123400002)'
> 
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>     ... i32* byval dereferenceable(4) align 4 ...
>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
> 
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
> 
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
> 
> Thank you,
>  - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Philip Reames via llvm-dev

2016-Jan-24 18:25 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

On 01/19/2016 02:47 PM, Eddy B. via llvm-dev wrote:> Hi,
>
> In the past months, several options have been presented for making byval
> (and similar attributes, such as inalloca or sret) work with opaque
pointers.
>
> The main two I've seen were byval(T) and byval(N) where N is the size
of T.
>
> They both have their upsides and downsides, for example: byval(T) would be
> a type-parametric attribute, which, AFAIK, does not already exist and may
> complicate the attribute system significantly, while byval(N) would be hard
> to introduce in tests as computing N from T requires LLVM's DataLayout.
>
> Also, this would have to be done for inalloca and sret as well - sret only
> needs it when targeting SPARC, although still generally useful in analysis.
>
> To sidestep some of the concerns and allow a smooth transition towards a
> byval that works with opaque pointers, I've come up with a new
approach:
>
> Reuse dereferenceable(S) and align A for the size and alignment of byval.
>
> That is, a byval dereferenceable(S) align A argument is guaranteed to have
> S bytes available to read from, *and only S*, aligned to a multiple of A.
> Reading past that size is UB, as LLVM will not copy more than S bytes.Just to make sure I understand, you are *not* planning on changing the 
semantics of the existing attributes right?  The current semantics for 
"dereferenceable(N)" are "N bytes are known to be
dereferenceable, more
bytes might be".  What worried my in your wording was the UB comment.  
It is not currently UB to have a read beyond the *known* dereferenceable 
bytes.  This is important to me and I would be very hesitant to change
it.>
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
>
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>
> To maintain compatibility with existing code, dereferenceable and align
> attributes are automatically injected as soon as a non-default DataLayout
> is available. The "injection" mechanism could potentially be
replaced with
> a pass, although it was easier to experiment with it being guaranteed.
>
> This works out pretty well in practice, as analysis already understands
> dereferenceable and can make decisions based on it.
>
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have it)
> and that S is the exact size of the pointee type (while we still know
that).
>
> That last bit is very important, because it allows a script to do the
following:
>
> 1. Find all byval arguments in tests that are missing dereferenceable, e.g.
>      ... i32* byval align 4 ...
>      .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>      ... i32* byval dereferenceable(123400001) align 4 ...
>      .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
>
> Attribute 'byval' expects 'dereferenceable(4)' for type
i32*,
>      found 'dereferenceable(123400001)'
>
> Attribute 'byval' expects 'dereferenceable(16) align 8' for
type {i8, i64}*,
>      found 'dereferenceable(123400002)'
>
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>      ... i32* byval dereferenceable(4) align 4 ...
>      .... {i8, i16}* byval dereferenceable(16) align 8 ...
>
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
>
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
>
> Thank you,
>   - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Eddy B. via llvm-dev

2016-Jan-24 20:52 UTC

head link

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

2016-01-24 20:25 GMT+02:00 Philip Reames <listmail at philipreames.com>:
>
>
> On 01/19/2016 02:47 PM, Eddy B. via llvm-dev wrote:
>
>> Hi,
>>
>> In the past months, several options have been presented for making
byval
>> (and similar attributes, such as inalloca or sret) work with opaque
>> pointers.
>>
>> The main two I've seen were byval(T) and byval(N) where N is the
size of
>> T.
>>
>> They both have their upsides and downsides, for example: byval(T) would
be
>> a type-parametric attribute, which, AFAIK, does not already exist and
may
>> complicate the attribute system significantly, while byval(N) would be
>> hard
>> to introduce in tests as computing N from T requires LLVM's
DataLayout.
>>
>> Also, this would have to be done for inalloca and sret as well - sret
only
>> needs it when targeting SPARC, although still generally useful in
>> analysis.
>>
>> To sidestep some of the concerns and allow a smooth transition towards
a
>> byval that works with opaque pointers, I've come up with a new
approach:
>>
>> Reuse dereferenceable(S) and align A for the size and alignment of
byval.
>>
>> That is, a byval dereferenceable(S) align A argument is guaranteed to
have
>> S bytes available to read from, *and only S*, aligned to a multiple of
A.
>> Reading past that size is UB, as LLVM will not copy more than S bytes.
>>
> Just to make sure I understand, you are *not* planning on changing the
> semantics of the existing attributes right?  The current semantics for
> "dereferenceable(N)" are "N bytes are known to be
dereferenceable, more
> bytes might be".  What worried my in your wording was the UB comment. 
It
> is not currently UB to have a read beyond the *known* dereferenceable
> bytes.  This is important to me and I would be very hesitant to change it.
The semantics I described were for byval+dereferenceable(N), not
dereferenceable(N) alone,
i.e. LLVM will only guarantee exactly N bytes will be copied, no less, no
more.

Although with mjacob prototyping byval(T) so quickly, I do feel that my
work is obsoleted.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160124/92348a7b/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Jan 2016 - [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Possibly Parallel Threads