Eddy B. via llvm-dev
2016-Jan-19 22:47 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
Hi, In the past months, several options have been presented for making byval (and similar attributes, such as inalloca or sret) work with opaque pointers. The main two I've seen were byval(T) and byval(N) where N is the size of T. They both have their upsides and downsides, for example: byval(T) would be a type-parametric attribute, which, AFAIK, does not already exist and may complicate the attribute system significantly, while byval(N) would be hard to introduce in tests as computing N from T requires LLVM's DataLayout. Also, this would have to be done for inalloca and sret as well - sret only needs it when targeting SPARC, although still generally useful in analysis. To sidestep some of the concerns and allow a smooth transition towards a byval that works with opaque pointers, I've come up with a new approach: Reuse dereferenceable(S) and align A for the size and alignment of byval. That is, a byval dereferenceable(S) align A argument is guaranteed to have S bytes available to read from, *and only S*, aligned to a multiple of A. Reading past that size is UB, as LLVM will not copy more than S bytes. An API can be provided to add the attribute alongside dereferenceable and align attributes, for a given Type* and DataLayout. A preliminary implementation (w/o sret) can be found at: https://github.com/eddyb/llvm/compare/2579466...65ac99b To maintain compatibility with existing code, dereferenceable and align attributes are automatically injected as soon as a non-default DataLayout is available. The "injection" mechanism could potentially be replaced with a pass, although it was easier to experiment with it being guaranteed. This works out pretty well in practice, as analysis already understands dereferenceable and can make decisions based on it. The verifier checks that for byval & friends, dereferenceable(S) and align A are present (clang always adds align, but not all tests have it) and that S is the exact size of the pointee type (while we still know that). That last bit is very important, because it allows a script to do the following: 1. Find all byval arguments in tests that are missing dereferenceable, e.g. ... i32* byval align 4 ... .... {i8, i64}* byval ... 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. ... i32* byval dereferenceable(123400001) align 4 ... .... {i8, i16}* byval dereferenceable(123400002) ... 3. Run the tests and record the errors, which may look like: Attribute 'byval' expects 'dereferenceable(4)' for type i32*, found 'dereferenceable(123400001)' Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, i64}*, found 'dereferenceable(123400002)' 4. Use the verifier error messages to replace the bogus attributes with the proper ones, which include align A when it is missing: ... i32* byval dereferenceable(4) align 4 ... .... {i8, i16}* byval dereferenceable(16) align 8 ... For what is worth, the same scheme would also work for byval(N), and would be entirely unnecessary for byval(T). I would love to know your thoughts on this, and more specifically: Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) do you think would provide the easiest transition path for front-ends? Thank you, - eddyb
David Blaikie via llvm-dev
2016-Jan-19 23:09 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
+Chandler, Duncan, David M, and Reid (pseudorandom selection of people who might be interested/voiced opinions on prior threads/conversations on the subject) Firstly, thanks Eddy for doing a lot of work in the opaque pointer area - it's greatly appreciated. This is a particularly thorny part of it (or, at least I think so at the moment - sounds like you've seen thornier parts further down the trail). On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > In the past months, several options have been presented for making byval > (and similar attributes, such as inalloca or sret) work with opaque > pointers. > > The main two I've seen were byval(T) and byval(N) where N is the size of T. > > They both have their upsides and downsides, for example: byval(T) would be > a type-parametric attribute, which, AFAIK, does not already exist and may > complicate the attribute system significantly, while byval(N) would be hard > to introduce in tests as computing N from T requires LLVM's DataLayout. >By "hard to introduce in tests" you just mean it's hard to edit the test cases to add the correct value of N automatically? Yeah - it's certainly not a simple sed script like my previous transformations.> Also, this would have to be done for inalloca and sret as well - sret only > needs it when targeting SPARC, although still generally useful in analysis. >Reid, Dave - bringing this to your attention in case you've got any thoughts on inalloca/sret (was sret part of the Windows-y things, or just inalloca?), how they should be handled, if it should be different/the same as byval, etc?> To sidestep some of the concerns and allow a smooth transition towards a > byval that works with opaque pointers, I've come up with a new approach: > > Reuse dereferenceable(S) and align A for the size and alignment of byval. >Side note: align should already be what's used for the alignment of byval. I think I remember seeing that there was one target hook and one target using that hook, that did some backend alignment fixes. If that is still present, we should fix/remove it so that the align attribute is the one source of truth for the alignment of a byval parameter. This should happen before any changes to byval. The dereferenceable part is the novelty here and I really don't know how everyone feels about it. I don't much mind - I suppose I like it because it means optimizations can, maybe, hack on byval parameters more effectively without having to have byval-specific smarts necessarily. (also, because you've already done a bunch of the work)> That is, a byval dereferenceable(S) align A argument is guaranteed to have > S bytes available to read from, *and only S*, aligned to a multiple of A. > Reading past that size is UB, as LLVM will not copy more than S bytes. > > An API can be provided to add the attribute alongside dereferenceable > and align attributes, for a given Type* and DataLayout. > > A preliminary implementation (w/o sret) can be found at: > https://github.com/eddyb/llvm/compare/2579466...65ac99b > > To maintain compatibility with existing code,^ this isn't actually a goal/necessity, fwiw. So we might not need the complexity you're outlining below. Or if we do choose to take it, we may only want/need to keep it around for a relatively short period.> dereferenceable and align > attributes are automatically injected as soon as a non-default DataLayout > is available. The "injection" mechanism could potentially be replaced with > a pass, although it was easier to experiment with it being guaranteed. > > This works out pretty well in practice, as analysis already understands > dereferenceable and can make decisions based on it. > > The verifier checks that for byval & friends, dereferenceable(S) and > align A are present (clang always adds align, but not all tests have it) >Do all tests need it? Align is just for non-default alignment (> align(1)) so it may be OK for the tests to omit it & it's probably not right to /require/ an align attribute on a byval parameter.> and that S is the exact size of the pointee type (while we still know > that). > > That last bit is very important, because it allows a script to do the > following: > > 1. Find all byval arguments in tests that are missing dereferenceable, e.g. > ... i32* byval align 4 ... > .... {i8, i64}* byval ... > 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. > ... i32* byval dereferenceable(123400001) align 4 ... > .... {i8, i16}* byval dereferenceable(123400002) ... > 3. Run the tests and record the errors, which may look like: > > Attribute 'byval' expects 'dereferenceable(4)' for type i32*, > found 'dereferenceable(123400001)' > > Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, > i64}*, > found 'dereferenceable(123400002)' > > 4. Use the verifier error messages to replace the bogus attributes > with the proper ones, which include align A when it is missing: > ... i32* byval dereferenceable(4) align 4 ... > .... {i8, i16}* byval dereferenceable(16) align 8 ... > > For what is worth, the same scheme would also work for byval(N), and > would be entirely unnecessary for byval(T). > > I would love to know your thoughts on this, and more specifically: > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) > do you think would provide the easiest transition path for front-ends? > > Thank you, > - eddyb > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/ac98fc59/attachment.html>
Antoine Pitrou via llvm-dev
2016-Jan-19 23:11 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
On Wed, 20 Jan 2016 00:47:56 +0200 "Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org> wrote:> > I would love to know your thoughts on this, and more specifically: > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) > do you think would provide the easiest transition path for front-ends?We (llvmlite and Numba) don't use byval specifically but, were we to use it, I think byval(T) would by far be the easiest for us. By contrast, computing ABI sizes is a bit of a PITA due to llvmlite's architecture (llvmlite's IR generation side doesn't call into LLVM at all, it generates textual IR in pure Python). (as a matter of fact, I plan to keep typed pointers in llvmlite, because they provide invaluable help discovering bugs early rather than late - i.e. giving a proper error rather than crashing mysteriously) Regards Antoine.
Eddy B. via llvm-dev
2016-Jan-19 23:16 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
> Do all tests need it? Align is just for non-default alignment (> align(1)) so it may be OK for the tests to omit it & it's probably not right to /require/ an align attribute on a byval parameter.I've had to fix some of the tests which checked that the right alignment was applied, by looking at the generated asm, but had no "align A" attributes and relied on the target alignment guessing, which I had taken out in my experimental implementation. The rest of the tests might be fine, but could behave surprisingly upon closer inspection. 2016-01-20 1:09 GMT+02:00 David Blaikie <dblaikie at gmail.com>:> +Chandler, Duncan, David M, and Reid (pseudorandom selection of people who > might be interested/voiced opinions on prior threads/conversations on the > subject) > > Firstly, thanks Eddy for doing a lot of work in the opaque pointer area - > it's greatly appreciated. This is a particularly thorny part of it (or, at > least I think so at the moment - sounds like you've seen thornier parts > further down the trail). > > On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> In the past months, several options have been presented for making byval >> (and similar attributes, such as inalloca or sret) work with opaque >> pointers. >> >> The main two I've seen were byval(T) and byval(N) where N is the size of >> T. >> >> They both have their upsides and downsides, for example: byval(T) would be >> a type-parametric attribute, which, AFAIK, does not already exist and may >> complicate the attribute system significantly, while byval(N) would be >> hard >> to introduce in tests as computing N from T requires LLVM's DataLayout. > > > By "hard to introduce in tests" you just mean it's hard to edit the test > cases to add the correct value of N automatically? Yeah - it's certainly not > a simple sed script like my previous transformations. > >> >> Also, this would have to be done for inalloca and sret as well - sret only >> needs it when targeting SPARC, although still generally useful in >> analysis. > > > Reid, Dave - bringing this to your attention in case you've got any thoughts > on inalloca/sret (was sret part of the Windows-y things, or just inalloca?), > how they should be handled, if it should be different/the same as byval, > etc? > >> >> To sidestep some of the concerns and allow a smooth transition towards a >> byval that works with opaque pointers, I've come up with a new approach: >> >> Reuse dereferenceable(S) and align A for the size and alignment of byval. > > > Side note: align should already be what's used for the alignment of byval. I > think I remember seeing that there was one target hook and one target using > that hook, that did some backend alignment fixes. If that is still present, > we should fix/remove it so that the align attribute is the one source of > truth for the alignment of a byval parameter. This should happen before any > changes to byval. > > The dereferenceable part is the novelty here and I really don't know how > everyone feels about it. I don't much mind - I suppose I like it because it > means optimizations can, maybe, hack on byval parameters more effectively > without having to have byval-specific smarts necessarily. (also, because > you've already done a bunch of the work) > >> >> That is, a byval dereferenceable(S) align A argument is guaranteed to have >> S bytes available to read from, *and only S*, aligned to a multiple of A. >> Reading past that size is UB, as LLVM will not copy more than S bytes. >> >> An API can be provided to add the attribute alongside dereferenceable >> and align attributes, for a given Type* and DataLayout. >> >> A preliminary implementation (w/o sret) can be found at: >> https://github.com/eddyb/llvm/compare/2579466...65ac99b >> >> To maintain compatibility with existing code, > > > ^ this isn't actually a goal/necessity, fwiw. So we might not need the > complexity you're outlining below. Or if we do choose to take it, we may > only want/need to keep it around for a relatively short period. > >> >> dereferenceable and align >> attributes are automatically injected as soon as a non-default DataLayout >> is available. The "injection" mechanism could potentially be replaced with >> a pass, although it was easier to experiment with it being guaranteed. >> >> This works out pretty well in practice, as analysis already understands >> dereferenceable and can make decisions based on it. >> >> The verifier checks that for byval & friends, dereferenceable(S) and >> align A are present (clang always adds align, but not all tests have it) > > > Do all tests need it? Align is just for non-default alignment (> align(1)) > so it may be OK for the tests to omit it & it's probably not right to > /require/ an align attribute on a byval parameter. > >> >> and that S is the exact size of the pointee type (while we still know >> that). >> >> That last bit is very important, because it allows a script to do the >> following: >> >> 1. Find all byval arguments in tests that are missing dereferenceable, >> e.g. >> ... i32* byval align 4 ... >> .... {i8, i64}* byval ... >> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. >> ... i32* byval dereferenceable(123400001) align 4 ... >> .... {i8, i16}* byval dereferenceable(123400002) ... >> 3. Run the tests and record the errors, which may look like: >> >> Attribute 'byval' expects 'dereferenceable(4)' for type i32*, >> found 'dereferenceable(123400001)' >> >> Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, >> i64}*, >> found 'dereferenceable(123400002)' >> >> 4. Use the verifier error messages to replace the bogus attributes >> with the proper ones, which include align A when it is missing: >> ... i32* byval dereferenceable(4) align 4 ... >> .... {i8, i16}* byval dereferenceable(16) align 8 ... >> >> For what is worth, the same scheme would also work for byval(N), and >> would be entirely unnecessary for byval(T). >> >> I would love to know your thoughts on this, and more specifically: >> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) >> do you think would provide the easiest transition path for front-ends? >> >> Thank you, >> - eddyb >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Eddy B. via llvm-dev
2016-Jan-19 23:26 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
2016-01-20 1:11 GMT+02:00 Antoine Pitrou via llvm-dev < llvm-dev at lists.llvm.org>:> On Wed, 20 Jan 2016 00:47:56 +0200 > "Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org> wrote: > > > > I would love to know your thoughts on this, and more specifically: > > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) > > do you think would provide the easiest transition path for front-ends? > > We (llvmlite and Numba) don't use byval specifically but, were we to use > it, I think byval(T) would by far be the easiest for us. By contrast, > computing ABI sizes is a bit of a PITA due to llvmlite's architecture > (llvmlite's IR generation side doesn't call into LLVM at all, it > generates textual IR in pure Python). >If you have the datalayout string, the ABI sizes become almost trivial to compute, the algorithms for structs are simple and I'm not aware of any special-cases that would complicate a front-end's computation.> (as a matter of fact, I plan to keep typed pointers in llvmlite, > because they provide invaluable help discovering bugs early rather than > late - i.e. giving a proper error rather than crashing mysteriously) >I wholeheartedly agree: IMO, all LLVM front-ends should attach their own typesystem information to LLVM values instead of passing bare Value* around, or the textual equivalent, in your case. In rustc we didn't do this from the start and it's become a complete nightmare to work with those old parts of the codebase.> Regards > > Antoine. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160120/bfac4229/attachment.html>
Hal Finkel via llvm-dev
2016-Jan-20 06:49 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
----- Original Message -----> From: "Eddy B. via llvm-dev" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Tuesday, January 19, 2016 4:47:56 PM > Subject: [llvm-dev] [RFC] A proposal for byval in a world with opaque pointers > > Hi, > > In the past months, several options have been presented for making > byval > (and similar attributes, such as inalloca or sret) work with opaque > pointers. > > The main two I've seen were byval(T) and byval(N) where N is the size > of T. > > They both have their upsides and downsides, for example: byval(T) > would be > a type-parametric attribute, which, AFAIK, does not already exist and > may > complicate the attribute system significantly, while byval(N) would > be hard > to introduce in tests as computing N from T requires LLVM's > DataLayout. > > > Also, this would have to be done for inalloca and sret as well - sret > only > needs it when targeting SPARC, although still generally useful in > analysis. > > To sidestep some of the concerns and allow a smooth transition > towards a > byval that works with opaque pointers, I've come up with a new > approach: > > Reuse dereferenceable(S) and align A for the size and alignment of > byval. > > That is, a byval dereferenceable(S) align A argument is guaranteed to > have > S bytes available to read from, *and only S*, aligned to a multiple > of A. > Reading past that size is UB, as LLVM will not copy more than S > bytes. > > An API can be provided to add the attribute alongside dereferenceable > and align attributes, for a given Type* and DataLayout.This sounds like a reasonable plan to me. -Hal> > A preliminary implementation (w/o sret) can be found at: > https://github.com/eddyb/llvm/compare/2579466...65ac99b > > To maintain compatibility with existing code, dereferenceable and > align > attributes are automatically injected as soon as a non-default > DataLayout > is available. The "injection" mechanism could potentially be replaced > with > a pass, although it was easier to experiment with it being > guaranteed. > > This works out pretty well in practice, as analysis already > understands > dereferenceable and can make decisions based on it. > > The verifier checks that for byval & friends, dereferenceable(S) and > align A are present (clang always adds align, but not all tests have > it) > and that S is the exact size of the pointee type (while we still know > that). > > That last bit is very important, because it allows a script to do the > following: > > 1. Find all byval arguments in tests that are missing > dereferenceable, e.g. > ... i32* byval align 4 ... > .... {i8, i64}* byval ... > 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. > ... i32* byval dereferenceable(123400001) align 4 ... > .... {i8, i16}* byval dereferenceable(123400002) ... > 3. Run the tests and record the errors, which may look like: > > Attribute 'byval' expects 'dereferenceable(4)' for type i32*, > found 'dereferenceable(123400001)' > > Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, > i64}*, > found 'dereferenceable(123400002)' > > 4. Use the verifier error messages to replace the bogus attributes > with the proper ones, which include align A when it is missing: > ... i32* byval dereferenceable(4) align 4 ... > .... {i8, i16}* byval dereferenceable(16) align 8 ... > > For what is worth, the same scheme would also work for byval(N), and > would be entirely unnecessary for byval(T). > > I would love to know your thoughts on this, and more specifically: > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + > align) > do you think would provide the easiest transition path for > front-ends? > > Thank you, > - eddyb > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Manuel Jacob via llvm-dev
2016-Jan-24 03:16 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
Hi, As there seems to be some concern that adding type attributes complicates the attributes system, I decided to experiment with implementing type attributes and byval as a type attribute. Adding type attributes required mostly adding boilerplate code. The only slightly tricky part was updating the types in the linker (IRMover.cpp). This however seems to follow the common pattern in IRMover.cpp. See D16515 for the implementation [1]. Adding the byval attribute on top of this was straightforward as well and probably similar to adding an integer attribute. The tricky bit here was bitcode compatibility. When reading the attribute group, we don't have access to the parameter types, so the current implementation casts `-1` to Type * and fixes the attribute later (when the attribute set is referenced by an actual function). A similar hack would likely also be needed if it was an integer attribute. See D16516 for the implementation [2]. I think we can only go from byval(<type>) or byval(<size>) to dereferenceable(<size>), but not the other way around. In particular, the current definition of the dereferenceable attribute contains "It is legal for the number of bytes to be less than the size of the pointee type.". -Manuel [1] http://reviews.llvm.org/D16515 [2] http://reviews.llvm.org/D16516 On 2016-01-19 23:47, Eddy B. via llvm-dev wrote:> Hi, > > In the past months, several options have been presented for making > byval > (and similar attributes, such as inalloca or sret) work with opaque > pointers. > > The main two I've seen were byval(T) and byval(N) where N is the size > of T. > > They both have their upsides and downsides, for example: byval(T) would > be > a type-parametric attribute, which, AFAIK, does not already exist and > may > complicate the attribute system significantly, while byval(N) would be > hard > to introduce in tests as computing N from T requires LLVM's DataLayout. > > Also, this would have to be done for inalloca and sret as well - sret > only > needs it when targeting SPARC, although still generally useful in > analysis. > > To sidestep some of the concerns and allow a smooth transition towards > a > byval that works with opaque pointers, I've come up with a new > approach: > > Reuse dereferenceable(S) and align A for the size and alignment of > byval. > > That is, a byval dereferenceable(S) align A argument is guaranteed to > have > S bytes available to read from, *and only S*, aligned to a multiple of > A. > Reading past that size is UB, as LLVM will not copy more than S bytes. > > An API can be provided to add the attribute alongside dereferenceable > and align attributes, for a given Type* and DataLayout. > > A preliminary implementation (w/o sret) can be found at: > https://github.com/eddyb/llvm/compare/2579466...65ac99b > > To maintain compatibility with existing code, dereferenceable and align > attributes are automatically injected as soon as a non-default > DataLayout > is available. The "injection" mechanism could potentially be replaced > with > a pass, although it was easier to experiment with it being guaranteed. > > This works out pretty well in practice, as analysis already understands > dereferenceable and can make decisions based on it. > > The verifier checks that for byval & friends, dereferenceable(S) and > align A are present (clang always adds align, but not all tests have > it) > and that S is the exact size of the pointee type (while we still know > that). > > That last bit is very important, because it allows a script to do the > following: > > 1. Find all byval arguments in tests that are missing dereferenceable, > e.g. > ... i32* byval align 4 ... > .... {i8, i64}* byval ... > 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. > ... i32* byval dereferenceable(123400001) align 4 ... > .... {i8, i16}* byval dereferenceable(123400002) ... > 3. Run the tests and record the errors, which may look like: > > Attribute 'byval' expects 'dereferenceable(4)' for type i32*, > found 'dereferenceable(123400001)' > > Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, > i64}*, > found 'dereferenceable(123400002)' > > 4. Use the verifier error messages to replace the bogus attributes > with the proper ones, which include align A when it is missing: > ... i32* byval dereferenceable(4) align 4 ... > .... {i8, i16}* byval dereferenceable(16) align 8 ... > > For what is worth, the same scheme would also work for byval(N), and > would be entirely unnecessary for byval(T). > > I would love to know your thoughts on this, and more specifically: > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) > do you think would provide the easiest transition path for front-ends? > > Thank you, > - eddyb > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Philip Reames via llvm-dev
2016-Jan-24 18:25 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
On 01/19/2016 02:47 PM, Eddy B. via llvm-dev wrote:> Hi, > > In the past months, several options have been presented for making byval > (and similar attributes, such as inalloca or sret) work with opaque pointers. > > The main two I've seen were byval(T) and byval(N) where N is the size of T. > > They both have their upsides and downsides, for example: byval(T) would be > a type-parametric attribute, which, AFAIK, does not already exist and may > complicate the attribute system significantly, while byval(N) would be hard > to introduce in tests as computing N from T requires LLVM's DataLayout. > > Also, this would have to be done for inalloca and sret as well - sret only > needs it when targeting SPARC, although still generally useful in analysis. > > To sidestep some of the concerns and allow a smooth transition towards a > byval that works with opaque pointers, I've come up with a new approach: > > Reuse dereferenceable(S) and align A for the size and alignment of byval. > > That is, a byval dereferenceable(S) align A argument is guaranteed to have > S bytes available to read from, *and only S*, aligned to a multiple of A. > Reading past that size is UB, as LLVM will not copy more than S bytes.Just to make sure I understand, you are *not* planning on changing the semantics of the existing attributes right? The current semantics for "dereferenceable(N)" are "N bytes are known to be dereferenceable, more bytes might be". What worried my in your wording was the UB comment. It is not currently UB to have a read beyond the *known* dereferenceable bytes. This is important to me and I would be very hesitant to change it.> > An API can be provided to add the attribute alongside dereferenceable > and align attributes, for a given Type* and DataLayout. > > A preliminary implementation (w/o sret) can be found at: > https://github.com/eddyb/llvm/compare/2579466...65ac99b > > To maintain compatibility with existing code, dereferenceable and align > attributes are automatically injected as soon as a non-default DataLayout > is available. The "injection" mechanism could potentially be replaced with > a pass, although it was easier to experiment with it being guaranteed. > > This works out pretty well in practice, as analysis already understands > dereferenceable and can make decisions based on it. > > The verifier checks that for byval & friends, dereferenceable(S) and > align A are present (clang always adds align, but not all tests have it) > and that S is the exact size of the pointee type (while we still know that). > > That last bit is very important, because it allows a script to do the following: > > 1. Find all byval arguments in tests that are missing dereferenceable, e.g. > ... i32* byval align 4 ... > .... {i8, i64}* byval ... > 2. Add a bogus dereferenceable(unique ID) to each of them, i.e. > ... i32* byval dereferenceable(123400001) align 4 ... > .... {i8, i16}* byval dereferenceable(123400002) ... > 3. Run the tests and record the errors, which may look like: > > Attribute 'byval' expects 'dereferenceable(4)' for type i32*, > found 'dereferenceable(123400001)' > > Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, i64}*, > found 'dereferenceable(123400002)' > > 4. Use the verifier error messages to replace the bogus attributes > with the proper ones, which include align A when it is missing: > ... i32* byval dereferenceable(4) align 4 ... > .... {i8, i16}* byval dereferenceable(16) align 8 ... > > For what is worth, the same scheme would also work for byval(N), and > would be entirely unnecessary for byval(T). > > I would love to know your thoughts on this, and more specifically: > Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align) > do you think would provide the easiest transition path for front-ends? > > Thank you, > - eddyb > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Eddy B. via llvm-dev
2016-Jan-24 20:52 UTC
[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers
2016-01-24 20:25 GMT+02:00 Philip Reames <listmail at philipreames.com>:> > > On 01/19/2016 02:47 PM, Eddy B. via llvm-dev wrote: > >> Hi, >> >> In the past months, several options have been presented for making byval >> (and similar attributes, such as inalloca or sret) work with opaque >> pointers. >> >> The main two I've seen were byval(T) and byval(N) where N is the size of >> T. >> >> They both have their upsides and downsides, for example: byval(T) would be >> a type-parametric attribute, which, AFAIK, does not already exist and may >> complicate the attribute system significantly, while byval(N) would be >> hard >> to introduce in tests as computing N from T requires LLVM's DataLayout. >> >> Also, this would have to be done for inalloca and sret as well - sret only >> needs it when targeting SPARC, although still generally useful in >> analysis. >> >> To sidestep some of the concerns and allow a smooth transition towards a >> byval that works with opaque pointers, I've come up with a new approach: >> >> Reuse dereferenceable(S) and align A for the size and alignment of byval. >> >> That is, a byval dereferenceable(S) align A argument is guaranteed to have >> S bytes available to read from, *and only S*, aligned to a multiple of A. >> Reading past that size is UB, as LLVM will not copy more than S bytes. >> > Just to make sure I understand, you are *not* planning on changing the > semantics of the existing attributes right? The current semantics for > "dereferenceable(N)" are "N bytes are known to be dereferenceable, more > bytes might be". What worried my in your wording was the UB comment. It > is not currently UB to have a read beyond the *known* dereferenceable > bytes. This is important to me and I would be very hesitant to change it.The semantics I described were for byval+dereferenceable(N), not dereferenceable(N) alone, i.e. LLVM will only guarantee exactly N bytes will be copied, no less, no more. Although with mjacob prototyping byval(T) so quickly, I do feel that my work is obsoleted. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160124/92348a7b/attachment.html>
Reasonably Related Threads
- [RFC] A proposal for byval in a world with opaque pointers
- [RFC] A proposal for byval in a world with opaque pointers
- [LLVMdev] byval in a world without pointee types
- [LLVMdev] byval in a world without pointee types
- RFC: A change in InstCombine canonical form