Finkel, Hal J. via llvm-dev
2020-Feb-20 01:52 UTC
[llvm-dev] The semantics of nonnull attribute
Two thoughts: 1. I think that we should aim for regularity, to the extent possible, and so we should treat nonnull, align, etc. similarly w.r.t. to whether they produce poison or UB. 2. I was thinking about the following last night, and it clarified for me why having an not_poison attribute makes sense and seems useful, and how poison/UB might affect things on a function-call boundary itself. Imagine that we had a fastcc lowering strategy that took a pointer argument with an alignment attribute, followed by a suitably-small integer argument, and implemented a calling convention that passed both in the same register. If the pointer value might be poison, and thus violate the alignment attribute (or might violate the alignment attribute otherwise and produce poison), then we can't implement this just by anding together the two values (to pass them in the one register). We need to mask off the low bits first. If the value can't be or generate poison, and violating the alignment constraint produces UB, then the masking is not needed and we can just and together the two values (confident that the low bits will always be zero). -Hal Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Finkel, Hal J. via llvm-dev <llvm-dev at lists.llvm.org> Sent: Wednesday, February 19, 2020 2:29 AM To: Doerfert, Johannes <johannesdoerfert at gmail.com>; Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Nuno Lopes <nuno.lopes at ist.utl.pt>; John Regehr <regehr at cs.utah.edu> Subject: Re: [llvm-dev] The semantics of nonnull attribute On 2/19/20 1:16 AM, Doerfert, Johannes via llvm-dev wrote: On 02/19, Juneyoung Lee via llvm-dev wrote: Hello, Would it be correct to resolve this by saying that dereferenceable(N) *implies* not_poison? This would be helpful as a clarification of how it all fits together. Yes, I think it makes sense. I don't we should do that. Take the `gep inbounds` example: char* foo(char *arg) { return `gep inbounds %arg, -100` } Here it depends if we want to deduce the output is dereferenceable(100) or not. If we do, we need dereferenceable to mean poison if violated, as with nonnull, because it is derived from poison. Only if we don't derive dereferenceable for the return value we can go for dereferenceable violations are UB. Can you please clarify what it means for the output of dereferenceable to be poison? If we tag a memory address as dereferenceable, is the optimizer free to insert a load of the address immediately following that? Or we need to see some other access (prior to any thread synchronization?) to say that's valid? Thanks again, Hal In the end, I think, it boils down to the question if there are situations where violation of some attributes should be poison and violation of others should be UB. If such situations exists it is unclear to me what makes the UB/poison ones special. On Wed, Feb 19, 2020 at 12:14 PM Nicolai Hähnle <nhaehnle at gmail.com><mailto:nhaehnle at gmail.com> wrote: On Wed, Feb 19, 2020 at 3:51 AM Juneyoung Lee via llvm-dev <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org> wrote: I think not_poison (Johannes's used keyword) makes sense. We can simulate the original UB semantics by simply attaching it, as explained. For the attributes other than nonnull, we may need more discussion; align attribute seems to be okay with defining it as poison, dereferenceable may need UB even without nonnull (because it needs to be non-poison as shown Nuno's hoisting example). For reference, the hoisting example was: f(dereferenceable(4) %p) { loop() { %v = load %p use(%v) } } => f(dereferenceable(4) %p) { %v = load %p loop() { use(%v) } } Would it be correct to resolve this by saying that dereferenceable(N) *implies* not_poison? This would be helpful as a clarification of how it all fits together. Cheers, Nicolai -- Juneyoung Lee Software Foundation Lab, Seoul National University _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200220/34f692fe/attachment-0001.html>
Juneyoung Lee via llvm-dev
2020-Feb-20 15:47 UTC
[llvm-dev] The semantics of nonnull attribute
Hello all, The problem with defining all attributes as yielding poison is that, certain attributes are not meaningful to yield poison or only meaningful when it is conveyed with not_poison. If a dereferenceable argument can have poison, I don’t think the attribute will be useful by its alone because existing analyses cannot conclude that accessing the pointer is okay; To make the conclusion, it always has to be carried with not_poison. Forgetting not_poison check will lead to a bug. Among other attributes, byval seems problematic to me. The value pointed by a byval pointer is copied at a temporary space, and the temporary address is passed to the callee. This means that byval argument should raise UB if it was not dereferenceable; `f(byval null);` raises segmentation fault even if the null pointer is not used inside f(). (https://godbolt.org/z/sNC_RF ) So we cannot define it as f(poison). If the semantics of attributes should be consistent, I suggest UB should be the one. not_poison will also raise UB if the input was poison, so it also satisfies the consistency as well. The gep inbounds optimization should be fixed then. For dead argument elimination and function call hoisting, we should be able to drop relevant attributes. Best regards, Juneyoung Lee On Thu, Feb 20, 2020 at 10:52 AM Finkel, Hal J. <hfinkel at anl.gov> wrote:> Two thoughts: > > 1. I think that we should aim for regularity, to the extent possible, and > so we should treat nonnull, align, etc. similarly w.r.t. to whether they > produce poison or UB. > > 2. I was thinking about the following last night, and it clarified for me > why having an not_poison attribute makes sense and seems useful, and how > poison/UB might affect things on a function-call boundary itself. Imagine > that we had a fastcc lowering strategy that took a pointer argument with an > alignment attribute, followed by a suitably-small integer argument, and > implemented a calling convention that passed both in the same register. If > the pointer value might be poison, and thus violate the alignment attribute > (or might violate the alignment attribute otherwise and produce poison), > then we can't implement this just by anding together the two values (to > pass them in the one register). We need to mask off the low bits first. If > the value can't be or generate poison, and violating the alignment > constraint produces UB, then the masking is not needed and we can just and > together the two values (confident that the low bits will always be zero). > > -Hal > > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > ------------------------------ > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Finkel, > Hal J. via llvm-dev <llvm-dev at lists.llvm.org> > *Sent:* Wednesday, February 19, 2020 2:29 AM > *To:* Doerfert, Johannes <johannesdoerfert at gmail.com>; Juneyoung Lee < > juneyoung.lee at sf.snu.ac.kr> > *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Nuno Lopes < > nuno.lopes at ist.utl.pt>; John Regehr <regehr at cs.utah.edu> > *Subject:* Re: [llvm-dev] The semantics of nonnull attribute > > On 2/19/20 1:16 AM, Doerfert, Johannes via llvm-dev wrote: > > On 02/19, Juneyoung Lee via llvm-dev wrote: > > Hello, > > > Would it be correct to resolve this by saying that dereferenceable(N) > *implies* not_poison? This would be helpful as a clarification of how > it all fits together. > > Yes, I think it makes sense. > > I don't we should do that. > > Take the `gep inbounds` example: > > char* foo(char *arg) { > return `gep inbounds %arg, -100` > } > > Here it depends if we want to deduce the output is dereferenceable(100) > or not. If we do, we need dereferenceable to mean poison if violated, as > with nonnull, because it is derived from poison. Only if we don't derive > dereferenceable for the return value we can go for dereferenceable > violations are UB. > > > Can you please clarify what it means for the output of dereferenceable to > be poison? If we tag a memory address as dereferenceable, is the optimizer > free to insert a load of the address immediately following that? Or we need > to see some other access (prior to any thread synchronization?) to say > that's valid? > > Thanks again, > > Hal > > > In the end, I think, it boils down to the question if there are > situations where violation of some attributes should be poison and > violation of others should be UB. If such situations exists it is > unclear to me what makes the UB/poison ones special. > > > > On Wed, Feb 19, 2020 at 12:14 PM Nicolai Hähnle <nhaehnle at gmail.com> <nhaehnle at gmail.com> wrote: > > > On Wed, Feb 19, 2020 at 3:51 AM Juneyoung Lee via llvm-dev<llvm-dev at lists.llvm.org> <llvm-dev at lists.llvm.org> wrote: > > I think not_poison (Johannes's used keyword) makes sense. We can > > simulate the original UB semantics by simply attaching it, as explained. > > For the attributes other than nonnull, we may need more discussion; > > align attribute seems to be okay with defining it as poison, > dereferenceable may need UB even without nonnull (because it needs to be > non-poison as shown Nuno's hoisting example). > > For reference, the hoisting example was: > > f(dereferenceable(4) %p) { > loop() { > %v = load %p > use(%v) > } > } > => > f(dereferenceable(4) %p) { > %v = load %p > loop() { > use(%v) > } > } > > Would it be correct to resolve this by saying that dereferenceable(N) > *implies* not_poison? This would be helpful as a clarification of how > it all fits together. > > Cheers, > Nicolai > > > -- > > Juneyoung Lee > Software Foundation Lab, Seoul National University > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200221/cdc11b98/attachment.html>
Nicolai Hähnle via llvm-dev
2020-Feb-21 10:08 UTC
[llvm-dev] The semantics of nonnull attribute
On Thu, Feb 20, 2020 at 4:47 PM Juneyoung Lee via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The problem with defining all attributes as yielding poison is that, certain attributes are not meaningful to yield poison or only meaningful when it is conveyed with not_poison. > If a dereferenceable argument can have poison, I don’t think the attribute will be useful by its alone because existing analyses cannot conclude that accessing the pointer is okay; To make the conclusion, it always has to be carried with not_poison. Forgetting not_poison check will lead to a bug. > Among other attributes, byval seems problematic to me. > The value pointed by a byval pointer is copied at a temporary space, and the temporary address is passed to the callee. > This means that byval argument should raise UB if it was not dereferenceable; `f(byval null);` raises segmentation fault even if the null pointer is not used inside f(). (https://godbolt.org/z/sNC_RF ) So we cannot define it as f(poison).byval seems sufficiently different though. The attributes that have been mostly discussed in this thread (nonnull, dereferenceable, align) are attributes that claim a property of the value passed into the argument. byval is about how the value is passed into the function. There are other attributes that fall into different categories, such as nocapture and nofree. Cheers, Nicolai> If the semantics of attributes should be consistent, I suggest UB should be the one. not_poison will also raise UB if the input was poison, so it also satisfies the consistency as well. > The gep inbounds optimization should be fixed then. For dead argument elimination and function call hoisting, we should be able to drop relevant attributes. > > Best regards, > Juneyoung Lee > > On Thu, Feb 20, 2020 at 10:52 AM Finkel, Hal J. <hfinkel at anl.gov> wrote: >> >> Two thoughts: >> >> 1. I think that we should aim for regularity, to the extent possible, and so we should treat nonnull, align, etc. similarly w.r.t. to whether they produce poison or UB. >> >> 2. I was thinking about the following last night, and it clarified for me why having an not_poison attribute makes sense and seems useful, and how poison/UB might affect things on a function-call boundary itself. Imagine that we had a fastcc lowering strategy that took a pointer argument with an alignment attribute, followed by a suitably-small integer argument, and implemented a calling convention that passed both in the same register. If the pointer value might be poison, and thus violate the alignment attribute (or might violate the alignment attribute otherwise and produce poison), then we can't implement this just by anding together the two values (to pass them in the one register). We need to mask off the low bits first. If the value can't be or generate poison, and violating the alignment constraint produces UB, then the masking is not needed and we can just and together the two values (confident that the low bits will always be zero). >> >> -Hal >> >> Hal Finkel >> Lead, Compiler Technology and Programming Languages >> Leadership Computing Facility >> Argonne National Laboratory >> >> ________________________________ >> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Finkel, Hal J. via llvm-dev <llvm-dev at lists.llvm.org> >> Sent: Wednesday, February 19, 2020 2:29 AM >> To: Doerfert, Johannes <johannesdoerfert at gmail.com>; Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> >> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Nuno Lopes <nuno.lopes at ist.utl.pt>; John Regehr <regehr at cs.utah.edu> >> Subject: Re: [llvm-dev] The semantics of nonnull attribute >> >> On 2/19/20 1:16 AM, Doerfert, Johannes via llvm-dev wrote: >> >> On 02/19, Juneyoung Lee via llvm-dev wrote: >> >> Hello, >> >> Would it be correct to resolve this by saying that dereferenceable(N) >> *implies* not_poison? This would be helpful as a clarification of how >> it all fits together. >> >> Yes, I think it makes sense. >> >> I don't we should do that. >> >> Take the `gep inbounds` example: >> >> char* foo(char *arg) { >> return `gep inbounds %arg, -100` >> } >> >> Here it depends if we want to deduce the output is dereferenceable(100) >> or not. If we do, we need dereferenceable to mean poison if violated, as >> with nonnull, because it is derived from poison. Only if we don't derive >> dereferenceable for the return value we can go for dereferenceable >> violations are UB. >> >> >> Can you please clarify what it means for the output of dereferenceable to be poison? If we tag a memory address as dereferenceable, is the optimizer free to insert a load of the address immediately following that? Or we need to see some other access (prior to any thread synchronization?) to say that's valid? >> >> Thanks again, >> >> Hal >> >> >> In the end, I think, it boils down to the question if there are >> situations where violation of some attributes should be poison and >> violation of others should be UB. If such situations exists it is >> unclear to me what makes the UB/poison ones special. >> >> >> On Wed, Feb 19, 2020 at 12:14 PM Nicolai Hähnle <nhaehnle at gmail.com> wrote: >> >> On Wed, Feb 19, 2020 at 3:51 AM Juneyoung Lee via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> >> I think not_poison (Johannes's used keyword) makes sense. We can >> >> simulate the original UB semantics by simply attaching it, as explained. >> >> For the attributes other than nonnull, we may need more discussion; >> >> align attribute seems to be okay with defining it as poison, >> dereferenceable may need UB even without nonnull (because it needs to be >> non-poison as shown Nuno's hoisting example). >> >> For reference, the hoisting example was: >> >> f(dereferenceable(4) %p) { >> loop() { >> %v = load %p >> use(%v) >> } >> } >> => >> f(dereferenceable(4) %p) { >> %v = load %p >> loop() { >> use(%v) >> } >> } >> >> Would it be correct to resolve this by saying that dereferenceable(N) >> *implies* not_poison? This would be helpful as a clarification of how >> it all fits together. >> >> Cheers, >> Nicolai >> >> -- >> >> Juneyoung Lee >> Software Foundation Lab, Seoul National University >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > > -- > > Juneyoung Lee > Software Foundation Lab, Seoul National University > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte.