Duncan P. N. Exon Smith via llvm-dev
2016-Jul-11 21:13 UTC
[llvm-dev] [PATCH] D22161: SystemZ: Avoid implicit iterator conversions, NFC
> On 2016-Jul-11, at 09:05, Ulrich Weigand <ulrich.weigand at de.ibm.com> wrote: > > uweigand accepted this revision. > uweigand added a comment. > This revision is now accepted and ready to land. > > I'll defer to your expertise on that. Patch looks good to me. > > I guess I'm not fully familiar with some of the C++ language details here. Would you mind explaining in more detail what the problem is specifically, and what we should be doing to avoid this issue in the future? Using something like "&*I" in the LDCleanup file wouldn't have naturally occurred to me ... :-) >See PR26753 and "ilist/iplist are broken (maybe I'll fix them?)": http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html I thought I was "finished" back in November, but I'd forgotten about MachineInstrBundleIterator (MachineBasicBlock::iterator) and then didn't find time to continue until the last few weeks. Let me summarize the big picture (adding llvm-dev since I haven't talked about this in a while): - List iterators work by having a pointer to a list node and walking "node->next" every time operator++ is called. - Lists need a sentinel of some sort to represent the "list.end()" iterator. - The way to do this efficiently and safely is to have the sentinel be some un-templated "list_node_base" class that has a next and prev pointer. The next/prev pointers point to "list_node_base", and "list_iterator<T>" downcasts to "T" on dereference. - However, ilist<T> assumes the sentinel is a full-bodied "T" and uses "T*" for the "next" pointers. In practice consumers override the sentinel traits so that the sentinel is only a "ilist_half_node" (to save space). This means that operator++ invokes UB every time you increment to the end iterator and downcast to "T*" (badness). - MachineInstrBundleIterator::operator MachineInstr*() relies on the member of ilist_iterator<MachineInstr> being an already-downcast MachineInstr*, indirectly relying on the UB in operator++. I can't safely refactor ilist/ilist_node/etc. until code stops (indirectly) relying on this UB. What's `&*I` all about? - An iterator<T> does not necessarily point at a valid T. - It could just be a sentinel object that T inherits from (in the case of lists), or an unallocated memory location (in the case of arrays). - If you really want a T*, you should dereference iterator<T> and then take the address. - Everyone (?) knows that it's undefined behaviour to dereference an iterator<T> that is not pointing at a valid T. `&*I` makes it clear at the call site that a pre-condition to getting a `T*` is for `I` to be a valid iterator. - I will be making the implicit operator illegal promptly when there's no code using it. Just waiting on a few backends. - Your code will be nicer to read -- particularly after my changes -- if you use references (T&) instead of pointers (T*) whenever variables/parameters/etc. cannot be nullptr. In many cases I've updated variables/parameters/etc. in this way, but sometimes I took the easy way out of just saying `&*I` (and sometimes that's the best option). I encourage following up with more reference-based logic!> > http://reviews.llvm.org/D22161 > > >
Ulrich Weigand via llvm-dev
2016-Jul-12 07:47 UTC
[llvm-dev] [PATCH] D22161: SystemZ: Avoid implicit iterator conversions, NFC
dexonsmith at apple.com wrote on 11.07.2016 23:13:17:> See PR26753 and "ilist/iplist are broken (maybe I'll fix them?)": > http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html > > I thought I was "finished" back in November, but I'd forgotten about > MachineInstrBundleIterator (MachineBasicBlock::iterator) and then > didn't find time to continue until the last few weeks. > > Let me summarize the big picture (adding llvm-dev since I haven't > talked about this in a while): > - List iterators work by having a pointer to a list node and walking > "node->next" every time operator++ is called. > - Lists need a sentinel of some sort to represent the "list.end()"iterator.> - The way to do this efficiently and safely is to have the sentinel > be some un-templated "list_node_base" class that has a next and prev > pointer. The next/prev pointers point to "list_node_base", and > "list_iterator<T>" downcasts to "T" on dereference. > - However, ilist<T> assumes the sentinel is a full-bodied "T" and > uses "T*" for the "next" pointers. In practice consumers override > the sentinel traits so that the sentinel is only a "ilist_half_node" > (to save space). This means that operator++ invokes UB every time > you increment to the end iterator and downcast to "T*" (badness). > - MachineInstrBundleIterator::operator MachineInstr*() relies on the > member of ilist_iterator<MachineInstr> being an already-downcast > MachineInstr*, indirectly relying on the UB in operator++. I can't > safely refactor ilist/ilist_node/etc. until code stops (indirectly) > relying on this UB. > > What's `&*I` all about? > - An iterator<T> does not necessarily point at a valid T. > - It could just be a sentinel object that T inherits from (in the > case of lists), or an unallocated memory location (in the case ofarrays).> - If you really want a T*, you should dereference iterator<T> and > then take the address. > - Everyone (?) knows that it's undefined behaviour to dereference an > iterator<T> that is not pointing at a valid T. `&*I` makes it clear > at the call site that a pre-condition to getting a `T*` is for `I` > to be a valid iterator. > - I will be making the implicit operator illegal promptly when > there's no code using it. Just waiting on a few backends. > - Your code will be nicer to read -- particularly after my changes > -- if you use references (T&) instead of pointers (T*) whenever > variables/parameters/etc. cannot be nullptr. In many cases I've > updated variables/parameters/etc. in this way, but sometimes I took > the easy way out of just saying `&*I` (and sometimes that's the best > option). I encourage following up with more reference-based logic!Thanks for the explanation, that does make sense. So once the implicit operator is made illegal, the compiler should catch any attempt to use it at compile time, and we cannot accidentally reintroduce problematic code in the future, right? Bye, Ulrich -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160712/a0fc1ea8/attachment.html>
Duncan Exon Smith via llvm-dev
2016-Jul-12 14:04 UTC
[llvm-dev] [PATCH] D22161: SystemZ: Avoid implicit iterator conversions, NFC
> On Jul 12, 2016, at 00:47, Ulrich Weigand <Ulrich.Weigand at de.ibm.com> wrote: > > dexonsmith at apple.com wrote on 11.07.2016 23:13:17: > > > See PR26753 and "ilist/iplist are broken (maybe I'll fix them?)": > > http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html > > > > I thought I was "finished" back in November, but I'd forgotten about > > MachineInstrBundleIterator (MachineBasicBlock::iterator) and then > > didn't find time to continue until the last few weeks. > > > > Let me summarize the big picture (adding llvm-dev since I haven't > > talked about this in a while): > > - List iterators work by having a pointer to a list node and walking > > "node->next" every time operator++ is called. > > - Lists need a sentinel of some sort to represent the "list.end()" iterator. > > - The way to do this efficiently and safely is to have the sentinel > > be some un-templated "list_node_base" class that has a next and prev > > pointer. The next/prev pointers point to "list_node_base", and > > "list_iterator<T>" downcasts to "T" on dereference. > > - However, ilist<T> assumes the sentinel is a full-bodied "T" and > > uses "T*" for the "next" pointers. In practice consumers override > > the sentinel traits so that the sentinel is only a "ilist_half_node" > > (to save space). This means that operator++ invokes UB every time > > you increment to the end iterator and downcast to "T*" (badness). > > - MachineInstrBundleIterator::operator MachineInstr*() relies on the > > member of ilist_iterator<MachineInstr> being an already-downcast > > MachineInstr*, indirectly relying on the UB in operator++. I can't > > safely refactor ilist/ilist_node/etc. until code stops (indirectly) > > relying on this UB. > > > > What's `&*I` all about? > > - An iterator<T> does not necessarily point at a valid T. > > - It could just be a sentinel object that T inherits from (in the > > case of lists), or an unallocated memory location (in the case of arrays). > > - If you really want a T*, you should dereference iterator<T> and > > then take the address. > > - Everyone (?) knows that it's undefined behaviour to dereference an > > iterator<T> that is not pointing at a valid T. `&*I` makes it clear > > at the call site that a pre-condition to getting a `T*` is for `I` > > to be a valid iterator. > > - I will be making the implicit operator illegal promptly when > > there's no code using it. Just waiting on a few backends. > > - Your code will be nicer to read -- particularly after my changes > > -- if you use references (T&) instead of pointers (T*) whenever > > variables/parameters/etc. cannot be nullptr. In many cases I've > > updated variables/parameters/etc. in this way, but sometimes I took > > the easy way out of just saying `&*I` (and sometimes that's the best > > option). I encourage following up with more reference-based logic! > > Thanks for the explanation, that does make sense. So once the implicit > operator is made illegal, the compiler should catch any attempt to use > it at compile time, and we cannot accidentally reintroduce problematic > code in the future, right?Exactly. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160712/261549f7/attachment.html>