On Wed, Dec 5, 2012 at 1:06 PM, David Blaikie <dblaikie at gmail.com> wrote:> On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <richard at metafoo.co.uk> wrote: >> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <eli.friedman at gmail.com> >> wrote: >>> >>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <dblaikie at gmail.com> wrote: >>> > Moving to LLVM dev to discuss the possibility of extending the cast >>> > infrastructure to handle this. >>> > >>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <rjmccall at apple.com> wrote: >>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <dblaikie at gmail.com> wrote: >>> >>> TypeLoc casting looks bogus. >>> >>> >>> >>> TypeLoc derived types return true from classof when the dynamic type >>> >>> is not actually an instance of the derived type. The TypeLoc is then >>> >>> (erroneously) cast to the derived type and the derived type's implicit >>> >>> copy ctor is executed (so long as you remember to assign the result of >>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's >>> >>> member functions are actually invoked on a TypeLoc object - bogas, but >>> >>> it works (because there's no other members in the SpecificTypeLoc >>> >>> types). >>> >> >>> >> Yep. See also LLVM's IntrinsicInst. This kind of UB is very common >>> >> and >>> >> essentially harmless, but if you want to stamp it out, feel free. >>> >> >>> >>> Richard / Kostya: what's the word on catching this kind of UB >>> >>> (essentially: calling member functions of one type on an instance not >>> >>> of that type)? (in the specific case mentioned below, as discussed in >>> >>> the original thread, ASan or MSan might be able to help) >>> >>> >>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful >>> >>> to add machinery to llvm::cast to handle concrete type conversion, so >>> >>> I think we should consider a non-llvm::cast solution for this >>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or >>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&) >>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls >>> >>> up through to the base copy ctor. It'll be a fair bit of code to add >>> >>> to TypeLoc, but nothing complicated. >>> >>> >>> >>> Any other ideas? >>> >> >>> >> I don't see why isa<> and cast<> aren't the right code interface for >>> >> this, >>> >> rather than reinventing something else. You just need to teach cast<> >>> >> to construct and return a proper temporary. >>> > >>> > LLVM-dev folks: what do you reckon? Should we extend the cast >>> > infrastructure to be able to handle value type conversions? >>> >>> No, I don't think that makes sense; the casting infrastructure is >>> complicated enough without having to deal with this. It would be easy >>> enough to implement a method on TypeLoc to handle this, >>> >>> > It doesn't really feel like the right fit to me, but I'll defer to the >>> > community's wisdom if it's overwhelming. I'd just like to see a >>> > solution wherein we can't write this particular kind of bug again, >>> > ideally. >>> >>> Attaching proof-of-concept solution which prevents this kind of bug. >>> (Ugly, and requires C++11 support as written, but potentially >>> interesting anyway.) >> >> >> The C++11 should be easy to remove (move the enable_if to the return type). >> This approach seems reasonable to me. > > Thanks Eli - that does seem to do the trick (took me a little while to > wrap my head around it, though). I've modified it, as Richard > suggested, to use enable_if in the return type & it builds in C++98 > and correctly catches the TypeLoc usage. > > I'll have to find some time to actually go & fix TypeLoc before we can > submit this, though. > > (& technically we could allow casting where the parameter type is a > temporary, so long as you only cast to the same type - but that's not > much use. I wonder if we should trivially disallow casting when the > source/dest type are the same anyway? But I suppose there's no harm in > allowing that either, except that it might confuse the reader a bit > when they see a no-op cast like that)Looking into this I've fixed a few other things that were a little broken by the looseness of llvm::cast & friends (see r174853). I've been prototyping a fix to TypeLoc specifically, hit some hiccups in ASTMatchers (probably surmountable - Manuel helped me with r174862 (might need one other change there)) but otherwise seems achievable. Beyond that, though, I've hit one hierarchy in the Static Analyzer that does this as well: ProgramPoint. On IRC Jordan Rose mentioned that there's another more pervasive use of this pattern in the Static Analyzer, the SVal hierarchy. So, Ted, how objectionable would it be for me to introduce something like (names subject to adjustment): template<typename T> T SVal::castAs(); template<typename T> llvm::Optional<T> SVal::getAs(); (the implementations of these functions might involve invoking private FooSVal(SVal) ctors in each SVal derived type - so adding a bit of boilerplate ctors to all those classes) and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and dyn_cast with getAs similarly? & similarly for the ProgramPoint (& TypeLoc - which I'm already working on) hierarchies? Full disclosure: Obviously the more important part of this is fixing the easily-written bugs such as r174862 & that doesn't necessarily require us to fix the UB, nor to migrate these APIs away from the LLVM RTTI infrastructure. We could, for example, add value-type-conversion support to the LLVM RTTI infrastructure that checks that if you pass a temporary it would explicitly do a value conversion and return a value not a reference/pointer. (this wouldn't scale to dyn_cast, though - that'd have to return Optional<T> not T* since there's no T object for a T* to point to - we could still, technically, wedge this in to dyn_cast but that might be a bit much) And we could do this without fixing the UB type punning here (still reinterpret_casting the base to the derived type & using the implicit copy ctor) or while fixing the UB (require the derived types to have derived(base) ctors) but keeping the cast API usage the same. It's my personal opinion (& see Eli's agreement above) that it's best just to leave this out of the LLVM RTTI infrastructure entirely & have these hierarchies handle it themselves. Thanks, - David
On Mon, Feb 11, 2013 at 12:02 PM, David Blaikie <dblaikie at gmail.com> wrote:> On Wed, Dec 5, 2012 at 1:06 PM, David Blaikie <dblaikie at gmail.com> wrote: >> On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <richard at metafoo.co.uk> wrote: >>> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <eli.friedman at gmail.com> >>> wrote: >>>> >>>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <dblaikie at gmail.com> wrote: >>>> > Moving to LLVM dev to discuss the possibility of extending the cast >>>> > infrastructure to handle this. >>>> > >>>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <rjmccall at apple.com> wrote: >>>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <dblaikie at gmail.com> wrote: >>>> >>> TypeLoc casting looks bogus. >>>> >>> >>>> >>> TypeLoc derived types return true from classof when the dynamic type >>>> >>> is not actually an instance of the derived type. The TypeLoc is then >>>> >>> (erroneously) cast to the derived type and the derived type's implicit >>>> >>> copy ctor is executed (so long as you remember to assign the result of >>>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's >>>> >>> member functions are actually invoked on a TypeLoc object - bogas, but >>>> >>> it works (because there's no other members in the SpecificTypeLoc >>>> >>> types). >>>> >> >>>> >> Yep. See also LLVM's IntrinsicInst. This kind of UB is very common >>>> >> and >>>> >> essentially harmless, but if you want to stamp it out, feel free. >>>> >> >>>> >>> Richard / Kostya: what's the word on catching this kind of UB >>>> >>> (essentially: calling member functions of one type on an instance not >>>> >>> of that type)? (in the specific case mentioned below, as discussed in >>>> >>> the original thread, ASan or MSan might be able to help) >>>> >>> >>>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful >>>> >>> to add machinery to llvm::cast to handle concrete type conversion, so >>>> >>> I think we should consider a non-llvm::cast solution for this >>>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or >>>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&) >>>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls >>>> >>> up through to the base copy ctor. It'll be a fair bit of code to add >>>> >>> to TypeLoc, but nothing complicated. >>>> >>> >>>> >>> Any other ideas? >>>> >> >>>> >> I don't see why isa<> and cast<> aren't the right code interface for >>>> >> this, >>>> >> rather than reinventing something else. You just need to teach cast<> >>>> >> to construct and return a proper temporary. >>>> > >>>> > LLVM-dev folks: what do you reckon? Should we extend the cast >>>> > infrastructure to be able to handle value type conversions? >>>> >>>> No, I don't think that makes sense; the casting infrastructure is >>>> complicated enough without having to deal with this. It would be easy >>>> enough to implement a method on TypeLoc to handle this, >>>> >>>> > It doesn't really feel like the right fit to me, but I'll defer to the >>>> > community's wisdom if it's overwhelming. I'd just like to see a >>>> > solution wherein we can't write this particular kind of bug again, >>>> > ideally. >>>> >>>> Attaching proof-of-concept solution which prevents this kind of bug. >>>> (Ugly, and requires C++11 support as written, but potentially >>>> interesting anyway.) >>> >>> >>> The C++11 should be easy to remove (move the enable_if to the return type). >>> This approach seems reasonable to me. >> >> Thanks Eli - that does seem to do the trick (took me a little while to >> wrap my head around it, though). I've modified it, as Richard >> suggested, to use enable_if in the return type & it builds in C++98 >> and correctly catches the TypeLoc usage. >> >> I'll have to find some time to actually go & fix TypeLoc before we can >> submit this, though. >> >> (& technically we could allow casting where the parameter type is a >> temporary, so long as you only cast to the same type - but that's not >> much use. I wonder if we should trivially disallow casting when the >> source/dest type are the same anyway? But I suppose there's no harm in >> allowing that either, except that it might confuse the reader a bit >> when they see a no-op cast like that) > > Looking into this I've fixed a few other things that were a little > broken by the looseness of llvm::cast & friends (see r174853). I've > been prototyping a fix to TypeLoc specifically, hit some hiccups in > ASTMatchers (probably surmountable - Manuel helped me with r174862 > (might need one other change there)) but otherwise seems achievable. > > Beyond that, though, I've hit one hierarchy in the Static Analyzer > that does this as well: ProgramPoint. On IRC Jordan Rose mentioned > that there's another more pervasive use of this pattern in the Static > Analyzer, the SVal hierarchy. > > So, Ted, how objectionable would it be for me to introduce something > like (names subject to adjustment): > > template<typename T> > T SVal::castAs(); > > template<typename T> > llvm::Optional<T> SVal::getAs(); > > (the implementations of these functions might involve invoking private > FooSVal(SVal) ctors in each SVal derived type - so adding a bit of > boilerplate ctors to all those classes)Richard Smith pointed out that this could be done without the boilerplate with something like: Derived d; d.Base::operator=(base); a bit nasty, but valid/well-defined & would avoid writing the boilerplate delegating Derived(Base) ctors in every derived type> > and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and > dyn_cast with getAs similarly? & similarly for the ProgramPoint (& > TypeLoc - which I'm already working on) hierarchies? > > > Full disclosure: > Obviously the more important part of this is fixing the easily-written > bugs such as r174862 & that doesn't necessarily require us to fix the > UB, nor to migrate these APIs away from the LLVM RTTI infrastructure. > > We could, for example, add value-type-conversion support to the LLVM > RTTI infrastructure that checks that if you pass a temporary it would > explicitly do a value conversion and return a value not a > reference/pointer. (this wouldn't scale to dyn_cast, though - that'd > have to return Optional<T> not T* since there's no T object for a T* > to point to - we could still, technically, wedge this in to dyn_cast > but that might be a bit much) And we could do this without fixing the > UB type punning here (still reinterpret_casting the base to the > derived type & using the implicit copy ctor) or while fixing the UB > (require the derived types to have derived(base) ctors) but keeping > the cast API usage the same. > > It's my personal opinion (& see Eli's agreement above) that it's best > just to leave this out of the LLVM RTTI infrastructure entirely & have > these hierarchies handle it themselves. > > Thanks, > - David
On Feb 11, 2013, at 12:02 PM, David Blaikie <dblaikie at gmail.com> wrote:> Beyond that, though, I've hit one hierarchy in the Static Analyzer > that does this as well: ProgramPoint. On IRC Jordan Rose mentioned > that there's another more pervasive use of this pattern in the Static > Analyzer, the SVal hierarchy. > > So, Ted, how objectionable would it be for me to introduce something > like (names subject to adjustment): > > template<typename T> > T SVal::castAs(); > > template<typename T> > llvm::Optional<T> SVal::getAs(); > > (the implementations of these functions might involve invoking private > FooSVal(SVal) ctors in each SVal derived type - so adding a bit of > boilerplate ctors to all those classes) > > and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and > dyn_cast with getAs similarly? & similarly for the ProgramPoint (& > TypeLoc - which I'm already working on) hierarchies? >Sorry everyone for not seeing this. Jordan was nice enough to ping me. I would be perfectly fine with this change, and I think it looks cleaner anyways. I think it is a great idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130215/85b27e5d/attachment.html>
On Fri, Feb 15, 2013 at 4:55 PM, Ted Kremenek <kremenek at apple.com> wrote:> On Feb 11, 2013, at 12:02 PM, David Blaikie <dblaikie at gmail.com> wrote: > > Beyond that, though, I've hit one hierarchy in the Static Analyzer > that does this as well: ProgramPoint. On IRC Jordan Rose mentioned > that there's another more pervasive use of this pattern in the Static > Analyzer, the SVal hierarchy. > > So, Ted, how objectionable would it be for me to introduce something > like (names subject to adjustment): > > template<typename T> > T SVal::castAs(); > > template<typename T> > llvm::Optional<T> SVal::getAs(); > > (the implementations of these functions might involve invoking private > FooSVal(SVal) ctors in each SVal derived type - so adding a bit of > boilerplate ctors to all those classes) > > and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and > dyn_cast with getAs similarly? & similarly for the ProgramPoint (& > TypeLoc - which I'm already working on) hierarchies? > > > Sorry everyone for not seeing this. Jordan was nice enough to ping me. I > would be perfectly fine with this change, and I think it looks cleaner > anyways. I think it is a great idea. >Thanks for your patience - the final change to llvm::cast & friends to disallow rvalues (based on Eli's suggestion, tweaked to work with LLVM type traits rather than C++11 traits) has been committed in r175462. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130221/52881f3c/attachment.html>