Chandler Carruth
2015-Jan-27 05:13 UTC
[LLVMdev] RFC: Introduce a section to the programmers manual about type hierarchies, polymorphism, and virtual dispatch.
The proposed addition to ProgrammersManuel.rst is below the break... This is essentially trying to explain the emerging design techniques being used in LLVM these days somewhere more accessible than the comments on a particular piece of infrastructure. It covers the "concepts-based polymorphism" that caused some confusion during initial reviews of the new pass manager as well as the tagged-dispatch mechanism used pervasively in LLVM and Clang. Perhaps most notably, I've tried to provide some criteria to help developers choose between these options when designing new pieces of infrastructure. Many thanks to Richard Smith who helped me crystalize this, as well as Eric, Dave, and Nick who helped me talk it through. You can find the actual patch and make comments on Phabricator here: http://reviews.llvm.org/D7191 I'm just going to paste the text below however as I think that'll more easily facilitate discussion! ======= .. _polymorphism: Designing Type Hiercharies and Polymorphic Interfaces ----------------------------------------------------- There are two different design patterns that tend to result in the use of virtual dispatch for methods in a type hierarchy in C++ programs. The first is a genuine type hierarchy where different types in the hierarchy model a specific subset of the functionality and semantics, and these types nest strictly within each other. Good examples of this can be seen in the ``Value`` or ``Type`` type hierarchies. A second is the desire to dispatch dynamically across a collection of polymorphic interface implementations. This latter use case can be modeled with virtual dispatch and inheritance by defining an abstract interface base class which all implementations derive from and override. However, this implementation strategy forces an **"is-a"** relationship to exist that is not actually meaningful. There is often not some nested hierarchy of useful generalizations which code might interact with and move up and down. Instead, there is a singular interface which is dispatched across a range of implementations. An alternate implementation strategy for the second use case which should be preferred within LLVM is to that of generic programming. For example, a template over some type parameter ``T`` can be instantiated across any particular implementation that conforms to the interface. A good example here is the highly generic properties of any type which models a node in a directed graph. LLVM models these primarily through templates and generic programming. Such templates include the ``LoopInfoBase`` and ``DominatorTreeBase``. When this type of polymorphism truly needs **dynamic** dispatch, you can generalize it using a technique called *concept-based polymorphism* which emulates the interfaces and behaviors of templates while using a very limited form of virtual dispatch for type erasure inside its implementation. You can find examples of this technique in the ``PassManager.h`` system, and there is a more detailed introduction to it by Sean Parent in several of his talks and papers: #. `Sean Parent's Papers and Presentations < http://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations>`_- A Github project full of links to slides, video, and sometimes code. #. `Value Semantics and Concepts-based Polymorphism <http://www.youtube.com/watch?v=_BpMYeUFXv8>`_ - The C++Now! 2012 talk describing this technique. #. `Inheritance Is The Base Class of Evil < http://channel9.msdn.com/Events/GoingNative/2013/Inheritance-Is-The-Base-Class-of-Evil>`_- The GoingNative 2013 talk describing this technique. When deciding between creating a type hierarchy and using virtual dispatch and using templates or concepts-based polymorphism, consider whether there is some refinement of an abstract base class which is actually a useful type on an interface boundary. If anything more refined than the root abstract interface is meaningless to talk about as a partial extension of the semantic model, then your use case fits better with polymorphism and you should avoid using virtual dispatch. However, there may be some exigent circumstances that require one technique or the other to be used. If you do need to introduce a type hierarchy, we often prefer to used explicitly closed type hierarchies with manual tagged dispatch rather than the open inheritance model and virtual dispatch that is more common in C++ code. This is because LLVM rarely encourages library consumers to extend its core types, and leverages the closed and tag-dispatched nature of its hierarchies to generate significantly more efficient code. We have also found that a large amount of our usage of type hierarchies fits better with tag-based pattern matching rather than dynamic dispatch across a common interface. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150126/beb41c64/attachment.html>
Philip Reames
2015-Jan-27 17:13 UTC
[LLVMdev] Closed virtual dispatch (was: Introduce a section to the programmers manual about type hierarchies, polymorphism, and virtual dispatch.)
On 01/26/2015 09:13 PM, Chandler Carruth wrote:> If you do need to introduce a type hierarchy, we often prefer to used > explicitly closed type hierarchies with manual tagged dispatch rather > than the > open inheritance model and virtual dispatch that is more common in C++ > code. > This is because LLVM rarely encourages library consumers to extend its > core > types, and leverages the closed and tag-dispatched nature of its > hierarchies to > generate significantly more efficient code. We have also found that a > large > amount of our usage of type hierarchies fits better with tag-based pattern > matching rather than dynamic dispatch across a common interface.Given we have a large number of compiler authors and contributors to the C++ language spec, it seems really odd to me that we can't solve this problem in a more elegant way. The tagged-dispatch as a sealed world replacement for virtual dispatch is one that comes up on a pretty regular basis. Might it be time to figure out how to generalize this and propose either a clang extension or a language mechanism for some future version of C++? Philip
David Chisnall
2015-Jan-28 09:05 UTC
[LLVMdev] Closed virtual dispatch (was: Introduce a section to the programmers manual about type hierarchies, polymorphism, and virtual dispatch.)
On 27 Jan 2015, at 17:13, Philip Reames <listmail at philipreames.com> wrote:> > Given we have a large number of compiler authors and contributors to the C++ language spec, it seems really odd to me that we can't solve this problem in a more elegant way. The tagged-dispatch as a sealed world replacement for virtual dispatch is one that comes up on a pretty regular basis. Might it be time to figure out how to generalize this and propose either a clang extension or a language mechanism for some future version of C++?One of the motivations for being able to mark a class as final in Java was to allow compilers to perform optimisations on the assumption that there would never be any subclasses (I don't believe any modern JVMs take advantage of this as it's mostly only useful to AoT compilers, but maybe ART does). For this idiom, the guarantee that you want is something slightly broader than final, it's that, at some known point, all subclasses will be visible to the optimiser. This is a bit difficult to do in the files-are-approximately-compilation-units model, but a bit easier with LTO (though re-inferring enough information to do devirtualisation at this level is a bit tricky). C++, at the language level, unfortunately has no notion of a shared library. This is very problematic in some places in C++11 (the semantics for when constructors and destructors are run for thread_local objects in the presence of multiple threads and library loading and unloading are very poorly defined) and also makes it difficult to have a package-final or similar to indicate that no subclasses outside of this library are permitted. Or, if the goal is to put a little bit more burden on the programmer but give greater flexibility in implementations, I believe that one of the people in this thread is the head of the reflection subcommittee in WG21, so may already be reviewing proposals that would make it easier to implement generically... David