I came across a PDB that caused llvm-pdbdump to crash. I tracked it down to a missing overload for a data type in the variable printer. When I tried to provide an implementation, I discovered that there were some bugs in the existing overloads and it was easy to come up with combinations of types that weren't being dumped right. I created some test cases and tried to fix these problems but realized that there's a limitation to how printing the type information is structured. The cause of the difficulty is the conflict between these two facts: 1. C and C++ variable declarations are described inside-out 2. Printing with color requires printing left-to-right Simple types generally work, because all the type information is on the left side, so VariableDumper can just work left-to-right. To handle more complex types (like arrays and function signatures), VariableDumper attempts to special case the complex types. Unfortunately, the special cases can't be combined generally (e.g., making an array of pointers to functions), so it's always possible to find a new bug by adding an extra layer of type information. I'm proposing a restructure of VariableDumper (PrettyVariableDumper.cpp) that I think should allow for pretty-printing of all types and actually simplify away most of the special-case code. The existing approach is to emit the type information for a symbol and then the identifier, using recursion as necessary to emit the type information of subtypes. For example, if foo is a pointer to an int, it's dumped by first recursively dumping the pointee type (int), then the '*', and then the name (foo). But if bar is a pointer to an array of ints, this breaks down. My proposal is to extend this slightly by having dumpers for the "left-side" and the "right-side" of the type information. The pattern for printing a variable them becomes: call dumpLeft with the symbol type emit the symbol name call dumpRight with the symbol type Most of the dumpLeft implementations would be identical to the current dump implementations. The default for a dumpRight implementation would be to emit nothing, but for arrays and function signatures, dumpRight would print the bracketed array size and argument lists. This depth-first order of the recursion should produce the necessary output in order, which means there should be no change to how we stream the output in color. Any questions, comments, objections? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/e867b6f8/attachment-0001.html>
This is exactly how clang's type printer works, so I'd say this is probably the right implementation strategy. :) Check out clang/lib/AST/TypePrinter.cpp: class TypePrinter { ... void printBefore(const Type *ty, Qualifiers qs, raw_ostream &OS); void printBefore(QualType T, raw_ostream &OS); void printAfter(const Type *ty, Qualifiers qs, raw_ostream &OS); void printAfter(QualType T, raw_ostream &OS); ... #define ABSTRACT_TYPE(CLASS, PARENT) #define TYPE(CLASS, PARENT) \ void print##CLASS##Before(const CLASS##Type *T, raw_ostream &OS); \ void print##CLASS##After(const CLASS##Type *T, raw_ostream &OS); #include "clang/AST/TypeNodes.def" On Fri, Apr 7, 2017 at 1:33 PM, Adrian McCarthy via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I came across a PDB that caused llvm-pdbdump to crash. I tracked it down > to a missing overload for a data type in the variable printer. When I > tried to provide an implementation, I discovered that there were some bugs > in the existing overloads and it was easy to come up with combinations of > types that weren't being dumped right. I created some test cases and tried > to fix these problems but realized that there's a limitation to how > printing the type information is structured. > > The cause of the difficulty is the conflict between these two facts: > > 1. C and C++ variable declarations are described inside-out > 2. Printing with color requires printing left-to-right > > Simple types generally work, because all the type information is on the > left side, so VariableDumper can just work left-to-right. To handle more > complex types (like arrays and function signatures), VariableDumper > attempts to special case the complex types. Unfortunately, the special > cases can't be combined generally (e.g., making an array of pointers to > functions), so it's always possible to find a new bug by adding an extra > layer of type information. > > I'm proposing a restructure of VariableDumper (PrettyVariableDumper.cpp) > that I think should allow for pretty-printing of all types and actually > simplify away most of the special-case code. > > The existing approach is to emit the type information for a symbol and > then the identifier, using recursion as necessary to emit the type > information of subtypes. For example, if foo is a pointer to an int, it's > dumped by first recursively dumping the pointee type (int), then the '*', > and then the name (foo). But if bar is a pointer to an array of ints, this > breaks down. > > My proposal is to extend this slightly by having dumpers for the > "left-side" and the "right-side" of the type information. The pattern for > printing a variable them becomes: > > call dumpLeft with the symbol type > emit the symbol name > call dumpRight with the symbol type > > Most of the dumpLeft implementations would be identical to the current > dump implementations. The default for a dumpRight implementation would be > to emit nothing, but for arrays and function signatures, dumpRight would > print the bracketed array size and argument lists. This depth-first order > of the recursion should produce the necessary output in order, which means > there should be no change to how we stream the output in color. > > Any questions, comments, objections? > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/11419776/attachment.html>
Since Phabricator isn't sending email: https://reviews.llvm.org/D31832 On Fri, Apr 7, 2017 at 1:54 PM, Reid Kleckner <rnk at google.com> wrote:> This is exactly how clang's type printer works, so I'd say this is > probably the right implementation strategy. :) > > Check out clang/lib/AST/TypePrinter.cpp: > class TypePrinter { > ... > void printBefore(const Type *ty, Qualifiers qs, raw_ostream &OS); > void printBefore(QualType T, raw_ostream &OS); > void printAfter(const Type *ty, Qualifiers qs, raw_ostream &OS); > void printAfter(QualType T, raw_ostream &OS); > ... > #define ABSTRACT_TYPE(CLASS, PARENT) > #define TYPE(CLASS, PARENT) \ > void print##CLASS##Before(const CLASS##Type *T, raw_ostream &OS); \ > void print##CLASS##After(const CLASS##Type *T, raw_ostream &OS); > #include "clang/AST/TypeNodes.def" > > > On Fri, Apr 7, 2017 at 1:33 PM, Adrian McCarthy via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I came across a PDB that caused llvm-pdbdump to crash. I tracked it down >> to a missing overload for a data type in the variable printer. When I >> tried to provide an implementation, I discovered that there were some bugs >> in the existing overloads and it was easy to come up with combinations of >> types that weren't being dumped right. I created some test cases and tried >> to fix these problems but realized that there's a limitation to how >> printing the type information is structured. >> >> The cause of the difficulty is the conflict between these two facts: >> >> 1. C and C++ variable declarations are described inside-out >> 2. Printing with color requires printing left-to-right >> >> Simple types generally work, because all the type information is on the >> left side, so VariableDumper can just work left-to-right. To handle more >> complex types (like arrays and function signatures), VariableDumper >> attempts to special case the complex types. Unfortunately, the special >> cases can't be combined generally (e.g., making an array of pointers to >> functions), so it's always possible to find a new bug by adding an extra >> layer of type information. >> >> I'm proposing a restructure of VariableDumper (PrettyVariableDumper.cpp) >> that I think should allow for pretty-printing of all types and actually >> simplify away most of the special-case code. >> >> The existing approach is to emit the type information for a symbol and >> then the identifier, using recursion as necessary to emit the type >> information of subtypes. For example, if foo is a pointer to an int, it's >> dumped by first recursively dumping the pointee type (int), then the '*', >> and then the name (foo). But if bar is a pointer to an array of ints, this >> breaks down. >> >> My proposal is to extend this slightly by having dumpers for the >> "left-side" and the "right-side" of the type information. The pattern for >> printing a variable them becomes: >> >> call dumpLeft with the symbol type >> emit the symbol name >> call dumpRight with the symbol type >> >> Most of the dumpLeft implementations would be identical to the current >> dump implementations. The default for a dumpRight implementation would be >> to emit nothing, but for arrays and function signatures, dumpRight would >> print the bracketed array size and argument lists. This depth-first order >> of the recursion should produce the necessary output in order, which means >> there should be no change to how we stream the output in color. >> >> Any questions, comments, objections? >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/77aacabf/attachment.html>