I came across a PDB that caused llvm-pdbdump to crash.  I tracked it down
to a missing overload for a data type in the variable printer.  When I
tried to provide an implementation, I discovered that there were some bugs
in the existing overloads and it was easy to come up with combinations of
types that weren't being dumped right.  I created some test cases and tried
to fix these problems but realized that there's a limitation to how
printing the type information is structured.
The cause of the difficulty is the conflict between these two facts:
1.  C and C++ variable declarations are described inside-out
2.  Printing with color requires printing left-to-right
Simple types generally work, because all the type information is on the
left side, so VariableDumper can just work left-to-right.  To handle more
complex types (like arrays and function signatures), VariableDumper
attempts to special case the complex types.  Unfortunately, the special
cases can't be combined generally (e.g., making an array of pointers to
functions), so it's always possible to find a new bug by adding an extra
layer of type information.
I'm proposing a restructure of VariableDumper (PrettyVariableDumper.cpp)
that I think should allow for pretty-printing of all types and actually
simplify away most of the special-case code.
The existing approach is to emit the type information for a symbol and then
the identifier, using recursion as necessary to emit the type information
of subtypes.  For example, if foo is a pointer to an int, it's dumped by
first recursively dumping the pointee type (int), then the '*', and then
the name (foo).  But if bar is a pointer to an array of ints, this breaks
down.
My proposal is to extend this slightly by having dumpers for the
"left-side" and the "right-side" of the type information. 
The pattern for
printing a variable them becomes:
    call dumpLeft with the symbol type
    emit the symbol name
    call dumpRight with the symbol type
Most of the dumpLeft implementations would be identical to the current dump
implementations.  The default for a dumpRight implementation would be to
emit nothing, but for arrays and function signatures, dumpRight would print
the bracketed array size and argument lists.  This depth-first order of the
recursion should produce the necessary output in order, which means there
should be no change to how we stream the output in color.
Any questions, comments, objections?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/e867b6f8/attachment-0001.html>
This is exactly how clang's type printer works, so I'd say this is
probably
the right implementation strategy. :)
Check out clang/lib/AST/TypePrinter.cpp:
  class TypePrinter {
...
    void printBefore(const Type *ty, Qualifiers qs, raw_ostream &OS);
    void printBefore(QualType T, raw_ostream &OS);
    void printAfter(const Type *ty, Qualifiers qs, raw_ostream &OS);
    void printAfter(QualType T, raw_ostream &OS);
...
#define ABSTRACT_TYPE(CLASS, PARENT)
#define TYPE(CLASS, PARENT) \
    void print##CLASS##Before(const CLASS##Type *T, raw_ostream &OS); \
    void print##CLASS##After(const CLASS##Type *T, raw_ostream &OS);
#include "clang/AST/TypeNodes.def"
On Fri, Apr 7, 2017 at 1:33 PM, Adrian McCarthy via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I came across a PDB that caused llvm-pdbdump to crash.  I tracked it down
> to a missing overload for a data type in the variable printer.  When I
> tried to provide an implementation, I discovered that there were some bugs
> in the existing overloads and it was easy to come up with combinations of
> types that weren't being dumped right.  I created some test cases and
tried
> to fix these problems but realized that there's a limitation to how
> printing the type information is structured.
>
> The cause of the difficulty is the conflict between these two facts:
>
> 1.  C and C++ variable declarations are described inside-out
> 2.  Printing with color requires printing left-to-right
>
> Simple types generally work, because all the type information is on the
> left side, so VariableDumper can just work left-to-right.  To handle more
> complex types (like arrays and function signatures), VariableDumper
> attempts to special case the complex types.  Unfortunately, the special
> cases can't be combined generally (e.g., making an array of pointers to
> functions), so it's always possible to find a new bug by adding an
extra
> layer of type information.
>
> I'm proposing a restructure of VariableDumper
(PrettyVariableDumper.cpp)
> that I think should allow for pretty-printing of all types and actually
> simplify away most of the special-case code.
>
> The existing approach is to emit the type information for a symbol and
> then the identifier, using recursion as necessary to emit the type
> information of subtypes.  For example, if foo is a pointer to an int,
it's
> dumped by first recursively dumping the pointee type (int), then the
'*',
> and then the name (foo).  But if bar is a pointer to an array of ints, this
> breaks down.
>
> My proposal is to extend this slightly by having dumpers for the
> "left-side" and the "right-side" of the type
information.  The pattern for
> printing a variable them becomes:
>
>     call dumpLeft with the symbol type
>     emit the symbol name
>     call dumpRight with the symbol type
>
> Most of the dumpLeft implementations would be identical to the current
> dump implementations.  The default for a dumpRight implementation would be
> to emit nothing, but for arrays and function signatures, dumpRight would
> print the bracketed array size and argument lists.  This depth-first order
> of the recursion should produce the necessary output in order, which means
> there should be no change to how we stream the output in color.
>
> Any questions, comments, objections?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/11419776/attachment.html>
Since Phabricator isn't sending email: https://reviews.llvm.org/D31832 On Fri, Apr 7, 2017 at 1:54 PM, Reid Kleckner <rnk at google.com> wrote:> This is exactly how clang's type printer works, so I'd say this is > probably the right implementation strategy. :) > > Check out clang/lib/AST/TypePrinter.cpp: > class TypePrinter { > ... > void printBefore(const Type *ty, Qualifiers qs, raw_ostream &OS); > void printBefore(QualType T, raw_ostream &OS); > void printAfter(const Type *ty, Qualifiers qs, raw_ostream &OS); > void printAfter(QualType T, raw_ostream &OS); > ... > #define ABSTRACT_TYPE(CLASS, PARENT) > #define TYPE(CLASS, PARENT) \ > void print##CLASS##Before(const CLASS##Type *T, raw_ostream &OS); \ > void print##CLASS##After(const CLASS##Type *T, raw_ostream &OS); > #include "clang/AST/TypeNodes.def" > > > On Fri, Apr 7, 2017 at 1:33 PM, Adrian McCarthy via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I came across a PDB that caused llvm-pdbdump to crash. I tracked it down >> to a missing overload for a data type in the variable printer. When I >> tried to provide an implementation, I discovered that there were some bugs >> in the existing overloads and it was easy to come up with combinations of >> types that weren't being dumped right. I created some test cases and tried >> to fix these problems but realized that there's a limitation to how >> printing the type information is structured. >> >> The cause of the difficulty is the conflict between these two facts: >> >> 1. C and C++ variable declarations are described inside-out >> 2. Printing with color requires printing left-to-right >> >> Simple types generally work, because all the type information is on the >> left side, so VariableDumper can just work left-to-right. To handle more >> complex types (like arrays and function signatures), VariableDumper >> attempts to special case the complex types. Unfortunately, the special >> cases can't be combined generally (e.g., making an array of pointers to >> functions), so it's always possible to find a new bug by adding an extra >> layer of type information. >> >> I'm proposing a restructure of VariableDumper (PrettyVariableDumper.cpp) >> that I think should allow for pretty-printing of all types and actually >> simplify away most of the special-case code. >> >> The existing approach is to emit the type information for a symbol and >> then the identifier, using recursion as necessary to emit the type >> information of subtypes. For example, if foo is a pointer to an int, it's >> dumped by first recursively dumping the pointee type (int), then the '*', >> and then the name (foo). But if bar is a pointer to an array of ints, this >> breaks down. >> >> My proposal is to extend this slightly by having dumpers for the >> "left-side" and the "right-side" of the type information. The pattern for >> printing a variable them becomes: >> >> call dumpLeft with the symbol type >> emit the symbol name >> call dumpRight with the symbol type >> >> Most of the dumpLeft implementations would be identical to the current >> dump implementations. The default for a dumpRight implementation would be >> to emit nothing, but for arrays and function signatures, dumpRight would >> print the bracketed array size and argument lists. This depth-first order >> of the recursion should produce the necessary output in order, which means >> there should be no change to how we stream the output in color. >> >> Any questions, comments, objections? >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/77aacabf/attachment.html>