Frédéric Riss
2015-Feb-20 17:44 UTC
[LLVMdev] Questions before moving the new debug info hierarchy into place
> On Feb 20, 2015, at 9:00 AM, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Thu, Feb 19, 2015 at 7:49 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote: > I'm getting close to executing the transition to the new debug info > hierarchy. For reference, I've attached two WIP patches (which would be > squashed before commit) and the WIP upgrade script I'm using. > > - transition-code.patch: Change the `DIDescriptor` hierarchy to > lightweight wrappers around the subclasses of `DebugNode` (instead > of rather heavier wrappers around `MDTuple`), and update > `DIBuilder` to generate the right things. > - transition-testcases.patch: Upgrade of testcases, entirely > generated from the upgrade script (so far). > - upgrade-specialized-nodes.sh: Script to upgrade LLVM assembly. > > (Feel free to have a look, but I haven't updated LangRef, I don't quite > have `make check` passing, and I haven't even started on `make > check-clang` (I imagine that'll all be done by hand).) > > There are two outstanding issues I'd like some feedback on. > > Pretty-printing the flags > ========================> > I've noticed the `flags:` field is *harder* to read in the new assembly: > > !MDDerivedType(flags: 16384, ...) > > than the pretty-printed comments in the old: > > !{!"...\\0016384", ...} ; ... [public] [rvalue reference] > > I don't want to regress here. > > In `DIDescriptor`, the flags are described in an enum bitfield: > > FlagAccessibility = 1 << 0 | 1 << 1, > FlagPrivate = 1, > FlagProtected = 2, > FlagPublic = 3, > FlagFwdDecl = 1 << 2, > FlagAppleBlock = 1 << 3, > FlagBlockByrefStruct = 1 << 4, > FlagVirtual = 1 << 5, > FlagArtificial = 1 << 6, > FlagExplicit = 1 << 7, > FlagPrototyped = 1 << 8, > FlagObjcClassComplete = 1 << 9, > FlagObjectPointer = 1 << 10, > FlagVector = 1 << 11, > FlagStaticMember = 1 << 12, > FlagLValueReference = 1 << 13, > FlagRValueReference = 1 << 14 > > I think the right short-term solution is to use these names directly in > assembly, giving us: > > !MDDerivedType(flags: FlagPublic | FlagRValueReference, ...) > > This is easy to implement and easy to CHECK against. > > Sound good? > > (Eventually, I'd like to use the `DW_AT` symbols that each of these > corresponds to, but `FlagStaticMember` doesn't seem to line up with any > such `DW_AT` so that will take some refactoring (and I don't think it > makes sense for that to block moving the hierarchy into place).) > > Merging the two types of files > =============================> > In the old format, we have two types of files -- an untagged file node, > and a DIFile/DW_TAG_file_type/0x29 which references the untagged node. > > !0 = !{!"path/to/file", !"/path/to/dir"} > !1 = !{!"0x29", !0} > > In the actual metadata graph, most file references use !0, but in > DIBuilder !1 is always passed in and the !0 is extracted from it. > > I've been planning to merge these into: > > !1 = !MDFile(filename: "path/to/file", directory: "/path/to/dir") > > Anyone see a problem with that? Why? > > If the strings are deduplicated elsewhere, that should be fine. (I think I made the change originally to pull out the names because of the duplication I saw in many constructsn needing to reference the file/directory and they all contained the same file/directory)No specific comments on the approach which seems fine. One question though: Speaking of deduplication of filenames. I think we discussed that in the early stages of your work, but I just wanted to make sure I remember correctly: the new debug hierarchy will allow to implement specific uniquing behavior, right? So we will be able to tweak MDFile to unique: !MDFile(filename: "path/to/file", directory: "/path/to/dir") !MDFile(filename: "to/file", directory: "/path/to/dir/path") into the same object? I did some work in this area and in the current form it’s not really possible to do cleanly (don’t remember the details here, but I know punted it till we get smart uniquing capability). Thanks! Fred> I take it you'll have other constructs (I think all scopes have a file/directory) reference the MDFile directly? > > > If not, I'd like to roll that change into the same patch in order to > reduce testcase churn. (I've discovered that it won't complicate the > code patch at all.) > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/65114a9b/attachment.html>
Duncan P. N. Exon Smith
2015-Feb-20 19:14 UTC
[LLVMdev] Questions before moving the new debug info hierarchy into place
> On 2015-Feb-20, at 09:44, Frédéric Riss <friss at apple.com> wrote: > > Speaking of deduplication of filenames. I think we discussed that in the early stages of your work, but I just wanted to make sure I remember correctly: the new debug hierarchy will allow to implement specific uniquing behavior, right? So we will be able to tweak MDFile to unique: > > !MDFile(filename: "path/to/file", directory: "/path/to/dir") > !MDFile(filename: "to/file", directory: "/path/to/dir/path") > > into the same object? I did some work in this area and in the current form it’s not really possible to do cleanly (don’t remember the details here, but I know punted it till we get smart uniquing capability).IIRC, there was some talk about how the DWARF spec requires that references to a `directory` need to be the CWD of the compiler. It's not clear to me *which* references need that, but I guess we need to sort out exactly what the requirements are so we know which way to canonicalize things. I wonder if this only matters for DW_TAG_compile_unit? If so, we could canonicalize freely everywhere else. (For example, we could store a `path:` node everywhere (dropping the filename/directory distinction, or canonicalizing it to basename/dirname, etc.), and add a `cwd:` to the compile unit.) David also pointed out (in a previous thread about this) that the frontend (or `DIBuilder`) might be the right location for canonicalization. Certainly, we couldn't canonicalize `..` references without access to the filesystem (at least not on POSIX platforms), but maybe we don't care about that anyway?
Frédéric Riss
2015-Feb-20 19:34 UTC
[LLVMdev] Questions before moving the new debug info hierarchy into place
> On Feb 20, 2015, at 11:14 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > >> On 2015-Feb-20, at 09:44, Frédéric Riss <friss at apple.com> wrote: >> >> Speaking of deduplication of filenames. I think we discussed that in the early stages of your work, but I just wanted to make sure I remember correctly: the new debug hierarchy will allow to implement specific uniquing behavior, right? So we will be able to tweak MDFile to unique: >> >> !MDFile(filename: "path/to/file", directory: "/path/to/dir") >> !MDFile(filename: "to/file", directory: "/path/to/dir/path") >> >> into the same object? I did some work in this area and in the current form it’s not really possible to do cleanly (don’t remember the details here, but I know punted it till we get smart uniquing capability). > > IIRC, there was some talk about how the DWARF spec requires that > references to a `directory` need to be the CWD of the compiler.If you refer to a conversation we had, I’m pretty sure I was referring to this text in the standard about the directories listed in the line table: "Entries in this sequence describe each path that was searched for included source files in this compilation. (The paths include those directories specified explicitly by the user for the compiler to search and those the compiler searches without explicit direction.) Each path entry is either a full path name or is relative to the current directory of the compilation." I think that at the time I had interpreted that has “if the user passes -I../include to the compiler, le line table should list ../include as a directory in the line table”. I think my interpretation was too strict though and that we could canonicalize the file/dir names as we’d like. The compilation dir will always be there as a special entry. It’s the implicit entry 0 in the directory list, but it’s value is not stored in the line table, it is a normal string that is referenced by the TAG_compile_unit DIE. If it is extracted from an MDFile, we need to make sure this one doesn’t get mangled, but I think this is the only one we really need to preserve.> It's not clear to me *which* references need that, but I guess > we need to sort out exactly what the requirements are so we know > which way to canonicalize things. I wonder if this only matters > for DW_TAG_compile_unit? If so, we could canonicalize freely > everywhere else. (For example, we could store a `path:` node > everywhere (dropping the filename/directory distinction, or > canonicalizing it to basename/dirname, etc.), and add a `cwd:` > to the compile unit.)As stated above, I think we are free to canonicalize as we wish.> David also pointed out (in a previous thread about this) that > the frontend (or `DIBuilder`) might be the right location for > canonicalization. Certainly, we couldn't canonicalize `..` > references without access to the filesystem (at least not on > POSIX platforms), but maybe we don't care about that anyway?What I attempted to do in the past was exactly that, canonicalize ‘..’ in the paths (The lldb folks would have loved to be able to consider file identifiers as unique in the line table). Fred -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/328f57d6/attachment.html>