thr3ads.net - llvm dev - [LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Manman Ren

2013-Oct-17 04:10 UTC

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

On Wed, Oct 16, 2013 at 1:58 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
>
> On Wed, Oct 16, 2013 at 1:21 PM, Manman Ren <manman.ren at gmail.com>
wrote:
>
>>
>> There are a few places where we break the assumption:
>> 1> formal_parameter constructed in DwarfDebug when adding attribute
type
>>      we call SPCU->addType(Arg, ATy), where Arg does not belong to
SPCU.
>> 2> inlined_subroutine constructed in DwarfDebug when adding
attribute
>> abstract_origin
>>      The inlined_subroutine does not belong to the CU we call
addDIEEntry
>> on.
>>      We create the children of inlined_subroutine and call addDIEEntry
>> before we add it to an owner.
>> 3> ...
>>
>> When building xalan with "lto -g", I saw 3158 violations of
the
>> assumption. Should we try to fix all these violations to make the
>> assumption true?
>>
>
> As you've said, this assumption (that DIEs are always constructed by
the
> CU that will own them) isn't necessary for this patch - it will be
> necessary for cu-at-a-time emission. Perhaps we should discuss it in a
> separate thread rather than trying to make a stronger assumption than is
> needed for this patch.
>
> Of course without your patch the assumption is trivially correct (well,
> given your point (1) above, perhaps it wasn't - maybe we do do things
out
> of order and from outside the CU) and we might want to say that it's so
> important that your patch shouldn't violate it - but I'm not sure
that's
> necessary (Eric?). I haven't given it a great deal of thought.
>

Patches are reattached for convenience: remove DIE duplication with a
worklist (name it patch_with), remove DIE duplication without a worklist
but an assertion when we emit a ref4, we make sure the DIE and the
referenced DIE belong to the same CU (name it patch_without).

Without my patch, the assumption may be true, but it does not matter since
we should always use ref4.
I have provided some cases that the assertion fails with patch_without.

I didn't get a chance to implement another assertion we mentioned earlier
(verify that inside addDIEEntry if a DIE does not have a parent,
it should belong to "this" CU). If we want to check if the assumption
holds
without my patch, I can implement the assertion and check on
Xalan.

If the assumption does not hold even without my patch, are we okay with
using a worklist?
If the assumption does hold without my patch, but does not hold with my
patch, are we okay with using a worklist? If not, we will need to fix the
3000+ violations (which may belong to <10 categories).

>
>> As I stated earlier, this assumption, in my opinion, is not really
>> general, and if we only make this assumption to remove a worklist, it
may
>> not worth it.
>>
>
> There is still no discussion about "removing the worklist" - the
only
> discussion is about whether we need to /add/ the worklist. It's up to
the
> commiter to justify the addition of code, not for the community to justify
> its removal. (I'd say this is /almost/ even true of committed code - if
> something isn't tested, I'm likely to remove it and see what breaks
- if
> someone cared about it, they should've provided tests for it).
>Agree. I was under the impression we are trying to remove the worklist.

>
>
>> How can developers make sure this assumption is not later violated?
>>
>
> We can add the kind of assertions we've discussed in this thread,
keeping
> a mapping every time we rely on the assumption and checking the mapping
> whenever we resolve the parent chain. I don't think this is necessary,
but
> I wouldn't block such a change either.
>
>
>> If we are going to make assumptions,
>>
>
> We are always going to make assumptions - well, we're always going to
have
> invariants in our design where any violation of those invariants is a bug.
>
>
>> I will suggest making general assumptions such as
>> 1> always use a CU to construct a DIE
>> 2> Before constructing a DIE, figure out the owner CU first by
chasing
>> the context chain, then call construction on the owner CU.
>> 3> Before calling addDIEEntry in DwarfDebug, find the owner CU
first,
>> then call addDIEEntry on the owner CU.
>>
>
> I'm not sure how valuable this would be - perhaps highly, perhaps not.
> It'd require more work for your patch, though.
>
Agree. These 3 will help emit-one-CU-at-a-time. I was thinking if we are
going to enforce the single assumption, it may be easier to enforce 3
general assumptions which can make emit-one-CU-at-a-time easier.

Thanks,
Manman
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/4cc45f3a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: remove-DIE-duplication-without-worklist.patch
Type: application/octet-stream
Size: 15993 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/4cc45f3a/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: remove-DIE-duplication-with-worklist.patch
Type: application/octet-stream
Size: 18123 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/4cc45f3a/attachment-0001.obj>

David Blaikie

2013-Oct-17 05:32 UTC

head link

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

On Wed, Oct 16, 2013 at 9:10 PM, Manman Ren <manman.ren at gmail.com>
wrote:
>
>
>
> On Wed, Oct 16, 2013 at 1:58 PM, David Blaikie <dblaikie at
gmail.com> wrote:
>
>>
>>
>>
>> On Wed, Oct 16, 2013 at 1:21 PM, Manman Ren <manman.ren at
gmail.com> wrote:
>>
>>>
>>> There are a few places where we break the assumption:
>>> 1> formal_parameter constructed in DwarfDebug when adding
attribute type
>>>      we call SPCU->addType(Arg, ATy), where Arg does not belong
to SPCU.
>>> 2> inlined_subroutine constructed in DwarfDebug when adding
attribute
>>> abstract_origin
>>>      The inlined_subroutine does not belong to the CU we call
>>> addDIEEntry on.
>>>      We create the children of inlined_subroutine and call
addDIEEntry
>>> before we add it to an owner.
>>> 3> ...
>>>
>>> When building xalan with "lto -g", I saw 3158 violations
of the
>>> assumption. Should we try to fix all these violations to make the
>>> assumption true?
>>>
>>
>> As you've said, this assumption (that DIEs are always constructed
by the
>> CU that will own them) isn't necessary for this patch - it will be
>> necessary for cu-at-a-time emission. Perhaps we should discuss it in a
>> separate thread rather than trying to make a stronger assumption than
is
>> needed for this patch.
>>
>> Of course without your patch the assumption is trivially correct (well,
>> given your point (1) above, perhaps it wasn't - maybe we do do
things out
>> of order and from outside the CU) and we might want to say that
it's so
>> important that your patch shouldn't violate it - but I'm not
sure that's
>> necessary (Eric?). I haven't given it a great deal of thought.
>>
>
>
> Patches are reattached for convenience: remove DIE duplication with a
> worklist (name it patch_with), remove DIE duplication without a worklist
> but an assertion when we emit a ref4, we make sure the DIE and the
> referenced DIE belong to the same CU (name it patch_without).
>
OK, we're going round in circles here and I'm not sure there are many
other
ways I can communicate things.

> Without my patch, the assumption may be true, but it does not matter since
> we should always use ref4.
> I have provided some cases that the assertion fails with patch_without.
>
Yes, because it's a narrow assertion based on an assumption I've
explained
is not true (due to adding DIEs to types after-the-fact, due to things like
implicit special members, member templates, and nested types).

The assertion you would need to check that a worklist is not required would
be a mapping listing the "assumptions" (ie: we found this parentless
DIE
and assumed it was in the same DIE as this other DIE we're building) and
then, whenever a DIE is added to a parent that chains up to a CU we would
have to check that mapping to see if the assumption turned out to be true.

I don't mind if you implement this or not.

What I do mind and will not accept is committing the worklist version of
this patch without justification, and by that I mean a demonstrated example
where it is needed, not just where it causes an overly-narrow assertion to
fail, but where it causes us to produce the wrong DWARF. If you have such
an example, let's look at that and figure out how to fix that - it might
(probably isn't) not be the right answer to use a worklist, but simply to
rework the order of construction/registration/parenting.

>
> I didn't get a chance to implement another assertion we mentioned
earlier
> (verify that inside addDIEEntry if a DIE does not have a parent,
> it should belong to "this" CU). If we want to check if the
assumption
> holds without my patch, I can implement the assertion and check on
> Xalan.
>
> If the assumption does not hold even without my patch, are we okay with
> using a worklist?
>
No, because we're talking about the wrong assumption, so far as I can tell.
If I'm understanding you correctly you chose an overly narrow assumption
(that something without a parent belongs to the current CU, where it might
belong to another CU we're adding it to) that, while possibly sufficient,
isn't necessary to avoid the worklist.

> If the assumption does hold without my patch, but does not hold with my
> patch, are we okay with using a worklist? If not, we will need to fix the
> 3000+ violations (which may belong to <10 categories).
>
I'm still not following you here. If you take one of those failures, remove
the assertion, and run the non-worklist code, does it produce the wrong
DWARF?

>
>
>>
>>> As I stated earlier, this assumption, in my opinion, is not really
>>> general, and if we only make this assumption to remove a worklist,
it may
>>> not worth it.
>>>
>>
>> There is still no discussion about "removing the worklist" -
the only
>> discussion is about whether we need to /add/ the worklist. It's up
to the
>> commiter to justify the addition of code, not for the community to
justify
>> its removal. (I'd say this is /almost/ even true of committed code
- if
>> something isn't tested, I'm likely to remove it and see what
breaks - if
>> someone cared about it, they should've provided tests for it).
>>
> Agree. I was under the impression we are trying to remove the worklist.
>
Sorry, perhaps a phrasing problem. My point is that there is no worklist to
remove - we haven't added one yet. We must justify the addition, not the
removal. Which means you must justify why it should be included in your
patch - I don't have to justify why it shouldn't be included. The
addition
of code/complexity must be justified by the demonstration, in the form of a
test, that shows that without that code we would produce the wrong answer.

Until we have such a test case in the patch, the worklist code should not
be committed.

>
>
>>
>>
>>> How can developers make sure this assumption is not later violated?
>>>
>>
>> We can add the kind of assertions we've discussed in this thread,
keeping
>> a mapping every time we rely on the assumption and checking the mapping
>> whenever we resolve the parent chain. I don't think this is
necessary, but
>> I wouldn't block such a change either.
>>
>>
>>> If we are going to make assumptions,
>>>
>>
>> We are always going to make assumptions - well, we're always going
to
>> have invariants in our design where any violation of those invariants
is a
>> bug.
>>
>>
>>> I will suggest making general assumptions such as
>>> 1> always use a CU to construct a DIE
>>> 2> Before constructing a DIE, figure out the owner CU first by
chasing
>>> the context chain, then call construction on the owner CU.
>>> 3> Before calling addDIEEntry in DwarfDebug, find the owner CU
first,
>>> then call addDIEEntry on the owner CU.
>>>
>>
>> I'm not sure how valuable this would be - perhaps highly, perhaps
not.
>> It'd require more work for your patch, though.
>>
>
> Agree. These 3 will help emit-one-CU-at-a-time. I was thinking if we are
> going to enforce the single assumption, it may be easier to enforce 3
> general assumptions which can make emit-one-CU-at-a-time easier.
>
I'm just going to ignore the cu-at-a-time work/subthread for now, it may be
best to discuss that separately, unless you want to decide on a design for
the cross-CU-dies that is forward-looking enough to simplify cu-at-a-time
(ie: by never mutating a prior CU after it has been constructed) but I
don't think that would be helpful.

- David
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/e0e3acc9/attachment.html>

Manman Ren

2013-Oct-17 18:11 UTC

head link

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

On Wed, Oct 16, 2013 at 10:32 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
>
> On Wed, Oct 16, 2013 at 9:10 PM, Manman Ren <manman.ren at gmail.com>
wrote:
>
>>
>>
>>
>> On Wed, Oct 16, 2013 at 1:58 PM, David Blaikie <dblaikie at
gmail.com>wrote:
>>
>>>
>>>
>>>
>>> On Wed, Oct 16, 2013 at 1:21 PM, Manman Ren <manman.ren at
gmail.com>wrote:
>>>
>>>>
>>>> There are a few places where we break the assumption:
>>>> 1> formal_parameter constructed in DwarfDebug when adding
attribute type
>>>>      we call SPCU->addType(Arg, ATy), where Arg does not
belong to SPCU.
>>>> 2> inlined_subroutine constructed in DwarfDebug when adding
attribute
>>>> abstract_origin
>>>>      The inlined_subroutine does not belong to the CU we call
>>>> addDIEEntry on.
>>>>      We create the children of inlined_subroutine and call
addDIEEntry
>>>> before we add it to an owner.
>>>> 3> ...
>>>>
>>>> When building xalan with "lto -g", I saw 3158
violations of the
>>>> assumption. Should we try to fix all these violations to make
the
>>>> assumption true?
>>>>
>>>
>>> As you've said, this assumption (that DIEs are always
constructed by the
>>> CU that will own them) isn't necessary for this patch - it will
be
>>> necessary for cu-at-a-time emission. Perhaps we should discuss it
in a
>>> separate thread rather than trying to make a stronger assumption
than is
>>> needed for this patch.
>>>
>>> Of course without your patch the assumption is trivially correct
(well,
>>> given your point (1) above, perhaps it wasn't - maybe we do do
things out
>>> of order and from outside the CU) and we might want to say that
it's so
>>> important that your patch shouldn't violate it - but I'm
not sure that's
>>> necessary (Eric?). I haven't given it a great deal of thought.
>>>
>>
>>
>> Patches are reattached for convenience: remove DIE duplication with a
>> worklist (name it patch_with), remove DIE duplication without a
worklist
>> but an assertion when we emit a ref4, we make sure the DIE and the
>> referenced DIE belong to the same CU (name it patch_without).
>>
>
> OK, we're going round in circles here and I'm not sure there are
many
> other ways I can communicate things.
>
>
>> Without my patch, the assumption may be true, but it does not matter
>> since we should always use ref4.
>> I have provided some cases that the assertion fails with patch_without.
>>
>
> Yes, because it's a narrow assertion based on an assumption I've
explained
> is not true (due to adding DIEs to types after-the-fact, due to things like
> implicit special members, member templates, and nested types).
>
> The assertion you would need to check that a worklist is not required
> would be a mapping listing the "assumptions" (ie: we found this
parentless
> DIE and assumed it was in the same DIE as this other DIE we're
building)
> and then, whenever a DIE is added to a parent that chains up to a CU we
> would have to check that mapping to see if the assumption turned out to be
> true.
>
> I don't mind if you implement this or not.
>
> What I do mind and will not accept is committing the worklist version of
> this patch without justification, and by that I mean a demonstrated example
> where it is needed, not just where it causes an overly-narrow assertion to
> fail, but where it causes us to produce the wrong DWARF. If you have such
> an example, let's look at that and figure out how to fix that - it
might
> (probably isn't) not be the right answer to use a worklist, but simply
to
> rework the order of construction/registration/parenting.
>
Let's agree on what assumptions you are trying to make:
I have stated a few times: inside addDIEEntry, if a DIE does not have a
parent, it should belong to "this" CU
--> that is the assumption I made in the patch and I thought that you
wanted to verify against
Now you are talking about a different assumption: "we found this parentless
DIE and assumed it was in the same DIE as this other DIE we're
building",
please be specific on where we should update the mapping list and what you
mean by "this other DIE".

With my understanding of the assumption, I implemented
"patch_without",
where
 +void CompileUnit::addDIEEntry(DIE *Die, uint16_t Attribute, DIEEntry
*Entry) {
+  DIE *DieCU = Die->getCompileUnitOrNull();
+  DIE *EntryCU = Entry->getEntry()->getCompileUnitOrNull();
+  if (!DieCU)
+    // We assume that Die belongs to this CU, if it is not linked to any
CU yet.
+    DieCU = getCUDie();
+  if (!EntryCU)
+    EntryCU = getCUDie();
+  Die->addValue(Attribute, EntryCU == DieCU ? dwarf::DW_FORM_ref4 :
+                                              dwarf::DW_FORM_ref_addr,
Entry);
 }

If we made the correct assumptions inside addDIEEntry, we would make the
correct decision between ref4 and ref_addr, so when we emitting a ref4, the
two DIEs must belong to the same CU.
If when emitting a ref4, the two DIEs do not belong to the same CU, it
means we made the wrong assumptions inside addDIEEntry.
If when emitting a ref4, the two DIEs do not belong to the same CU, that
means we are generating wrong DWARF. Why is it an overly-narrow assertion?

Since you are the one to ask for testing cases against some assumptions,
please be specific on how you want addDIEEntry to look like without a
worklist.

Thanks,
Manman

>
>>
>> I didn't get a chance to implement another assertion we mentioned
earlier
>> (verify that inside addDIEEntry if a DIE does not have a parent,
>> it should belong to "this" CU). If we want to check if the
assumption
>> holds without my patch, I can implement the assertion and check on
>> Xalan.
>>
>> If the assumption does not hold even without my patch, are we okay with
>> using a worklist?
>>
>
> No, because we're talking about the wrong assumption, so far as I can
> tell. If I'm understanding you correctly you chose an overly narrow
> assumption (that something without a parent belongs to the current CU,
> where it might belong to another CU we're adding it to) that, while
> possibly sufficient, isn't necessary to avoid the worklist.
>
>
>> If the assumption does hold without my patch, but does not hold with my
>> patch, are we okay with using a worklist? If not, we will need to fix
the
>> 3000+ violations (which may belong to <10 categories).
>>
>
> I'm still not following you here. If you take one of those failures,
> remove the assertion, and run the non-worklist code, does it produce the
> wrong DWARF?
>
>
>>
>>
>>>
>>>> As I stated earlier, this assumption, in my opinion, is not
really
>>>> general, and if we only make this assumption to remove a
worklist, it may
>>>> not worth it.
>>>>
>>>
>>> There is still no discussion about "removing the
worklist" - the only
>>> discussion is about whether we need to /add/ the worklist. It's
up to the
>>> commiter to justify the addition of code, not for the community to
justify
>>> its removal. (I'd say this is /almost/ even true of committed
code - if
>>> something isn't tested, I'm likely to remove it and see
what breaks - if
>>> someone cared about it, they should've provided tests for it).
>>>
>> Agree. I was under the impression we are trying to remove the worklist.
>>
>
> Sorry, perhaps a phrasing problem. My point is that there is no worklist
> to remove - we haven't added one yet. We must justify the addition, not
the
> removal. Which means you must justify why it should be included in your
> patch - I don't have to justify why it shouldn't be included. The
addition
> of code/complexity must be justified by the demonstration, in the form of a
> test, that shows that without that code we would produce the wrong answer.
>
> Until we have such a test case in the patch, the worklist code should not
> be committed.
>
>
>>
>>
>>>
>>>
>>>> How can developers make sure this assumption is not later
violated?
>>>>
>>>
>>> We can add the kind of assertions we've discussed in this
thread,
>>> keeping a mapping every time we rely on the assumption and checking
the
>>> mapping whenever we resolve the parent chain. I don't think
this is
>>> necessary, but I wouldn't block such a change either.
>>>
>>>
>>>> If we are going to make assumptions,
>>>>
>>>
>>> We are always going to make assumptions - well, we're always
going to
>>> have invariants in our design where any violation of those
invariants is a
>>> bug.
>>>
>>>
>>>> I will suggest making general assumptions such as
>>>> 1> always use a CU to construct a DIE
>>>> 2> Before constructing a DIE, figure out the owner CU first
by chasing
>>>> the context chain, then call construction on the owner CU.
>>>> 3> Before calling addDIEEntry in DwarfDebug, find the owner
CU first,
>>>> then call addDIEEntry on the owner CU.
>>>>
>>>
>>> I'm not sure how valuable this would be - perhaps highly,
perhaps not.
>>> It'd require more work for your patch, though.
>>>
>>
>> Agree. These 3 will help emit-one-CU-at-a-time. I was thinking if we
are
>> going to enforce the single assumption, it may be easier to enforce 3
>> general assumptions which can make emit-one-CU-at-a-time easier.
>>
>
> I'm just going to ignore the cu-at-a-time work/subthread for now, it
may
> be best to discuss that separately, unless you want to decide on a design
> for the cross-CU-dies that is forward-looking enough to simplify
> cu-at-a-time (ie: by never mutating a prior CU after it has been
> constructed) but I don't think that would be helpful.
>
> - David
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131017/4ee1beec/attachment.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Oct 2013 - [LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

Reasonably Related Threads