thr3ads.net - llvm dev - [LLVMdev] Issues with clang-llvm debug info validity [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Adrian Prantl

2014-Jun-27 20:49 UTC

[LLVMdev] Issues with clang-llvm debug info validity

I looked into this a little. My proposal is this:

- extend DINamespace with an extra UniqueID filed (analogous to DICompositeType)
- add an extra case to DIScope::getRef()
- Let CGDebugInfo create a mangled name for each namespace to use as UID
  -> the UID of the anonymous namespace is null.

For example:

namespace a { namespace b {} } -> “_ZN1a", “_ZN1a1b"

!0 = [compile unit]
!1 = metadata !{i32 786489, metadata !0, metadata !”_ZN1a", metadata
!”b", i32 1, metadata !”_ZN1a1b"} ; [ DW_TAG_namespace ] [a]
!2 = metadata !{i32 786489, metadata !0, null, metadata !”a", i32 1,
metadata !”_ZN1a"} ; [ DW_TAG_namespace ] [a] [line 1]

namespace a1b {} -> “_ZN3a1b"
namespace {} -> null (!= “”) // Anonymous namespaces will remain
old-fashioned MDNode references to prevent uniquing.

With this change we will get the full type uniquing benefits. As a side effect,
definitions that are in the same namespace will be children of the same
DW_TAG_namespace in the debug info, which I’d argue is the expected behavior for
LTO anyway.

-- adrian
> On Jun 24, 2014, at 4:39 PM, Adrian Prantl <aprantl at apple.com>
wrote:
> 
> 
>> On Jun 24, 2014, at 4:34 PM, Eric Christopher <echristo at
gmail.com> wrote:
>> 
>>>> Without needing to change how namespaces are handled.
>>>> 
>>>> Yes, using DIRefs for namespaces would cause the actual type
nodes to
>>>> be deduplicated when they aren't even with the patch above,
and that
>>>> would help reduce debug info metadata further if that's
>>>> useful/necessary.
>>> 
>>> Exactly. While the above patch will cover up the problem, we will
still have many duplicate type entries during LTO that will take up memory, even
though only the last one to be entered into DIRefMap will actually be used for
emitting DWARF. Since DIRefs are also more expensive in terms of lookup speed, I
think I’d prefer to only use them only where necessary (=as far down the MDNode
DAG as possible).
>>> 
>> 
>> Right, though, to be fair, some numbers here might be nice. The
>> tradeoff of "extra lookup for ref" versus "extra memory
allocation for
>> larger debug info”.
> 
> Then again, it’s not like the extra lookups introduced by David’s proposed
patch would get rid of those extra memory allocations. Granted, they do get rid
of the assertion, which obviously provides some value ;-)
> 
>> 
>> I also don't think it'll matter for speed :)
>> 
> You’re probably right about the lookup, but depending on the project you’re
building the memory usage could matter.
> 
> -- adrian
> 
>> -eric
>> 
>>>> 
>>>>> 
>>>>> Question to the C++ experts: Is the name (“test” in this
case) a sufficient appropriate unique ID for a C++ namespace?
>>>> 
>>>> Potentially - assuming you don't do this for anything in an
anonymous
>>>> namespace. I'd probably see about using the standard
mangling
>>> Good point! We should avoid unifying the anonymous namespace :-)
>>>> machinery as we do for other things, though, if possible. Might
keep
>>>> it more consistent.
>>> 
>>> I will do some experiments with nested namespaces, but it seems as
if name mangling does not buy us much: this will merely turn into N4test.
>>> 
>>> -- adrian
>>> 
>>>> 
>>>>> 
>>>>> -- adrian
>>>>> 
>>>>> 
>>>>>> On Jun 24, 2014, at 1:49 PM, David Blaikie <dblaikie
at gmail.com> wrote:
>>>>>> 
>>>>>> +Adrian & Manman,
>>>>>> 
>>>>>> Looks like this is a case of non-DIRef references
ending up in the IR,
>>>>>> and thus the references not being deduplicated. This
could lead to
>>>>>> some of the IR bloat that you guys implemented the
DIRef stuff to
>>>>>> reduce/avoid.
>>>>>> 
>>>>>> The specific issue is that the type is slightly
different in the two
>>>>>> TUs (since the namespace scope it's contained
within is different in
>>>>>> the two TUs - note the namespace preceeding the header
include in
>>>>>> test2.cpp) and the actual DIRef-usage failure is when
building
>>>>>> subroutine types. The subroutine types are made with
direct references
>>>>>> to types, not with DIRef references to types. This
means they won't be
>>>>>> deduplicated/resolved to a single type during linking
correctly.
>>>>>> 
>>>>>> It also leads to an assertion as per the original
email. Figured you
>>>>>> guys might want to look into/fix this (I assume the
right fix is to
>>>>>> use DIRef more pervasively in the frontend - I'm
doing a little
>>>>>> experimenting with clang now to see how that looks).
>>>>>> 
>>>>>> - David
>>>>>> 
>>>>>> On Fri, Jun 13, 2014 at 11:08 PM, David Blaikie
<dblaikie at gmail.com> wrote:
>>>>>>> Yep, looks like a legit bug to me. I'll try to
reproduce it and reduce
>>>>>>> a test case, etc.
>>>>>>> 
>>>>>>> (the ODR doesn't require that the two copies of
the Test class
>>>>>>> template appear on the same line in the same file
in two distinct
>>>>>>> translation units, it only requires that the two
copies have the same
>>>>>>> sequence of tokens - which they do in your example)
>>>>>>> 
>>>>>>> On Fri, Jun 13, 2014 at 5:33 PM, Kevin Modzelewski
<kmod at dropbox.com> wrote:
>>>>>>>> Hi all, sorry to post to both lists, but
I'm running into an issue where
>>>>>>>> clang-generated debug info is deemed to be
invalid by LLVM tools (throws an
>>>>>>>> assertion error in both llc and mcjit), and
I'm not sure what the proper
>>>>>>>> resolution is.
>>>>>>>> 
>>>>>>>> Here's a test case; I last tested it on
revision r210953:
>>>>>>>> 
>>>>>>>> $ cat test1.cpp
>>>>>>>> #include "test.h"
>>>>>>>> test::Test<int> foo1() {
>>>>>>>> return test::Test<int>();
>>>>>>>> }
>>>>>>>> 
>>>>>>>> $ cat test2.cpp
>>>>>>>> namespace test {
>>>>>>>> }
>>>>>>>> #include "test.h"
>>>>>>>> test::Test<int> foo2() {
>>>>>>>> return test::Test<int>();
>>>>>>>> }
>>>>>>>> 
>>>>>>>> $ cat test.h
>>>>>>>> namespace test {
>>>>>>>> template <class T>
>>>>>>>> class Test {
>>>>>>>> };
>>>>>>>> }
>>>>>>>> 
>>>>>>>> $ clang++ -g -emit-llvm test1.cpp -S -o
test1.ll
>>>>>>>> $ clang++ -g -emit-llvm test2.cpp -S -o
test2.ll
>>>>>>>> $ llvm-link test1.ll test2.ll -S -o linked.ll
>>>>>>>> $ llc linked.ll
>>>>>>>> 
>>>>>>>> The last step raises an assertion error,
assuming you have assertions
>>>>>>>> enabled:
>>>>>>>> llc: lib/CodeGen/AsmPrinter/DwarfUnit.cpp:953:
llvm::DIE*
>>>>>>>> llvm::DwarfUnit::getOrCreateTypeDIE(const
llvm::MDNode*): Assertion `Ty =>>>>>>>>
resolve(Ty.getRef()) && "type was not uniqued, possible ODR
violation."'
>>>>>>>> failed
>>>>>>>> 
>>>>>>>> The initial introduction of the
"test" namespace in test2.cpp seems to
>>>>>>>> confuse clang -- test1.cpp and test2.cpp have
non-compatible debug
>>>>>>>> information for the test::Test type.  test1.cpp
says that test::Test's
>>>>>>>> namespace is defined in test.h, and test2.cpp
says that test::Test's
>>>>>>>> namespace is defined in test2.cpp (they both
say that the class itself is
>>>>>>>> defined in test.h).  Regardless of whether LLVM
tools should allow this,
>>>>>>>> this seems like the wrong output for these
files?
>>>>>>>> 
>>>>>>>> Another related example, is if we simply
defined the test::Test template in
>>>>>>>> both files (ie, manually executed the #include
"test.h") and got rid of the
>>>>>>>> test2.cpp "namespace test {}"
definition, we'd run into the same assertion,
>>>>>>>> without hitting the weird clang behavior with
the empty namespace
>>>>>>>> pre-declaration.
>>>>>>>> 
>>>>>>>> It's probably also worth noting that
compiling directly with clang, ie
>>>>>>>> adding a "int main() {}" function and
then doing
>>>>>>>> $ clang++ -g test1.cpp test2.cpp -o test
>>>>>>>> works fine.
>>>>>>>> 
>>>>>>>> I'm not a C++ expert and I don't know
if these template issues really are
>>>>>>>> violations of the ODR, but it feels like this
should be allowed, and
>>>>>>>> somewhere in the llvm toolchain (llvm-link?
DwarfUnit.cpp?) this should be
>>>>>>>> handled.  What do people think?
>>>>>>>> 
>>>>>>>> kmod
>>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Blaikie

2014-Jun-27 20:58 UTC

head link

[LLVMdev] Issues with clang-llvm debug info validity

On Fri, Jun 27, 2014 at 1:49 PM, Adrian Prantl <aprantl at apple.com>
wrote:> I looked into this a little.
Thanks for taking a look!
> My proposal is this:
>
> - extend DINamespace with an extra UniqueID filed (analogous to
DICompositeType)
> - add an extra case to DIScope::getRef()
> - Let CGDebugInfo create a mangled name for each namespace to use as UID
>   -> the UID of the anonymous namespace is null.
>
> For example:
>
> namespace a { namespace b {} } -> “_ZN1a", “_ZN1a1b"
>
> !0 = [compile unit]
> !1 = metadata !{i32 786489, metadata !0, metadata !”_ZN1a", metadata
!”b", i32 1, metadata !”_ZN1a1b"} ; [ DW_TAG_namespace ] [a]
> !2 = metadata !{i32 786489, metadata !0, null, metadata !”a", i32 1,
metadata !”_ZN1a"} ; [ DW_TAG_namespace ] [a] [line 1]
>
> namespace a1b {} -> “_ZN3a1b"
> namespace {} -> null (!= “”) // Anonymous namespaces will remain
old-fashioned MDNode references to prevent uniquing.
Will this do the right thing if there's a named namespace inside or
outside the anonymous namespace?
> With this change we will get the full type uniquing benefits. As a side
effect, definitions that are in the same namespace will be children of the same
DW_TAG_namespace in the debug info, which I’d argue is the expected behavior for
LTO anyway.
Sort of - what happens to functions defined in that namespace? Will
they all end up in one CU with one instance of the namespace under
LTO? Could do weird things to address ranges? (would we end up with
the address range in the CU the function was originally in - but the
subprogram DIE ends up in some other CU that doesn't include that
address range)

- David
> -- adrian
>
>> On Jun 24, 2014, at 4:39 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>
>>
>>> On Jun 24, 2014, at 4:34 PM, Eric Christopher <echristo at
gmail.com> wrote:
>>>
>>>>> Without needing to change how namespaces are handled.
>>>>>
>>>>> Yes, using DIRefs for namespaces would cause the actual
type nodes to
>>>>> be deduplicated when they aren't even with the patch
above, and that
>>>>> would help reduce debug info metadata further if that's
>>>>> useful/necessary.
>>>>
>>>> Exactly. While the above patch will cover up the problem, we
will still have many duplicate type entries during LTO that will take up memory,
even though only the last one to be entered into DIRefMap will actually be used
for emitting DWARF. Since DIRefs are also more expensive in terms of lookup
speed, I think I’d prefer to only use them only where necessary (=as far down
the MDNode DAG as possible).
>>>>
>>>
>>> Right, though, to be fair, some numbers here might be nice. The
>>> tradeoff of "extra lookup for ref" versus "extra
memory allocation for
>>> larger debug info”.
>>
>> Then again, it’s not like the extra lookups introduced by David’s
proposed patch would get rid of those extra memory allocations. Granted, they do
get rid of the assertion, which obviously provides some value ;-)
>>
>>>
>>> I also don't think it'll matter for speed :)
>>>
>> You’re probably right about the lookup, but depending on the project
you’re building the memory usage could matter.
>>
>> -- adrian
>>
>>> -eric
>>>
>>>>>
>>>>>>
>>>>>> Question to the C++ experts: Is the name (“test” in
this case) a sufficient appropriate unique ID for a C++ namespace?
>>>>>
>>>>> Potentially - assuming you don't do this for anything
in an anonymous
>>>>> namespace. I'd probably see about using the standard
mangling
>>>> Good point! We should avoid unifying the anonymous namespace
:-)
>>>>> machinery as we do for other things, though, if possible.
Might keep
>>>>> it more consistent.
>>>>
>>>> I will do some experiments with nested namespaces, but it seems
as if name mangling does not buy us much: this will merely turn into N4test.
>>>>
>>>> -- adrian
>>>>
>>>>>
>>>>>>
>>>>>> -- adrian
>>>>>>
>>>>>>
>>>>>>> On Jun 24, 2014, at 1:49 PM, David Blaikie
<dblaikie at gmail.com> wrote:
>>>>>>>
>>>>>>> +Adrian & Manman,
>>>>>>>
>>>>>>> Looks like this is a case of non-DIRef references
ending up in the IR,
>>>>>>> and thus the references not being deduplicated.
This could lead to
>>>>>>> some of the IR bloat that you guys implemented the
DIRef stuff to
>>>>>>> reduce/avoid.
>>>>>>>
>>>>>>> The specific issue is that the type is slightly
different in the two
>>>>>>> TUs (since the namespace scope it's contained
within is different in
>>>>>>> the two TUs - note the namespace preceeding the
header include in
>>>>>>> test2.cpp) and the actual DIRef-usage failure is
when building
>>>>>>> subroutine types. The subroutine types are made
with direct references
>>>>>>> to types, not with DIRef references to types. This
means they won't be
>>>>>>> deduplicated/resolved to a single type during
linking correctly.
>>>>>>>
>>>>>>> It also leads to an assertion as per the original
email. Figured you
>>>>>>> guys might want to look into/fix this (I assume the
right fix is to
>>>>>>> use DIRef more pervasively in the frontend -
I'm doing a little
>>>>>>> experimenting with clang now to see how that
looks).
>>>>>>>
>>>>>>> - David
>>>>>>>
>>>>>>> On Fri, Jun 13, 2014 at 11:08 PM, David Blaikie
<dblaikie at gmail.com> wrote:
>>>>>>>> Yep, looks like a legit bug to me. I'll try
to reproduce it and reduce
>>>>>>>> a test case, etc.
>>>>>>>>
>>>>>>>> (the ODR doesn't require that the two
copies of the Test class
>>>>>>>> template appear on the same line in the same
file in two distinct
>>>>>>>> translation units, it only requires that the
two copies have the same
>>>>>>>> sequence of tokens - which they do in your
example)
>>>>>>>>
>>>>>>>> On Fri, Jun 13, 2014 at 5:33 PM, Kevin
Modzelewski <kmod at dropbox.com> wrote:
>>>>>>>>> Hi all, sorry to post to both lists, but
I'm running into an issue where
>>>>>>>>> clang-generated debug info is deemed to be
invalid by LLVM tools (throws an
>>>>>>>>> assertion error in both llc and mcjit), and
I'm not sure what the proper
>>>>>>>>> resolution is.
>>>>>>>>>
>>>>>>>>> Here's a test case; I last tested it on
revision r210953:
>>>>>>>>>
>>>>>>>>> $ cat test1.cpp
>>>>>>>>> #include "test.h"
>>>>>>>>> test::Test<int> foo1() {
>>>>>>>>> return test::Test<int>();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> $ cat test2.cpp
>>>>>>>>> namespace test {
>>>>>>>>> }
>>>>>>>>> #include "test.h"
>>>>>>>>> test::Test<int> foo2() {
>>>>>>>>> return test::Test<int>();
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> $ cat test.h
>>>>>>>>> namespace test {
>>>>>>>>> template <class T>
>>>>>>>>> class Test {
>>>>>>>>> };
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> $ clang++ -g -emit-llvm test1.cpp -S -o
test1.ll
>>>>>>>>> $ clang++ -g -emit-llvm test2.cpp -S -o
test2.ll
>>>>>>>>> $ llvm-link test1.ll test2.ll -S -o
linked.ll
>>>>>>>>> $ llc linked.ll
>>>>>>>>>
>>>>>>>>> The last step raises an assertion error,
assuming you have assertions
>>>>>>>>> enabled:
>>>>>>>>> llc:
lib/CodeGen/AsmPrinter/DwarfUnit.cpp:953: llvm::DIE*
>>>>>>>>> llvm::DwarfUnit::getOrCreateTypeDIE(const
llvm::MDNode*): Assertion `Ty =>>>>>>>>>
resolve(Ty.getRef()) && "type was not uniqued, possible ODR
violation."'
>>>>>>>>> failed
>>>>>>>>>
>>>>>>>>> The initial introduction of the
"test" namespace in test2.cpp seems to
>>>>>>>>> confuse clang -- test1.cpp and test2.cpp
have non-compatible debug
>>>>>>>>> information for the test::Test type. 
test1.cpp says that test::Test's
>>>>>>>>> namespace is defined in test.h, and
test2.cpp says that test::Test's
>>>>>>>>> namespace is defined in test2.cpp (they
both say that the class itself is
>>>>>>>>> defined in test.h).  Regardless of whether
LLVM tools should allow this,
>>>>>>>>> this seems like the wrong output for these
files?
>>>>>>>>>
>>>>>>>>> Another related example, is if we simply
defined the test::Test template in
>>>>>>>>> both files (ie, manually executed the
#include "test.h") and got rid of the
>>>>>>>>> test2.cpp "namespace test {}"
definition, we'd run into the same assertion,
>>>>>>>>> without hitting the weird clang behavior
with the empty namespace
>>>>>>>>> pre-declaration.
>>>>>>>>>
>>>>>>>>> It's probably also worth noting that
compiling directly with clang, ie
>>>>>>>>> adding a "int main() {}" function
and then doing
>>>>>>>>> $ clang++ -g test1.cpp test2.cpp -o test
>>>>>>>>> works fine.
>>>>>>>>>
>>>>>>>>> I'm not a C++ expert and I don't
know if these template issues really are
>>>>>>>>> violations of the ODR, but it feels like
this should be allowed, and
>>>>>>>>> somewhere in the llvm toolchain (llvm-link?
DwarfUnit.cpp?) this should be
>>>>>>>>> handled.  What do people think?
>>>>>>>>>
>>>>>>>>> kmod
>>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Adrian Prantl

2014-Jun-27 21:15 UTC

head link

[LLVMdev] Issues with clang-llvm debug info validity

> On Jun 27, 2014, at 1:58 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> On Fri, Jun 27, 2014 at 1:49 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>> I looked into this a little.
> 
> Thanks for taking a look!
> 
>> My proposal is this:
>> 
>> - extend DINamespace with an extra UniqueID filed (analogous to
DICompositeType)
>> - add an extra case to DIScope::getRef()
>> - Let CGDebugInfo create a mangled name for each namespace to use as
UID
>>  -> the UID of the anonymous namespace is null.
>> 
>> For example:
>> 
>> namespace a { namespace b {} } -> “_ZN1a", “_ZN1a1b"
>> 
>> !0 = [compile unit]
>> !1 = metadata !{i32 786489, metadata !0, metadata !”_ZN1a",
metadata !”b", i32 1, metadata !”_ZN1a1b"} ; [ DW_TAG_namespace ] [a]
>> !2 = metadata !{i32 786489, metadata !0, null, metadata !”a", i32
1, metadata !”_ZN1a"} ; [ DW_TAG_namespace ] [a] [line 1]
>> 
>> namespace a1b {} -> “_ZN3a1b"
>> namespace {} -> null (!= “”) // Anonymous namespaces will remain
old-fashioned MDNode references to prevent uniquing.
> 
> Will this do the right thing if there's a named namespace inside or
> outside the anonymous namespace?
This is actually really tricky. Normally the ValueSymbolTable takes care of
adding a unique suffix to all conflicting symbols. i think the most reasonable
thing to do here is to add a rule:
  -> the UID of any namespace that has an anonymous namespace in its parent
chain is null.
> 
>> With this change we will get the full type uniquing benefits. As a side
effect, definitions that are in the same namespace will be children of the same
DW_TAG_namespace in the debug info, which I’d argue is the expected behavior for
LTO anyway.
> 
> Sort of - what happens to functions defined in that namespace? Will
> they all end up in one CU with one instance of the namespace under
> LTO? Could do weird things to address ranges? (would we end up with
> the address range in the CU the function was originally in - but the
> subprogram DIE ends up in some other CU that doesn't include that
> address range)
I think you just convinced me to save this for later and clone a namespace for
every compile unit instead. The decl_file/decl_line will have to identical for
all CUs though (and come from the module linked in first).

thanks!
adrian
> 
> - David
> 
>> -- adrian
>> 
>>> On Jun 24, 2014, at 4:39 PM, Adrian Prantl <aprantl at
apple.com> wrote:
>>> 
>>> 
>>>> On Jun 24, 2014, at 4:34 PM, Eric Christopher <echristo at
gmail.com> wrote:
>>>> 
>>>>>> Without needing to change how namespaces are handled.
>>>>>> 
>>>>>> Yes, using DIRefs for namespaces would cause the actual
type nodes to
>>>>>> be deduplicated when they aren't even with the
patch above, and that
>>>>>> would help reduce debug info metadata further if
that's
>>>>>> useful/necessary.
>>>>> 
>>>>> Exactly. While the above patch will cover up the problem,
we will still have many duplicate type entries during LTO that will take up
memory, even though only the last one to be entered into DIRefMap will actually
be used for emitting DWARF. Since DIRefs are also more expensive in terms of
lookup speed, I think I’d prefer to only use them only where necessary (=as far
down the MDNode DAG as possible).
>>>>> 
>>>> 
>>>> Right, though, to be fair, some numbers here might be nice. The
>>>> tradeoff of "extra lookup for ref" versus "extra
memory allocation for
>>>> larger debug info”.
>>> 
>>> Then again, it’s not like the extra lookups introduced by David’s
proposed patch would get rid of those extra memory allocations. Granted, they do
get rid of the assertion, which obviously provides some value ;-)
>>> 
>>>> 
>>>> I also don't think it'll matter for speed :)
>>>> 
>>> You’re probably right about the lookup, but depending on the
project you’re building the memory usage could matter.
>>> 
>>> -- adrian
>>> 
>>>> -eric
>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Question to the C++ experts: Is the name (“test” in
this case) a sufficient appropriate unique ID for a C++ namespace?
>>>>>> 
>>>>>> Potentially - assuming you don't do this for
anything in an anonymous
>>>>>> namespace. I'd probably see about using the
standard mangling
>>>>> Good point! We should avoid unifying the anonymous
namespace :-)
>>>>>> machinery as we do for other things, though, if
possible. Might keep
>>>>>> it more consistent.
>>>>> 
>>>>> I will do some experiments with nested namespaces, but it
seems as if name mangling does not buy us much: this will merely turn into
N4test.
>>>>> 
>>>>> -- adrian
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> -- adrian
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 24, 2014, at 1:49 PM, David Blaikie
<dblaikie at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> +Adrian & Manman,
>>>>>>>> 
>>>>>>>> Looks like this is a case of non-DIRef
references ending up in the IR,
>>>>>>>> and thus the references not being deduplicated.
This could lead to
>>>>>>>> some of the IR bloat that you guys implemented
the DIRef stuff to
>>>>>>>> reduce/avoid.
>>>>>>>> 
>>>>>>>> The specific issue is that the type is slightly
different in the two
>>>>>>>> TUs (since the namespace scope it's
contained within is different in
>>>>>>>> the two TUs - note the namespace preceeding the
header include in
>>>>>>>> test2.cpp) and the actual DIRef-usage failure
is when building
>>>>>>>> subroutine types. The subroutine types are made
with direct references
>>>>>>>> to types, not with DIRef references to types.
This means they won't be
>>>>>>>> deduplicated/resolved to a single type during
linking correctly.
>>>>>>>> 
>>>>>>>> It also leads to an assertion as per the
original email. Figured you
>>>>>>>> guys might want to look into/fix this (I assume
the right fix is to
>>>>>>>> use DIRef more pervasively in the frontend -
I'm doing a little
>>>>>>>> experimenting with clang now to see how that
looks).
>>>>>>>> 
>>>>>>>> - David
>>>>>>>> 
>>>>>>>> On Fri, Jun 13, 2014 at 11:08 PM, David Blaikie
<dblaikie at gmail.com> wrote:
>>>>>>>>> Yep, looks like a legit bug to me. I'll
try to reproduce it and reduce
>>>>>>>>> a test case, etc.
>>>>>>>>> 
>>>>>>>>> (the ODR doesn't require that the two
copies of the Test class
>>>>>>>>> template appear on the same line in the
same file in two distinct
>>>>>>>>> translation units, it only requires that
the two copies have the same
>>>>>>>>> sequence of tokens - which they do in your
example)
>>>>>>>>> 
>>>>>>>>> On Fri, Jun 13, 2014 at 5:33 PM, Kevin
Modzelewski <kmod at dropbox.com> wrote:
>>>>>>>>>> Hi all, sorry to post to both lists,
but I'm running into an issue where
>>>>>>>>>> clang-generated debug info is deemed to
be invalid by LLVM tools (throws an
>>>>>>>>>> assertion error in both llc and mcjit),
and I'm not sure what the proper
>>>>>>>>>> resolution is.
>>>>>>>>>> 
>>>>>>>>>> Here's a test case; I last tested
it on revision r210953:
>>>>>>>>>> 
>>>>>>>>>> $ cat test1.cpp
>>>>>>>>>> #include "test.h"
>>>>>>>>>> test::Test<int> foo1() {
>>>>>>>>>> return test::Test<int>();
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> $ cat test2.cpp
>>>>>>>>>> namespace test {
>>>>>>>>>> }
>>>>>>>>>> #include "test.h"
>>>>>>>>>> test::Test<int> foo2() {
>>>>>>>>>> return test::Test<int>();
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> $ cat test.h
>>>>>>>>>> namespace test {
>>>>>>>>>> template <class T>
>>>>>>>>>> class Test {
>>>>>>>>>> };
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> $ clang++ -g -emit-llvm test1.cpp -S -o
test1.ll
>>>>>>>>>> $ clang++ -g -emit-llvm test2.cpp -S -o
test2.ll
>>>>>>>>>> $ llvm-link test1.ll test2.ll -S -o
linked.ll
>>>>>>>>>> $ llc linked.ll
>>>>>>>>>> 
>>>>>>>>>> The last step raises an assertion
error, assuming you have assertions
>>>>>>>>>> enabled:
>>>>>>>>>> llc:
lib/CodeGen/AsmPrinter/DwarfUnit.cpp:953: llvm::DIE*
>>>>>>>>>>
llvm::DwarfUnit::getOrCreateTypeDIE(const llvm::MDNode*): Assertion `Ty
=>>>>>>>>>> resolve(Ty.getRef()) &&
"type was not uniqued, possible ODR violation."'
>>>>>>>>>> failed
>>>>>>>>>> 
>>>>>>>>>> The initial introduction of the
"test" namespace in test2.cpp seems to
>>>>>>>>>> confuse clang -- test1.cpp and
test2.cpp have non-compatible debug
>>>>>>>>>> information for the test::Test type. 
test1.cpp says that test::Test's
>>>>>>>>>> namespace is defined in test.h, and
test2.cpp says that test::Test's
>>>>>>>>>> namespace is defined in test2.cpp (they
both say that the class itself is
>>>>>>>>>> defined in test.h).  Regardless of
whether LLVM tools should allow this,
>>>>>>>>>> this seems like the wrong output for
these files?
>>>>>>>>>> 
>>>>>>>>>> Another related example, is if we
simply defined the test::Test template in
>>>>>>>>>> both files (ie, manually executed the
#include "test.h") and got rid of the
>>>>>>>>>> test2.cpp "namespace test {}"
definition, we'd run into the same assertion,
>>>>>>>>>> without hitting the weird clang
behavior with the empty namespace
>>>>>>>>>> pre-declaration.
>>>>>>>>>> 
>>>>>>>>>> It's probably also worth noting
that compiling directly with clang, ie
>>>>>>>>>> adding a "int main() {}"
function and then doing
>>>>>>>>>> $ clang++ -g test1.cpp test2.cpp -o
test
>>>>>>>>>> works fine.
>>>>>>>>>> 
>>>>>>>>>> I'm not a C++ expert and I
don't know if these template issues really are
>>>>>>>>>> violations of the ODR, but it feels
like this should be allowed, and
>>>>>>>>>> somewhere in the llvm toolchain
(llvm-link? DwarfUnit.cpp?) this should be
>>>>>>>>>> handled.  What do people think?
>>>>>>>>>> 
>>>>>>>>>> kmod
>>>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>

llvm dev - Jun 2014 - [LLVMdev] Issues with clang-llvm debug info validity

[LLVMdev] Issues with clang-llvm debug info validity

[LLVMdev] Issues with clang-llvm debug info validity

[LLVMdev] Issues with clang-llvm debug info validity