thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm [May 2016]

If this information is useful, please help other people find it:
Share via:

Smith, Kevin B via llvm-dev

2016-May-11 17:51 UTC

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm

>-----Original Message-----
>From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of Reid
>Kleckner via cfe-dev
>Sent: Wednesday, May 11, 2016 10:40 AM
>To: Mehdi Amini <mehdi.amini at apple.com>
>Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <cfe-dev at
lists.llvm.org>
>Subject: Re: [cfe-dev] [llvm-dev] RFC: Up front type information generation
in
>clang and llvm
>
>Responses to Mehdi and Eric below.
>
>On Wed, Apr 27, 2016 at 4:53 PM, Eric Christopher <echristo at
gmail.com>
>wrote:
>> I don't agree in general here because of:
>>
>> a) maintainability - there isn't a one true path through things and
now is
>> scattering more windows knowledge through debug info and lto
>
>There was never going to be one true way to generate LLVM debug info
>for both formats. We need some help from the frontend.
I believe that Amjad Aboud has argued several times that there could be one true
way to generate LLVM debug info such that both
windows and DWARF debug info could be generated from it.  I know for a fact that
within the Intel Compiler that the FE generates a single
set of debug info representation, that then gets translated into either MS PDB
format, or DWARF depending on the target platform.

Architecturally, that is very desirable. You really do not want to have every FE
have to know about, and generate different debug info depending
on whether they are targeting windows or a DWARF enabled target, do you?
 >
>> b) higher bar for implementing similar dwarf functionality -
there's nothing
>> here that makes it at any point better for our general debug info
support.
>> Incrementally updating to an intermediate step is much easier and a
lower
>> bar than needing to implement everything up to and including a format
>aware
>> linker and support that through ThinLTO, the JIT, and full LTO.
>
>I claim that everything does not have to be format aware. All it has
>to do is call out to a library which is format aware. We can come up
>with reasonable high-level abstractions for operations that we'll want
>to do on types, such as "extract this type and everything it
>references".
>
>> c) if there's no reason to do this for dwarf there's no reason
to do it for
>> windows. The existing proposal was a way to get you type emission in
the
>> front end so that you'd have to do less work. Ultimately though I
don't see
>> a reason to do this if all of the platforms don't look the same.
>
>There are reasons to do this for DWARF, but they are not compelling
>enough to do a total rewrite of our type information support.
>
>> d) ThinLTO/ORC won't support the debug info you have in your
proposal
>right
>> now without patches
>>
>> e) You're regressing LTO linking performance hugely for windows
with
>debug
>> until you write the patches that enable format aware linking of code
view
>> information
>
>The way I see it, there is no existing CodeView debug info
>functionality to regress for any of ORC, LTO, or ThinLTO. Apparently
>we don't see this the same way.
>
>And I've already written the patch to do type merging:
>http://reviews.llvm.org/D20122 Regular LTO can call this code, and
>rewrite the DITypeIndex numbers with the map produced. While this may
>not be directly applicable to ORC and ThinLTO, I don't expect that
>supporting them will be much more work.
>
>
>
>On Tue, May 10, 2016 at 11:32 PM, Mehdi Amini via cfe-dev
><cfe-dev at lists.llvm.org> wrote:
>> On the other hand, it seems that what you're proposing is basically
>> "optimized" for "type units" (which are not
supported on Darwin anyway)
>and
>> the only advantage we could see is to have an easy way of type-uniquing
>> directly in the IR.
>
>Splitting up the type information into opaque units lets you do
>format-agnostic type uniquing, but it doesn't let you extract forward
>declarations like ThinLTO wants to do.
>
>> Our conclusion was that for us, a single type blob with somehow
"smart
>> reference" to be able to point inside the blob from the outside is
the most
>> efficient things we can built upon. However the cost/benefit of getting
>> there is too high for us to prioritize working this at this point.
>> (If I misrepresented anything, please Adrian/Duncan/Fred correct me)
>
>Yeah, this is kind of where I am. Having one blob per module is
>probably the most efficient thing possible that I could do for
>CodeView, but I estimate that the cost of also doing it for DWARF is
>very high. We have a lot of dependencies on the existing
>representation. We can attempt to try and generalize up-front emission
>to DWARF, but I think if we don't pay the full cost, we will end up
>with something half-baked for DWARF. I don't think I have the time to
>do it justice.
>
>Speaking of the idea of smart references that point out of the IR into
>separate type info, my current approach (DITypeIndex) is very
>CV-specific. However, I think if we allow one kind of smart reference,
>we can add support for more, and they can be format-specific. As long
>as we're OK making DITypeRefs opaque, adding new kinds of type refs is
>cheap.
>_______________________________________________
>cfe-dev mailing list
>cfe-dev at lists.llvm.org
>http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Reid Kleckner via llvm-dev

2016-May-11 18:18 UTC

head link

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm

On Wed, May 11, 2016 at 10:51 AM, Smith, Kevin B
<kevin.b.smith at intel.com> wrote:> I believe that Amjad Aboud has argued several times that there could be one
true way to generate LLVM debug info such that both
> windows and DWARF debug info could be generated from it.  I know for a fact
that within the Intel Compiler that the FE generates a single
> set of debug info representation, that then gets translated into either MS
PDB format, or DWARF depending on the target platform.
>
> Architecturally, that is very desirable. You really do not want to have
every FE have to know about, and generate different debug info depending
> on whether they are targeting windows or a DWARF enabled target, do you?
If we go with the existing metadata representation, we will need to
extend it to be the union of DWARF and CodeView, and that will require
frontends to feed us more information specific to CodeView. In other
words, "we need help from the frontend." Depending on your
perspective, you could see this as spreading Windows knowledge across
the codebase.

I think extending the DI metadata is definitely workable. As you say,
it is obviously very useful for other frontends. I just feel that the
representation shift is needlessly inefficient and stands in our way
when we need to express things that it can't yet represent.

---

Anyway, at this point, many people have concerns about this idea, so I
think it would be best to move forward on CV by extending the DI
metadata. We can come back and revisit up front emission at some point
in the future. If we can demonstrate big compile-time and QoI
improvements, it might be worth supporting both approaches.

Mehdi Amini via llvm-dev

2016-May-11 18:29 UTC

head link

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm

> On May 11, 2016, at 11:18 AM, Reid Kleckner <rnk at google.com>
wrote:
> 
> On Wed, May 11, 2016 at 10:51 AM, Smith, Kevin B
> <kevin.b.smith at intel.com> wrote:
>> I believe that Amjad Aboud has argued several times that there could be
one true way to generate LLVM debug info such that both
>> windows and DWARF debug info could be generated from it.  I know for a
fact that within the Intel Compiler that the FE generates a single
>> set of debug info representation, that then gets translated into either
MS PDB format, or DWARF depending on the target platform.
>> 
>> Architecturally, that is very desirable. You really do not want to have
every FE have to know about, and generate different debug info depending
>> on whether they are targeting windows or a DWARF enabled target, do
you?
> 
> If we go with the existing metadata representation, we will need to
> extend it to be the union of DWARF and CodeView, and that will require
> frontends to feed us more information specific to CodeView. In other
> words, "we need help from the frontend." Depending on your
> perspective, you could see this as spreading Windows knowledge across
> the codebase.
> 
> I think extending the DI metadata is definitely workable. As you say,
> it is obviously very useful for other frontends. I just feel that the
> representation shift is needlessly inefficient and stands in our way
> when we need to express things that it can't yet represent.
This is a bit blurry to me as it seems a bit orthogonal: the fact that there is
an interface exposed to the frontends to emit debug info should be almost
independent from where we actually emit the blob.
So yes, such an interface would require the frontends to expose the union of the
information needed to emit Dwarf and CodeView, but it does imply that the
metadata representation need to be extended (i.e. behind such an interface you
could get the current metadatas for Dwarf and the single blob for CodeView).
Did I miss something?

-- 
Mehdi

llvm dev - May 2016 - [cfe-dev] RFC: Up front type information generation in clang and llvm

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm

[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm