Smith, Kevin B via llvm-dev
2016-May-11 17:51 UTC
[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm
>-----Original Message----- >From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] On Behalf Of Reid >Kleckner via cfe-dev >Sent: Wednesday, May 11, 2016 10:40 AM >To: Mehdi Amini <mehdi.amini at apple.com> >Cc: llvm-dev <llvm-dev at lists.llvm.org>; Clang Dev <cfe-dev at lists.llvm.org> >Subject: Re: [cfe-dev] [llvm-dev] RFC: Up front type information generation in >clang and llvm > >Responses to Mehdi and Eric below. > >On Wed, Apr 27, 2016 at 4:53 PM, Eric Christopher <echristo at gmail.com> >wrote: >> I don't agree in general here because of: >> >> a) maintainability - there isn't a one true path through things and now is >> scattering more windows knowledge through debug info and lto > >There was never going to be one true way to generate LLVM debug info >for both formats. We need some help from the frontend.I believe that Amjad Aboud has argued several times that there could be one true way to generate LLVM debug info such that both windows and DWARF debug info could be generated from it. I know for a fact that within the Intel Compiler that the FE generates a single set of debug info representation, that then gets translated into either MS PDB format, or DWARF depending on the target platform. Architecturally, that is very desirable. You really do not want to have every FE have to know about, and generate different debug info depending on whether they are targeting windows or a DWARF enabled target, do you?> >> b) higher bar for implementing similar dwarf functionality - there's nothing >> here that makes it at any point better for our general debug info support. >> Incrementally updating to an intermediate step is much easier and a lower >> bar than needing to implement everything up to and including a format >aware >> linker and support that through ThinLTO, the JIT, and full LTO. > >I claim that everything does not have to be format aware. All it has >to do is call out to a library which is format aware. We can come up >with reasonable high-level abstractions for operations that we'll want >to do on types, such as "extract this type and everything it >references". > >> c) if there's no reason to do this for dwarf there's no reason to do it for >> windows. The existing proposal was a way to get you type emission in the >> front end so that you'd have to do less work. Ultimately though I don't see >> a reason to do this if all of the platforms don't look the same. > >There are reasons to do this for DWARF, but they are not compelling >enough to do a total rewrite of our type information support. > >> d) ThinLTO/ORC won't support the debug info you have in your proposal >right >> now without patches >> >> e) You're regressing LTO linking performance hugely for windows with >debug >> until you write the patches that enable format aware linking of code view >> information > >The way I see it, there is no existing CodeView debug info >functionality to regress for any of ORC, LTO, or ThinLTO. Apparently >we don't see this the same way. > >And I've already written the patch to do type merging: >http://reviews.llvm.org/D20122 Regular LTO can call this code, and >rewrite the DITypeIndex numbers with the map produced. While this may >not be directly applicable to ORC and ThinLTO, I don't expect that >supporting them will be much more work. > > > >On Tue, May 10, 2016 at 11:32 PM, Mehdi Amini via cfe-dev ><cfe-dev at lists.llvm.org> wrote: >> On the other hand, it seems that what you're proposing is basically >> "optimized" for "type units" (which are not supported on Darwin anyway) >and >> the only advantage we could see is to have an easy way of type-uniquing >> directly in the IR. > >Splitting up the type information into opaque units lets you do >format-agnostic type uniquing, but it doesn't let you extract forward >declarations like ThinLTO wants to do. > >> Our conclusion was that for us, a single type blob with somehow "smart >> reference" to be able to point inside the blob from the outside is the most >> efficient things we can built upon. However the cost/benefit of getting >> there is too high for us to prioritize working this at this point. >> (If I misrepresented anything, please Adrian/Duncan/Fred correct me) > >Yeah, this is kind of where I am. Having one blob per module is >probably the most efficient thing possible that I could do for >CodeView, but I estimate that the cost of also doing it for DWARF is >very high. We have a lot of dependencies on the existing >representation. We can attempt to try and generalize up-front emission >to DWARF, but I think if we don't pay the full cost, we will end up >with something half-baked for DWARF. I don't think I have the time to >do it justice. > >Speaking of the idea of smart references that point out of the IR into >separate type info, my current approach (DITypeIndex) is very >CV-specific. However, I think if we allow one kind of smart reference, >we can add support for more, and they can be format-specific. As long >as we're OK making DITypeRefs opaque, adding new kinds of type refs is >cheap. >_______________________________________________ >cfe-dev mailing list >cfe-dev at lists.llvm.org >http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reid Kleckner via llvm-dev
2016-May-11 18:18 UTC
[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm
On Wed, May 11, 2016 at 10:51 AM, Smith, Kevin B <kevin.b.smith at intel.com> wrote:> I believe that Amjad Aboud has argued several times that there could be one true way to generate LLVM debug info such that both > windows and DWARF debug info could be generated from it. I know for a fact that within the Intel Compiler that the FE generates a single > set of debug info representation, that then gets translated into either MS PDB format, or DWARF depending on the target platform. > > Architecturally, that is very desirable. You really do not want to have every FE have to know about, and generate different debug info depending > on whether they are targeting windows or a DWARF enabled target, do you?If we go with the existing metadata representation, we will need to extend it to be the union of DWARF and CodeView, and that will require frontends to feed us more information specific to CodeView. In other words, "we need help from the frontend." Depending on your perspective, you could see this as spreading Windows knowledge across the codebase. I think extending the DI metadata is definitely workable. As you say, it is obviously very useful for other frontends. I just feel that the representation shift is needlessly inefficient and stands in our way when we need to express things that it can't yet represent. --- Anyway, at this point, many people have concerns about this idea, so I think it would be best to move forward on CV by extending the DI metadata. We can come back and revisit up front emission at some point in the future. If we can demonstrate big compile-time and QoI improvements, it might be worth supporting both approaches.
Mehdi Amini via llvm-dev
2016-May-11 18:29 UTC
[llvm-dev] [cfe-dev] RFC: Up front type information generation in clang and llvm
> On May 11, 2016, at 11:18 AM, Reid Kleckner <rnk at google.com> wrote: > > On Wed, May 11, 2016 at 10:51 AM, Smith, Kevin B > <kevin.b.smith at intel.com> wrote: >> I believe that Amjad Aboud has argued several times that there could be one true way to generate LLVM debug info such that both >> windows and DWARF debug info could be generated from it. I know for a fact that within the Intel Compiler that the FE generates a single >> set of debug info representation, that then gets translated into either MS PDB format, or DWARF depending on the target platform. >> >> Architecturally, that is very desirable. You really do not want to have every FE have to know about, and generate different debug info depending >> on whether they are targeting windows or a DWARF enabled target, do you? > > If we go with the existing metadata representation, we will need to > extend it to be the union of DWARF and CodeView, and that will require > frontends to feed us more information specific to CodeView. In other > words, "we need help from the frontend." Depending on your > perspective, you could see this as spreading Windows knowledge across > the codebase. > > I think extending the DI metadata is definitely workable. As you say, > it is obviously very useful for other frontends. I just feel that the > representation shift is needlessly inefficient and stands in our way > when we need to express things that it can't yet represent.This is a bit blurry to me as it seems a bit orthogonal: the fact that there is an interface exposed to the frontends to emit debug info should be almost independent from where we actually emit the blob. So yes, such an interface would require the frontends to expose the union of the information needed to emit Dwarf and CodeView, but it does imply that the metadata representation need to be extended (i.e. behind such an interface you could get the current metadatas for Dwarf and the single blob for CodeView). Did I miss something? -- Mehdi