On Thu, May 7, 2015 at 12:58 PM, Jim Grosbach <grosbach at apple.com> wrote:> Hi Rui, > > Thank you for clarifying. This is very helpful. > > It’s unfortunate that you’re not seeing benefits from the increased > semantic knowledge the atom based model can provide. I know you’ve explored > the issue thoroughly, though, so I understand why you’re wanting to move a > different direction for your platform. > > It’s reasonable to me to split the logic along atom based vs. section > based in the LLD codebase. Ideally I’d love for that not to be so, but the > practical results indicate that it is. I agree there is still worthwhile > code sharing that can and should be done between the two. We’re talking > about expanding the LLD project’s scope to include multiple linking models, > not forking the project. > > It will be important to keep the layering here such that the linking model > choice is orthogonal to the file format choice. It should be possible to > construct both an atom based ELF linker and a section based Mach-O linker, > for example, even though the default choice for both formats is the other > way around. That way different platforms with different constraints, such > as what Alex talked about earlier, can make the choice of model and the > choice of representation independently. > > As a second step, I would very much like to see the native format brought > back, if only for the atom-based model. Do you feel this is doable? >Yes, it's doable, but I'd really like to see this back with different unit tests because the way the feature was tested was unrealistic and hard to maintain. Previously, we tested the feature by dumping intermediate linker state to a Native file and reading it back from a file to resume processing. That was different from the expected use case, which is to use Native files as an alternative object file format. Creating a checkpoint file of the linker state is a different feature (and I guess nobody really intended to implement such feature.) I think we need to extend yaml2obj tool to write outputs in the Native format, and use the tool to feed Native object files to the linker. Is there anyone who wants to own the feature? Or I could just revive the code, but I'd really want to avoid doing that if that doesn't come with a good test...> -Jim > > On May 6, 2015, at 2:18 PM, Rui Ueyama <ruiu at google.com> wrote: > > I'm sorry if my suggestion gave an impression that I disregard the Mach-O > port of the LLD linker. I do care about Mach-O. I do not plan to break or > remove any functionality from the current Mach-O port of the LLD. I don't > propose to remove the atom model from the linker as long as it seems to be > a good fit for the port (and looks like it is). > > As to the proposal to have two different linkers, I'd think that that's > not really a counter-proposal, as it's similar to what I'm proposing. > > Maybe the view of "future file formats vs the existing formats" (or > "experimental platform vs. practical tool") is not right to get the > difference between the atom model and the section model, since the Mach-O > file an existing file format which we'd want to keep to be on the atom > model. I think we want both even for the existing formats. > > My proposal can be read as suggesting we split the LLD linker into two > major parts, the atom model-based and the section model-based, while > keeping the two under the same project and repository. I still think that > we can share code between the two, especially for the LTO, which is I > prefer to have the two under the same repository. > > On Mon, May 4, 2015 at 12:52 PM, Chris Lattner <clattner at apple.com> wrote: > >> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com> wrote: >> >> *Proposal* >> >> 1. Re-architect the linker based on the section model where it’s >> appropriate. >> 2. Stop simulating different linker semantics using the Unix model. >> Instead, directly implement the native behavior. >> >> Preface: I have never personally contributed code to LLD, so don’t take >> anything I’m about to say too seriously. This is not a mandate or >> anything, just an observation/idea. >> >> >> I think that there is an alternative solution to these exact same >> problems. What you’ve identified here is that there are two camps of >> people working on LLD, and they have conflicting goals: >> >> - Camp A: LLD is infrastructure for the next generation of awesome >> linking and toolchain features, it should take advantage of how compilers >> work to offer new features, performance, etc without deep concern for >> compatibility. >> >> - Camp B: LLD is a drop in replacement system linker (notably for COFF >> and ELF systems), which is best of breed and with no compromises w.r.t. >> that goal. >> >> >> I think the problem here is that these lead to natural and inescapable >> tensions, and Alex summarized how Camp B has been steering LLD away from >> what Camp A people want. This isn’t bad in and of itself, because what >> Camp B wants is clearly and unarguably good for LLVM. However, it is also >> not sufficient, and while innovation in the linker space (e.g. a new >> “native” object file format generated directly from compiler structures) >> may or may not actually “work” or be “worth it”, we won’t know unless we >> try, and that won’t fulfill its promise if there are compromises to Camp B. >> >> So here’s my counterproposal: *two different linkers.* >> >> Lets stop thinking about lld as one linker, and instead think of it is >> two different ones. We’ll build a Camp B linker which is the best of breed >> section based linker. It will support linker scripts and do everything >> better than any existing section based linker. The first step of this is >> to do what Rui proposes and rip atoms out of the model. >> >> We will *also* build a no-holds-barred awesome atom based linker that >> takes advantage of everything it can from LLVM’s architecture to enable >> innovative new tools without worrying too much about backwards >> compatibility. >> >> These two linkers should share whatever code makes sense, but also >> shouldn’t try to share code that doesn’t make sense. The split between the >> semantic model of sections vs atoms seems like a very natural one to me. >> >> One question is: does it make sense for these to live in the same lld >> subproject, or be split into two different subprojects? I think the answer >> to that question is driven from whether there is shared code common between >> the two linkers that doesn’t make sense to sink down to the llvm subproject >> itself. >> >> What do you think? >> >> -Chris >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu > lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20150507/23bf6bc6/attachment.html>
Hi Rui, I'd like to preserve the native format, and I'm happy to own it. I'm still getting up to speed on LLD though, so it may take me a little while to improve the tooling/testing for it. Cheers, Lang. On Thu, May 7, 2015 at 1:28 PM, Rui Ueyama <ruiu at google.com> wrote:> On Thu, May 7, 2015 at 12:58 PM, Jim Grosbach <grosbach at apple.com> wrote: > >> Hi Rui, >> >> Thank you for clarifying. This is very helpful. >> >> It’s unfortunate that you’re not seeing benefits from the increased >> semantic knowledge the atom based model can provide. I know you’ve explored >> the issue thoroughly, though, so I understand why you’re wanting to move a >> different direction for your platform. >> >> It’s reasonable to me to split the logic along atom based vs. section >> based in the LLD codebase. Ideally I’d love for that not to be so, but the >> practical results indicate that it is. I agree there is still worthwhile >> code sharing that can and should be done between the two. We’re talking >> about expanding the LLD project’s scope to include multiple linking models, >> not forking the project. >> >> It will be important to keep the layering here such that the linking >> model choice is orthogonal to the file format choice. It should be possible >> to construct both an atom based ELF linker and a section based Mach-O >> linker, for example, even though the default choice for both formats is the >> other way around. That way different platforms with different constraints, >> such as what Alex talked about earlier, can make the choice of model and >> the choice of representation independently. >> >> As a second step, I would very much like to see the native format brought >> back, if only for the atom-based model. Do you feel this is doable? >> > > Yes, it's doable, but I'd really like to see this back with different unit > tests because the way the feature was tested was unrealistic and hard to > maintain. Previously, we tested the feature by dumping intermediate linker > state to a Native file and reading it back from a file to resume > processing. That was different from the expected use case, which is to use > Native files as an alternative object file format. Creating a checkpoint > file of the linker state is a different feature (and I guess nobody really > intended to implement such feature.) > > I think we need to extend yaml2obj tool to write outputs in the Native > format, and use the tool to feed Native object files to the linker. > > Is there anyone who wants to own the feature? Or I could just revive the > code, but I'd really want to avoid doing that if that doesn't come with a > good test... > > >> -Jim >> >> On May 6, 2015, at 2:18 PM, Rui Ueyama <ruiu at google.com> wrote: >> >> I'm sorry if my suggestion gave an impression that I disregard the Mach-O >> port of the LLD linker. I do care about Mach-O. I do not plan to break or >> remove any functionality from the current Mach-O port of the LLD. I don't >> propose to remove the atom model from the linker as long as it seems to be >> a good fit for the port (and looks like it is). >> >> As to the proposal to have two different linkers, I'd think that that's >> not really a counter-proposal, as it's similar to what I'm proposing. >> >> Maybe the view of "future file formats vs the existing formats" (or >> "experimental platform vs. practical tool") is not right to get the >> difference between the atom model and the section model, since the Mach-O >> file an existing file format which we'd want to keep to be on the atom >> model. I think we want both even for the existing formats. >> >> My proposal can be read as suggesting we split the LLD linker into two >> major parts, the atom model-based and the section model-based, while >> keeping the two under the same project and repository. I still think that >> we can share code between the two, especially for the LTO, which is I >> prefer to have the two under the same repository. >> >> On Mon, May 4, 2015 at 12:52 PM, Chris Lattner <clattner at apple.com> >> wrote: >> >>> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com> wrote: >>> >>> *Proposal* >>> >>> 1. Re-architect the linker based on the section model where it’s >>> appropriate. >>> 2. Stop simulating different linker semantics using the Unix model. >>> Instead, directly implement the native behavior. >>> >>> Preface: I have never personally contributed code to LLD, so don’t take >>> anything I’m about to say too seriously. This is not a mandate or >>> anything, just an observation/idea. >>> >>> >>> I think that there is an alternative solution to these exact same >>> problems. What you’ve identified here is that there are two camps of >>> people working on LLD, and they have conflicting goals: >>> >>> - Camp A: LLD is infrastructure for the next generation of awesome >>> linking and toolchain features, it should take advantage of how compilers >>> work to offer new features, performance, etc without deep concern for >>> compatibility. >>> >>> - Camp B: LLD is a drop in replacement system linker (notably for COFF >>> and ELF systems), which is best of breed and with no compromises w.r.t. >>> that goal. >>> >>> >>> I think the problem here is that these lead to natural and inescapable >>> tensions, and Alex summarized how Camp B has been steering LLD away from >>> what Camp A people want. This isn’t bad in and of itself, because what >>> Camp B wants is clearly and unarguably good for LLVM. However, it is also >>> not sufficient, and while innovation in the linker space (e.g. a new >>> “native” object file format generated directly from compiler structures) >>> may or may not actually “work” or be “worth it”, we won’t know unless we >>> try, and that won’t fulfill its promise if there are compromises to Camp B. >>> >>> So here’s my counterproposal: *two different linkers.* >>> >>> Lets stop thinking about lld as one linker, and instead think of it is >>> two different ones. We’ll build a Camp B linker which is the best of breed >>> section based linker. It will support linker scripts and do everything >>> better than any existing section based linker. The first step of this is >>> to do what Rui proposes and rip atoms out of the model. >>> >>> We will *also* build a no-holds-barred awesome atom based linker that >>> takes advantage of everything it can from LLVM’s architecture to enable >>> innovative new tools without worrying too much about backwards >>> compatibility. >>> >>> These two linkers should share whatever code makes sense, but also >>> shouldn’t try to share code that doesn’t make sense. The split between the >>> semantic model of sections vs atoms seems like a very natural one to me. >>> >>> One question is: does it make sense for these to live in the same lld >>> subproject, or be split into two different subprojects? I think the answer >>> to that question is driven from whether there is shared code common between >>> the two linkers that doesn’t make sense to sink down to the llvm subproject >>> itself. >>> >>> What do you think? >>> >>> -Chris >>> >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu >> lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu > lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20150512/c262293b/attachment.html>
I'm sorry for not updating the thread -- I thought I did that before. I started experimenting the idea by implementing a minimal linker using the section-based design with some additional simplification/optimizations. It's already able to link small programs like the LLD itself, and the performance looks indeed better (probably the LLD is too small as a benchmark, but the new one is more than 2x faster). I also believe that the code is more readable than the current COFF port. I've just hacked it up, so it needs time to clean up. I think I can send a patch for review this week. On Tue, May 12, 2015 at 12:38 PM, Lang Hames <lhames at gmail.com> wrote:> Hi Rui, > > I'd like to preserve the native format, and I'm happy to own it. I'm still > getting up to speed on LLD though, so it may take me a little while to > improve the tooling/testing for it. > > Cheers, > Lang. > > > On Thu, May 7, 2015 at 1:28 PM, Rui Ueyama <ruiu at google.com> wrote: > >> On Thu, May 7, 2015 at 12:58 PM, Jim Grosbach <grosbach at apple.com> wrote: >> >>> Hi Rui, >>> >>> Thank you for clarifying. This is very helpful. >>> >>> It’s unfortunate that you’re not seeing benefits from the increased >>> semantic knowledge the atom based model can provide. I know you’ve explored >>> the issue thoroughly, though, so I understand why you’re wanting to move a >>> different direction for your platform. >>> >>> It’s reasonable to me to split the logic along atom based vs. section >>> based in the LLD codebase. Ideally I’d love for that not to be so, but the >>> practical results indicate that it is. I agree there is still worthwhile >>> code sharing that can and should be done between the two. We’re talking >>> about expanding the LLD project’s scope to include multiple linking models, >>> not forking the project. >>> >>> It will be important to keep the layering here such that the linking >>> model choice is orthogonal to the file format choice. It should be possible >>> to construct both an atom based ELF linker and a section based Mach-O >>> linker, for example, even though the default choice for both formats is the >>> other way around. That way different platforms with different constraints, >>> such as what Alex talked about earlier, can make the choice of model and >>> the choice of representation independently. >>> >>> As a second step, I would very much like to see the native format >>> brought back, if only for the atom-based model. Do you feel this is doable? >>> >> >> Yes, it's doable, but I'd really like to see this back with different >> unit tests because the way the feature was tested was unrealistic and hard >> to maintain. Previously, we tested the feature by dumping intermediate >> linker state to a Native file and reading it back from a file to resume >> processing. That was different from the expected use case, which is to use >> Native files as an alternative object file format. Creating a checkpoint >> file of the linker state is a different feature (and I guess nobody really >> intended to implement such feature.) >> >> I think we need to extend yaml2obj tool to write outputs in the Native >> format, and use the tool to feed Native object files to the linker. >> >> Is there anyone who wants to own the feature? Or I could just revive the >> code, but I'd really want to avoid doing that if that doesn't come with a >> good test... >> >> >>> -Jim >>> >>> On May 6, 2015, at 2:18 PM, Rui Ueyama <ruiu at google.com> wrote: >>> >>> I'm sorry if my suggestion gave an impression that I disregard the >>> Mach-O port of the LLD linker. I do care about Mach-O. I do not plan to >>> break or remove any functionality from the current Mach-O port of the LLD. >>> I don't propose to remove the atom model from the linker as long as it >>> seems to be a good fit for the port (and looks like it is). >>> >>> As to the proposal to have two different linkers, I'd think that that's >>> not really a counter-proposal, as it's similar to what I'm proposing. >>> >>> Maybe the view of "future file formats vs the existing formats" (or >>> "experimental platform vs. practical tool") is not right to get the >>> difference between the atom model and the section model, since the Mach-O >>> file an existing file format which we'd want to keep to be on the atom >>> model. I think we want both even for the existing formats. >>> >>> My proposal can be read as suggesting we split the LLD linker into two >>> major parts, the atom model-based and the section model-based, while >>> keeping the two under the same project and repository. I still think that >>> we can share code between the two, especially for the LTO, which is I >>> prefer to have the two under the same repository. >>> >>> On Mon, May 4, 2015 at 12:52 PM, Chris Lattner <clattner at apple.com> >>> wrote: >>> >>>> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com> wrote: >>>> >>>> *Proposal* >>>> >>>> 1. Re-architect the linker based on the section model where it’s >>>> appropriate. >>>> 2. Stop simulating different linker semantics using the Unix model. >>>> Instead, directly implement the native behavior. >>>> >>>> Preface: I have never personally contributed code to LLD, so don’t take >>>> anything I’m about to say too seriously. This is not a mandate or >>>> anything, just an observation/idea. >>>> >>>> >>>> I think that there is an alternative solution to these exact same >>>> problems. What you’ve identified here is that there are two camps of >>>> people working on LLD, and they have conflicting goals: >>>> >>>> - Camp A: LLD is infrastructure for the next generation of awesome >>>> linking and toolchain features, it should take advantage of how compilers >>>> work to offer new features, performance, etc without deep concern for >>>> compatibility. >>>> >>>> - Camp B: LLD is a drop in replacement system linker (notably for COFF >>>> and ELF systems), which is best of breed and with no compromises w.r.t. >>>> that goal. >>>> >>>> >>>> I think the problem here is that these lead to natural and inescapable >>>> tensions, and Alex summarized how Camp B has been steering LLD away from >>>> what Camp A people want. This isn’t bad in and of itself, because what >>>> Camp B wants is clearly and unarguably good for LLVM. However, it is also >>>> not sufficient, and while innovation in the linker space (e.g. a new >>>> “native” object file format generated directly from compiler structures) >>>> may or may not actually “work” or be “worth it”, we won’t know unless we >>>> try, and that won’t fulfill its promise if there are compromises to Camp B. >>>> >>>> So here’s my counterproposal: *two different linkers.* >>>> >>>> Lets stop thinking about lld as one linker, and instead think of it is >>>> two different ones. We’ll build a Camp B linker which is the best of breed >>>> section based linker. It will support linker scripts and do everything >>>> better than any existing section based linker. The first step of this is >>>> to do what Rui proposes and rip atoms out of the model. >>>> >>>> We will *also* build a no-holds-barred awesome atom based linker that >>>> takes advantage of everything it can from LLVM’s architecture to enable >>>> innovative new tools without worrying too much about backwards >>>> compatibility. >>>> >>>> These two linkers should share whatever code makes sense, but also >>>> shouldn’t try to share code that doesn’t make sense. The split between the >>>> semantic model of sections vs atoms seems like a very natural one to me. >>>> >>>> One question is: does it make sense for these to live in the same lld >>>> subproject, or be split into two different subprojects? I think the answer >>>> to that question is driven from whether there is shared code common between >>>> the two linkers that doesn’t make sense to sink down to the llvm subproject >>>> itself. >>>> >>>> What do you think? >>>> >>>> -Chris >>>> >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu >>> lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu >> lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20150524/937839a2/attachment.html>