In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link lld. That's done by parallelizing archive member parsing. But I realized that was not the slowest pass. The single slowest pass in LLD is LayoutPass. Only sort() at the last of Layoutpass::perform takes about 3 seconds (one third of total execution time). It is because the comparison function passed to sort, compareAtoms, does too much stuff. It looks to me that the entire pass is overkill. We don't really need that complexity there. I think nobody is actually depends on the details of the pass's behavior. I'd like to simplify LayoutPass, so that the pass sorts atoms only by file position and position in file (that means atoms are sorted in the same order as the command line.) I believe that should be enough and we can still keep output the same. Any comments? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/8246d45f/attachment.html>
Does this mean you are going to kill the Layout specific references ? The kindLayoutAfter, kindInGroup references are used to layout atoms by both Mach-O (atom order file) and ELF. Shankar Easwaran On 1/21/2015 10:07 PM, Rui Ueyama wrote:> In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link lld. > That's done by parallelizing archive member parsing. But I realized that > was not the slowest pass. > > The single slowest pass in LLD is LayoutPass. Only sort() at the last of > Layoutpass::perform takes about 3 seconds (one third of total execution > time). It is because the comparison function passed to sort, compareAtoms, > does too much stuff. > > It looks to me that the entire pass is overkill. We don't really need that > complexity there. I think nobody is actually depends on the details of the > pass's behavior. > > I'd like to simplify LayoutPass, so that the pass sorts atoms only by file > position and position in file (that means atoms are sorted in the same > order as the command line.) I believe that should be enough and we can > still keep output the same. > > Any comments? > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150122/7ecd22d5/attachment.html>
No, I'm not proposing to kill the layout references because components other than LayoutPass are using them. On Thu, Jan 22, 2015 at 7:12 AM, Shankar Easwaran <shankare at codeaurora.org> wrote:> Does this mean you are going to kill the Layout specific references ? > > The kindLayoutAfter, kindInGroup references are used to layout atoms by > both Mach-O (atom order file) and ELF. > > Shankar Easwaran > > > On 1/21/2015 10:07 PM, Rui Ueyama wrote: > > In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link lld. > That's done by parallelizing archive member parsing. But I realized that > was not the slowest pass. > > The single slowest pass in LLD is LayoutPass. Only sort() at the last of > Layoutpass::perform takes about 3 seconds (one third of total execution > time). It is because the comparison function passed to sort, compareAtoms, > does too much stuff. > > It looks to me that the entire pass is overkill. We don't really need that > complexity there. I think nobody is actually depends on the details of the > pass's behavior. > > I'd like to simplify LayoutPass, so that the pass sorts atoms only by file > position and position in file (that means atoms are sorted in the same > order as the command line.) I believe that should be enough and we can > still keep output the same. > > Any comments? > > > > > _______________________________________________ > LLVM Developers mailing listLLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150122/eb6df7c5/attachment.html>
I'll start writing a patch. I'll probably only modify LayoutPass.cpp and try to make it faster without affecting existing features. On Wed, Jan 21, 2015 at 8:07 PM, Rui Ueyama <ruiu at google.com> wrote:> In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link lld. > That's done by parallelizing archive member parsing. But I realized that > was not the slowest pass. > > The single slowest pass in LLD is LayoutPass. Only sort() at the last of > Layoutpass::perform takes about 3 seconds (one third of total execution > time). It is because the comparison function passed to sort, compareAtoms, > does too much stuff. > > It looks to me that the entire pass is overkill. We don't really need that > complexity there. I think nobody is actually depends on the details of the > pass's behavior. > > I'd like to simplify LayoutPass, so that the pass sorts atoms only by file > position and position in file (that means atoms are sorted in the same > order as the command line.) I believe that should be enough and we can > still keep output the same. > > Any comments? >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150122/2d78f83c/attachment.html>
On Thu, Jan 22, 2015 at 4:07 AM, Rui Ueyama <ruiu at google.com> wrote:> In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link lld. > That's done by parallelizing archive member parsing. But I realized that was > not the slowest pass. > > The single slowest pass in LLD is LayoutPass. Only sort() at the last of > Layoutpass::perform takes about 3 seconds (one third of total execution > time). It is because the comparison function passed to sort, compareAtoms, > does too much stuff. > > It looks to me that the entire pass is overkill. We don't really need that > complexity there. I think nobody is actually depends on the details of the > pass's behavior. > > I'd like to simplify LayoutPass, so that the pass sorts atoms only by file > position and position in file (that means atoms are sorted in the same order > as the command line.) I believe that should be enough and we can still keep > output the same. > > Any comments? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >Have you tried switching to a parallel sort? On windows I got a pretty good perf increase by doing that. I would also love to simplify this pass, as long as we make it clear what guarantees we are making. - Michael Spencer
On Mon, Jan 26, 2015 at 4:11 AM, Michael Spencer <bigcheesegs at gmail.com> wrote:> On Thu, Jan 22, 2015 at 4:07 AM, Rui Ueyama <ruiu at google.com> wrote: > > In r226336 I shove off 1.2 seconds out of 9.8 seconds for lld to link > lld. > > That's done by parallelizing archive member parsing. But I realized that > was > > not the slowest pass. > > > > The single slowest pass in LLD is LayoutPass. Only sort() at the last of > > Layoutpass::perform takes about 3 seconds (one third of total execution > > time). It is because the comparison function passed to sort, > compareAtoms, > > does too much stuff. > > > > It looks to me that the entire pass is overkill. We don't really need > that > > complexity there. I think nobody is actually depends on the details of > the > > pass's behavior. > > > > I'd like to simplify LayoutPass, so that the pass sorts atoms only by > file > > position and position in file (that means atoms are sorted in the same > order > > as the command line.) I believe that should be enough and we can still > keep > > output the same. > > > > Any comments? > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > Have you tried switching to a parallel sort? On windows I got a pretty > good perf increase by doing that. >Done in r227132, but I only saw a minor performance increase.> I would also love to simplify this pass, as long as we make it clear > what guarantees we are making. >Yeah. I took a look at the history of the file to understand how the feature set that pass currently provides was chosen, and found that some features were introduced without clear reasoning and users of the features. Now a few pieces of code depends on subtle behavior of the pass, I need to understand the details first. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150126/3f145caf/attachment.html>