That was likely type information and should mostly be fixed up. It's still not lazily loaded, but is going to be ridiculously smaller now. -eric On Fri Jan 10 2014 at 12:11:52 AM, Sean Silva <chisophugis at gmail.com> wrote:> This Summer I was working on LTO and Rafael mentioned to me that debug > info is not lazy loaded, which was the cause for the insane resource usage > I was seeing when doing LTO with debug info. This is likely the reason that > the lazy loading was so ineffective for your debug build. > > Rafael, am I remembering this right/can you give more information? I > expect that this will have to get fixed before pitching LLD as a turnkey > LTO solution (not sure where in the priority list it is). > > -- Sean Silva > On Thu, Jan 9, 2014 at 5:37 PM, Kevin Modzelewski <kmod at dropbox.com>wrote: > > Hi all, I'm trying to reduce the startup time for my JIT, but I'm running > into the problem that the majority of the time is spent loading the bitcode > for my standard library, and I suspect it's due to debug info. My stdlib > is currently about 2kloc in a number of C++ files; I compile them with > clang -g -emit-llvm, then link them together with llvm-link, call opt -O3 > on it, and arrive at a 1MB bitcode file. I then embed this as a binary > blob into my executable, and call ParseBitcodeFile on it at startup. > > Unfortunately, this parsing takes about 60ms right now, which is the main > component of my ~100ms time to run on an empty source file (another ~20ms > is loading the pre-jit'd image through an ObjectCache). I thought I'd save > some time by using getLazyBitcodeModule, since the IR isn't actually needed > right away, but this only reduced the parsing time (ie the time of the > actual getLazyBitcodeModule() call) to 45ms, which I thought was > surprising. I also tested computing the bytewise-xor of the bitcode file > to make sure that it was fully read into memory, which took about 5ms, so > the majority of the time does seem to be spent parsing. > > Then I switched back to ParseBitcodeFile, but now I added the > "-strip-debug" flag to my opt invocation, which reduced the bitcode file > down to about 100KB, and reduced the parsing time to 20ms. What surprised > me the most was that if I then switched to getLazyBitcodeModule, the > parsing time was cut down to 3ms, which is what I was originally expecting. > So when lazy loading, stripping out the debug info cuts down the > initialization time from 45ms to 3ms, which is why I suspect that > getLazyBitcodeModule is still parsing all of the debug info. > > > To work around it, I can generate separate builds, one with debug info and > one without, but I'd like to avoid doing that. I did some simple profiling > of what getLazyBitcodeModule was doing, and it wasn't terribly informative > (spends most of its time in parsing-related functions); does anyone have > any ideas if this is something that could be fixable or if I should just > move on? > > Thanks, > Kevin > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140110/e63c4b6b/attachment.html>
I briefly looked at the bit code files and some types are not uniqued, here
is one example:
!3903 = metadata !{i32 786454, metadata !3904, null, metadata
!"int64_t",
i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ]
[int64_t] [line 198, size 0, align 0, offset 0] [from long int]
!4019 = metadata !{i32 786454, metadata !4020, null, metadata
!"int64_t",
i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ]
[int64_t] [line 198, size 0, align 0, offset 0] [from long int]
!3904 = metadata !{metadata !"runtime/int.cpp", metadata
!"/home/kmod/icbd/jit"}
!4020 = metadata !{metadata !"runtime/list.cpp", metadata
!"/home/kmod/icbd/jit"}
The file names are different for the two typedefs.
Manman
On Fri, Jan 10, 2014 at 12:14 AM, Eric Christopher <echristo at
gmail.com>wrote:
> That was likely type information and should mostly be fixed up. It's
still
> not lazily loaded, but is going to be ridiculously smaller now.
>
> -eric
>
> On Fri Jan 10 2014 at 12:11:52 AM, Sean Silva <chisophugis at
gmail.com>
> wrote:
>
>> This Summer I was working on LTO and Rafael mentioned to me that debug
>> info is not lazy loaded, which was the cause for the insane resource
usage
>> I was seeing when doing LTO with debug info. This is likely the reason
that
>> the lazy loading was so ineffective for your debug build.
>>
>> Rafael, am I remembering this right/can you give more information? I
>> expect that this will have to get fixed before pitching LLD as a
turnkey
>> LTO solution (not sure where in the priority list it is).
>>
>> -- Sean Silva
>> On Thu, Jan 9, 2014 at 5:37 PM, Kevin Modzelewski <kmod at
dropbox.com>wrote:
>>
>> Hi all, I'm trying to reduce the startup time for my JIT, but
I'm running
>> into the problem that the majority of the time is spent loading the
bitcode
>> for my standard library, and I suspect it's due to debug info. My
stdlib
>> is currently about 2kloc in a number of C++ files; I compile them with
>> clang -g -emit-llvm, then link them together with llvm-link, call opt
-O3
>> on it, and arrive at a 1MB bitcode file. I then embed this as a binary
>> blob into my executable, and call ParseBitcodeFile on it at startup.
>>
>> Unfortunately, this parsing takes about 60ms right now, which is the
main
>> component of my ~100ms time to run on an empty source file (another
~20ms
>> is loading the pre-jit'd image through an ObjectCache). I thought
I'd save
>> some time by using getLazyBitcodeModule, since the IR isn't
actually needed
>> right away, but this only reduced the parsing time (ie the time of the
>> actual getLazyBitcodeModule() call) to 45ms, which I thought was
>> surprising. I also tested computing the bytewise-xor of the bitcode
file
>> to make sure that it was fully read into memory, which took about 5ms,
so
>> the majority of the time does seem to be spent parsing.
>>
>> Then I switched back to ParseBitcodeFile, but now I added the
>> "-strip-debug" flag to my opt invocation, which reduced the
bitcode file
>> down to about 100KB, and reduced the parsing time to 20ms. What
surprised
>> me the most was that if I then switched to getLazyBitcodeModule, the
>> parsing time was cut down to 3ms, which is what I was originally
expecting.
>> So when lazy loading, stripping out the debug info cuts down the
>> initialization time from 45ms to 3ms, which is why I suspect that
>> getLazyBitcodeModule is still parsing all of the debug info.
>>
>>
>> To work around it, I can generate separate builds, one with debug info
>> and one without, but I'd like to avoid doing that. I did some
simple
>> profiling of what getLazyBitcodeModule was doing, and it wasn't
terribly
>> informative (spends most of its time in parsing-related functions);
does
>> anyone have any ideas if this is something that could be fixable or if
I
>> should just move on?
>>
>> Thanks,
>> Kevin
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140113/b05684c9/attachment.html>
On 13 January 2014 18:34, Manman Ren <manman.ren at gmail.com> wrote:> I briefly looked at the bit code files and some types are not uniqued, here > is one example: > !3903 = metadata !{i32 786454, metadata !3904, null, metadata !"int64_t", > i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ] > [int64_t] [line 198, size 0, align 0, offset 0] [from long int] > > !4019 = metadata !{i32 786454, metadata !4020, null, metadata !"int64_t", > i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ] > [int64_t] [line 198, size 0, align 0, offset 0] [from long int] > > !3904 = metadata !{metadata !"runtime/int.cpp", metadata > !"/home/kmod/icbd/jit"} > !4020 = metadata !{metadata !"runtime/list.cpp", metadata > !"/home/kmod/icbd/jit"} > > The file names are different for the two typedefs.Has this been fixed by r199760? Cheers, Rafael
Adrian may have handled this recently? On Jan 13, 2014 3:34 PM, "Manman Ren" <manman.ren at gmail.com> wrote:> I briefly looked at the bit code files and some types are not uniqued, > here is one example: > !3903 = metadata !{i32 786454, metadata !3904, null, metadata !"int64_t", > i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ] > [int64_t] [line 198, size 0, align 0, offset 0] [from long int] > > !4019 = metadata !{i32 786454, metadata !4020, null, metadata !"int64_t", > i32 198, i64 0, i64 0, i64 0, i32 0, metadata !2258} ; [ DW_TAG_typedef ] > [int64_t] [line 198, size 0, align 0, offset 0] [from long int] > > !3904 = metadata !{metadata !"runtime/int.cpp", metadata > !"/home/kmod/icbd/jit"} > !4020 = metadata !{metadata !"runtime/list.cpp", metadata > !"/home/kmod/icbd/jit"} > > The file names are different for the two typedefs. > > Manman > > > On Fri, Jan 10, 2014 at 12:14 AM, Eric Christopher <echristo at gmail.com>wrote: > >> That was likely type information and should mostly be fixed up. It's >> still not lazily loaded, but is going to be ridiculously smaller now. >> >> -eric >> >> On Fri Jan 10 2014 at 12:11:52 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> This Summer I was working on LTO and Rafael mentioned to me that debug >>> info is not lazy loaded, which was the cause for the insane resource usage >>> I was seeing when doing LTO with debug info. This is likely the reason that >>> the lazy loading was so ineffective for your debug build. >>> >>> Rafael, am I remembering this right/can you give more information? I >>> expect that this will have to get fixed before pitching LLD as a turnkey >>> LTO solution (not sure where in the priority list it is). >>> >>> -- Sean Silva >>> On Thu, Jan 9, 2014 at 5:37 PM, Kevin Modzelewski <kmod at dropbox.com>wrote: >>> >>> Hi all, I'm trying to reduce the startup time for my JIT, but I'm >>> running into the problem that the majority of the time is spent loading the >>> bitcode for my standard library, and I suspect it's due to debug info. My >>> stdlib is currently about 2kloc in a number of C++ files; I compile them >>> with clang -g -emit-llvm, then link them together with llvm-link, call opt >>> -O3 on it, and arrive at a 1MB bitcode file. I then embed this as a binary >>> blob into my executable, and call ParseBitcodeFile on it at startup. >>> >>> Unfortunately, this parsing takes about 60ms right now, which is the >>> main component of my ~100ms time to run on an empty source file (another >>> ~20ms is loading the pre-jit'd image through an ObjectCache). I thought >>> I'd save some time by using getLazyBitcodeModule, since the IR isn't >>> actually needed right away, but this only reduced the parsing time (ie the >>> time of the actual getLazyBitcodeModule() call) to 45ms, which I thought >>> was surprising. I also tested computing the bytewise-xor of the bitcode >>> file to make sure that it was fully read into memory, which took about 5ms, >>> so the majority of the time does seem to be spent parsing. >>> >>> Then I switched back to ParseBitcodeFile, but now I added the >>> "-strip-debug" flag to my opt invocation, which reduced the bitcode file >>> down to about 100KB, and reduced the parsing time to 20ms. What surprised >>> me the most was that if I then switched to getLazyBitcodeModule, the >>> parsing time was cut down to 3ms, which is what I was originally expecting. >>> So when lazy loading, stripping out the debug info cuts down the >>> initialization time from 45ms to 3ms, which is why I suspect that >>> getLazyBitcodeModule is still parsing all of the debug info. >>> >>> >>> To work around it, I can generate separate builds, one with debug info >>> and one without, but I'd like to avoid doing that. I did some simple >>> profiling of what getLazyBitcodeModule was doing, and it wasn't terribly >>> informative (spends most of its time in parsing-related functions); does >>> anyone have any ideas if this is something that could be fixable or if I >>> should just move on? >>> >>> Thanks, >>> Kevin >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140123/f875c11c/attachment.html>