David Blaikie via llvm-dev
2016-Feb-29 23:46 UTC
[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC
On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com> wrote:> > On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com> wrote: > > Just in case it interests anyone else, I'm playing around with trying to > broaden the MCStreamer API to allow for emission of bytes without copying > the contents into a local buffer first (either because you already have a > buffer, or the bytes are already present in another file, etc) in > http://reviews.llvm.org/D17694 . In theory there's some overlap with lld > here (no doubt it already does this sort of thing, but not in a way, I > assume, we could reuse from other tools at the moment) and my motivation, > llvm-dwp, looks very much like "linking with a few extra steps". > > But to check that these changes might be more generally applicable, I > thought I'd solicit data from anyone building tools that might be memory > constrained as well. > > First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil. > > Adrian/Fred - do you guys ever have trouble with memory usage of > llvm-dsymutil? Do you have an example you could provide that has high > memory usage, so I could see if any simple changes based on my prototype MC > changes would help. > > > Since dsymutil processes object files one after another, >As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra bits. But the MCStreamer API means any bytes you write to the streamer stay in memory until you "Finish" - so if you're dwp/linking large enough inputs, you have them all in memory when you really don't need them. For example, the dwp file I was generating is 7GB, but the tool with the memory improvements only has a high water mark of 2.3GB.> memory usage wasn’t really a problem so far, but you could try running > llvm-dsymutil on bin/clang for a larger example (takes about a minute to > finish). >Was thinking of something more accessible to me, on a non-Darwin platform. Is there a way I can generate the dsym inputs across Clang on a non-Darwin platform? (what happens if I run dsymutil on my ELF object files?)> > A quick glance at dsymutil's code indicates it might benefit slightly, at > least - in the string table emission, for example (it looks very similar to > string table emission in dwp - just being able to reference the strings in > the StringMap rather than copying them into MCStreamer could help (also I > found using a DenseMap<StringRef to the memory mapped input helped as well > - but that's a change you can make locally without any MCStreamer > improvements) - other parts might be trickier, and consist of parts of > referencable data (like the line table header) and parts that are not > referencable (like their contents) - my prototype could be extended to > handle that) > > > -- adrian >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/f51d10a6/attachment.html>
Adrian Prantl via llvm-dev
2016-Feb-29 23:51 UTC
[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC
> On Feb 29, 2016, at 3:46 PM, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote: > >> On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: >> >> Just in case it interests anyone else, I'm playing around with trying to broaden the MCStreamer API to allow for emission of bytes without copying the contents into a local buffer first (either because you already have a buffer, or the bytes are already present in another file, etc) in http://reviews.llvm.org/D17694 <http://reviews.llvm.org/D17694> . In theory there's some overlap with lld here (no doubt it already does this sort of thing, but not in a way, I assume, we could reuse from other tools at the moment) and my motivation, llvm-dwp, looks very much like "linking with a few extra steps". >> >> But to check that these changes might be more generally applicable, I thought I'd solicit data from anyone building tools that might be memory constrained as well. >> >> First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil. >> >> Adrian/Fred - do you guys ever have trouble with memory usage of llvm-dsymutil? Do you have an example you could provide that has high memory usage, so I could see if any simple changes based on my prototype MC changes would help. > > Since dsymutil processes object files one after another, > > As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra bits. But the MCStreamer API means any bytes you write to the streamer stay in memory until you "Finish" - so if you're dwp/linking large enough inputs, you have them all in memory when you really don't need them. For example, the dwp file I was generating is 7GB, but the tool with the memory improvements only has a high water mark of 2.3GB. > > memory usage wasn’t really a problem so far, but you could try running llvm-dsymutil on bin/clang for a larger example (takes about a minute to finish). > > Was thinking of something more accessible to me, on a non-Darwin platform. Is there a way I can generate the dsym inputs across Clang on a non-Darwin platform? (what happens if I run dsymutil on my ELF object files?)At this point probably nothing. Dsymutil acts on STABS symbol table entries that are (I guess) not present in a typical ELF binary. Dsymutil also only implements MachO relocations and has lots of other things where the ELF implementation is missing. It’s probably not too much work to wire all this up, but so far nobody did it. -- adrian>> >> A quick glance at dsymutil's code indicates it might benefit slightly, at least - in the string table emission, for example (it looks very similar to string table emission in dwp - just being able to reference the strings in the StringMap rather than copying them into MCStreamer could help (also I found using a DenseMap<StringRef to the memory mapped input helped as well - but that's a change you can make locally without any MCStreamer improvements) - other parts might be trickier, and consist of parts of referencable data (like the line table header) and parts that are not referencable (like their contents) - my prototype could be extended to handle that) > > -- adrian >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/a43fbf8d/attachment.html>
David Blaikie via llvm-dev
2016-Mar-01 00:10 UTC
[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC
On Mon, Feb 29, 2016 at 3:51 PM, Adrian Prantl <aprantl at apple.com> wrote:> > On Feb 29, 2016, at 3:46 PM, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com> wrote: > >> >> On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com> wrote: >> >> Just in case it interests anyone else, I'm playing around with trying to >> broaden the MCStreamer API to allow for emission of bytes without copying >> the contents into a local buffer first (either because you already have a >> buffer, or the bytes are already present in another file, etc) in >> http://reviews.llvm.org/D17694 . In theory there's some overlap with lld >> here (no doubt it already does this sort of thing, but not in a way, I >> assume, we could reuse from other tools at the moment) and my motivation, >> llvm-dwp, looks very much like "linking with a few extra steps". >> >> But to check that these changes might be more generally applicable, I >> thought I'd solicit data from anyone building tools that might be memory >> constrained as well. >> >> First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil. >> >> Adrian/Fred - do you guys ever have trouble with memory usage of >> llvm-dsymutil? Do you have an example you could provide that has high >> memory usage, so I could see if any simple changes based on my prototype MC >> changes would help. >> >> >> Since dsymutil processes object files one after another, >> > > As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra > bits. But the MCStreamer API means any bytes you write to the streamer stay > in memory until you "Finish" - so if you're dwp/linking large enough > inputs, you have them all in memory when you really don't need them. For > example, the dwp file I was generating is 7GB, but the tool with the memory > improvements only has a high water mark of 2.3GB. > > >> memory usage wasn’t really a problem so far, but you could try running >> llvm-dsymutil on bin/clang for a larger example (takes about a minute to >> finish). >> > > Was thinking of something more accessible to me, on a non-Darwin platform. > Is there a way I can generate the dsym inputs across Clang on a non-Darwin > platform? (what happens if I run dsymutil on my ELF object files?) > > > At this point probably nothing. Dsymutil acts on STABS symbol table > entries that are (I guess) not present in a typical ELF binary. Dsymutil > also only implements MachO relocations and has lots of other things where > the ELF implementation is missing. It’s probably not too much work to wire > all this up, but so far nobody did it. >& no easy way for me to get a representative (or pathalogically large, even) set of machO files to play with, I take it? It's no worries - just figured I'd give it a go if it was convenient.> > -- adrian > > >> A quick glance at dsymutil's code indicates it might benefit slightly, at >> least - in the string table emission, for example (it looks very similar to >> string table emission in dwp - just being able to reference the strings in >> the StringMap rather than copying them into MCStreamer could help (also I >> found using a DenseMap<StringRef to the memory mapped input helped as well >> - but that's a change you can make locally without any MCStreamer >> improvements) - other parts might be trickier, and consist of parts of >> referencable data (like the line table header) and parts that are not >> referencable (like their contents) - my prototype could be extended to >> handle that) >> >> >> -- adrian >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/38484a1c/attachment.html>
Frédéric Riss via llvm-dev
2016-Mar-01 04:41 UTC
[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC
> On Feb 29, 2016, at 3:46 PM, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com <mailto:aprantl at apple.com>> wrote: > >> On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: >> >> Just in case it interests anyone else, I'm playing around with trying to broaden the MCStreamer API to allow for emission of bytes without copying the contents into a local buffer first (either because you already have a buffer, or the bytes are already present in another file, etc) in http://reviews.llvm.org/D17694 <http://reviews.llvm.org/D17694> . In theory there's some overlap with lld here (no doubt it already does this sort of thing, but not in a way, I assume, we could reuse from other tools at the moment) and my motivation, llvm-dwp, looks very much like "linking with a few extra steps". >> >> But to check that these changes might be more generally applicable, I thought I'd solicit data from anyone building tools that might be memory constrained as well. >> >> First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil. >> >> Adrian/Fred - do you guys ever have trouble with memory usage of llvm-dsymutil? Do you have an example you could provide that has high memory usage, so I could see if any simple changes based on my prototype MC changes would help. > > Since dsymutil processes object files one after another, > > As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra bits. But the MCStreamer API means any bytes you write to the streamer stay in memory until you "Finish" - so if you're dwp/linking large enough inputs, you have them all in memory when you really don't need them. For example, the dwp file I was generating is 7GB, but the tool with the memory improvements only has a high water mark of 2.3GB.I’m a bit surprised by those numbers. If the output is 7GB, don’t you need to have a high watermark of 7GB at emission time even with your scheme? Also, in D17694 you mention that the memory peak goes from 9.6GB to 2.3GB. Is this dirty memory or allocated memory? When investigating the memory use of dsymutil, I found out that the exponential growth of the MC vectors would hide the real memory usage (eg showing 2GB when the code actually used just a bit over 1GB). Just curious, I think your approach makes a lot of sense. Fred> memory usage wasn’t really a problem so far, but you could try running llvm-dsymutil on bin/clang for a larger example (takes about a minute to finish). > > Was thinking of something more accessible to me, on a non-Darwin platform. Is there a way I can generate the dsym inputs across Clang on a non-Darwin platform? (what happens if I run dsymutil on my ELF object files?) >> >> A quick glance at dsymutil's code indicates it might benefit slightly, at least - in the string table emission, for example (it looks very similar to string table emission in dwp - just being able to reference the strings in the StringMap rather than copying them into MCStreamer could help (also I found using a DenseMap<StringRef to the memory mapped input helped as well - but that's a change you can make locally without any MCStreamer improvements) - other parts might be trickier, and consist of parts of referencable data (like the line table header) and parts that are not referencable (like their contents) - my prototype could be extended to handle that) > > -- adrian-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/94f1fcac/attachment.html>
David Blaikie via llvm-dev
2016-Mar-01 04:59 UTC
[llvm-dev] Possible Memory Savings for tools emitting large amounts of existing data through MC
On Mon, Feb 29, 2016 at 8:41 PM, Frédéric Riss <friss at apple.com> wrote:> > On Feb 29, 2016, at 3:46 PM, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Mon, Feb 29, 2016 at 3:36 PM, Adrian Prantl <aprantl at apple.com> wrote: > >> >> On Feb 29, 2016, at 3:18 PM, David Blaikie <dblaikie at gmail.com> wrote: >> >> Just in case it interests anyone else, I'm playing around with trying to >> broaden the MCStreamer API to allow for emission of bytes without copying >> the contents into a local buffer first (either because you already have a >> buffer, or the bytes are already present in another file, etc) in >> http://reviews.llvm.org/D17694 . In theory there's some overlap with lld >> here (no doubt it already does this sort of thing, but not in a way, I >> assume, we could reuse from other tools at the moment) and my motivation, >> llvm-dwp, looks very much like "linking with a few extra steps". >> >> But to check that these changes might be more generally applicable, I >> thought I'd solicit data from anyone building tools that might be memory >> constrained as well. >> >> First that comes to mind (Eric suggested/mentioned) is llvm-dsymutil. >> >> Adrian/Fred - do you guys ever have trouble with memory usage of >> llvm-dsymutil? Do you have an example you could provide that has high >> memory usage, so I could see if any simple changes based on my prototype MC >> changes would help. >> >> >> Since dsymutil processes object files one after another, >> > > As does llvm-dwp. Think of llvm-dwp more like a linker with a few extra > bits. But the MCStreamer API means any bytes you write to the streamer stay > in memory until you "Finish" - so if you're dwp/linking large enough > inputs, you have them all in memory when you really don't need them. For > example, the dwp file I was generating is 7GB, but the tool with the memory > improvements only has a high water mark of 2.3GB. > > > I’m a bit surprised by those numbers. If the output is 7GB, don’t you > need to have a high watermark of 7GB at emission time even with your scheme? >Nope, which is the great thing - the input files are memory mapped (reading with libObject) and by delaying the output a bit more, we can literally be reading bytes from the memory mapped input and writing them out to the output file - at no point do we then need to have the entire contents in memory.> Also, in D17694 you mention that the memory peak goes from 9.6GB to 2.3GB. > Is this dirty memory or allocated memory? >Allocated - I used valgrind's --tool=massif to analyze the memory usage.> When investigating the memory use of dsymutil, I found out that the > exponential growth of the MC vectors would hide the real memory usage (eg > showing 2GB when the code actually used just a bit over 1GB). >True, there could be some allocated but undirtied pages. Not sure if Valgrind accounts for that.> Just curious, I think your approach makes a lot of sense. > > Fred > > memory usage wasn’t really a problem so far, but you could try running >> llvm-dsymutil on bin/clang for a larger example (takes about a minute to >> finish). >> > > Was thinking of something more accessible to me, on a non-Darwin platform. > Is there a way I can generate the dsym inputs across Clang on a non-Darwin > platform? (what happens if I run dsymutil on my ELF object files?) > >> >> A quick glance at dsymutil's code indicates it might benefit slightly, at >> least - in the string table emission, for example (it looks very similar to >> string table emission in dwp - just being able to reference the strings in >> the StringMap rather than copying them into MCStreamer could help (also I >> found using a DenseMap<StringRef to the memory mapped input helped as well >> - but that's a change you can make locally without any MCStreamer >> improvements) - other parts might be trickier, and consist of parts of >> referencable data (like the line table header) and parts that are not >> referencable (like their contents) - my prototype could be extended to >> handle that) >> >> >> -- adrian >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/51ec58bb/attachment-0001.html>
Possibly Parallel Threads
- Possible Memory Savings for tools emitting large amounts of existing data through MC
- Possible Memory Savings for tools emitting large amounts of existing data through MC
- Possible Memory Savings for tools emitting large amounts of existing data through MC
- Possible Memory Savings for tools emitting large amounts of existing data through MC
- Possible Memory Savings for tools emitting large amounts of existing data through MC