Jake Ehrlich via llvm-dev
2019-Nov-27 07:15 UTC
[llvm-dev] RFC: Loadable segments watermark for lld
The ELF file header isn't always covered by a segment but still affects loading. I think everything else that effects loading/dynamic linking is always covered by a PT_LOAD segment. As evidence this is basically what --strip-sections in llvm-strip and eu-strip do and they produce perfectly runnable binaries. Having a hash of the actual memory map is interesting IMO. Build IDs can't really be verified but a hash of the memory map would be loadable with the expected semantics if and only if the hash was verifiable. So if there's a use case for verification, then this seems sensible to me. I'm not sure where such a verification matters however. On Tue, Nov 26, 2019, 10:04 PM Rui Ueyama via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Chris, > > Thank you for starting the thread! To make things clear what the proposed > feature can or can't do, let me ask a few questions (allow me to ask > duplicate questions that were discussed in the review thread): > > *Why build-id is not sufficient?* > > If you pass -build-id to the linker, the linker computes a hash of an > entire output file and append it to a .note section. This is not intended > to be a checksum but more like just a unique identifier. But you might be > able to use it as a checksum and detect any post-link modification by > recomputing build-id and compare it with the content of a .note section. > > *What kind of post-link modification are you expecting?* > > The first thing that comes to mind is strip command which removes debug > info and symbol table. But it looks like you are expecting more than that? > > *Is computing memory-mapped sections strong enough to detect post-link > modifications?* > > I wonder if there's some section or an ELF header field that does not > mapped to memory at run-time but affects how the loader works. If such a > thing exists, computing a hash of all memory-mapped sections is not enough > to catch post-link modifications. > > On Thu, Nov 21, 2019 at 8:43 PM Chris Jackson via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello all, >> >> I'm implementing a watermarking feature for lld that computes a hash of >> loadable >> segments and places the result in a note section. Ongoing work can be >> found >> here: >> >> https://reviews.llvm.org/D70316 >> https://reviews.llvm.org/D66426 >> >> The purpose of this watermark is to enable detection of post-link >> modifications >> to the loadable segments of the binary. Such modifications may produce a >> binary >> that relies on functionality that is an incidental detail of the OS that >> may >> change in a future update and negatively affect the runtime behaviour of >> the >> binary. >> >> As well as identifying reliance on unspecified behaviour, on detection of >> post-link changes we can then look at improving our tooling to support >> whatever >> changes had been applied. >> >> Its critical for us that the watermark has minimal impact on build time >> and >> cryptographic security is not the goal. Hence, xxhash is used as our >> experiments showed it has minimal overhead. >> >> Chris >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191126/f6f0fdc2/attachment.html>
Fāng-ruì Sòng via llvm-dev
2019-Nov-27 19:40 UTC
[llvm-dev] RFC: Loadable segments watermark for lld
>From previous reviews I recall that .note.llvm.watermark is a non-SHF_ALLOCsection. Linkers normally place non-SHF_ALLOC sections after SHF_ALLOC ones. I think a post-link tool can be used to append .note.llvm.watermark to an ELF file. It just needs to update tens to a few hundred bytes (section header table+content of .note.llvm.watermark), assuming the position of .note.llvm.watermark does not matter. I feel that the reasoning of building .note.llvm.watermark being an lld feature is not sufficiently strong. Does it need to be fast? (A benchmark measuring the performance will be useful.) .note.llvm.watermark seems to be only used in releases. Releases naturally involve a lot of preparation and verification. I can't imagine that running a post-link tool can be a bottleneck. During development .note.llvm.watermark is probably not very useful. If .note.llvm.watermark is indeed non-SHF_ALLOC, it can be discarded by llvm-objcopy/llvm-strip --strip-all, but not by --strip-all-gnu (objcopy/strip --strip-all). Is this an expected modification? If we reach the consensus that this section is useful, llvm-objcopy may be the right place to implement the update/verification features. If the performance is really critical (see my question mentioned before), we probably need to make llvm-objcopy's in-place update fast by not overwriting contents that are not changed.> *Is computing memory-mapped sections strong enough to detect post-linkmodifications?* In most cases, yes. A lot of people (including me) hold the opinion that non-SHF_ALLOC parts should not affect runtime execution. There are some counter-examples (runtime introspection), though. 1) The non-SHF_ALLOC .ARM.attributes (https://reviews.llvm.org/D69188) is used by Debian patched glibc ld.so. 2) The .ctf developers intend .ctf to be non-strippable https://sourceware.org/ml/binutils/2019-09/msg00209.html (see the thread in October; I even implemented objcopy --keep-section for them but I may likely lose the battle). On Tue, Nov 26, 2019 at 11:15 PM Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> The ELF file header isn't always covered by a segment but still affects > loading. I think everything else that effects loading/dynamic linking is > always covered by a PT_LOAD segment. As evidence this is basically what > --strip-sections in llvm-strip and eu-strip do and they produce perfectly > runnable binaries. > > Having a hash of the actual memory map is interesting IMO. Build IDs can't > really be verified but a hash of the memory map would be loadable with the > expected semantics if and only if the hash was verifiable. So if there's a > use case for verification, then this seems sensible to me. I'm not sure > where such a verification matters however. > > On Tue, Nov 26, 2019, 10:04 PM Rui Ueyama via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi Chris, >> >> Thank you for starting the thread! To make things clear what the proposed >> feature can or can't do, let me ask a few questions (allow me to ask >> duplicate questions that were discussed in the review thread): >> >> *Why build-id is not sufficient?* >> >> If you pass -build-id to the linker, the linker computes a hash of an >> entire output file and append it to a .note section. This is not intended >> to be a checksum but more like just a unique identifier. But you might be >> able to use it as a checksum and detect any post-link modification by >> recomputing build-id and compare it with the content of a .note section. >> >> *What kind of post-link modification are you expecting?* >> >> The first thing that comes to mind is strip command which removes debug >> info and symbol table. But it looks like you are expecting more than that? >> >> *Is computing memory-mapped sections strong enough to detect post-link >> modifications?* >> >> I wonder if there's some section or an ELF header field that does not >> mapped to memory at run-time but affects how the loader works. If such a >> thing exists, computing a hash of all memory-mapped sections is not enough >> to catch post-link modifications. >> >> On Thu, Nov 21, 2019 at 8:43 PM Chris Jackson via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hello all, >>> >>> I'm implementing a watermarking feature for lld that computes a hash of >>> loadable >>> segments and places the result in a note section. Ongoing work can be >>> found >>> here: >>> >>> https://reviews.llvm.org/D70316 >>> https://reviews.llvm.org/D66426 >>> >>> The purpose of this watermark is to enable detection of post-link >>> modifications >>> to the loadable segments of the binary. Such modifications may produce a >>> binary >>> that relies on functionality that is an incidental detail of the OS that >>> may >>> change in a future update and negatively affect the runtime behaviour of >>> the >>> binary. >>> >>> As well as identifying reliance on unspecified behaviour, on detection of >>> post-link changes we can then look at improving our tooling to support >>> whatever >>> changes had been applied. >>> >>> Its critical for us that the watermark has minimal impact on build time >>> and >>> cryptographic security is not the goal. Hence, xxhash is used as our >>> experiments showed it has minimal overhead. >>> >>> Chris >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- 宋方睿 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191127/800902c9/attachment-0001.html>
Chris Jackson via llvm-dev
2019-Nov-29 11:58 UTC
[llvm-dev] RFC: Loadable segments watermark for lld
The whole point of the watermark is to show that no post-link modifications have been made, and if the watermark itself is added post-link, it does not achieve this aim: someone could either deliberately or accidentally add a step prior to the watermarking happening. In our case we always map .note.llvm.watermark to a PT_NOTE segment with a linker script. Thus, llvm-objcopy --strip-all does not discard the section. If --strip-all is used when the section is not mapped to a segment, --keep-section can be used to preserve the watermark section. On Wed, Nov 27, 2019 at 7:40 PM Fāng-ruì Sòng <maskray at google.com> wrote:> From previous reviews I recall that .note.llvm.watermark is a > non-SHF_ALLOC section. Linkers normally place non-SHF_ALLOC sections after > SHF_ALLOC ones. I think a post-link tool can be used to append > .note.llvm.watermark to an ELF file. It just needs to update tens to a few > hundred bytes (section header table+content of .note.llvm.watermark), > assuming the position of .note.llvm.watermark does not matter. > > I feel that the reasoning of building .note.llvm.watermark being an lld > feature is not sufficiently strong. Does it need to be fast? (A benchmark > measuring the performance will be useful.) .note.llvm.watermark seems to be > only used in releases. Releases naturally involve a lot of preparation and > verification. I can't imagine that running a post-link tool can be a > bottleneck. During development .note.llvm.watermark is probably not very > useful. > > If .note.llvm.watermark is indeed non-SHF_ALLOC, it can be discarded by > llvm-objcopy/llvm-strip --strip-all, but not by --strip-all-gnu > (objcopy/strip --strip-all). Is this an expected modification? > > If we reach the consensus that this section is useful, llvm-objcopy may be > the right place to implement the update/verification features. If the > performance is really critical (see my question mentioned before), we > probably need to make llvm-objcopy's in-place update fast by not > overwriting contents that are not changed. > > > *Is computing memory-mapped sections strong enough to detect post-link > modifications?* > > In most cases, yes. A lot of people (including me) hold the opinion that > non-SHF_ALLOC parts should not affect runtime execution. There are some > counter-examples (runtime introspection), though. 1) The non-SHF_ALLOC > .ARM.attributes (https://reviews.llvm.org/D69188) is used by Debian > patched glibc ld.so. 2) The .ctf developers intend .ctf to be > non-strippable https://sourceware.org/ml/binutils/2019-09/msg00209.html (see > the thread in October; I even implemented objcopy --keep-section for them > but I may likely lose the battle). > > > On Tue, Nov 26, 2019 at 11:15 PM Jake Ehrlich via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> The ELF file header isn't always covered by a segment but still affects >> loading. I think everything else that effects loading/dynamic linking is >> always covered by a PT_LOAD segment. As evidence this is basically what >> --strip-sections in llvm-strip and eu-strip do and they produce perfectly >> runnable binaries. >> >> Having a hash of the actual memory map is interesting IMO. Build IDs >> can't really be verified but a hash of the memory map would be loadable >> with the expected semantics if and only if the hash was verifiable. So if >> there's a use case for verification, then this seems sensible to me. I'm >> not sure where such a verification matters however. >> >> On Tue, Nov 26, 2019, 10:04 PM Rui Ueyama via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi Chris, >>> >>> Thank you for starting the thread! To make things clear what the >>> proposed feature can or can't do, let me ask a few questions (allow me to >>> ask duplicate questions that were discussed in the review thread): >>> >>> *Why build-id is not sufficient?* >>> >>> If you pass -build-id to the linker, the linker computes a hash of an >>> entire output file and append it to a .note section. This is not intended >>> to be a checksum but more like just a unique identifier. But you might be >>> able to use it as a checksum and detect any post-link modification by >>> recomputing build-id and compare it with the content of a .note section. >>> >>> *What kind of post-link modification are you expecting?* >>> >>> The first thing that comes to mind is strip command which removes debug >>> info and symbol table. But it looks like you are expecting more than that? >>> >>> *Is computing memory-mapped sections strong enough to detect post-link >>> modifications?* >>> >>> I wonder if there's some section or an ELF header field that does not >>> mapped to memory at run-time but affects how the loader works. If such a >>> thing exists, computing a hash of all memory-mapped sections is not enough >>> to catch post-link modifications. >>> >>> On Thu, Nov 21, 2019 at 8:43 PM Chris Jackson via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello all, >>>> >>>> I'm implementing a watermarking feature for lld that computes a hash of >>>> loadable >>>> segments and places the result in a note section. Ongoing work can be >>>> found >>>> here: >>>> >>>> https://reviews.llvm.org/D70316 >>>> https://reviews.llvm.org/D66426 >>>> >>>> The purpose of this watermark is to enable detection of post-link >>>> modifications >>>> to the loadable segments of the binary. Such modifications may produce >>>> a binary >>>> that relies on functionality that is an incidental detail of the OS >>>> that may >>>> change in a future update and negatively affect the runtime behaviour >>>> of the >>>> binary. >>>> >>>> As well as identifying reliance on unspecified behaviour, on detection >>>> of >>>> post-link changes we can then look at improving our tooling to support >>>> whatever >>>> changes had been applied. >>>> >>>> Its critical for us that the watermark has minimal impact on build time >>>> and >>>> cryptographic security is not the goal. Hence, xxhash is used as our >>>> experiments showed it has minimal overhead. >>>> >>>> Chris >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > -- > 宋方睿 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191129/3c8dfcdf/attachment.html>
Chris Jackson via llvm-dev
2019-Dec-02 18:16 UTC
[llvm-dev] RFC: Loadable segments watermark for lld
There is discussion of this on Phabricator (https://reviews.llvm.org/D66426). There is no threat model as this is not a security feature. We are trying to detect post-link modifications that result in a binary that relies on incidental details of the OS. Reliance on these details may impair future work on the platform. Also, if post-link modifications are detected then we may be able to identify functionality that is lacking in our platform. On Fri, Nov 29, 2019 at 1:32 PM Jake Ehrlich <jakehehrlich at google.com> wrote:> Could you clarify the threat model? Are we preventing bugs or malicious > attackers? > > On Fri, Nov 29, 2019, 3:58 AM Chris Jackson via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> The whole point of the watermark is to show that no post-link >> modifications have been made, and if the watermark itself is added >> post-link, it does not achieve this aim: someone could either >> deliberately or accidentally add a step prior to the watermarking happening. >> >> In our case we always map .note.llvm.watermark to a PT_NOTE segment with >> a linker script. Thus, llvm-objcopy --strip-all does not discard the >> section. If --strip-all is used when the section is not mapped to a >> segment, --keep-section can be used to preserve the watermark section. >> >> On Wed, Nov 27, 2019 at 7:40 PM Fāng-ruì Sòng <maskray at google.com> wrote: >> >>> From previous reviews I recall that .note.llvm.watermark is a >>> non-SHF_ALLOC section. Linkers normally place non-SHF_ALLOC sections after >>> SHF_ALLOC ones. I think a post-link tool can be used to append >>> .note.llvm.watermark to an ELF file. It just needs to update tens to a few >>> hundred bytes (section header table+content of .note.llvm.watermark), >>> assuming the position of .note.llvm.watermark does not matter. >>> >>> I feel that the reasoning of building .note.llvm.watermark being an lld >>> feature is not sufficiently strong. Does it need to be fast? (A benchmark >>> measuring the performance will be useful.) .note.llvm.watermark seems to be >>> only used in releases. Releases naturally involve a lot of preparation and >>> verification. I can't imagine that running a post-link tool can be a >>> bottleneck. During development .note.llvm.watermark is probably not very >>> useful. >>> >>> If .note.llvm.watermark is indeed non-SHF_ALLOC, it can be discarded by >>> llvm-objcopy/llvm-strip --strip-all, but not by --strip-all-gnu >>> (objcopy/strip --strip-all). Is this an expected modification? >>> >>> If we reach the consensus that this section is useful, llvm-objcopy may >>> be the right place to implement the update/verification features. If the >>> performance is really critical (see my question mentioned before), we >>> probably need to make llvm-objcopy's in-place update fast by not >>> overwriting contents that are not changed. >>> >>> > *Is computing memory-mapped sections strong enough to detect >>> post-link modifications?* >>> >>> In most cases, yes. A lot of people (including me) hold the opinion that >>> non-SHF_ALLOC parts should not affect runtime execution. There are some >>> counter-examples (runtime introspection), though. 1) The non-SHF_ALLOC >>> .ARM.attributes (https://reviews.llvm.org/D69188) is used by Debian >>> patched glibc ld.so. 2) The .ctf developers intend .ctf to be >>> non-strippable https://sourceware.org/ml/binutils/2019-09/msg00209.html (see >>> the thread in October; I even implemented objcopy --keep-section for them >>> but I may likely lose the battle). >>> >>> >>> On Tue, Nov 26, 2019 at 11:15 PM Jake Ehrlich via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> The ELF file header isn't always covered by a segment but still affects >>>> loading. I think everything else that effects loading/dynamic linking is >>>> always covered by a PT_LOAD segment. As evidence this is basically what >>>> --strip-sections in llvm-strip and eu-strip do and they produce perfectly >>>> runnable binaries. >>>> >>>> Having a hash of the actual memory map is interesting IMO. Build IDs >>>> can't really be verified but a hash of the memory map would be loadable >>>> with the expected semantics if and only if the hash was verifiable. So if >>>> there's a use case for verification, then this seems sensible to me. I'm >>>> not sure where such a verification matters however. >>>> >>>> On Tue, Nov 26, 2019, 10:04 PM Rui Ueyama via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hi Chris, >>>>> >>>>> Thank you for starting the thread! To make things clear what the >>>>> proposed feature can or can't do, let me ask a few questions (allow me to >>>>> ask duplicate questions that were discussed in the review thread): >>>>> >>>>> *Why build-id is not sufficient?* >>>>> >>>>> If you pass -build-id to the linker, the linker computes a hash of an >>>>> entire output file and append it to a .note section. This is not intended >>>>> to be a checksum but more like just a unique identifier. But you might be >>>>> able to use it as a checksum and detect any post-link modification by >>>>> recomputing build-id and compare it with the content of a .note section. >>>>> >>>>> *What kind of post-link modification are you expecting?* >>>>> >>>>> The first thing that comes to mind is strip command which removes >>>>> debug info and symbol table. But it looks like you are expecting more than >>>>> that? >>>>> >>>>> *Is computing memory-mapped sections strong enough to detect post-link >>>>> modifications?* >>>>> >>>>> I wonder if there's some section or an ELF header field that does not >>>>> mapped to memory at run-time but affects how the loader works. If such a >>>>> thing exists, computing a hash of all memory-mapped sections is not enough >>>>> to catch post-link modifications. >>>>> >>>>> On Thu, Nov 21, 2019 at 8:43 PM Chris Jackson via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hello all, >>>>>> >>>>>> I'm implementing a watermarking feature for lld that computes a hash >>>>>> of loadable >>>>>> segments and places the result in a note section. Ongoing work can be >>>>>> found >>>>>> here: >>>>>> >>>>>> https://reviews.llvm.org/D70316 >>>>>> https://reviews.llvm.org/D66426 >>>>>> >>>>>> The purpose of this watermark is to enable detection of post-link >>>>>> modifications >>>>>> to the loadable segments of the binary. Such modifications may >>>>>> produce a binary >>>>>> that relies on functionality that is an incidental detail of the OS >>>>>> that may >>>>>> change in a future update and negatively affect the runtime behaviour >>>>>> of the >>>>>> binary. >>>>>> >>>>>> As well as identifying reliance on unspecified behaviour, on >>>>>> detection of >>>>>> post-link changes we can then look at improving our tooling to >>>>>> support whatever >>>>>> changes had been applied. >>>>>> >>>>>> Its critical for us that the watermark has minimal impact on build >>>>>> time and >>>>>> cryptographic security is not the goal. Hence, xxhash is used as our >>>>>> experiments showed it has minimal overhead. >>>>>> >>>>>> Chris >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >>> >>> -- >>> 宋方睿 >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191202/69148717/attachment.html>