Brian Gesiak via llvm-dev
2018-Mar-19 22:44 UTC
[llvm-dev] Suggestions for how coroutines and UBSan codegen can play nice with one another?
Hello all! (+cc Vedant Kumar, who I've been told knows a lot about UBSan!) I am trying to fix an assert that occurs when the transforms in llvm/lib/Transforms/Coroutines are applied to LLVM IR that has been generated with UBSan enabled -- specifically, '-fsanitize=null'. You can see an example of the assert in this 26-line C++ file here: godbolt.org/g/Gw9UZq Note that without the '-fsanitize=null' option this compiles fine, but when that option is used, Clang/LLVM crashes due to "error in backend: cannot move instruction since its users are not dominated by CoroBegin". The coroutine pass coro-split is responsible for transforming instructions within a coroutine function. Because a coroutine function can be suspended and then resumed much later in a program's execution, accesses to stack variables much be rewritten so that they access variables stored on the "coroutine frame," a heap-allocated piece of state that represents the execution frame of the coroutine. As part of this process, load instructions, such as the one generated for the '*this' expression in the source file above, are moved down past a call to the '@llvm.coro.begin' intrinsic. However, when UBSan is enabled, the load of '*this' is then immediately followed by a null check and, in the null case, a conditional branch to a call to '@ __ubsan_handle_type_mismatch_v1'. The coro-split pass is not written to move these compare and branch instructions, and instead asserts. You can see an example of the IR generated with and without '-fsanitize=null' here: gist.github.com/modocache/54a036c3bf9c06882fe85122e105d153 -- PR36578.ll lines 82 to 104 show the problematic UBSan branches. Note that PR36578.nosan.ll, which was compiled without '-fsanitize=null', does not include these branches. I have several questions: * What's the best way for the LLVM coroutines transform passes to play nicely with the code generated with UBSan's '-fsanitize=null'? * Are there other LLVM passes I could look at, to see examples of how they were modified to handle UBSan? * I've seen code in Clang's codebase that adds 'SanitizerKind::Null' to a set of skipped UBSan checks; should lib/CodeGen/CGCoroutine.cpp also do something like this, to avoid generating UBSan checks altogether? Would that be sufficient to avoid the error? Are there downsides to doing this? Any and all advice is appreciated -- thanks! - Brian Gesiak -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20180319/68847def/attachment.html>
Vedant Kumar via llvm-dev
2018-Mar-19 23:01 UTC
[llvm-dev] Suggestions for how coroutines and UBSan codegen can play nice with one another?
> On Mar 19, 2018, at 3:44 PM, Brian Gesiak <modocache at gmail.com> wrote: > > Hello all! > (+cc Vedant Kumar, who I've been told knows a lot about UBSan!) > > I am trying to fix an assert that occurs when the transforms in llvm/lib/Transforms/Coroutines are applied to LLVM IR that has been generated with UBSan enabled -- specifically, '-fsanitize=null'. > > You can see an example of the assert in this 26-line C++ file here: godbolt.org/g/Gw9UZq <godbolt.org/g/Gw9UZq> > > Note that without the '-fsanitize=null' option this compiles fine, but when that option is used, Clang/LLVM crashes due to "error in backend: cannot move instruction since its users are not dominated by CoroBegin". > > The coroutine pass coro-split is responsible for transforming instructions within a coroutine function. Because a coroutine function can be suspended and then resumed much later in a program's execution, accesses to stack variables much be rewritten so that they access variables stored on the "coroutine frame," a heap-allocated piece of state that represents the execution frame of the coroutine. As part of this process, load instructions, such as the one generated for the '*this' expression in the source file above, are moved down past a call to the '@llvm.coro.begin' intrinsic. > > However, when UBSan is enabled, the load of '*this' is then immediately followed by a null check and, in the null case, a conditional branch to a call to '@ __ubsan_handle_type_mismatch_v1'. The coro-split pass is not written to move these compare and branch instructions, and instead asserts.It looks like there was a FIXME about this issue introduced circa r280678.> You can see an example of the IR generated with and without '-fsanitize=null' here: gist.github.com/modocache/54a036c3bf9c06882fe85122e105d153 <gist.github.com/modocache/54a036c3bf9c06882fe85122e105d153> -- PR36578.ll lines 82 to 104 show the problematic UBSan branches. Note that PR36578.nosan.ll, which was compiled without '-fsanitize=null', does not include these branches. > > I have several questions: > > * What's the best way for the LLVM coroutines transform passes to play nicely with the code generated with UBSan's '-fsanitize=null'?+ Gor, as he might be the best person to answer this.> * Are there other LLVM passes I could look at, to see examples of how they were modified to handle UBSan?No, no other llvm passes needed to be modified to handle ubsan instrumentation.> * I've seen code in Clang's codebase that adds 'SanitizerKind::Null' to a set of skipped UBSan checks; should lib/CodeGen/CGCoroutine.cpp also do something like this, to avoid generating UBSan checks altogether? Would that be sufficient to avoid the error?I don't think it would be sufficient. You might encounter the same issue with -fsanitize=alignment,object-size,vptr and possibly -fsanitize=address. Instead of blacklisting known bad sanitizers, it might be better to whitelist known good ones. Over time you could expand that set, adding tests as needed.> Are there downsides to doing this?I think a whitelisting approach is a reasonable way to prevent crashes. It at least allows you to enable sanitizers for coroutines later on. The alternative is to force users to partially disable sanitization (either with source annotations or by dropping the build configuration entirely). If that happens, it's much less likely that they'd re-enable it in the future. vedant> > Any and all advice is appreciated -- thanks! > > - Brian Gesiak >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20180319/8a9ea5a3/attachment.html>
Brian Gesiak via llvm-dev
2018-Mar-19 23:17 UTC
[llvm-dev] Suggestions for how coroutines and UBSan codegen can play nice with one another?
Thanks for the quick reply, Vedant! On Mon, Mar 19, 2018 at 7:01 PM, Vedant Kumar <vsk at apple.com> wrote:> > On Mar 19, 2018, at 3:44 PM, Brian Gesiak <modocache at gmail.com> wrote: > > Hello all! > (+cc Vedant Kumar, who I've been told knows a lot about UBSan!) > > I am trying to fix an assert that occurs when the transforms in > llvm/lib/Transforms/Coroutines are applied to LLVM IR that has been > generated with UBSan enabled -- specifically, '-fsanitize=null'. > > You can see an example of the assert in this 26-line C++ file here: > godbolt.org/g/Gw9UZq > > Note that without the '-fsanitize=null' option this compiles fine, but > when that option is used, Clang/LLVM crashes due to "error in backend: > cannot move instruction since its users are not dominated by CoroBegin". > > The coroutine pass coro-split is responsible for transforming instructions > within a coroutine function. Because a coroutine function can be suspended > and then resumed much later in a program's execution, accesses to stack > variables much be rewritten so that they access variables stored on the > "coroutine frame," a heap-allocated piece of state that represents the > execution frame of the coroutine. As part of this process, load > instructions, such as the one generated for the '*this' expression in the > source file above, are moved down past a call to the '@llvm.coro.begin' > intrinsic. > > However, when UBSan is enabled, the load of '*this' is then immediately > followed by a null check and, in the null case, a conditional branch to a > call to '@ __ubsan_handle_type_mismatch_v1'. The coro-split pass is not > written to move these compare and branch instructions, and instead asserts. > > > It looks like there was a FIXME about this issue introduced circa r280678. >Yup! Perhaps the time has come, as the comment suggests, to "make this more robust" :)> You can see an example of the IR generated with and without > '-fsanitize=null' here: gist.github.com/modocache > 54a036c3bf9c06882fe85122e105d153 -- PR36578.ll lines 82 to 104 show the > problematic UBSan branches. Note that PR36578.nosan.ll, which was compiled > without '-fsanitize=null', does not include these branches. > > I have several questions: > > * What's the best way for the LLVM coroutines transform passes to play > nicely with the code generated with UBSan's '-fsanitize=null'? > > > + Gor, as he might be the best person to answer this. > > > * Are there other LLVM passes I could look at, to see examples of how they > were modified to handle UBSan? > > > No, no other llvm passes needed to be modified to handle ubsan > instrumentation. > > > * I've seen code in Clang's codebase that adds 'SanitizerKind::Null' to a > set of skipped UBSan checks; should lib/CodeGen/CGCoroutine.cpp also do > something like this, to avoid generating UBSan checks altogether? Would > that be sufficient to avoid the error? > > > I don't think it would be sufficient. You might encounter the same issue > with -fsanitize=alignment,object-size,vptr and possibly > -fsanitize=address. Instead of blacklisting known bad sanitizers, it might > be better to whitelist known good ones. Over time you could expand that > set, adding tests as needed. > > > Are there downsides to doing this? > > > I think a whitelisting approach is a reasonable way to prevent crashes. It > at least allows you to enable sanitizers for coroutines later on. The > alternative is to force users to partially disable sanitization (either > with source annotations or by dropping the build configuration entirely). > If that happens, it's much less likely that they'd re-enable it in the > future. >Excellent, thanks for these suggestions! I think I'll try the whitelisting approach you suggested. I'm guessing I can accomplish this by using the 'clang::SanitizerBlacklist' class or 'clang::SanitizerSet' struct somehow...? 'clang::SanitizerSet::clear', in particular, looks promising. I'll look into it and see what I can do. Thanks! - Brian> > vedant > > > Any and all advice is appreciated -- thanks! > > - Brian Gesiak > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20180319/1a31d91a/attachment.html>