Hi My goal is that given a binary and the corresponding input. I want to know what IR level basic blocks are covered. I need the detail information, which is the set of all the covered BBs rather than just a number. I want to know whether there are some tools that can support this requirements. If not, I think maybe instrumentation can helps. However, I do not know too much about this. Any suggestions or ideas are welcome. Thank you so much Regards Muhui -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180903/d36de489/attachment.html>
this is a classic problem: how to derive a complete coverage of all potential paths of execution. and the path covered is not only dependent on the input, it may depend on many other external factors like OS environment, external resource like network, storage, and memories etc. normally a complete coverage is computationally infeasible, but if you let a fuzzer lib libFuzzer do the work of finding random inputs + environments, then a "most probabilistically likely coverage" may be constructed. On Mon, Sep 3, 2018 at 1:17 PM Muhui Jiang via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi > > My goal is that given a binary and the corresponding input. I want to know > what IR level basic blocks are covered. I need the detail information, > which is the set of all the covered BBs rather than just a number. > > I want to know whether there are some tools that can support this > requirements. If not, I think maybe instrumentation can helps. However, I > do not know too much about this. Any suggestions or ideas are welcome. > Thank you so much > > Regards > Muhui > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Regards, Peter Teoh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180903/cd29b225/attachment.html>
Hi Peter I think you may misunderstand my question. I mean I have a binary and one input. I run the binary with this input. Can I know what IR basic blocks are covered. What I want to know is not only the number but also the BBs’ labels/IDs. Many thanks Regards Muhui Peter Teoh <htmldeveloper at gmail.com>于2018年9月3日 周一下午6:06写道:> this is a classic problem: how to derive a complete coverage of all > potential paths of execution. and the path covered is not only dependent > on the input, it may depend on many other external factors like OS > environment, external resource like network, storage, and memories etc. > normally a complete coverage is computationally infeasible, but if you let > a fuzzer lib libFuzzer do the work of finding random inputs + environments, > then a "most probabilistically likely coverage" may be constructed. > > On Mon, Sep 3, 2018 at 1:17 PM Muhui Jiang via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi >> >> My goal is that given a binary and the corresponding input. I want to >> know what IR level basic blocks are covered. I need the detail information, >> which is the set of all the covered BBs rather than just a number. >> >> I want to know whether there are some tools that can support this >> requirements. If not, I think maybe instrumentation can helps. However, I >> do not know too much about this. Any suggestions or ideas are welcome. >> Thank you so much >> >> Regards >> Muhui >> > _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >> > > -- > Regards, > Peter Teoh >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180903/691fe0ec/attachment.html>
Hi Peter Yeah. I know that BB IDs are virtual addresses. One method I think is to use the debugging information so that I could distinguish different BBs and also map them to each IR BB. Maybe I need to do instrumentation? Or I just print out some debugging information so that I know what BBs are visited. Do you think it works? I just want to distinguish the BBs. Maybe I can Taylor some existing tools. I don’t know where to start. Many thanks Regards Muhui Peter Teoh <htmldeveloper at gmail.com>于2018年9月3日 周一下午10:15写道:> I did not say anything about "basic block" because that is what we are > talking about. In fact, in LLVM tracing, or any runtime tracing, we only > get back "basic block" > identifier - which is just addresses indicating the start of the BB. So > BB IDs are also the virtual addresses. > > On Mon, Sep 3, 2018 at 9:41 PM Muhui Jiang <jiangmuhui at gmail.com> wrote: > >> >> Hi Peter >> >> I think you may misunderstand my question. I mean I have a binary and one >> input. I run the binary with this input. Can I know what IR basic blocks >> are covered. What I want to know is not only the number but also the BBs’ >> labels/IDs. Many thanks >> >> Regards >> Muhui >> >> Peter Teoh <htmldeveloper at gmail.com>于2018年9月3日 周一下午6:06写道: >> >>> this is a classic problem: how to derive a complete coverage of all >>> potential paths of execution. and the path covered is not only dependent >>> on the input, it may depend on many other external factors like OS >>> environment, external resource like network, storage, and memories etc. >>> normally a complete coverage is computationally infeasible, but if you let >>> a fuzzer lib libFuzzer do the work of finding random inputs + environments, >>> then a "most probabilistically likely coverage" may be constructed. >>> >>> On Mon, Sep 3, 2018 at 1:17 PM Muhui Jiang via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hi >>>> >>>> My goal is that given a binary and the corresponding input. I want to >>>> know what IR level basic blocks are covered. I need the detail information, >>>> which is the set of all the covered BBs rather than just a number. >>>> >>>> I want to know whether there are some tools that can support this >>>> requirements. If not, I think maybe instrumentation can helps. However, I >>>> do not know too much about this. Any suggestions or ideas are welcome. >>>> Thank you so much >>>> >>>> Regards >>>> Muhui >>>> >>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>>> >>> >>> -- >>> Regards, >>> Peter Teoh >>> >> > > -- > Regards, > Peter Teoh >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180903/7368c658/attachment.html>
On Mon, Sep 3, 2018 at 10:32 PM Muhui Jiang <jiangmuhui at gmail.com> wrote:> Hi Peter > > Yeah. I know that BB IDs are virtual addresses. One method I think is to > use the debugging information so that I could distinguish different BBs and > also map them to each IR BB. > > Maybe I need to do instrumentation? Or I just print out some debugging > information so that I know what BBs are visited. Do you think it works? I > just want to distinguish the BBs. Maybe I can Taylor some existing tools. I > don’t know where to start. Many thanks >yes, instrumentation is needed. so for example, you can use "clang" to compile + instrument it to do BB (-fsanitize-coverage=trace-bb) or function or edge-level tracing (-fsanitize-coverage=[func,edge]), see this for more details: https://bcain-llvm.readthedocs.io/projects/clang/en/release_39/SanitizerCoverage/> Regards > Muhui > > Peter Teoh <htmldeveloper at gmail.com>于2018年9月3日 周一下午10:15写道: > >> I did not say anything about "basic block" because that is what we are >> talking about. In fact, in LLVM tracing, or any runtime tracing, we only >> get back "basic block" >> identifier - which is just addresses indicating the start of the BB. So >> BB IDs are also the virtual addresses. >> >> On Mon, Sep 3, 2018 at 9:41 PM Muhui Jiang <jiangmuhui at gmail.com> wrote: >> >>> >>> Hi Peter >>> >>> I think you may misunderstand my question. I mean I have a binary and >>> one input. I run the binary with this input. Can I know what IR basic >>> blocks are covered. What I want to know is not only the number but also the >>> BBs’ labels/IDs. Many thanks >>> >>> Regards >>> Muhui >>> >>> Peter Teoh <htmldeveloper at gmail.com>于2018年9月3日 周一下午6:06写道: >>> >>>> this is a classic problem: how to derive a complete coverage of all >>>> potential paths of execution. and the path covered is not only dependent >>>> on the input, it may depend on many other external factors like OS >>>> environment, external resource like network, storage, and memories etc. >>>> normally a complete coverage is computationally infeasible, but if you let >>>> a fuzzer lib libFuzzer do the work of finding random inputs + environments, >>>> then a "most probabilistically likely coverage" may be constructed. >>>> >>>> On Mon, Sep 3, 2018 at 1:17 PM Muhui Jiang via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hi >>>>> >>>>> My goal is that given a binary and the corresponding input. I want to >>>>> know what IR level basic blocks are covered. I need the detail information, >>>>> which is the set of all the covered BBs rather than just a number. >>>>> >>>>> I want to know whether there are some tools that can support this >>>>> requirements. If not, I think maybe instrumentation can helps. However, I >>>>> do not know too much about this. Any suggestions or ideas are welcome. >>>>> Thank you so much >>>>> >>>>> Regards >>>>> Muhui >>>>> >>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>>> >>>> >>>> -- >>>> Regards, >>>> Peter Teoh >>>> >>> >> >> -- >> Regards, >> Peter Teoh >> >-- Regards, Peter Teoh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180903/2c670cf2/attachment.html>