HI think I have a fairly nicely integrated Libfuzzer based fuzzer in Postgres now. I can run things like: SELECT fuzz(100000,'select regexp_matches(''foo/bar/baz'',$1,''g'')') Which makes it convenient to fuzz arbitrary public functions available in SQL. (I haven't figured out what interface to make for fuzzing internal functions which take char buffers that can have nuls. The SQL interface will only be able to handle valid utf8 encoded strings which contain no nuls.) I have some feedback of things that are a bit awkard or that I miss from AFL. Some of this may actually be there but I'm just not using it right? 1) One minor things, it's a bit of a pain to construct the argv when you're not invoking it on the command line. Not a big deal but it would be nice to bypass that and just allow the caller to set the variables directly. Some of the parameters are not entirely clear either -- I'm not clear what the distinction is between -runs and -iterations and I'm not clear whether the timeout is for the whole run or individual tests (it's not doing anything in my case which is probably due to Postgres having its own ALRM handler). 2) I've caught a bug which causes enormous stack growth. AFL enforces strict memory and time limits on the tests which is awfully convenient. I can implement those myself in my fuzzer function (and in fact in the Postgres context it would be best if I did) but some simple AFL-style protection would be appreciated, as it is it takes a *looong* time to fail and doesn't leave behind the crashing test. It would be nice if Libfuzzer took a page out of the sanitize code's tricks and kept an mmapped file where it wrote the current test being run. If the file is never synced then it shouldn't cause any syscalls or I/O until the program crashes and the file descriptor is closed. My thinking is I need to set an RLIMIT_STACK setting and then install a SEGV handler which will longjmp back to the top level and return to the fuzzer. That will be risky since it's in theory impossible to restore any state the SEGV caused but in practice if it's always caused by a stack overflow might be safe. I would also like to have an ALRM handler but that requires calling alarm() on every call and I'm not sure if the setitimer in Libfuzzer can be disabled or if it'll interfere with that. Maybe there's a better approach, I could call setitimer and if I see more than n ALRMs during the execution decide it's a fault. Again it would be nice if Libfuzzer provided that itself. 3) When it writes the minimal test corpus it seems to keep older tests around too. I guess the intent is to pass two directories, one which starts empty and is intended to receive the results and one which is maintained as the working tree? I'm not sure how to use this mode. 4) The actually fuzzing seems to be less effective than AFL at finding good cases. In particular I've found I have to use only_ascii mode or else it spends all the time looking at encoding errors on random binary inputs. Even in only_ascii mode it seems insistent on putting a ^L in a *lot* of tests even when the function being tested always ends with the same error if one is present. I'm hoping to try DFA mode and hoping it will help with this but all the "experimental" warnings in the docs scare me. Is it just that there's room for improvement or is there any downside to running in that mode? Another thing I'm not clear whether it's not implemented yet or there's just no feedback yet is the test for variable coverage. AFL runs the same test repeatedly to test whether the coverage is repeatable which can be an important thing to know whether your testing is actually well implemented or whether you're failing to clean up state sufficiently between runs. 5) I'm currently running 1M iterations per call then calling it again (in a new process). It would be convenient if I could call it again in the same process and in fact it would be most convenient if I could make my code call the fuzzer repeatedly for, say, 1k invocations. I could check for C-c once ever 1k calls and do any other cleanup, checking for memory leaks, etc at that time. It would also be nice to be able to ask for the minimal corpus back in memory along with meta information like coverage, runtime, etc so I could, say, store them in the database :) 6) The crashing and slow tests are written to the current directory. It would be nice to be able to provide a directory for them to go into. Also, it would be nice to provide a callback or some other way to override this. I could generate the whole SQL reproduction instead of just having the binary data to pass and have to remember what function I was testing. In general the feedback is a bit unclear. It seems to print binary strings in several different escape styles, sometimes using \x (though it's not clear how many hex digits follow) sometimes using 0x and sometimes using base64: #755 NEW cov: 14667 bits: 476 units: 6 exec/s: 20 L: 4 \xa\x5\xcb* 0xa,0x5,0xcb,0x2a, Test unit written to crash-b0f4bc53c8f72fd53ef0a6c1f46115bd7bd8fe50 Base64: IZqM9rA71To7KDonOlb8pCEoJ3Mn2sAnO1I3XwYoITtxO0exSjwo7u4nKZ8hnilHeQo6GDshTI4pKipWLa8KXg= Of these only the base64 is convenient for writing reproductions (though a callback would be most convenient) but it's not so convenient for watching the progress. And for many lines it seems to print no test data which is definitely not helpful for watching progress: #2028804 NEW cov: 15330 bits: 6511 units: 127 exec/s: 5880 L: 39 #2045447 NEW cov: 15330 bits: 6512 units: 128 exec/s: 5877 L: 47 Also, all this feedback is currently going into the server log. I would like to capture it and report it to the client. I'm currently basically just doing my own progress feedback this way but it's missing the information about coverage and number of units found. It can only show the number of tests done and things like memory usage etc. 7) If I open up the corpus files in emacs and accidentally hit any key then emacs saves an autosave file but then deletes it when I undo the accidental edit -- which causes Libfuzzer to pretty much immediately crash with: Can not stat: /var/tmp/corpus/.#16813d894b330e26fdf4520793501dfffc830eb9; exiting I would suggest ignoring auto-save and backup files (.#* and *~) but in any case this doesn't seem like it should be a fatal error. Just warn about the disappearing file and move on to the next one. -- greg
On Sat, Sep 5, 2015 at 1:50 PM, Greg Stark <stark at mit.edu> wrote:> 2) I've caught a bug which causes enormous stack growth. AFL enforces > strict memory and time limits on the tests which is awfully > convenient. I can implement those myself in my fuzzer function (and in > fact in the Postgres context it would be best if I did) but some > simple AFL-style protection would be appreciated, as it is it takes a > *looong* time to fail and doesn't leave behind the crashing test. It > would be nice if Libfuzzer took a page out of the sanitize code's > tricks and kept an mmapped file where it wrote the current test being > run. If the file is never synced then it shouldn't cause any syscalls > or I/O until the program crashes and the file descriptor is closed. > > My thinking is I need to set an RLIMIT_STACK setting and then install > a SEGV handler which will longjmp back to the top level and return to > the fuzzer. That will be risky since it's in theory impossible to > restore any state the SEGV caused but in practice if it's always > caused by a stack overflow might be safe. I would also like to have an > ALRM handler but that requires calling alarm() on every call and I'm > not sure if the setitimer in Libfuzzer can be disabled or if it'll > interfere with that. Maybe there's a better approach, I could call > setitimer and if I see more than n ALRMs during the execution decide > it's a fault. Again it would be nice if Libfuzzer provided that > itself.It occurs to me that this is silly as Libfuzzer is currently not capable of continuing once it finds one crash. There's no way for my FuzzOne() call to report that the test "failed" but that it's still prepared to continue fuzzing more inputs. This seems like the fundamental problem I'm missing. Also, one more thing, currently Libfuzzer does not catch SIGABRT and treat it as a fatal event. I've added a SIGABRT handler to my own code and moved StaticDeathCallback to public so I can call it from there. -- greg
Kostya Serebryany via llvm-dev
2015-Sep-05 17:38 UTC
[llvm-dev] Some feedback on Libfuzzer
Greg, This is lots of useful feedback! I'll reply to individual bullets when time permits (mostly after the holidays). If you find a bug in Postgres with libFuzzer, please let us know so that we can add it to http://llvm.org/docs/LibFuzzer.html#trophies On Sat, Sep 5, 2015 at 8:40 AM, Greg Stark via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Sat, Sep 5, 2015 at 1:50 PM, Greg Stark <stark at mit.edu> wrote: > > 2) I've caught a bug which causes enormous stack growth. AFL enforces > > strict memory and time limits on the tests which is awfully > > convenient. I can implement those myself in my fuzzer function (and in > > fact in the Postgres context it would be best if I did) but some > > simple AFL-style protection would be appreciated, as it is it takes a > > *looong* time to fail and doesn't leave behind the crashing test. It > > would be nice if Libfuzzer took a page out of the sanitize code's > > tricks and kept an mmapped file where it wrote the current test being > > run. If the file is never synced then it shouldn't cause any syscalls > > or I/O until the program crashes and the file descriptor is closed. > > > > My thinking is I need to set an RLIMIT_STACK setting and then install > > a SEGV handler which will longjmp back to the top level and return to > > the fuzzer. That will be risky since it's in theory impossible to > > restore any state the SEGV caused but in practice if it's always > > caused by a stack overflow might be safe. I would also like to have an > > ALRM handler but that requires calling alarm() on every call and I'm > > not sure if the setitimer in Libfuzzer can be disabled or if it'll > > interfere with that. Maybe there's a better approach, I could call > > setitimer and if I see more than n ALRMs during the execution decide > > it's a fault. Again it would be nice if Libfuzzer provided that > > itself. > > > It occurs to me that this is silly as Libfuzzer is currently not > capable of continuing once it finds one crash. There's no way for my > FuzzOne() call to report that the test "failed" but that it's still > prepared to continue fuzzing more inputs. This seems like the > fundamental problem I'm missing. >This is more like a limitation of asan, not libFuzzer. By design, asan does not recover from the first crash. This feature has been criticized quite a lot, but I am still convinced this is a feature, not a bug. IMHO, recovery mode will be misused/abused too often to be useful, besides it adds complexity to the code. (There is a patch under review right now to implement recovery mode for asan, but I am not sure if or when this patch will be committed)> > Also, one more thing, currently Libfuzzer does not catch SIGABRT and > treat it as a fatal event. I've added a SIGABRT handler to my own code > and moved StaticDeathCallback to public so I can call it from there. > > Again, this is asan, not libFuzzer.You need ASAN_OPTIONS=handle_abort=1 I hope to make it the default soon-ish.> -- > greg > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150905/08b23e1e/attachment.html>
Kostya Serebryany via llvm-dev
2015-Sep-08 18:07 UTC
[llvm-dev] Some feedback on Libfuzzer
More replies below. If you feel some of your questions left unanswered, please ping or file a bug. On Sat, Sep 5, 2015 at 5:50 AM, Greg Stark via llvm-dev < llvm-dev at lists.llvm.org> wrote:> HI think I have a fairly nicely integrated Libfuzzer based fuzzer in > Postgres now. I can run things like: > > SELECT fuzz(100000,'select regexp_matches(''foo/bar/baz'',$1,''g'')') > > Which makes it convenient to fuzz arbitrary public functions available > in SQL. (I haven't figured out what interface to make for fuzzing > internal functions which take char buffers that can have nuls. The SQL > interface will only be able to handle valid utf8 encoded strings which > contain no nuls.) > > > I have some feedback of things that are a bit awkard or that I miss > from AFL. Some of this may actually be there but I'm just not using it > right? > > 1) One minor things, it's a bit of a pain to construct the argv when > you're not invoking it on the command line.So, you want the code Fuzzer::FuzzingOptions (FuzzerInternal.h) to be accessible to a user? I thought about it and may do it one day. File a bug if you want to track it. (Not my first priority though).> Not a big deal but it > would be nice to bypass that and just allow the caller to set the > variables directly. Some of the parameters are not entirely clear > either -- I'm not clear what the distinction is between -runs and > -iterations-iterations is an artifact from the past. removed in 247030.> and I'm not clear whether the timeout is for the whole run > or individual testsindividual tests (it's not doing anything in my case which is> probably due to Postgres having its own ALRM handler). >Yea, probably.> > 2) I've caught a bug which causes enormous stack growth. AFL enforces > strict memory and time limits on the tests which is awfully > convenient. I can implement those myself in my fuzzer function (and in > fact in the Postgres context it would be best if I did) but some > simple AFL-style protection would be appreciated, as it is it takes a > *looong* time to fail and doesn't leave behind the crashing test.That's strange. Most likely this is the same problem as above: Postgres redefines the ALRM handler? libFuzzer should be able to detect a long running test and report it.> It > would be nice if Libfuzzer took a page out of the sanitize code's > tricks and kept an mmapped file where it wrote the current test being > run. If the file is never synced then it shouldn't cause any syscalls > or I/O until the program crashes and the file descriptor is closed. >Let's resolve the above problem first, maybe this will not be needed.> > My thinking is I need to set an RLIMIT_STACK setting and then install > a SEGV handler which will longjmp back to the top level and return to > the fuzzer. That will be risky since it's in theory impossible to > restore any state the SEGV caused but in practice if it's always > caused by a stack overflow might be safe. I would also like to have an > ALRM handler but that requires calling alarm() on every call and I'm > not sure if the setitimer in Libfuzzer can be disabled or if it'll > interfere with that. Maybe there's a better approach, I could call > setitimer and if I see more than n ALRMs during the execution decide > it's a fault. Again it would be nice if Libfuzzer provided that > itself. >I am confused. Libfuzzer does set an alarm.> > 3) When it writes the minimal test corpus it seems to keep older tests > around too. I guess the intent is to pass two directories,correct. If you want to minimize the corpus do it like this: ./fuzzer NEW_EMPTY_DIR OLD_CORPUS The docs were a bit vague, I've tried to improve them in 247033.. I rarely use this option myself because libFuzzer does corpus minimization at startup. It may still be useful if you want e.g. to commit the corpus to test repository or to share the corpus with other fuzzers. one which> starts empty and is intended to receive the results and one which is > maintained as the working tree? I'm not sure how to use this mode. > > 4) The actually fuzzing seems to be less effective than AFL at finding > good cases.That's not entirely unexpected, AFL is extremely algorithmically advanced. We are trying to catch up :) Note that I've just added support for AFL-style dictionaries, which may help in your case. http://llvm.org/docs/LibFuzzer.html#dictionaries> In particular I've found I have to use only_ascii mode or > else it spends all the time looking at encoding errors on random > binary inputs. Even in only_ascii mode it seems insistent on putting a > ^L in a *lot* of tests even when the function being tested always ends > with the same error if one is present. >Hmm.. I simply rely on isspace/isprint I may of course change it to not emit ^L in ascii mode, but another way for you is to replace ^L with a space in your target function.> > I'm hoping to try DFA mode and hoping it will help with this but all > the "experimental" warnings in the docs scare me. Is it just that > there's room for improvement or is there any downside to running in > that mode? >You mean, the data flow feedback mode enabled by -use_traces=1 (and -fsanitize-coverage=trace-cmp)? This is really a prototypish thing so far. I've seen several cases in the wild where it breaks though a wall (where the regular mode does not find new coverage for days), but it's not anywhere near to be complete. By all means, try it as one of the strategies, but don't solely rely on it.> > Another thing I'm not clear whether it's not implemented yet or > there's just no feedback yet is the test for variable coverage. AFL > runs the same test repeatedly to test whether the coverage is > repeatable which can be an important thing to know whether your > testing is actually well implemented or whether you're failing to > clean up state sufficiently between runs. >Hmm.. Interesting. I don't think we have anything to check if the target function produces stable coverage. You probably can run the fuzzer 100 times with -runs=0 -seed=1 and see if it produces the same INITED coverage. Is that what you need?> > 5) I'm currently running 1M iterations per call then calling it again > (in a new process). It would be convenient if I could call it again in > the same process and in fact it would be most convenient if I could > make my code call the fuzzer repeatedly for, say, 1k invocations. I > could check for C-c once ever 1k calls and do any other cleanup, > checking for memory leaks, etc at that time. >I'll need more explanations here.> > It would also be nice to be able to ask for the minimal corpus back in > memory along with meta information like coverage, runtime, etc so I > could, say, store them in the database :) >All this is doable, just not my priority for now. If you come up with a simple patch -- you are more than welcome. (For large/complex patches now is not the best time though)> 6) The crashing and slow tests are written to the current directory. > It would be nice to be able to provide a directory for them to go > into.In my todo, file a separate bug if you want to track it.> Also, it would be nice to provide a callback or some other way > to override this. I could generate the whole SQL reproduction instead > of just having the binary data to pass and have to remember what > function I was testing. >A bit more involved, but doable, of course. You can probably also do it on your side and let us know how it works.> In general the feedback is a bit unclear. It seems to print binary > strings in several different escape styles, sometimes using \x (though > it's not clear how many hex digits follow) sometimes using 0x and > sometimes using base64: > > #755 NEW cov: 14667 bits: 476 units: 6 exec/s: 20 L: 4 \xa\x5\xcb* > 0xa,0x5,0xcb,0x2a, > > Test unit written to crash-b0f4bc53c8f72fd53ef0a6c1f46115bd7bd8fe50 > Base64: > IZqM9rA71To7KDonOlb8pCEoJ3Mn2sAnO1I3XwYoITtxO0exSjwo7u4nKZ8hnilHeQo6GDshTI4pKipWLa8KXg=> > Of these only the base64 is convenient for writing reproductions > (though a callback would be most convenient) but it's not so > convenient for watching the progress. And for many lines it seems to > print no test data which is definitely not helpful for watching > progress: > > #2028804 NEW cov: 15330 bits: 6511 units: 127 exec/s: 5880 L: 39 > #2045447 NEW cov: 15330 bits: 6512 units: 128 exec/s: 5877 L: 47 >That's a trade of between more output and fewer output. You can watch the corpus itself if you want to see all cases. The output like "0xa,0x5,0xcb,0x2a," is useful if you want to paste this data back to a C program as a char array. base64 is the simplest way to have a file repro. Escaped text is printed only for very small units, just FYI. Making things nicer in in my TODO, but someone will always dislike the style and I really don't want to spend time (at this point) making the output more customizable.> Also, all this feedback is currently going into the server log. I > would like to capture it and report it to the client. I'm currently > basically just doing my own progress feedback this way but it's > missing the information about coverage and number of units found. It > can only show the number of tests done and things like memory usage > etc. > > 7) If I open up the corpus files in emacs and accidentally hit any key > then emacs saves an autosave file but then deletes it when I undo the > accidental edit -- which causes Libfuzzer to pretty much immediately > crash with: > > Can not stat: /var/tmp/corpus/.#16813d894b330e26fdf4520793501dfffc830eb9; > exiting >I've seen a similar problem too (not with emacs). I'll try to fix it. Again, file a bug if you want to track this.> > I would suggest ignoring auto-save and backup files (.#* and *~) but > in any case this doesn't seem like it should be a fatal error. Just > warn about the disappearing file and move on to the next one. > > -- > greg > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150908/5d163753/attachment-0001.html>