Ivan Krylov
2023-Sep-26 21:18 UTC
[Rd] Crashes and other problems uncovered by static analysis
Hello R-devel, I've been trying to see if static analysers could have caught PR18602. Surprisingly, the analysers I've tried don't seem to notice it, but they do uncover other problems. While most of these either stem from slightly too defensive programming ("condition is always true/false" warnings), are mostly harmless style issues ("variable assigned twice without using it in between" / "unreachable assignment of <error code> after NORET Rf_error(...)"), are unlikely to cause issues in practice (a few unchecked dereferences of pointers that could in theory be null), or are just plain false positives (e.g. it's a warning every time the code calls snprintf(..., gettext(...), ...)), I now also have a small collection of stupid ways to crash R. For example, there are stack buffer overflows in contour() (caused by unrealistically long labels) and the "concise traceback" generator (caused by unrealistically long function names). (The runtime terminates R immediately upon noticing the overflow; the SEGV handler is not called.) Also, there's a mostly harmless integer overflow in polyroot() that may end up asking for 134217728 Tb of memory given gigabyte-sized input. Additionally, gzcon() (but *not* gzfile()) will silently fail to work with .gz files containing embedded filenames that have 0xff bytes in them (e.g. ? on Latin-1 systems or ? on KOI-8 systems). I'm approximately halfway through the ~600 warnings, so there will likely be a few more real problems. I can post crashes (and likely fixes) to the Bugzilla, unlikely as they are to happen in real life, but how serious should a non-crashing problem be to warrant a PR and a suggested patch? I'd rather avoid making it like the SQLite forum where anyone with a new fuzzer or a static analyser posts a crash (or just a suspected warning) then goes off to celebrate and pad their CV. -- Best regards, Ivan