Laszlo Ersek
2021-Oct-11 22:36 UTC
[Libguestfs] [PATCH 3/3] daemon/yara: fix undefined behavior due to Yara 4.0 API changes
Currently, the Yara test case ("yara/test-yara-scan.sh") fails, with the following obscure error message:> ><fs> yara-scan /text.txt > libguestfs: error: deserialise_yara_detection_list:Namely, the Yara rule match list serialization / de-serialization, between the daemon and the library, is broken. It is caused by the following incompatible pointer passing (i.e., undefined behavior), in function do_internal_yara_scan(), file "daemon/yara.c":> r = yr_rules_scan_fd (rules, fd, 0, yara_rules_callback, (void *) path, 0);^^^^^^^^^^^^^^^^^^^ The prototype of yara_rules_callback() is:> static int > yara_rules_callback (int code, void *message, void *data)however, in Yara commit 2b121b166d25 ("Track string matches using YR_SCAN_CONTEXT.", 2020-02-27), which was included in the upstream v4.0.0 release, the rules callback prototype was changed as follows:> diff --git a/libyara/include/yara/types.h b/libyara/include/yara/types.h > index cad095cd70c2..f415033c4aa6 100644 > --- a/libyara/include/yara/types.h > +++ b/libyara/include/yara/types.h > @@ -661,6 +644,7 @@ struct YR_MEMORY_BLOCK_ITERATOR > > > typedef int (*YR_CALLBACK_FUNC)( > + YR_SCAN_CONTEXT* context, > int message, > void* message_data, > void* user_data);Therefore, the yara_rules_callback() function is entered with a mismatched parameter list in the daemon, and so it passes garbage to send_detection_info(), for serializing the match list. This incompatible change was in fact documented by the Yara project: https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#scanning-callback Gcc too warns about the incompatible pointer type, under "-Wincompatible-pointer-types". However, libguestfs is built without "-Werror" by default, so the warning is easy to miss, and the bug only manifests at runtime. (The same problem exists for yr_compiler_set_callback() / compile_error_callback(): https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#compiler-callback except that this instance of the problem is not triggered by the test case, as the rule list always compiles.) Rather than simply fixing the parameter lists, consider the following approach. If Yara's YR_CALLBACK_FUNC and YR_COMPILER_CALLBACK_FUNC typedefs were not for *pointer* types but actual *function* prototypes, then we could use them directly in the declarations of our callback functions. Then any future changes in the param lists would force a "conflicting types" *compilation error* (not a warning). Illustration: /* this is *not* a pointer type */ typedef int HELLO_FUNC (void); /* function declarations */ static HELLO_FUNC my_hello_good; static HELLO_FUNC my_hello_bad; /* function definition, with explicit parameter list */ static int my_hello_good (void) { return 1; } /* function definition with wrong param list -> compilation error */ static int my_hello_bad (int i) { return i; } Unfortunately, given that the Yara-provided typedefs are already pointers, we can't do this, in standard C anyway. Type derivation only allows for array, structure, union, function, and pointer type derivation; it does not allow "undoing" previous derivations. However, using gcc's "typeof" keyword, the idea is possible. Given YR_CALLBACK_FUNC, the expression (YR_CALLBACK_FUNC)NULL is a well-defined null pointer, and the function designator expression *(YR_CALLBACK_FUNC)NULL has the desired function type. Of course, evaluating this expression would be undefined behavior, but in the GCC extension expression typeof (*(YR_CALLBACK_FUNC)NULL) the operand of the "typeof" operator is never evaluated, as it does not have a variably modified type. We can therefore use this "typeof" in the same role as HELLO_FUNC had in the above example. Signed-off-by: Laszlo Ersek <lersek at redhat.com> --- daemon/yara.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/daemon/yara.c b/daemon/yara.c index 9e7bc9414957..e5f5b89eb6d9 100644 --- a/daemon/yara.c +++ b/daemon/yara.c @@ -56,11 +56,22 @@ static YR_RULES *rules = NULL; static bool initialized = false; static int compile_rules_file (const char *); -static void compile_error_callback (int, const char *, int, const char *, void *); static void cleanup_destroy_yara_compiler (void *ptr); -static int yara_rules_callback (int , void *, void *); static int send_detection_info (const char *, YR_RULE *); +/* Typedefs that effectively strip the pointer derivation from Yara's + * YR_CALLBACK_FUNC and YR_COMPILER_CALLBACK_FUNC types, using GCC's "typeof" + * extension. + */ +typedef typeof (*(YR_CALLBACK_FUNC)NULL) guestfs_yr_callback; +typedef typeof (*(YR_COMPILER_CALLBACK_FUNC)NULL) guestfs_yr_compiler_callback; + +/* Declarations of our callback functions expressed in terms of Yara's + * typedefs. Note: these are *function declarations*. + */ +static guestfs_yr_callback yara_rules_callback; +static guestfs_yr_compiler_callback compile_error_callback; + /* Has one FileIn parameter. * Takes optional arguments, consult optargs_bitmask. */ @@ -210,7 +221,7 @@ compile_rules_file (const char *rules_path) */ static void compile_error_callback (int level, const char *name, int line, - const char *message, void *data) + const YR_RULE* rule, const char *message, void *data) { if (level == YARA_ERROR_LEVEL_ERROR) fprintf (stderr, "Yara error (line %d): %s\n", line, message); @@ -222,7 +233,8 @@ compile_error_callback (int level, const char *name, int line, * Return 0 on success, -1 on error. */ static int -yara_rules_callback (int code, void *message, void *data) +yara_rules_callback (YR_SCAN_CONTEXT *context, int code, void *message, + void *data) { int ret = 0; -- 2.19.1.3.g30247aa5d201
Richard W.M. Jones
2021-Oct-12 10:04 UTC
[Libguestfs] [PATCH 3/3] daemon/yara: fix undefined behavior due to Yara 4.0 API changes
Thanks - ACK series. It may be that the bad help string ("error" instead of "werror") is present in other projects since I just copied the configure.ac when splitting libguestfs up. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2021-Oct-12 14:16 UTC
[Libguestfs] [PATCH 3/3] daemon/yara: fix undefined behavior due to Yara 4.0 API changes
On Tue, Oct 12, 2021 at 12:36:27AM +0200, Laszlo Ersek wrote:> The prototype of yara_rules_callback() is: > > > static int > > yara_rules_callback (int code, void *message, void *data) > > however, in Yara commit 2b121b166d25 ("Track string matches using > YR_SCAN_CONTEXT.", 2020-02-27), which was included in the upstream v4.0.0 > release, the rules callback prototype was changed as follows: > > > diff --git a/libyara/include/yara/types.h b/libyara/include/yara/types.h > > index cad095cd70c2..f415033c4aa6 100644 > > --- a/libyara/include/yara/types.h > > +++ b/libyara/include/yara/types.h > > @@ -661,6 +644,7 @@ struct YR_MEMORY_BLOCK_ITERATOR > > > > > > typedef int (*YR_CALLBACK_FUNC)( > > + YR_SCAN_CONTEXT* context, > > int message, > > void* message_data, > > void* user_data);Do we intend to compile against both older and newer versions of Yara, in which case we'd need a configure-time probe of which variant we must compile against? I could not quickly find documentation of a minimum version of Yara that we are willing to support, at least not in README or HACKING.> Of course, evaluating this expression would be undefined behavior, but in > the GCC extension expression > > typeof (*(YR_CALLBACK_FUNC)NULL) > > the operand of the "typeof" operator is never evaluated, as it does not > have a variably modified type. We can therefore use this "typeof" in the > same role as HELLO_FUNC had in the above example.Clever.> +/* Typedefs that effectively strip the pointer derivation from Yara's > + * YR_CALLBACK_FUNC and YR_COMPILER_CALLBACK_FUNC types, using GCC's "typeof" > + * extension. > + */ > +typedef typeof (*(YR_CALLBACK_FUNC)NULL) guestfs_yr_callback; > +typedef typeof (*(YR_COMPILER_CALLBACK_FUNC)NULL) guestfs_yr_compiler_callback; > + > +/* Declarations of our callback functions expressed in terms of Yara's > + * typedefs. Note: these are *function declarations*. > + */ > +static guestfs_yr_callback yara_rules_callback; > +static guestfs_yr_compiler_callback compile_error_callback; > +As written, this portion of the code will compile with either old or new Yara...> /* Has one FileIn parameter. > * Takes optional arguments, consult optargs_bitmask. > */ > @@ -210,7 +221,7 @@ compile_rules_file (const char *rules_path) > */ > static void > compile_error_callback (int level, const char *name, int line, > - const char *message, void *data) > + const YR_RULE* rule, const char *message, void *data)...but this will cause compile failure on old Yara (redefinition to an incompatible type); if we want to support both versions, we need #ifdef alternatives depending on the version of Yara we are building against. [I did not check whether the Yara project properly bumped their .so version to match their incompatible ABI change; hopefully they did, so that anyone using dynamic linking against older libraries will be forced to recompile for the new .so, rather than being hit with runtime failures as you encountered in the libguestfs testsuite. But if not, that's a bug report for a different mailing list] -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Laszlo Ersek
2021-Oct-12 14:56 UTC
[Libguestfs] [PATCH 3/3] daemon/yara: fix undefined behavior due to Yara 4.0 API changes
On 10/12/21 00:36, Laszlo Ersek wrote:> Currently, the Yara test case ("yara/test-yara-scan.sh") fails, with the > following obscure error message: > >>> <fs> yara-scan /text.txt >> libguestfs: error: deserialise_yara_detection_list: > > Namely, the Yara rule match list serialization / de-serialization, between > the daemon and the library, is broken. It is caused by the following > incompatible pointer passing (i.e., undefined behavior), in function > do_internal_yara_scan(), file "daemon/yara.c": > >> r = yr_rules_scan_fd (rules, fd, 0, yara_rules_callback, (void *) path, 0); > ^^^^^^^^^^^^^^^^^^^ > > The prototype of yara_rules_callback() is: > >> static int >> yara_rules_callback (int code, void *message, void *data) > > however, in Yara commit 2b121b166d25 ("Track string matches using > YR_SCAN_CONTEXT.", 2020-02-27), which was included in the upstream v4.0.0 > release, the rules callback prototype was changed as follows: > >> diff --git a/libyara/include/yara/types.h b/libyara/include/yara/types.h >> index cad095cd70c2..f415033c4aa6 100644 >> --- a/libyara/include/yara/types.h >> +++ b/libyara/include/yara/types.h >> @@ -661,6 +644,7 @@ struct YR_MEMORY_BLOCK_ITERATOR >> >> >> typedef int (*YR_CALLBACK_FUNC)( >> + YR_SCAN_CONTEXT* context, >> int message, >> void* message_data, >> void* user_data); > > Therefore, the yara_rules_callback() function is entered with a mismatched > parameter list in the daemon, and so it passes garbage to > send_detection_info(), for serializing the match list. > > This incompatible change was in fact documented by the Yara project: > > https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#scanning-callback > > Gcc too warns about the incompatible pointer type, under > "-Wincompatible-pointer-types". However, libguestfs is built without > "-Werror" by default, so the warning is easy to miss, and the bug only > manifests at runtime. > > (The same problem exists for yr_compiler_set_callback() / > compile_error_callback(): > > https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#compiler-callback > > except that this instance of the problem is not triggered by the test > case, as the rule list always compiles.) > > Rather than simply fixing the parameter lists, consider the following > approach. > > If Yara's YR_CALLBACK_FUNC and YR_COMPILER_CALLBACK_FUNC typedefs were not > for *pointer* types but actual *function* prototypes, then we could use > them directly in the declarations of our callback functions. Then any > future changes in the param lists would force a "conflicting types" > *compilation error* (not a warning). Illustration: > > /* this is *not* a pointer type */ > typedef int HELLO_FUNC (void); > > /* function declarations */ > static HELLO_FUNC my_hello_good; > static HELLO_FUNC my_hello_bad; > > /* function definition, with explicit parameter list */ > static int my_hello_good (void) { return 1; } > > /* function definition with wrong param list -> compilation error */ > static int my_hello_bad (int i) { return i; } > > Unfortunately, given that the Yara-provided typedefs are already pointers, > we can't do this, in standard C anyway. Type derivation only allows for > array, structure, union, function, and pointer type derivation; it does > not allow "undoing" previous derivations. > > However, using gcc's "typeof" keyword, the idea is possible. Given > YR_CALLBACK_FUNC, the expression > > (YR_CALLBACK_FUNC)NULL > > is a well-defined null pointer, and the function designator expression > > *(YR_CALLBACK_FUNC)NULL > > has the desired function type. > > Of course, evaluating this expression would be undefined behavior, but in > the GCC extension expression > > typeof (*(YR_CALLBACK_FUNC)NULL) > > the operand of the "typeof" operator is never evaluated, as it does not > have a variably modified type. We can therefore use this "typeof" in the > same role as HELLO_FUNC had in the above example. > > Signed-off-by: Laszlo Ersek <lersek at redhat.com> > --- > daemon/yara.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/daemon/yara.c b/daemon/yara.c > index 9e7bc9414957..e5f5b89eb6d9 100644 > --- a/daemon/yara.c > +++ b/daemon/yara.c > @@ -56,11 +56,22 @@ static YR_RULES *rules = NULL; > static bool initialized = false; > > static int compile_rules_file (const char *); > -static void compile_error_callback (int, const char *, int, const char *, void *); > static void cleanup_destroy_yara_compiler (void *ptr); > -static int yara_rules_callback (int , void *, void *); > static int send_detection_info (const char *, YR_RULE *); > > +/* Typedefs that effectively strip the pointer derivation from Yara's > + * YR_CALLBACK_FUNC and YR_COMPILER_CALLBACK_FUNC types, using GCC's "typeof" > + * extension. > + */ > +typedef typeof (*(YR_CALLBACK_FUNC)NULL) guestfs_yr_callback; > +typedef typeof (*(YR_COMPILER_CALLBACK_FUNC)NULL) guestfs_yr_compiler_callback; > + > +/* Declarations of our callback functions expressed in terms of Yara's > + * typedefs. Note: these are *function declarations*. > + */ > +static guestfs_yr_callback yara_rules_callback; > +static guestfs_yr_compiler_callback compile_error_callback; > + > /* Has one FileIn parameter. > * Takes optional arguments, consult optargs_bitmask. > */ > @@ -210,7 +221,7 @@ compile_rules_file (const char *rules_path) > */ > static void > compile_error_callback (int level, const char *name, int line, > - const char *message, void *data) > + const YR_RULE* rule, const char *message, void *data)^^^^^^^^^^^^^ I cut and pasted this from the <https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#compiler-callback> article; that's why it's not "YR_RULE *rule" (note the different placement of the space character vs. the asterisks). I'm fixing that up now, before I merge the series. (I'll re-run the test suite too.) (I had updated "YR_SCAN_CONTEXT *context" below, before posting the patch, relative to <https://github.com/VirusTotal/yara/wiki/Backward-incompatible-changes-in-YARA-4.0-API#scanning-callback>.) Thanks, Laszlo> { > if (level == YARA_ERROR_LEVEL_ERROR) > fprintf (stderr, "Yara error (line %d): %s\n", line, message); > @@ -222,7 +233,8 @@ compile_error_callback (int level, const char *name, int line, > * Return 0 on success, -1 on error. > */ > static int > -yara_rules_callback (int code, void *message, void *data) > +yara_rules_callback (YR_SCAN_CONTEXT *context, int code, void *message, > + void *data) > { > int ret = 0; > >