Sam Eiderman
2020-Apr-20 10:17 UTC
[Libguestfs] [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
The python3 bindings create unicode objects from application strings
on the guest (i.e. installed rpm, deb packages).
It is documented that rpm package fields such as description should be
utf8 encoded - however in some cases they are not a valid unicode
string, on SLES11 SP4 the following packages fail to be converted to
unicode using guestfs_int_py_fromstring() (which invokes
PyUnicode_FromString()):
PackageKit
aaa_base
coreutils
dejavu
desktop-data-SLED
gnome-utils
hunspell
hunspell-32bit
hunspell-tools
libblocxx6
libexif
libgphoto2
libgtksourceview-2_0-0
libmpfr1
libopensc2
libopensc2-32bit
liborc-0_4-0
libpackagekit-glib10
libpixman-1-0
libpixman-1-0-32bit
libpoppler-glib4
libpoppler5
libsensors3
libtelepathy-glib0
m4
opensc
opensc-32bit
permissions
pinentry
poppler-tools
python-gtksourceview
splashy
syslog-ng
tar
tightvnc
xorg-x11
xorg-x11-xauth
yast2-mouse
This is a surgical fix for inspect_list_applications2()'s description
field.
Signed-off-by: Sam Eiderman <sameid@google.com>
---
generator/python.ml | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/generator/python.ml b/generator/python.ml
index f0d6b5d96..7394a943a 100644
--- a/generator/python.ml
+++ b/generator/python.ml
@@ -170,6 +170,14 @@ and generate_python_structs () function
| name, FString ->
pr " value = guestfs_int_py_fromstring (%s->%s);\n"
typ name;
+ (match typ, name with
+ | "application", "app_description"
+ | "application2", "app2_description" ->
+ pr " if (value == NULL) {\n";
+ pr " value = guestfs_int_py_fromstring
(\"\");\n";
+ pr " PyErr_Clear ();\n";
+ pr " }\n";
+ | _ -> pr ""; );
pr " if (value == NULL)\n";
pr " goto err;\n";
pr " PyDict_SetItemString (dict, \"%s\",
value);\n" name;
--
2.26.1.301.g55bc3eb7cb9-goog
Daniel P. Berrangé
2020-Apr-20 11:41 UTC
Re: [Libguestfs] [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
On Mon, Apr 20, 2020 at 01:17:35PM +0300, Sam Eiderman wrote:> The python3 bindings create unicode objects from application strings > on the guest (i.e. installed rpm, deb packages). > It is documented that rpm package fields such as description should be > utf8 encoded - however in some cases they are not a valid unicode > string, on SLES11 SP4 the following packages fail to be converted to > unicode using guestfs_int_py_fromstring() (which invokes > PyUnicode_FromString()): > > PackageKit > aaa_base > coreutils > dejavu > desktop-data-SLED > gnome-utils > hunspell > hunspell-32bit > hunspell-tools > libblocxx6 > libexif > libgphoto2 > libgtksourceview-2_0-0 > libmpfr1 > libopensc2 > libopensc2-32bit > liborc-0_4-0 > libpackagekit-glib10 > libpixman-1-0 > libpixman-1-0-32bit > libpoppler-glib4 > libpoppler5 > libsensors3 > libtelepathy-glib0 > m4 > opensc > opensc-32bit > permissions > pinentry > poppler-tools > python-gtksourceview > splashy > syslog-ng > tar > tightvnc > xorg-x11 > xorg-x11-xauth > yast2-mouse > > This is a surgical fix for inspect_list_applications2()'s description > field. > > Signed-off-by: Sam Eiderman <sameid@google.com> > --- > generator/python.ml | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/generator/python.ml b/generator/python.ml > index f0d6b5d96..7394a943a 100644 > --- a/generator/python.ml > +++ b/generator/python.ml > @@ -170,6 +170,14 @@ and generate_python_structs () > function > | name, FString -> > pr " value = guestfs_int_py_fromstring (%s->%s);\n" typ name; > + (match typ, name with > + | "application", "app_description" > + | "application2", "app2_description" -> > + pr " if (value == NULL) {\n"; > + pr " value = guestfs_int_py_fromstring (\"\");\n"; > + pr " PyErr_Clear ();\n"; > + pr " }\n";I don't think this is especially friendly/helpful to users. I'm assuming that there's just a handful of characters that are not valid UTF-8. I think we really want a graceful conversion that will convert as much as possible, replacing any invalid UTF-8 with some generic placeholder character. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Sam Eiderman
2020-Apr-20 12:38 UTC
Re: [Libguestfs] [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
I uploaded a v2, which does as you requested, more globally (across all python bindings) - tell me what you think. On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé <berrange@redhat.com> wrote:> On Mon, Apr 20, 2020 at 01:17:35PM +0300, Sam Eiderman wrote: > > The python3 bindings create unicode objects from application strings > > on the guest (i.e. installed rpm, deb packages). > > It is documented that rpm package fields such as description should be > > utf8 encoded - however in some cases they are not a valid unicode > > string, on SLES11 SP4 the following packages fail to be converted to > > unicode using guestfs_int_py_fromstring() (which invokes > > PyUnicode_FromString()): > > > > PackageKit > > aaa_base > > coreutils > > dejavu > > desktop-data-SLED > > gnome-utils > > hunspell > > hunspell-32bit > > hunspell-tools > > libblocxx6 > > libexif > > libgphoto2 > > libgtksourceview-2_0-0 > > libmpfr1 > > libopensc2 > > libopensc2-32bit > > liborc-0_4-0 > > libpackagekit-glib10 > > libpixman-1-0 > > libpixman-1-0-32bit > > libpoppler-glib4 > > libpoppler5 > > libsensors3 > > libtelepathy-glib0 > > m4 > > opensc > > opensc-32bit > > permissions > > pinentry > > poppler-tools > > python-gtksourceview > > splashy > > syslog-ng > > tar > > tightvnc > > xorg-x11 > > xorg-x11-xauth > > yast2-mouse > > > > This is a surgical fix for inspect_list_applications2()'s description > > field. > > > > Signed-off-by: Sam Eiderman <sameid@google.com> > > --- > > generator/python.ml | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/generator/python.ml b/generator/python.ml > > index f0d6b5d96..7394a943a 100644 > > --- a/generator/python.ml > > +++ b/generator/python.ml > > @@ -170,6 +170,14 @@ and generate_python_structs () > > function > > | name, FString -> > > pr " value = guestfs_int_py_fromstring (%s->%s);\n" typ > name; > > + (match typ, name with > > + | "application", "app_description" > > + | "application2", "app2_description" -> > > + pr " if (value == NULL) {\n"; > > + pr " value = guestfs_int_py_fromstring (\"\");\n"; > > + pr " PyErr_Clear ();\n"; > > + pr " }\n"; > > I don't think this is especially friendly/helpful to users. > > I'm assuming that there's just a handful of characters that are not > valid UTF-8. I think we really want a graceful conversion that will > convert as much as possible, replacing any invalid UTF-8 with some > generic placeholder character. > > Regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| > >
Seemingly Similar Threads
- [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
- [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
- Re: [PATCH v3] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
- Re: [PATCH v2] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
- Re: [PATCH v2] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)