Hi, I have been working on a Python application that uses hivex. Meanwhile I have encountered some Python bindings issues which could be fixed. The next issue I see now is about the value_value function. This is briefly documented as: "return data length, data type and data of a value". For Perl, Python and OCaml, this is not true. A tuple is returned for both without the length (as this can be calculated from the data value). Ruby is the outlier here that uses a dictionary with three keys. I am not familar with Ruby and neither do I know Ruby users of hivex. The documentation should likely be fixed to exclude the length, but what about the Ruby API? Is it correct or should a documentation note be added that Ruby differs Kind regards, Peter https://lekensteyn.nl
Richard W.M. Jones
2014-Aug-10 15:26 UTC
Re: [Libguestfs] About the return value of value_value
On Sun, Aug 10, 2014 at 05:04:01PM +0200, Peter Wu wrote:> Hi, > > I have been working on a Python application that uses hivex. Meanwhile I have > encountered some Python bindings issues which could be fixed. > > The next issue I see now is about the value_value function. This is briefly > documented as: "return data length, data type and data of a value". > > For Perl, Python and OCaml, this is not true. A tuple is returned > for both without the length (as this can be calculated from the data > value). Ruby is the outlier here that uses a dictionary with three > keys. I am not familar with Ruby and neither do I know Ruby users of > hivex. > > The documentation should likely be fixed to exclude the length, but > what about the Ruby API? Is it correct or should a documentation > note be added that Ruby differsNote that the documentation applies to the C API (where length is returned). Since the same generated documentation is used for the other languages too, that can make it a bit patchy. In Ruby it seems as if the length could be calculated from the string. On the other hand, I'm not sure there is any point in intentionally removing the length from the return value, as that might break callers for no particular reason. The best plan here is probably to add a note to the Ruby documentation for RLenTypeVal saying what the hash contains on Ruby. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Peter Wu
2014-Aug-10 20:19 UTC
[Libguestfs] New Python API? (was: Re: About the return value of value_value)
(renaming subject as I am partially getting off-topic) On Sunday 10 August 2014 16:26:07 Richard W.M. Jones wrote:> > The next issue I see now is about the value_value function. This is > > briefly documented as: "return data length, data type and data of a > > value". > > > > For Perl, Python and OCaml, this is not true. A tuple is returned > > for both without the length (as this can be calculated from the data > > value). Ruby is the outlier here that uses a dictionary with three > > keys. I am not familar with Ruby and neither do I know Ruby users of > > hivex. > > > > The documentation should likely be fixed to exclude the length, but > > what about the Ruby API? Is it correct or should a documentation > > note be added that Ruby differs > > Note that the documentation applies to the C API (where length is > returned). Since the same generated documentation is used for the > other languages too, that can make it a bit patchy.The Python documentation is scare on the type of the various parameters and return values. Moreover, it states "Read the hivex(3) man page to find out how to use the API." Perhaps a second API should be created that is more pythonic (read: easier to use)? I mean, right now you have to use this (with some patches[0][1], also available at git[2]): import hivex from hivex.hive_types import * h = hivex.Hivex("system", write=True) ccs_name = "ControlSet001" ccs = h.node_get_child(h.root(), ccs_name) services = h.node_get_child(ccs, "Services") svc_viostor = h.node_get_child(services, "viostor") start_id = h.node_get_value(svc_viostor, "Start") #node_type, node_value = h.value_value(start_id) dword_value = h.value_dword(start_id) if node_value != 4: new_value = { "key": "Start", # constant from hivex.hive_types "t": REG_DWORD, # alternative of b'\4\0\0\0' "value": 4 } h.node_set_value(svc_viostor, new_value) h.commit() It would be great if something like this could be done instead: import hivex hive = hivex.Hivex2("system", write=True) ccs_name = "ControlSet001" svc_viostor = hive.root()[ccs_name].Services.viostor if svc_viostor.Start != 4: # Automatically detect that int '4' is an DWORD svc_viostor.Start = 4 svc.commit() I (ab)use the __getattr__ methods if an object to allow this kind of modifications. See also the RegistryHandle helper class at https://github.com/Lekensteyn/qemu-tools/blob/master/vbox-to-qemu.py (_import_callback at line 216 may also be interesting) This is a quick implementation with not much thought put into it, but what do you think of the idea to make an easier API next to the current one? In the current implementation, Python 3 bytes (Python 2 strings) are treated as plain bytes(*). That is fine, but Unicode is not handled correctly. This might also be an opportunity to treat Unicode strings as UTF-16 (LE) strings which must be nul-terminated. So u'Bar' should become b'B\0a\0r\0\0\0'. (*) Actually, Hivex 1.3.10 is broken in Python 3 and tries to convert all strings from UTF-8 to bytes and segfaults on other input which does not work for UTF-16 strings[0].> In Ruby it seems as if the length could be calculated from the string. > On the other hand, I'm not sure there is any point in intentionally > removing the length from the return value, as that might break callers > for no particular reason. > > The best plan here is probably to add a note to the Ruby documentation > for RLenTypeVal saying what the hash contains on Ruby.... and mention that all other language bindings return a tuple / list / array with just two elements as the length can be found from the value? Kind regards, Peter https://lekensteyn.nl [0]: https://www.redhat.com/archives/libguestfs/2014-August/msg00050.html [1]: https://www.redhat.com/archives/libguestfs/2014-August/msg00053.html [2]: https://github.com/Lekensteyn/hivex/compare/master...develop
Reasonably Related Threads
- About the return value of value_value
- [PATCH] hivex: python: value_value no longer generates Unicode strings
- [PATCH 4/7] hivex: Add metadata length functions for nodes and values
- [PATCH v4 0/5] hivex: handle corrupted hives better.
- Hivex3: Saving values - always string