noxdafox
2016-Mar-07 18:14 UTC
Re: [Libguestfs] [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
On 07/03/16 13:29, Richard W.M. Jones wrote:> On Sun, Mar 06, 2016 at 05:42:24PM +0200, Matteo Cafasso wrote: >> As discussed in the topic: https://www.redhat.com/archives/libguestfs/2016-March/msg00018.html >> >> I'd like to add to libguestfs the disk forensics capabilities offered by The Sleuth Kit. >> http://www.sleuthkit.org/ >> >> The two APIs I'm adding with the patch are a simple example of which type of features TSK can enable. > A few comments in general terms: > > The current splitting of the commits doesn't make much sense to me. > I think it would be better as: > > - commit to add TSK to the appliance > > - commit to add the icat API > > - tests for icat > > - commit to add the fls0 API > > - tests for fls0 > > although it would be fine to combine the tests with the new API, or > even have all the tests as a single separate commit (as now). > > This benefits you because it will allow patches to go upstream > earlier. For example, a commit to add TSK to the appliance is a > simple and obvious change that I see no problem with. Also the icat > API is closer to being ready than the fls0 API (see below for > explanation).Indeed I've done quite a poor job in this. I will split it as suggested.> >>> <fs> fls0 /dev/sda2 /home/noxdafox/disk-content.txt >> r/r 15711-128-1: $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/desktop.ini >> -/r * 60015-128-1: $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/$R07QQZ2.txt >> -/r * 60015-128-3: $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/$R07QQZ2.txt:Zone.Identifier > What is `/home/noxdafox/disk-content.txt'?It's the local (host side) file where to store the command output.> > The problem with this API is it pushes all the parsing up in the > stack, to libguestfs consumers. > > In general we'd like to avoid that and have just one place where all > parsing needs to be done (ie. libguestfs itself), so it'd be nicer to > have an API that returns a list of structs (RStructList) with all the > important fields parsed out.As the API documentation says, this is the low level API which I have provided as an example. I took inspiration from the guestfs_ls0 API which does a similar job storing the content of a directory onto a host file. If I understood correctly (the dynamic code generation is still confusing me a bit), the way Libguestfs implements commands which could have a large output is via first dumping it onto a local file and then iterating over it. This command would list the entire content of a disk including the deleted files therefore we need to expect a large output. What is missing is the higher level implementation which would pretty much look like the libguestfs_ls API. I need to better understand how to implement it and suggestions are more than appreciated. I tried to trace back how the guestfs_find is implemented for example, but I'm still a bit disoriented by the automagic code generation.> > Does TSK have a machine-readable mode? If it does, it'll definitely > make things easier if (eg) JSON or XML output is available. If not, > push upstream to add that to TSK -- it's a simple change for them, > which will make their tools much more usable, a win for everyone.I personally disagree on this. The TSK `fls` command is a clone of the bash `ls` one. Maybe it's more similar to `ls -al` as it returns additional information. IMHO asking to upstream to add JSON or XML output format would sound pretty much as asking the same to bash for the `ls` utility. The end result is to still return a list of structs or a list of strings. But parsing the `fls` output shouldn't be that hard. It's documentation is here: http://wiki.sleuthkit.org/index.php?title=Fls> > Rich. >
Richard W.M. Jones
2016-Mar-07 19:31 UTC
Re: [Libguestfs] [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
On Mon, Mar 07, 2016 at 08:14:41PM +0200, noxdafox wrote:> As the API documentation says, this is the low level API which I > have provided as an example. > > I took inspiration from the guestfs_ls0 API which does a similar job > storing the content of a directory onto a host file. > > If I understood correctly (the dynamic code generation is still > confusing me a bit), the way Libguestfs implements commands which > could have a large output is via first dumping it onto a local file > and then iterating over it. > This command would list the entire content of a disk including the > deleted files therefore we need to expect a large output.Your understanding is correct. But the fls0 API still isn't safe because (I assume) it cannot handle filenames containing '\n'. There's another API which handles arbitrary length RStructList returns, which is: guestfs_lxattrlist / guestfs_internal_lxattrlist (see src/file.c:guestfs_impl_lxattrlist and daemon/xattr.c).> What is missing is the higher level implementation which would > pretty much look like the libguestfs_ls API. I need to better > understand how to implement it and suggestions are more than > appreciated. I tried to trace back how the guestfs_find is > implemented for example, but I'm still a bit disoriented by the > automagic code generation.See guestfs_impl_ls in src/file.c. All non_daemon_functions are implemented by some guestfs_impl_* function in the library side.> >Does TSK have a machine-readable mode? If it does, it'll definitely > >make things easier if (eg) JSON or XML output is available. If not, > >push upstream to add that to TSK -- it's a simple change for them, > >which will make their tools much more usable, a win for everyone. > I personally disagree on this. The TSK `fls` command is a clone of > the bash `ls` one. Maybe it's more similar to `ls -al` as it returns > additional information. IMHO asking to upstream to add JSON or XML > output format would sound pretty much as asking the same to bash for > the `ls` utility. > > The end result is to still return a list of structs or a list of > strings. But parsing the `fls` output shouldn't be that hard. It's > documentation is here: > http://wiki.sleuthkit.org/index.php?title=FlsWell I still think it would be better to make this parsable. If I want to get information about a file in a shell script, I use the 'stat(1)' program since that has machine-readable output (stat -c). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
noxdafox
2016-Mar-07 19:46 UTC
Re: [Libguestfs] [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
On 07/03/16 21:31, Richard W.M. Jones wrote:> On Mon, Mar 07, 2016 at 08:14:41PM +0200, noxdafox wrote: >> As the API documentation says, this is the low level API which I >> have provided as an example. >> >> I took inspiration from the guestfs_ls0 API which does a similar job >> storing the content of a directory onto a host file. >> >> If I understood correctly (the dynamic code generation is still >> confusing me a bit), the way Libguestfs implements commands which >> could have a large output is via first dumping it onto a local file >> and then iterating over it. >> This command would list the entire content of a disk including the >> deleted files therefore we need to expect a large output. > Your understanding is correct. But the fls0 API still isn't safe > because (I assume) it cannot handle filenames containing '\n'.I haven't considered this issue. This is why guestfs_ls0 separates the results using a '\0' character right? I'll try to reproduce this and see how TSK gives me the output.> > There's another API which handles arbitrary length RStructList > returns, which is: guestfs_lxattrlist / guestfs_internal_lxattrlist > (see src/file.c:guestfs_impl_lxattrlist and daemon/xattr.c).I will take a look at these ones thanks!> >> What is missing is the higher level implementation which would >> pretty much look like the libguestfs_ls API. I need to better >> understand how to implement it and suggestions are more than >> appreciated. I tried to trace back how the guestfs_find is >> implemented for example, but I'm still a bit disoriented by the >> automagic code generation. > See guestfs_impl_ls in src/file.c. All non_daemon_functions are > implemented by some guestfs_impl_* function in the library side.I guess I'll come back with a complete solution with both low level and high level implementation.> >>> Does TSK have a machine-readable mode? If it does, it'll definitely >>> make things easier if (eg) JSON or XML output is available. If not, >>> push upstream to add that to TSK -- it's a simple change for them, >>> which will make their tools much more usable, a win for everyone. >> I personally disagree on this. The TSK `fls` command is a clone of >> the bash `ls` one. Maybe it's more similar to `ls -al` as it returns >> additional information. IMHO asking to upstream to add JSON or XML >> output format would sound pretty much as asking the same to bash for >> the `ls` utility. >> >> The end result is to still return a list of structs or a list of >> strings. But parsing the `fls` output shouldn't be that hard. It's >> documentation is here: >> http://wiki.sleuthkit.org/index.php?title=Fls > Well I still think it would be better to make this parsable. If I > want to get information about a file in a shell script, I use the > 'stat(1)' program since that has machine-readable output (stat -c).Indeed but in such case you know what to expect as the set of information is a closed and well defined one. In this case the options are unfortunately many. I can of course propose the idea to upstream but I guess they won't like it much.> > Rich. >
Seemingly Similar Threads
- Re: [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
- [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
- Re: [PATCH 0/2] added icat and fls0 APIs for deleted files recovery
- [PATCH 1/2] added icat and fls0 APIs
- Re: [PATCH 1/2] added icat and fls0 APIs