thr3ads.net - Libguestfs - Re: [Libguestfs] extract NTFS Master File Table for analysis [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Richard W.M. Jones

2016-Feb-02 19:35 UTC

Re: [Libguestfs] extract NTFS Master File Table for analysis

On Tue, Feb 02, 2016 at 07:40:12PM +0200, noxdafox
wrote:> Greetings,
> 
> I'm playing around an idea and I'd like to ask you some questions.
> 
> I'd like to extract the MFT table from a disk image file. The idea
> is to employ it to build a sort of reverse lookup table which, given
> a cluster, could retrieve the corresponding file with the related
> metadata.
> 
> Such table could be used to optimize the analysis of disk snapshots
> in order to collect the changes which happened on the disk. As the
> disk snapshots contains only the new or modified clusters, I could
> avoid exploring the whole FS content and focus on what has really
> changed on disk.
> 
> Did you explore the concept anyhow?
No.
> Is there a way I can use libguestfs to locate and extract the MFT
> table from a disk image?
If there's an ntfsprogs command that does this (ntfsinfo --mft maybe?)
then it's really easy to extract the output from that command.  You
could hack it together using `debug sh', search this page:

  http://libguestfs.org/guestfs-faq.1.html

... but if you wanted to do it "properly" then you could add an API
modelled on one of the `FileOut' APIs, eg:

  https://github.com/libguestfs/libguestfs/blob/master/daemon/base64.c#L100

For information on adding APIs, see:

  http://libguestfs.org/guestfs-hacking.1.html#adding-a-new-api

This question of how do you find which disk block is associated with a
particular file comes up often enough that I have looked at it various
times on my blog:

 
https://rwmj.wordpress.com/2014/02/21/use-guestfish-and-nbdkit-to-examine-physical-disk-locations/

  https://rwmj.wordpress.com/2014/11/23/mapping-files-to-disk/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

noxdafox

2016-Feb-18 19:41 UTC

head link

Re: [Libguestfs] extract NTFS Master File Table for analysis

On 02/02/16 21:35, Richard W.M. Jones wrote:> On Tue, Feb 02, 2016 at 07:40:12PM +0200, noxdafox wrote:
>> Greetings,
>>
>> I'm playing around an idea and I'd like to ask you some
questions.
>>
>> I'd like to extract the MFT table from a disk image file. The idea
>> is to employ it to build a sort of reverse lookup table which, given
>> a cluster, could retrieve the corresponding file with the related
>> metadata.
>>
>> Such table could be used to optimize the analysis of disk snapshots
>> in order to collect the changes which happened on the disk. As the
>> disk snapshots contains only the new or modified clusters, I could
>> avoid exploring the whole FS content and focus on what has really
>> changed on disk.
>>
>> Did you explore the concept anyhow?
> No.
>
>> Is there a way I can use libguestfs to locate and extract the MFT
>> table from a disk image?
> If there's an ntfsprogs command that does this (ntfsinfo --mft maybe?)
> then it's really easy to extract the output from that command.  You
> could hack it together using `debug sh', search this page:
>
>    http://libguestfs.org/guestfs-faq.1.html
>
> ... but if you wanted to do it "properly" then you could add an
API
> modelled on one of the `FileOut' APIs, eg:
>
>   
https://github.com/libguestfs/libguestfs/blob/master/daemon/base64.c#L100
>
> For information on adding APIs, see:
>
>    http://libguestfs.org/guestfs-hacking.1.html#adding-a-new-apiI played around a bit and I need to confess I am impressed on how easy 
is to add functionalities to libguestfs.

I could easily extract the Master File Table using the download API and 
parse it with third party tools.

I'd like to extract as well the Update Sequence Number Journal 
($UsnJrnl) but it seems unaccessible via it's path (C:\$Extend\$UsnJrnl).
I tried on a real disk and it seems to be a limitation of the NTFS-3g 
driver: it can extract C:\$MTF and C:\$LogFile, it can list C:\$Extend 
content but it cannot access those files.

Curiously enough, stat() syscall on C:\$Extend\$UsnJrnl seems to work 
and returns the correct inode number. Yet the size is wrong as it 
reports 0 while the real one is > 9Mb.

The next step I tried was to use ntfscat command in the following 
manner: ntfscat -i <UsnJrnl inode number> /dev/sdXX and it worked 
flawlessly.

So I proceeded adding such API to libguestfs and I could extract the 
journal without any issue. The UsnJrnl file is very handy to check what 
changes were made on disk. Not only it's faster than using virt-diff on 
two different snapshots but it also shows much more relevant 
information. I could for example track down temporary files created and 
deleted within the two snapshots.

All of this to say I'd like to add the possibility of extracting files 
via their inode. This functionality has the advantage of not requiring 
the FS to be mounted. Would libguestfs benefit from this?

If so how should I proceed? Which API names to use?

Most straightforward would be something like:

   ntfsicat(device, inode)

or

   ntfsidownload(device, inode)

I guess also linux guest disks would benefit from this but this requires 
a bit more research.
>
> This question of how do you find which disk block is associated with a
> particular file comes up often enough that I have looked at it various
> times on my blog:
>
>   
https://rwmj.wordpress.com/2014/02/21/use-guestfish-and-nbdkit-to-examine-physical-disk-locations/
>
>    https://rwmj.wordpress.com/2014/11/23/mapping-files-to-disk/
>
> Rich.
>

Richard W.M. Jones

2016-Feb-19 10:51 UTC

head link

Re: [Libguestfs] extract NTFS Master File Table for analysis

On Thu, Feb 18, 2016 at 09:41:51PM +0200, noxdafox
wrote:> All of this to say I'd like to add the possibility of extracting
> files via their inode. This functionality has the advantage of not
> requiring the FS to be mounted. Would libguestfs benefit from this?
> 
> If so how should I proceed? Which API names to use?
We generally tend to stick to API names which are the same as the
underlying utility, so "ntfscat".  In this case however ntfscat has
lots of different modes, so we'd use a name like "ntfscat_i" for
this
API.
> Most straightforward would be something like:
> 
>   ntfsicat(device, inode)
  { defaults with
    name = "ntfscat_i";
    style = RErr, [Mountable "device"; Int64 "inode";
FileOut "filename"], [];
    ...
  }

seems like the right sort of API to use.
> I guess also linux guest disks would benefit from this but this
> requires a bit more research.
Not sure if there is any way to download a file by inode from a Linux
filesystem.  But it doesn't matter for this case.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Possibly Parallel Threads

Search for more reasonably related threads

Libguestfs - Feb 2016 - Re: extract NTFS Master File Table for analysis

Re: [Libguestfs] extract NTFS Master File Table for analysis

Re: [Libguestfs] extract NTFS Master File Table for analysis

Re: [Libguestfs] extract NTFS Master File Table for analysis

Possibly Parallel Threads