Edward Ned Harvey
2010-May-01 13:37 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
Forget about files for the moment, because directories are fundamentally easier to deal with. Let''s suppose I''ve got the inode number of some directory in the present filesystem. [root at FILER ~]# ls -id /share/projects/foo/goo/rev1.0/working 14363 /share/projects/foo/goo/rev1.0/working/ I want to identify the previous names & locations of that directory from snapshots. find /share/.zfs/snapshot -inum 14363 And I want to do it fast. I don''t want to use "find" or anything else that needs to walk every tree of every snapshot. The answer needs to be essentially zero-time, just like the "ls -id" is essentially zero-time. I understand you cannot lookup names by inode number in general, because that would present a security violation. Joe User should not be able to find the name of an item that''s in a directory where he does not have permission. But, even if it can only be run by root, is there some way to lookup the name of an object based on inode number? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100501/69970c4a/attachment.html>
Casper.Dik at Sun.COM
2010-May-01 14:23 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
>I understand you cannot lookup names by inode number in general, because >that would present a security violation. Joe User should not be able to >find the name of an item that''s in a directory where he does not have >permission. > > > >But, even if it can only be run by root, is there some way to lookup the >name of an object based on inode number?Sure, that''s typically how NFS works. The inode itself is not sufficient; an inode number might be recycled and and old snapshot with the same inode number may refer to a different file. Casper
On Sat, May 1, 2010 at 16:23, <Casper.Dik at sun.com> wrote:> > >>I understand you cannot lookup names by inode number in general, because >>that would present a security violation. ?Joe User should not be able to >>find the name of an item that''s in a directory where he does not have >>permission. >> >> >> >>But, even if it can only be run by root, is there some way to lookup the >>name of an object based on inode number? > > Sure, that''s typically how NFS works. > > The inode itself is not sufficient; an inode number might be recycled and > and old snapshot with the same inode number may refer to a different file.No, a NFS client will not ask the NFS server for a name by sending the inode or NFS-handle. There is no need for a NFS client to do that. There is no way to get a name from an inode number.
Casper.Dik at Sun.COM
2010-May-01 14:49 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
>No, a NFS client will not ask the NFS server for a name by sending the >inode or NFS-handle. There is no need for a NFS client to do that.The NFS clients certainly version 2 and 3 only use the "file handle"; the file handle can be decoded by the server. It filehandle does not contain the name, only the FSid, the inode number and the generation.>There is no way to get a name from an inode number.The nfs server knows how so it is clearly possible. It is not exported to userland but the kernel can find a file by its inumber. Casper
On Sat, May 1, 2010 at 16:49, <Casper.Dik at sun.com> wrote:> > >>No, a NFS client will not ask the NFS server for a name by sending the >>inode or NFS-handle. There is no need for a NFS client to do that. > > The NFS clients certainly version 2 and 3 only use the "file handle"; > the file handle can be decoded by the server. ?It filehandle does not > contain the name, only the FSid, the inode number and the generation. > > >>There is no way to get a name from an inode number. > > The nfs server knows how so it is clearly possible. ?It is not exported to > userland but the kernel can find a file by its inumber.The nfs server can find the file but not the file _name_. inode is all that the NFS server needs, it does not need the file name if it has the inode number.
Edward Ned Harvey
2010-May-01 17:44 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Mattias Pantzare > > The nfs server can find the file but not the file _name_. > > inode is all that the NFS server needs, it does not need the file name > if it has the inode number.It is not useful or helpful for you guys to debate whether or not this is possible. And it is especially not helpful to flat out say "it''s not possible." Here is the final word on whether or not it''s possible: Whenever any process calls "open(''/some/path/filename'')" that system call is handled by the kernel, recursively resolving name to inode number, checking permissions, and opening that inode number, until the final inode is identified and opened, or some error is encountered. The point is: Obviously, the kernel has the facility to open an inode by number. However, for security reasons (enforcing permissions of parent directories before the parent directories have been identified), the ability to open an arbitrary inode by number is not normally made available to user level applications, except perhaps when run by root. At present, a file inode does not contain any reference to its parent directory or directories. But that''s just a problem inherent to files. It is fundamentally easier to reverse lookup a directory by inode number, because this information is already in the filesystem. No filesystem enhancements are needed to reverse lookup a directory by inode number, because: (a) every directory contains an entry ".." which refers to its parent by number, and (b) every directory has precisely one parent, and no more. There is no such thing as a hardlink copy of a directory. Therefore, there is exactly one absolute path to any directory in any ZFS filesystem. If the kernel (or root) can open an arbitrary directory by inode number, then the kernel (or root) can find the inode number of its parent by looking at the ''..'' entry, which the kernel (or root) can then open, and identify both: the name of the child subdir whose inode number is already known, and (b) yet another ''..'' entry. The kernel (or root) can repeat this process recursively, up to the root of the filesystem tree. At that time, the kernel (or root) has completely identified the absolute path of the inode that it started with. The only question I want answered right now is: Although it is possible, is it implemented? Is there any kind of function, or existing program, which can be run by root, to obtain either the complete path of a directory by inode number, or to simply open an inode by number, which would leave the recursion and absolute path generation yet to be completed?
> If the kernel (or root) can open an arbitrary directory by inode number, > then the kernel (or root) can find the inode number of its parent by looking > at the ''..'' entry, which the kernel (or root) can then open, and identify > both: ?the name of the child subdir whose inode number is already known, and > (b) yet another ''..'' entry. ?The kernel (or root) can repeat this process > recursively, up to the root of the filesystem tree. ?At that time, the > kernel (or root) has completely identified the absolute path of the inode > that it started with. > > The only question I want answered right now is: > > Although it is possible, is it implemented? ?Is there any kind of function, > or existing program, which can be run by root, to obtain either the complete > path of a directory by inode number, or to simply open an inode by number, > which would leave the recursion and absolute path generation yet to be > completed?You can do in the kernel by calling vnodetopath(). I don''t know if it is exposed to user space. But that could be slow if you have large directories so you have to think about where you would use it.
Casper.Dik at Sun.COM
2010-May-02 11:17 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
>You can do in the kernel by calling vnodetopath(). I don''t know if it >is exposed to user space.Yes, in /proc/*/path (kinda).>But that could be slow if you have large directories so you have to >think about where you would use it.The kernel caches file names; however, it cannot be use for files that aren''t in use. It is certainly possible to create a .zfs/snapshot_byinode but it is not clear when it helps but it can be used for finding the earlier copy of a directory (netapp/.snapshot) Casper
Edward Ned Harvey
2010-May-02 14:56 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
> From: casper at holland.sun.com [mailto:casper at holland.sun.com] On Behalf > Of Casper.Dik at Sun.COM > > It is certainly possible to create a .zfs/snapshot_byinode but it is > not > clear when it helps but it can be used for finding the earlier copy of > a > directory (netapp/.snapshot)Do you happen to have any idea how easy/difficult it would be, to create something like ".zfs/snapshot_by_inode"? And although I hypothetically described the behavior of such a thing once before, I don''t recall seeing any response to that. So I don''t know if there''s any agreement on the proposed behavior. That suggestion, again, was: Inside the ".zfs" directory, there is presently only one subdirectory, "snapshot." Let there be more: .zfs/snapshot .zfs/name_from_inode .zfs/inode_from_name Inside the ".zfs/snapshot" directory, there''s a list of snapshots. The same is true for the name_from_inode, and inode_from_name directories too. However, inside the ".zfs/snapshot/snapname" directory, there is an actual snapshot of all the files and directories of the filesystem. The other two directories behave as follows: -- #1 -- .zfs/name_from_inode: The .zfs/name_from_inode directory provides a mechanism to perform reverse inode-->name lookup. If a user does "ls .zfs/name_from_inode/snapname" then they will see nothing. (The system will not generate a complete list of all the inodes in the filesystem; that would be crazy). But if they explicitly "cat .zfs/name_from_inode/snapname/12345" then the system does an inode-->name reverse lookup, and if the result is accessible with the user''s permission level, then the result is just a text output of the pathname of the object. (Presumably a directory, because there is currently no facility to reverse lookup a file. Directories do have a reference to their parent, via ''..'' entries, but files have no such thing.) Thus, if a user wants to find all the old snapshots of a directory in the present filesystem, even if the name or location of that directory may have changed, they could do this: ls -di /tank/path/to/some/dirname (result inode number 12345) cat /tank/.zfs/name_from_inode/snapname/12345 (result path/to/previous/old-dirname ls /tank/.zfs/snapshot/snapname/path/to/previous/old-dirname And I am slowly working on scripts now, to simplify all the above into a single command: zhist ls /tank/path/to/some/dirname would display all the former snapnames for that object. It is possible for the OS, in between two snapshots, to recycle an inode number. So it is possible to mistakenly identify some completely unrelated object as a former snapshot of a present object. As far as I know, this is unavoidable, but also unlikely. -- #2 -- .zfs/inode_from_name: The .zfs/inode_from_name directory provides a mechanism to find the inode number of an object, when "ls -di" is not possible, because CIFS doesn''t support inodes. So a CIFS client would do this: cat /tank/.zfs/inode_from_name/path/to/some/dirname (result inode number 12345) The rest of the process would be the same as above. cat /tank/.zfs/name_from_inode/snapname/12345 (result path/to/previous/old-dirname ls /tank/.zfs/snapshot/snapname/path/to/previous/old-dirname
On 2010-May-02 01:44:51 +0800, Edward Ned Harvey <solaris2 at nedharvey.com> wrote:>Obviously, the kernel has the facility to open an inode by number. However, >for security reasons (enforcing permissions of parent directories before the >parent directories have been identified), the ability to open an arbitrary >inode by number is not normally made available to user level applications, >except perhaps when run by root.There is no provision in normal Unix to open a file by inode from userland. Some filesystems (eg HP Tru64) may expose a special pseudo-directoy that exposes all the inodes. Note that opening a file by inode number is a completely different issue to mapping an inode number to a pathname.>because: (a) every directory contains an entry ".." which refers to its >parent by number, and (b) every directory has precisely one parent, and no >more. There is no such thing as a hardlink copy of a directory. Therefore, >there is exactly one absolute path to any directory in any ZFS filesystem.s/is/should be/ - I haven''t checked with ZFS but it may be possible to trick/corrupt the filesystem into allowing a second real name (though the filesystem is then inconsistent).>If the kernel (or root) can open an arbitrary directory by inode number, >then the kernel (or root) can find the inode number of its parent by looking >at the ''..'' entry, which the kernel (or root) can then open, and identify >both: the name of the child subdir whose inode number is already known, and >(b) yet another ''..'' entry. The kernel (or root) can repeat this process >recursively, up to the root of the filesystem tree. At that time, the >kernel (or root) has completely identified the absolute path of the inode >that it started with.Any user can do this (subject to permissions) and this is how ''pwd'' was traditionally implemented. Note that you need to check device and inode, not just inode, to correctly handle mountpoints. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100504/7aaf762a/attachment.bin>
Edward Ned Harvey
2010-May-06 03:03 UTC
[zfs-discuss] Reverse lookup: inode to name lookup
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Edward Ned HarveyThanks to Victor, here is at least proof of concept that yes, it is possible to reverse resolve, inode number --> pathname, and yes, it is almost infinitely faster than doing something like "find": Root can reverse lookup names of inodes with this command: zdb -dddd <dataset_name> <object_number> (on a tangent) Surprisingly, it is not limited to just looking up directories. It finds files too (sort of). Apparently a file inode does contain *one* reference to its latest parent. But if you hardlink more than once, you''ll only find the latest parent, and if you rm the latest hardlink, then it''ll still find only the latest parent, which has been unlinked, and therefore not valid. But it works perfectly for directories. (back from tangent) Regardless of how big the filesystem is, regardless of cache warmness, regardless of how many inodes you want to reverse-lookup, this zdb command takes between 1 and 2 seconds per filesystem, fixed. In other words, the operation of performing reverse-lookup per inode is essentially zero time, but there is some kind of "startup" overhead. In theory at least, the reverse lookup could be equally as fast as a regular forward lookup, such as "ls" or "stat". But my measurements also show that a forward lookup incurs some form of "startup" overhead. A forward lookup on an already mounted filesystem should require a few ms. But in my example below, it takes several hundred ms per snapshot, which means there''s a "warmup" period for some reason, to open up each snapshot. Find, of course, scales linearly with the total number of directories/files in the filesystem. On my company filer, I got these results: Just do a forward lookup time ls -d /tank/somefilesystem/.zfs/snapshot/*/some_object took 24 sec, on my 53 snapshots (that''s 0.45sec per snapshot) Using a for loop and zdb to reverse lookup those things took 1m 3sec, on my 53 snapshots (that''s 1.19 sec per snapshot) Using "find -inum" to locate all those things ... I only let it complete 4 snapshots. Took 33 mins per snapshot So that''s a marvelous proof of concept. Yes, reverse lookup is possible, and it''s essentially infinitely faster than "find -inum" can be. I have a feeling a reverse-lookup application could be even faster, if it were an application designed specifically for this purpose. Zdb is not a suitable long term solution for this purpose. Zdb is only sufficient here, as a proof of concept. Here''s the problem with zdb: man zdb DESCRIPTION The zdb command is used by support engineers to diagnose failures and gather statistics. Since the ZFS file system is always consistent on disk and is self-repairing, zdb should only be run under the direction by a support engineer. If no arguments are specified, zdb, performs basic con- sistency checks on the pool and associated datasets, and report any problems detected. Any options supported by this command are internal to Sun and subject to change at any time.