# dtrace -n ''syscall::open64:return /execname == "cat"/ { trace(fds[arg0].fi_pathname); }'' & dtrace: description ''syscall::open64:return '' matched 1 probe # touch /tmp/old.name # mv /tmp/old.name /tmp/new.name # cat /tmp/new.name CPU ID FUNCTION:NAME 0 40825 open64:return /tmp/old.name fds[] uses ->v_path directly, but a rename(2) never updates the cached path. I assume there''s a good reason that ->v_path can''t be updated on rename(2)? It also means that pfiles(1) can''t resolve the path of the file, as well. regards john
The introduction of v_path was purely for observability, not correctness. It''s actually impossible to guarantee correctness in all circumstances. That being said, we should strive to improve observability in the common cases. For rename() in particular, the reason it wasn''t done is that due to the nature of the VFS interfaces, it can''t be done in a generic way. This means that every filesystem would need to call back into a private VFS interface to update the path. There''s nothing stopping bundled ON filesystems from doing this. In fact, NFS was enhanced to do just this not too long ago. It would be a small amount of code to fix UFS and ZFS to do this callback as well. - Eric On Fri, Apr 07, 2006 at 06:34:41PM +0100, John Levon wrote:> > # dtrace -n ''syscall::open64:return /execname == "cat"/ { trace(fds[arg0].fi_pathname); }'' & > dtrace: description ''syscall::open64:return '' matched 1 probe > # touch /tmp/old.name > # mv /tmp/old.name /tmp/new.name > # cat /tmp/new.name > CPU ID FUNCTION:NAME > 0 40825 open64:return /tmp/old.name > > fds[] uses ->v_path directly, but a rename(2) never updates the cached path. I > assume there''s a good reason that ->v_path can''t be updated on rename(2)? It > also means that pfiles(1) can''t resolve the path of the file, as well. > > regards > john > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
I made the change to nfs for v_path to be updated on rename, but this is only slightly better. There are still cases where the path is wrong, consider: $ echo hi > a $ ln a b $ rm a $ echo bye > a Now there will be two different vnodes with the same v_path. I''ve thought about a way to fix this, but I''m not sure it is worth it. Also, what happens when a directory is renamed? The v_path for that directory would get updated, but what about any file in that directory that already has the v_path set with the old directory name? Just my thoughts. Jim Eric Schrock wrote On 04/07/06 12:44,:>The introduction of v_path was purely for observability, not >correctness. It''s actually impossible to guarantee correctness in all >circumstances. That being said, we should strive to improve >observability in the common cases. > >For rename() in particular, the reason it wasn''t done is that due to the >nature of the VFS interfaces, it can''t be done in a generic way. This >means that every filesystem would need to call back into a private VFS >interface to update the path. > >There''s nothing stopping bundled ON filesystems from doing this. In >fact, NFS was enhanced to do just this not too long ago. It would be a >small amount of code to fix UFS and ZFS to do this callback as well. > >- Eric > >On Fri, Apr 07, 2006 at 06:34:41PM +0100, John Levon wrote: > > >># dtrace -n ''syscall::open64:return /execname == "cat"/ { trace(fds[arg0].fi_pathname); }'' & >>dtrace: description ''syscall::open64:return '' matched 1 probe >># touch /tmp/old.name >># mv /tmp/old.name /tmp/new.name >># cat /tmp/new.name >>CPU ID FUNCTION:NAME >> 0 40825 open64:return /tmp/old.name >> >>fds[] uses ->v_path directly, but a rename(2) never updates the cached path. I >>assume there''s a good reason that ->v_path can''t be updated on rename(2)? It >>also means that pfiles(1) can''t resolve the path of the file, as well. >> >>regards >>john >>_______________________________________________ >>dtrace-discuss mailing list >>dtrace-discuss at opensolaris.org >> >> > >-- >Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20060407/65822056/attachment.html>
On Fri, Apr 07, 2006 at 04:26:46PM -0500, james wahlig wrote:> I made the change to nfs for v_path to be updated on rename, but this is > only slightly better. There are still cases where the path is wrong, > consider: > > $ echo hi > a > $ ln a b > $ rm a > $ echo bye > a > > Now there will be two different vnodes with the same v_path. > > I''ve thought about a way to fix this, but I''m not sure it is worth it. > > Also, what happens when a directory is renamed? The v_path for that > directory would get updated, but what about any file in that directory > that already has the v_path set with the old directory name? > > Just my thoughts.You could make this all work for local filesystems, but for distributed filesystems that lack event notifications you couldn''t really, and even then, you couldn''t make such updates synchronous...> Jim > > Eric Schrock wrote On 04/07/06 12:44,: > > >The introduction of v_path was purely for observability, not > >correctness. It''s actually impossible to guarantee correctness in all > >circumstances. That being said, we should strive to improve > >observability in the common cases. > > > >For rename() in particular, the reason it wasn''t done is that due to the > >nature of the VFS interfaces, it can''t be done in a generic way. This > >means that every filesystem would need to call back into a private VFS > >interface to update the path.What about FEM? But, yes, FEM would have to be pushed all over the place. And FEM isn''t a public interface, IIRC. Nico --
>>>>> "Jim" == james wahlig <James.Wahlig at sun.com> writes:Jim> $ echo hi > a Jim> $ ln a b Jim> $ rm a Jim> $ echo bye > a Jim> Now there will be two different vnodes with the same v_path. Jim> I''ve thought about a way to fix this, but I''m not sure it is worth Jim> it. Jim> Also, what happens when a directory is renamed? The v_path for Jim> that directory would get updated, but what about any file in that Jim> directory that already has the v_path set with the old directory Jim> name? The NFSv4 client code has to deal with these issues for servers that use volatile filehandles. So there''s a file name struct (nfs4_fname_t) that was written specifically to address issues like renaming an ancestor directory. I''d expect it to handle the link/rm case, too. The code might not do exactly what you need (IIRC, there are some locking issues with extracting the path name), but it might be a good start. mike
>I made the change to nfs for v_path to be updated on rename, but this is >only slightly better. There are still cases where the path is wrong, >consider: > >$ echo hi > a >$ ln a b >$ rm a >$ echo bye > aQuite; I remember Linux is broken in more or less the same way.>Now there will be two different vnodes with the same v_path.v_path is a hint which needs to be verified.>Also, what happens when a directory is renamed? The v_path for that >directory would get updated, but what about any file in that directory >that already has the v_path set with the old directory name?Yep, utter chaos :-) Casper
John Levon wrote:> # dtrace -n ''syscall::open64:return /execname == "cat"/ { trace(fds[arg0].fi_pathname); }'' & > dtrace: description ''syscall::open64:return '' matched 1 probe > # touch /tmp/old.name > # mv /tmp/old.name /tmp/new.name > # cat /tmp/new.name > CPU ID FUNCTION:NAME > 0 40825 open64:return /tmp/old.name > > fds[] uses ->v_path directly, but a rename(2) never updates the cached path. I > assume there''s a good reason that ->v_path can''t be updated on rename(2)? It > also means that pfiles(1) can''t resolve the path of the file, as well.Wouldn''t you be better off defining v_path as "the name the file was opened with", documenting it it as "may not be the only name for the file, and could become meaningless at any time" - and making pfiles show it too? It seems to me that what you''re asking is in fundamental conflict with the nature of a traditional Unix filesystem. Cheers, Jeremy
On Sat, Apr 08, 2006 at 03:45:33PM +0100, Jeremy Harris wrote:> Wouldn''t you be better off defining v_path as "the name the file was > opened with"But this is precisely what is /not/ happening in my example, and indeed pfiles(1) explicitly calls out this definition as incorrect. ->v_path is slippier to pin down than that.> documenting it it as "may not be the only name for the > file, and could become meaningless at any time" - and making pfiles > show it too?I''m not sure what our documentation says/will say about this, but it should certainly make clear that ->v_path is not guaranteed to be "correct". The case I describe is (AFAIK) fairly simple to deal with: it''s certainly not sensible to make ->v_path always "correct", but this particular case should be a small change for a larger gain. regards john
John Levon wrote:> On Sat, Apr 08, 2006 at 03:45:33PM +0100, Jeremy Harris wrote: > > >>Wouldn''t you be better off defining v_path as "the name the file was >>opened with" > > > But this is precisely what is /not/ happening in my example, and indeed > pfiles(1) explicitly calls out this definition as incorrect. ->v_path is > slippier to pin down than that.Yes. I''m not convinced it''s particularly useful in that state. Change the implementation.>>documenting it it as "may not be the only name for the >>file, and could become meaningless at any time" - and making pfiles >>show it too? > > > I''m not sure what our documentation says/will say about this, but it should > certainly make clear that ->v_path is not guaranteed to be "correct". The case > I describe is (AFAIK) fairly simple to deal with: it''s certainly not sensible > to make ->v_path always "correct", but this particular case should be a small > change for a larger gain.It''s not sufficient though; it leaves v_path still with a host of inconsistencies. I don''t see that patching this case is worthwhile. Instead, the requirement should be readdressed. Cheers, Jeremy
On Sat, Apr 08, 2006 at 04:53:36PM +0100, Jeremy Harris wrote:> > >>documenting it it as "may not be the only name for the > >>file, and could become meaningless at any time" - and making pfiles > >>show it too? > > > > > >I''m not sure what our documentation says/will say about this, but it should > >certainly make clear that ->v_path is not guaranteed to be "correct". The > >case > >I describe is (AFAIK) fairly simple to deal with: it''s certainly not > >sensible > >to make ->v_path always "correct", but this particular case should be a > >small > >change for a larger gain. > > It''s not sufficient though; it leaves v_path still with a host of > inconsistencies. I don''t see that patching this case is worthwhile. > Instead, the requirement should be readdressed.Please, there are two main points here: #1) The path information stored in v_path is just an estimate. The "documentation" for an internal implementation detail is, and always will be, the source. A simple look at v_path, the comments, and how its used is enough to know that it''s not guaranteed. Now, the DTrace documentation for the I/O provider could probably point this out, perhaps that is "our documentation" you''re referring to? #2) The public interface for this information is /proc/<pid>/path (which pfiles uses), and this is GUARANTEED to be correct[1], or to be unavailable. And to re-emphasize the points already made: #1) It''s impossible to guarantee the correctness of v_path for reasons outlined before. And anything more complicated than a simple string makes it impossible for use by DTrace. I don''t know how "the requirement should be readdressed", but it''s fundamentally not possible given the VFS and DTrace constraints. #2) If there is a common case (such as a rename) where a simple fix can improve observability by a significant amount, then we should do it. This has already been shown to be a significant win with NFS obvserability. Hope that clears things up. - Eric [1] Obviously, there is an inherent asynchronous nature of pfiles(1). If a file is renamed after pfiles has looked up the path but before it''s printed out, there''s not much that we can do. -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Sat, Apr 08, 2006 at 04:18:26PM +0100, John Levon wrote:> On Sat, Apr 08, 2006 at 03:45:33PM +0100, Jeremy Harris wrote: > > > Wouldn''t you be better off defining v_path as "the name the file was > > opened with" > > But this is precisely what is /not/ happening in my example, and indeed > pfiles(1) explicitly calls out this definition as incorrect. ->v_path is > slippier to pin down than that.The name it had when the vnode was looked up? A name the file once had. Since it can change anytime I''m not sure that it''s useful to track the name the file had when it was opened, unless that is tracked in the struct file (file descriptor), rather than in the vnode (which would be the wrong place for tracking the name the file had when a particular file descriptor for it was opened). What you may want is f_audit_data (which is only available if auditing is turned on). Nico --
On Sat, 2006-04-08 at 04:58, Casper.Dik at Sun.COM wrote:> >Also, what happens when a directory is renamed? The v_path for that > >directory would get updated, but what about any file in that directory > >that already has the v_path set with the old directory name? > Yep, utter chaos :-)on the other hand, if vnode-to-name information were to be cached on a pathname component-by-pathname component basis (essentially an inverse DNLC query which, if given a vp, yields <parent vp, entryname> pairs) you would get (at least some) directory renames for free.
On Sat, Apr 08, 2006 at 03:51:05PM -0400, Bill Sommerfeld wrote:> > on the other hand, if vnode-to-name information were to be cached on a > pathname component-by-pathname component basis (essentially an inverse > DNLC query which, if given a vp, yields <parent vp, entryname> pairs) > you would get (at least some) directory renames for free.Except that DTrace wouldn''t be able to traverse this hierarchy from arbitrary probe context. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
>On Sat, 2006-04-08 at 04:58, Casper.Dik at Sun.COM wrote: >> >Also, what happens when a directory is renamed? The v_path for that >> >directory would get updated, but what about any file in that directory >> >that already has the v_path set with the old directory name? >> Yep, utter chaos :-) > >on the other hand, if vnode-to-name information were to be cached on a >pathname component-by-pathname component basis (essentially an inverse >DNLC query which, if given a vp, yields <parent vp, entryname> pairs) >you would get (at least some) directory renames for free.Yes, but requires the DNLC cache to be unpurgable for active vnodes and the containing directories. Casper
On Sat, Apr 08, 2006 at 10:35:34AM -0700, Eric Schrock wrote:> #1) The path information stored in v_path is just an estimate. The > "documentation" for an internal implementation detail is, and > always will be, the source. A simple look at v_path, the comments, > and how its used is enough to know that it''s not guaranteed. Now, > the DTrace documentation for the I/O provider could probably point > this out, perhaps that is "our documentation" you''re referring to?I was referring to the user-visible implementations based upon ->v_path, namely the future documentation for DTrace''s fds[]. I do not expect anything more than a brief sentence mentioning that fds[].fi_filepath and friends is not guaranteed correct. regards, john
>On Sat, Apr 08, 2006 at 10:35:34AM -0700, Eric Schrock wrote: > >> #1) The path information stored in v_path is just an estimate. The >> "documentation" for an internal implementation detail is, and >> always will be, the source. A simple look at v_path, the comments, >> and how its used is enough to know that it''s not guaranteed. Now, >> the DTrace documentation for the I/O provider could probably point >> this out, perhaps that is "our documentation" you''re referring to? > >I was referring to the user-visible implementations based upon ->v_path, namely >the future documentation for DTrace''s fds[]. I do not expect anything more than >a brief sentence mentioning that fds[].fi_filepath and friends is not >guaranteed correct.Is there any reason for the dtrace program not to use filtering similar to pfiles? Casper
On Sun, Apr 09, 2006 at 10:08:39AM +0200, Casper.Dik at Sun.COM wrote:> >I was referring to the user-visible implementations based upon ->v_path, namely > >the future documentation for DTrace''s fds[]. I do not expect anything more than > >a brief sentence mentioning that fds[].fi_filepath and friends is not > >guaranteed correct. > > Is there any reason for the dtrace program not to use filtering > similar to pfiles?I don''t know how you could do the equivalent of /proc/pid/path/ handling in probe context, and I''d rather have something that might be out of date than nothing at all. regards, john
>On Sun, Apr 09, 2006 at 10:08:39AM +0200, Casper.Dik at Sun.COM wrote: > >> >I was referring to the user-visible implementations based upon ->v_path, namely >> >the future documentation for DTrace''s fds[]. I do not expect anything more than >> >a brief sentence mentioning that fds[].fi_filepath and friends is not >> >guaranteed correct. >> >> Is there any reason for the dtrace program not to use filtering >> similar to pfiles? > >I don''t know how you could do the equivalent of /proc/pid/path/ handling in >probe context, and I''d rather have something that might be out of date than >nothing at all.I wasn''t thinking about probe context; I was thinking about the context of the dtrace process. (Probe context saves v_path and <dev, ino>; dtrace(1) uses stat() to verify) Casper
On Sun, Apr 09, 2006 at 04:12:04PM +0200, Casper.Dik at Sun.COM wrote:> > >On Sun, Apr 09, 2006 at 10:08:39AM +0200, Casper.Dik at Sun.COM wrote: > > > >> >I was referring to the user-visible implementations based upon ->v_path, namely > >> >the future documentation for DTrace''s fds[]. I do not expect anything more than > >> >a brief sentence mentioning that fds[].fi_filepath and friends is not > >> >guaranteed correct. > >> > >> Is there any reason for the dtrace program not to use filtering > >> similar to pfiles? > > > >I don''t know how you could do the equivalent of /proc/pid/path/ handling in > >probe context, and I''d rather have something that might be out of date than > >nothing at all. > > I wasn''t thinking about probe context; I was thinking about the > context of the dtrace process. (Probe context saves v_path and > <dev, ino>; dtrace(1) uses stat() to verify)I''ve thought about this for syscall arguments also. Provide an option to stop the victim and let the DTrace consumer use /proc to get at the syscall arguments, thus obviating the page fault issue. Yes, this is destructive, and so counter to the spirit of DTrace, but then, DTrace already provides destructive actions, like stop(). Nico --
On Sun, Apr 09, 2006 at 04:12:04PM +0200, Casper.Dik at Sun.COM wrote:> >> Is there any reason for the dtrace program not to use filtering > >> similar to pfiles? > > > >I don''t know how you could do the equivalent of /proc/pid/path/ handling in > >probe context, and I''d rather have something that might be out of date than > >nothing at all. > > I wasn''t thinking about probe context; I was thinking about the > context of the dtrace process. (Probe context saves v_path and > <dev, ino>; dtrace(1) uses stat() to verify)Might be a useful extension, possibly, but would only help if the path is just being stored rather than acted upon. regards, john
On Sun, Apr 09, 2006 at 07:37:44PM +0100, John Levon wrote:> On Sun, Apr 09, 2006 at 04:12:04PM +0200, Casper.Dik at Sun.COM wrote: > > > >> Is there any reason for the dtrace program not to use filtering > > >> similar to pfiles? > > > > > >I don''t know how you could do the equivalent of /proc/pid/path/ handling in > > >probe context, and I''d rather have something that might be out of date than > > >nothing at all. > > > > I wasn''t thinking about probe context; I was thinking about the > > context of the dtrace process. (Probe context saves v_path and > > <dev, ino>; dtrace(1) uses stat() to verify) > > Might be a useful extension, possibly, but would only help if the path is just > being stored rather than acted upon.Precisely. Postprocessing is generally bad news, because as John points out, it means that you can''t use the result in a predicate. Or worse, the use becomes inconsistent: it means one thing when stored, but another when compared against. This is why ufunc(), usym(), umod(), etc. are actions and not variables -- and it limits their utility. In the case of ufunc() and co. we don''t have a choice; in the case of the path, we do -- and we will opt to stick with only the processing that we can safely do in probe context. - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc