Hello, I have?recently started using iosnoop from the dtrace toolkit to monitor?file level io. It works great (!) on ufs file systems, but I?am unable to see any file names?when monitoring?processes running on?a zfs file system. As is documented,?"<none>" is displayed for the file name when it can not be determined from the file system. ? Ideally I would like to gather file level ( named) IO Read/Write Bytes as well as IO Read/Write Counts, like the output of iosnoop. ? During my search of the list archives it appears that this issue came up in?March 2008.?But I haven''t been able to determine if there was a resolution. If there has not been a solution to this, then?it would be great if someone could point me in the right direction?so that I can experiment with some?other probes.?I''ve noticed that the fbt provider has many zfs related probes, but I am unable to find the docs that describe them?individually in?detail. Thanks, johndz
John Dzilvelis wrote:> Hello, > I have recently started using iosnoop from the dtrace toolkit to monitor file level io. It works great (!) on ufs file systems, but I am unable to see any file names when monitoring processes running on a zfs file system. As is documented, "<none>" is displayed for the file name when it can not be determined from the file system. > Ideally I would like to gather file level ( named) IO Read/Write Bytes as well as IO Read/Write Counts, like the output of iosnoop. > During my search of the list archives it appears that this issue came up in March 2008. But I haven''t been able to determine if there was a resolution. > If there has not been a solution to this, then it would be great if someone could point me in the right direction so that I can experiment with some other probes. I''ve noticed that the fbt provider has many zfs related probes, but I am unable to find the docs that describe them individually in detail.Tracing IO through ZFS is very complex because of the various layers through which IO passes. If you want to watch arbitrary file io requests you''ll want to stick with syscall probes and the like. DTracing IO at a high level is easy, same at a low level, its the in between with ZFS thats tough. The FBT provider exposes every functions entry and return, so for a given fbt:zfs:whatever probe just look for that function in the ZFS code (cvs.opensolaris.org). How you trace will depend on what you want to know, but I find that a good starting point is to aggregate callstacks from low level zio functions, thus allowing me to see how we got there; ie: dtrace ''fbt:zfs:zio_read:entry{ @[stack(20)] = count(); }''. Based on the call stack I can use other FBT probes to explore. When all else fails I have gone so far as to trace every function in the kernel (this works for about 1-2 seconds and produces an output file around 500M on my personal workstation at home)... not something I''d do in production but an amazing learning experience. ;) benr.
On Aug 22, 2008, at 7:31 AM, Ben Rockwood wrote:> John Dzilvelis wrote: >> Hello, >> I have recently started using iosnoop from the dtrace toolkit to >> monitor file level io. It works great (!) on ufs file systems, but >> I am unable to see any file names when monitoring processes running >> on a zfs file system. As is documented, "<none>" is displayed for >> the file name when it can not be determined from the file system. >> Ideally I would like to gather file level ( named) IO Read/Write >> Bytes as well as IO Read/Write Counts, like the output of iosnoop. >> During my search of the list archives it appears that this issue >> came up in March 2008. But I haven''t been able to determine if >> there was a resolution. >> If there has not been a solution to this, then it would be great if >> someone could point me in the right direction so that I can >> experiment with some other probes. I''ve noticed that the fbt >> provider has many zfs related probes, but I am unable to find the >> docs that describe them individually in detail. > > Tracing IO through ZFS is very complex because of the various layers > through which IO passes. If you want to watch arbitrary file io > requests you''ll want to stick with syscall probes and the like. > DTracing IO at a high level is easy, same at a low level, its the in > between with ZFS thats tough. > > The FBT provider exposes every functions entry and return, so for a > given fbt:zfs:whatever probe just look for that function in the ZFS > code > (cvs.opensolaris.org). > > How you trace will depend on what you want to know, but I find that a > good starting point is to aggregate callstacks from low level zio > functions, thus allowing me to see how we got there; ie: dtrace > ''fbt:zfs:zio_read:entry{ @[stack(20)] = count(); }''. Based on the > call > stack I can use other FBT probes to explore. When all else fails I > have > gone so far as to trace every function in the kernel (this works for > about 1-2 seconds and produces an output file around 500M on my > personal > workstation at home)... not something I''d do in production but an > amazing learning experience. ;)yep - you won''t see filenames from iosnoop on zfs since it doesn''t call bdev_strategy() directly which is what io:::start works with .. this came up last year with Mauro on the zfs-discuss alias .. if you''re looking for file pathnames take a look at the following example I whipped up after that discussion for use on a project to track timings to zfs based files (which can be particularly enlightening.) #!/usr/sbin/dtrace -s # pragma D option quiet zfs_write:entry, zfs_read:entry, zfs_putpage:entry, zfs_getpage:entry { self->ts = timestamp; self->filepath = args[0]->v_path; } zfs_write:return, zfs_read:return, zfs_putpage:return, zfs_getpage:return /self->ts && self->filepath/ { printf("%s on %s took %d nsecs\n", probefunc, stringof(self->filepath), timestamp - self->ts); self->ts = 0; self->filepath = 0; } --- for IO sizes and counts though .. take a look at the calls to extract the right arguments you might be looking for --- .je
For those that are interested there is an open bug on this: 6266202: DTrace io provider doesn''t work with ZFS Jon.> On Aug 22, 2008, at 7:31 AM, Ben Rockwood wrote: > > >> John Dzilvelis wrote: >> >>> Hello, >>> I have recently started using iosnoop from the dtrace toolkit to >>> monitor file level io. It works great (!) on ufs file systems, but >>> I am unable to see any file names when monitoring processes running >>> on a zfs file system. As is documented, "<none>" is displayed for >>> the file name when it can not be determined from the file system. >>> Ideally I would like to gather file level ( named) IO Read/Write >>> Bytes as well as IO Read/Write Counts, like the output of iosnoop. >>> During my search of the list archives it appears that this issue >>> came up in March 2008. But I haven''t been able to determine if >>> there was a resolution. >>> If there has not been a solution to this, then it would be great if >>> someone could point me in the right direction so that I can >>> experiment with some other probes. I''ve noticed that the fbt >>> provider has many zfs related probes, but I am unable to find the >>> docs that describe them individually in detail. >>> >> Tracing IO through ZFS is very complex because of the various layers >> through which IO passes. If you want to watch arbitrary file io >> requests you''ll want to stick with syscall probes and the like. >> DTracing IO at a high level is easy, same at a low level, its the in >> between with ZFS thats tough. >> >> The FBT provider exposes every functions entry and return, so for a >> given fbt:zfs:whatever probe just look for that function in the ZFS >> code >> (cvs.opensolaris.org). >> >> How you trace will depend on what you want to know, but I find that a >> good starting point is to aggregate callstacks from low level zio >> functions, thus allowing me to see how we got there; ie: dtrace >> ''fbt:zfs:zio_read:entry{ @[stack(20)] = count(); }''. Based on the >> call >> stack I can use other FBT probes to explore. When all else fails I >> have >> gone so far as to trace every function in the kernel (this works for >> about 1-2 seconds and produces an output file around 500M on my >> personal >> workstation at home)... not something I''d do in production but an >> amazing learning experience. ;) >> > > yep - you won''t see filenames from iosnoop on zfs since it doesn''t > call bdev_strategy() directly which is what io:::start works with .. > this came up last year with Mauro on the zfs-discuss alias .. if > you''re looking for file pathnames take a look at the following example > I whipped up after that discussion for use on a project to track > timings to zfs based files (which can be particularly enlightening.) > > #!/usr/sbin/dtrace -s > > # pragma D option quiet > > zfs_write:entry, > zfs_read:entry, > zfs_putpage:entry, > zfs_getpage:entry > { > self->ts = timestamp; > self->filepath = args[0]->v_path; > } > > zfs_write:return, > zfs_read:return, > zfs_putpage:return, > zfs_getpage:return > /self->ts && self->filepath/ > { > printf("%s on %s took %d nsecs\n", probefunc, > stringof(self->filepath), timestamp - self->ts); > self->ts = 0; > self->filepath = 0; > } > > --- > for IO sizes and counts though .. take a look at the calls to extract > the right arguments you might be looking for > > --- > .je > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Hi, I''m working in a zfs_log monitor, mi idea is to create a backup fileserver, at first place the source server log each zfs syscall and then another server takes them and replicate the information copying each diference (via NFS), My problem is that some transactions don''t appear in the log, i''ve been reading a lot about it thinking about "drops" , then i changed the swhitchrate and the buffersize but they didn''t work. i leave you mi d scritp: Tanks! #!/usr/sbin/dtrace -s #pragma D option bufsize=10m #pragma D option switchrate=20hz # pragma D option quiet zfs_mkdir:entry { self->ts = timestamp; self->filepath = args[0]->v_path; self->dirname = args[1]; self->perms = args[2]->va_mode; } zfs_mkdir:return /self->filepath && self->ts && args[1] == 0/ { printf("%s\t%s\t%d\n", probefunc, stringof(self->filepath), timestamp - self->ts); self->filepath = 0; self->dirname = 0; self->ts = 0; self->perms = 0; } zfs_write:entry, zfs_read:entry { self->ts = timestamp; self->filepath = args[0]->v_path; } zfs_write:return, zfs_read:return /self->filepath && self->ts && args[1] == 0/ { printf("%s\t%s\t%d\n", probefunc, stringof(self->filepath),timestamp ); self->filepath = 0; self->ts = 0; } zfs_create:entry { self->ts = timestamp; self->filepath = args[0]->v_path; self->filename = args[1]; self->perms = args[2]->va_mode; } zfs_create:return /self->ts && self->filepath && self->filename && self->perms && args[1] == 0/ { printf("%s\t%s/%s\t%o\t%d\n", probefunc, stringof(self->filepath), stringof(self->filename), self->perms,timestamp); self->filepath = 0; self->filename = 0; self->ts = 0; self->perms = 0; } zfs_rmdir:entry { self->ts = timestamp; self->filepath = args[0]->v_path; self->dirname = args[1]; } zfs_rmdir:return /self->filepath && self->dirname && self->ts && args[1] == 0/ { printf("%s\t%s/%s\t%d\n", probefunc, stringof(self->filepath), stringof(self->dirname), timestamp); self->filepath = 0; self->dirname = 0; self->ts = 0; } zfs_remove:entry { self->ts = timestamp; self->filepath = args[0]->v_path; self->filename = args[1]; } zfs_remove:return /self->filepath && self->ts && self->filename && args[1] == 0/ { printf("%s\t%s/%s\t%d\n", probefunc, stringof(self->filepath), stringof(self->filename), timestamp); self->filepath = 0; self->filename = 0; self->ts = 0; } zfs_rename:entry { self->ts = timestamp; self->filepathsource = args[0]->v_path; self->filenamesource = args[1]; self->filepathdest = args[2]->v_path; self->filenamedest = args[3]; } zfs_rename:return /self->ts && self->filepathsource && self->filenamesource && self->filepathdest && self->filenamedest && args[1] == 0/ { printf("%s\t%s/%s\t%s/%s\t%d\n", probefunc, stringof(self->filepathsource), stringof(self->filenamesource),stringof(self->filepathdest), stringof(self->filenamedest), timestamp ); self->filepathsource = 0; self->filenamesource = 0; self->filepathdest = 0; self->filenamedest = 0; self->ts = 0; } zfs_symlink:entry { self->ts = timestamp; self->filepathdest = args[0]->v_path; self->filenamedest = args[1]; self->linkname = args[3]; self->perms = args[2]->va_mode; } zfs_symlink:return /self->filepathdest&& self->ts && self->filenamedest && self->linkname && args[1] == 0/ { printf("%s\t%s\t%s/%s\t%o\t%d\n", probefunc, stringof(self->linkname), stringof(self->filepathdest), stringof(self->filenamedest), self->perms,timestamp ); self->filepathsdest= 0; self->filenamesdest= 0; self->flinkname= 0; self->ts = 0; self->perms = 0; } zfs_link:entry { self->ts = timestamp; self->filepathdest = args[0]->v_path; self->filenamedest = args[2]; self->filenamesource = args[1]->v_path; } zfs_link:return /self->ts && self->filepathdest && self->filenamedest && self->filenamesource && args[1] == 0/ { printf("%s\t%s\t%s/%s\t%d\n", probefunc, stringof(self->filenamesource),stringof(self->filepathdest), stringof(self->filenamedest), timestamp ); self->filepathdest = 0; self->filenamesource = 0; self->filenamesource = 0; self->ts = 0; } zfs_setattr:entry { self->ts = timestamp; self->fileid = args[0]->v_path; } zfs_setattr:return /self->ts && self->fileid && args[1] == 0/ { printf("%s\t%s\t%d\n", probefunc, stringof(self->fileid), timestamp ); self->fileid = 0; self->ts = 0; } zfs_readdir:entry { self->ts = timestamp; self->fileid = args[0]->v_path; } zfs_readdir:return /self->ts && self->fileid && args[1] == 0/ { printf("%s\t%s\t%d\n", probefunc, stringof(self->fileid), timestamp); self->fileid = 0; self->ts = 0; } -- This message posted from opensolaris.org
i want to add more information: I can see dtrace isn''t generating some zfs_mkdir''s syscalls , i deleted all the "if conditionals" but it didn''t work either. i''m using OpenSolaris, Is there any difference between Open Solaris and Solairs? (speaking about dtrace). I''ve read zfs_vnops.c looking for another sycall which create directories but i haven''t found it. Maybe there is another way to create directories (another syscall?) and i''m not getting this syscall in my dtrace script. Thanks! -- This message posted from opensolaris.org