I recently converted my home directory to zfs on an external disk drive. Approximately every three seconds I can hear the disk being accessed, even if I''m doing nothing. The noise is driving me crazy! I tried using dtrace to find out what process might be accessing the disk. I used the iosnoop program from the dtrace toolkit. No matter what I do, iosnoop doesn''t report any accesses to the disk. Is it broken? Is it a limitation of zfs? Am I doing something wrong? Running just plain "iosnoop" shows accesses to lots of files, but none on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m /export/home/shannon" shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still nothing. Meanwhile, "zpool iostat 5" shows the disk being accessed. Glenn Skinner told me about the parameter zfs:spa_max_replication_override. Setting it to 1 is supposed to disable ditto blocks in zfs. Here''s what I get when I set it to 1: $ zpool iostat 5 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- home 12.4G 4.55G 0 2 11.7K 41.1K home 12.4G 4.55G 0 6 0 133K home 12.4G 4.55G 0 6 0 162K home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 7 0 144K home 12.4G 4.55G 0 6 0 153K home 12.4G 4.55G 0 6 0 139K home 12.4G 4.55G 0 6 0 132K home 12.4G 4.55G 0 6 0 144K home 12.4G 4.55G 0 6 0 135K home 12.4G 4.55G 0 12 0 186K home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 13 0 265K home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 Setting it back to its original value of 3 while still running gives: home 12.4G 4.55G 0 12 0 185K home 12.4G 4.55G 0 12 0 258K home 12.4G 4.55G 0 11 0 188K home 12.4G 4.55G 0 11 0 175K home 12.4G 4.55G 0 11 0 184K home 12.4G 4.55G 0 12 0 180K That seems to be related to my problem, but it doesn''t seem to be the whole problem. Curiously, when I came in to my office this morning, I didn''t hear my disk making noise. It wasn''t until after I unlocked the screen that the noise started, which makes my think it must be something related to my desktop. If I check the I/O activity now, logged in remotely with my screen locked, I get: $ zpool iostat 5 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- home 12.4G 4.55G 0 2 10.9K 39.6K home 12.4G 4.55G 0 11 0 171K home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 10 0 149K home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 0 0 0 home 12.4G 4.55G 0 11 0 146K home 12.4G 4.55G 0 12 0 186K Much less activity, but still not zero. Anyway, anyone have any ideas of how I can use dtrace or some other tool to track this down? Any chance that there''s something about zfs that is causing this problem? Is there something about zfs that prevents dtrace from working properly? Oh ya, I''m running snv_79b on SPARC. Thanks.
On 01 March, 2008 - Bill Shannon sent me these 4,2K bytes:> I recently converted my home directory to zfs on an external disk drive. > Approximately every three seconds I can hear the disk being accessed, > even if I''m doing nothing. The noise is driving me crazy!Just a guess.. Something looking at files, ZFS updating atime? Try zfs set atime=off yourpool and see.. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Marty Itzkowitz
2008-Mar-01 15:07 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses: try er_kernel
Interesting problem. I''ve used disk rattle as a measurement of io activity before there were such tools for measurement. It''s crude, but effective. To answer your question: you could try er_kernel. It uses DTrace to do statistical callstack sampling, and is described on our kernel profiling page <http://dpweb.sfbay/products/analyzer/kernel.html>. That page is woefully out of date, but is basically correct w.r.t usage. We test it on Nevada and S10, and S9. Roch Bourbonnais (PAE), who did most of the development, used it to track down the root cause of a similar-sounding problem in VxFS. See the slides towards the end of the presentation on Kernel Profiling <http://dpweb.sfbay/products/analyzer/kprof.pdf> dated June, 2002. The best version to use on a late snv is the one from our nightly-build <http://dpweb.sfbay/products/analyzer/access.html#nightly> If you have any problems, please contact Roch and/or me; if you give it a try, please report your experience back to our interest alias, er_kernel at sun.com <mailto:er_kernel at sun.com> . Marty Bill Shannon wrote:> I recently converted my home directory to zfs on an external disk drive. > Approximately every three seconds I can hear the disk being accessed, > even if I''m doing nothing. The noise is driving me crazy! > > I tried using dtrace to find out what process might be accessing the > disk. I used the iosnoop program from the dtrace toolkit. No matter > what I do, iosnoop doesn''t report any accesses to the disk. Is it > broken? Is it a limitation of zfs? Am I doing something wrong? > > Running just plain "iosnoop" shows accesses to lots of files, but none > on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m /export/home/shannon" > shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still nothing. > > Meanwhile, "zpool iostat 5" shows the disk being accessed. Glenn Skinner > told me about the parameter zfs:spa_max_replication_override. Setting it > to 1 is supposed to disable ditto blocks in zfs. Here''s what I get when > I set it to 1: > > $ zpool iostat 5 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > home 12.4G 4.55G 0 2 11.7K 41.1K > home 12.4G 4.55G 0 6 0 133K > home 12.4G 4.55G 0 6 0 162K > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 7 0 144K > home 12.4G 4.55G 0 6 0 153K > home 12.4G 4.55G 0 6 0 139K > home 12.4G 4.55G 0 6 0 132K > home 12.4G 4.55G 0 6 0 144K > home 12.4G 4.55G 0 6 0 135K > home 12.4G 4.55G 0 12 0 186K > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 13 0 265K > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > > Setting it back to its original value of 3 while still running gives: > > home 12.4G 4.55G 0 12 0 185K > home 12.4G 4.55G 0 12 0 258K > home 12.4G 4.55G 0 11 0 188K > home 12.4G 4.55G 0 11 0 175K > home 12.4G 4.55G 0 11 0 184K > home 12.4G 4.55G 0 12 0 180K > > That seems to be related to my problem, but it doesn''t seem to be the > whole problem. > > Curiously, when I came in to my office this morning, I didn''t hear my > disk making noise. It wasn''t until after I unlocked the screen that > the noise started, which makes my think it must be something related > to my desktop. > > If I check the I/O activity now, logged in remotely with my screen > locked, I get: > > $ zpool iostat 5 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > home 12.4G 4.55G 0 2 10.9K 39.6K > home 12.4G 4.55G 0 11 0 171K > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 10 0 149K > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 0 0 0 > home 12.4G 4.55G 0 11 0 146K > home 12.4G 4.55G 0 12 0 186K > > Much less activity, but still not zero. > > Anyway, anyone have any ideas of how I can use dtrace or some other tool > to track this down? Any chance that there''s something about zfs that is > causing this problem? Is there something about zfs that prevents dtrace > from working properly? > > Oh ya, I''m running snv_79b on SPARC. > > Thanks. > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
On Sat, 1 Mar 2008, Bill Shannon wrote:> > Curiously, when I came in to my office this morning, I didn''t hear my > disk making noise. It wasn''t until after I unlocked the screen that > the noise started, which makes my think it must be something related > to my desktop.Gnome periodically checks files under ''Desktop'' and likely other places as well to see if there are any updates. There may also be Gnome software which periodially saves its current state. I doubt that the accesses can be blamed on ZFS. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Mar 1, 2008, at 3:41 AM, Bill Shannon wrote:> Running just plain "iosnoop" shows accesses to lots of files, but none > on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m /export/ > home/shannon" > shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still > nothing.hi Bill this came up sometime last year .. io:::start won''t work since ZFS doesn''t call bdev_strategy() directly .. you''ll want to use something more like zfs_read:entry, zfs_write:entry and zfs_putpage or zfs_getpage for mmap''d ZFS files here''s one i hacked from our discussion back then to track some timings on files: > cat zfs_iotime.d #!/usr/sbin/dtrace -s # pragma D option quiet zfs_write:entry, zfs_read:entry, zfs_putpage:entry, zfs_getpage:entry { self->ts = timestamp; self->filepath = args[0]->v_path; } zfs_write:return, zfs_read:return, zfs_putpage:return, zfs_getpage:return /self->ts && self->filepath/ { printf("%s on %s took %d nsecs\n", probefunc, stringof(self->filepath), timestamp - self->ts); self->ts = 0; self->filepath = 0; } --- .je
Jonathan Edwards
2008-Mar-01 18:48 UTC
[dtrace-discuss] [zfs-discuss] periodic ZFS disk accesses
On Mar 1, 2008, at 3:41 AM, Bill Shannon wrote:> Running just plain "iosnoop" shows accesses to lots of files, but none > on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m /export/ > home/shannon" > shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still > nothing.hi Bill this came up sometime last year .. io:::start won''t work since ZFS doesn''t call bdev_strategy() directly .. you''ll want to use something more like zfs_read:entry, zfs_write:entry and zfs_putpage or zfs_getpage for mmap''d ZFS files here''s one i hacked from our discussion back then to track some timings on files: > cat zfs_iotime.d #!/usr/sbin/dtrace -s # pragma D option quiet zfs_write:entry, zfs_read:entry, zfs_putpage:entry, zfs_getpage:entry { self->ts = timestamp; self->filepath = args[0]->v_path; } zfs_write:return, zfs_read:return, zfs_putpage:return, zfs_getpage:return /self->ts && self->filepath/ { printf("%s on %s took %d nsecs\n", probefunc, stringof(self->filepath), timestamp - self->ts); self->ts = 0; self->filepath = 0; } --- .je
Bill Shannon
2008-Mar-01 21:03 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses: try er_kernel
Marty Itzkowitz wrote:> Interesting problem. I''ve used disk rattle as a measurement of io > activity before > there were such tools for measurement. It''s crude, but effective. > > To answer your question: you could try er_kernel. It uses DTrace to > do statistical callstack sampling, and is described on our kernel > profiling page <http://dpweb.sfbay/products/analyzer/kernel.html>. > That page is woefully out of date, but is basically correct w.r.t usage. > We test it on Nevada and S10, and S9. > > Roch Bourbonnais (PAE), who did most of the development, used it to track > down the root cause of a similar-sounding problem in VxFS. See the slides > towards the end of the presentation on Kernel Profiling > <http://dpweb.sfbay/products/analyzer/kprof.pdf> dated June, 2002. > > The best version to use on a late snv is the one from our nightly-build > <http://dpweb.sfbay/products/analyzer/access.html#nightly> > > If you have any problems, please contact Roch and/or me; > if you give it a try, please report your experience back to our interest > alias, er_kernel at sun.com <mailto:er_kernel at sun.com> .I looked through the page and the presentation and it wasn''t clear to me that this was going to give me information on which file was being accessed or which process was doing the accessing. I really want "tracing", not "profiling". Statistically, the disk isn''t being accessed at all.
Bob Friesenhahn wrote:> On Sat, 1 Mar 2008, Bill Shannon wrote: >> >> Curiously, when I came in to my office this morning, I didn''t hear my >> disk making noise. It wasn''t until after I unlocked the screen that >> the noise started, which makes my think it must be something related >> to my desktop. > > Gnome periodically checks files under ''Desktop'' and likely other places > as well to see if there are any updates. There may also be Gnome > software which periodially saves its current state. I doubt that the > accesses can be blamed on ZFS.Saving the state every 3 seconds, even when nothing is happening, seems like a bug. I just need a tool to find the culprit so that I can file a bug against the correct program, or stop using the program.
Jonathan Edwards wrote:> > On Mar 1, 2008, at 3:41 AM, Bill Shannon wrote: >> Running just plain "iosnoop" shows accesses to lots of files, but none >> on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m >> /export/home/shannon" >> shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still >> nothing. > > hi Bill > > this came up sometime last year .. io:::start won''t work since ZFS > doesn''t call bdev_strategy() directly .. you''ll want to use something > more like zfs_read:entry, zfs_write:entry and zfs_putpage or zfs_getpage > for mmap''d ZFS filesAh ha! Maybe someone should report this to the right people and get the dtrace toolkit and dtrace demo programs fixed?> here''s one i hacked from our discussion back then to track some timings > on files: > > > cat zfs_iotime.d > > #!/usr/sbin/dtrace -s > > # pragma D option quiet > > zfs_write:entry, > zfs_read:entry, > zfs_putpage:entry, > zfs_getpage:entry > { > self->ts = timestamp; > self->filepath = args[0]->v_path; > } > > zfs_write:return, > zfs_read:return, > zfs_putpage:return, > zfs_getpage:return > /self->ts && self->filepath/ > { > printf("%s on %s took %d nsecs\n", probefunc, > stringof(self->filepath), timestamp - self->ts); > self->ts = 0; > self->filepath = 0; > }Ok, that''s much better! At least I''m getting output when I touch files on zfs. However, even though zpool iostat is reporting activity, the above program isn''t showing any file accesses when the system is idle. Any ideas?
Martin.Itzkowitz at Sun.COM
2008-Mar-01 21:31 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses: try er_kernel
Bill Shannon wrote:> Marty Itzkowitz wrote: >> Interesting problem. I''ve used disk rattle as a measurement of io >> activity before >> there were such tools for measurement. It''s crude, but effective. >> >> To answer your question: you could try er_kernel. It uses DTrace to >> do statistical callstack sampling, and is described on our kernel >> profiling page <http://dpweb.sfbay/products/analyzer/kernel.html>. >> That page is woefully out of date, but is basically correct w.r.t usage. >> We test it on Nevada and S10, and S9. >> >> Roch Bourbonnais (PAE), who did most of the development, used it to >> track >> down the root cause of a similar-sounding problem in VxFS. See the >> slides >> towards the end of the presentation on Kernel Profiling >> <http://dpweb.sfbay/products/analyzer/kprof.pdf> dated June, 2002. >> >> The best version to use on a late snv is the one from our >> nightly-build <http://dpweb.sfbay/products/analyzer/access.html#nightly> >> >> If you have any problems, please contact Roch and/or me; >> if you give it a try, please report your experience back to our interest >> alias, er_kernel at sun.com <mailto:er_kernel at sun.com> . > > I looked through the page and the presentation and it wasn''t clear to me > that this was going to give me information on which file was being > accessed or which process was doing the accessing. I really want > "tracing", > not "profiling". Statistically, the disk isn''t being accessed at all.It should give you information on what callstacks in the kernel are triggering the disk usage. You should see something with the same periodicity as the rattle you hear. Marty
On Mar 1, 2008, at 4:14 PM, Bill Shannon wrote:> Ok, that''s much better! At least I''m getting output when I touch > files > on zfs. However, even though zpool iostat is reporting activity, the > above program isn''t showing any file accesses when the system is idle. > > Any ideas?assuming that you''re running an interval (ie: zpool iostat -v 5) and skipping past the initial summary .. you know it''s not file read/write activity .. you might want to check other vop calls .. eg: http://blogs.sun.com/erickustarz/resource/zvop_times_fsid.d to see what''s happening .. scrubs or silvering perhaps? just a thought --- .je
Roch Bourbonnais
2008-Mar-02 00:22 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
Le 1 mars 08 ? 22:14, Bill Shannon a ?crit :> Jonathan Edwards wrote: >> >> On Mar 1, 2008, at 3:41 AM, Bill Shannon wrote: >>> Running just plain "iosnoop" shows accesses to lots of files, but >>> none >>> on my zfs disk. Using "iosnoop -d c1t1d0" or "iosnoop -m >>> /export/home/shannon" >>> shows nothing at all. I tried /usr/demo/dtrace/iosnoop.d too, still >>> nothing. >> >> hi Bill >> >> this came up sometime last year .. io:::start won''t work since ZFS >> doesn''t call bdev_strategy() directly .. you''ll want to use something >> more like zfs_read:entry, zfs_write:entry and zfs_putpage or >> zfs_getpage >> for mmap''d ZFS files >Ed: That''s not entirely accurate. I believe ZFS does lead to bdev_strategy being called and io:::start will fire for ZFS I/Os. The problem is that a ZFS I/O can be servicing a number of ZFS operations on a number of different files (which is a great performance enabler). Given that we can''t map an I/O to a file, iosnoop does not report the file info.> Ah ha! Maybe someone should report this to the right people and get > the > dtrace toolkit and dtrace demo programs fixed?Bill Did the dtrace cmd I sent you gave any hints ? -r>> here''s one i hacked from our discussion back then to track some >> timings >> on files: >> >>> cat zfs_iotime.d >> >> #!/usr/sbin/dtrace -s >> >> # pragma D option quiet >> >> zfs_write:entry, >> zfs_read:entry, >> zfs_putpage:entry, >> zfs_getpage:entry >> { >> self->ts = timestamp; >> self->filepath = args[0]->v_path; >> } >> >> zfs_write:return, >> zfs_read:return, >> zfs_putpage:return, >> zfs_getpage:return >> /self->ts && self->filepath/ >> { >> printf("%s on %s took %d nsecs\n", probefunc, >> stringof(self->filepath), timestamp - self->ts); >> self->ts = 0; >> self->filepath = 0; >> } > > Ok, that''s much better! At least I''m getting output when I touch > files > on zfs. However, even though zpool iostat is reporting activity, the > above program isn''t showing any file accesses when the system is idle. > > Any ideas? > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Jeff Bonwick
2008-Mar-02 01:21 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
> I recently converted my home directory to zfs on an external disk drive. > Approximately every three seconds I can hear the disk being accessed, > even if I''m doing nothing. The noise is driving me crazy! > [...] > Anyway, anyone have any ideas of how I can use dtrace or some other tool > to track this down?Yes. Everything in ZFS is transactional, so it is not possible for anyone to modify anything without first calling dmu_tx_create(). Therefore I suggest: # dtrace -n ''dmu_tx_create:entry { trace(execname); stack() }'' to see what process is causing this. Jeff
Bill Shannon
2008-Mar-02 04:58 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
Roch Bourbonnais wrote:>>> this came up sometime last year .. io:::start won''t work since ZFS >>> doesn''t call bdev_strategy() directly .. you''ll want to use something >>> more like zfs_read:entry, zfs_write:entry and zfs_putpage or zfs_getpage >>> for mmap''d ZFS files >> > > Ed: > That''s not entirely accurate. I believe ZFS does lead to bdev_strategy > being called and io:::start > will fire for ZFS I/Os. The problem is that a ZFS I/O can be servicing a > number of ZFS operations on a > number of different files (which is a great performance enabler). Given > that we can''t map an I/O to a file, > iosnoop does not report the file info.Still seems like iosnoop ought to be fixed, or a warning added.>> Ah ha! Maybe someone should report this to the right people and get the >> dtrace toolkit and dtrace demo programs fixed? > > Bill > Did the dtrace cmd I sent you gave any hints ?It gives far too much data. I haven''t been able to correlate it with the disk activity.
Jonathan Edwards wrote:> > On Mar 1, 2008, at 4:14 PM, Bill Shannon wrote: >> Ok, that''s much better! At least I''m getting output when I touch files >> on zfs. However, even though zpool iostat is reporting activity, the >> above program isn''t showing any file accesses when the system is idle. >> >> Any ideas? > > assuming that you''re running an interval (ie: zpool iostat -v 5) and > skipping past the initial summary .. you know it''s not file read/write > activity .. you might want to check other vop calls .. eg: > http://blogs.sun.com/erickustarz/resource/zvop_times_fsid.d > > to see what''s happening .. scrubs or silvering perhaps?I ended up combining a few programs (cut & paste programming without really understanding what I was doing!) and ended up with a program that traces all the zfs_* entry points. Based on the data I''m getting from that, correlated with the zpool iostat output, it appears that the culprit is ksh93! It seems that ksh93 is doing a setattr call of some sort on the .history file. I think I''ve reached the limit of what I can do remotely. Now I have to repeat all these experiments when I''m sitting next to the disk and can actually hear it and see if the correlation remains. Then, it may be time to dig into the ksh93 code and figure out what it thinks it''s doing. Fortunately, I''ve been there before... Thanks everyone for the help!
On Sat, 1 Mar 2008, Bill Shannon wrote:> > I think I''ve reached the limit of what I can do remotely. Now I have > to repeat all these experiments when I''m sitting next to the disk and > can actually hear it and see if the correlation remains. Then, it may > be time to dig into the ksh93 code and figure out what it thinks it''s > doing. Fortunately, I''ve been there before...One thing that can make a shell periodically active is if it is checking for new mail. Check the ksh man page for descriptions of MAIL, MAILCHECK, MAILPATH. Perhaps whenever it checks for new mail, it also updates this file. Unsetting the MAIL environment variable may make the noise go away. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Jonathan Edwards
2008-Mar-02 07:58 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
On Mar 1, 2008, at 7:22 PM, Roch Bourbonnais wrote:> That''s not entirely accurate. I believe ZFS does lead to > bdev_strategy being called and io:::start > will fire for ZFS I/Os. The problem is that a ZFS I/O can be > servicing a number of ZFS operations on a > number of different files (which is a great performance enabler). > Given that we can''t map an I/O to a file, > iosnoop does not report the file info.yep - bdev_strategy gets called through ldi_strategy, but this is only going to happen after the ZIO pipeline in vdev_io_start() .. and since the ZIO pipeline gets filled from the dmu_tx routines (for the whole pool), i guess it would make the most sense to look at the dmu_tx_create() entry from vnops (as Jeff already pointed out.) --- jonathan
Brendan Gregg - Sun Microsystems
2008-Mar-02 18:10 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
G''Day, On Sat, Mar 01, 2008 at 08:58:53PM -0800, Bill Shannon wrote:> Roch Bourbonnais wrote: > >>> this came up sometime last year .. io:::start won''t work since ZFS > >>> doesn''t call bdev_strategy() directly .. you''ll want to use something > >>> more like zfs_read:entry, zfs_write:entry and zfs_putpage or zfs_getpage > >>> for mmap''d ZFS files > >> > > > > Ed: > > That''s not entirely accurate. I believe ZFS does lead to bdev_strategy > > being called and io:::start > > will fire for ZFS I/Os. The problem is that a ZFS I/O can be servicing a > > number of ZFS operations on a > > number of different files (which is a great performance enabler). Given > > that we can''t map an I/O to a file, > > iosnoop does not report the file info. > > Still seems like iosnoop ought to be fixed, or a warning added.Yes, I''ve added that to my todo list. It should explain why in Notes/iosnoop_notes.txt, and note the behaviour of the cached pathname in the manpage and the script. Sorry for the confusion. I''ll investigate if it''s appropriate to include fssnoop/fstop scripts, for snooping at the VFS level (especially if I can make them fsinfo::: based) where pathnames should still be available in ZFS. It''ll include logical I/O, but may still solve the question of which files and which processes are causing disk I/O. This is the correct forum for DTraceToolkit bugs, BTW. I might not be able to respond right away, but they will get read and added to the DTraceToolkit todo list if appropriate. :) cheers, Brendan -- Brendan [CA, USA]
Roland Mainz
2008-Mar-02 20:28 UTC
[zfs-discuss] Periodic ZFS disk accesses by ksh93... / was: Re: [dtrace-discuss] periodic ZFS disk accesses
Bill Shannon wrote:> Jonathan Edwards wrote: > > On Mar 1, 2008, at 4:14 PM, Bill Shannon wrote: > >> Ok, that''s much better! At least I''m getting output when I touch files > >> on zfs. However, even though zpool iostat is reporting activity, the > >> above program isn''t showing any file accesses when the system is idle. > >> > >> Any ideas? > > > > assuming that you''re running an interval (ie: zpool iostat -v 5) and > > skipping past the initial summary .. you know it''s not file read/write > > activity .. you might want to check other vop calls .. eg: > > http://blogs.sun.com/erickustarz/resource/zvop_times_fsid.d > > > > to see what''s happening .. scrubs or silvering perhaps? > > I ended up combining a few programs (cut & paste programming without > really understanding what I was doing!) and ended up with a program > that traces all the zfs_* entry points. Based on the data I''m getting > from that, correlated with the zpool iostat output, it appears that > the culprit is ksh93! It seems that ksh93 is doing a setattr call of > some sort on the .history file.What''s the exact filename and how often are the accesses ? Is this an interactive shell or is this a script (an interactive shell session will do periodical lookups for things like the MAIL*-variables (see ksh(1) and ksh93(1) manual pages) while scripts may do random stuff as intended by the script''s author(s)) ? And how does the output of $ set # look like ? ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)
Bill Shannon
2008-Mar-02 23:47 UTC
[zfs-discuss] Periodic ZFS disk accesses by ksh93... / was: Re: [dtrace-discuss] periodic ZFS disk accesses
Roland Mainz wrote:> What''s the exact filename and how often are the accesses ? Is this an > interactive shell or is this a script (an interactive shell session will > do periodical lookups for things like the MAIL*-variables (see ksh(1) > and ksh93(1) manual pages) while scripts may do random stuff as intended > by the script''s author(s)) ? > And how does the output of $ set # look like ?The filename is /home/shannon/.history.datsun, which is what I have HISTFILE set to. Again, it''s doing setattr, which it shouldn''t be doing for $MAIL. And, based on the dtrace output, the setattrs aren''t at any obvious period. Even stranger, despite the fact that I have something like eight shells running, the calls are coming from a shell from which I started another (superuser) shell, from which I''m running the dtrace command. What is "set #"?
Roland Mainz
2008-Mar-03 06:19 UTC
[zfs-discuss] Periodic ZFS disk accesses by ksh93... / was: Re: [dtrace-discuss] periodic ZFS disk accesses
Bill Shannon wrote:> Roland Mainz wrote: > > What''s the exact filename and how often are the accesses ? Is this an > > interactive shell or is this a script (an interactive shell session will > > do periodical lookups for things like the MAIL*-variables (see ksh(1) > > and ksh93(1) manual pages) while scripts may do random stuff as intended > > by the script''s author(s)) ? > > And how does the output of $ set # look like ? > > The filename is /home/shannon/.history.datsun, which is what I have > HISTFILE set to. Again, it''s doing setattr, which it shouldn''t be doing > for $MAIL. And, based on the dtrace output, the setattrs aren''t at > any obvious period.Do you have an userland stacktrace for these setattr calls ?> Even stranger, despite the fact that I have something > like eight shells running, the calls are coming from a shell from which I > started another (superuser) shell, from which I''m running the dtrace > command.That sounds weired... is it possible that something in the interactive environment may cause this ?> What is "set #"?"set" prints all shell variables (local, global, environment) to stdout including all their values... the ''$'' character was thought as shell prompt and the ''#'' character is the shell''s comment character to make sure that nothing gets executed past this point when someone is copy&pasting such lines from the email client to the terminal window... ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)
Bill Shannon
2008-Mar-03 07:49 UTC
[zfs-discuss] Periodic ZFS disk accesses by ksh93... / was: Re: [dtrace-discuss] periodic ZFS disk accesses
Roland Mainz wrote:> Bill Shannon wrote: >> Roland Mainz wrote: >>> What''s the exact filename and how often are the accesses ? Is this an >>> interactive shell or is this a script (an interactive shell session will >>> do periodical lookups for things like the MAIL*-variables (see ksh(1) >>> and ksh93(1) manual pages) while scripts may do random stuff as intended >>> by the script''s author(s)) ? >>> And how does the output of $ set # look like ? >> The filename is /home/shannon/.history.datsun, which is what I have >> HISTFILE set to. Again, it''s doing setattr, which it shouldn''t be doing >> for $MAIL. And, based on the dtrace output, the setattrs aren''t at >> any obvious period. > > Do you have an userland stacktrace for these setattr calls ?No. I want to do more testing to assure myself that that''s really the cause of my problem. (Then someone will have to tell me how to get a stacktrace using dtrace.)>> Even stranger, despite the fact that I have something >> like eight shells running, the calls are coming from a shell from which I >> started another (superuser) shell, from which I''m running the dtrace >> command. > > That sounds weired... is it possible that something in the interactive > environment may cause this ?Like what?>> What is "set #"? > > "set" prints all shell variables (local, global, environment) to stdout > including all their values... the ''$'' character was thought as shell > prompt and the ''#'' character is the shell''s comment character to make > sure that nothing gets executed past this point when someone is > copy&pasting such lines from the email client to the terminal window...Ya, I know about "set". I couldn''t figure out what you were asking for since you didn''t just ask for that. I assumed it must''ve been a typo or something.
Bill Shannon
2008-Mar-14 23:57 UTC
[zfs-discuss] [dtrace-discuss] periodic ZFS disk accesses
I just wanted to follow up on this issue I raised a few weeks ago. With help from several of you, I had all the information and tools I needed to start debugging my problem. Which of course meant that my problem disappeared! At one point my theory was that ksh93 was updating my .history file periodically. Well, it still does update my .history file at times when nothing else seems to be going on, but that''s not the cause of the periodic disk accesses I was seeing. (Maybe someone else wants to track this down further but I''m ignoring it.) Another problem I''ve noticed is that thunderbird very frequently updates the panacea.dat file. Does anyone know what''s in this file? If I take thunderbird offline, the updates become much less frequent, although they don''t seem to stop entirely. Right now it seems most likely that thunderbird was the cause of the periodic disk accesses I was seeing. Thanks for all your help!
Bill, On Sat, Mar 15, 2008 at 7:57 AM, Bill Shannon <bill.shannon at sun.com> wrote:> I just wanted to follow up on this issue I raised a few weeks ago. > > With help from several of you, I had all the information and tools > I needed to start debugging my problem. Which of course meant that > my problem disappeared! > > At one point my theory was that ksh93 was updating my .history file > periodically. Well, it still does update my .history file at times > when nothing else seems to be going on, but that''s not the cause of > the periodic disk accesses I was seeing. (Maybe someone else wants > to track this down further but I''m ignoring it.) > > Another problem I''ve noticed is that thunderbird very frequently > updates the panacea.dat file. Does anyone know what''s in this file? > If I take thunderbird offline, the updates become much less frequent, > although they don''t seem to stop entirely. > > Right now it seems most likely that thunderbird was the cause of the > periodic disk accesses I was seeing. > > Thanks for all your help!Will it help to aggregate ustack on writes from thunderbird.bin? E.g. syscall::write:entry /execname=="thunderbird.bin"/ { @[fds[args[0], ustack() ] = count(); } -- Just me, Wire ... Blog: <prstat.blogspot.com>
Wee Yeh Tan wrote:> Will it help to aggregate ustack on writes from thunderbird.bin? E.g. > > syscall::write:entry > /execname=="thunderbird.bin"/ > { @[fds[args[0], ustack() ] = count(); }It might help someone, but it won''t help me. I don''t know anything about the internals of thunderbird and I don''t plan to debug it.
Bill Shannon wrote:> I just wanted to follow up on this issue I raised a few weeks ago. > > With help from several of you, I had all the information and tools > I needed to start debugging my problem. Which of course meant that > my problem disappeared! > > At one point my theory was that ksh93 was updating my .history file > periodically. Well, it still does update my .history file at times > when nothing else seems to be going on, but that''s not the cause of > the periodic disk accesses I was seeing. (Maybe someone else wants > to track this down further but I''m ignoring it.) > > Another problem I''ve noticed is that thunderbird very frequently > updates the panacea.dat file. Does anyone know what''s in this file? > If I take thunderbird offline, the updates become much less frequent, > although they don''t seem to stop entirely.http://kb.mozillazine.org/Files_and_folders_in_the_profile_-_Thunderbird says "Mail folder cache". It''s vaguely textish (they have their own goofy compressed database format that I''ve looked at before, called "mork") and seems to be date and time stamps of other mail cache files. So sort of a master cache directory. http://kb.mozillazine.org/Profile_folder_-_Thunderbird#Files_and_folders_in_the_profile has some info too