Philip Brown
2006-Jun-17 00:03 UTC
[dtrace-discuss] tracing use of disk write cache toggle
hi folks, I''m trying to dtrace when exactly disk write cache is toggled on/off, on some SATA drives (using S10U2) I thought that using fbt::sata_set_cache_mode:entry, fbt::sata_init_write_cache_mode:entry { printf("%s called\n",probefunc); } would work. but when using format -e to enable/disable it on a sata drive... these probes do not seem to get fired. Any suggestions on what to do instead?
Seth Goldberg
2006-Jun-17 00:34 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Hi, format uses uscsi so the cache enable (mode select page 8) does not go through those interfaces to enable the cache; you can actually take a look at the OpenSolaris code in sata_mode_select_page_8(), which ends up calling mv_start() with the command register set to the SET FEATURES command, with the appropriate parameters. --S Quoting Philip Brown, who wrote the following on Fri, 16 Jun 2006:> hi folks, > > I''m trying to dtrace when exactly disk write cache is toggled on/off, on some > SATA drives (using S10U2) > > I thought that using > > fbt::sata_set_cache_mode:entry, > fbt::sata_init_write_cache_mode:entry > { > printf("%s called\n",probefunc); > } > > > would work. but when using format -e to enable/disable it on a sata drive... > these probes do not seem to get fired. > Any suggestions on what to do instead? > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Seth Goldberg
2006-Jun-17 00:35 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Too fast on the trigger :). Of course, it only calls mv_start() if you''re running on a Marvell controller; if you''re using the sil3124, it calls the appropriate start entry point... --S Quoting Seth Goldberg, who wrote the following on Fri, 16 Jun 2006:> Hi, > > format uses uscsi so the cache enable (mode select page 8) does not go > through those interfaces to enable the cache; you can actually take a look at > the OpenSolaris code in sata_mode_select_page_8(), which ends up calling > mv_start() with the command register set to the SET FEATURES command, with > the appropriate parameters. > > --S > > Quoting Philip Brown, who wrote the following on Fri, 16 Jun 2006: > >> hi folks, >> >> I''m trying to dtrace when exactly disk write cache is toggled on/off, on >> some SATA drives (using S10U2) >> >> I thought that using >> >> fbt::sata_set_cache_mode:entry, >> fbt::sata_init_write_cache_mode:entry >> { >> printf("%s called\n",probefunc); >> } >> >> >> would work. but when using format -e to enable/disable it on a sata >> drive... these probes do not seem to get fired. >> Any suggestions on what to do instead? >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >> > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Philip Brown
2006-Jun-17 00:50 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Seth Goldberg wrote:> Hi, > > format uses uscsi so the cache enable (mode select page 8) does not go > through those interfaces to enable the cache;ooo. wow (ick). well, what about if I want to trace when zfs supposedly attempts to do it? Allegedly, zfs now makes an ldi ioctl call when a zpool is brought online. and using dtrace on ldi_ioctl, it does indeed seem to make the call. fbt::ldi_ioctl:entry { printf("ldi_ioctl called with %x\n",args[1]); } says ldi_ioctl get called with DKIOCSETWCE (0x425) but... the drives dont actualy seem to get their cache enabled, if it was not previously turned on. I''m trying to avoid digging through layer upon layer of kernel code, and setting dtrace on the lowest layer that "should" get called to set this thing. I thought that layer would be sata_set_cache_mode(), but according to dtrace, it doesnt appear to get called .
Seth Goldberg
2006-Jun-17 01:05 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Hi,> well, what about if I want to trace when zfs supposedly attempts to do it? > > Allegedly, zfs now makes an ldi ioctl call when a zpool is brought online. > and using dtrace on ldi_ioctl, it does indeed seem to make the call. > > fbt::ldi_ioctl:entry { printf("ldi_ioctl called with %x\n",args[1]); } > > says ldi_ioctl get called with DKIOCSETWCE (0x425) > > but... the drives dont actualy seem to get their cache enabled, if it was not > previously turned on.Yep, that''s a known bug: 6399231 sd arbitrarily assumes that caching mode page is saveable, fixed in Build 8.> > I''m trying to avoid digging through layer upon layer of kernel code, and > setting dtrace on the lowest layer that "should" get called to set this > thing. > > I thought that layer would be sata_set_cache_mode(), but according to dtrace, > it doesnt appear to get called .No, that''s our obfuscation strategy ;). --S
Philip Brown
2006-Jun-17 02:10 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Seth Goldberg wrote:> Hi, > >> well, what about if I want to trace when zfs supposedly attempts to do >> it? >> >> Allegedly, zfs now makes an ldi ioctl call when a zpool is brought >> online. >> and using dtrace on ldi_ioctl, it does indeed seem to make the call. >> >> fbt::ldi_ioctl:entry { printf("ldi_ioctl called with %x\n",args[1]); } >> >> says ldi_ioctl get called with DKIOCSETWCE (0x425) >> >> but... the drives dont actualy seem to get their cache enabled, if it >> was not previously turned on. > > > Yep, that''s a known bug: > > 6399231 sd arbitrarily assumes that caching mode page is saveable, fixed > in Build 8.not quite the same thing I think. unless the kernel actually saves somewhere, "I already enabled write cache beore last reboot", and then ignores explicit calls "hey, enable write cache now" ? I''m interested in finding out what actually happens when zfs makes the ldi_ioctl call, and what/where is it apparently stopping.>> I''m trying to avoid digging through layer upon layer of kernel code, >> and setting dtrace on the lowest layer that "should" get called to set >> this thing. >> >> I thought that layer would be sata_set_cache_mode(), but according to >> dtrace, it doesnt appear to get called . > > > No, that''s our obfuscation strategy ;).>.<
Seth Goldberg
2006-Jun-17 02:49 UTC
[dtrace-discuss] tracing use of disk write cache toggle
>> Yep, that''s a known bug: >> >> 6399231 sd arbitrarily assumes that caching mode page is saveable, fixed in >> Build 8. > > not quite the same thing I think. > > unless the kernel actually saves somewhere, "I already enabled write cache > beore last reboot", and then ignores explicit calls "hey, enable write cache > now"Are you running on build 8 or later? If not, please upgrade and you shall see. If you are, then this is another issue. --S
Seth Goldberg wrote:> >>> Yep, that''s a known bug: >>> >>> 6399231 sd arbitrarily assumes that caching mode page is saveable, >>> fixed in Build 8. >>I just checked the bug and it says fixed in build 37. Phi>> >> not quite the same thing I think. >> >> unless the kernel actually saves somewhere, "I already enabled write >> cache beore last reboot", and then ignores explicit calls "hey, enable >> write cache now" > > > Are you running on build 8 or later? If not, please upgrade and you > shall see. If you are, then this is another issue. > > --S > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Phi Tran wrote:> Seth Goldberg wrote: > >> >>>> Yep, that''s a known bug: >>>> >>>> 6399231 sd arbitrarily assumes that caching mode page is saveable, >>>> fixed in Build 8. >>> >>> > > I just checked the bug and it says fixed in build 37. > > Phi >I get it, you mean s10 build 8 and I mean NV 37 :) Phi>>> >>> not quite the same thing I think. >>> >>> unless the kernel actually saves somewhere, "I already enabled write >>> cache beore last reboot", and then ignores explicit calls "hey, >>> enable write cache now" >> >> >> >> Are you running on build 8 or later? If not, please upgrade and >> you shall see. If you are, then this is another issue. >> >> --S >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
George Wilson
2006-Jun-18 12:38 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Philip, Take a look at sd_cache_control() this is where ZFS ends up calling into when it wants to enable/disable the write_cache. This maybe what you are looking to dtrace. Thanks, George Philip Brown wrote:> Seth Goldberg wrote: >> Hi, >> >>> well, what about if I want to trace when zfs supposedly attempts to >>> do it? >>> >>> Allegedly, zfs now makes an ldi ioctl call when a zpool is brought >>> online. >>> and using dtrace on ldi_ioctl, it does indeed seem to make the call. >>> >>> fbt::ldi_ioctl:entry { printf("ldi_ioctl called with %x\n",args[1]); } >>> >>> says ldi_ioctl get called with DKIOCSETWCE (0x425) >>> >>> but... the drives dont actualy seem to get their cache enabled, if it >>> was not previously turned on. >> >> >> Yep, that''s a known bug: >> >> 6399231 sd arbitrarily assumes that caching mode page is saveable, >> fixed in Build 8. > > not quite the same thing I think. > > unless the kernel actually saves somewhere, "I already enabled write > cache beore last reboot", and then ignores explicit calls "hey, enable > write cache now" > > ? > > I''m interested in finding out what actually happens when zfs makes the > ldi_ioctl call, and what/where is it apparently stopping. > > >>> I''m trying to avoid digging through layer upon layer of kernel code, >>> and setting dtrace on the lowest layer that "should" get called to >>> set this thing. >>> >>> I thought that layer would be sata_set_cache_mode(), but according to >>> dtrace, it doesnt appear to get called . >> >> >> No, that''s our obfuscation strategy ;). > > > >.< > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Philip Brown
2006-Jun-19 19:27 UTC
[dtrace-discuss] tracing use of disk write cache toggle
George Wilson wrote:> Philip, > > Take a look at sd_cache_control() this is where ZFS ends up calling into > when it wants to enable/disable the write_cache. This maybe what you are > looking to dtrace. >dont work. which kinda proves my point that zfs isnt enabling cache even though people think it should already be doing so... The following is on Solaris 10 6/06 s10x_u2wos_09 X86 ############################################################# #!/usr/sbin/dtrace -s fbt::sata_set_cache_mode:entry, fbt::sata_init_write_cache_mode:entry, fbt::sd_cache_control:entry { printf("%s called\n",probefunc); } I force write cache to be disabled (via format -e), on a set of test disks, c5t1d0 c5t4d0 c5t5d0 i run "zpool create philpool c5t1d0 c5t4d0 c5t5d0" The above dtrace script does not get triggered. and, checking with format -e, write-cache is NOT auto-enabled on c5t1d0 c5t4d0 c5t5d0 after the zpool is created and mounted by the above command. This is on SATA disks, which is why I have those other lines in the dtrace script.. but none of the probes get triggered, as previously mentioned.
Apologies if you''ve said, Phil, but: are you sure the disks are being driven by sd and the SATA framework? (i.e., what kind of controller, and is it being recognized as "native SATA" and not "legacy mode")? Philip Brown wrote:> George Wilson wrote: >> Philip, >> >> Take a look at sd_cache_control() this is where ZFS ends up calling >> into when it wants to enable/disable the write_cache. This maybe what >> you are looking to dtrace. >> > > dont work. which kinda proves my point that zfs isnt enabling cache even > though people think it should already be doing so... > > > The following is on > Solaris 10 6/06 s10x_u2wos_09 X86 > > ############################################################# > > #!/usr/sbin/dtrace -s > fbt::sata_set_cache_mode:entry, > fbt::sata_init_write_cache_mode:entry, > fbt::sd_cache_control:entry > { > printf("%s called\n",probefunc); > } > > > I force write cache to be disabled (via format -e), on a set of test disks, > c5t1d0 > c5t4d0 > c5t5d0 > > i run "zpool create philpool c5t1d0 c5t4d0 c5t5d0" > > The above dtrace script does not get triggered. > and, checking with format -e, write-cache is NOT auto-enabled on > c5t1d0 > c5t4d0 > c5t5d0 > > after the zpool is created and mounted by the above command. > This is on SATA disks, which is why I have those other lines in the > dtrace script.. but none of the probes get triggered, as previously > mentioned. > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
George Wilson
2006-Jun-19 20:09 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Phil, I tried this on a non-sata drive and I get this: # dtrace -s /var/tmp/cache.d dtrace: script ''/var/tmp/cache.d'' matched 1 probe CPU ID FUNCTION:NAME 0 19227 sd_cache_control:entry sd_cache_control called I''ll try to find a system with sata drives to test further. Thanks, George Dan Mick wrote:> Apologies if you''ve said, Phil, but: are you sure the disks are being > driven by sd and the SATA framework? (i.e., what kind of controller, > and is it being recognized as "native SATA" and not "legacy mode")? > > Philip Brown wrote: >> George Wilson wrote: >>> Philip, >>> >>> Take a look at sd_cache_control() this is where ZFS ends up calling >>> into when it wants to enable/disable the write_cache. This maybe what >>> you are looking to dtrace. >>> >> >> dont work. which kinda proves my point that zfs isnt enabling cache >> even though people think it should already be doing so... >> >> >> The following is on >> Solaris 10 6/06 s10x_u2wos_09 X86 >> >> ############################################################# >> >> #!/usr/sbin/dtrace -s >> fbt::sata_set_cache_mode:entry, >> fbt::sata_init_write_cache_mode:entry, >> fbt::sd_cache_control:entry >> { >> printf("%s called\n",probefunc); >> } >> >> >> I force write cache to be disabled (via format -e), on a set of test >> disks, >> c5t1d0 >> c5t4d0 >> c5t5d0 >> >> i run "zpool create philpool c5t1d0 c5t4d0 c5t5d0" >> >> The above dtrace script does not get triggered. >> and, checking with format -e, write-cache is NOT auto-enabled on >> c5t1d0 >> c5t4d0 >> c5t5d0 >> >> after the zpool is created and mounted by the above command. >> This is on SATA disks, which is why I have those other lines in the >> dtrace script.. but none of the probes get triggered, as previously >> mentioned. >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Seth Goldberg
2006-Jun-19 20:19 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Hi, Phil, the reason why sd_cache_control wasn''t called in your test case is because when you used format -e to set the write cache mode, you basically bypassed sd and went "right to the disk". As a result, sd still thought that the write cache was enabled, and thought that the zfs request was superfluous (and thus, didn''t do it). So, the moral of this story is don''t do that :). ZFS does enable write caching when the caching poicy is not altered by the user. sd should probably be aware of the cache changes implemented through format (but it isn''t today because format uses the uscsi ioctl to effect this change), whatever interface format uses to change the write-caching policy, so let''s open a bug on this. Thanks, --S Quoting George Wilson, who wrote the following on Mon, 19 Jun 2006:> Phil, > > I tried this on a non-sata drive and I get this: > > # dtrace -s /var/tmp/cache.d > dtrace: script ''/var/tmp/cache.d'' matched 1 probe > CPU ID FUNCTION:NAME > 0 19227 sd_cache_control:entry sd_cache_control called > > > I''ll try to find a system with sata drives to test further. > > Thanks, > George > > Dan Mick wrote: >> Apologies if you''ve said, Phil, but: are you sure the disks are being >> driven by sd and the SATA framework? (i.e., what kind of controller, and >> is it being recognized as "native SATA" and not "legacy mode")? >> >> Philip Brown wrote: >>> George Wilson wrote: >>>> Philip, >>>> >>>> Take a look at sd_cache_control() this is where ZFS ends up calling into >>>> when it wants to enable/disable the write_cache. This maybe what you are >>>> looking to dtrace. >>>> >>> >>> dont work. which kinda proves my point that zfs isnt enabling cache even >>> though people think it should already be doing so... >>> >>> >>> The following is on >>> Solaris 10 6/06 s10x_u2wos_09 X86 >>> >>> ############################################################# >>> >>> #!/usr/sbin/dtrace -s >>> fbt::sata_set_cache_mode:entry, >>> fbt::sata_init_write_cache_mode:entry, >>> fbt::sd_cache_control:entry >>> { >>> printf("%s called\n",probefunc); >>> } >>> >>> >>> I force write cache to be disabled (via format -e), on a set of test >>> disks, >>> c5t1d0 >>> c5t4d0 >>> c5t5d0 >>> >>> i run "zpool create philpool c5t1d0 c5t4d0 c5t5d0" >>> >>> The above dtrace script does not get triggered. >>> and, checking with format -e, write-cache is NOT auto-enabled on >>> c5t1d0 >>> c5t4d0 >>> c5t5d0 >>> >>> after the zpool is created and mounted by the above command. >>> This is on SATA disks, which is why I have those other lines in the dtrace >>> script.. but none of the probes get triggered, as previously mentioned. >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
George Wilson
2006-Jun-19 20:24 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Following this thread, I tried this on an IBM system with SATA drives running in ''legacy mode'': 1. c3d0 <WDC WD80- WD-WMAM9D59118-0001-74.54GB> /pci at 0,0/pci1022,7460 at 6/pci-ide at 6/ide at 1/cmdk at 0,0 You can see that the drive is being driven by the cmdk driver. This means that will won''t attempt to enable the write cache. Thanks, George Dan Mick wrote:> Apologies if you''ve said, Phil, but: are you sure the disks are being > driven by sd and the SATA framework? (i.e., what kind of controller, > and is it being recognized as "native SATA" and not "legacy mode")? > > Philip Brown wrote: >> George Wilson wrote: >>> Philip, >>> >>> Take a look at sd_cache_control() this is where ZFS ends up calling >>> into when it wants to enable/disable the write_cache. This maybe what >>> you are looking to dtrace. >>> >> >> dont work. which kinda proves my point that zfs isnt enabling cache >> even though people think it should already be doing so... >> >> >> The following is on >> Solaris 10 6/06 s10x_u2wos_09 X86 >> >> ############################################################# >> >> #!/usr/sbin/dtrace -s >> fbt::sata_set_cache_mode:entry, >> fbt::sata_init_write_cache_mode:entry, >> fbt::sd_cache_control:entry >> { >> printf("%s called\n",probefunc); >> } >> >> >> I force write cache to be disabled (via format -e), on a set of test >> disks, >> c5t1d0 >> c5t4d0 >> c5t5d0 >> >> i run "zpool create philpool c5t1d0 c5t4d0 c5t5d0" >> >> The above dtrace script does not get triggered. >> and, checking with format -e, write-cache is NOT auto-enabled on >> c5t1d0 >> c5t4d0 >> c5t5d0 >> >> after the zpool is created and mounted by the above command. >> This is on SATA disks, which is why I have those other lines in the >> dtrace script.. but none of the probes get triggered, as previously >> mentioned. >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Philip Brown
2006-Jun-19 21:21 UTC
[dtrace-discuss] tracing use of disk write cache toggle
Seth Goldberg wrote:> Hi, > > Phil, the reason why sd_cache_control wasn''t called in your test case > is because when you used format -e to set the write cache mode, you > basically bypassed sd and went "right to the disk". As a result, sd > still thought that the write cache was enabled, and thought that the zfs > request was superfluous (and thus, didn''t do it).that''s what I was wondering: whether sd tried to be "too intelligent". > sd should probably be aware> of the cache changes implemented through format (but it isn''t today > because format uses the uscsi ioctl to effect this change), whatever > interface format uses to change the write-caching policy, so let''s open > a bug on this. >I think the bug should be, "make the sd driver quit trying to be so "intelligent" :-) To put it another way: If I tell the sd driver, "send a flush_cache command now", I expect it to send the command, whether or not sd thinks I "need" it. Similarly, if I tell the sd driver, "enable/disable write cache now", I expect it to send the command to the drive, whether or not sd thinks I "need" it.
Seth Goldberg
2006-Jun-19 21:32 UTC
[dtrace-discuss] tracing use of disk write cache toggle
> To put it another way: If I tell the sd driver, "send a flush_cache command > now", I expect it to send the command, whether or not sd thinks I "need" it.Flush cache will *always* be executed. There is no intelligence in sd for that command.> Similarly, if I tell the sd driver, "enable/disable write cache now", I > expect it to send the command to the drive, whether or not sd thinks I "need"That''s certainly one way to fix the problem; the other way is to ensure that sd is in the loop with respect to the current caching state. In any case, we can take this resolution off this list; we''re already woefully off-topic. Thanks, --S