Bill Sommerfeld
2006-Mar-12 01:42 UTC
[zfs-discuss] scary boot-time messages triggered by write cache enable?
I just upgraded a server to snv_35. At boot I was greeted with 22 error messages of this form: WARNING: /scsi_vhci/ssd at g20000004cf29b948 (ssd28): Error for Command: read(10) Error Level: Retryable Requested Block: 290 Error Block: 290 Vendor: SEAGATE Serial Number: 0140K0LQ70 Sense Key: Unit_Attention ASC: 0x2a (mode parameters changed), ASCQ: 0x1, FRU: 0x2 These corresponded to the 22 disks in an A5200 containing half of the ZFS pool on the system. All of them were requesting the same block (290). (The pool is built out of two A5200''s; one is connected to a pair of mpxio-capable FC ports, while the other is connected to non-mpxio-capable FC ports -- an artifact of what controllers were available to me). I had not seen this sort of error before. I suspect that it''s related to: 6322205 Enable disk write cache if ZFS owns the disk which integrated in nv_35 (spot checks of the disks in the pool show that their write caches are now enabled..) zpool status does not show any I/O errors or the equivalent and the pool appears to be otherwise behaving itself. For now I''m going to assume that these errors are spurious but I''m tempted to file a bug against this.... - Bill
Bill Baker
2006-Mar-13 17:56 UTC
[zfs-discuss] scary boot-time messages triggered by write cache enable?
Hmmm, I never saw messages like this during my testing of the changes for 6322205. That change enables write cache when zfs first opens the device. Odd that that would cause a Unit_Attention on a subsequent read. Any SCSI experts on this alias that can interpret these messages or their root cause? Bill Sommerfeld wrote:> I just upgraded a server to snv_35. > > At boot I was greeted with 22 error messages of this form: > > WARNING: /scsi_vhci/ssd at g20000004cf29b948 (ssd28): > Error for Command: read(10) Error Level: Retryable > Requested Block: 290 Error Block: 290 > Vendor: SEAGATE Serial Number: 0140K0LQ70 > Sense Key: Unit_Attention > ASC: 0x2a (mode parameters changed), ASCQ: 0x1, FRU: 0x2 > > These corresponded to the 22 disks in an A5200 containing half of the > ZFS pool on the system. All of them were requesting the same block > (290). > > (The pool is built out of two A5200''s; one is connected to a pair of > mpxio-capable FC ports, while the other is connected to > non-mpxio-capable FC ports -- an artifact of what controllers were > available to me). > > I had not seen this sort of error before. I suspect that it''s related to: > > 6322205 Enable disk write cache if ZFS owns the disk > > which integrated in nv_35 (spot checks of the disks in the pool show > that their write caches are now enabled..) > > zpool status does not show any I/O errors or the equivalent and the pool > appears to be otherwise behaving itself. > > For now I''m going to assume that these errors are spurious but I''m > tempted to file a bug against this.... > > - Bill > > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Bill Baker, 512-401-1081, x64081
Chris Horne
2006-Mar-13 18:15 UTC
[zfs-discuss] scary boot-time messages triggered by write cache enable?
Do you may have multiple paths to the drive? Maybe you are getting a check condition to report that the operational parameters changed (via WCE) the next time the path not used to issue the WCE enable command is used (in this case for a read). -Chris Bill Baker wrote:> Hmmm, I never saw messages like this during my testing of the > changes for 6322205. That change enables write cache when > zfs first opens the device. Odd that that would cause a > Unit_Attention on a subsequent read. Any SCSI experts on > this alias that can interpret these messages or their root > cause? > > Bill Sommerfeld wrote: > >>I just upgraded a server to snv_35. >> >>At boot I was greeted with 22 error messages of this form: >> >>WARNING: /scsi_vhci/ssd at g20000004cf29b948 (ssd28): >> Error for Command: read(10) Error Level: Retryable >> Requested Block: 290 Error Block: 290 >> Vendor: SEAGATE Serial Number: 0140K0LQ70 >> Sense Key: Unit_Attention >> ASC: 0x2a (mode parameters changed), ASCQ: 0x1, FRU: 0x2 >> >>These corresponded to the 22 disks in an A5200 containing half of the >>ZFS pool on the system. All of them were requesting the same block >>(290). >> >>(The pool is built out of two A5200''s; one is connected to a pair of >>mpxio-capable FC ports, while the other is connected to >>non-mpxio-capable FC ports -- an artifact of what controllers were >>available to me). >> >>I had not seen this sort of error before. I suspect that it''s related to: >> >>6322205 Enable disk write cache if ZFS owns the disk >> >>which integrated in nv_35 (spot checks of the disks in the pool show >>that their write caches are now enabled..) >> >>zpool status does not show any I/O errors or the equivalent and the pool >>appears to be otherwise behaving itself. >> >>For now I''m going to assume that these errors are spurious but I''m >>tempted to file a bug against this.... >> >> - Bill >> >> >> >> >> >>_______________________________________________ >>zfs-discuss mailing list >>zfs-discuss at opensolaris.org >>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > >
Bill Sommerfeld
2006-Mar-13 19:03 UTC
[zfs-discuss] scary boot-time messages triggered by write cache enable?
On Mon, 2006-03-13 at 13:15, Chris Horne wrote:> Do you may have multiple paths to the drive?Yes, there are multiple paths to all of the disks which reported this warning. For instance: # luxadm display /dev/rdsk/c14t20000004CF29B948d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/c14t20000004CF29B948d0s2 Status(Port A): O.K. Status(Port B): O.K. Vendor: SEAGATE Product ID: ST373405FSUN72G WWN(Node): 20000004cf29b948 WWN(Port A): 21000004cf29b948 WWN(Port B): 22000004cf29b948 Revision: 0638 Serial Num: 3EK0LQ700000 Unformatted capacity: 70007.195 MBytes Write Cache: Enabled Read Cache: Enabled Minimum prefetch: 0x0 Maximum prefetch: 0xffff Location: In slot 0 in the Front of the enclosure named: zhadum_p0 Device Type: Disk device Path(s): /dev/rdsk/c14t20000004CF29B948d0s2 /devices/scsi_vhci/ssd at g20000004cf29b948:c,raw Controller /devices/sbus at 6,0/SUNW,qlc at 2,30400/fp at 0,0 Device Address 22000004cf29b948,0 Host controller port WWN 210100e08b2aaf66 Class primary State ONLINE Controller /devices/sbus at 2,0/SUNW,qlc at 2,30400/fp at 0,0 Device Address 21000004cf29b948,0 Host controller port WWN 210100e08b2ab066 Class primary State ONLINE and in scsi_vhci.conf, I see: load-balance="round-robin"; (I believe this is the default setting).> Maybe you are > getting a check condition to report that the operational > parameters changed (via WCE) the next time the path not > used to issue the WCE enable command is used (in this case > for a read).from what little I know about SCSI, that sounds at least plausible.
Bill Sommerfeld
2006-Mar-13 20:29 UTC
[zfs-discuss] scary boot-time messages triggered by write cache enable?
I just filed 6397679 Scary Unit Attention warning/error when ZFS enables cache on multipathed disk against kernel/zfs as this was the category for 6322205 Enable disk write cache if ZFS owns the disk - Bill