Jay
2008-Dec-30 14:22 UTC
[zfs-discuss] read/write errors on storage pool (poss. ahci/hw related?)
hi *,
i''m currently playing around with the setup of an opensolaris server as
home nas and am experiencing occasional read/write problems with the
zfs pool.
the short version (details below/attached):
* 6-disk raidz pool attached to the sata controller on an nvidia MCP78S chipset
* first scrub of the pool with some data on it marks device sd5 as faulted
due to "WARNING: ahci0: watchdog port 5 satapkt 0xffffff01c7d8d660 timed
out"
and a plethora of "Error for Command: read(10)" (see attached
messages)
* these messages appeared also for sd1, sd2 and sd3, but only sd5 failed in the
end
* replaced the disk, resilvering started
* the same timeouts appear for sd0 and sd1 while resilvering, to prevent the
pool from
failing completely, i (rather brute force) rebooted the machine
* resilvering ends eventually, data seems intact
* everything seems normal for a few days, reading/writing is ok, no errors show
up, the
data is accessible
today, i saw the same errors reported for sd4 in the logfile and when trying a
''zpool status''
it became unresponsive, with timeouts showing up for sd0. after another reboot,
everything still looks ok, zpool status is ok, read and write access are ok.
the disks themselves should be ok, i had them running a burn-in before
installing opensolaris and the WD diagnostics passed them - even the faulted one
i replaced passed another test as being perfectly ok.
can anybody shed some light on this? i''m guessing it''s related
to the sata controller, but i''d appreciate any help or insight.
(at the moment, i''m not really worried about data loss as you might
guess from the brute
force rebooting, all the data on the pool is also stored on an old linux
machine. i''m reacquainting myself with solaris, so it''s more
or less a playground for now. but i''d like to replace the old linux
server sometime - mainly because of zfs)
thanks,
jay
-----
the hardware setup is
* MSI K9N2GM-FIH mainboard, geforce 8200 chipset (nvidia MCP78S)
* amd athlon x24450e
* 4G ram
* 6x WD 750GB disks (WD75000AACS)
* old samsung 200G as system disk on the IDE controller
installed is opensolaris 2008.11 (snv_101b) with a raidz pool spanning the 6 WD
disks.
-----
jay at space:/var/adm# zpool status storage-b
pool: storage-b
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
storage-b ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
c3t1d0 ONLINE 0 0 0
c3t2d0 ONLINE 0 0 0
c3t3d0 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
c3t5d0 ONLINE 0 0 0
errors: No known data errors
-----
jay at space:/var/adm# cfgadm -la
Ap_Id Type Receptacle Occupant Condition
sata4/0::dsk/c3t0d0 disk connected configured ok
sata4/1::dsk/c3t1d0 disk connected configured ok
sata4/2::dsk/c3t2d0 disk connected configured ok
sata4/3::dsk/c3t3d0 disk connected configured ok
sata4/4::dsk/c3t4d0 disk connected configured ok
sata4/5::dsk/c3t5d0 disk connected configured ok
attached: the output of prtconf -vp and the log of the failing scrub and the
resilvering
--
This message posted from opensolaris.org
-------------- next part --------------
System Configuration: Sun Microsystems i86pc
Memory size: 3968 Megabytes
System Peripherals (PROM Nodes):
Node 0x000001
bios-boot-device: ''80''
stdout: 00000000
name: ''i86pc''
Node 0x000002
existing: 00d94000.00000000.028e7801.00000000
name: ''ramdisk''
Node 0x000003
bus-type: ''isa''
device_type: ''isa''
name: ''isa''
Node 0x000004
compatible: ''pciex_root_complex''
device_type: ''pciex''
reg: 00000000.00000000.00000000
#size-cells: 00000002
#address-cells: 00000003
name: ''pci''
Node 0x000005
reg: 00000000.00000000.00000000.00000000.00000000
compatible: ''pci10de,754.1462.7508.a2'' +
''pci10de,754.1462.7508'' + ''pci1462,7508'' +
''pci10de,754.a2'' + ''pci10de,754'' +
''pciclass,050000'' + ''pciclass,0500''
model: ''Ram''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''0''
class-code: 00050000
revision-id: 000000a2
vendor-id: 000010de
device-id: 00000754
name: ''pci1462,7508''
Node 0x000006
assigned-addresses: 81000810.00000000.00002f00.00000000.00000100
reg:
00000800.00000000.00000000.00000000.00000000.01000810.00000000.00000000.00000000.00000100
compatible: ''pci10de,75c.1462.7508.a2'' +
''pci10de,75c.1462.7508'' + ''pci1462,7508'' +
''pci10de,75c.a2'' + ''pci10de,75c'' +
''pciclass,060100'' + ''pciclass,0601''
model: ''ISA bridge''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''1''
class-code: 00060100
revision-id: 000000a2
vendor-id: 000010de
device-id: 0000075c
name: ''pci1462,7508''
Node 0x000007
assigned-addresses:
81000910.00000000.00002900.00000000.00000040.81000920.00000000.00002d00.00000000.00000040.81000924.00000000.00002e00.00000000.00000040
reg:
00000900.00000000.00000000.00000000.00000000.01000910.00000000.00000000.00000000.00000040.01000920.00000000.00000000.00000000.00000040.01000924.00000000.00000000.00000000.00000040
compatible: ''pci10de,752.1462.7508.a1'' +
''pci10de,752.1462.7508'' + ''pci1462,7508'' +
''pci10de,752.a1'' + ''pci10de,752'' +
''pciclass,0c0500'' + ''pciclass,0c05''
model: ''SMBus (System Management Bus)''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
interrupts: 00000001
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''1,1''
class-code: 000c0500
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000752
name: ''pci1462,7508''
Node 0x000008
reg: 00000a00.00000000.00000000.00000000.00000000
compatible: ''pci10de,751.1462.7508.a1'' +
''pci10de,751.1462.7508'' + ''pci1462,7508'' +
''pci10de,751.a1'' + ''pci10de,751'' +
''pciclass,050000'' + ''pciclass,0500''
model: ''Ram''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''1,2''
class-code: 00050000
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000751
name: ''pci1462,7508''
Node 0x000009
assigned-addresses: 82000b10.00000000.fce80000.00000000.00080000
reg:
00000b00.00000000.00000000.00000000.00000000.02000b10.00000000.00000000.00000000.00080000
compatible: ''pci10de,753.1462.7508.a2'' +
''pci10de,753.1462.7508'' + ''pci1462,7508'' +
''pci10de,753.a2'' + ''pci10de,753'' +
''pciclass,0b4000'' + ''pciclass,0b40''
model: ''Co-processor''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
interrupts: 00000002
max-latency: 00000001
min-grant: 00000003
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''1,3''
class-code: 000b4000
revision-id: 000000a2
vendor-id: 000010de
device-id: 00000753
name: ''pci1462,7508''
Node 0x00000a
reg: 00000c00.00000000.00000000.00000000.00000000
compatible: ''pci10de,568.1462.7508.a1'' +
''pci10de,568.1462.7508'' + ''pci1462,7508'' +
''pci10de,568.a1'' + ''pci10de,568'' +
''pciclass,050000'' + ''pciclass,0500''
model: ''Ram''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''1,4''
class-code: 00050000
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000568
name: ''pci1462,7508''
Node 0x00000b
#size-cells: 00000000
#address-cells: 00000001
device_type: ''pci-ide''
assigned-addresses:
81003010.00000000.000001f0.00000000.00000008.81003014.00000000.000003f6.00000000.00000001.81003018.00000000.00000170.00000000.00000008.8100301c.00000000.00000376.00000000.00000001.81003020.00000000.0000ffa0.00000000.00000010
reg:
00003000.00000000.00000000.00000000.00000000.81003010.00000000.000001f0.00000000.00000008.81003014.00000000.000003f6.00000000.00000001.81003018.00000000.00000170.00000000.00000008.8100301c.00000000.00000376.00000000.00000001.01003020.00000000.00000000.00000000.00000010
compatible: ''pci10de,759.1462.7508.a1'' +
''pci10de,759.1462.7508'' + ''pci1462,7508'' +
''pci10de,759.a1'' + ''pci10de,759'' +
''pciclass,01018a'' + ''pciclass,0101''
model: ''IDE controller''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
max-latency: 00000001
min-grant: 00000003
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''6''
class-code: 0001018a
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000759
name: ''pci-ide''
Node 0x00000c
reg: 00000000
name: ''ide''
Node 0x00000d
reg: 00000001
name: ''ide''
Node 0x00000e
assigned-addresses: 82003810.00000000.fce78000.00000000.00004000
reg:
00003800.00000000.00000000.00000000.00000000.02003810.00000000.00000000.00000000.00004000
compatible: ''pci10de,774.1462.7508.a1'' +
''pci10de,774.1462.7508'' + ''pci1462,7508'' +
''pci10de,774.a1'' + ''pci10de,774'' +
''pciclass,040300'' + ''pciclass,0403''
model: ''Mixed Mode device''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
interrupts: 00000001
max-latency: 00000005
min-grant: 00000002
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''7''
class-code: 00040300
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000774
name: ''pci1462,7508''
Node 0x00000f
slot-names: 00000300.746f6c53.6c530032.0031746f
reg: 00004000.00000000.00000000.00000000.00000000
compatible: ''pci10de,75a.a1'' +
''pci10de,75a'' + ''pciclass,060401'' +
''pciclass,0604''
model: ''Subtractive Decode PCI-PCI bridge''
ranges:
81000000.00000000.0000d000.81000000.00000000.0000d000.00000000.00001000.82000000.00000000.fcf00000.82000000.00000000.fcf00000.00000000.00100000
bus-range: 00000001.00000001
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pci''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
unit-address: ''8''
class-code: 00060401
revision-id: 000000a1
vendor-id: 000010de
device-id: 0000075a
name: ''pci10de,75a''
Node 0x00001a
assigned-addresses:
81014010.00000000.0000d800.00000000.00000100.82014014.00000000.fcfffc00.00000000.00000100
reg:
00014000.00000000.00000000.00000000.00000000.01014010.00000000.00000000.00000000.00000100.02014014.00000000.00000000.00000000.00000100
compatible: ''pci10ec,8169.1385.311a.10'' +
''pci10ec,8169.1385.311a'' + ''pci1385,311a'' +
''pci10ec,8169.10'' + ''pci10ec,8169'' +
''pciclass,020000'' + ''pciclass,0200''
model: ''Ethernet controller''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000001
interrupts: 00000001
max-latency: 00000040
min-grant: 00000020
subsystem-vendor-id: 00001385
subsystem-id: 0000311a
unit-address: ''8''
class-code: 00020000
revision-id: 00000010
vendor-id: 000010ec
device-id: 00008169
name: ''pci1385,311a''
Node 0x000010
assigned-addresses:
81004810.00000000.0000c480.00000000.00000008.81004814.00000000.0000c400.00000000.00000004.81004818.00000000.0000c080.00000000.00000008.8100481c.00000000.0000c000.00000000.00000004.81004820.00000000.0000bc00.00000000.00000010.82004824.00000000.fce7c000.00000000.00002000
reg:
00004800.00000000.00000000.00000000.00000000.01004810.00000000.00000000.00000000.00000008.01004814.00000000.00000000.00000000.00000004.01004818.00000000.00000000.00000000.00000008.0100481c.00000000.00000000.00000000.00000004.01004820.00000000.00000000.00000000.00000010.02004824.00000000.00000000.00000000.00002000
compatible: ''pci10de,ad4.1462.7508.a2'' +
''pci10de,ad4.1462.7508'' + ''pci1462,7508'' +
''pci10de,ad4.a2'' + ''pci10de,ad4'' +
''pciclass,010601'' + ''pciclass,0106''
model: ''SATA AHCI 1.0 Interface''
power-consumption: 00000001.00000001
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000000
interrupts: 00000001
max-latency: 00000001
min-grant: 00000003
subsystem-vendor-id: 00001462
subsystem-id: 00007508
unit-address: ''9''
class-code: 00010601
revision-id: 000000a2
vendor-id: 000010de
device-id: 00000ad4
name: ''pci1462,7508''
Node 0x000011
reg: 00005800.00000000.00000000.00000000.00000000
compatible: ''pci10de,569.a1'' +
''pci10de,569'' + ''pciclass,060400'' +
''pciclass,0604''
model: ''PCI-PCI bridge''
ranges:
81000000.00000000.0000e000.81000000.00000000.0000e000.00000000.00001000.82000000.00000000.fd000000.82000000.00000000.fd000000.00000000.01b00000.c2000000.00000000.ce000000.c2000000.00000000.ce000000.00000000.12000000
bus-range: 00000002.00000002
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pci''
power-consumption: 00000001.00000001
devsel-speed: 00000000
unit-address: ''b''
class-code: 00060400
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000569
name: ''pci10de,569''
Node 0x00001b
assigned-addresses:
82020010.00000000.fd000000.00000000.01000000.c3020014.00000000.d0000000.00000000.10000000.c302001c.00000000.ce000000.00000000.02000000.81020024.00000000.0000ec00.00000000.00000080.a1020000.00000000.000003b0.00000000.0000000c.a1020000.00000000.000003c0.00000000.00000020.82020000.00000000.000a0000.00000000.00020000
reg:
00020000.00000000.00000000.00000000.00000000.02020010.00000000.00000000.00000000.01000000.43020014.00000000.00000000.00000000.10000000.4302001c.00000000.00000000.00000000.02000000.01020024.00000000.00000000.00000000.00000080.a1020000.00000000.000003b0.00000000.0000000c.a1020000.00000000.000003c0.00000000.00000020.82020000.00000000.000a0000.00000000.00020000
compatible: ''pci10de,849.1462.7508.a2'' +
''pci10de,849.1462.7508'' + ''pci1462,7508'' +
''pci10de,849.a2'' + ''pci10de,849'' +
''pciclass,030000'' + ''pciclass,0300''
model: ''VGA compatible controller''
power-consumption: 00000001.00000001
devsel-speed: 00000000
interrupts: 00000001
max-latency: 00000000
min-grant: 00000000
subsystem-vendor-id: 00001462
subsystem-id: 00007508
device_type: ''display''
unit-address: ''0''
class-code: 00030000
revision-id: 000000a2
vendor-id: 000010de
device-id: 00000849
name: ''display''
Node 0x000012
reg: 00008000.00000000.00000000.00000000.00000000
compatible: ''pciex10de,778.a1'' +
''pciex10de,778'' + ''pciexclass,060400'' +
''pciexclass,0604'' + ''pci10de,778.a1'' +
''pci10de,778'' + ''pciclass,060400'' +
''pciclass,0604''
model: ''PCI-PCI bridge''
bus-range: 00000003.00000003
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pciex''
power-consumption: 00000001.00000001
slot-names: 00000001.65696370.00000031
physical-slot#: 00000001
devsel-speed: 00000000
interrupts: 00000001
unit-address: ''10''
class-code: 00060400
revision-id: 000000a1
vendor-id: 000010de
device-id: 00000778
pcie-capid-pointer: 00000080
pcie-capid-reg: 00000142
pcie-slotcap-reg: 00082580
name: ''pci10de,778''
Node 0x000013
reg: 00009000.00000000.00000000.00000000.00000000
compatible: ''pciex10de,75b.a1'' +
''pciex10de,75b'' + ''pciexclass,060400'' +
''pciexclass,0604'' + ''pci10de,75b.a1'' +
''pci10de,75b'' + ''pciclass,060400'' +
''pciclass,0604''
model: ''PCI-PCI bridge''
bus-range: 00000004.00000004
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pciex''
power-consumption: 00000001.00000001
slot-names: 00000001.65696370.00000033
physical-slot#: 00000003
devsel-speed: 00000000
interrupts: 00000001
unit-address: ''12''
class-code: 00060400
revision-id: 000000a1
vendor-id: 000010de
device-id: 0000075b
pcie-capid-pointer: 00000080
pcie-capid-reg: 00000141
pcie-slotcap-reg: 00180500
name: ''pci10de,75b''
Node 0x000014
reg: 00009800.00000000.00000000.00000000.00000000
compatible: ''pciex10de,77a.a1'' +
''pciex10de,77a'' + ''pciexclass,060400'' +
''pciexclass,0604'' + ''pci10de,77a.a1'' +
''pci10de,77a'' + ''pciclass,060400'' +
''pciclass,0604''
model: ''PCI-PCI bridge''
ranges:
82000000.00000000.feb00000.82000000.00000000.feb00000.00000000.00100000
bus-range: 00000005.00000005
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pciex''
power-consumption: 00000001.00000001
slot-names: 00000001.65696370.00000034
physical-slot#: 00000004
devsel-speed: 00000000
interrupts: 00000001
unit-address: ''13''
class-code: 00060400
revision-id: 000000a1
vendor-id: 000010de
device-id: 0000077a
pcie-capid-pointer: 00000080
pcie-capid-reg: 00000141
pcie-slotcap-reg: 00200500
name: ''pci10de,77a''
Node 0x00001c
assigned-addresses:
82050010.00000000.febff800.00000000.00000800.82050014.00000000.febff400.00000000.00000080.82050020.00000000.febff000.00000000.00000080.82050024.00000000.febfec00.00000000.00000080
reg:
00050000.00000000.00000000.00000000.00000000.02050010.00000000.00000000.00000000.00000800.02050014.00000000.00000000.00000000.00000080.02050020.00000000.00000000.00000000.00000080.02050024.00000000.00000000.00000000.00000080
compatible: ''pciex197b,2380.1462.508d.0'' +
''pciex197b,2380.1462.508d'' +
''pciex197b,2380.0'' + ''pciex197b,2380'' +
''pciexclass,0c0010'' + ''pciexclass,0c00'' +
''pci197b,2380.1462.508d.0'' +
''pci197b,2380.1462.508d'' + ''pci1462,508d'' +
''pci197b,2380.0'' + ''pci197b,2380'' +
''pciclass,0c0010'' + ''pciclass,0c00''
model: ''FireWire (IEEE 1394) OpenHCI
compliant''
power-consumption: 00000001.00000001
devsel-speed: 00000000
interrupts: 00000001
subsystem-vendor-id: 00001462
subsystem-id: 0000508d
unit-address: ''0''
class-code: 000c0010
revision-id: 00000000
vendor-id: 0000197b
device-id: 00002380
pcie-capid-pointer: 00000080
pcie-capid-reg: 00000001
name: ''pci1462,508d''
Node 0x000015
reg: 0000a000.00000000.00000000.00000000.00000000
compatible: ''pciex10de,77a.a1'' +
''pciex10de,77a'' + ''pciexclass,060400'' +
''pciexclass,0604'' + ''pci10de,77a.a1'' +
''pci10de,77a'' + ''pciclass,060400'' +
''pciclass,0604''
model: ''PCI-PCI bridge''
bus-range: 00000006.00000006
#size-cells: 00000002
#address-cells: 00000003
device_type: ''pciex''
power-consumption: 00000001.00000001
slot-names: 00000001.65696370.00000035
physical-slot#: 00000005
devsel-speed: 00000000
interrupts: 00000001
unit-address: ''14''
class-code: 00060400
revision-id: 000000a1
vendor-id: 000010de
device-id: 0000077a
pcie-capid-pointer: 00000080
pcie-capid-reg: 00000141
pcie-slotcap-reg: 00280500
name: ''pci10de,77a''
Node 0x000016
reg: 0000c000.00000000.00000000.00000000.00000000
compatible: ''pci1022,1100.0'' +
''pci1022,1100'' + ''pciclass,060000'' +
''pciclass,0600''
model: ''Host bridge''
power-consumption: 00000001.00000001
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
unit-address: ''18''
class-code: 00060000
revision-id: 00000000
vendor-id: 00001022
device-id: 00001100
name: ''pci1022,1100''
Node 0x000017
reg: 0000c100.00000000.00000000.00000000.00000000
compatible: ''pci1022,1101.0'' +
''pci1022,1101'' + ''pciclass,060000'' +
''pciclass,0600''
model: ''Host bridge''
power-consumption: 00000001.00000001
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
unit-address: ''18,1''
class-code: 00060000
revision-id: 00000000
vendor-id: 00001022
device-id: 00001101
name: ''pci1022,1101''
Node 0x000018
reg: 0000c200.00000000.00000000.00000000.00000000
compatible: ''pci1022,1102.0'' +
''pci1022,1102'' + ''pciclass,060000'' +
''pciclass,0600''
model: ''Host bridge''
power-consumption: 00000001.00000001
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
unit-address: ''18,2''
class-code: 00060000
revision-id: 00000000
vendor-id: 00001022
device-id: 00001102
name: ''pci1022,1102''
Node 0x000019
reg: 0000c300.00000000.00000000.00000000.00000000
compatible: ''pci1022,1103.0'' +
''pci1022,1103'' + ''pciclass,060000'' +
''pciclass,0600''
model: ''Host bridge''
power-consumption: 00000001.00000001
devsel-speed: 00000000
max-latency: 00000000
min-grant: 00000000
unit-address: ''18,3''
class-code: 00060000
revision-id: 00000000
vendor-id: 00001022
device-id: 00001103
name: ''pci1022,1103''
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages.0
Type: application/octet-stream
Size: 278813 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081230/01868ea1/attachment.obj>
Richard Elling
2008-Dec-30 18:11 UTC
[zfs-discuss] read/write errors on storage pool (poss. ahci/hw related?)
Jay wrote:> hi *, > > i''m currently playing around with the setup of an opensolaris server as > home nas and am experiencing occasional read/write problems with the > zfs pool. > > the short version (details below/attached): > * 6-disk raidz pool attached to the sata controller on an nvidia MCP78S chipset > * first scrub of the pool with some data on it marks device sd5 as faulted > due to "WARNING: ahci0: watchdog port 5 satapkt 0xffffff01c7d8d660 timed out" > and a plethora of "Error for Command: read(10)" (see attached messages) >Jay, if you search the bugs database for this error message, http://bugs.opensolaris.org you will find a number of hits. Many possibly related bugs have been fixed by b101, but there may be more. You should also ask this question on the drivers-discuss forum as that is where the device driver writers hang out. -- richard> * these messages appeared also for sd1, sd2 and sd3, but only sd5 failed in the end > * replaced the disk, resilvering started > * the same timeouts appear for sd0 and sd1 while resilvering, to prevent the pool from > failing completely, i (rather brute force) rebooted the machine > * resilvering ends eventually, data seems intact > * everything seems normal for a few days, reading/writing is ok, no errors show up, the > data is accessible > > today, i saw the same errors reported for sd4 in the logfile and when trying a ''zpool status'' > it became unresponsive, with timeouts showing up for sd0. after another reboot, everything still looks ok, zpool status is ok, read and write access are ok. > > the disks themselves should be ok, i had them running a burn-in before installing opensolaris and the WD diagnostics passed them - even the faulted one i replaced passed another test as being perfectly ok. > > can anybody shed some light on this? i''m guessing it''s related to the sata controller, but i''d appreciate any help or insight. > > (at the moment, i''m not really worried about data loss as you might guess from the brute > force rebooting, all the data on the pool is also stored on an old linux machine. i''m reacquainting myself with solaris, so it''s more or less a playground for now. but i''d like to replace the old linux server sometime - mainly because of zfs) > > thanks, > jay > >
Jay
2008-Dec-31 11:44 UTC
[zfs-discuss] read/write errors on storage pool (poss. ahci/hw related?)
hi richard, the bugs database ... figures ... now that you said it, it''s really quite obvious :) thanks, and thanks for the hint towards the drivers-discuss forum. bye, jay -- This message posted from opensolaris.org