Michael Stalnaker
2008-Jan-12 19:47 UTC
[zfs-discuss] Disk array problems - any suggestions?
All;
I have a 24-disk SATA array attached to an HP DL160 with a LSI 3801E for the
controller. We''ve been seeing errors that look like:
WARNING: /pci at 0,0/pci8086,25f7 at 2/pci8086,350c at 0,3/pci1000,30e0 at 2
(mpt0);
Disconnected command timeout for Target 23
WARNING: /pci at 0,0/pci8086,25f7 at 2/pci8086,350c at 0,3/pci1000,30e0 at 2
(mpt0);
Disconnected command timeout for Target 23
SCSI transport failed: reason ''reset'': giving up
WARNING: /pci at 0,0/pci8086,25f7 at 2/pci8086,350c at 0,3/pci1000,30e0 at 2
(mpt0);
Disconnected command timeout for Target 23
WARNING: /pci at 0,0/pci8086,25f7 at 2/pci8086,350c at 0,3/pci1000,30e0 at 2
(mpt0);
Disconnected command timeout for Target 23
When these occur, the system hangs on any access to the array and never
recovers. After some discussions with some folks at Sun, I rebuilt the
system from Solaris 10 x 86 Update 4 to run Open Solaris. It''s
currently on
Solaris Express (Nevada) build 78, and these errors are continuing. The
drives are the 750g hitachis, and after power cycle and reboot, the error
does not persist on one drive. Each of the drives is in a carrier with some
active electronics to adapt the SATA drives for SAS use. My fear at the
moment is that there''s some sort of problem with the 24 drive enclosure
itself as the drives appear to be fine, and I cannot believe we''re
seeing an
intermittent failure across a number of drives.
Any suggestions would be appreciated.
--Mike Stalnaker