Well, I have a zpool created that contains four vdevs. Each Vdev is a mirror of a T3B lun and a corresponding lun of a SE3511 brick. I did this since I was new with ZFS and wanted to ensure that my data would survive an array failure. It turns out that I was smart for doing this :) I had a hardware failure on the SE3511 that caused the complete RAID5 lun on the se3511 to die. (The first glance showed 6 drives failed :( ) However, I would have expected that ZFS would detect the failed mirror halves and offline them as would ODS and VxVM. To my shock, it basically hung the server. I eventually had to unmap the SE3511 luns and replace them space I had available from another brick in the SE3511. I then did a zpool replace and ZFS reslivered the data. So, why did ZFS hang my server? This is on Solaris 11/06 kernel patch 127111-05 and ZFS version 4. This message posted from opensolaris.org
Matthew C Aycock wrote:> Well, I have a zpool created that contains four vdevs. Each Vdev is a mirror of a T3B lun and a corresponding lun of a SE3511 brick. I did this since I was new with ZFS and wanted to ensure that my data would survive an array failure. It turns out that I was smart for doing this :) > > I had a hardware failure on the SE3511 that caused the complete RAID5 lun on the se3511 to die. (The first glance showed 6 drives failed :( ) However, I would have expected that ZFS would detect the failed mirror halves and offline them as would ODS and VxVM. To my shock, it basically hung the server. I eventually had to unmap the SE3511 luns and replace them space I had available from another brick in the SE3511. I then did a zpool replace and ZFS reslivered the data. > > So, why did ZFS hang my server? >It was patiently waiting.> This is on Solaris 11/06 kernel patch 127111-05 and ZFS version 4. > >Additional failure management improvements were integrated into NV b72 (IIRC). I''m not sure when or if those changes will make it into Solaris 10, but update 6 would be a good guess. -- richard
Richard Elling wrote:> Matthew C Aycock wrote: > >> Well, I have a zpool created that contains four vdevs. Each Vdev is a mirror of a T3B lun and a corresponding lun of a SE3511 brick. I did this since I was new with ZFS and wanted to ensure that my data would survive an array failure. It turns out that I was smart for doing this :) >> >> I had a hardware failure on the SE3511 that caused the complete RAID5 lun on the se3511 to die. (The first glance showed 6 drives failed :( ) However, I would have expected that ZFS would detect the failed mirror halves and offline them as would ODS and VxVM. To my shock, it basically hung the server. I eventually had to unmap the SE3511 luns and replace them space I had available from another brick in the SE3511. I then did a zpool replace and ZFS reslivered the data. >> >> So, why did ZFS hang my server? >> >> > > It was patiently waiting. > > >> This is on Solaris 11/06 kernel patch 127111-05 and ZFS version 4. >> >> >> > > Additional failure management improvements were integrated > into NV b72 (IIRC). I''m not sure when or if those changes will > make it into Solaris 10, but update 6 would be a good guess. > -- richard >My understanding talking with the relevant folks is that the fix will be in 10 Update 6, but not likely available as a patch beforehand. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)