On Thu, Nov 6, 2008 at 1:50 AM, Eric Sproul <esproul at omniti.com>
wrote:> Ben Rockwood wrote:
>> This is troubling because it means one disk can go wonky and render
your
>> storage system useless until someone can respond, and I''d
imagine most
>> admins would "solve" the problem via reboot, a very poor
solution.
>
> At the risk of being just AOL-style "me too" noise... this has
bitten us too,
> but in our case it seemed to be poor hardware/drivers. The system had the
same
> pathology you describe, but the errors we retryable writes to a disk that
had
> failed. The controller (Adaptec SATA RAID, aac) would not fail the device,
> instead it just kept issuing retryables and the effect was the same. The
only
One would expect file-system to time-out and subsequently
disconnect/fail respective drive. Peculiar zfs with its self-healing
doesn''t.
Regards,
Andrey
> way I could recover was to shut down and physically detach the offending
disk.
>
> My disks show up as SCSI in this scenario, and IIRC (memory is hazy from
the
> stress) cfgadm commands were failing as well.
>
> Eric
> _______________________________________________
> storage-discuss mailing list
> storage-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>