thr3ads.net - freebsd stable - 10.1 RC4 r273903 - zpool scrub on ssd mirror

If this information is useful, please help other people find it:
Share via:

Stephane LAPIE

2014-Dec-06 07:13 UTC

10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout

On 11/24/2014 02:56 AM, Olivier Cochard-Labb? wrote:> On Thu, Nov 6, 2014 at 12:32 AM, Kai Gallasch <k at free.de> wrote:
>
>> Hi.
>>
>> Not sure if this is 10.1 related or more a problem of the ssd
>> model and/or ahci controller..
>>
>>
>> I find the following kernel message in the output of 'dmesg':
(after
>> running zpool scrub two times)
>>
>>
>> ahcich2: Timeout on slot 15 port 0
>> ahcich2: is 00000000 cs 000f0000 ss 000f8000 rs 000f8000 tfd 40 serr
>> 00000000 cmd 0024cf17 (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60
>> 8b a6 1d 56 40 0d 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status:
>> Command timeout (ada2:ahcich2:0:0:0): Retrying command
>> ahcich2: Timeout on slot 23 port 0
>> ahcich2: is 00000000 cs 0f000000 ss 0f800000 rs 0f800000 tfd 40 serr
>> 00000000 cmd 0024d817 (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60
>> 1b 23 81 bc 40 06 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status:
>> Command timeout (ada2:ahcich2:0:0:0): Retrying command
>> ahcich2: Timeout on slot 3 port 0
>> ahcich2: is 00000000 cs 00000030 ss 00000038 rs 00000038 tfd 40 serr
>> 00000000 cmd 0024c317 (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60
>> 26 bd 18 8e 40 12 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status:
>> Command timeout (ada2:ahcich2:0:0:0): Retrying command
>>
>>
> Hi,
>
> I meet the same "CAM status: Command timeout" problem since my
upgrade from
> 10.0 to 10.1 on a HP ProLiant MicroServer (AMD SB7x0/SB8x0/SB9x0 SATA
> Controller). A "zpool scrub" is a way for generating this error
message but
> writting big files (few GB) is another way too.
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"Hi,

I'm encountering something along those lines too,
with Transcend SSDs on AHCI controllers.
(Although the controllers are set and identified as ATA,
because of option ROM clash on that system,
which I should eventually get down to fixing...)

I have two ZFS pools :
- 15 disks on a LSI controller
- 2 SSDs on the aformentioned Intel AHCI controller

Ever since upgrading to 10.1-RELEASE,
eventually, I will get a warning in dmesg on either of the two SSDs :
ata0: already running!
ata1: already running!

Source code for ATA indicates that it is just reprocessing the same query,
and gstat indicates that operations are stuck in the I/O queue without
ever being purged.
Once it comes to that, the only way to (temporarily) recover is to reboot.

Cheers,

-- 
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 295 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141206/fa22e4b9/attachment.sig>

freebsd stable - Dec 2014 - 10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout

10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout