thr3ads.net - zfs discuss - [zfs-discuss] SCSI write retry errors on ZIL SSD drives... [Aug 2010]

If this information is useful, please help other people find it:
Share via:

Ray Van Dolson

2010-Aug-24 23:05 UTC

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

I posted a thread on this once long ago[1] -- but we''re still fighting
with this problem and I wanted to throw it out here again.

All of our hardware is from Silicon Mechanics (SuperMicro chassis and
motherboards).

Up until now, all of the hardware has had a single 24-disk expander /
backplane -- but we recently got one of the new SC847-based models with
24 disks up front and 12 in the back -- a dual backplane setup.

We''re using two SSD''s in the front backplane as mirrored
ZIL/OS (I
don''t think we have the 4K alignment set up correctly) and two drives
in the back as L2ARC.

The rest of the disks are 1TB SATA disks which make up a single large
zpool via three 8-disk RAIDZ2''s.  As you can see, we don''t
have the
server maxed out on drives...

In any case, this new server gets between 400 and 600 of these timeout
errors an hour:

Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at
8/pci15d9,1 at 0 (mpt0):
Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target 8.
Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b, scsi_state=c
Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at
8/pci15d9,1 at 0 (mpt0):
Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target 8.
Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b, scsi_state=c
Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0/sd at 8,0 (sd0):
Aug 21 03:10:17 dev-zfs1        Error for Command: write(10)               Error
Level: Retryable
Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Requested Block:
21230708                  Error Block: 21230708
Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Vendor: ATA             
Serial Number: CVEM002600EW
Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Sense Key: Unit
Attention
Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  ASC: 0x29 (power on,
reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Aug 21 03:10:21 dev-zfs1 scsi: [ID 365881 kern.info] /pci at 0,0/pci8086,340f at
8/pci15d9,1 at 0 (mpt0):

iostat -xnMCez shows that the first of the two ZIL drives receives
about twice the number of "errors" as the second drive.

There are no other errors on any other drives -- including the L2ARC
SSD''s and the ascv_t times seem reasonably low and don''t
indicate a bad
drive to my eyes...

The timeouts above exact a rather large performance penalty on the
system, both in IO and general usage from an SSH console.  Obvious
pauses and glitches when accessing the filesystem.

The problem _follows_ the ZIL and isn''t tied to hardware.  IOW, if I
switch to using the L2ARC drives as ZIL, those drives suddenly exhibit
the timeout problems...

If we connect the SSD drives directly to the LSI controller instead of
hanging off the hot-swap backplane, the timeouts go away.

If we use SSD''s attached to the SATA controllers as ZIL, there are also
no performance issues or timeout errors.

So the problem only occurs with SSD drives acting as ZIL attached to
the backplane.

This is leading me to believe we have a driver issue of some sort in
the mpt subsystem unable to cope with the longer command path of
multiple backplanes.  Someone alluded to this in [1] as well, and it
makes sense to me.

One quick fix to me would seem to be upping the SCSI timeout values.
How do you do this with the mpt driver?

We haven''t yet been able to try OpenSolaris or Nexenta on one of these
systems to see if the problem goes away in later releases of the kernel
or driver, but I''m curious if anyone out there has any bright ideas as
to what we might be running into here and what''s involved in fixing it.

We''ve swapped out backplanes and drives and the problem happens on
every single Silicon Mechanics system we have, so at this point I''m
really doubting it''s a hardware issue :)

Hardware details are as follows:

Silicon Mechanics Storform iServ R518
(Based on SuperMicro SC847E16-R1400 chassis)
SuperMicro X8DT3 motherboard w/ onboard LSI1068 controller.
    - One LSI port goes to the front backplane (where the bulk of the
      SATA drives are, the two SSD''s used as ZIL/OS)
    - The other LSI port goes to the rear backplane where the two L2ARC
      drives are along with a couple SATA''s)

We''ve got 6GB''s of RAM and 2 quad core Xeons in the box as
well.

The SSD''s themselves are all Intel X-25E''s (32GB) with
firmware 8860
and the LSI 1068 is a SAS1068E B3 with firmware 011c0200 (1.28.02.00).

We''re running Solaris 10U8 mostly up to date and MPT HBA Driver v1.92.

Thoughts, theories and conjectures would be much appreciated... Sun
these days wants us to be able to reproduce the problem on Sun hardware
to get much support... Silicon Mechanics has been helpful, but they
don''t have a large enough inventory on hand to replicate our hardware
setup it seems. :(

Ray

[1] http://markmail.org/message/gfz2cui2iua4dxpy

Andrew Gabriel

2010-Aug-24 23:46 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

Ray Van Dolson wrote:> I posted a thread on this once long ago[1] -- but we''re still
fighting
> with this problem and I wanted to throw it out here again.
>
> All of our hardware is from Silicon Mechanics (SuperMicro chassis and
> motherboards).
>
> Up until now, all of the hardware has had a single 24-disk expander /
> backplane -- but we recently got one of the new SC847-based models with
> 24 disks up front and 12 in the back -- a dual backplane setup.
>
> We''re using two SSD''s in the front backplane as mirrored
ZIL/OS (I
> don''t think we have the 4K alignment set up correctly) and two
drives
> in the back as L2ARC.
>
> The rest of the disks are 1TB SATA disks which make up a single large
> zpool via three 8-disk RAIDZ2''s.  As you can see, we
don''t have the
> server maxed out on drives...
>
> In any case, this new server gets between 400 and 600 of these timeout
> errors an hour:
>
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
> Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target 8.
> Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b,
scsi_state=c
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
> Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target 8.
> Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b,
scsi_state=c
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0/sd at 8,0 (sd0):
> Aug 21 03:10:17 dev-zfs1        Error for Command: write(10)              
Error Level: Retryable
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Requested Block:
21230708                  Error Block: 21230708
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Vendor: ATA        
Serial Number: CVEM002600EW
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Sense Key: Unit
Attention
> Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  ASC: 0x29 (power
on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
> Aug 21 03:10:21 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
>
> iostat -xnMCez shows that the first of the two ZIL drives receives
> about twice the number of "errors" as the second drive.
>
> There are no other errors on any other drives -- including the L2ARC
> SSD''s and the ascv_t times seem reasonably low and don''t
indicate a bad
> drive to my eyes...
>
> The timeouts above exact a rather large performance penalty on the
> system, both in IO and general usage from an SSH console.  Obvious
> pauses and glitches when accessing the filesystem.
>   
This isn''t a timeout. "Unit Attention" is the drive saying
back to the
computer that it''s been reset and has forgotten any negotiation which 
happened with the controller. It''s a couple of decades since I was 
working on SCSI at this level, but IIRC, a drive will return "Unit 
Attention" error to the first command issued to it after a 
reset/powerup, except for a Test Unit Ready command. As it says, this 
might be caused by power on, reset, or bus reset occurred.
> The problem _follows_ the ZIL and isn''t tied to hardware.  IOW, if
I
> switch to using the L2ARC drives as ZIL, those drives suddenly exhibit
> the timeout problems...
>   
A possibility is that the problem is related to the nature of the load a 
ZIL drive attracts. One scenario could be that you are crashing the 
drive firmware, causing it it reset and reinitialize itself, and 
therefore to return "Unit Attention" to the next command. (I
don''t know
if X25-E''s can behave this way.)

I would try and correct the 4k alignment on the ZIL at least - that does 
significantly affect the work the drive has to do internally (as well as 
its performance), although I''ve no idea if that''s related to
the issue
you''re seeing.
> If we connect the SSD drives directly to the LSI controller instead of
> hanging off the hot-swap backplane, the timeouts go away.
>   
Again, may be related to some combination of the load type and physical 
characteristics.
> If we use SSD''s attached to the SATA controllers as ZIL, there are
also
> no performance issues or timeout errors.
>   
Why not do this then? It also avoids using SATA tunneling protocol 
across the SAS and port expanders.
> So the problem only occurs with SSD drives acting as ZIL attached to
> the backplane.
>
> This is leading me to believe we have a driver issue of some sort in
> the mpt subsystem unable to cope with the longer command path of
> multiple backplanes.  Someone alluded to this in [1] as well, and it
> makes sense to me.
>
> One quick fix to me would seem to be upping the SCSI timeout values.
>   
The error you included isn''t a timeout.
> The SSD''s themselves are all Intel X-25E''s (32GB) with
firmware 8860
> and the LSI 1068 is a SAS1068E B3 with firmware 011c0200 (1.28.02.00).
>   
I''m not intimately familiar with the firmware versions, but if
you''re
having problems, making sure you have latest firmware is probably a good 
thing to do.


-- 
Andrew Gabriel

Ray Van Dolson

2010-Aug-25 00:27 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

On Tue, Aug 24, 2010 at 04:46:23PM -0700, Andrew Gabriel
wrote:> Ray Van Dolson wrote:
> > I posted a thread on this once long ago[1] -- but we''re still
fighting
> > with this problem and I wanted to throw it out here again.
> >
> > All of our hardware is from Silicon Mechanics (SuperMicro chassis and
> > motherboards).
> >
> > Up until now, all of the hardware has had a single 24-disk expander /
> > backplane -- but we recently got one of the new SC847-based models
with
> > 24 disks up front and 12 in the back -- a dual backplane setup.
> >
> > We''re using two SSD''s in the front backplane as
mirrored ZIL/OS (I
> > don''t think we have the 4K alignment set up correctly) and
two drives
> > in the back as L2ARC.
> >
> > The rest of the disks are 1TB SATA disks which make up a single large
> > zpool via three 8-disk RAIDZ2''s.  As you can see, we
don''t have the
> > server maxed out on drives...
> >
> > In any case, this new server gets between 400 and 600 of these timeout
> > errors an hour:
> >
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
> > Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target
8.
> > Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b,
scsi_state=c
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
> > Aug 21 03:10:17 dev-zfs1        Log info 31126000 received for target
8.
> > Aug 21 03:10:17 dev-zfs1        scsi_status=0, ioc_status=804b,
scsi_state=c
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.warning] WARNING: /pci
at 0,0/pci8086,340f at 8/pci15d9,1 at 0/sd at 8,0 (sd0):
> > Aug 21 03:10:17 dev-zfs1        Error for Command: write(10)          
Error Level: Retryable
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Requested
Block: 21230708                  Error Block: 21230708
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Vendor: ATA   
Serial Number: CVEM002600EW
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  Sense Key:
Unit Attention
> > Aug 21 03:10:17 dev-zfs1 scsi: [ID 107833 kern.notice]  ASC: 0x29
(power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
> > Aug 21 03:10:21 dev-zfs1 scsi: [ID 365881 kern.info] /pci at
0,0/pci8086,340f at 8/pci15d9,1 at 0 (mpt0):
> >
> > iostat -xnMCez shows that the first of the two ZIL drives receives
> > about twice the number of "errors" as the second drive.
> >
> > There are no other errors on any other drives -- including the L2ARC
> > SSD''s and the ascv_t times seem reasonably low and
don''t indicate a bad
> > drive to my eyes...
> >
> > The timeouts above exact a rather large performance penalty on the
> > system, both in IO and general usage from an SSH console.  Obvious
> > pauses and glitches when accessing the filesystem.
> >   
> 
> This isn''t a timeout. "Unit Attention" is the drive
saying back to the
> computer that it''s been reset and has forgotten any negotiation
which
> happened with the controller. It''s a couple of decades since I was
> working on SCSI at this level, but IIRC, a drive will return "Unit 
> Attention" error to the first command issued to it after a 
> reset/powerup, except for a Test Unit Ready command. As it says, this 
> might be caused by power on, reset, or bus reset occurred.
Interesting.  Thanks for the insight.
> 
> > The problem _follows_ the ZIL and isn''t tied to hardware. 
IOW, if I
> > switch to using the L2ARC drives as ZIL, those drives suddenly exhibit
> > the timeout problems...
> >   
> 
> A possibility is that the problem is related to the nature of the load a 
> ZIL drive attracts. One scenario could be that you are crashing the 
> drive firmware, causing it it reset and reinitialize itself, and 
> therefore to return "Unit Attention" to the next command. (I
don''t know
> if X25-E''s can behave this way.)
> 
> I would try and correct the 4k alignment on the ZIL at least - that does 
> significantly affect the work the drive has to do internally (as well as 
> its performance), although I''ve no idea if that''s related
to the issue
> you''re seeing.
Will definitely give this a go -- certainly can''t hurt.
> 
> > If we connect the SSD drives directly to the LSI controller instead of
> > hanging off the hot-swap backplane, the timeouts go away.
> >   
> 
> Again, may be related to some combination of the load type and physical 
> characteristics.
> 
> > If we use SSD''s attached to the SATA controllers as ZIL,
there are also
> > no performance issues or timeout errors.
> >   
> 
> Why not do this then? It also avoids using SATA tunneling protocol 
> across the SAS and port expanders.
We may -- however, the main reason we''d gone with the port expander was
for convenient hot swappability.  Though I guess SATA is technically
hot swappable, it''s not as convenient :)
> 
> > So the problem only occurs with SSD drives acting as ZIL attached to
> > the backplane.
> >
> > This is leading me to believe we have a driver issue of some sort in
> > the mpt subsystem unable to cope with the longer command path of
> > multiple backplanes.  Someone alluded to this in [1] as well, and it
> > makes sense to me.
> >
> > One quick fix to me would seem to be upping the SCSI timeout values.
> >   
> 
> The error you included isn''t a timeout.
> 
> > The SSD''s themselves are all Intel X-25E''s (32GB)
with firmware 8860
> > and the LSI 1068 is a SAS1068E B3 with firmware 011c0200 (1.28.02.00).
> >   
> 
> I''m not intimately familiar with the firmware versions, but if
you''re
> having problems, making sure you have latest firmware is probably a good 
> thing to do.
> 
Appreciate the response Gabriel.

We also do plan to compare between Solaris 10U8 and OpenSolaris /
Nexenta at some point when this hardware is freed up...

Ray

Andreas Grüninger

2010-Aug-25 18:47 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

Ray

Supermicro does not support the use of SSDs behind an expander.

You must put the SSD in the head or use an interposer card see here:
http://www.lsi.com/storage_home/products_home/standard_product_ics/sas_sata_protocol_bridge/lsiss9252/index.html
Supermicro offers an interposer card too: AOCSMPLSISS9252 .

Andreas
-- 
This message posted from opensolaris.org

Ray Van Dolson

2010-Aug-25 19:23 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

On Wed, Aug 25, 2010 at 11:47:38AM -0700, Andreas Gr?ninger
wrote:> Ray
> 
> Supermicro does not support the use of SSDs behind an expander.
> 
> You must put the SSD in the head or use an interposer card see here:
>
http://www.lsi.com/storage_home/products_home/standard_product_ics/sas_sata_protocol_bridge/lsiss9252/index.html
> Supermicro offers an interposer card too: AOCSMPLSISS9252 .
> 
Hmm, interesting.

FAQ #3 on this page[1] seems to indicate otherwise -- at least in the
case of the Intel X25-E (SSDSA2SH064G1GC) with firmware 8860 (which we
are running).

Ray

[1] http://www.supermicro.com/support/faqs/results.cfm?id=95

Andreas Grüninger

2010-Aug-25 20:47 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

This was the information I got from the distributor but this faq is newer.

Anyway you have still the problems.

When we installed the Intel-X25 we had also problems with timeout.
We replaced the original SUN StorageTek SAS HBA (LSI based, 1068E, newest
firmware) with an original SUN StorageTek SAS RAID HBA (SUN OEM version of
Adaptec 5085).
No timeouts since this replacement.

Andreas
-- 
This message posted from opensolaris.org

Ray Van Dolson

2010-Sep-21 15:58 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

Just wanted to post a quick follow-up to this.  Original thread is
here[1] -- not quoted for brevity.

Andrew Gabriel suggested[2] that this could possibly be some workload
triggered issue.  We wanted to rule out a driver problem and so we
tested various configurations under Solaris 10U9 and OpenSolaris with
correct 4K block alignment.

The "Unit Attention" errors under all operating environments for any
X-25E (we haven''t tested other brands) when used as ZIL and attached to
one of the LSI port expanders used in Silicon Mechanics hardware.

As soon as we move the drives to the onboard SATA controller or
directly attach to the LSI controller (bypassing the expander) the
issues go away.

Perhaps tweaking the firmware on the port expander would have resolved
the issue, but we''re not able to test that scenario currently.

Of note, heavy workload wasn''t required to trigger the problem.  We ran
bonnie++ hard on the system -- which appeared to tax the ZIL quite a
bit, but got no errors.

However, as soon as we set up an NFS VMware datastore and loaded a
couple VM''s on it the Unit Attention errors began popping up -- even
when they weren''t particularly busy.

In any case, we''ll probably stop chasing our tails on this issue and
will begin mounting all drives used for ZIL internally directly
attached to the onboard SATA controllers.

Thanks,
Ray

[1] http://mail.opensolaris.org/pipermail/zfs-discuss/2010-August/044362.html
[2] http://mail.opensolaris.org/pipermail/zfs-discuss/2010-August/044364.html

Richard Elling

2010-Sep-21 16:15 UTC

head link

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

Other SATA HDDs and SSDs are similarly affected. This can vary by firmware rev.
The prescription seems to be consistent with my experience.
 -- richard

On Sep 21, 2010, at 8:58 AM, Ray Van Dolson wrote:
> Just wanted to post a quick follow-up to this.  Original thread is
> here[1] -- not quoted for brevity.
> 
> Andrew Gabriel suggested[2] that this could possibly be some workload
> triggered issue.  We wanted to rule out a driver problem and so we
> tested various configurations under Solaris 10U9 and OpenSolaris with
> correct 4K block alignment.
> 
> The "Unit Attention" errors under all operating environments for
any
> X-25E (we haven''t tested other brands) when used as ZIL and
attached to
> one of the LSI port expanders used in Silicon Mechanics hardware.
> 
> As soon as we move the drives to the onboard SATA controller or
> directly attach to the LSI controller (bypassing the expander) the
> issues go away.
> 
> Perhaps tweaking the firmware on the port expander would have resolved
> the issue, but we''re not able to test that scenario currently.
> 
> Of note, heavy workload wasn''t required to trigger the problem. 
We ran
> bonnie++ hard on the system -- which appeared to tax the ZIL quite a
> bit, but got no errors.
> 
> However, as soon as we set up an NFS VMware datastore and loaded a
> couple VM''s on it the Unit Attention errors began popping up --
even
> when they weren''t particularly busy.
> 
> In any case, we''ll probably stop chasing our tails on this issue
and
> will begin mounting all drives used for ZIL internally directly
> attached to the onboard SATA controllers.
> 
> Thanks,
> Ray
> 
> [1]
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-August/044362.html
> [2]
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-August/044364.html
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
ZFS and performance consulting
http://www.RichardElling.com

Possibly Parallel Threads

Search for more maybe matching threads

zfs discuss - Aug 2010 - SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

[zfs-discuss] SCSI write retry errors on ZIL SSD drives...

Possibly Parallel Threads