thr3ads.net - zfs discuss - [zfs-discuss] ZFS 60 second pause times to read 1K [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Michael Kucharski

2007-Oct-09 10:29 UTC

[zfs-discuss] ZFS 60 second pause times to read 1K

Every day we see pause times of sometime 60 seconds to read 1K of a file for
local reads as well as NFS in a test setup.

We have a x4500 setup as a single 4*( raid2z 9 + 2)+2 spare pool and have the
files  system mounted over v5 krb5 NFS and accessed directly. The pool is a 20TB
pool and is using . There are three filesystems, backup, test and home. Test has
about 20 million files and uses 4TB. These files range from 100B to 200MB. Test
has a cron job to take snapshots every 15 minutes from 1m on the hour. Every
15min at 2min on the hour a cron batch job runs to zfs send/recv to the backup
filesystem. Home has only 100GB.

The test dir has 3 directories, 1 has 130,000 files other 2 have 10,000,000. We
have 4 processes, 2 over NFS, 2 local, and 2 reading the dir with 130,000 files,
the other 2 reading the dir with 10,000,000. Every 35 seconds each process reads
1K of 64thK of 10 files and records the latency, then reads for 1 second and
record the throughput. At times of no other activity (outside the snapshot and
send/recv times) we see read latencies of up to 60 seconds, maybe once a day at
random times.

We are using an unpatched Solaris 10 08/07 build.

Pause times this long can to timeouts and jobs to fails which is problematic for
us. Is this expected behaviour? Can anything be done to mitigate or diagnose the
issue?
 
 
This message posted from opensolaris.org

Gary Gendel

2007-Oct-09 13:31 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

Are there any clues in the logs?  I have had a similar problem when a disk bad
block was uncovered by zfs.  I''ve also seen this when using the Silicon
Image driver without the recommended patch.

The former became evident when I ran a scrub. I saw the SCSI timeout errors pop
up in the "kern" syslogs.  I solved this by replacing the disk.

Gary
 
 
This message posted from opensolaris.org

Michael

2007-Oct-09 14:11 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

Excellent. 

Oct  9 13:36:01 zeta1 scsi: [ID 107833 kern.warning] WARNING: /pci at
2,0/pci1022,7458 at 8/pci11ab,11ab at 1/disk at 2,0 (sd13):
Oct  9 13:36:01 zeta1   Error for Command: read                    Error Level:
Retryable

Scrubbing now.

Big thanks ggendel.
 
 
This message posted from opensolaris.org

Marc Bevand

2007-Oct-09 22:21 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

Michael <m.kucharski <at> bigfoot.com>
writes:> 
> Excellent. 
> 
> Oct  9 13:36:01 zeta1 scsi: [ID 107833 kern.warning] WARNING:
> /pci <at> 2,0/pci1022,7458 <at> 8/pci11ab,11ab <at>
1/disk <at> 2,0 (sd13):
> Oct  9 13:36:01 zeta1   Error for Command: read                    Error
Level: Retryable> 
> Scrubbing now.
This is only a part of the complete error message. Look a few lines above this
one. If you see something like:

  sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci8086,25ae at
1c/pci11ab,11ab at 2:
   port 1: device reset
  sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci8086,25ae at
1c/pci11ab,11ab at 2:
   port 1: link lost
  sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci8086,25ae at
1c/pci11ab,11ab at 2:
   port 1: link established
  marvell88sx: [ID 812950 kern.warning] WARNING: marvell88sx0: error on port 1:
  marvell88sx: [ID 517869 kern.info]       device disconnected
  marvell88sx: [ID 517869 kern.info]       device connected

Then it means you are probably affected by bug
http://bugs.opensolaris.org/view_bug.do?bug_id=6587133

This bug is fixed in Solaris Express build 73 and above, and will likely be
fixed in Solaris 10 Update 5. The workaround is to disable SATA NCQ and queuing
by adding "set sata:sata_func_enable = 0x4" to /etc/system.

-marc

Michael

2007-Oct-10 10:13 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

Thanks. Looks like I have this bug. Is it a hardware problem combined with a
software problem?

Oct  9 09:35:43 zeta1 sata: [ID 801593 kern.notice] NOTICE: /pci at
2,0/pci1022,7458 at 7/pci11ab,11ab at 1:
Oct  9 09:35:43 zeta1  port 3: device reset
Oct  9 09:35:43 zeta1 scsi: [ID 107833 kern.warning] WARNING: /pci at
2,0/pci1022,7458 at 7/pci11ab,11ab at 1/disk at 3,0 (sd18):
Oct  9 09:35:43 zeta1   Error for Command: read(10)                Error Level:
Retryable
Oct  9 09:35:43 zeta1 scsi: [ID 107833 kern.notice]     Requested Block: 6063576
Error Block: 6063576
Oct  9 09:35:43 zeta1 scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Oct  9 09:35:43 zeta1 scsi: [ID 107833 kern.notice]     Sense Key: No Additional
Sense
Oct  9 09:35:43 zeta1 scsi: [ID 107833 kern.notice]     ASC: 0x0 (no additional
sense info), ASCQ: 0x0, FRU: 0x0
Oct  9 09:35:43 zeta1 sata: [ID 801593 kern.notice] NOTICE: /pci at
2,0/pci1022,7458 at 7/pci11ab,11ab at 1:
Oct  9 09:35:43 zeta1  port 3: device reset
Oct  9 09:35:43 zeta1 sata: [ID 801593 kern.notice] NOTICE: /pci at
2,0/pci1022,7458 at 7/pci11ab,11ab at 1:
Oct  9 09:35:43 zeta1  port 3: link lost
Oct  9 09:35:43 zeta1 sata: [ID 801593 kern.notice] NOTICE: /pci at
2,0/pci1022,7458 at 7/pci11ab,11ab at 1:
Oct  9 09:35:43 zeta1  port 3: link established
Oct  9 09:35:43 zeta1 marvell88sx: [ID 812950 kern.warning] WARNING:
marvell88sx4: error on port 3:
Oct  9 09:35:43 zeta1 marvell88sx: [ID 517869 kern.info]        device
disconnected
Oct  9 09:35:43 zeta1 marvell88sx: [ID 517869 kern.info]        device connected
 
 
This message posted from opensolaris.org

Gary Gendel

2007-Oct-10 11:18 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

I''m not sure.  But when I would re-run a scrub, I got the errors at the
same block numbers, which indicated that the disk was really bad.  It
wouldn''t hurt to make the entry in the /etc/system file, reboot, and
then try the scrub again. If the problem disappears then it is a driver bug.

Gary
 
 
This message posted from opensolaris.org

Matthew Ahrens

2007-Oct-13 00:46 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

Michael Kucharski wrote:> We have a x4500 setup as a single 4*( raid2z 9 + 2)+2 spare pool and have
> the files  system mounted over v5 krb5 NFS and accessed directly. The pool
> is a 20TB pool and is using . There are three filesystems, backup, test
> and home. Test has about 20 million files and uses 4TB. These files range
> from 100B to 200MB. Test has a cron job to take snapshots every 15 minutes
> from 1m on the hour. Every 15min at 2min on the hour a cron batch job runs
> to zfs send/recv to the backup filesystem. Home has only 100GB.
Are you doing this send|recv within the same pool?  What''s the
motivation for
that?  Can''t you just use "zfs clone", which would be much
faster and use
less disk space?  Or if you want another copy (which seems unlikely since you 
can already tolerate any 2 disks failing in your pool), then use "zfs set 
copies=2 <fs>".

--matt

Michael

2007-Oct-13 12:31 UTC

head link

[zfs-discuss] ZFS 60 second pause times to read 1K

I''ve got the box on eval, and just pushing through its paces. Ideally I
would be replicating to another x4500, but I don''t have another one and
didn''t want to use 22 disks for another pool.
 
 
This message posted from opensolaris.org

Reasonably Related Threads

Search for more apparently analagous threads

zfs discuss - Oct 2007 - ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

[zfs-discuss] ZFS 60 second pause times to read 1K

Reasonably Related Threads