thr3ads.net - zfs discuss - [zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads [May 2008]

If this information is useful, please help other people find it:
Share via:

Chris Siebenmann

2008-May-09 20:19 UTC

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am seeing
a weird performance degradation as the number of simultaneous sequential
reads increases.

 Setup:
	NFS client -> Solaris NFS server -> iSCSI target machine

 There are 12 physical disks on the iSCSI target machine. Each of them
is sliced up into 11 parts and the parts exported as individual LUNs to
the Solaris server. The Solaris server uses each LUN as a separate ZFS
pool (giving 132 pools in total) and exports them all to the NFS client.

(The NFS client and the iSCSI target machine are both running Linux.
The Solaris NFS server has 4 GB of RAM.)

 When the NFS client starts a sequential read against one filesystem
from each physical disk, the iSCSI target machine and the NFS client
both use the full network bandwidth and each individual read gets
1/12th of it (about 9.something MBytes/sec). Starting a second set of
sequential reads against each disk (to a different pool) behaves the
same, as does starting a third set.

 However, when I add a fourth set of reads thing change; while the
NFS server continues to read from the iSCSI target at full speed, the
data rate to the NFS client drops significantly. By the time I hit
9 reads per physical disk, the NFS client is getting a *total* of 8
MBytes/sec.  In other words, it seems that ZFS on the NFS server is
somehow discarding most of what it reads from the iSCSI disks, although
I can''t see any sign of this in ''vmstat'' output on
Solaris.

 Also, this may not be just an NFS issue; in limited testing with local
IO on the Solaris machine it seems that I may be seeing the same effect
with the same rough magnitude.

(It is limited testing because it is harder to accurately measure what
aggregate data rate I''m getting and harder to run that many
simultaneous
reads, as if I run too many of them the Solaris machine locks up due to
overload.)

 Does anyone have any ideas of what might be going on here, and how I
might be able to tune things on the Solaris machine so that it performs
better in this situation (ideally without harming performance under
smaller loads)? Would partitioning the physical disks on Solaris instead
of splitting them up on the iSCSI target make a significant difference?

 Thanks in advance.

	- cks

Robin Guo

2008-May-09 21:52 UTC

head link

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

Hi, Chris,

  Good topic, I''d like to see comments from expert as well.

  Firstly, I think it has some punishment from NFS, ZFS/NFS  has 
performance lost,
and the L2ARC cache feature is the way to solve it, so far. (Has in 
opensolaris, but not in s10u4 yet,
will target in s10u6 release).

  And, I also see the performance lost while I try iSCISI from local 
machine,
but I didn''t gather the accurate data yet. That might be a problem need
evaluate.

  I''ll trace this thread to see if any advance, thanks for bring out
the
topic.

  - Regards,

Robin Guo

Chris Siebenmann wrote:>  I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am seeing
> a weird performance degradation as the number of simultaneous sequential
> reads increases.
>
>  Setup:
> 	NFS client -> Solaris NFS server -> iSCSI target machine
>
>  There are 12 physical disks on the iSCSI target machine. Each of them
> is sliced up into 11 parts and the parts exported as individual LUNs to
> the Solaris server. The Solaris server uses each LUN as a separate ZFS
> pool (giving 132 pools in total) and exports them all to the NFS client.
>
> (The NFS client and the iSCSI target machine are both running Linux.
> The Solaris NFS server has 4 GB of RAM.)
>
>  When the NFS client starts a sequential read against one filesystem
> from each physical disk, the iSCSI target machine and the NFS client
> both use the full network bandwidth and each individual read gets
> 1/12th of it (about 9.something MBytes/sec). Starting a second set of
> sequential reads against each disk (to a different pool) behaves the
> same, as does starting a third set.
>
>  However, when I add a fourth set of reads thing change; while the
> NFS server continues to read from the iSCSI target at full speed, the
> data rate to the NFS client drops significantly. By the time I hit
> 9 reads per physical disk, the NFS client is getting a *total* of 8
> MBytes/sec.  In other words, it seems that ZFS on the NFS server is
> somehow discarding most of what it reads from the iSCSI disks, although
> I can''t see any sign of this in ''vmstat'' output
on Solaris.
>
>  Also, this may not be just an NFS issue; in limited testing with local
> IO on the Solaris machine it seems that I may be seeing the same effect
> with the same rough magnitude.
>
> (It is limited testing because it is harder to accurately measure what
> aggregate data rate I''m getting and harder to run that many
simultaneous
> reads, as if I run too many of them the Solaris machine locks up due to
> overload.)
>
>  Does anyone have any ideas of what might be going on here, and how I
> might be able to tune things on the Solaris machine so that it performs
> better in this situation (ideally without harming performance under
> smaller loads)? Would partitioning the physical disks on Solaris instead
> of splitting them up on the iSCSI target make a significant difference?
>
>  Thanks in advance.
>
> 	- cks
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Robert Milkowski

2008-May-14 23:30 UTC

head link

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

Hello Chris,

Friday, May 9, 2008, 9:19:53 PM, you wrote:

CS>  I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am seeing
CS> a weird performance degradation as the number of simultaneous sequential
CS> reads increases.

CS>  Setup:
CS>         NFS client -> Solaris NFS server -> iSCSI target machine

CS>  There are 12 physical disks on the iSCSI target machine. Each of them
CS> is sliced up into 11 parts and the parts exported as individual LUNs to
CS> the Solaris server. The Solaris server uses each LUN as a separate ZFS
CS> pool (giving 132 pools in total) and exports them all to the NFS client.

CS> (The NFS client and the iSCSI target machine are both running Linux.
CS> The Solaris NFS server has 4 GB of RAM.)

CS>  When the NFS client starts a sequential read against one filesystem
CS> from each physical disk, the iSCSI target machine and the NFS client
CS> both use the full network bandwidth and each individual read gets
CS> 1/12th of it (about 9.something MBytes/sec). Starting a second set of
CS> sequential reads against each disk (to a different pool) behaves the
CS> same, as does starting a third set.

CS>  However, when I add a fourth set of reads thing change; while the
CS> NFS server continues to read from the iSCSI target at full speed, the
CS> data rate to the NFS client drops significantly. By the time I hit
CS> 9 reads per physical disk, the NFS client is getting a *total* of 8
CS> MBytes/sec.  In other words, it seems that ZFS on the NFS server is
CS> somehow discarding most of what it reads from the iSCSI disks, although
CS> I can''t see any sign of this in ''vmstat''
output on Solaris.


Keep in mind that you will end up with a lot of seeks on physical
drives once you do multiple sqeuntial reads from differnt disk
regions.

Nevertheless I wouldn''t expect much difference in throughput between
nfs client and iscsi server. I''m thinking that maybe you are hitting
the issue with vdev cache as probably you ended up with 8KB reads over
nfs (RSIZE) and 64KB reads from iSCSI. You have 4GB of ram and I''m
assuming most of it is free (used by ARC cache)... or maybe it is
actually not the case so vdev cache reads 64KB, nfs client reads 8KB
and by the time it asks for another 8KB it is already gone... since
your box "locks up" - maybe iscsi target or other application has a
memory leak? Is your system using swap device just before it "locks
up"?

Try on fs client to mount filesystems with RSIZE=32KB and make sure
your scripts/programs are also requesting at least 32KB at the time.
Check if it help. If it doesn''t then disable vdev cache on solaris box
(by setting zfs_vdev_cache_max to 1). And check again.




CS> (It is limited testing because it is harder to accurately measure what
CS> aggregate data rate I''m getting and harder to run that many
simultaneous
CS> reads, as if I run too many of them the Solaris machine locks up due to
CS> overload.)

that''s strange - what exactly happens when it "locks up"?
Does it
panic?


CS> smaller loads)? Would partitioning the physical disks on Solaris instead
CS> of splitting them up on the iSCSI target make a significant difference?

Why do you want to partition them in a first place? Why not present
each disk as an iscsi lun then create a pool out of it and if
necessary create multiple file systems inside.

Then what about data protection - don''t you want to use any RAID?



-- 
Best regards,
 Robert Milkowski                            mailto:milek at task.gda.pl
                                       http://milek.blogspot.com

Chris Siebenmann

2008-May-15 04:42 UTC

head link

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

I wrote:
|  I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am
| seeing a weird performance degradation as the number of simultaneous
| sequential reads increases.

 To update zfs-discuss on this: after more investigation, this seems
to be due to file-level prefetching. Turning file-level prefetching
off (following the directions of the ZFS Evil Tuning Guide) returns
NFS server performance to full network bandwidth when there are lots
of simultaneous sequential reads. Unfortunately it significantly
reduces the performance of a single sequential read (when the server is
otherwise idle).

 The problem is definitely not an issue of having too many pools or too
many LUNS; I saw the same issue with a single striped pool made from 12
whole-disk LUNs. (And the issue happens locally as well as remotely, so
it''s not NFS; it''s just easier to measure with an NFS client,
because
you can clearly see the (maximum) aggregate data rate to all of the
sequential reads.)

| CS> (It is limited testing because it is harder to accurately measure
| CS> what aggregate data rate I''m getting and harder to run that
many
| CS> simultaneous reads, as if I run too many of them the Solaris
| CS> machine locks up due to overload.)
|
| that''s strange - what exactly happens when it "locks up"?
Does it
| panic?

 I have to apologize; this happened during an earlier round of
tests, when the Solaris machine had too little memory for the number
of pools I had on it. According to my notes, the behavior in the
with-prefetch state is that the machine can survive but is extremely
unresponsive until the test programs finish. (I haven''t retested with
file prefetching turned off.)

(Here ''locks up'' means it becomes basically totally
unresponsive,
although it seems to still be doing IO.)

 I am using a test program that is basically dd with some reporting; it
reads a 1 MB buffer from standard in and writes it to standard out. In
these tests, each reader''s stdin is a (different) 10 GB file and their
stdout is /dev/null.

	- cks

Robert Milkowski

2008-May-16 08:09 UTC

head link

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

Hello Chris,

Thursday, May 15, 2008, 5:42:32 AM, you wrote:

CS> I wrote:
CS> |  I have a ZFS-based NFS server (Solaris 10 U4 on x86) where I am
CS> | seeing a weird performance degradation as the number of simultaneous
CS> | sequential reads increases.

CS>  To update zfs-discuss on this: after more investigation, this seems
CS> to be due to file-level prefetching. Turning file-level prefetching
CS> off (following the directions of the ZFS Evil Tuning Guide) returns
CS> NFS server performance to full network bandwidth when there are lots
CS> of simultaneous sequential reads. Unfortunately it significantly
CS> reduces the performance of a single sequential read (when the server is
CS> otherwise idle).


Have you tried to disable vdev caching and leave file level
prefetching?


-- 
Best regards,
 Robert Milkowski                            mailto:milek at task.gda.pl
                                       http://milek.blogspot.com

Chris Siebenmann

2008-May-16 14:45 UTC

head link

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

| Have you tried to disable vdev caching and leave file level
| prefetching?

 If you mean setting zfs_vdev_cache_bshift to 13 (per the ZFS Evil
Tuning Guide) to turn off device-level prefetching then yes, I have
tried turning off just that; it made no difference.

 If there''s another tunable then I don''t know about it and
haven''t
tried it (and would be pleased to).

	- cks

zfs discuss - May 2008 - Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads

[zfs-discuss] Weird performance issue with ZFS with lots of simultaneous reads