thr3ads.net - zfs discuss - [zfs-discuss] zfs & iscsi sustained write performance [Dec 2008]

If this information is useful, please help other people find it:
Share via:

milosz

2008-Dec-08 23:09 UTC

[zfs-discuss] zfs & iscsi sustained write performance

hi all,

currently having trouble with sustained write performance with my setup...

ms server 2003/ms iscsi initiator 2.08 w/intel e1000g nic directly connected to
snv_101 w/ intel e1000g nic.

basically, given enough time, the sustained write behavior is perfectly
periodic.  if i copy a large file to the iscsi target, iostat reports 10 seconds
or so of -no- writes to disk, just small reads... then 2-3 seconds of disk-maxed
writes, during which time windows reports the write performance dropping to zero
(disk queues maxed).

so iostat will report something like this for each of my zpool disks (with
iostat -xtc 1)

1s: %b 0
2s: %b 0
3s: %b 0
4s: %b 0
5s: %b 0
6s: %b 0
7s: %b 0
8s: %b 0
9s: %b 0
10s: %b 0
11s: %b 100
12s: %b 100
13s: %b 100
14s: %b 0
15s: %b 0

it looks like solaris hangs out caching the writes and not actually committing
them to disk... when the cache gets flushed, the iscsitgt (or whatever) just
stops accepting writes.

this is happening across controllers and zpools.  also, a test copy of a 10gb
file from one zpool to another (not iscsi) yielded similar iostat results: 10
seconds of big reads from the source zpool, 2-3 seconds of big writes to the
target zpool (target zpool is 5x  bigger than source zpool).

anyone got any ideas?  point me in the right direction?

thanks,

milosz
-- 
This message posted from opensolaris.org

Rob at Logan.com

2008-Dec-09 01:19 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

> (with iostat -xtc 1)
it sure would be nice to know if actv > 0 so
we would know if the lun was busy because
its queue is full or just slow (svc_t > 200)

for tracking errors try `iostat -xcen 1`
and `iostat -E`


			Rob

Brent Jones

2008-Dec-09 02:16 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

On Mon, Dec 8, 2008 at 3:09 PM, milosz <mewash at gmail.com>
wrote:> hi all,
>
> currently having trouble with sustained write performance with my setup...
>
> ms server 2003/ms iscsi initiator 2.08 w/intel e1000g nic directly
connected to snv_101 w/ intel e1000g nic.
>
> basically, given enough time, the sustained write behavior is perfectly
periodic.  if i copy a large file to the iscsi target, iostat reports 10 seconds
or so of -no- writes to disk, just small reads... then 2-3 seconds of disk-maxed
writes, during which time windows reports the write performance dropping to zero
(disk queues maxed).
>
> so iostat will report something like this for each of my zpool disks (with
iostat -xtc 1)
>
> 1s: %b 0
> 2s: %b 0
> 3s: %b 0
> 4s: %b 0
> 5s: %b 0
> 6s: %b 0
> 7s: %b 0
> 8s: %b 0
> 9s: %b 0
> 10s: %b 0
> 11s: %b 100
> 12s: %b 100
> 13s: %b 100
> 14s: %b 0
> 15s: %b 0
>
> it looks like solaris hangs out caching the writes and not actually
committing them to disk... when the cache gets flushed, the iscsitgt (or
whatever) just stops accepting writes.
>
> this is happening across controllers and zpools.  also, a test copy of a
10gb file from one zpool to another (not iscsi) yielded similar iostat results:
10 seconds of big reads from the source zpool, 2-3 seconds of big writes to the
target zpool (target zpool is 5x  bigger than source zpool).
>
> anyone got any ideas?  point me in the right direction?
>
> thanks,
>
> milosz
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
Are you running at compression? I see this behavior with heavy loads,
and GZIP compression enabled.
What does ''zfs get compression'' say?

-- 
Brent Jones
brent at servuhome.net

milosz

2008-Dec-09 02:31 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

compression is off across the board.

svc_t is only maxed during the periods of heavy write activity (2-3 seconds
every 10 or so seconds)... otherwise disks are basically idling.
-- 
This message posted from opensolaris.org

Bob Friesenhahn

2008-Dec-09 02:37 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

On Mon, 8 Dec 2008, milosz wrote:
> compression is off across the board.
>
> svc_t is only maxed during the periods of heavy write activity (2-3 
> seconds every 10 or so seconds)... otherwise disks are basically 
> idling.
Check for some hardware anomaly which might impact disks 11, 12, and 
13 but not the other disks.  For example, perhaps they share a cable, 
share the same controller, or or there is some other common point 
which is slow or producing recoverable errors.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

milosz

2008-Dec-09 02:42 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

my apologies... 11s, 12s, and 13s represent the number of seconds in a
read/write period, not disks.  so, 11 seconds into a period, %b suddenly jumps
to 100 after having been 0 for the first 10.
-- 
This message posted from opensolaris.org

Roch Bourbonnais

2009-Jan-03 15:24 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

Le 9 d?c. 08 ? 03:16, Brent Jones a ?crit :
> On Mon, Dec 8, 2008 at 3:09 PM, milosz <mewash at gmail.com> wrote:
>> hi all,
>>
>> currently having trouble with sustained write performance with my  
>> setup...
>>
>> ms server 2003/ms iscsi initiator 2.08 w/intel e1000g nic directly  
>> connected to snv_101 w/ intel e1000g nic.
>>
>> basically, given enough time, the sustained write behavior is  
>> perfectly periodic.  if i copy a large file to the iscsi target,  
>> iostat reports 10 seconds or so of -no- writes to disk, just small  
>> reads... then 2-3 seconds of disk-maxed writes, during which time  
>> windows reports the write performance dropping to zero (disk queues  
>> maxed).
>>
This looks consistent with being limited by the network factors.
Disks are idling while the next ZFS transaction group is being formed.
What is less clear is why windows write performance drops to zero.
One possible explanation is that during, the write bursts the small  
reads are being starved preventing progress on the Initiator side.
-r

>> so iostat will report something like this for each of my zpool  
>> disks (with iostat -xtc 1)
>>
>> 1s: %b 0
>> 2s: %b 0
>> 3s: %b 0
>> 4s: %b 0
>> 5s: %b 0
>> 6s: %b 0
>> 7s: %b 0
>> 8s: %b 0
>> 9s: %b 0
>> 10s: %b 0
>> 11s: %b 100
>> 12s: %b 100
>> 13s: %b 100
>> 14s: %b 0
>> 15s: %b 0
>>
>> it looks like solaris hangs out caching the writes and not actually  
>> committing them to disk... when the cache gets flushed, the  
>> iscsitgt (or whatever) just stops accepting writes.
>>
>> this is happening across controllers and zpools.  also, a test copy  
>> of a 10gb file from one zpool to another (not iscsi) yielded  
>> similar iostat results: 10 seconds of big reads from the source  
>> zpool, 2-3 seconds of big writes to the target zpool (target zpool  
>> is 5x  bigger than source zpool).
>>
>> anyone got any ideas?  point me in the right direction?
>>
>> thanks,
>>
>> milosz
>> --
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>
> Are you running at compression? I see this behavior with heavy loads,
> and GZIP compression enabled.
> What does ''zfs get compression'' say?
>
> -- 
> Brent Jones
> brent at servuhome.net
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Sean Alderman

2009-Jan-04 15:09 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

> What is less clear is why windows write performance drops to zero.
Perhaps the tweak for Nagel''s Algorithm in Windows would be in order?

http://blogs.sun.com/constantin/entry/x4500_solaris_zfs_iscsi_perfect
-- 
This message posted from opensolaris.org

milosz

2009-Jan-04 20:09 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

thanks for your responses, guys...

the nagle''s tweak is the first thing i did, actually.

not sure what the network limiting factors could be here... there''s no
switch, jumbo frames are on... maybe it''s the e1000g driver? 
it''s been wonky since 94 or so.  even during the write bursts
i''m only getting 60% of gigabit on average.
-- 
This message posted from opensolaris.org

Roch Bourbonnais

2009-Jan-12 14:09 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

Le 4 janv. 09 ? 21:09, milosz a ?crit :
> thanks for your responses, guys...
>
> the nagle''s tweak is the first thing i did, actually.
>
> not sure what the network limiting factors could be here...
there''s
> no switch, jumbo frames are on... maybe it''s the e1000g driver?   
> it''s been wonky since 94 or so.  even during the write bursts
i''m
> only getting 60% of gigabit on average.
How about tcp window size (particularly tcp_recv_hiwat on the recv  
side) and whether or not some CPU is saturated  (particularly the  
interrupt cpu on recv side, check with mpstat 1).
There is also some magic incantation to allow bigger transfer size in  
iscsi (blaise should have the details).
Can you verify the single connection throughput using either of   
iperf,uperf,netperf.

-r

>
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim Dunham

2009-Jan-13 01:51 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

Roch Bourbonnais wrote:> Le 4 janv. 09 ? 21:09, milosz a ?crit :
>
>> thanks for your responses, guys...
>>
>> the nagle''s tweak is the first thing i did, actually.
>>
>> not sure what the network limiting factors could be here...
there''s
>> no switch, jumbo frames are on... maybe it''s the e1000g
driver?
>> it''s been wonky since 94 or so.  even during the write bursts
i''m
>> only getting 60% of gigabit on average.
>
> How about tcp window size (particularly tcp_recv_hiwat on the recv
> side) and whether or not some CPU is saturated  (particularly the
> interrupt cpu on recv side, check with mpstat 1).
> There is also some magic incantation to allow bigger transfer size in
> iscsi (blaise should have the details).
For Solaris, the value can be set on either the iSCSI Target or  
Initiator, replacing 65536 (64K), with a value of one''s choosing.

[ target ]
	iscsitadm modify target --maxrecv 65536 <target-IQN>

[ initiator]
	iscsiadm modify target-param -p maxrecvdataseglen=65536 <target-IQN>

Jim
>
> Can you verify the single connection throughput using either of
> iperf,uperf,netperf.
>
> -r
>
>
>>
>> -- 
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090112/8815ee8a/attachment.html>

milosz

2009-Jan-13 20:31 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

iperf test coming out fine, actually...

iperf -s -w 64k

iperf -c -w 64k -t 900 -i 5

[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-899.9 sec  81.1 GBytes    774 Mbits/sec

totally steady.  i could probably implement some tweaks to improve it, but if i
were getting a steady 77% of gigabit i''d be very happy.

not seeing any cpu saturation with mpstat... nothing unusual other than low
activity while zfs commits writes to disk (ostensibly this is when the transfer
rate troughs)...
-- 
This message posted from opensolaris.org

Roch

2009-Jan-14 08:53 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

milosz writes:
 > iperf test coming out fine, actually...
 > 
 > iperf -s -w 64k
 > 
 > iperf -c -w 64k -t 900 -i 5
 > 
 > [ ID] Interval       Transfer     Bandwidth
 > [  5]  0.0-899.9 sec  81.1 GBytes    774 Mbits/sec
 > 
 > totally steady.  i could probably implement some tweaks to improve it, but
if i were getting a steady 77% of gigabit i''d be very happy.
 > 

So you''re trying to get from 60% to 77%. IIRC you had some
small amount of reads going on. If you can find out where
those come from and eliminate them that could help.

Did we cover maxrecvdataseglen also ? I''ve seen this help
throughput using solaris initiator :
 
	iscsiadm list target | grep ^Target | awk ''{print $2}'' |
while read
	x ; do
	iscsiadm modify target-param -p maxrecvdataseglen=65536 $x
	done


-r

 
 > not seeing any cpu saturation with mpstat... nothing unusual other than
low activity while zfs commits writes to disk (ostensibly this is when the
transfer rate troughs)...
 > -- 
 > This message posted from opensolaris.org
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

milosz

2009-Jan-14 14:55 UTC

head link

[zfs-discuss] zfs & iscsi sustained write performance

sorry, that 60% statement was misleading... i will VERY OCCASIONALLY get a spike
to 60%, but i''m averaging more like 15%, with the throughput often
dropping to zero for several seconds at a time.

that iperf test more or less demonstrates it isn''t a network problem,
no?

also i have been using microsoft iscsi initiator... i will try doing a
solaris-solaris test later.
-- 
This message posted from opensolaris.org

zfs discuss - Dec 2008 - zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance

[zfs-discuss] zfs & iscsi sustained write performance