thr3ads.net - zfs discuss - [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4) [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Andy Lubel

2007-Apr-20 20:07 UTC

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

We are having a really tough time accepting the performance with ZFS and NFS
interaction.  I have tried so many different ways trying to make it work (even
zfs set:zil_disable 1) and I''m still no where near the performance of
using a standard NFS mounted UFS filesystem - insanely slow; especially on file
rewrites.

We have been combing the message boards and it looks like there was a lot of
talk about this interaction of zfs+nfs back in november and before but since i
have not seen much.  It seems the only fix up to that date was to disable zil,
is that still the case?  Did anyone ever get closure on this?

We are running solaris 10 (SPARC) .latest patched 11/06 release connecting
directly via FC to a 6120 with 2 raid 5 volumes over a bge interface (gigabit). 
tried raidz, mirror and stripe with no negligible difference in speed.  the
clients connecting to this machine are HP-UX 11i and OS X 10.4.9 and they both
have corresponding performance characteristics.

Any insight would be appreciated - we really like zfs compared to any filesystem
we have EVER worked on and dont want to revert if at all possible!


TIA,

Andy Lubel

Bill Moore

2007-Apr-20 21:13 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

When you say rewrites, can you give more detail?  For example, are you
rewriting in 8K chunks, random sizes, etc?  The reason I ask is because
ZFS will, by default, use 128K blocks for large files.  If you then
rewrite a small chunk at a time, ZFS is forced to read 128K, modify the
small chunk you''re changing, and then write 128K.  Obviously, this has
adverse effects on performance.  :)  If your typical workload has a
preferred block size that it uses, you might try setting the recordsize
property in ZFS to match - that should help.

If you''re completely rewriting the file, then I can''t imagine
why it
would be slow.  The only thing I can think of is the forced sync that
NFS does on a file closed.  But if you set zil_disable in /etc/system
and reboot, you shouldn''t see poor performance in that case.

Other folks have had good success with NFS/ZFS performance (while other
have not).  If it''s possible, could you characterize your workload in a
bit more detail?

--Bill

On Fri, Apr 20, 2007 at 04:07:44PM -0400, Andy Lubel
wrote:> 
> We are having a really tough time accepting the performance with ZFS
> and NFS interaction.  I have tried so many different ways trying to
> make it work (even zfs set:zil_disable 1) and I''m still no where
near
> the performance of using a standard NFS mounted UFS filesystem -
> insanely slow; especially on file rewrites.
> 
> We have been combing the message boards and it looks like there was a
> lot of talk about this interaction of zfs+nfs back in november and
> before but since i have not seen much.  It seems the only fix up to
> that date was to disable zil, is that still the case?  Did anyone ever
> get closure on this?
> 
> We are running solaris 10 (SPARC) .latest patched 11/06 release
> connecting directly via FC to a 6120 with 2 raid 5 volumes over a bge
> interface (gigabit).  tried raidz, mirror and stripe with no
> negligible difference in speed.  the clients connecting to this
> machine are HP-UX 11i and OS X 10.4.9 and they both have corresponding
> performance characteristics.
> 
> Any insight would be appreciated - we really like zfs compared to any
> filesystem we have EVER worked on and dont want to revert if at all
> possible!
> 
> 
> TIA,
> 
> Andy Lubel
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Marion Hakanson

2007-Apr-20 21:32 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

andy.Lubel at gtsi.com said:> We have been combing the message boards and it looks like there was a lot
of
> talk about this interaction of zfs+nfs back in november and before but
since
> i have not seen much.  It seems the only fix up to that date was to disable
> zil, is that still the case?  Did anyone ever get closure on this? 
There''s a way to tell your 6120 to ignore ZFS cache flushes, until ZFS
learns to do that itself.  See:
  http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html

Regards,

Marion

Torrey McMahon

2007-Apr-20 22:00 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Marion Hakanson wrote:> andy.Lubel at gtsi.com said:
>   
>> We have been combing the message boards and it looks like there was a
lot of
>> talk about this interaction of zfs+nfs back in november and before but
since
>> i have not seen much.  It seems the only fix up to that date was to
disable
>> zil, is that still the case?  Did anyone ever get closure on this? 
>>     
>
> There''s a way to tell your 6120 to ignore ZFS cache flushes, until
ZFS
> learns to do that itself.  See:
>  
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>
>   
The 6120 isn''t the same as a 6130/61340/6540. The instructions 
referenced above won''t work on a T3/T3+/6120/6320

Andy Lubel

2007-Apr-20 22:09 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

yeah i saw that post about the other arrays but none for this EOL''d
hunk of metal.  i have some 6130''s but hopefully by the time they are
implemented we will have retired this nfs stuff and stepped into zvol iscsi
targets.

thanks anyways.. back to the drawing board on how to resolve this!

-Andy

-----Original Message-----
From: zfs-discuss-bounces at opensolaris.org on behalf of Torrey McMahon
Sent: Fri 4/20/2007 6:00 PM
To: Marion Hakanson
Cc: zfs-discuss at opensolaris.org
Subject: Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)
 
Marion Hakanson wrote:> andy.Lubel at gtsi.com said:
>   
>> We have been combing the message boards and it looks like there was a
lot of
>> talk about this interaction of zfs+nfs back in november and before but
since
>> i have not seen much.  It seems the only fix up to that date was to
disable
>> zil, is that still the case?  Did anyone ever get closure on this? 
>>     
>
> There''s a way to tell your 6120 to ignore ZFS cache flushes, until
ZFS
> learns to do that itself.  See:
>  
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>
>   
The 6120 isn''t the same as a 6130/61340/6540. The instructions 
referenced above won''t work on a T3/T3+/6120/6320

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Marion Hakanson

2007-Apr-20 22:18 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

tmcmahon2 at yahoo.com said:> The 6120 isn''t the same as a 6130/61340/6540. The instructions 
referenced
> above won''t work on a T3/T3+/6120/6320 
Sigh.  I can''t keep up (:-).  Thanks for the correction.

Marion

Andy Lubel

2007-Apr-20 22:26 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Im not sure about the workload but I did configure the volumes with the block
size in mind.. didnt seem to do much.  it could be due to the fact im basically
HW raid then zfs raid and i just dont know the equation to define a smarter
blocksize.  seems like if i have 2 arrays with 64kb striped together that 128k
would be ideal for my zfs datasets, but again.. my logic isnt infinite when it
comes to this fun stuff ;)

The 6120 has 2 volumes each with 64k stripe size blocks.  i then
raidz''ed the 2 volumes and tried both 64k and 128k.  i do get a bit of
a performance gain on rewrite at 128k.

These are dd tests by the way:

*this one is locally, and works just great.

bash-3.00# date ; uname -a 
Thu Apr 19 21:11:22 EDT 2007 
SunOS yuryaku 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V210 
     ^-------^

bash-3.00# df -k 
Filesystem            kbytes    used   avail capacity  Mounted on 
... 
se6120               697761792      26 666303904     1%    /pool/se6120 
se6120/rfs-v10       31457280 9710895 21746384    31%    /pool/se6120/rfs-v10

bash-3.00# time dd if=/dev/zero of=/pool/se6120/rfs-v10/rw-test-1.loo bs=8192
count=131072
131072+0 records in 
131072+0 records out 
real    0m13.783s     real    0m14.136s 
user    0m0.331s 
sys     0m9.947s

*this one is from a HP-UX 11i system mounted to the v210 listed above:

onyx:/rfs># date ; uname -a 
Thu Apr 19 21:15:02 EDT 2007 
HP-UX onyx B.11.11 U 9000/800 1196424606 unlimited-user license 
     ^====^ 
onyx:/rfs># bdf 
Filesystem          kbytes    used   avail %used Mounted on 
... 
yuryaku.sol:/pool/se6120/rfs-v10 
                   31457280 9710896 21746384   31% /rfs/v10

onyx:/rfs># time dd if=/dev/zero of=/rfs/v10/rw-test-2.loo bs=8192
count=131072
131072+0 records in 
131072+0 records out

real    1m2.25s     real    0m29.02s     real    0m50.49s 
user    0m0.30s 
sys     0m8.16s

*my 6120 tidbits of interest:

6120 Release 3.2.6 Mon Feb  5 02:26:22 MST 2007 (xxx.xxx.xxx.xxx) 
Copyright (C) 1997-2006 Sun Microsystems, Inc.  All Rights Reserved. 
daikakuji:/:<1>vol mode 
volume         mounted cache        mirror 
v1             yes     writebehind  off 
v2             yes     writebehind  off 

daikakuji:/:<5>vol list 
volume            capacity raid data       standby 
v1              340.851 GB    5 u1d01-06     u1d07 
v2              340.851 GB    5 u1d08-13     u1d14 
daikakuji:/:<6>sys list 
controller         : 2.5 
blocksize          : 64k 
cache              : auto 
mirror             : auto 
mp_support         : none 
naca               : off 
rd_ahead           : off 
recon_rate         : med 
sys memsize        : 256 MBytes 
cache memsize      : 1024 MBytes 
fc_topology        : auto 
fc_speed           : 2Gb 
disk_scrubber      : on 
ondg               : befit
----

Am i missing something?  As far as the RW test, i will tinker some more and
paste the results soonish.

Thanks in advance,

Andy Lubel

-----Original Message-----
From: Bill Moore [mailto:Bill.Moore at sun.com]
Sent: Fri 4/20/2007 5:13 PM
To: Andy Lubel
Cc: zfs-discuss at opensolaris.org
Subject: Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

When you say rewrites, can you give more detail?  For example, are you
rewriting in 8K chunks, random sizes, etc?  The reason I ask is because
ZFS will, by default, use 128K blocks for large files.  If you then
rewrite a small chunk at a time, ZFS is forced to read 128K, modify the
small chunk you''re changing, and then write 128K.  Obviously, this has
adverse effects on performance.  :)  If your typical workload has a
preferred block size that it uses, you might try setting the recordsize
property in ZFS to match - that should help.

If you''re completely rewriting the file, then I can''t imagine
why it
would be slow.  The only thing I can think of is the forced sync that
NFS does on a file closed.  But if you set zil_disable in /etc/system
and reboot, you shouldn''t see poor performance in that case.

Other folks have had good success with NFS/ZFS performance (while other
have not).  If it''s possible, could you characterize your workload in a
bit more detail?

--Bill

On Fri, Apr 20, 2007 at 04:07:44PM -0400, Andy Lubel
wrote:> 
> We are having a really tough time accepting the performance with ZFS
> and NFS interaction.  I have tried so many different ways trying to
> make it work (even zfs set:zil_disable 1) and I''m still no where
near
> the performance of using a standard NFS mounted UFS filesystem -
> insanely slow; especially on file rewrites.
> 
> We have been combing the message boards and it looks like there was a
> lot of talk about this interaction of zfs+nfs back in november and
> before but since i have not seen much.  It seems the only fix up to
> that date was to disable zil, is that still the case?  Did anyone ever
> get closure on this?
> 
> We are running solaris 10 (SPARC) .latest patched 11/06 release
> connecting directly via FC to a 6120 with 2 raid 5 volumes over a bge
> interface (gigabit).  tried raidz, mirror and stripe with no
> negligible difference in speed.  the clients connecting to this
> machine are HP-UX 11i and OS X 10.4.9 and they both have corresponding
> performance characteristics.
> 
> Any insight would be appreciated - we really like zfs compared to any
> filesystem we have EVER worked on and dont want to revert if at all
> possible!
> 
> 
> TIA,
> 
> Andy Lubel
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Leon Koll

2007-Apr-21 04:33 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

Welcome to the club, Andy...

I tried several times to attract the attention of the community to the dramatic
performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS combination -
without any result : <a
href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a>
, <a
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">[2]</a>.

Just look at two graphs in my <a
href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated August, 2006</a> to see how bad the situation was and,
unfortunately, this situation wasn''t changed much recently:
http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png

I don''t think the storage array is a source of the problems you
reported. It''s somewhere else...

[i]-- leon[/i]
 
 
This message posted from opensolaris.org

Selim Daoud

2007-Apr-21 07:05 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Roch,

isn''t there another flag in /etc/system to force zfs not to send flush
requests to NVRAM?

s.


On 4/20/07, Marion Hakanson <hakansom at ohsu.edu>
wrote:> andy.Lubel at gtsi.com said:
> > We have been combing the message boards and it looks like there was a
lot of
> > talk about this interaction of zfs+nfs back in november and before but
since
> > i have not seen much.  It seems the only fix up to that date was to
disable
> > zil, is that still the case?  Did anyone ever get closure on this?
>
> There''s a way to tell your 6120 to ignore ZFS cache flushes, until
ZFS
> learns to do that itself.  See:
>  
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>
> Regards,
>
> Marion
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Albert Chin

2007-Apr-21 13:51 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

On Sat, Apr 21, 2007 at 09:05:01AM +0200, Selim Daoud
wrote:> isn''t there another flag in /etc/system to force zfs not to send
flush
> requests to NVRAM?
I think it''s zfs_nocacheflush=1, according to Matthew Ahrens in
http://blogs.digitar.com/jjww/?itemid=44.
> s.
> 
> 
> On 4/20/07, Marion Hakanson <hakansom at ohsu.edu> wrote:
> >andy.Lubel at gtsi.com said:
> >> We have been combing the message boards and it looks like there
was a
> >lot of
> >> talk about this interaction of zfs+nfs back in november and before
but
> >since
> >> i have not seen much.  It seems the only fix up to that date was
to
> >disable
> >> zil, is that still the case?  Did anyone ever get closure on this?
> >
> >There''s a way to tell your 6120 to ignore ZFS cache flushes,
until ZFS
> >learns to do that itself.  See:
> > 
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
> >
> >Regards,
> >
> >Marion
> >
> >
> >_______________________________________________
> >zfs-discuss mailing list
> >zfs-discuss at opensolaris.org
> >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> 
-- 
albert chin (china at thewrittenword.com)

Andy Lubel

2007-Apr-21 14:46 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

so what you are saying is that if we were using NFS v4 things should be
dramatically better?

do you think this applies to any NFS v4 client or only Suns?



-----Original Message-----
From: zfs-discuss-bounces at opensolaris.org on behalf of Erblichs
Sent: Sun 4/22/2007 4:50 AM
To: Leon Koll
Cc: zfs-discuss at opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)
 
Leon Koll,

	As a knowldegeable outsider I can say something.

	The benchbark (SFS) page specifies NFSv3,v2 support, so I question
	 whether you ra n NFSv4. I would expect a major change in
	 performance just to version 4 NFS version and ZFS.

	The benchmark seems to stress your configuration enough that
	the latency to service NFS ops increases to the point of non
	serviced NFS requests. However, you don''t know what is the
	byte count per IO op. Reads are bottlenecked against rtt of
	the connection and writes are normally sub 1K with a later
	commit. However, many ops are probably just file handle
	verifications which again are limited to your connection
	rtt (round trip time). So, my initial guess is that the number
	of NFS threads are somewhat related to the number of non
	state (v4 now has state) per file handle op. Thus, if a 64k
	ZFS block is being modified by 1 byte, COW would require a
	64k byte read, 1 byte modify, and then allocation of another
	64k block. So, for every write op, you COULD be writing a
	full ZFS block.

	This COW philosphy works best with extending delayed writes, etc
	where later reads would make the trade-off of increased
	latency of the larger block on a read op versus being able
	to minimize the number of seeks on the write and read. Basicly
	increasing the block size from say 8k to 64K. Thus, your
	read latency goes up just to get the data off the disk
	and minimizing the number of seeks, and dropping the read
	ahead logic for the needed 8k to 64k file offset.

	I do NOT know that "THAT" 4000 IO OPS load would match your maximal
	load and that your actual load would never increase past 2000 IO ops.
	Secondly, jumping from 2000 to 4000 seems to be too big of a jump
	for your environment. Going to 2500 or 3000 might be more
	appropriate. Lastly wrt the benchmark, some remnants (NFS and/or ZFS
	and/or benchmark) seem to remain that have a negative impact.

	Lastly, my guess is that this NFS and the benchark are stressing small
	partial block writes and that is probably one of the worst case
	scenarios for ZFS. So, my guess is the proper analogy is trying to
	kill a nat with a sledgehammer. Each write IO OP really needs to be
equal
	to a full size ZFS block to get the full benefit of ZFS on a per byte
	basis.

	Mitchell Erblich
	Sr Software Engineer
	-----------------

	

	

Leon Koll wrote:> 
> Welcome to the club, Andy...
> 
> I tried several times to attract the attention of the community to the
dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS
combination - without any result : <a
href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a>
, <a
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">[2]</a>.
> 
> Just look at two graphs in my <a
href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated August, 2006</a> to see how bad the situation was and,
unfortunately, this situation wasn''t changed much recently:
http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
> 
> I don''t think the storage array is a source of the problems you
reported. It''s somewhere else...
> 
> [i]-- leon[/i]
> 
> 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Spencer Shepler

2007-Apr-21 20:16 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

On Apr 21, 2007, at 9:46 AM, Andy Lubel wrote:
> so what you are saying is that if we were using NFS v4 things  
> should be dramatically better?
I certainly don''t support this assertion (if it was being made).

NFSv4 does have some advantages from the perspective of enabling
more aggressive file data caching; that will enable NFSv4 to
outperform NFSv3 in some specific workloads.  In general, however,
NFSv4 performs similarly to NFSv3.

Spencer

>
> do you think this applies to any NFS v4 client or only Suns?
>
>
>
> -----Original Message-----
> From: zfs-discuss-bounces at opensolaris.org on behalf of Erblichs
> Sent: Sun 4/22/2007 4:50 AM
> To: Leon Koll
> Cc: zfs-discuss at opensolaris.org
> Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)
>
> Leon Koll,
>
> 	As a knowldegeable outsider I can say something.
>
> 	The benchbark (SFS) page specifies NFSv3,v2 support, so I question
> 	 whether you ra n NFSv4. I would expect a major change in
> 	 performance just to version 4 NFS version and ZFS.
>
> 	The benchmark seems to stress your configuration enough that
> 	the latency to service NFS ops increases to the point of non
> 	serviced NFS requests. However, you don''t know what is the
> 	byte count per IO op. Reads are bottlenecked against rtt of
> 	the connection and writes are normally sub 1K with a later
> 	commit. However, many ops are probably just file handle
> 	verifications which again are limited to your connection
> 	rtt (round trip time). So, my initial guess is that the number
> 	of NFS threads are somewhat related to the number of non
> 	state (v4 now has state) per file handle op. Thus, if a 64k
> 	ZFS block is being modified by 1 byte, COW would require a
> 	64k byte read, 1 byte modify, and then allocation of another
> 	64k block. So, for every write op, you COULD be writing a
> 	full ZFS block.
>
> 	This COW philosphy works best with extending delayed writes, etc
> 	where later reads would make the trade-off of increased
> 	latency of the larger block on a read op versus being able
> 	to minimize the number of seeks on the write and read. Basicly
> 	increasing the block size from say 8k to 64K. Thus, your
> 	read latency goes up just to get the data off the disk
> 	and minimizing the number of seeks, and dropping the read
> 	ahead logic for the needed 8k to 64k file offset.
>
> 	I do NOT know that "THAT" 4000 IO OPS load would match your
maximal
> 	load and that your actual load would never increase past 2000 IO ops.
> 	Secondly, jumping from 2000 to 4000 seems to be too big of a jump
> 	for your environment. Going to 2500 or 3000 might be more
> 	appropriate. Lastly wrt the benchmark, some remnants (NFS and/or ZFS
> 	and/or benchmark) seem to remain that have a negative impact.
>
> 	Lastly, my guess is that this NFS and the benchark are stressing  
> small
> 	partial block writes and that is probably one of the worst case
> 	scenarios for ZFS. So, my guess is the proper analogy is trying to
> 	kill a nat with a sledgehammer. Each write IO OP really needs to be
> equal
> 	to a full size ZFS block to get the full benefit of ZFS on a per byte
> 	basis.
>
> 	Mitchell Erblich
> 	Sr Software Engineer
> 	-----------------
>
> 	
>
> 	
>
> Leon Koll wrote:
>>
>> Welcome to the club, Andy...
>>
>> I tried several times to attract the attention of the community to  
>> the dramatic performance degradation (about 3 times) of NFZ/ZFS  
>> vs. ZFS/UFS combination - without any result : <a href="http://
>>
www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a> ,
<a
>>
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">
>> [2]</a>.
>>
>> Just look at two graphs in my <a
href="http://napobo3.blogspot.com/
>> 2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting dated  
>> August, 2006</a> to see how bad the situation was and,  
>> unfortunately, this situation wasn''t changed much recently:
http://
>> photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
>>
>> I don''t think the storage array is a source of the problems
you
>> reported. It''s somewhere else...
>>
>> [i]-- leon[/i]
>>
>>
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Torrey McMahon

2007-Apr-21 23:18 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Don''t take this as gospel, and someone chime in if I''m off
here, but I
just saw an ARC case about this issue....

The fw in the T3 line might already take the NV_SYNC request. If it 
doesn''t then we''ll have a conf file where you can set it per
array.
Also, I would think the module or conf file would come with Sun arrays 
already listed.

Andy Lubel wrote:> yeah i saw that post about the other arrays but none for this
EOL''d hunk of metal.  i have some 6130''s but hopefully by the
time they are implemented we will have retired this nfs stuff and stepped into
zvol iscsi targets.
>
> thanks anyways.. back to the drawing board on how to resolve this!
>
> -Andy
>
> -----Original Message-----
> From: zfs-discuss-bounces at opensolaris.org on behalf of Torrey McMahon
> Sent: Fri 4/20/2007 6:00 PM
> To: Marion Hakanson
> Cc: zfs-discuss at opensolaris.org
> Subject: Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)
>  
> Marion Hakanson wrote:
>   
>> andy.Lubel at gtsi.com said:
>>   
>>     
>>> We have been combing the message boards and it looks like there was
a lot of
>>> talk about this interaction of zfs+nfs back in november and before
but since
>>> i have not seen much.  It seems the only fix up to that date was to
disable
>>> zil, is that still the case?  Did anyone ever get closure on this? 
>>>     
>>>       
>> There''s a way to tell your 6120 to ignore ZFS cache flushes,
until ZFS
>> learns to do that itself.  See:
>>  
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>>
>>   
>>     
>
> The 6120 isn''t the same as a 6130/61340/6540. The instructions 
> referenced above won''t work on a T3/T3+/6120/6320
>
>

Erblichs

2007-Apr-22 08:50 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

Leon Koll,

	As a knowldegeable outsider I can say something.

	The benchbark (SFS) page specifies NFSv3,v2 support, so I question
	 whether you ra n NFSv4. I would expect a major change in
	 performance just to version 4 NFS version and ZFS.

	The benchmark seems to stress your configuration enough that
	the latency to service NFS ops increases to the point of non
	serviced NFS requests. However, you don''t know what is the
	byte count per IO op. Reads are bottlenecked against rtt of
	the connection and writes are normally sub 1K with a later
	commit. However, many ops are probably just file handle
	verifications which again are limited to your connection
	rtt (round trip time). So, my initial guess is that the number
	of NFS threads are somewhat related to the number of non
	state (v4 now has state) per file handle op. Thus, if a 64k
	ZFS block is being modified by 1 byte, COW would require a
	64k byte read, 1 byte modify, and then allocation of another
	64k block. So, for every write op, you COULD be writing a
	full ZFS block.

	This COW philosphy works best with extending delayed writes, etc
	where later reads would make the trade-off of increased
	latency of the larger block on a read op versus being able
	to minimize the number of seeks on the write and read. Basicly
	increasing the block size from say 8k to 64K. Thus, your
	read latency goes up just to get the data off the disk
	and minimizing the number of seeks, and dropping the read
	ahead logic for the needed 8k to 64k file offset.

	I do NOT know that "THAT" 4000 IO OPS load would match your maximal
	load and that your actual load would never increase past 2000 IO ops.
	Secondly, jumping from 2000 to 4000 seems to be too big of a jump
	for your environment. Going to 2500 or 3000 might be more
	appropriate. Lastly wrt the benchmark, some remnants (NFS and/or ZFS
	and/or benchmark) seem to remain that have a negative impact.

	Lastly, my guess is that this NFS and the benchark are stressing small
	partial block writes and that is probably one of the worst case
	scenarios for ZFS. So, my guess is the proper analogy is trying to
	kill a nat with a sledgehammer. Each write IO OP really needs to be
equal
	to a full size ZFS block to get the full benefit of ZFS on a per byte
	basis.

	Mitchell Erblich
	Sr Software Engineer
	-----------------

	

	

Leon Koll wrote:> 
> Welcome to the club, Andy...
> 
> I tried several times to attract the attention of the community to the
dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS
combination - without any result : <a
href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a>
, <a
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">[2]</a>.
> 
> Just look at two graphs in my <a
href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated August, 2006</a> to see how bad the situation was and,
unfortunately, this situation wasn''t changed much recently:
http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
> 
> I don''t think the storage array is a source of the problems you
reported. It''s somewhere else...
> 
> [i]-- leon[/i]
> 
> 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Erblichs

2007-Apr-22 21:27 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

Spencer,

	Summary: I am not sure that v4 would have a significant
	advantage over v3 or v2 in all envirs. I just believe it can
	have a significant advantage (no/minimal drawbacks)
	and one should use it if at all possbile to verify
	that it is not the bottleneck.

	So, no, I can not say that NFSv3 has the same performance
	as v4. I know that at its worst, I don''t belive that v4
	performs under v3 and at best, performs up to 2x or more
	than v3.
	
	So,,

	The assumptions are:

	-  V4 is being actively worked on,
	-  v3 is stable but no major changes are being done on it..
	-  leases,
	-  better data caching (delagations and client callbacks),
	-  state behaviour,
	-  compound NFS requests (procs) to remove the sequential rtt
	   of individual NFS requests
	- Significantly improved lookups for pathing (multi-lookup)
	  and later attr requests.. I am sure that the attr calls
	  are/were a significant percentage of NFS ops.
	- etc...
		** I am not telling Spencer this he should already
	           know this because skip...

	So, with the compound procs in v4, the increased latency''s
	with some of the ops might have a different congestion type
	behaviour (it scales better under more environments and
	allows the IO bandwidth to be more of an issue).

	So, yes, my assumption is that NFSv4 has a good possibility
	of significantly outperforming v3.. Either way, I know
	of no degradation in any op moving to v4.

	So, again, if we are tuning a setup, I would rather see what
	 ZFS does with v4, knowing that a few performance holes were
	 closed or almost closed versus v3.. I don''t think this is
	 specific to Sun.. It would apply to all NFSv4 environments.

	**Yes, however even when the public (Paw,Spencer, etc) NFSv4 paper
	was done, the SFS was stated as not yet done..

	-- LASTLY, I would also be interested in the actual times
	   of the different TCP segments. To see, if acks are
	   constantly in the pipeline between the dst and src, or
	   whether "slow-start restart behaviour" is occuring. It
	   is also theorectical that delayed acks of the dst,
	   the number of acks is reduced, which reduces the
	   bandwidth (IO ops) on subsequent data bursts. Also,
	   is Allman''s ABC being used in the TCP implementation.

	Mitchell Erblich
	----------------

	

Spencer Shepler wrote:> 
> On Apr 21, 2007, at 9:46 AM, Andy Lubel wrote:
> 
> > so what you are saying is that if we were using NFS v4 things
> > should be dramatically better?
> 
> I certainly don''t support this assertion (if it was being made).
> 
> NFSv4 does have some advantages from the perspective of enabling
> more aggressive file data caching; that will enable NFSv4 to
> outperform NFSv3 in some specific workloads.  In general, however,
> NFSv4 performs similarly to NFSv3.
> 
> Spencer
> 
> >
> > do you think this applies to any NFS v4 client or only Suns?
> >
> >
> >
> > -----Original Message-----
> > From: zfs-discuss-bounces at opensolaris.org on behalf of Erblichs
> > Sent: Sun 4/22/2007 4:50 AM
> > To: Leon Koll
> > Cc: zfs-discuss at opensolaris.org
> > Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)
> >
> > Leon Koll,
> >
> >       As a knowldegeable outsider I can say something.
> >
> >       The benchbark (SFS) page specifies NFSv3,v2 support, so I
question
> >        whether you ra n NFSv4. I would expect a major change in
> >        performance just to version 4 NFS version and ZFS.
> >
> >       The benchmark seems to stress your configuration enough that
> >       the latency to service NFS ops increases to the point of non
> >       serviced NFS requests. However, you don''t know what is
the
> >       byte count per IO op. Reads are bottlenecked against rtt of
> >       the connection and writes are normally sub 1K with a later
> >       commit. However, many ops are probably just file handle
> >       verifications which again are limited to your connection
> >       rtt (round trip time). So, my initial guess is that the number
> >       of NFS threads are somewhat related to the number of non
> >       state (v4 now has state) per file handle op. Thus, if a 64k
> >       ZFS block is being modified by 1 byte, COW would require a
> >       64k byte read, 1 byte modify, and then allocation of another
> >       64k block. So, for every write op, you COULD be writing a
> >       full ZFS block.
> >
> >       This COW philosphy works best with extending delayed writes, etc
> >       where later reads would make the trade-off of increased
> >       latency of the larger block on a read op versus being able
> >       to minimize the number of seeks on the write and read. Basicly
> >       increasing the block size from say 8k to 64K. Thus, your
> >       read latency goes up just to get the data off the disk
> >       and minimizing the number of seeks, and dropping the read
> >       ahead logic for the needed 8k to 64k file offset.
> >
> >       I do NOT know that "THAT" 4000 IO OPS load would match
your maximal
> >       load and that your actual load would never increase past 2000 IO
ops.
> >       Secondly, jumping from 2000 to 4000 seems to be too big of a
jump
> >       for your environment. Going to 2500 or 3000 might be more
> >       appropriate. Lastly wrt the benchmark, some remnants (NFS and/or
ZFS
> >       and/or benchmark) seem to remain that have a negative impact.
> >
> >       Lastly, my guess is that this NFS and the benchark are stressing
> > small
> >       partial block writes and that is probably one of the worst case
> >       scenarios for ZFS. So, my guess is the proper analogy is trying
to
> >       kill a nat with a sledgehammer. Each write IO OP really needs to
be
> > equal
> >       to a full size ZFS block to get the full benefit of ZFS on a per
byte
> >       basis.
> >
> >       Mitchell Erblich
> >       Sr Software Engineer
> >       -----------------
> >
> >
> >
> >
> >
> > Leon Koll wrote:
> >>
> >> Welcome to the club, Andy...
> >>
> >> I tried several times to attract the attention of the community to
> >> the dramatic performance degradation (about 3 times) of NFZ/ZFS
> >> vs. ZFS/UFS combination - without any result : <a
href="http://
> >>
www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a> ,
<a
> >>
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">
> >> [2]</a>.
> >>
> >> Just look at two graphs in my <a
href="http://napobo3.blogspot.com/
> >> 2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated
> >> August, 2006</a> to see how bad the situation was and,
> >> unfortunately, this situation wasn''t changed much
recently: http://
> >> photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
> >>
> >> I don''t think the storage array is a source of the
problems you
> >> reported. It''s somewhere else...
> >>
> >> [i]-- leon[/i]
> >>
> >>
> >> This message posted from opensolaris.org
> >> _______________________________________________
> >> zfs-discuss mailing list
> >> zfs-discuss at opensolaris.org
> >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> >
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Roch - PAE

2007-Apr-23 08:57 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Albert Chin writes:
 > On Sat, Apr 21, 2007 at 09:05:01AM +0200, Selim Daoud wrote:
 > > isn''t there another flag in /etc/system to force zfs not to
send flush
 > > requests to NVRAM?
 > 
 > I think it''s zfs_nocacheflush=1, according to Matthew Ahrens in
 > http://blogs.digitar.com/jjww/?itemid=44.
 > 

Correct. 

So one might use this bit while waiting for the SYNC_NV
complete solution. However setting zfs_nocacheflush=1 opens
a small possibility of pool corruption because it bypassed
the cache flushes around uerberblock updates. So definitely
not something to use on non-NVRAM storage.

It think it''s really best find our how to disable the
flushing at the storage array level which is more inline
with what SYNC_NV proper fix does.

-r


 > > s.
 > > 
 > > 
 > > On 4/20/07, Marion Hakanson <hakansom at ohsu.edu> wrote:
 > > >andy.Lubel at gtsi.com said:
 > > >> We have been combing the message boards and it looks like
there was a
 > > >lot of
 > > >> talk about this interaction of zfs+nfs back in november and
before but
 > > >since
 > > >> i have not seen much.  It seems the only fix up to that date
was to
 > > >disable
 > > >> zil, is that still the case?  Did anyone ever get closure on
this?
 > > >
 > > >There''s a way to tell your 6120 to ignore ZFS cache
flushes, until ZFS
 > > >learns to do that itself.  See:
 > > > 
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
 > > >
 > > >Regards,
 > > >
 > > >Marion
 > > >
 > > >
 > > >_______________________________________________
 > > >zfs-discuss mailing list
 > > >zfs-discuss at opensolaris.org
 > > >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 > > >
 > > _______________________________________________
 > > zfs-discuss mailing list
 > > zfs-discuss at opensolaris.org
 > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 > > 
 > > 
 > 
 > -- 
 > albert chin (china at thewrittenword.com)
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Roch - PAE

2007-Apr-23 09:32 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

Leon Koll writes:
 > Welcome to the club, Andy...
 > 
 > I tried several times to attract the attention of the community to the
dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS
combination - without any result : <a
href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a>
, <a
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">[2]</a>.
 > 
 > Just look at two graphs in my <a
href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated August, 2006</a> to see how bad the situation was and,
unfortunately, this situation wasn''t changed much recently:
http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
 > 
 > I don''t think the storage array is a source of the problems you
reported. It''s somewhere else...
 > 

Why do you say this ?

My reading is that  almost all NFS/ZFS complaints are either
complaining  about NFS   performance   vs   direct   attach,
comparing UFS vs  ZFS on disk with  write cache  enabled, or
complaining  about ZFS running  on storage with NVRAM.  Your
complain is the  one   exception, SFS being worst  with  ZFS
backend vs say UFS or VxFS.

My points being:

 So NFS cannot match direct attach for some loads.
 It''s a fact that we can''t get around .

 Enabling the write cache gives is not a valid way to
 run NFS over UFS. 

 ZFS on NVRAM storage, we need to make sure the storage
 does not flush the cache in response to ZFS requests.

 Then SFS over ZFS is being investigated by others within
 Sun. I believe we have stuff in the pipe to make ZFS match
 or exceed  UFS on small server level loads. So I think your
 complaint is being heard. 

 I personally find it always incredibly hard to do performance
 engineering around SFS.
 So my perspective is that improving the SFS numbers
 will more likely come from finding ZFS/NFS performance
 deficiencies on simpler benchmarks.

-r

 > [i]-- leon[/i]
 >  
 >  
 > This message posted from opensolaris.org
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Markus Lippert

2007-Apr-23 11:28 UTC

head link

[zfs-discuss] unsubscribe

Albert Chin wrote:> On Sat, Apr 21, 2007 at 09:05:01AM +0200, Selim Daoud wrote:
>   
>> isn''t there another flag in /etc/system to force zfs not to
send flush
>> requests to NVRAM?
>>     
>
> I think it''s zfs_nocacheflush=1, according to Matthew Ahrens in
> http://blogs.digitar.com/jjww/?itemid=44.
>
>   
>> s.
>>
>>
>> On 4/20/07, Marion Hakanson <hakansom at ohsu.edu> wrote:
>>     
>>> andy.Lubel at gtsi.com said:
>>>       
>>>> We have been combing the message boards and it looks like there
was a
>>>>         
>>> lot of
>>>       
>>>> talk about this interaction of zfs+nfs back in november and
before but
>>>>         
>>> since
>>>       
>>>> i have not seen much.  It seems the only fix up to that date
was to
>>>>         
>>> disable
>>>       
>>>> zil, is that still the case?  Did anyone ever get closure on
this?
>>>>         
>>> There''s a way to tell your 6120 to ignore ZFS cache
flushes, until ZFS
>>> learns to do that itself.  See:
>>> 
http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>>>
>>> Regards,
>>>
>>> Marion
>>>
>>>
>>> _______________________________________________
>>> zfs-discuss mailing list
>>> zfs-discuss at opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>
>>>       
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>>
>>     
>
>   
-- 
<http://www.sun.com> 	* Markus Lippert *
SSE-StrategicSupportEngineer Cluster,Highend Server
*Sun Microsystems GmbH*
Brandenburger Str 2
Ratingen DE-40880
Phone +49-2102-4511-670 / x68670
Mobile +49-1728122707
Fax +49-2102-4511-672 / x68672
Email Markus.Lippert at Sun.COM Sitz der Gesellschaft: Sun Microsystems 
GmbH, Sonnenallee 1, D-85551

Andy Lubel

2007-Apr-23 17:56 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

What I''m saying is ZFS doesn''t play nice with NFS in all the
scenarios I could think of:

-Single second disk in a v210 (sun72g) write cache on and off = ~1/3 the
performance of UFS when writing files using dd over an NFS mount using the same
disk.

-2 raid 5 volumes composing of 6 spindles each taking ~53 seconds to write 1gb
over a NFS mounted zfs stripe,raidz or mirror of a storedge 6120 array with bbc,
zil_disable''d and write cache off/on.  In some testing dd would even
seem to ''hang''. When any volslice is formatted UFS with the
same NFS client - its ~17 seconds!

We are likely going to just try iscsi instead, the behavior is non-existent.  At
some point though we would like to use ZFS based NFS mounts for things..  the
current difference in performance just scares us!

-Andy

-----Original Message-----
From: zfs-discuss-bounces at opensolaris.org on behalf of Roch - PAE
Sent: Mon 4/23/2007 5:32 AM
To: Leon Koll
Cc: zfs-discuss at opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

Leon Koll writes:
 > Welcome to the club, Andy...
 > 
 > I tried several times to attract the attention of the community to the
dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS
combination - without any result : <a
href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592">[1]</a>
, <a
href="http://www.opensolaris.org/jive/thread.jspa?threadID=24015">[2]</a>.
 > 
 > Just look at two graphs in my <a
href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html">posting
dated August, 2006</a> to see how bad the situation was and,
unfortunately, this situation wasn''t changed much recently:
http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
 > 
 > I don''t think the storage array is a source of the problems you
reported. It''s somewhere else...
 > 

Why do you say this ?

My reading is that  almost all NFS/ZFS complaints are either
complaining  about NFS   performance   vs   direct   attach,
comparing UFS vs  ZFS on disk with  write cache  enabled, or
complaining  about ZFS running  on storage with NVRAM.  Your
complain is the  one   exception, SFS being worst  with  ZFS
backend vs say UFS or VxFS.

My points being:

 So NFS cannot match direct attach for some loads.
 It''s a fact that we can''t get around .

 Enabling the write cache gives is not a valid way to
 run NFS over UFS. 

 ZFS on NVRAM storage, we need to make sure the storage
 does not flush the cache in response to ZFS requests.

 Then SFS over ZFS is being investigated by others within
 Sun. I believe we have stuff in the pipe to make ZFS match
 or exceed  UFS on small server level loads. So I think your
 complaint is being heard. 

 I personally find it always incredibly hard to do performance
 engineering around SFS.
 So my perspective is that improving the SFS numbers
 will more likely come from finding ZFS/NFS performance
 deficiencies on simpler benchmarks.

-r

 > [i]-- leon[/i]
 >  
 >  
 > This message posted from opensolaris.org
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

eric kustarz

2007-Apr-23 23:39 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

On Apr 23, 2007, at 10:56 AM, Andy Lubel wrote:
>
> What I''m saying is ZFS doesn''t play nice with NFS in all
the
> scenarios I could think of:
>
> -Single second disk in a v210 (sun72g) write cache on and off =  
> ~1/3 the performance of UFS when writing files using dd over an NFS  
> mount using the same disk.
If the write cache is enabled using UFS then its not a fair  
comparison as UFS is open to corruption then.

ZFS write cache enabled vs. UFS write cache disabled is a fair  
comparison.  ZFS will enable the write cache be default if it owns  
the whole disk (something to watch out for when doing successive  
tests - say doing UFS after ZFS as the caches will be enabled w/out  
you explicitly doing it).
>
> -2 raid 5 volumes composing of 6 spindles each taking ~53 seconds  
> to write 1gb over a NFS mounted zfs stripe,raidz or mirror of a  
> storedge 6120 array with bbc, zil_disable''d and write cache off/ 
> on.  In some testing dd would even seem to ''hang''. When
any
> volslice is formatted UFS with the same NFS client - its ~17 seconds!
Can you show the output to ''zpool status'' for ZFS and the  
corresponding SVM/UFS setup?

eric

Leon Koll

2007-Apr-24 00:17 UTC

head link

[zfs-discuss] Re: Re: ZFS+NFS on storedge 6120 (sun t4)

Hello, Roch
<...>> Then SFS over ZFS is being investigated by others
>  within
> Sun. I believe we have stuff in the pipe to make ZFS
> match
> or exceed  UFS on small server level loads. So I
>  think your
> complaint is being heard. 
You''re the first one who said this and I am glad I''m being
heard.
>  
> I personally find it always incredibly hard to do
> performance
>  engineering around SFS.
> So my perspective is that improving the SFS numbers
> will more likely come from finding ZFS/NFS
>  performance
> deficiencies on simpler benchmarks.
>There is a new version of SPEC SFS in beta phase (w/NFS4 and CIFS support),
available to SPEC members only. I am very interested to see the results of it on
ZFS. Is there anybody from [i]"others within Sun"[/i] who tested it ?

Thanks,
-- leon
 
 
This message posted from opensolaris.org

Joel Miller

2007-Jun-26 21:20 UTC

head link

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

I am pretty sure the T3/6120/6320 firmware does not support the
SYNCHRONIZE_CACHE commands..

Off the top of my head, I do not know if that triggers any change in behavior on
the Solaris side...

The firmware does support the use of the FUA bit...which would potentially lead
to similar flushing behavior...

I will try to check in my infinite spare time...

-Joel
 
 
This message posted from opensolaris.org

Łukasz

2007-Aug-06 16:35 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

I think you have a problem with pool fragmentation. We have the same problem and
changing
recordsize will help. You have to set smaller recordsize for pool ( all
filesystem must have the same size or smaller size ). First check if you have
problems with finding blocks with this dtrace script:

#!/usr/sbin/dtrace -s


fbt::space_map_alloc:entry
{
   self->s = arg1;
}

fbt::space_map_alloc:return
/arg1 != -1/
{
  self->s = 0;
}

fbt::space_map_alloc:return
/self->s && (arg1 == -1)/
{
  @s = quantize(self->s);
  self->s = 0;
}

tick-10s
{
  printa(@s);
}
 
 
This message posted from opensolaris.org

Joel Miller

2007-Oct-18 04:18 UTC

head link

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

Ok...got a break from the 25xx release...
Trying to catch up so...sorry for the late response...

The 6120 firmware does not support the Cache Sync command at all...

You could try using a smaller blocksize setting on the array to attempt to
reduce the number of read/modify/writes that you will incur...

It also can be important to understand how zfs attempts to make aligned
transactions as well....since a single 128k write that starts on the beginning
of a RAID stripe is guaranteed to do a full-stripe write v/s 2 read/modify/write
stripes....

I have considered making an unsupported firmware that turns it into a caching
JBOD...I just have not had any "infinite spare time"....

-Joel
 
 
This message posted from opensolaris.org

zfs discuss - Apr 2007 - ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] unsubscribe

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)