Hi, I''m running SunOS Release 5.10 Version Generic_118855-36 64-bit and in [b]/etc/system[/b] I put: [b]set zfs:zfs_nocacheflush = 1[/b] And after rebooting, I get the message: [b]sorry, variable ''zfs_nocacheflush'' is not defined in the ''zfs'' module[/b] So is this variable not available in the Solaris kernel? I''m getting really poor write performance with ZFS on a RAID5 volume (5 disks) from a storagetek 6140 array. I''ve searched the web and these forums and it seems that this zfs_nocacheflush option is the solution, but I''m open to others as well. Thanks, Grant This message posted from opensolaris.org
On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:> I''m running SunOS Release 5.10 Version Generic_118855-36 64-bit > and in [b]/etc/system[/b] I put: > > [b]set zfs:zfs_nocacheflush = 1[/b] > > And after rebooting, I get the message: > > [b]sorry, variable ''zfs_nocacheflush'' is not defined in the ''zfs'' module[/b] > > So is this variable not available in the Solaris kernel?I think zfs:zfs_nocacheflush is only available in Nevada.> I''m getting really poor write performance with ZFS on a RAID5 volume > (5 disks) from a storagetek 6140 array. I''ve searched the web and > these forums and it seems that this zfs_nocacheflush option is the > solution, but I''m open to others as well.What type of poor performance? Is it because of ZFS? You can test this by creating a RAID-5 volume on the 6140, creating a UFS file system on it, and then comparing performance with what you get against ZFS. It would also be worthwhile doing something like the following to determine the max throughput the H/W RAID is giving you: # time dd of=<raw disk> if=/dev/zero bs=1048576 count=1000 For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. -- albert chin (china at thewrittenword.com)
Albert Chin wrote:> On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: > > >> I''m getting really poor write performance with ZFS on a RAID5 volume >> (5 disks) from a storagetek 6140 array. I''ve searched the web and >> these forums and it seems that this zfs_nocacheflush option is the >> solution, but I''m open to others as well. >> > > What type of poor performance? Is it because of ZFS? You can test this > by creating a RAID-5 volume on the 6140, creating a UFS file system on > it, and then comparing performance with what you get against ZFS. >If it''s ZFS then you might want to check into modifying the 6540 NVRAM as mentioned in this thread http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html there is a fix that doesn''t involve modifying the NVRAM in the works. (I don''t have an estimate.)
On Fri, May 25, 2007 at 12:14:45AM -0400, Torrey McMahon wrote:> Albert Chin wrote: > >On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: > > > > > >>I''m getting really poor write performance with ZFS on a RAID5 volume > >>(5 disks) from a storagetek 6140 array. I''ve searched the web and > >>these forums and it seems that this zfs_nocacheflush option is the > >>solution, but I''m open to others as well. > >> > > > >What type of poor performance? Is it because of ZFS? You can test this > >by creating a RAID-5 volume on the 6140, creating a UFS file system on > >it, and then comparing performance with what you get against ZFS. > > If it''s ZFS then you might want to check into modifying the 6540 NVRAM > as mentioned in this thread > > http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html > > there is a fix that doesn''t involve modifying the NVRAM in the works. (I > don''t have an estimate.)The above URL helps only if you have Santricity. -- albert chin (china at thewrittenword.com)
Im using: zfs set:zil_disable 1 On my se6130 with zfs, accessed by NFS and writing performance almost doubled. Since you have BBC, why not just set that? -Andy On 5/24/07 4:16 PM, "Albert Chin" <opensolaris-zfs-discuss at mlists.thewrittenword.com> wrote:> On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: >> I''m running SunOS Release 5.10 Version Generic_118855-36 64-bit >> and in [b]/etc/system[/b] I put: >> >> [b]set zfs:zfs_nocacheflush = 1[/b] >> >> And after rebooting, I get the message: >> >> [b]sorry, variable ''zfs_nocacheflush'' is not defined in the ''zfs'' module[/b] >> >> So is this variable not available in the Solaris kernel? > > I think zfs:zfs_nocacheflush is only available in Nevada. > >> I''m getting really poor write performance with ZFS on a RAID5 volume >> (5 disks) from a storagetek 6140 array. I''ve searched the web and >> these forums and it seems that this zfs_nocacheflush option is the >> solution, but I''m open to others as well. > > What type of poor performance? Is it because of ZFS? You can test this > by creating a RAID-5 volume on the 6140, creating a UFS file system on > it, and then comparing performance with what you get against ZFS. > > It would also be worthwhile doing something like the following to > determine the max throughput the H/W RAID is giving you: > # time dd of=<raw disk> if=/dev/zero bs=1048576 count=1000 > For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a > single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k > stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.--
On Fri, May 25, 2007 at 12:01:45PM -0400, Andy Lubel wrote:> Im using: > > zfs set:zil_disable 1 > > On my se6130 with zfs, accessed by NFS and writing performance almost > doubled. Since you have BBC, why not just set that?I don''t think it''s enough to have BBC to justify zil_disable=1. Besides, I don''t know anyone from Sun recommending zil_disable=1. If your storage array has BBC, it doesn''t matter. What matters is what happens when ZIL isn''t flushed and your file server crashes (ZFS file system is still consistent but you''ll lose some info that hasn''t been flushed by ZIL). Even having your file server on a UPS won''t help here. http://blogs.sun.com/erickustarz/entry/zil_disable discusses some of the issues affecting zil_disable=1. We know we get better performance with zil_disable=1 but we''re not taking any chances.> -Andy > > > > On 5/24/07 4:16 PM, "Albert Chin" > <opensolaris-zfs-discuss at mlists.thewrittenword.com> wrote: > > > On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: > >> I''m running SunOS Release 5.10 Version Generic_118855-36 64-bit > >> and in [b]/etc/system[/b] I put: > >> > >> [b]set zfs:zfs_nocacheflush = 1[/b] > >> > >> And after rebooting, I get the message: > >> > >> [b]sorry, variable ''zfs_nocacheflush'' is not defined in the ''zfs'' module[/b] > >> > >> So is this variable not available in the Solaris kernel? > > > > I think zfs:zfs_nocacheflush is only available in Nevada. > > > >> I''m getting really poor write performance with ZFS on a RAID5 volume > >> (5 disks) from a storagetek 6140 array. I''ve searched the web and > >> these forums and it seems that this zfs_nocacheflush option is the > >> solution, but I''m open to others as well. > > > > What type of poor performance? Is it because of ZFS? You can test this > > by creating a RAID-5 volume on the 6140, creating a UFS file system on > > it, and then comparing performance with what you get against ZFS. > > > > It would also be worthwhile doing something like the following to > > determine the max throughput the H/W RAID is giving you: > > # time dd of=<raw disk> if=/dev/zero bs=1048576 count=1000 > > For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a > > single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k > > stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. > -- > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- albert chin (china at thewrittenword.com)
> It would also be worthwhile doing something like the > following to > determine the max throughput the H/W RAID is giving > you: > # time dd of=<raw disk> if=/dev/zero bs=1048576 > count=1000 > or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s > on a > single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 > array w/128k > stripe, and ~69MB/s on a seven-disk RAID-5 array > w/128k strip. > > -- > albert chin (china at thewrittenword.com) > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss >Well the Solaris kernel is telling me that it doesn''t understand zfs_nocacheflush, but the array sure is acting like it! I ran the dd example, but increased the count for a longer running time. 5-disk RAID5 with UFS: ~79 MB/s 5-disk RAID5 with ZFS: ~470 MB/s I''m assuming there''s some caching going on with ZFS that''s really helping out? Also, no Santricity, just Sun''s Common Array Manager. Is it possible to use both without completely confusing the array? This message posted from opensolaris.org
On Fri, May 25, 2007 at 09:54:04AM -0700, Grant Kelly wrote:> > It would also be worthwhile doing something like the following to > > determine the max throughput the H/W RAID is giving you: > > # time dd of=<raw disk> if=/dev/zero bs=1048576 count=1000 > > or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a > > single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k > > stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. > > Well the Solaris kernel is telling me that it doesn''t understand > zfs_nocacheflush, but the array sure is acting like it! > I ran the dd example, but increased the count for a longer running time.I don''t think a longer running time is going to give you a more accurate measurement.> 5-disk RAID5 with UFS: ~79 MB/sWhat about against a raw RAID-5 device?> 5-disk RAID5 with ZFS: ~470 MB/sI don''t think you want to if=/dev/zero on ZFS. There''s probably some optimization going on. Better to use /dev/urandom or concat n-many files comprised of random bits.> I''m assuming there''s some caching going on with ZFS that''s really > helping out?Yes.> Also, no Santricity, just Sun''s Common Array Manager. Is it possible > to use both without completely confusing the array?I think both are ok. CAM is free. Dunno about Santricity. -- albert chin (china at thewrittenword.com)
Albert Chin wrote:> I don''t think you want to if=/dev/zero on ZFS. There''s probably some > optimization going on. Better to use /dev/urandom or concat n-many > files comprised of random bits.Unless you have turned on compression, that is not the case. By default there is no optimization for all zeros. --matt
Robert Milkowski
2007-May-27 12:57 UTC
[zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?
Hello Grant, Friday, May 25, 2007, 6:54:04 PM, you wrote:>> It would also be worthwhile doing something like the >> following to >> determine the max throughput the H/W RAID is giving >> you: >> # time dd of=<raw disk> if=/dev/zero bs=1048576 >> count=1000 >> or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s >> on a >> single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 >> array w/128k >> stripe, and ~69MB/s on a seven-disk RAID-5 array >> w/128k strip. >> >> -- >> albert chin (china at thewrittenword.com) >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discu >> ss >>GK> Well the Solaris kernel is telling me that it doesn''t understand GK> zfs_nocacheflush, but the array sure is acting like it! GK> I ran the dd example, but increased the count for a longer running time. GK> 5-disk RAID5 with UFS: ~79 MB/s GK> 5-disk RAID5 with ZFS: ~470 MB/s GK> I''m assuming there''s some caching going on with ZFS that''s really helping out? How did you measure the performance? When setting up RAID-5 on the array and then putting ZFS on top of it it''s possible to get much better performance for some workload than with UFS due to the fact that ZFS will ''convert'' most write IO''s into sequential writes and will boundle most IOs from last 5s - this means the array will do mostly full stripe writes. However in you case with just only one dd command it shouldn''t be a case - I guess you''ve got plenty of RAM and you measure how much time dd is running. Better use iostat to see what is your actual performance in both cases. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com