Hi experts, I am new to zfs and ask a question regarding zfs sequential peroformance: I read some blogs saying that netapp''s WAFL can suffer "sequential read after random write(SRARW)" performance penalty, since zfs is also doing no update in place, can zfs has such problem? Thanks Victor -- This message posted from opensolaris.org
On 23/02/2010 09:00, v wrote:> Hi experts, > I am new to zfs and ask a question regarding zfs sequential peroformance: I read some blogs saying that netapp''s WAFL can suffer "sequential read after random write(SRARW)" performance penalty, since zfs is also doing no update in place, can zfs has such problem? > > Thanks > Victor >yes but on the other hand it usually means much faster random writes. -- Robert Milkowski http://milek.blogspot.com
On Feb 23, 2010, at 1:00 AM, v wrote:> Hi experts, > I am new to zfs and ask a question regarding zfs sequential peroformance: I read some blogs saying that netapp''s WAFL can suffer "sequential read after random write(SRARW)" performance penalty, since zfs is also doing no update in place, can zfs has such problem?I know of only one study of this effect, by Allan Packer and Neel using MySQL. http://www.youtube.com/watch?v=a31NhwzlAxs slides at http://blogs.sun.com/realneel/resource/MySQL_Conference_2009_ZFS_MySQL.pdf However, there is much speculation in the ZFS archives... -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)
On 23/02/2010 17:18, Richard Elling wrote:> On Feb 23, 2010, at 1:00 AM, v wrote: > > >> Hi experts, >> I am new to zfs and ask a question regarding zfs sequential peroformance: I read some blogs saying that netapp''s WAFL can suffer "sequential read after random write(SRARW)" performance penalty, since zfs is also doing no update in place, can zfs has such problem? >> > I know of only one study of this effect, by Allan Packer and Neel using MySQL. > http://www.youtube.com/watch?v=a31NhwzlAxs > slides at > http://blogs.sun.com/realneel/resource/MySQL_Conference_2009_ZFS_MySQL.pdf > >from my own experience with mysql on top of zfs on solaris 10 - after a fresh db restore backup times X amount of time. After couple of weeks of the database in live the backup takes 2-3x the amount of X. -- Robert Milkowski http://milek.blogspot.com
Hi, Thanks for the reply. So the problem of sequential read after random write problem exist in zfs. I wonder if it is a real problem, ie, for example cause longer backup time, will it be addressed in future? So I should ask anther question: is zfs suitable for an environment that has lots of data changes? I think for random I/O, there will be no such performance penalty, but if you backup a zfs dataset, must the backup utility sequentially read blocks of a dataset? Will zfs dataset suitable for database temporary tablespace or online redo logs? Will a defrag utility be implemented? Regards Victor -- This message posted from opensolaris.org
On 24/02/2010 02:21, v wrote:> Hi, > Thanks for the reply. > So the problem of sequential read after random write problem exist in zfs. > I wonder if it is a real problem, ie, for example cause longer backup time, will it be addressed in future? > >Once the famous bp rewriter is integrated and a defrag functionality built on top of it you will be able to re-arrange your data again so it is sequential again.> So I should ask anther question: is zfs suitable for an environment that has lots of data changes? I think for random I/O, there will be no such performance penalty, but if you backup a zfs dataset, must the backup utility sequentially read blocks of a dataset? Will zfs dataset suitable for database temporary tablespace or online redo logs? > > Will a defrag utility be implemented? > >From my own experience it is not an issue in most cases - environments with lots of random writes tend to do random reads as well anyway. Then when it comes to backup - yes, it might get slower over time but in most cases it will still be faster than your network bandwidth so it would not be a bottleneck. Additionally if your environment is doing lots of random updates ZFS (or a CoW fs in general) should be able to perform them much faster than other filesystems. But of course YMMV. -- Robert Milkowski http://milek.blogspot.com
> I wonder if it is a real problem, ie, for example cause longer backup > time, will it be addressed in future?It doesn''t cause longer backup time, as long as you''re doing a "zfs send | zfs receive" But it could cause longer backup time if you''re using something like tar. The only way to "solve" it is to eliminate copy on write (negating the value of ZFS), or choose to pay the price during regular operation, resulting in an overall slower system. You can expect it won''t be changed or addressed in any way. You can also expect you''ll never be able to detect or measure this as a performance problem that you care about. ZFS and copy on write are so much faster at other things, such as backups, and add so much value in terms of snapshots and data reliability ... There is a special case where the performance is lower. I don''t mean to disrespect the concerns of anybody who is affected by that special case. But I believe it''s uncommon.> So I should ask anther question: is zfs suitable for an environment > that has lots of data changes? I think for random I/O, there will be no > such performance penalty, but if you backup a zfs dataset, must the > backup utility sequentially read blocks of a dataset? Will zfs dataset > suitable for database temporary tablespace or online redo logs?Yes, ZFS is great for environments that write a lot, and do random writes a lot. There is only one situation where the performance is lower, and it''s specific: * You write a large amount of sequential data. * Then you randomly write a lot *inside* that large sequential file. * Then you sequentially read the data back. Performance is not hurt, if you eliminate any one of those points. * If you did not start by writing a large file in one shot, you won''t have a problem. * If you do lots of random writes, but they''re not in the middle of a large sequential file, you won''t have a problem. * If you always read or write that file randomly, you won''t have a problem. * The only time you have a problem is when you sequentially read a large file that previously had many random writes in the middle. Even in that case, the penalty you pay is usually small enough that you wouldn''t notice. But it''s possible.
> Once the famous bp rewriter is integrated and a defrag functionality > built on top of it you will be able to re-arrange your data again so it > is sequential again.Then again, this would also rearrange your data to be sequential again: cp -p somefile somefile.tmp ; mv -f somefile.tmp somefile