In the "Thoughts on ZFS Pool Backup Strategies thread" it was stated that zfs send, sends uncompress data and uses the ARC. If "zfs send" sends uncompress data which has already been compress this is not very efficient, and it would be *nice* to see it send the original compress data. (or an option to do it) I thought I would ask a true or false type questions mainly for curiosity sake. If "zfs send" uses standard ARC cache (when something is not already in the ARC) I would expect this to hurt (to some degree??) the performance of the system. (ie I assume it has the effect of replacing current/useful data in the cache with not very useful/old data depending on how large the ZFS send is) If above true, zfs send and ?zfs backup? (if it the cmd existed to backup and restore a file or set of files with all ZFS attributes) would improve the performance of normal read/write by avoiding the ARC cache (or if easier to implement having its own private ARC cache). Or does it use the same sort of code, as setting ?primarycache=none? on a file system. Has anyone monitored ARC hit rates while doing a large zfs send? Cheers -- This message posted from opensolaris.org
On Mar 25, 2010, at 6:13 AM, Damon Atkins wrote:> In the "Thoughts on ZFS Pool Backup Strategies thread" it was stated that zfs send, sends uncompress data and uses the ARC. > > If "zfs send" sends uncompress data which has already been compress this is not very efficient, and it would be *nice* to see it send the original compress data. (or an option to do it) > > I thought I would ask a true or false type questions mainly for curiosity sake. > > If "zfs send" uses standard ARC cache (when something is not already in the ARC) I would expect this to hurt (to some degree??) the performance of the system. (ie I assume it has the effect of replacing current/useful data in the cache with not very useful/old data depending on how large the ZFS send is)If you restrict answers to "true/false" then the answer is false :-) Actually, the answer is mostly false. The ARC is divided into a most frequently used cache and a most recently used cache. The send data should stick to the most recently used side.> If above true, zfs send and ?zfs backup? (if it the cmd existed to backup and restore a file or set of files with all ZFS attributes) would improve the performance of normal read/write by avoiding the ARC cache (or if easier to implement having its own private ARC cache).The zio pipeline can, in theory, be tapped between the checksum and decompression side, but I think you will find that this defeats both piped compression and receive compression.> > Or does it use the same sort of code, as setting ?primarycache=none? on a file system. > > Has anyone monitored ARC hit rates while doing a large zfs send?Yes. I see very good ARC hit rates when I send from a high transaction system. This is a good thing because recently written data is likely to be in the ARC. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
Wither it is efficient or not to send the compressed or uncompressed data depends on a lot of factors. If the data is already in the ARC for some other reason then it is likely much more efficient to use that because sending the compressed blocks involves doing IO to disk. Reading the version from the in memory ARC does not. If the data is in the L2ARC that is still better than going out to the main pool disks to get the compressed version. Reading from disk is always slower than reading from memory. Depending on what your working set of data in the ARC is and the size of the dataset you are sending it is possible that the ''zfs send'' will cause data that was in the ARC to be evicted to make room for the blocks that ''zfs send'' needs. This is a perfect use case for having a large L2ARC if you can''t fit your working set and the blocks for the ''zfs send'' into the ARC. If you are using incremental ''zfs send'' streams the chances of you thrashing the ARC are probably reduced, particularly if you do them frequently enough so that they aren''t too big. I know people have monitored the ARC hit rates when doing large zfs sends. Using the DTrace Analytics in an SS7000 makes this very easy. It really comes down to the size of your working set in the ARC, the size of your L2ARC and your pattern of data access all that combined with the volumen of data you are ''zfs send''ing. -- Darren J Moffat
On Thu, Mar 25, 2010 at 04:23:38PM +0000, Darren J Moffat wrote:> If the data is in the L2ARC that is still better than going out to > the main pool disks to get the compressed version.<advocate customer=''devil''> Well, one could just compress it... If you''d otherwise put compression in the ssh pipe (or elsewhere) then you could stop doing that. </advocate customer=''devil''> Nico --
> In the "Thoughts on ZFS Pool Backup Strategies thread" it was stated > that zfs send, sends uncompress data and uses the ARC. > > If "zfs send" sends uncompress data which has already been compress > this is not very efficient, and it would be *nice* to see it send the > original compress data. (or an option to do it)You''ve got 2 questions in your post. The one above first ... It''s true that "zfs send" sends uncompressed data. So I''ve heard. I haven''t tested it personally. I seem to remember there''s some work to improve this, but not available yet. Because it was easier to implement the uncompressed send, and that already is super-fast compared to all the alternatives.> I thought I would ask a true or false type questions mainly for > curiosity sake. > > If "zfs send" uses standard ARC cache (when something is not already in > the ARC) I would expect this to hurt (to some degree??) the performance > of the system. (ie I assume it has the effect of replacing > current/useful data in the cache with not very useful/old dataAnd this is a separate question. I can''t say first-hand what ZFS does, but I have an educated guess. I would say, for every block the "zfs send" needs to read ... if the block is in ARC or L2ARC, then it won''t fetch again from disk. But it is not obliterating the ARC or L2ARC with old data. Because it''s smart enough to work at a lower level than a user-space process, and tell the kernel (or whatever) something like "I''m only reading this block once; don''t bother caching it for my sake."
On Fri, March 26, 2010 07:06, Edward Ned Harvey wrote:>> In the "Thoughts on ZFS Pool Backup Strategies thread" it was stated >> that zfs send, sends uncompress data and uses the ARC. >> >> If "zfs send" sends uncompress data which has already been compress >> this is not very efficient, and it would be *nice* to see it send the >> original compress data. (or an option to do it) > > You''ve got 2 questions in your post. The one above first ... > > It''s true that "zfs send" sends uncompressed data. So I''ve heard. I > haven''t tested it personally. > > I seem to remember there''s some work to improve this, but not available > yet. Because it was easier to implement the uncompressed send, and that > already is super-fast compared to all the alternatives.I don''t know that it makes sense to. There are lots of existing filter packages that do compression; so if you want compression, just put them in your pipeline. That way you''re not limited by what zfs send has implemented, either. When they implement bzip98 with a new compression technology breakthrough, you can just use it :-) . -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On Fri, March 26, 2010 09:46, David Dyer-Bennet wrote:> I don''t know that it makes sense to. There are lots of existing filter > packages that do compression; so if you want compression, just put them in > your pipeline. That way you''re not limited by what zfs send has > implemented, either. When they implement bzip98 with a new compression > technology breakthrough, you can just use it :-) .Actually a better example may be using parallel implementations of popular algorithms: http://www.zlib.net/pigz/ http://www.google.com/search?q=parallel+bzip Given the amount of cores we have nowadays (especially the Niagara-based CPUs), might as well use them. There are also better algorithms out there (some of which assume parallelism): http://en.wikipedia.org/wiki/Xz http://en.wikipedia.org/wiki/7z If you''re using OpenSSH, there are also some third-party patches that may help in performance: http://www.psc.edu/networking/projects/hpn-ssh/ However, if the data is already compressed (and/or deduped), there''s no sense in doing it again. If ZFS does have to go to disk, might as well send the data as-is.