Hi, I''ve got a large file - 20gigs - which is stored in a zfs dataset. I snapshotted the dataset and rsynced an newer, updated version of the same file - slightly larger, but with mostly unchanged data. According to how rsync works, the actual sent data should be much less than the size of the file(s), however, I noticed that the used space of the dataset+snapshot was the combined size of the older and newer file. I found this a bit puzzling, because I was sort of expecting that the increase in size of dataset+snapshot should only be the size of the changed blocks. I guess that ZFS has no way of knowing what the smaller deltas are, that rsync use to re/construct the updated file, and will write every part to disk. But do you think it would be possible to create a version of rsync that worked more closely with ZFS in such a way that only the changes could be written to disk? comments/insight would be appreciated! -- This message posted from opensolaris.org
On Sat, 17 Apr 2010, G. Ander wrote:> > According to how rsync works, the actual sent data should be much > less than the size of the file(s), however, I noticed that the used > space of the dataset+snapshot was the combined size of the older and > newer file.The increase of space consumed is because rsync copied to a new temporary file in order to assure that there is no corruption.> But do you think it would be possible to create a version of rsync > that worked more closely with ZFS in such a way that only the > changes could be written to disk?Use these rsync options to achieve the desired behavior: --inplace --no-whole-file Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn wrote:> On Sat, 17 Apr 2010, G. Ander wrote: >> >> According to how rsync works, the actual sent data should be much >> less than the size of the file(s), however, I noticed that the used >> space of the dataset+snapshot was the combined size of the older and >> newer file. > > The increase of space consumed is because rsync copied to a new > temporary file in order to assure that there is no corruption. > >> But do you think it would be possible to create a version of rsync >> that worked more closely with ZFS in such a way that only the changes >> could be written to disk? > > Use these rsync options to achieve the desired behavior: > > --inplace --no-whole-file > > Bob > --Bob''s entirely correct, with one additional note: make sure to use rsync 3.0.0 or later ON BOTH ENDS of the copy, as there are some important bugfixes around the --inplace option rolled up by that point. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)