Okay, since I''m finally starting to care (projecting long-term use, so I can set my snapshot taking and retention policy, and estimate disk needs), I need a review of reading snapshot size information.>From the user end, there are three "sizes" that I can imagine caring aboutfor a snapshot: 1) The total amount of space that sending this snapshot to a new filesystem would take. This is the largest possible value for snapshot "size" -- the size of everything that can be accessed through this snapshot. I believe this is what zfs list reports as "referenced". 2) The amount of space that would be freed up in the pool if this snapshot were to be destroyed. This has the obvious use -- if you''re short of space and looking for something to delete, this is the number you need to consider. I believe this is "usedbydataset". 3) The amount of space needed to represent the difference between this snapshot and the preceding stored state. This should be roughly the size that an incremental ZFS send would be from the preceding state to this state. (2 and 3 would be the same IF the snapshot in question was the most recent state of the filesystem, nothing changed since then.) The "used" space (zfs list output) isn''t any of these; it''s what the snapshot plus all descendents uses. And the "usedby*" give lots of other kinds of detail. "used" is changes in this and later datasets, I guess. So the difference between "used" in adjacent datasets is my #3, at least roughly. (And I understand that there are timing issues involved in testing expecting to see exact numbers.) I guess this stuff is decently documented; at least unless I misunderstood a bunch. Let me know if anything is badly wrong! -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Cindy spent a lot of time trying to make this understandable in the ZFS Admin Guide. A quick refresher below... On Mar 1, 2010, at 8:55 AM, David Dyer-Bennet wrote:> Okay, since I''m finally starting to care (projecting long-term use, so I > can set my snapshot taking and retention policy, and estimate disk needs), > I need a review of reading snapshot size information. > > From the user end, there are three "sizes" that I can imagine caring about > for a snapshot: > > 1) The total amount of space that sending this snapshot to a new > filesystem would take. This is the largest possible value for snapshot > "size" -- the size of everything that can be accessed through this > snapshot. > > I believe this is what zfs list reports as "referenced". > > 2) The amount of space that would be freed up in the pool if this > snapshot were to be destroyed. This has the obvious use -- if you''re > short of space and looking for something to delete, this is the number you > need to consider. > > I believe this is "usedbydataset". > > 3) The amount of space needed to represent the difference between this > snapshot and the preceding stored state. This should be roughly the size > that an incremental ZFS send would be from the preceding state to this > state. > > (2 and 3 would be the same IF the snapshot in question was the most recent > state of the filesystem, nothing changed since then.) > > The "used" space (zfs list output) isn''t any of these; it''s what the > snapshot plus all descendents uses. And the "usedby*" give lots of other > kinds of detail. "used" is changes in this and later datasets, I guess. > So the difference between "used" in adjacent datasets is my #3, at least > roughly. > > (And I understand that there are timing issues involved in testing > expecting to see exact numbers.) > > I guess this stuff is decently documented; at least unless I misunderstood > a bunch. Let me know if anything is badly wrong!used property = usedbychildren + usedbydataset + usedbyrefreservation + usedbysnapshots The "zfs list -o space" command will nicely format these for you. There is also a "USED" column for snapshot listings. This is just the size of the unique data in the snapshot. This can help planning the size of an incremental send. This is slightly confusing, but it gets worse with dedup... -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)
Except none of these actually gives you the information about "send size". Used will give you unique bytes, so in my understanding it would change as you add more snapshots, and especially with dedup. Referenced is more or less equivalent to doing DU on the .zfs/snapshot directory. So basically, you can know total or unique bytes, but not unique since last snapshot. This is such a basic need that we ended up literally doing send just to get that basic accounting info. I would happily be corrected, but as far as i know, there is simply no way to figure out the unique difference between a snapshot and the one before it. -- This message posted from opensolaris.org