In a thread elsewhere, trying to analyse why the zfs auto-snapshot cleanup code was cleaning up more aggressively than expected, I discovered some interesting properties of a zvol. http://mail.opensolaris.org/pipermail/zfs-auto-snapshot/2010-January/000232.html The zvol is not thin-provisioned. The entire volume has been written to (it was dd''d off a physical disk), and: volsize = refreservation referenced = usedbydataset = (volsize + a little overhead) This is as expected. Not expected is that: usedbyrefreservation = refreservation I would expect this to be 0, sinnce all the reserved space has been allocated. As a result, used is over twice the size of the volume (+ a few small snapshots as well). I think others may have have seen similar problems; it may be the root cause behind several other complaints that time-slider-cleanup deleted snapshots to free up space, when the pool still had plenty free. A quick followup test shows that usedbyrefreservation behaves as expected, for a new test zvol. http://mail.opensolaris.org/pipermail/zfs-auto-snapshot/2010-January/000233.html So apparently it may be a problem picked up along the upgrade path through many zpool version upgrades. The pool, and the zvol, would first have been created on b111 or shortly after. It has been used with both xvm kernels, and native kernels running virtualbox, in that time. Who can help me figure out what''s going on with the older zvol? Any useful zdb info I can dump out? I could "fix" it by copying and replacing the zvol, getting compression and dedup in the process, but before I do I don''t want to destroy what may be useful debug info. I''ll check later whether the send|recv snapshots of this zvol on my backup server show similar problems, but I doubt they will. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100128/6f781da3/attachment.bin>
On 01/27/10 21:17, Daniel Carosone wrote:> This is as expected. Not expected is that: > > usedbyrefreservation = refreservation > > I would expect this to be 0, since all the reserved space has been > allocated.This would be the case if the volume had no snapshots.> As a result, used is over twice the size of the volume (+ > a few small snapshots as well).I''m seeing essentially the same thing with a recently-created zvol with snapshots that I export via iscsi for time machine backups on a mac. % zfs list -r -o name,refer,used,usedbyrefreservation,refreservation,volsize z/tm/mcgarrett NAME REFER USED USEDREFRESERV REFRESERV VOLSIZE z/tm/mcgarrett 26.7G 88.2G 60G 60G 60G The actual volume footprint is a bit less than half of the volume size, but the refreservation ensures that there is enough free space in the pool to allow me to overwrite every block of the zvol with uncompressable data without any writes failing due to the pool being out of space. If you were to disable time-based snapshots and then overwrite a measurable fraction of the zvol you I''d expect "USEDBYREFRESERVATION" to shrink as the reserved blocks were actually used. If you want to allow for overcommit, you need to delete the refreservation. - Bill
On Wed, Jan 27, 2010 at 09:57:08PM -0800, Bill Sommerfeld wrote: Hi Bill! :-)> On 01/27/10 21:17, Daniel Carosone wrote: >> This is as expected. Not expected is that: >> >> usedbyrefreservation = refreservation >> >> I would expect this to be 0, since all the reserved space has been >> allocated. > > This would be the case if the volume had no snapshots.Hmm....> The actual volume footprint is a bit less than half of the volume > size, but the refreservation ensures that there is enough free space > in the pool to allow me to overwrite every block of the zvol with > uncompressable data without any writes failing due to the pool being > out of space.Hmm.... this is new (to me) and undescribed (in the manpage) behaviour, but it does explain the observed behaviour. In other words, usedbyrefreservation includes blocks currently shared with snapshots, representing a reservation for potential future CoW of these blocks. Does this happen for filesystems, or only volumes? I hope it''s both, but just more commonly encountered because refreserv is more commonly used with volumes.> If you were to disable time-based snapshots and then overwrite a measurable > fraction of the zvol you I''d expect "USEDBYREFRESERVATION" to shrink as > the reserved blocks were actually used.Right. If I repeat my quick test with snapshots, when the first snapshot is taken, I would see usedbyrefreservation jump back up to the full size of the volume. At that point the whole volume is shared with the snapshot. As data is overwritten, the space for the retained copy would be added to usedbysnapshots, and the space that''s now unique to the dataset would come off usedbyrefreservation, with the used total staying constant - until another snapshot is taken. I''ll do that for my own interest, but it now makes perfect sense and is quite reasonable. The trouble is the documentation doesn''t point to this, so it''s surprising and unexpected. There''s text in the description of the refreservation property, about "snapshots will only be allowed if there''s enough free space". What needs to be clear is that this is achieved by the behaviour of usedbyrefreservation, in part by additional text in the description of that property (that it includes space shared with snapshots), and partly by improving the wording about "free space" here. I''ll see if I can knock together some better wording later.> If you want to allow for overcommit, you need to delete the refreservation.Of course, I just wasn''t thinking of a taking a snapshot as having this cost, though of course it does. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100128/daf35836/attachment.bin>