Hi. When deleting a snapshot, I have observed that the disk space used by that snapshot is not immediately released (according to statvfs or df). Neither "sync" nor "btrfs filesystem sync" releases the disk space neither. The only way I have found to actually fully release the disk space is to issue the sync and then sleep until the statvfs free numbers stop changing. This is a rather problematic approach to managing disk space. Is there any way to either force a wait until the disk space has been released? My application is automatically managing disk space in the presence of snapshots. I allow the disk (a backup) to fill up with snapshots until it is nearly full, and then to delete snapshots until I have a threshold free. However, without the disk space being released promptly and no way to wait until it is released, the loop can''t tell how many snapshots to delete. -- Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
On Mon, May 10, 2010 at 12:23:52PM -0600, Bruce Guenter wrote:> Hi. > > When deleting a snapshot, I have observed that the disk space used by > that snapshot is not immediately released (according to statvfs or df). > Neither "sync" nor "btrfs filesystem sync" releases the disk space > neither. The only way I have found to actually fully release the disk > space is to issue the sync and then sleep until the statvfs free numbers > stop changing. > > This is a rather problematic approach to managing disk space. Is there > any way to either force a wait until the disk space has been released? > > My application is automatically managing disk space in the presence of > snapshots. I allow the disk (a backup) to fill up with snapshots until > it is nearly full, and then to delete snapshots until I have a threshold > free. However, without the disk space being released promptly and no > way to wait until it is released, the loop can''t tell how many snapshots > to delete. >The way BTRFS''s COW works is that we can''t free up space until after a transaction has committed. After the transaction commits (after a sync) we walk the list of pinned extents and free them asynchronously. We could probably make btrfs filesystem sync wait for that part to finish tho. It shouldn''t be too hard to do, feel free to take a crack at it. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 11, 2010 at 2:23 AM, Bruce Guenter <bruce@untroubled.org> wrote:> Hi. > > When deleting a snapshot, I have observed that the disk space used by > that snapshot is not immediately released (according to statvfs or df). > Neither "sync" nor "btrfs filesystem sync" releases the disk space > neither. The only way I have found to actually fully release the disk > space is to issue the sync and then sleep until the statvfs free numbers > stop changing. > > This is a rather problematic approach to managing disk space. Is there > any way to either force a wait until the disk space has been released? > > My application is automatically managing disk space in the presence of > snapshots. I allow the disk (a backup) to fill up with snapshots until > it is nearly full, and then to delete snapshots until I have a threshold > free. However, without the disk space being released promptly and no > way to wait until it is released, the loop can''t tell how many snapshots > to delete. >This is because the snapshot deleting ioctl only removes the a link. The corresponding tree is dropped in the background by a kernel thread. We could probably add another ioctl that waits until the tree has been completely dropped. Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng wrote:> This is because the snapshot deleting ioctl only removes the a link.Right, I understand that. That part is not unexpected, as it works just like unlink would. However...> The corresponding tree is dropped in the background by a kernel thread.The surprise is that ''sync'', in any form I was able to try, does not wait until all or even most of the I/O is completed. Apparently the standards spec for sync(2) says it is not required to wait for I/O to complete, but AFAIK all other Linux FS do wait (the man page for sync(2) implies as much, as does the info page for sync in glibc). The only way I''ve found so far to force this behavior is to unmount, and that''s rather intrusive to other users of the FS.> We could probably add another ioctl that waits until the tree has been > completely dropped.Since the expected behavior for sync is to wait until all pending I/O has been completed, I would argue this should be the default action for sync. Am I misunderstanding something? -- Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
On Tue, May 11, 2010 at 11:45 PM, Bruce Guenter <bruce@untroubled.org> wrote:> On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng wrote: >> This is because the snapshot deleting ioctl only removes the a link. > > Right, I understand that. That part is not unexpected, as it works just > like unlink would. However... > >> The corresponding tree is dropped in the background by a kernel thread. > > The surprise is that ''sync'', in any form I was able to try, does not > wait until all or even most of the I/O is completed. Apparently the > standards spec for sync(2) says it is not required to wait for I/O to > complete, but AFAIK all other Linux FS do wait (the man page for sync(2) > implies as much, as does the info page for sync in glibc). > > The only way I''ve found so far to force this behavior is to unmount, and > that''s rather intrusive to other users of the FS. > >> We could probably add another ioctl that waits until the tree has been >> completely dropped. > > Since the expected behavior for sync is to wait until all pending I/O > has been completed, I would argue this should be the default action for > sync. Am I misunderstanding something? >Dropping a tree can be lengthy. It''s not good to let sync wait for hours. For most linux FS, ''sync'' just force an transaction/journal commit. I don''t think they wait for large operations that can span multiple transactions to complete. Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12 May 2010 06:02, Yan, Zheng <yanzheng@21cn.com> wrote:> On Tue, May 11, 2010 at 11:45 PM, Bruce Guenter <bruce@untroubled.org> wrote: >> On Tue, May 11, 2010 at 08:10:38AM +0800, Yan, Zheng wrote: >>> This is because the snapshot deleting ioctl only removes the a link. >> >> Right, I understand that. That part is not unexpected, as it works just >> like unlink would. However... >> >>> The corresponding tree is dropped in the background by a kernel thread. >> >> The surprise is that ''sync'', in any form I was able to try, does not >> wait until all or even most of the I/O is completed. Apparently the >> standards spec for sync(2) says it is not required to wait for I/O to >> complete, but AFAIK all other Linux FS do wait (the man page for sync(2) >> implies as much, as does the info page for sync in glibc). >> >> The only way I''ve found so far to force this behavior is to unmount, and >> that''s rather intrusive to other users of the FS. >> >>> We could probably add another ioctl that waits until the tree has been >>> completely dropped. >> >> Since the expected behavior for sync is to wait until all pending I/O >> has been completed, I would argue this should be the default action for >> sync. Am I misunderstanding something? >> > > Dropping a tree can be lengthy. It''s not good to let sync wait for hours. > For most linux FS, ''sync'' just force an transaction/journal commit. I don''t > think they wait for large operations that can span multiple transactions to > complete.Disclaimer: I know nothing about the internals of Btrfs! I have an analogy as a way to thinking about what deleting a snapshot entails (which I hope isn''t totally bogus). Deleting a clone of a file system is not like unlinking a single file. It is analogous to deleting a directory tree. Syncing in the middle of a recursive delete will wait for the in flight I/O to complete, but it would not wait for the unlink requests from the portion of the directory tree not yet traversed. The same would be true when the kernel thread deletes the snapshot by recursing through it''s tree. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng wrote:> Dropping a tree can be lengthy. It''s not good to let sync wait for hours. > For most linux FS, ''sync'' just force an transaction/journal commit. I don''t > think they wait for large operations that can span multiple transactions to > complete.What happens to the consistency of the filesystem if a crash happens during this process? -- Bruce Guenter <bruce@untroubled.org> http://untroubled.org/
On Mon, May 31, 2010 at 12:01 PM, Bruce Guenter <bruce@untroubled.org> wrote:> On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng wrote: >> Dropping a tree can be lengthy. It''s not good to let sync wait for hours. >> For most linux FS, ''sync'' just force an transaction/journal commit. I don''t >> think they wait for large operations that can span multiple transactions to >> complete. > > What happens to the consistency of the filesystem if a crash happens > during this process?There''s a good test case for you to try. Let us know what you find. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 1, 2010 at 3:01 AM, Bruce Guenter <bruce@untroubled.org> wrote:> On Wed, May 12, 2010 at 01:02:07PM +0800, Yan, Zheng wrote: >> Dropping a tree can be lengthy. It''s not good to let sync wait for hours. >> For most linux FS, ''sync'' just force an transaction/journal commit. I don''t >> think they wait for large operations that can span multiple transactions to >> complete. > > What happens to the consistency of the filesystem if a crash happens > during this process? >This does not break the consistency of the filesystem. Next mount will find the partial dropped tree and restart the dropping process. Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html