I currently have a btrfs filesystem that I am unmounting and it has been has been "unmounting" for the last 20 minutes. I''m pretty sure I know exactly what is going on and in my current situation it''s not a huge issues, but it would be a problem if this was a production system and I was trying to do a maintenance. Here is how I got into this situation: I am migrating my data from one pair of disks (mirrored with btrfs) to another pair of disks. I rsync''d my data from the original btrfs file system to the other. When it completed, my new filesystem showed 165GB used. The original show 1.8TB used. I came to the conclusion that it must be the daily snapshots I have that were using the majority of the space and because I was going to destroy the filesystem, I decided, what the heck, let me destroy the snapshots and see what it looks like. To my surprise, removing all the snapshots resulted in the usage dropping from 1.8TB to 1.7TB. I re-ran my rsync, it complete without transferring any new data. I then did a du -s in the mountpoint for the original filesystem and is reported back 165GB which agrees with what rsync and df on the new filesystem reports. My first thought was that I must have some sort of bizarre corruption on the original filesystem. And then I went to unmount it and it still has not returned. What I now suspect is going on is that while deleting the snapshots was quick, that probably kicks of a background thread which actually does the heavy lifting. I noticed a btrfs-cleaner process that was in an io wait state, which I presumed was the process in question. However, now 40 minutes later, my unmount is still hung and the btrfs-cleaner process is sleeping, so perhaps I am wrong. At this point I am going to powercycle my system, but I figured I would check and see if anyone else knew for certain it this was the type of behavior one would expect to see when removing large snapshots and then immediately trying to unmount the filesystem. If so, it seems like this is something that would need to change before someone would want to seriously consider using btrfs w/ snapshots in a production environment. I know btrfs is not considered production ready yet (well, at least not by the developers, regardless of what Oracle and Suse say). At the same time, I''ve not been able to find any mention of similar problems, so I figured it was worth mentioning. -- Michael Johnson - MJ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 10 Mar 2013 22:31:08 -0700 Michael Johnson - MJ <mj@revmj.com> wrote:> What I now suspect is going on is that while deleting the snapshots > was quick, that probably kicks of a background thread which actually > does the heavy lifting.Exactly that, the snapshot deletion only "syncs" on unmount, there is no other way to ensure it is complete. If you have some patience and let it unmount properly and then remount it, you may find that you have gained much more free space, due to all the snapshots being actually deleted and the space they were occupying freed only just now. -- With respect, Roman
On Mon, Mar 11, 2013 at 12:11:43PM +0600, Roman Mamedov wrote:> On Sun, 10 Mar 2013 22:31:08 -0700 > Michael Johnson - MJ <mj@revmj.com> wrote: > > > What I now suspect is going on is that while deleting the snapshots > > was quick, that probably kicks of a background thread which actually > > does the heavy lifting. > > Exactly that, the snapshot deletion only "syncs" on unmount, there is no > other way to ensure it is complete. > > If you have some patience and let it unmount properly and then remount it, you > may find that you have gained much more free space, due to all the snapshots > being actually deleted and the space they were occupying freed only just now.A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0 Btrfs: fix cleaner thread not working with inode cache option) may improve the situation. You may want to try it. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 11/03/2013 07:47, Liu Bo a écrit :> A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0 > Btrfs: fix cleaner thread not working with inode cache option) > may improve the situation.Hi Liu, I have never seen this issue with btrfs-cleaner not working, when I delete snapshots it typically kicks in a few seconds later and works "until done". Does the bug you mention affect only specific kernel versions ? AFAIK I use inode_cache (it''s not in my fstab but I mounted my FSes using it manually, and I believe it''s a "persistent" option ? - I may possibly be wrong...) TIA Kind regards. -- Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E Ne cherchez pas : Je ne suis pas sur Facebook. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 10, 2013 at 10:31:08PM -0700, Michael Johnson - MJ wrote:> I currently have a btrfs filesystem that I am unmounting and it has > been has been "unmounting" for the last 20 minutes. > > I''m pretty sure I know exactly what is going on and in my current > situation it''s not a huge issues, but it would be a problem if this > was a production system and I was trying to do a maintenance. > > Here is how I got into this situation: > > I am migrating my data from one pair of disks (mirrored with btrfs) to > another pair of disks. I rsync''d my data from the original btrfs file > system to the other. When it completed, my new filesystem showed > 165GB used. The original show 1.8TB used. I came to the conclusion > that it must be the daily snapshots I have that were using the > majority of the space and because I was going to destroy the > filesystem, I decided, what the heck, let me destroy the snapshots and > see what it looks like. > > To my surprise, removing all the snapshots resulted in the usage > dropping from 1.8TB to 1.7TB. I re-ran my rsync, it complete without > transferring any new data. I then did a du -s in the mountpoint for > the original filesystem and is reported back 165GB which agrees with > what rsync and df on the new filesystem reports. > > My first thought was that I must have some sort of bizarre corruption > on the original filesystem. And then I went to unmount it and it > still has not returned. > > What I now suspect is going on is that while deleting the snapshots > was quick, that probably kicks of a background thread which actually > does the heavy lifting. I noticed a btrfs-cleaner process that was in > an io wait state, which I presumed was the process in question. > However, now 40 minutes later, my unmount is still hung and the > btrfs-cleaner process is sleeping, so perhaps I am wrong.You''re right, umount will wake up cleaner kthread to do ''real work'' of cleanup marked ''delete'' snapshot/subvolume. but while btrfs-cleaner is sleeping, could you please show what unmount is waiting for? Maybe ''cat /proc/xxxx/stack'' will be helpful on figuring out why. thanks, liubo> > At this point I am going to powercycle my system, but I figured I > would check and see if anyone else knew for certain it this was the > type of behavior one would expect to see when removing large snapshots > and then immediately trying to unmount the filesystem. If so, it > seems like this is something that would need to change before someone > would want to seriously consider using btrfs w/ snapshots in a > production environment. I know btrfs is not considered production > ready yet (well, at least not by the developers, regardless of what > Oracle and Suse say). At the same time, I''ve not been able to find > any mention of similar problems, so I figured it was worth mentioning. > > -- > Michael Johnson - MJ > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 11, 2013 at 11:20:15AM +0100, Swâmi Petaramesh wrote:> Le 11/03/2013 07:47, Liu Bo a écrit : > > A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0 > > Btrfs: fix cleaner thread not working with inode cache option) > > may improve the situation. > > Hi Liu, > > I have never seen this issue with btrfs-cleaner not working, when I > delete snapshots it typically kicks in a few seconds later and works > "until done". >The ''not working'' is a little confused, sorry. It means that cleaner thread does not do its work in time. When we delete a snapshot/subvolume, we a)invalidate all of inodes that belong to it and then b)add it to a list for cleaner thread to do the real work if the last inode is destroyed from memory. What the commit tries to fix is that the inode cache inode will remain in memory so that keeps the snapshot/subvolume from adding to the cleanup list. And this''d result in the situation that our space is not freed as we wish. So back to the thread, if you notice that even cleaner thread does not help you get free space after you''ve delete the snapshot/subvolume, there should be some inodes of snapshot/subvolume remaining in memory.> Does the bug you mention affect only specific kernel versions ?After we have inode cache.> > AFAIK I use inode_cache (it''s not in my fstab but I mounted my FSes > using it manually, and I believe it''s a "persistent" option ? - I may > possibly be wrong...)It''s only working when you mount with it, it helps you reuse inode id. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 10, 2013 at 10:31:08PM -0700, Michael Johnson - MJ wrote:> I currently have a btrfs filesystem that I am unmounting and it has > been has been "unmounting" for the last 20 minutes. > > I''m pretty sure I know exactly what is going on and in my current > situation it''s not a huge issues, but it would be a problem if this > was a production system and I was trying to do a maintenance. > > Here is how I got into this situation: > > What I now suspect is going on is that while deleting the snapshots > was quick, that probably kicks of a background thread which actually > does the heavy lifting. I noticed a btrfs-cleaner process that was in > an io wait state, which I presumed was the process in question. > However, now 40 minutes later, my unmount is still hung and the > btrfs-cleaner process is sleeping, so perhaps I am wrong.The umount blocked by cleaner is known and I have now a patch ready to improve that http://thread.gmane.org/gmane.comp.file-systems.btrfs/23212 cleaner does not wait to do the background work for all deleted snapshots and is able to return in the middle of processing the current one when the fs si going down. There''s another umount blocker, when a huge orphan file is being cleaned up, but from first look it also seems to be possible exit early if umount is detected. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html