thr3ads.net - Btrfs devel - snapshot deletion / unmount slowness [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Michael Johnson - MJ

2013-Mar-11 05:31 UTC

snapshot deletion / unmount slowness

I currently have a btrfs filesystem that I am unmounting and it has
been has been "unmounting" for the last 20 minutes.

I''m pretty sure I know exactly what is going on and in my current
situation it''s not a huge issues, but it would be a problem if this
was a production system and I was trying to do a maintenance.

Here is how I got into this situation:

I am migrating my data from one pair of disks (mirrored with btrfs) to
another pair of disks.  I rsync''d my data from the original btrfs file
system to the other.  When it completed, my new filesystem showed
165GB used. The original show 1.8TB used.  I came to the conclusion
that it must be the daily snapshots I have that were using the
majority of the space and because I was going to destroy the
filesystem, I decided, what the heck, let me destroy the snapshots and
see what it looks like.

To my surprise, removing all the snapshots resulted in the usage
dropping from 1.8TB to 1.7TB.  I re-ran my rsync, it complete without
transferring any new data.  I then did a du -s in the mountpoint for
the original filesystem and is reported back 165GB which agrees with
what rsync and df on the new filesystem reports.

My first thought was that I must have some sort of bizarre corruption
on the original filesystem.  And then I went to unmount it and it
still has not returned.

What I now suspect is going on is that while deleting the snapshots
was quick, that probably kicks of a background thread which actually
does the heavy lifting.  I noticed a btrfs-cleaner process that was in
an io wait state, which I presumed was the process in question.
However, now 40 minutes later, my unmount is still hung and the
btrfs-cleaner process is sleeping, so perhaps I am wrong.

At this point I am going to powercycle my system, but I figured I
would check and see if anyone else knew for certain it this was the
type of behavior one would expect to see when removing large snapshots
and then immediately trying to unmount the filesystem.  If so, it
seems like this is something that would need to change before someone
would want to seriously consider using btrfs w/ snapshots in a
production environment.  I know btrfs is not considered production
ready yet (well, at least not by the developers, regardless of what
Oracle and Suse say).  At the same time, I''ve not been able to find
any mention of similar problems, so I figured it was worth mentioning.

--
Michael Johnson - MJ
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Roman Mamedov

2013-Mar-11 06:11 UTC

head link

Re: snapshot deletion / unmount slowness

On Sun, 10 Mar 2013 22:31:08 -0700
Michael Johnson - MJ <mj@revmj.com> wrote:
> What I now suspect is going on is that while deleting the snapshots
> was quick, that probably kicks of a background thread which actually
> does the heavy lifting.
Exactly that, the snapshot deletion only "syncs" on unmount, there is
no
other way to ensure it is complete.

If you have some patience and let it unmount properly and then remount it, you
may find that you have gained much more free space, due to all the snapshots
being actually deleted and the space they were occupying freed only just now.

-- 
With respect,
Roman

Liu Bo

2013-Mar-11 06:47 UTC

head link

Re: snapshot deletion / unmount slowness

On Mon, Mar 11, 2013 at 12:11:43PM +0600, Roman Mamedov
wrote:> On Sun, 10 Mar 2013 22:31:08 -0700
> Michael Johnson - MJ <mj@revmj.com> wrote:
> 
> > What I now suspect is going on is that while deleting the snapshots
> > was quick, that probably kicks of a background thread which actually
> > does the heavy lifting.
> 
> Exactly that, the snapshot deletion only "syncs" on unmount,
there is no
> other way to ensure it is complete.
> 
> If you have some patience and let it unmount properly and then remount it,
you
> may find that you have gained much more free space, due to all the
snapshots
> being actually deleted and the space they were occupying freed only just
now.
A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0
Btrfs: fix cleaner thread not working with inode cache option)
may improve the situation.

You may want to try it.

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Swâmi Petaramesh

2013-Mar-11 10:20 UTC

head link

Re: snapshot deletion / unmount slowness

Le 11/03/2013 07:47, Liu Bo a écrit :> A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0
> Btrfs: fix cleaner thread not working with inode cache option)
> may improve the situation.
Hi Liu,

I have never seen this issue with btrfs-cleaner not working, when I
delete snapshots it typically kicks in a few seconds later and works
"until done".

Does the bug you mention affect only specific kernel versions ?

AFAIK I use inode_cache (it''s not in my fstab but I mounted my FSes
using it manually, and I believe it''s a "persistent" option ?
- I may
possibly be wrong...)

TIA

Kind regards.

-- 
Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E
Ne cherchez pas : Je ne suis pas sur Facebook.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2013-Mar-11 14:50 UTC

head link

Re: snapshot deletion / unmount slowness

On Sun, Mar 10, 2013 at 10:31:08PM -0700, Michael Johnson - MJ
wrote:> I currently have a btrfs filesystem that I am unmounting and it has
> been has been "unmounting" for the last 20 minutes.
> 
> I''m pretty sure I know exactly what is going on and in my current
> situation it''s not a huge issues, but it would be a problem if
this
> was a production system and I was trying to do a maintenance.
> 
> Here is how I got into this situation:
> 
> I am migrating my data from one pair of disks (mirrored with btrfs) to
> another pair of disks.  I rsync''d my data from the original btrfs
file
> system to the other.  When it completed, my new filesystem showed
> 165GB used. The original show 1.8TB used.  I came to the conclusion
> that it must be the daily snapshots I have that were using the
> majority of the space and because I was going to destroy the
> filesystem, I decided, what the heck, let me destroy the snapshots and
> see what it looks like.
> 
> To my surprise, removing all the snapshots resulted in the usage
> dropping from 1.8TB to 1.7TB.  I re-ran my rsync, it complete without
> transferring any new data.  I then did a du -s in the mountpoint for
> the original filesystem and is reported back 165GB which agrees with
> what rsync and df on the new filesystem reports.
> 
> My first thought was that I must have some sort of bizarre corruption
> on the original filesystem.  And then I went to unmount it and it
> still has not returned.
> 
> What I now suspect is going on is that while deleting the snapshots
> was quick, that probably kicks of a background thread which actually
> does the heavy lifting.  I noticed a btrfs-cleaner process that was in
> an io wait state, which I presumed was the process in question.
> However, now 40 minutes later, my unmount is still hung and the
> btrfs-cleaner process is sleeping, so perhaps I am wrong.
You''re right, umount will wake up cleaner kthread to do ''real
work'' of cleanup
marked ''delete'' snapshot/subvolume.

but while btrfs-cleaner is sleeping, could you please show what unmount is
waiting for?

Maybe ''cat /proc/xxxx/stack'' will be helpful on figuring out
why.

thanks,
liubo
> 
> At this point I am going to powercycle my system, but I figured I
> would check and see if anyone else knew for certain it this was the
> type of behavior one would expect to see when removing large snapshots
> and then immediately trying to unmount the filesystem.  If so, it
> seems like this is something that would need to change before someone
> would want to seriously consider using btrfs w/ snapshots in a
> production environment.  I know btrfs is not considered production
> ready yet (well, at least not by the developers, regardless of what
> Oracle and Suse say).  At the same time, I''ve not been able to
find
> any mention of similar problems, so I figured it was worth mentioning.
> 
> --
> Michael Johnson - MJ
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2013-Mar-11 15:03 UTC

head link

Re: snapshot deletion / unmount slowness

On Mon, Mar 11, 2013 at 11:20:15AM +0100, Swâmi Petaramesh
wrote:> Le 11/03/2013 07:47, Liu Bo a écrit :
> > A recent commit(commit fa6ac8765c48a06dfed914e8c8c3a903f9d313a0
> > Btrfs: fix cleaner thread not working with inode cache option)
> > may improve the situation.
> 
> Hi Liu,
> 
> I have never seen this issue with btrfs-cleaner not working, when I
> delete snapshots it typically kicks in a few seconds later and works
> "until done".
> 
The ''not working'' is a little confused, sorry.

It means that cleaner thread does not do its work in time.  When we delete a
snapshot/subvolume, we a)invalidate all of inodes that belong to it and then
b)add it to a list for cleaner thread to do the real work if the last inode is
destroyed from memory.

What the commit tries to fix is that the inode cache inode will remain in
memory so that keeps the snapshot/subvolume from adding to the cleanup list.
And this''d result in the situation that our space is not freed as we
wish.

So back to the thread, if you notice that even cleaner thread does not help you
get free space after you''ve delete the snapshot/subvolume, there should
be some
inodes of snapshot/subvolume remaining in memory.
> Does the bug you mention affect only specific kernel versions ?
After we have inode cache.
> 
> AFAIK I use inode_cache (it''s not in my fstab but I mounted my
FSes
> using it manually, and I believe it''s a "persistent"
option ? - I may
> possibly be wrong...)
It''s only working when you mount with it, it helps you reuse inode id.

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2013-Mar-11 17:46 UTC

head link

Re: snapshot deletion / unmount slowness

On Sun, Mar 10, 2013 at 10:31:08PM -0700, Michael Johnson - MJ
wrote:> I currently have a btrfs filesystem that I am unmounting and it has
> been has been "unmounting" for the last 20 minutes.
> 
> I''m pretty sure I know exactly what is going on and in my current
> situation it''s not a huge issues, but it would be a problem if
this
> was a production system and I was trying to do a maintenance.
> 
> Here is how I got into this situation:
> 
> What I now suspect is going on is that while deleting the snapshots
> was quick, that probably kicks of a background thread which actually
> does the heavy lifting.  I noticed a btrfs-cleaner process that was in
> an io wait state, which I presumed was the process in question.
> However, now 40 minutes later, my unmount is still hung and the
> btrfs-cleaner process is sleeping, so perhaps I am wrong.
The umount blocked by cleaner is known and I have now a patch ready to
improve that

http://thread.gmane.org/gmane.comp.file-systems.btrfs/23212

cleaner does not wait to do the background work for all deleted
snapshots and is able to return in the middle of processing the current
one when the fs si going down.

There''s another umount blocker, when a huge orphan file is being
cleaned
up, but from first look it also seems to be possible exit early if
umount is detected.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Mar 2013 - snapshot deletion / unmount slowness

snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness

Re: snapshot deletion / unmount slowness