thr3ads.net - Btrfs devel - Slow snapshot deletion [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Bruce Guenter

2011-Jul-28 20:04 UTC

Slow snapshot deletion

At work we have a backup server system running btrfs.  The main backup
pool is a 1.3TB partition (on LVM).  Every night, a series of backups
are done with rsync, with each backup onto a separate subvolume, and a
snapshot made of each backup.  Compression is used to minimize disk
space used.

When the system was first provisioned, all the operations ran
impressively fast, but now it is almost unusably slow.  It has taken 2
days to delete snapshots that recovered 30GB of space, and some of the
backups are taking longer than 24 hours to complete.

Is there any way to monitor the progress of the snapshot deletions?

What could be causing it to run so slow?

What could we do to speed things up?

Kernel is 2.6.39-gentoo-r3.  It was last rebooted 2 days ago, and
rebooting does not seem to improve the situation.  Mount options
according to /proc are
"rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
btrfs filesystem df shows:

Data: total=1.15TB, used=1.10TB
System, DUP: total=8.00MB, used=184.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=49.38GB, used=31.59GB
Metadata: total=8.00MB, used=0.00

-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

Chris Mason

2011-Jul-28 20:28 UTC

head link

Re: Slow snapshot deletion

Excerpts from Bruce Guenter''s message of 2011-07-28 16:04:45
-0400:> 
> At work we have a backup server system running btrfs.  The main backup
> pool is a 1.3TB partition (on LVM).  Every night, a series of backups
> are done with rsync, with each backup onto a separate subvolume, and a
> snapshot made of each backup.  Compression is used to minimize disk
> space used.
> 
> When the system was first provisioned, all the operations ran
> impressively fast, but now it is almost unusably slow.  It has taken 2
> days to delete snapshots that recovered 30GB of space, and some of the
> backups are taking longer than 24 hours to complete.
> 
> Is there any way to monitor the progress of the snapshot deletions?
> 
> What could be causing it to run so slow?
> 
> What could we do to speed things up?
> 
> Kernel is 2.6.39-gentoo-r3.  It was last rebooted 2 days ago, and
> rebooting does not seem to improve the situation.  Mount options
> according to /proc are
"rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
> btrfs filesystem df shows:
> 
> Data: total=1.15TB, used=1.10TB
> System, DUP: total=8.00MB, used=184.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=49.38GB, used=31.59GB
> Metadata: total=8.00MB, used=0.00
> 
The slow performance is probably coming from reading in the metadata
associated with the snapshot extents.  The new readahead extentions from
Arne should help once we''ve adapted them to it.  The easiest way to
make
sure is to hit sysrq-w a few times while it is slow (or run a patched
latencytop).

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2011-Jul-28 20:51 UTC

head link

Re: Slow snapshot deletion

Excerpts from Chris Mason''s message of 2011-07-28 16:28:04
-0400:> Excerpts from Bruce Guenter''s message of 2011-07-28 16:04:45
-0400:
> > 
> > At work we have a backup server system running btrfs.  The main backup
> > pool is a 1.3TB partition (on LVM).  Every night, a series of backups
> > are done with rsync, with each backup onto a separate subvolume, and a
> > snapshot made of each backup.  Compression is used to minimize disk
> > space used.
> > 
> > When the system was first provisioned, all the operations ran
> > impressively fast, but now it is almost unusably slow.  It has taken 2
> > days to delete snapshots that recovered 30GB of space, and some of the
> > backups are taking longer than 24 hours to complete.
> > 
> > Is there any way to monitor the progress of the snapshot deletions?
> > 
> > What could be causing it to run so slow?
> > 
> > What could we do to speed things up?
> > 
> > Kernel is 2.6.39-gentoo-r3.  It was last rebooted 2 days ago, and
> > rebooting does not seem to improve the situation.  Mount options
> > according to /proc are
"rw,nosuid,noatime,nodiratime,compress=zlib,noacl"
> > btrfs filesystem df shows:
> > 
> > Data: total=1.15TB, used=1.10TB
> > System, DUP: total=8.00MB, used=184.00KB
> > System: total=4.00MB, used=0.00
> > Metadata, DUP: total=49.38GB, used=31.59GB
> > Metadata: total=8.00MB, used=0.00
> > 
> 
> The slow performance is probably coming from reading in the metadata
> associated with the snapshot extents.  The new readahead extentions from
> Arne should help once we''ve adapted them to it.  The easiest way
to make
> sure is to hit sysrq-w a few times while it is slow (or run a patched
> latencytop).
Sorry, hit send too soon.  Here is the latencytop patch, after you
recompile run latencytop -c for a few minutes.  Send the output here.

http://oss.oracle.com/~mason/latencytop.patch

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bruce Guenter

2011-Aug-01 22:26 UTC

head link

Re: Slow snapshot deletion

On Thu, Jul 28, 2011 at 04:28:04PM -0400, Chris Mason
wrote:> The slow performance is probably coming from reading in the metadata
> associated with the snapshot extents.  The new readahead extentions from
> Arne should help once we''ve adapted them to it.  The easiest way
to make
> sure is to hit sysrq-w a few times while it is slow (or run a patched
> latencytop).
We shut down the rsync backups for now, and are waiting for the
snapshots to complete deletion so we can unmount the filesystem to
reboot (missing kernel options for latencytop).  In the mean time I hit
SysRq-W quite a number of times, and nothing is showing up in kernel
messages.  However, btrfs-cleaner and/or btrfs-transaction (sometimes
both, sometimes just one) have been running high in top, taking up all
the CPU time.

-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

cwillu

2011-Aug-01 22:41 UTC

head link

Re: Slow snapshot deletion

On Mon, Aug 1, 2011 at 4:26 PM, Bruce Guenter <bruce@untroubled.org>
wrote:> On Thu, Jul 28, 2011 at 04:28:04PM -0400, Chris Mason wrote:
>> The slow performance is probably coming from reading in the metadata
>> associated with the snapshot extents.  The new readahead extentions
from
>> Arne should help once we''ve adapted them to it.  The easiest
way to make
>> sure is to hit sysrq-w a few times while it is slow (or run a patched
>> latencytop).
>
> We shut down the rsync backups for now, and are waiting for the
> snapshots to complete deletion so we can unmount the filesystem to
> reboot (missing kernel options for latencytop).  In the mean time I hit
> SysRq-W quite a number of times, and nothing is showing up in kernel
The keystroke should be alt-sysrq-w.
> messages.  However, btrfs-cleaner and/or btrfs-transaction (sometimes
> both, sometimes just one) have been running high in top, taking up all
> the CPU time.--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bruce Guenter

2011-Aug-01 22:59 UTC

head link

Re: Slow snapshot deletion

On Mon, Aug 01, 2011 at 04:41:10PM -0600, cwillu wrote:> The keystroke should be alt-sysrq-w.
Right.  What I meant is that no backtraces are showing up.  I was
actually using

	echo w > /proc/sysrq-trigger

and all I was seeing is:

SysRq : Show Blocked State
  task                        PC stack   pid father

-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

Chris Mason

2011-Aug-02 00:29 UTC

head link

Re: Slow snapshot deletion

Excerpts from Bruce Guenter''s message of 2011-08-01 18:59:15
-0400:> On Mon, Aug 01, 2011 at 04:41:10PM -0600, cwillu wrote:
> > The keystroke should be alt-sysrq-w.
> 
> Right.  What I meant is that no backtraces are showing up.  I was
> actually using
> 
>     echo w > /proc/sysrq-trigger
> 
> and all I was seeing is:
> 
> SysRq : Show Blocked State
>   task                        PC stack   pid father
> 
# once per boot, run dmesg -n8
dmesg -n8

echo w > /proc/sysrq-trigger
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bruce Guenter

2011-Aug-04 16:35 UTC

head link

Re: Slow snapshot deletion

On Thu, Jul 28, 2011 at 04:51:24PM -0400, Chris Mason
wrote:> Sorry, hit send too soon.  Here is the latencytop patch, after you
> recompile run latencytop -c for a few minutes.  Send the output here.
> 
> http://oss.oracle.com/~mason/latencytop.patch
Ok, I ran it for more than a few minutes.  Early on (shortly after boot), the
output shows this for a long time:

=============== Mon Aug  1 21:09:30 2011
Globals: Cause Maximum Percentage
Process details:
Process ksoftirqd/0 (3) Total:   3.9 msec
        [run_ksoftirqd]   3.9 msec        100.0 %
                run_ksoftirqd kthread kernel_thread_helper 
Process kworker/1:1 (983) Total:  31.2 msec
        .         3.9 msec        100.0 %
                worker_thread kthread kernel_thread_helper 
Process sshd (3948) Total:   3.9 msec
        Waiting for event (select)        3.9 msec        100.0 %
                poll_schedule_timeout do_select core_sys_select sys_select 
                system_call_fastpath 
Process btrfs-cleaner (4345) Total: 2433.6 msec
        [sleep_on_page]  31.3 msec        100.0 %
                sleep_on_page wait_on_page_bit read_extent_buffer_pages 
                btree_read_extent_buffer_pages read_tree_block 
                read_block_for_search btrfs_search_slot 
                lookup_inline_extent_backref __btrfs_free_extent 
                run_clustered_refs btrfs_run_delayed_refs
__btrfs_end_transaction
Process btrfs-transacti (4346) Total: 2437.5 msec
        [sleep_on_page]  27.3 msec        100.0 %
                sleep_on_page wait_on_page_bit read_extent_buffer_pages 
                btree_read_extent_buffer_pages read_tree_block 
                read_block_for_search btrfs_search_slot 
                lookup_inline_extent_backref __btrfs_free_extent 
                run_clustered_refs btrfs_run_delayed_refs
btrfs_commit_transaction

This repeats identically (all numbers and backtrace is very stable) for
a long time.  Much later, I see this:

=============== Tue Aug  2 13:46:29 2011
Globals: Cause Maximum Percentage
Process details:
Process btrfs-submit-0 (4332) Total: 199.2 msec
        Creating block layer request     58.6 msec        100.0 %
                get_request_wait __make_request generic_make_request 
                submit_bio run_scheduled_bios pending_bios_fn worker_loop 
                kthread kernel_thread_helper 
Process btrfs-cleaner (4345) Total:   3.9 msec
        Creating block layer request      3.9 msec        100.0 %
                get_request_wait __make_request generic_make_request 
                submit_bio btrfs_map_bio btree_submit_bio_hook submit_one_bio 
                read_extent_buffer_pages readahead_tree_block reada_walk_down 
                do_walk_down walk_down_tree 
Process btrfs-endio-met (4700) Total:  35.2 msec
        [worker_loop]     3.9 msec        100.0 %
                worker_loop kthread kernel_thread_helper 
Process kworker/1:2 (26315) Total:  50.8 msec
        .         3.9 msec        100.0 %
                worker_thread kthread kernel_thread_helper 
[...snip...]
=============== Thu Aug  4 12:34:08 2011
Globals: Cause Maximum Percentage
Process details:
Process btrfs-submit-0 (4332) Total: 199.2 msec
        Creating block layer request     58.6 msec        100.0 %
                get_request_wait __make_request generic_make_request 
                submit_bio run_scheduled_bios pending_bios_fn worker_loop 
                kthread kernel_thread_helper 
Process btrfs-cleaner (4345) Total:   3.9 msec
        Creating block layer request      3.9 msec        100.0 %
                get_request_wait __make_request generic_make_request 
                submit_bio btrfs_map_bio btree_submit_bio_hook submit_one_bio 
                read_extent_buffer_pages readahead_tree_block reada_walk_down 
                do_walk_down walk_down_tree 
Process btrfs-endio-met (4700) Total:  35.2 msec
        [worker_loop]     3.9 msec        100.0 %
                worker_loop kthread kernel_thread_helper 


-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

Bruce Guenter

2011-Aug-11 15:04 UTC

head link

Re: Slow snapshot deletion

On Thu, Aug 04, 2011 at 10:35:36AM -0600, Bruce Guenter
wrote:> Ok, I ran it for more than a few minutes.  Early on (shortly after boot),
the output shows this for a long time:
Does anybody have any ideas what might be going on?

-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

Simon Kirby

2011-Aug-12 00:40 UTC

head link

Re: Slow snapshot deletion

On Thu, Aug 11, 2011 at 09:04:56AM -0600, Bruce Guenter wrote:
> On Thu, Aug 04, 2011 at 10:35:36AM -0600, Bruce Guenter wrote:
> > Ok, I ran it for more than a few minutes.  Early on (shortly after
boot), the output shows this for a long time:
> 
> Does anybody have any ideas what might be going on?
By chance, does "vmstat 1" or top show a high system cpu% while this
is
happening, and if so, does cat /proc/<pid>/stack of the PIDs in R state
sometimes show things under btrfs_run_delayed_refs like
setup_cluster_no_bitmap() and rb_next()?

I haven''t been able to build a 2.6.39 to verify yet, but 2.6.38.5 does
not show the problem for some reason. This is on a backup server, too,
with 20 3 TB btrfs volumes.

You should also be able to see a stack trace in this case (NOT the
uninterruptible sleep case) with echo l > /rpoc/sysrq-trigger, or use
oprofile or "perf top", which is what I am using. I see output such
as:

     samples  pcnt function                           DSO
     _______ _____ __________________________________ ______________

    13537.00 59.2% rb_next                            [kernel]
     3539.00 15.5% _raw_spin_lock                     [kernel]
     1668.00  7.3% setup_cluster_no_bitmap            [kernel]
      799.00  3.5% tree_search_offset                 [kernel]
      476.00  2.1% fill_window                        [kernel]

and and perf record -g; perf report --stdio, which gives me:

47.60%  btrfs-transacti  [kernel.kallsyms]          [k] rb_next
        |
        --- rb_next
           |
           |--97.36%-- setup_cluster_no_bitmap
           |          btrfs_find_space_cluster
           |          find_free_extent
           |          btrfs_reserve_extent
           |          btrfs_alloc_free_block
           |          __btrfs_cow_block
           |          btrfs_cow_block
           |          |
           |          |--99.95%-- btrfs_search_slot
           |          |          |
           |          |          |--55.01%-- lookup_inline_extent_backref
           |          |          |          |
           |          |          |          |--97.17%-- __btrfs_free_extent
           |          |          |          |          run_clustered_refs
           |          |          |          |          btrfs_run_delayed_refs
           |          |          |          |          |
           |          |          |          |          |--88.72%--
btrfs_commit_transaction
           |          |          |          |          |         
transaction_kthread

At one point, I rebooted the box without unmounting, and mounting of the
60 TB of btrfs volumes never actually finished within 8 hours. It seemed
to get stuck on one of them about half way through, constantly writing
(slowly) and taking lots of CPU. This is just from mounting, not any use.
Booting back into 2.6.38.5 was fine. The kernel giving problems was 3.0,
and now 3.1-rc1.

Before, when it actually was mounted, "sync" (with nothing else
dirtying
pages) also took 4 minutes to run and pegged the CPU with a profile
similar to the above.

I tried started trying to do a bisection but tons of other things have
changed and it takes quite a while to verify each pass. It also doesn''t
help that I can''t seem to reproduce it on any other non-production
boxes.

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bruce Guenter

2011-Aug-12 18:21 UTC

head link

Re: Slow snapshot deletion

On Thu, Aug 11, 2011 at 05:40:36PM -0700, Simon Kirby
wrote:> By chance, does "vmstat 1" or top show a high system cpu% while
this is
> happening,
Absolutely, usually around 80-90%
> and if so, does cat /proc/<pid>/stack of the PIDs in R state
> sometimes show things under btrfs_run_delayed_refs like
> setup_cluster_no_bitmap() and rb_next()?
No, I am not seeing what you describe.  The stack traces are usually
showing what I originally quoted.  I don''t know if I''ve seen
setup_cluster_no_bitmap or rb_* at all.

-- 
Bruce Guenter <bruce@untroubled.org>                http://untroubled.org/

Simon Kirby

2011-Aug-12 18:26 UTC

head link

Re: Slow snapshot deletion

On Fri, Aug 12, 2011 at 12:21:44PM -0600, Bruce Guenter wrote:
> On Thu, Aug 11, 2011 at 05:40:36PM -0700, Simon Kirby wrote:
> > By chance, does "vmstat 1" or top show a high system cpu%
while this is
> > happening,
> 
> Absolutely, usually around 80-90%
...which is perhaps why you had difficult obtaining traces with sysrq-w,
since most of the time there were no D-state processes. Instead, sysrq-l
give a backtrace of each running CPU.
> > and if so, does cat /proc/<pid>/stack of the PIDs in R state
> > sometimes show things under btrfs_run_delayed_refs like
> > setup_cluster_no_bitmap() and rb_next()?
> 
> No, I am not seeing what you describe.  The stack traces are usually
> showing what I originally quoted.  I don''t know if I''ve
seen
> setup_cluster_no_bitmap or rb_* at all.
Ok, oh well. :)

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Jul 2011 - Slow snapshot deletion

Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion

Re: Slow snapshot deletion