Jose Ildefonso Camargo Tolosa
2014-Aug-08 21:35 UTC
40TB volume taking over 16 hours to mount, any ideas?
Greetings,
I have been having some issues with btrfs for the past couple of days.
Some info (this has changed as I tried multiple things):
btrfs fi show
Btrfs v3.12
uname -a
Linux server1 3.15.8-031508-generic #201407311933 SMP Thu Jul 31
23:34:33 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
The complete story:
The filesystem was created on Ubuntu 12.04, running kernel 3.11.
mount options included compress=zlib .
After having some issues with it, specifically that it would mount
itself read-only, and would then be "stuck" while trying to mount it
again, we decided to upgrade 14.04, kernel 3.13, and 3.12 btrfs tools.
It worked just fine for a few days, until it mounted itself read-only
again. Someone rebooted the server, trying to get it back up: it
never came back. On console access, there was "hung task" messages,
lots of them. Well, I rebooted with a rescue disk, and removed btrfs
filesytem entry from fstab, and booted back to the system.
After this, tried to mount btrfs again, it would start reading at
~1MB/s from disk, then all disk activity would cease, and "mount"
would start taking 100% CPU. Well, we left the server like that
overnight: after ~12 hours, it was just the same, and I had messages
very similar to this one on dmesg output:
http://pastebin.com/dhPTrDSp
Then, after reading here and there, decided to try to use a newer
kernel, tried 3.15.8. Well, it is still mounting after ~16 hours, and
I got messages like these at first:
[25490.214875] BTRFS info (device sdb1): force clearing of disk cache
[25490.214882] BTRFS info (device sdb1): enabling auto recovery
[25556.240243] BTRFS: detected SSD devices, enabling SSD mode
[25812.123804] INFO: task btrfs-transacti:31532 blocked for more than
120 seconds.
[25812.125408] Not tainted 3.15.8-031508-generic #201407311933
[25812.126732] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[25812.128449] btrfs-transacti D 0000000000000003 0 31532 2 0x00000000
[25812.128458] ffff880608c73dc8 0000000000000002 ffff880608c73d68
ffff880608c73fd8
[25812.128461] 0000000000014500 0000000000014500 ffff8804698b3260
ffff880867440000
[25812.128463] ffff880608c73dd8 ffff8804658ec000 ffff880868357000
ffff880608c73e00
[25812.128465] Call Trace:
[25812.128476] [<ffffffff817784f9>] schedule+0x29/0x70
[25812.128503] [<ffffffffa010273d>]
btrfs_commit_transaction+0x25d/0xa70 [btrfs]
[25812.128514] [<ffffffff810b5450>] ? __wake_up_sync+0x20/0x20
[25812.128524] [<ffffffffa0100475>] transaction_kthread+0x1d5/0x250
[btrfs]
[25812.128535] [<ffffffffa01002a0>] ? open_ctree+0x1ee0/0x1ee0 [btrfs]
[25812.128539] [<ffffffff81091469>] kthread+0xc9/0xe0
[25812.128542] [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[25812.128545] [<ffffffff8178567c>] ret_from_fork+0x7c/0xb0
[25812.128547] [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
mmm... not this is not an SSD device, this is a hardware RAID5 array
of rotational disks. I am not sure why btrfs thought this is a SSD.
Current mount command:
mount -o clear_cache,compress=zlib,recovery /dev/sdb1 /pot/
And it kept going until later, when I just got this:
[26893.263397] INFO: task btrfs-transacti:31532 blocked for more than
120 seconds.
[26893.342166] Not tainted 3.15.8-031508-generic #201407311933
[26893.342167] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[26893.342170] btrfs-transacti D 0000000000000003 0 31532 2 0x00000000
[26893.342173] ffff880608c73dc8 0000000000000002 ffff880608c73d68
ffff880608c73fd8
[26893.342174] 0000000000014500 0000000000014500 ffff8804698b3260
ffff880867440000
[26893.342175] ffff880608c73dd8 ffff8804658ec000 ffff880868357000
ffff880608c73e00
[26893.342176] Call Trace:
[26893.342186] [<ffffffff817784f9>] schedule+0x29/0x70
[26893.342214] [<ffffffffa010273d>]
btrfs_commit_transaction+0x25d/0xa70 [btrfs]
[26893.342220] [<ffffffff810b5450>] ? __wake_up_sync+0x20/0x20
[26893.342231] [<ffffffffa0100475>] transaction_kthread+0x1d5/0x250
[btrfs]
[26893.342241] [<ffffffffa01002a0>] ? open_ctree+0x1ee0/0x1ee0 [btrfs]
[26893.342246] [<ffffffff81091469>] kthread+0xc9/0xe0
[26893.342249] [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[26893.342252] [<ffffffff8178567c>] ret_from_fork+0x7c/0xb0
[26893.342253] [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[43614.686424] perf interrupt took too long (2505 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
[72858.744039] ip_tables: (C) 2000-2006 Netfilter Core Team
[85708.015922] ip_tables: (C) 2000-2006 Netfilter Core Team
The last iptables thing was a few minutes ago. Right now, "mount" is
using 100% CPU, there is no disk activity, memory usage looks low.
perf top output looks like this:
7.57% [btrfs] [k] generic_bin_search.constprop.44
6.77% [btrfs] [k] btrfs_tree_read_unlock
6.70% [kernel] [k] __radix_tree_lookup
5.45% [btrfs] [k] map_private_extent_buffer
5.13% [btrfs] [k] btrfs_comp_cpu_keys
4.67% [btrfs] [k] btrfs_search_slot
4.66% [btrfs] [k] comp_keys
4.49% [btrfs] [k] btrfs_try_tree_read_lock
3.96% [btrfs] [k] find_extent_buffer
3.12% [kernel] [k] _raw_read_lock
2.97% [btrfs] [k] btrfs_clear_path_blocking
2.96% [btrfs] [k] btrfs_get_token_64
2.84% [btrfs] [k] release_extent_buffer
2.52% [kernel] [k] _raw_spin_lock
2.21% [btrfs] [k] btrfs_set_lock_blocking_rw
2.20% [btrfs] [k] btrfs_get_token_16
2.12% [kernel] [k] memcmp
1.97% [btrfs] [k] unlock_up
1.66% [btrfs] [k] btrfs_tree_read_lock
1.50% [btrfs] [k] btrfs_tree_read_unlock_blocking
1.45% [btrfs] [k] read_block_for_search.isra.40
1.29% [btrfs] [k] btrfs_release_path
1.17% [btrfs] [k] btrfs_root_node
1.16% [btrfs] [k] check_buffer_tree_ref
1.06% [btrfs] [k] btrfs_clear_lock_blocking_rw
0.96% [btrfs] [k] btrfs_get_token_8
0.86% [btrfs] [k] btrfs_buffer_uptodate
0.85% [btrfs] [k] verify_parent_transid
0.81% [btrfs] [k] bin_search
0.78% [kernel] [k] memcpy
0.77% [btrfs] [k] free_extent_buffer.part.39
0.76% [btrfs] [k] btrfs_get_token_32
0.72% [btrfs] [k] setup_nodes_for_search
0.67% [btrfs] [k] free_extent_buffer
0.67% [btrfs] [k] mark_extent_buffer_accessed
0.66% [kernel] [k] __kmalloc
0.57% [btrfs] [k] btrfs_set_path_blocking
0.55% [btrfs] [k] check_item_in_log
0.48% [btrfs] [k] read_extent_buffer
0.42% [btrfs] [k] memcmp_extent_buffer
0.38% [kernel] [k] crc32c_intel_le_hw
0.36% [btrfs] [k] btrfs_match_dir_item_name.isra.3
0.35% [btrfs] [k] verify_dir_item
0.33% [aacraid] [k] aac_queuecommand
0.32% [kernel] [k] _raw_spin_unlock
0.31% [btrfs] [k] btrfs_crc32c
0.31% [kernel] [k] kfree
0.29% [btrfs] [k] replay_dir_deletes
0.29% [btrfs] [k] extent_buffer_uptodate
0.27% [kernel] [k] mark_page_accessed
0.26% [btrfs] [k] btrfs_lookup_dir_item
0.25% [kernel] [k] radix_tree_lookup
0.14% [btrfs] [k] btrfs_find_tree_block
0.13% [kernel] [k] crypto_shash_update
I am compiling 3.16 kernel now to give it a try, but I am not sure on
this. I mean, this 40TB volume is holding just backups, but it took
quite some time to put the backups there, it will take ~2 weeks to put
that data in there again, I'd really prefer *not* to do it.
Any ideas?
Thanks!
--
Ildefonso Camargo
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html