Dave Chinner
2012-Jan-04 21:44 UTC
[3.2-rc7] slowdown, warning + oops creating lots of files
Hi there buttery folks, I just hit this warning and oops running a parallel fs_mark create workload on a test VM using a 17TB btrfs filesystem (12 disk dm RAID0) using default mkfs and mount parmeters, mounted on /mnt/scratch. The VM has 8p and 4GB RAM, and the fs_mark command line was: $ ./fs_mark -D 10000 -S0 -n 100000 -s 0 -L 250 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 The attached image should give you a better idea of the performance drop-off that was well under way when the crash occurred at about 96 million files created. I''m rerunning the test on a fresh filesystem, so I guess I''ll see if this a one-off in the next couple of hours.... Cheers, Dave. [70972.788725] device fsid 5c8b0884-4e94-41ac-9a7e-8628a03a6f6d devid 1 transid 4 /dev/vdc [70972.795441] btrfs: disk space caching is enabled [121727.817810] ------------[ cut here ]------------ [121727.819430] WARNING: at fs/btrfs/extent-tree.c:4771 __btrfs_free_extent+0x630/0x6d0() [121727.821729] Hardware name: Bochs [121727.823417] Modules linked in: [121727.824601] Pid: 2984, comm: btrfs-transacti Not tainted 3.2.0-rc7-dgc+ #167 [121727.826784] Call Trace: [121727.827508] [<ffffffff8108a69f>] warn_slowpath_common+0x7f/0xc0 [121727.829290] [<ffffffff8108a6fa>] warn_slowpath_null+0x1a/0x20 [121727.830785] [<ffffffff815b0680>] __btrfs_free_extent+0x630/0x6d0 [121727.832486] [<ffffffff815ac110>] ? block_group_cache_tree_search+0x90/0xc0 [121727.833907] [<ffffffff815b42f1>] run_clustered_refs+0x381/0x800 [121727.834627] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 [121727.836118] [<ffffffff815c5469>] btrfs_commit_transaction+0x359/0x840 [121727.837042] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 [121727.838053] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 [121727.839133] [<ffffffff810986d0>] ? usleep_range+0x50/0x50 [121727.840642] [<ffffffff815be0d5>] transaction_kthread+0x265/0x280 [121727.842918] [<ffffffff815bde70>] ? btrfs_congested_fn+0xb0/0xb0 [121727.844480] [<ffffffff810ab97c>] kthread+0x8c/0xa0 [121727.845772] [<ffffffff81abb5f4>] kernel_thread_helper+0x4/0x10 [121727.847204] [<ffffffff810ab8f0>] ? flush_kthread_worker+0xa0/0xa0 [121727.848803] [<ffffffff81abb5f0>] ? gs_change+0x13/0x13 [121727.850073] ---[ end trace 804317a2ff1cf200 ]--- [121727.851207] btrfs unable to find ref byte nr 14368411648 parent 0 root 2 owner 0 offset 0 [121727.853324] BUG: unable to handle kernel NULL pointer dereference at (null) [121727.855261] IP: [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [121727.856730] PGD 0 [121727.857105] Oops: 0000 [#1] SMP [121727.857243] CPU 1 [121727.857243] Modules linked in: [121727.857243] [121727.857243] Pid: 2984, comm: btrfs-transacti Tainted: G W 3.2.0-rc7-dgc+ #167 Bochs Bochs [121727.857243] RIP: 0010:[<ffffffff815e60f2>] [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [121727.857243] RSP: 0018:ffff880117b49b00 EFLAGS: 00010286 [121727.857243] RAX: 0000000000000000 RBX: 0000000000000065 RCX: ffff880117b49b30 [121727.857243] RDX: 0000000000000004 RSI: 000000000000007a RDI: 0000000000000000 [121727.857243] RBP: ffff880117b49b20 R08: ffff880117b49b38 R09: ffff880117b49b40 [121727.857243] R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000007a [121727.857243] R13: 0000000000000000 R14: 00000000ffffffe4 R15: 0000000000001000 [121727.857243] FS: 0000000000000000(0000) GS:ffff88011fc40000(0000) knlGS:0000000000000000 [121727.857243] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [121727.857243] CR2: 0000000000000000 CR3: 0000000001ec1000 CR4: 00000000000006e0 [121727.857243] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [121727.857243] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [121727.857243] Process btrfs-transacti (pid: 2984, threadinfo ffff880117b48000, task ffff88010f38e280) [121727.857243] Stack: [121727.857243] 00000000ffffffe4 0000000000000065 000000000000007a 0000000000000000 [121727.857243] ffff880117b49b70 ffffffff815db96c ffff880117b49b80 ffff880117b49b40 [121727.857243] 0000000000000002 00000003586c9000 0000000000000000 ffff8801042d97f0 [121727.857243] Call Trace: [121727.857243] [<ffffffff815db96c>] btrfs_item_size+0x3c/0x90 [121727.857243] [<ffffffff815b0282>] __btrfs_free_extent+0x232/0x6d0 [121727.857243] [<ffffffff815ac110>] ? block_group_cache_tree_search+0x90/0xc0 [121727.857243] [<ffffffff815b42f1>] run_clustered_refs+0x381/0x800 [121727.857243] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 [121727.857243] [<ffffffff815c5469>] btrfs_commit_transaction+0x359/0x840 [121727.857243] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 [121727.857243] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 [121727.857243] [<ffffffff810986d0>] ? usleep_range+0x50/0x50 [121727.857243] [<ffffffff815be0d5>] transaction_kthread+0x265/0x280 [121727.857243] [<ffffffff815bde70>] ? btrfs_congested_fn+0xb0/0xb0 [121727.857243] [<ffffffff810ab97c>] kthread+0x8c/0xa0 [121727.857243] [<ffffffff81abb5f4>] kernel_thread_helper+0x4/0x10 [121727.857243] [<ffffffff810ab8f0>] ? flush_kthread_worker+0xa0/0xa0 [121727.857243] [<ffffffff81abb5f0>] ? gs_change+0x13/0x13 [121727.857243] Code: 83 c0 01 48 89 85 78 ff ff ff e9 c0 fc ff ff 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 <4c> 8b 27 48 89 [121727.857243] RIP [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [121727.857243] RSP <ffff880117b49b00> [121727.857243] CR2: 0000000000000000 [121727.904461] ---[ end trace 804317a2ff1cf201 ]--- -- Dave Chinner david@fromorbit.com
Dave Chinner
2012-Jan-04 22:11 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 08:44:45AM +1100, Dave Chinner wrote:> Hi there buttery folks, > > I just hit this warning and oops running a parallel fs_mark create > workload on a test VM using a 17TB btrfs filesystem (12 disk dm > RAID0) using default mkfs and mount parmeters, mounted on > /mnt/scratch. The VM has 8p and 4GB RAM, and the fs_mark command > line was: > > $ ./fs_mark -D 10000 -S0 -n 100000 -s 0 -L 250 \ > -d /mnt/scratch/0 -d /mnt/scratch/1 \ > -d /mnt/scratch/2 -d /mnt/scratch/3 \ > -d /mnt/scratch/4 -d /mnt/scratch/5 \ > -d /mnt/scratch/6 -d /mnt/scratch/7 > > The attached image should give you a better idea of the performance > drop-off that was well under way when the crash occurred at about 96 > million files created. > > I''m rerunning the test on a fresh filesystem, so I guess I''ll see if > this a one-off in the next couple of hours....Looks to be reproducable. With a fresh filesystem performance was all over the place from the start, and the warning/oops occurred at about 43M files created. the failure stacks this time are: [ 1490.841957] device fsid 4b7ec51b-9747-4244-a568-fbecdb157383 devid 1 transid 4 /dev/vdc [ 1490.847408] btrfs: disk space caching is enabled [ 3027.690722] ------------[ cut here ]------------ [ 3027.692612] WARNING: at fs/btrfs/extent-tree.c:4771 __btrfs_free_extent+0x630/0x6d0() [ 3027.695607] Hardware name: Bochs [ 3027.696968] Modules linked in: [ 3027.697894] Pid: 3460, comm: fs_mark Not tainted 3.2.0-rc7-dgc+ #167 [ 3027.699581] Call Trace: [ 3027.700452] [<ffffffff8108a69f>] warn_slowpath_common+0x7f/0xc0 [ 3027.701973] [<ffffffff8108a6fa>] warn_slowpath_null+0x1a/0x20 [ 3027.703480] [<ffffffff815b0680>] __btrfs_free_extent+0x630/0x6d0 [ 3027.704981] [<ffffffff815ac110>] ? block_group_cache_tree_search+0x90/0xc0 [ 3027.706368] [<ffffffff815b42f1>] run_clustered_refs+0x381/0x800 [ 3027.707897] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 [ 3027.709347] [<ffffffff815b8f1c>] ? btrfs_update_root+0x9c/0xe0 [ 3027.710909] [<ffffffff815c3c33>] commit_cowonly_roots+0x33/0x1e0 [ 3027.712370] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 [ 3027.713220] [<ffffffff815c54cf>] btrfs_commit_transaction+0x3bf/0x840 [ 3027.714412] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 [ 3027.715460] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 [ 3027.716790] [<ffffffff815ac80c>] may_commit_transaction+0x6c/0x100 [ 3027.717843] [<ffffffff815b2b47>] reserve_metadata_bytes.isra.71+0x5a7/0x660 [ 3027.719223] [<ffffffff81073c23>] ? __wake_up+0x53/0x70 [ 3027.720328] [<ffffffff815a43ba>] ? btrfs_free_path+0x2a/0x40 [ 3027.721511] [<ffffffff815b2f9e>] btrfs_block_rsv_add+0x3e/0x70 [ 3027.722610] [<ffffffff81666dfb>] ? security_d_instantiate+0x1b/0x30 [ 3027.723765] [<ffffffff815c5f65>] start_transaction+0x255/0x2b0 [ 3027.725204] [<ffffffff815c6283>] btrfs_start_transaction+0x13/0x20 [ 3027.726273] [<ffffffff815d2236>] btrfs_create+0x46/0x220 [ 3027.727275] [<ffffffff8116c204>] vfs_create+0xb4/0xf0 [ 3027.728344] [<ffffffff8116e1d7>] do_last.isra.45+0x547/0x7c0 [ 3027.729400] [<ffffffff8116f7ab>] path_openat+0xcb/0x3d0 [ 3027.730363] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 [ 3027.731394] [<ffffffff8117cc1e>] ? vfsmount_lock_local_unlock+0x1e/0x30 [ 3027.733077] [<ffffffff8116fbd2>] do_filp_open+0x42/0xa0 [ 3027.733949] [<ffffffff8117c487>] ? alloc_fd+0xf7/0x150 [ 3027.734911] [<ffffffff8115f8e7>] do_sys_open+0xf7/0x1d0 [ 3027.735894] [<ffffffff810b572a>] ? do_gettimeofday+0x1a/0x50 [ 3027.737304] [<ffffffff8115f9e0>] sys_open+0x20/0x30 [ 3027.738099] [<ffffffff81ab9502>] system_call_fastpath+0x16/0x1b [ 3027.739199] ---[ end trace df586861a93ef3bf ]--- [ 3027.740348] btrfs unable to find ref byte nr 19982405632 parent 0 root 2 owner 0 offset 0 [ 3027.742001] BUG: unable to handle kernel NULL pointer dereference at (null) [ 3027.743502] IP: [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [ 3027.744982] PGD 109d8e067 PUD 1050a9067 PMD 0 [ 3027.745968] Oops: 0000 [#1] SMP [ 3027.745968] CPU 7 [ 3027.745968] Modules linked in: [ 3027.745968] [ 3027.745968] Pid: 3460, comm: fs_mark Tainted: G W 3.2.0-rc7-dgc+ #167 Bochs Bochs [ 3027.745968] RIP: 0010:[<ffffffff815e60f2>] [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [ 3027.745968] RSP: 0018:ffff8800d2ac36d8 EFLAGS: 00010296 [ 3027.745968] RAX: 0000000000000000 RBX: 0000000000000065 RCX: ffff8800d2ac3708 [ 3027.745968] RDX: 0000000000000004 RSI: 000000000000007a RDI: 0000000000000000 [ 3027.745968] RBP: ffff8800d2ac36f8 R08: ffff8800d2ac3710 R09: ffff8800d2ac3718 [ 3027.745968] R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000007a [ 3027.745968] R13: 0000000000000000 R14: 00000000ffffffe4 R15: 0000000000001000 [ 3027.745968] FS: 00007f3bf8ab5700(0000) GS:ffff88011fdc0000(0000) knlGS:0000000000000000 [ 3027.745968] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3027.745968] CR2: 00007fe424b0e000 CR3: 0000000106c33000 CR4: 00000000000006e0 [ 3027.745968] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3027.745968] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3027.745968] Process fs_mark (pid: 3460, threadinfo ffff8800d2ac2000, task ffff88006c81c100) [ 3027.745968] Stack: [ 3027.745968] 00000000ffffffe4 0000000000000065 000000000000007a 0000000000000000 [ 3027.745968] ffff8800d2ac3748 ffffffff815db96c ffff8800d2ac3758 ffff8800d2ac3718 [ 3027.745968] 0000000000000002 00000004a70b5000 0000000000000000 ffff88010bf1a910 [ 3027.745968] Call Trace: [ 3027.745968] [<ffffffff815db96c>] btrfs_item_size+0x3c/0x90 [ 3027.745968] [<ffffffff815b0282>] __btrfs_free_extent+0x232/0x6d0 [ 3027.745968] [<ffffffff815ac110>] ? block_group_cache_tree_search+0x90/0xc0 [ 3027.745968] [<ffffffff815b42f1>] run_clustered_refs+0x381/0x800 [ 3027.745968] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 [ 3027.745968] [<ffffffff815b8f1c>] ? btrfs_update_root+0x9c/0xe0 [ 3027.745968] [<ffffffff815c3c33>] commit_cowonly_roots+0x33/0x1e0 [ 3027.745968] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 [ 3027.745968] [<ffffffff815c54cf>] btrfs_commit_transaction+0x3bf/0x840 [ 3027.745968] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 [ 3027.745968] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 [ 3027.745968] [<ffffffff815ac80c>] may_commit_transaction+0x6c/0x100 [ 3027.745968] [<ffffffff815b2b47>] reserve_metadata_bytes.isra.71+0x5a7/0x660 [ 3027.745968] [<ffffffff81073c23>] ? __wake_up+0x53/0x70 [ 3027.745968] [<ffffffff815a43ba>] ? btrfs_free_path+0x2a/0x40 [ 3027.745968] [<ffffffff815b2f9e>] btrfs_block_rsv_add+0x3e/0x70 [ 3027.745968] [<ffffffff81666dfb>] ? security_d_instantiate+0x1b/0x30 [ 3027.745968] [<ffffffff815c5f65>] start_transaction+0x255/0x2b0 [ 3027.745968] [<ffffffff815c6283>] btrfs_start_transaction+0x13/0x20 [ 3027.745968] [<ffffffff815d2236>] btrfs_create+0x46/0x220 [ 3027.745968] [<ffffffff8116c204>] vfs_create+0xb4/0xf0 [ 3027.745968] [<ffffffff8116e1d7>] do_last.isra.45+0x547/0x7c0 [ 3027.745968] [<ffffffff8116f7ab>] path_openat+0xcb/0x3d0 [ 3027.745968] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 [ 3027.745968] [<ffffffff8117cc1e>] ? vfsmount_lock_local_unlock+0x1e/0x30 [ 3027.745968] [<ffffffff8116fbd2>] do_filp_open+0x42/0xa0 [ 3027.745968] [<ffffffff8117c487>] ? alloc_fd+0xf7/0x150 [ 3027.745968] [<ffffffff8115f8e7>] do_sys_open+0xf7/0x1d0 [ 3027.745968] [<ffffffff810b572a>] ? do_gettimeofday+0x1a/0x50 [ 3027.745968] [<ffffffff8115f9e0>] sys_open+0x20/0x30 [ 3027.745968] [<ffffffff81ab9502>] system_call_fastpath+0x16/0x1b [ 3027.745968] Code: 83 c0 01 48 89 85 78 ff ff ff e9 c0 fc ff ff 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 <4c> 8b 27 48 89 [ 3027.745968] RIP [<ffffffff815e60f2>] map_private_extent_buffer+0x12/0x150 [ 3027.745968] RSP <ffff8800d2ac36d8> [ 3027.745968] CR2: 0000000000000000 [ 3027.817725] ---[ end trace df586861a93ef3c0 ]--- Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Samuel
2012-Jan-04 22:23 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On 05/01/12 09:11, Dave Chinner wrote:> Looks to be reproducable.Does this happen with rc6 ? If not then it might be easy to track down as there are only 2 modifications between rc6 and rc7.. commit 08c422c27f855d27b0b3d9fa30ebd938d4ae6f1f Author: Al Viro <viro@ZenIV.linux.org.uk> Date: Fri Dec 23 07:58:13 2011 -0500 Btrfs: call d_instantiate after all ops are setup This closes races where btrfs is calling d_instantiate too soon during inode creation. All of the callers of btrfs_add_nondir are updated to instantiate after the inode is fully setup in memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com> commit 8d532b2afb2eacc84588db709ec280a3d1219be3 Author: Chris Mason <chris.mason@oracle.com> Date: Fri Dec 23 07:53:00 2011 -0500 Btrfs: fix worker lock misuse in find_worker Dan Carpenter noticed that we were doing a double unlock on the worker lock, and sometimes picking a worker thread without the lock held. This fixes both errors. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dave Chinner
2012-Jan-04 23:01 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote:> On 05/01/12 09:11, Dave Chinner wrote: > > > Looks to be reproducable. > > Does this happen with rc6 ?I haven''t tried. All I''m doing is running some benchmarks to get numbers for a talk I''m giving about improvements in XFS metadata scalability, so I wanted to update my last set of numbers from 2.6.39. As it was, these benchmarks also failed on btrfs with oopsen and corruptions back in 2.6.39 time frame. e.g. same VM, same test, different crashes, similar slowdowns as reported here: http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 Given that there is now a history of this simple test uncovering problems, perhaps this is a test that should be run more regularly by btrfs developers?> If not then it might be easy to track down as there are only > 2 modifications between rc6 and rc7..They don''t look like they''d be responsible for fixing an extent tree corruption, and I don''t really have the time to do an open-ended bisect to find where the problem fix arose. As it is, 3rd attempt failed at 22m inodes, without the warning this time: [ 59.433452] device fsid 4d27dc14-562d-4722-9591-723bd2bbe94c devid 1 transid 4 /dev/vdc [ 59.437050] btrfs: disk space caching is enabled [ 753.258465] ------------[ cut here ]------------ [ 753.259806] kernel BUG at fs/btrfs/extent-tree.c:5797! [ 753.260014] invalid opcode: 0000 [#1] SMP [ 753.260014] CPU 7 [ 753.260014] Modules linked in: [ 753.260014] [ 753.260014] Pid: 2874, comm: fs_mark Not tainted 3.2.0-rc7-dgc+ #167 Bochs Bochs [ 753.260014] RIP: 0010:[<ffffffff815b475b>] [<ffffffff815b475b>] run_clustered_refs+0x7eb/0x800 [ 753.260014] RSP: 0018:ffff8800430258a8 EFLAGS: 00010286 [ 753.260014] RAX: 00000000ffffffe4 RBX: ffff88009c8ab1c0 RCX: 0000000000000000 [ 753.260014] RDX: 0000000000000008 RSI: 0000000000000282 RDI: 0000000000000000 [ 753.260014] RBP: ffff880043025988 R08: 0000000000000000 R09: 0000000000000002 [ 753.260014] R10: ffff8801188f6000 R11: ffff880101b50d20 R12: ffff88008fc1ad40 [ 753.260014] R13: ffff88003940a6c0 R14: ffff880118a49000 R15: ffff88010fc77e80 [ 753.260014] FS: 00007f416ce90700(0000) GS:ffff88011fdc0000(0000) knlGS:0000000000000000 [ 753.260014] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 753.260014] CR2: 00007f416c2f6000 CR3: 000000003aaea000 CR4: 00000000000006e0 [ 753.260014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 753.260014] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 753.260014] Process fs_mark (pid: 2874, threadinfo ffff880043024000, task ffff8800090e6180) [ 753.260014] Stack: [ 753.260014] 0000000000000000 0000000000000000 ffff880000000001 0000000000000000 [ 753.260014] ffff88010fc77f38 0000000000000e92 0000000000000000 0000000000000002 [ 753.260014] 0000000000000e03 0000000000000e68 0000000000000000 ffff8800430259d8 [ 753.260014] Call Trace: [ 753.260014] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 [ 753.260014] [<ffffffff815c5469>] btrfs_commit_transaction+0x359/0x840 [ 753.260014] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 [ 753.260014] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 [ 753.260014] [<ffffffff815ac80c>] may_commit_transaction+0x6c/0x100 [ 753.260014] [<ffffffff815b2b47>] reserve_metadata_bytes.isra.71+0x5a7/0x660 [ 753.260014] [<ffffffff81073c23>] ? __wake_up+0x53/0x70 [ 753.260014] [<ffffffff815a43ba>] ? btrfs_free_path+0x2a/0x40 [ 753.260014] [<ffffffff815b2f9e>] btrfs_block_rsv_add+0x3e/0x70 [ 753.260014] [<ffffffff81666dfb>] ? security_d_instantiate+0x1b/0x30 [ 753.260014] [<ffffffff815c5f65>] start_transaction+0x255/0x2b0 [ 753.260014] [<ffffffff815c6283>] btrfs_start_transaction+0x13/0x20 [ 753.260014] [<ffffffff815d2236>] btrfs_create+0x46/0x220 [ 753.260014] [<ffffffff8116c204>] vfs_create+0xb4/0xf0 [ 753.260014] [<ffffffff8116e1d7>] do_last.isra.45+0x547/0x7c0 [ 753.260014] [<ffffffff8116f7ab>] path_openat+0xcb/0x3d0 [ 753.260014] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 [ 753.260014] [<ffffffff8117cc1e>] ? vfsmount_lock_local_unlock+0x1e/0x30 [ 753.260014] [<ffffffff8116fbd2>] do_filp_open+0x42/0xa0 [ 753.260014] [<ffffffff8117c487>] ? alloc_fd+0xf7/0x150 [ 753.260014] [<ffffffff8115f8e7>] do_sys_open+0xf7/0x1d0 [ 753.260014] [<ffffffff810b572a>] ? do_gettimeofday+0x1a/0x50 [ 753.260014] [<ffffffff8115f9e0>] sys_open+0x20/0x30 [ 753.260014] [<ffffffff81ab9502>] system_call_fastpath+0x16/0x1b [ 753.260014] Code: ff e9 37 f9 ff ff be 95 00 00 00 48 c7 c7 43 6f df 81 e8 99 5f ad ff e9 36 f9 ff ff 80 fa b2 0f 84 d0 f9 ff ff 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f [ 753.260014] RIP [<ffffffff815b475b>] run_clustered_refs+0x7eb/0x800 [ 753.260014] RSP <ffff8800430258a8> [ 753.330089] ---[ end trace f3d0e286a928c349 ]--- It''s hard to tell exactly what path gets to that BUG_ON(), so much code is inlined by the compiler into run_clustered_refs() that I can''t tell exactly how it got to the BUG_ON() triggered in alloc_reserved_tree_block(). (As an aside, this is why XFS use noinline for most of it''s static functions - so that stack traces are accurate when a problem occurs. Debuggability of complex code paths is far more important than the small speed improvement automatic inlining of static functions gives...) Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Jan-05 02:23 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On 01/04/2012 06:01 PM, Dave Chinner wrote:> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: >> On 05/01/12 09:11, Dave Chinner wrote: >> >>> Looks to be reproducable. >> Does this happen with rc6 ? > > I haven''t tried. All I''m doing is running some benchmarks to get > numbers for a talk I''m giving about improvements in XFS metadata > scalability, so I wanted to update my last set of numbers from > 2.6.39. > > As it was, these benchmarks also failed on btrfs with oopsen and > corruptions back in 2.6.39 time frame. e.g. same VM, same > test, different crashes, similar slowdowns as reported here: > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > Given that there is now a history of this simple test uncovering > problems, perhaps this is a test that should be run more regularly > by btrfs developers? > >> If not then it might be easy to track down as there are only >> 2 modifications between rc6 and rc7.. > > They don''t look like they''d be responsible for fixing an extent tree > corruption, and I don''t really have the time to do an open-ended > bisect to find where the problem fix arose. > > As it is, 3rd attempt failed at 22m inodes, without the warning this > time: > > [ 59.433452] device fsid 4d27dc14-562d-4722-9591-723bd2bbe94c devid 1 transid 4 /dev/vdc > [ 59.437050] btrfs: disk space caching is enabled > [ 753.258465] ------------[ cut here ]------------ > [ 753.259806] kernel BUG at fs/btrfs/extent-tree.c:5797! > [ 753.260014] invalid opcode: 0000 [#1] SMP > [ 753.260014] CPU 7 > [ 753.260014] Modules linked in: > [ 753.260014] > [ 753.260014] Pid: 2874, comm: fs_mark Not tainted 3.2.0-rc7-dgc+ #167 Bochs Bochs > [ 753.260014] RIP: 0010:[<ffffffff815b475b>] [<ffffffff815b475b>] run_clustered_refs+0x7eb/0x800 > [ 753.260014] RSP: 0018:ffff8800430258a8 EFLAGS: 00010286 > [ 753.260014] RAX: 00000000ffffffe4 RBX: ffff88009c8ab1c0 RCX: 0000000000000000 > [ 753.260014] RDX: 0000000000000008 RSI: 0000000000000282 RDI: 0000000000000000 > [ 753.260014] RBP: ffff880043025988 R08: 0000000000000000 R09: 0000000000000002 > [ 753.260014] R10: ffff8801188f6000 R11: ffff880101b50d20 R12: ffff88008fc1ad40 > [ 753.260014] R13: ffff88003940a6c0 R14: ffff880118a49000 R15: ffff88010fc77e80 > [ 753.260014] FS: 00007f416ce90700(0000) GS:ffff88011fdc0000(0000) knlGS:0000000000000000 > [ 753.260014] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 753.260014] CR2: 00007f416c2f6000 CR3: 000000003aaea000 CR4: 00000000000006e0 > [ 753.260014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 753.260014] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 753.260014] Process fs_mark (pid: 2874, threadinfo ffff880043024000, task ffff8800090e6180) > [ 753.260014] Stack: > [ 753.260014] 0000000000000000 0000000000000000 ffff880000000001 0000000000000000 > [ 753.260014] ffff88010fc77f38 0000000000000e92 0000000000000000 0000000000000002 > [ 753.260014] 0000000000000e03 0000000000000e68 0000000000000000 ffff8800430259d8 > [ 753.260014] Call Trace: > [ 753.260014] [<ffffffff815b483a>] btrfs_run_delayed_refs+0xca/0x220 > [ 753.260014] [<ffffffff815c5469>] btrfs_commit_transaction+0x359/0x840 > [ 753.260014] [<ffffffff810ac420>] ? add_wait_queue+0x60/0x60 > [ 753.260014] [<ffffffff815c5da4>] ? start_transaction+0x94/0x2b0 > [ 753.260014] [<ffffffff815ac80c>] may_commit_transaction+0x6c/0x100 > [ 753.260014] [<ffffffff815b2b47>] reserve_metadata_bytes.isra.71+0x5a7/0x660 > [ 753.260014] [<ffffffff81073c23>] ? __wake_up+0x53/0x70 > [ 753.260014] [<ffffffff815a43ba>] ? btrfs_free_path+0x2a/0x40 > [ 753.260014] [<ffffffff815b2f9e>] btrfs_block_rsv_add+0x3e/0x70 > [ 753.260014] [<ffffffff81666dfb>] ? security_d_instantiate+0x1b/0x30 > [ 753.260014] [<ffffffff815c5f65>] start_transaction+0x255/0x2b0 > [ 753.260014] [<ffffffff815c6283>] btrfs_start_transaction+0x13/0x20 > [ 753.260014] [<ffffffff815d2236>] btrfs_create+0x46/0x220 > [ 753.260014] [<ffffffff8116c204>] vfs_create+0xb4/0xf0 > [ 753.260014] [<ffffffff8116e1d7>] do_last.isra.45+0x547/0x7c0 > [ 753.260014] [<ffffffff8116f7ab>] path_openat+0xcb/0x3d0 > [ 753.260014] [<ffffffff81ab168e>] ? _raw_spin_lock+0xe/0x20 > [ 753.260014] [<ffffffff8117cc1e>] ? vfsmount_lock_local_unlock+0x1e/0x30 > [ 753.260014] [<ffffffff8116fbd2>] do_filp_open+0x42/0xa0 > [ 753.260014] [<ffffffff8117c487>] ? alloc_fd+0xf7/0x150 > [ 753.260014] [<ffffffff8115f8e7>] do_sys_open+0xf7/0x1d0 > [ 753.260014] [<ffffffff810b572a>] ? do_gettimeofday+0x1a/0x50 > [ 753.260014] [<ffffffff8115f9e0>] sys_open+0x20/0x30 > [ 753.260014] [<ffffffff81ab9502>] system_call_fastpath+0x16/0x1b > [ 753.260014] Code: ff e9 37 f9 ff ff be 95 00 00 00 48 c7 c7 43 6f df 81 e8 99 5f ad ff e9 36 f9 ff ff 80 fa b2 0f 84 d0 f9 ff ff 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f > [ 753.260014] RIP [<ffffffff815b475b>] run_clustered_refs+0x7eb/0x800 > [ 753.260014] RSP <ffff8800430258a8> > [ 753.330089] ---[ end trace f3d0e286a928c349 ]--- > > It''s hard to tell exactly what path gets to that BUG_ON(), so much > code is inlined by the compiler into run_clustered_refs() that I > can''t tell exactly how it got to the BUG_ON() triggered in > alloc_reserved_tree_block(). >This seems to be an oops led by ENOSPC. Thanks for the report.> (As an aside, this is why XFS use noinline for most of it''s static > functions - so that stack traces are accurate when a problem occurs. > Debuggability of complex code paths is far more important than the > small speed improvement automatic inlining of static functions > gives...) >Make sense. thanks, liubo> Cheers, > > Dave.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dave Chinner
2012-Jan-05 02:26 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote:> On 01/04/2012 06:01 PM, Dave Chinner wrote: > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > >> On 05/01/12 09:11, Dave Chinner wrote: > >> > >>> Looks to be reproducable. > >> Does this happen with rc6 ? > > > > I haven''t tried. All I''m doing is running some benchmarks to get > > numbers for a talk I''m giving about improvements in XFS metadata > > scalability, so I wanted to update my last set of numbers from > > 2.6.39. > > > > As it was, these benchmarks also failed on btrfs with oopsen and > > corruptions back in 2.6.39 time frame. e.g. same VM, same > > test, different crashes, similar slowdowns as reported here: > > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > > > Given that there is now a history of this simple test uncovering > > problems, perhaps this is a test that should be run more regularly > > by btrfs developers? > > > >> If not then it might be easy to track down as there are only > >> 2 modifications between rc6 and rc7.. > > > > They don''t look like they''d be responsible for fixing an extent tree > > corruption, and I don''t really have the time to do an open-ended > > bisect to find where the problem fix arose. > > > > As it is, 3rd attempt failed at 22m inodes, without the warning this > > time:.....> > It''s hard to tell exactly what path gets to that BUG_ON(), so much > > code is inlined by the compiler into run_clustered_refs() that I > > can''t tell exactly how it got to the BUG_ON() triggered in > > alloc_reserved_tree_block(). > > > > This seems to be an oops led by ENOSPC.At the time of the oops, this is the space used on the filesystem: $ df -h /mnt/scratch Filesystem Size Used Avail Use% Mounted on /dev/vdc 17T 31G 17T 1% /mnt/scratch It''s less than 0.2% full, so I think ENOSPC can be ruled out here. I have noticed one thing, however, in that the there are significant numbers of reads coming from disk when the slowdowns and oops occur. When everything runs fast, there are virtually no reads occurring at all. It looks to me that maybe the working set of metadata is being kicked out of memory, only to be read back in again short while later. Maybe that is a contributing factor. BTW, there is a lot of CPU time being spent on the tree locks. perf shows this as the top 2 CPU consumers: - 9.49% [kernel] [k] __write_lock_failed - __write_lock_failed - 99.80% _raw_write_lock - 79.35% btrfs_try_tree_write_lock 99.99% btrfs_search_slot - 20.63% btrfs_tree_lock 89.19% btrfs_search_slot 10.54% btrfs_lock_root_node btrfs_search_slot - 9.25% [kernel] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore - 55.87% __wake_up + 93.89% btrfs_clear_lock_blocking_rw + 3.46% btrfs_tree_read_unlock_blocking + 2.35% btrfs_tree_unlock Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dave Chinner
2012-Jan-05 11:43 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 02:11:31PM -0500, Liu Bo wrote:> On 01/04/2012 09:26 PM, Dave Chinner wrote: > > On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote: > >> On 01/04/2012 06:01 PM, Dave Chinner wrote: > >>> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > >>>> On 05/01/12 09:11, Dave Chinner wrote: > >>>> > >>>>> Looks to be reproducable. > >>>> Does this happen with rc6 ? > >>> I haven''t tried. All I''m doing is running some benchmarks to get > >>> numbers for a talk I''m giving about improvements in XFS metadata > >>> scalability, so I wanted to update my last set of numbers from > >>> 2.6.39. > >>> > >>> As it was, these benchmarks also failed on btrfs with oopsen and > >>> corruptions back in 2.6.39 time frame. e.g. same VM, same > >>> test, different crashes, similar slowdowns as reported here: > >>> http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > >>> > >>> Given that there is now a history of this simple test uncovering > >>> problems, perhaps this is a test that should be run more regularly > >>> by btrfs developers? > >>> > >>>> If not then it might be easy to track down as there are only > >>>> 2 modifications between rc6 and rc7.. > >>> They don''t look like they''d be responsible for fixing an extent tree > >>> corruption, and I don''t really have the time to do an open-ended > >>> bisect to find where the problem fix arose. > >>> > >>> As it is, 3rd attempt failed at 22m inodes, without the warning this > >>> time: > > > > ..... > > > >>> It''s hard to tell exactly what path gets to that BUG_ON(), so much > >>> code is inlined by the compiler into run_clustered_refs() that I > >>> can''t tell exactly how it got to the BUG_ON() triggered in > >>> alloc_reserved_tree_block(). > >>> > >> This seems to be an oops led by ENOSPC. > > > > At the time of the oops, this is the space used on the filesystem: > > > > $ df -h /mnt/scratch > > Filesystem Size Used Avail Use% Mounted on > > /dev/vdc 17T 31G 17T 1% /mnt/scratch > > > > It''s less than 0.2% full, so I think ENOSPC can be ruled out here. > > > > This bug has done something with our block reservation allocator, not the real disk space. > > Can you try the below one and see what happens?Still crashes, still has severe slowdowns. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Jan-05 18:46 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote:> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > > On 05/01/12 09:11, Dave Chinner wrote: > > > > > Looks to be reproducable. > > > > Does this happen with rc6 ? > > I haven''t tried. All I''m doing is running some benchmarks to get > numbers for a talk I''m giving about improvements in XFS metadata > scalability, so I wanted to update my last set of numbers from > 2.6.39. > > As it was, these benchmarks also failed on btrfs with oopsen and > corruptions back in 2.6.39 time frame. e.g. same VM, same > test, different crashes, similar slowdowns as reported here: > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > Given that there is now a history of this simple test uncovering > problems, perhaps this is a test that should be run more regularly > by btrfs developers?Unfortunately, this one works for me. I''ll try it again and see if I can push harder. If not, I''ll see if I can trade beer for some diagnostic runs. Thanks Dave -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Jan-05 19:11 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On 01/04/2012 09:26 PM, Dave Chinner wrote:> On Wed, Jan 04, 2012 at 09:23:18PM -0500, Liu Bo wrote: >> On 01/04/2012 06:01 PM, Dave Chinner wrote: >>> On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: >>>> On 05/01/12 09:11, Dave Chinner wrote: >>>> >>>>> Looks to be reproducable. >>>> Does this happen with rc6 ? >>> I haven''t tried. All I''m doing is running some benchmarks to get >>> numbers for a talk I''m giving about improvements in XFS metadata >>> scalability, so I wanted to update my last set of numbers from >>> 2.6.39. >>> >>> As it was, these benchmarks also failed on btrfs with oopsen and >>> corruptions back in 2.6.39 time frame. e.g. same VM, same >>> test, different crashes, similar slowdowns as reported here: >>> http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 >>> >>> Given that there is now a history of this simple test uncovering >>> problems, perhaps this is a test that should be run more regularly >>> by btrfs developers? >>> >>>> If not then it might be easy to track down as there are only >>>> 2 modifications between rc6 and rc7.. >>> They don''t look like they''d be responsible for fixing an extent tree >>> corruption, and I don''t really have the time to do an open-ended >>> bisect to find where the problem fix arose. >>> >>> As it is, 3rd attempt failed at 22m inodes, without the warning this >>> time: > > ..... > >>> It''s hard to tell exactly what path gets to that BUG_ON(), so much >>> code is inlined by the compiler into run_clustered_refs() that I >>> can''t tell exactly how it got to the BUG_ON() triggered in >>> alloc_reserved_tree_block(). >>> >> This seems to be an oops led by ENOSPC. > > At the time of the oops, this is the space used on the filesystem: > > $ df -h /mnt/scratch > Filesystem Size Used Avail Use% Mounted on > /dev/vdc 17T 31G 17T 1% /mnt/scratch > > It''s less than 0.2% full, so I think ENOSPC can be ruled out here. >This bug has done something with our block reservation allocator, not the real disk space. Can you try the below one and see what happens? diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index b1c8732..5a7f918 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3978,8 +3978,8 @@ static u64 calc_global_metadata_size(struct btrfs_fs_info *fs_info) csum_size * 2; num_bytes += div64_u64(data_used + meta_used, 50); - if (num_bytes * 3 > meta_used) - num_bytes = div64_u64(meta_used, 3); + if (num_bytes * 2 > meta_used) + num_bytes = div64_u64(meta_used, 2); return ALIGN(num_bytes, fs_info->extent_root->leafsize << 10); }> I have noticed one thing, however, in that the there are significant > numbers of reads coming from disk when the slowdowns and oops occur. > When everything runs fast, there are virtually no reads occurring at > all. It looks to me that maybe the working set of metadata is being > kicked out of memory, only to be read back in again short while > later. Maybe that is a contributing factor. > > BTW, there is a lot of CPU time being spent on the tree locks. perf > shows this as the top 2 CPU consumers: > > - 9.49% [kernel] [k] __write_lock_failed > - __write_lock_failed > - 99.80% _raw_write_lock > - 79.35% btrfs_try_tree_write_lock > 99.99% btrfs_search_slot > - 20.63% btrfs_tree_lock > 89.19% btrfs_search_slot > 10.54% btrfs_lock_root_node > btrfs_search_slot > - 9.25% [kernel] [k] _raw_spin_unlock_irqrestore > - _raw_spin_unlock_irqrestore > - 55.87% __wake_up > + 93.89% btrfs_clear_lock_blocking_rw > + 3.46% btrfs_tree_read_unlock_blocking > + 2.35% btrfs_tree_unlock >hmm, the new extent_buffer lock scheme written by Chris is aimed to avoid such cases, maybe he can provide some advices. thanks, liubo> Cheers, > > Dave.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Jan-05 19:45 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote:> On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote: > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > > > On 05/01/12 09:11, Dave Chinner wrote: > > > > > > > Looks to be reproducable. > > > > > > Does this happen with rc6 ? > > > > I haven''t tried. All I''m doing is running some benchmarks to get > > numbers for a talk I''m giving about improvements in XFS metadata > > scalability, so I wanted to update my last set of numbers from > > 2.6.39. > > > > As it was, these benchmarks also failed on btrfs with oopsen and > > corruptions back in 2.6.39 time frame. e.g. same VM, same > > test, different crashes, similar slowdowns as reported here: > > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > > > Given that there is now a history of this simple test uncovering > > problems, perhaps this is a test that should be run more regularly > > by btrfs developers? > > Unfortunately, this one works for me. I''ll try it again and see if I > can push harder. If not, I''ll see if I can trade beer for some > diagnostic runs.Aha, if I try it just on the ssd instead of on my full array it triggers at 88M files. Great. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dave Chinner
2012-Jan-05 20:12 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote:> On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote: > > On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote: > > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > > > > On 05/01/12 09:11, Dave Chinner wrote: > > > > > > > > > Looks to be reproducable. > > > > > > > > Does this happen with rc6 ? > > > > > > I haven''t tried. All I''m doing is running some benchmarks to get > > > numbers for a talk I''m giving about improvements in XFS metadata > > > scalability, so I wanted to update my last set of numbers from > > > 2.6.39. > > > > > > As it was, these benchmarks also failed on btrfs with oopsen and > > > corruptions back in 2.6.39 time frame. e.g. same VM, same > > > test, different crashes, similar slowdowns as reported here: > > > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > > > > > Given that there is now a history of this simple test uncovering > > > problems, perhaps this is a test that should be run more regularly > > > by btrfs developers? > > > > Unfortunately, this one works for me. I''ll try it again and see if I > > can push harder. If not, I''ll see if I can trade beer for some > > diagnostic runs. > > Aha, if I try it just on the ssd instead of on my full array it triggers > at 88M files. Great.Good to know. The error that is generating the BUG on my machine is -28 (ENOSPC). Given there''s 17TB free on my filesystem.... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Jan-05 21:02 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Fri, Jan 06, 2012 at 07:12:16AM +1100, Dave Chinner wrote:> On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote: > > On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote: > > > On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote: > > > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: > > > > > On 05/01/12 09:11, Dave Chinner wrote: > > > > > > > > > > > Looks to be reproducable. > > > > > > > > > > Does this happen with rc6 ? > > > > > > > > I haven''t tried. All I''m doing is running some benchmarks to get > > > > numbers for a talk I''m giving about improvements in XFS metadata > > > > scalability, so I wanted to update my last set of numbers from > > > > 2.6.39. > > > > > > > > As it was, these benchmarks also failed on btrfs with oopsen and > > > > corruptions back in 2.6.39 time frame. e.g. same VM, same > > > > test, different crashes, similar slowdowns as reported here: > > > > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 > > > > > > > > Given that there is now a history of this simple test uncovering > > > > problems, perhaps this is a test that should be run more regularly > > > > by btrfs developers? > > > > > > Unfortunately, this one works for me. I''ll try it again and see if I > > > can push harder. If not, I''ll see if I can trade beer for some > > > diagnostic runs. > > > > Aha, if I try it just on the ssd instead of on my full array it triggers > > at 88M files. Great. > > Good to know. The error that is generating the BUG on my machine is > -28 (ENOSPC). Given there''s 17TB free on my filesystem....Yeah, same thing here. I''m testing a fix now, it''s pretty dumb. We''re not allocating more metadata chunks from the drive because of where the allocation is happening, so it is just a check for "do we need a new chunk" in the right place. I''ll make sure it can fill my ssd and then send to you. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Samuel
2012-Jan-05 21:24 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Fri, 6 Jan 2012, 08:02:55 EST, Chris Mason <chris.mason@oracle.com> wrote:> Yeah, same thing here. I''m testing a fix now, it''s pretty dumb. We''re > not allocating more metadata chunks from the drive because of where the > allocation is happening, so it is just a check for "do we need a new > chunk" in the right place. > > I''ll make sure it can fill my ssd and then send to you.*If* the fix works would this be a candidate for a stable 3.2.1 release rather than having to wait for 3.3? cheers, Chris -- Chris Samuel - http://www.csamuel.org/ - on mobile -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Jan-06 01:22 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
On Fri, Jan 06, 2012 at 08:24:32AM +1100, Chris Samuel wrote:> On Fri, 6 Jan 2012, 08:02:55 EST, Chris Mason <chris.mason@oracle.com> wrote: > > > Yeah, same thing here. I''m testing a fix now, it''s pretty dumb. We''re > > not allocating more metadata chunks from the drive because of where the > > allocation is happening, so it is just a check for "do we need a new > > chunk" in the right place. > > > > I''ll make sure it can fill my ssd and then send to you. > > *If* the fix works would this be a candidate for a > stable 3.2.1 release rather than having to wait for > 3.3?Yes, it''s pretty safe. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christian Brunner
2012-Jan-07 21:34 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
2012/1/5 Chris Mason <chris.mason@oracle.com>:> On Fri, Jan 06, 2012 at 07:12:16AM +1100, Dave Chinner wrote: >> On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote: >> > On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote: >> > > On Thu, Jan 05, 2012 at 10:01:22AM +1100, Dave Chinner wrote: >> > > > On Thu, Jan 05, 2012 at 09:23:52AM +1100, Chris Samuel wrote: >> > > > > On 05/01/12 09:11, Dave Chinner wrote: >> > > > > >> > > > > > Looks to be reproducable. >> > > > > >> > > > > Does this happen with rc6 ? >> > > > >> > > > I haven''t tried. All I''m doing is running some benchmarks to get >> > > > numbers for a talk I''m giving about improvements in XFS metadata >> > > > scalability, so I wanted to update my last set of numbers from >> > > > 2.6.39. >> > > > >> > > > As it was, these benchmarks also failed on btrfs with oopsen and >> > > > corruptions back in 2.6.39 time frame. e.g. same VM, same >> > > > test, different crashes, similar slowdowns as reported here: >> > > > http://comments.gmane.org/gmane.comp.file-systems.btrfs/11062 >> > > > >> > > > Given that there is now a history of this simple test uncovering >> > > > problems, perhaps this is a test that should be run more regularly >> > > > by btrfs developers? >> > > >> > > Unfortunately, this one works for me. I''ll try it again and see if I >> > > can push harder. If not, I''ll see if I can trade beer for some >> > > diagnostic runs. >> > >> > Aha, if I try it just on the ssd instead of on my full array it triggers >> > at 88M files. Great. >> >> Good to know. The error that is generating the BUG on my machine is >> -28 (ENOSPC). Given there''s 17TB free on my filesystem.... > > Yeah, same thing here. I''m testing a fix now, it''s pretty dumb. We''re > not allocating more metadata chunks from the drive because of where the > allocation is happening, so it is just a check for "do we need a new > chunk" in the right place. > > I''ll make sure it can fill my ssd and then send to you.Could you send the patch to the list (or to me), please? Telling from what you mentioned on IRC this sounds quite interesting and I would like to see if this solves my performance problems with ceph, too... Thanks, Christian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christian Brunner
2012-Jan-12 16:18 UTC
Re: [3.2-rc7] slowdown, warning + oops creating lots of files
2012/1/7 Christian Brunner <chb@muc.de>:> 2012/1/5 Chris Mason <chris.mason@oracle.com>: >> On Fri, Jan 06, 2012 at 07:12:16AM +1100, Dave Chinner wrote: >>> On Thu, Jan 05, 2012 at 02:45:00PM -0500, Chris Mason wrote: >>> > On Thu, Jan 05, 2012 at 01:46:57PM -0500, Chris Mason wrote: >>> > > >>> > > Unfortunately, this one works for me. I''ll try it again and see if I >>> > > can push harder. If not, I''ll see if I can trade beer for some >>> > > diagnostic runs. >>> > >>> > Aha, if I try it just on the ssd instead of on my full array it triggers >>> > at 88M files. Great. >>> >>> Good to know. The error that is generating the BUG on my machine is >>> -28 (ENOSPC). Given there''s 17TB free on my filesystem.... >> >> Yeah, same thing here. I''m testing a fix now, it''s pretty dumb. We''re >> not allocating more metadata chunks from the drive because of where the >> allocation is happening, so it is just a check for "do we need a new >> chunk" in the right place. >> >> I''ll make sure it can fill my ssd and then send to you. > > Could you send the patch to the list (or to me), please? Telling from > what you mentioned on IRC this sounds quite interesting and I would > like to see if this solves my performance problems with ceph, too...I apologize for bothering you again, but I would really like to give it a spin. Thanks, Christian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html