Hi everyone, I''ve pushed out a new integration-test branch, and it includes a new reader/writer locking scheme for the btree locks. We''ve seen a number of benchmarks dominated by contention on the root node lock. This changes our locks into a simple reader/writer lock. They are based on mutexes so that we still take advantage of the mutex adaptive spins for write locks (rwsemaphores were much slower). I''m also sending the individual commits, please do take a look. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tsutomu Itoh
2011-Jul-20 02:08 UTC
Re: new metadata reader/writer locks in integration-test
(2011/07/20 2:30), Chris Mason wrote:> Hi everyone, > > I''ve pushed out a new integration-test branch, and it includes a new > reader/writer locking scheme for the btree locks. > > We''ve seen a number of benchmarks dominated by contention on the root > node lock. This changes our locks into a simple reader/writer lock. > They are based on mutexes so that we still take advantage of the mutex > adaptive spins for write locks (rwsemaphores were much slower). > > I''m also sending the individual commits, please do take a look.I pulled the new integration-test branch, and I got the following warning messages. Jul 20 10:03:30 luna kernel: ------------[ cut here ]------------ Jul 20 10:03:30 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5704 btrfs_alloc_free_block+0x178/0x340 [btrfs]() Jul 20 10:03:30 luna kernel: Hardware name: PRIMERGY Jul 20 10:03:30 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode] Jul 20 10:03:30 luna kernel: Pid: 8311, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 20 10:03:30 luna kernel: Call Trace: Jul 20 10:03:30 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 20 10:03:30 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 20 10:03:30 luna kernel: [<ffffffffa0455468>] btrfs_alloc_free_block+0x178/0x340 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa047f377>] ? read_extent_buffer+0xb7/0x190 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa04469ba>] split_leaf+0x14a/0x8c0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa044292a>] ? btrfs_leaf_free_space+0x8a/0xe0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa0447abc>] btrfs_search_slot+0x98c/0x9f0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa04730e4>] ? btrfs_drop_extents+0x7c4/0xa40 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa0448aed>] btrfs_insert_empty_items+0x8d/0xf0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa047fdad>] ? set_extent_bit+0x22d/0x5d0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa0466f1e>] insert_reserved_file_extent.clone.0+0xbe/0x270 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa046bb4b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa046bc10>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa049d306>] end_compressed_bio_write+0x86/0xf0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 Jul 20 10:03:30 luna kernel: [<ffffffffa045ccd4>] end_workqueue_fn+0xf4/0x130 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa048b35e>] worker_loop+0x13e/0x540 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa048b220>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffffa048b220>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] Jul 20 10:03:30 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 Jul 20 10:03:30 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 Jul 20 10:03:30 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 Jul 20 10:03:30 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 Jul 20 10:03:30 luna kernel: ---[ end trace c52e468b5140fdf0 ]--- Jul 20 10:07:27 luna kernel: ------------[ cut here ]------------ Jul 20 10:07:27 luna kernel: WARNING: at fs/btrfs/extent-tree.c:3860 btrfs_free_block_groups+0x217/0x290 [btrfs]() Jul 20 10:07:27 luna kernel: Hardware name: PRIMERGY Jul 20 10:07:27 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode] Jul 20 10:07:27 luna kernel: Pid: 12632, comm: umount Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 20 10:07:27 luna kernel: Call Trace: Jul 20 10:07:27 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 20 10:07:27 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 20 10:07:27 luna kernel: [<ffffffffa044c3e7>] btrfs_free_block_groups+0x217/0x290 [btrfs] Jul 20 10:07:27 luna kernel: [<ffffffffa045dfb9>] close_ctree+0x1e9/0x390 [btrfs] Jul 20 10:07:27 luna kernel: [<ffffffff81168931>] ? dispose_list+0x41/0x60 Jul 20 10:07:27 luna kernel: [<ffffffff8147c5b6>] ? down_write+0x16/0x40 Jul 20 10:07:27 luna kernel: [<ffffffffa043d5dd>] btrfs_put_super+0x1d/0x30 [btrfs] Jul 20 10:07:27 luna kernel: [<ffffffff81151d52>] generic_shutdown_super+0x72/0xf0 Jul 20 10:07:27 luna kernel: [<ffffffff81151e66>] kill_anon_super+0x16/0x60 Jul 20 10:07:27 luna kernel: [<ffffffff81152575>] deactivate_locked_super+0x45/0x70 Jul 20 10:07:27 luna kernel: [<ffffffff811531da>] deactivate_super+0x4a/0x70 Jul 20 10:07:27 luna kernel: [<ffffffff8116cadc>] mntput_no_expire+0x13c/0x1c0 Jul 20 10:07:27 luna kernel: [<ffffffff8116d2bb>] sys_umount+0x7b/0x3a0 Jul 20 10:07:27 luna kernel: [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b Jul 20 10:07:27 luna kernel: ---[ end trace c52e468b5140fdf1 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2011-Jul-20 06:55 UTC
Re: new metadata reader/writer locks in integration-test
Hi Chris, On 19.07.2011 19:30, Chris Mason wrote:> Hi everyone, > > I''ve pushed out a new integration-test branch, and it includes a new > reader/writer locking scheme for the btree locks. >I rebased my for-chris branch containing the readahead patches for scrub to your integration-test branch. It had only trivial conflicts. Hopefully it can go into 3.1 as well. -Arne> We''ve seen a number of benchmarks dominated by contention on the root > node lock. This changes our locks into a simple reader/writer lock. > They are based on mutexes so that we still take advantage of the mutex > adaptive spins for write locks (rwsemaphores were much slower). > > I''m also sending the individual commits, please do take a look. > > -chris-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Jul-20 07:58 UTC
Re: new metadata reader/writer locks in integration-test
Excerpts from Tsutomu Itoh''s message of 2011-07-19 22:08:38 -0400:> (2011/07/20 2:30), Chris Mason wrote: > > Hi everyone, > > > > I''ve pushed out a new integration-test branch, and it includes a new > > reader/writer locking scheme for the btree locks. > > > > We''ve seen a number of benchmarks dominated by contention on the root > > node lock. This changes our locks into a simple reader/writer lock. > > They are based on mutexes so that we still take advantage of the mutex > > adaptive spins for write locks (rwsemaphores were much slower). > > > > I''m also sending the individual commits, please do take a look. > > I pulled the new integration-test branch, and I got the following > warning messages. > > Jul 20 10:03:30 luna kernel: ------------[ cut here ]------------ > Jul 20 10:03:30 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5704 btrfs_alloc_free_block+0x178/0x340 [btrfs]()Thanks, I think this one is related to Josef''s enospc changes, but I''ll double check. What was the test? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tsutomu Itoh
2011-Jul-20 08:36 UTC
Re: new metadata reader/writer locks in integration-test
(2011/07/20 16:58), Chris Mason wrote:> Excerpts from Tsutomu Itoh''s message of 2011-07-19 22:08:38 -0400: >> (2011/07/20 2:30), Chris Mason wrote: >>> Hi everyone, >>> >>> I''ve pushed out a new integration-test branch, and it includes a new >>> reader/writer locking scheme for the btree locks. >>> >>> We''ve seen a number of benchmarks dominated by contention on the root >>> node lock. This changes our locks into a simple reader/writer lock. >>> They are based on mutexes so that we still take advantage of the mutex >>> adaptive spins for write locks (rwsemaphores were much slower). >>> >>> I''m also sending the individual commits, please do take a look. >> >> I pulled the new integration-test branch, and I got the following >> warning messages. >> >> Jul 20 10:03:30 luna kernel: ------------[ cut here ]------------ >> Jul 20 10:03:30 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5704 btrfs_alloc_free_block+0x178/0x340 [btrfs]() > > Thanks, I think this one is related to Josef''s enospc changes, but I''ll > double check.>What was the test?I ran my original test script. This script concurrently executes the making deletion of a lot of files, and the making deletion of a big file, etc. Thanks, Tsutomu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2011-Jul-20 16:49 UTC
Re: new metadata reader/writer locks in integration-test
On 20.07.2011 08:55, Arne Jansen wrote:> Hi Chris, > > On 19.07.2011 19:30, Chris Mason wrote: >> Hi everyone, >> >> I''ve pushed out a new integration-test branch, and it includes a new >> reader/writer locking scheme for the btree locks. >> > > I rebased my for-chris branch containing the readahead patches for scrub > to your integration-test branch. It had only trivial conflicts. > Hopefully it can go into 3.1 as well.The readahead series contained a stupid bug I introduced in v5. The corrected version is pushed out.> > -Arne > >> We''ve seen a number of benchmarks dominated by contention on the root >> node lock. This changes our locks into a simple reader/writer lock. >> They are based on mutexes so that we still take advantage of the mutex >> adaptive spins for write locks (rwsemaphores were much slower). >> >> I''m also sending the individual commits, please do take a look. >> >> -chris >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Jul-20 17:21 UTC
Re: new metadata reader/writer locks in integration-test
Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400:> Hi everyone, > > I''ve pushed out a new integration-test branch, and it includes a new > reader/writer locking scheme for the btree locks. > > We''ve seen a number of benchmarks dominated by contention on the root > node lock. This changes our locks into a simple reader/writer lock. > They are based on mutexes so that we still take advantage of the mutex > adaptive spins for write locks (rwsemaphores were much slower). > > I''m also sending the individual commits, please do take a look.Hi everyone, I just rebased Josef''s enospc fixes into integration-test, it should fix the warnings in extent-tree.c -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Jul-20 18:51 UTC
Re: new metadata reader/writer locks in integration-test
Excerpts from Chris Mason''s message of 2011-07-20 13:21:47 -0400:> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: > > Hi everyone, > > > > I''ve pushed out a new integration-test branch, and it includes a new > > reader/writer locking scheme for the btree locks. > > > > We''ve seen a number of benchmarks dominated by contention on the root > > node lock. This changes our locks into a simple reader/writer lock. > > They are based on mutexes so that we still take advantage of the mutex > > adaptive spins for write locks (rwsemaphores were much slower). > > > > I''m also sending the individual commits, please do take a look. > > Hi everyone, > > I just rebased Josef''s enospc fixes into integration-test, it should fix > the warnings in extent-tree.cAnd one more rebase to fix the x86-32 problems. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tsutomu Itoh
2011-Jul-21 00:48 UTC
Re: new metadata reader/writer locks in integration-test
(2011/07/21 2:21), Chris Mason wrote:> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: >> Hi everyone, >> >> I''ve pushed out a new integration-test branch, and it includes a new >> reader/writer locking scheme for the btree locks. >> >> We''ve seen a number of benchmarks dominated by contention on the root >> node lock. This changes our locks into a simple reader/writer lock. >> They are based on mutexes so that we still take advantage of the mutex >> adaptive spins for write locks (rwsemaphores were much slower). >> >> I''m also sending the individual commits, please do take a look. > > Hi everyone, > > I just rebased Josef''s enospc fixes into integration-test, it should fix > the warnings in extent-tree.c >Unfortunately, I got the following messages. Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 21 09:41:22 luna kernel: Call Trace: Jul 21 09:41:22 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 21 09:41:22 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 21 09:41:22 luna kernel: [<ffffffffa044a068>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa0464121>] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa0468c0b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffff8106fe23>] ? try_to_del_timer_sync+0x83/0xe0 Jul 21 09:41:22 luna kernel: [<ffffffffa0468cd0>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa049a3c6>] end_compressed_bio_write+0x86/0xf0 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 Jul 21 09:41:22 luna kernel: [<ffffffffa0459d84>] end_workqueue_fn+0xf4/0x130 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa048841e>] worker_loop+0x13e/0x540 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] Jul 21 09:41:22 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 Jul 21 09:41:22 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 Jul 21 09:41:22 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 Jul 21 09:41:22 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- Jul 21 09:42:21 luna kernel: ------------[ cut here ]------------ Jul 21 09:42:21 luna kernel: WARNING: at fs/btrfs/extent-tree.c:3860 btrfs_free_block_groups+0x217/0x290 [btrfs]() Jul 21 09:42:21 luna kernel: Hardware name: PRIMERGY Jul 21 09:42:21 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] Jul 21 09:42:21 luna kernel: Pid: 26136, comm: umount Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 21 09:42:21 luna kernel: Call Trace: Jul 21 09:42:21 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 21 09:42:21 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 21 09:42:21 luna kernel: [<ffffffffa04493e7>] btrfs_free_block_groups+0x217/0x290 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffffa045b069>] close_ctree+0x1e9/0x390 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81168931>] ? dispose_list+0x41/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff8147c5b6>] ? down_write+0x16/0x40 Jul 21 09:42:21 luna kernel: [<ffffffffa043a5dd>] btrfs_put_super+0x1d/0x30 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81151d52>] generic_shutdown_super+0x72/0xf0 Jul 21 09:42:21 luna kernel: [<ffffffff81151e66>] kill_anon_super+0x16/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff81152575>] deactivate_locked_super+0x45/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff811531da>] deactivate_super+0x4a/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff8116cadc>] mntput_no_expire+0x13c/0x1c0 Jul 21 09:42:21 luna kernel: [<ffffffff8116d2bb>] sys_umount+0x7b/0x3a0 Jul 21 09:42:21 luna kernel: [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b Jul 21 09:42:21 luna kernel: ---[ end trace 02c1fa3044677044 ]--- Jul 21 09:42:21 luna kernel: ------------[ cut here ]------------ Jul 21 09:42:21 luna kernel: WARNING: at fs/btrfs/extent-tree.c:3861 btrfs_free_block_groups+0x285/0x290 [btrfs]() Jul 21 09:42:21 luna kernel: Hardware name: PRIMERGY Jul 21 09:42:21 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] Jul 21 09:42:21 luna kernel: Pid: 26136, comm: umount Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 21 09:42:21 luna kernel: Call Trace: Jul 21 09:42:21 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 21 09:42:21 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 21 09:42:21 luna kernel: [<ffffffffa0449455>] btrfs_free_block_groups+0x285/0x290 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffffa045b069>] close_ctree+0x1e9/0x390 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81168931>] ? dispose_list+0x41/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff8147c5b6>] ? down_write+0x16/0x40 Jul 21 09:42:21 luna kernel: [<ffffffffa043a5dd>] btrfs_put_super+0x1d/0x30 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81151d52>] generic_shutdown_super+0x72/0xf0 Jul 21 09:42:21 luna kernel: [<ffffffff81151e66>] kill_anon_super+0x16/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff81152575>] deactivate_locked_super+0x45/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff811531da>] deactivate_super+0x4a/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff8116cadc>] mntput_no_expire+0x13c/0x1c0 Jul 21 09:42:21 luna kernel: [<ffffffff8116d2bb>] sys_umount+0x7b/0x3a0 Jul 21 09:42:21 luna kernel: [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b Jul 21 09:42:21 luna kernel: ---[ end trace 02c1fa3044677045 ]--- Jul 21 09:42:21 luna kernel: ------------[ cut here ]------------ Jul 21 09:42:21 luna kernel: WARNING: at fs/btrfs/extent-tree.c:6923 btrfs_free_block_groups+0x1cb/0x290 [btrfs]() Jul 21 09:42:21 luna kernel: Hardware name: PRIMERGY Jul 21 09:42:21 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] Jul 21 09:42:21 luna kernel: Pid: 26136, comm: umount Tainted: G W 2.6.39btrfs-tc1+ #1 Jul 21 09:42:21 luna kernel: Call Trace: Jul 21 09:42:21 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 Jul 21 09:42:21 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 Jul 21 09:42:21 luna kernel: [<ffffffffa044939b>] btrfs_free_block_groups+0x1cb/0x290 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffffa045b069>] close_ctree+0x1e9/0x390 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81168931>] ? dispose_list+0x41/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff8147c5b6>] ? down_write+0x16/0x40 Jul 21 09:42:21 luna kernel: [<ffffffffa043a5dd>] btrfs_put_super+0x1d/0x30 [btrfs] Jul 21 09:42:21 luna kernel: [<ffffffff81151d52>] generic_shutdown_super+0x72/0xf0 Jul 21 09:42:21 luna kernel: [<ffffffff81151e66>] kill_anon_super+0x16/0x60 Jul 21 09:42:21 luna kernel: [<ffffffff81152575>] deactivate_locked_super+0x45/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff811531da>] deactivate_super+0x4a/0x70 Jul 21 09:42:21 luna kernel: [<ffffffff8116cadc>] mntput_no_expire+0x13c/0x1c0 Jul 21 09:42:21 luna kernel: [<ffffffff8116d2bb>] sys_umount+0x7b/0x3a0 Jul 21 09:42:21 luna kernel: [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b Jul 21 09:42:21 luna kernel: ---[ end trace 02c1fa3044677046 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> Excerpts from Chris Mason''s message of 2011-07-20 13:21:47 -0400: >> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: >>> Hi everyone, >>> >>> I''ve pushed out a new integration-test branch, and it includes a new >>> reader/writer locking scheme for the btree locks. >>> >>> We''ve seen a number of benchmarks dominated by contention on the root >>> node lock. This changes our locks into a simple reader/writer lock. >>> They are based on mutexes so that we still take advantage of the mutex >>> adaptive spins for write locks (rwsemaphores were much slower). >>> >>> I''m also sending the individual commits, please do take a look. >> >> Hi everyone, >> >> I just rebased Josef''s enospc fixes into integration-test, it should fix >> the warnings in extent-tree.c > > And one more rebase to fix the x86-32 problems. >We can simply use page_address() in this macro: #define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits) \ static inline u##bits btrfs_##name(struct extent_buffer *eb) \ { \ type *p = kmap_atomic(eb->first_page, KM_USER0); \ u##bits res = le##bits##_to_cpu(p->member); \ kunmap_atomic(p, KM_USER0); \ return res; \ } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2011-Jul-21 05:44 UTC
Re: new metadata reader/writer locks in integration-test
On 20.07.2011 19:21, Chris Mason wrote:> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: >> Hi everyone, >> >> I''ve pushed out a new integration-test branch, and it includes a new >> reader/writer locking scheme for the btree locks. >> >> We''ve seen a number of benchmarks dominated by contention on the root >> node lock. This changes our locks into a simple reader/writer lock. >> They are based on mutexes so that we still take advantage of the mutex >> adaptive spins for write locks (rwsemaphores were much slower). >> >> I''m also sending the individual commits, please do take a look. > > Hi everyone, > > I just rebased Josef''s enospc fixes into integration-test, it should fix > the warnings in extent-tree.c >With the current integration-test branch I get very early enospc on a 7G volume create with -m single -d single and fs_mark-3.3/fs_mark -d /mnt/fsm -D 512 -t 16 -n 4096 -s 51200 -L5 -S0 -R1 It enospces at about 20%, but I can continue to fill it up to 94%. -Arne -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2011-Jul-21 05:46 UTC
Re: new metadata reader/writer locks in integration-test
On 21.07.2011 02:48, Tsutomu Itoh wrote:> (2011/07/21 2:21), Chris Mason wrote: >> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: >>> Hi everyone, >>> >>> I''ve pushed out a new integration-test branch, and it includes a new >>> reader/writer locking scheme for the btree locks. >>> >>> We''ve seen a number of benchmarks dominated by contention on the root >>> node lock. This changes our locks into a simple reader/writer lock. >>> They are based on mutexes so that we still take advantage of the mutex >>> adaptive spins for write locks (rwsemaphores were much slower). >>> >>> I''m also sending the individual commits, please do take a look. >> >> Hi everyone, >> >> I just rebased Josef''s enospc fixes into integration-test, it should fix >> the warnings in extent-tree.c >> > > Unfortunately, I got the following messages. > > > Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ > Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() > Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY > Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] > Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 > Jul 21 09:41:22 luna kernel: Call Trace: > Jul 21 09:41:22 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 > Jul 21 09:41:22 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 > Jul 21 09:41:22 luna kernel: [<ffffffffa044a068>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa0464121>] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa0468c0b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffff8106fe23>] ? try_to_del_timer_sync+0x83/0xe0 > Jul 21 09:41:22 luna kernel: [<ffffffffa0468cd0>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa049a3c6>] end_compressed_bio_write+0x86/0xf0 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 > Jul 21 09:41:22 luna kernel: [<ffffffffa0459d84>] end_workqueue_fn+0xf4/0x130 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa048841e>] worker_loop+0x13e/0x540 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] > Jul 21 09:41:22 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 > Jul 21 09:41:22 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 > Jul 21 09:41:22 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 > Jul 21 09:41:22 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 > Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- >a very similar warning here, but without compression involved: Jul 21 07:42:55 qualactin kernel: [57061.396898] ------------[ cut here ]------------ Jul 21 07:42:55 qualactin kernel: [57061.396923] WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() Jul 21 07:42:55 qualactin kernel: [57061.396927] Hardware name: X8SIL Jul 21 07:42:55 qualactin kernel: [57061.396930] Modules linked in: btrfs mpt2sas scsi_transport_sas raid_class [last unloaded: btrfs] Jul 21 07:42:55 qualactin kernel: [57061.396943] Pid: 10500, comm: btrfs-endio-wri Tainted: G W 2.6.39+ #53 Jul 21 07:42:55 qualactin kernel: [57061.396947] Call Trace: Jul 21 07:42:55 qualactin kernel: [57061.396958] [<ffffffff81091f0a>] warn_slowpath_common+0x7a/0xb0 Jul 21 07:42:55 qualactin kernel: [57061.396965] [<ffffffff81091f55>] warn_slowpath_null+0x15/0x20 Jul 21 07:42:55 qualactin kernel: [57061.396982] [<ffffffffa0264908>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.396991] [<ffffffff818742d6>] ? _raw_spin_unlock+0x26/0x30 Jul 21 07:42:55 qualactin kernel: [57061.397007] [<ffffffffa027f6ba>] T.1250+0x20a/0x260 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397018] [<ffffffffa02839cb>] btrfs_finish_ordered_io+0x2db/0x330 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397028] [<ffffffffa0283a69>] btrfs_writepage_end_io_hook+0x49/0x90 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397039] [<ffffffffa029778e>] end_bio_extent_writepage+0x13e/0x180 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397043] [<ffffffff811ad1c8>] bio_endio+0x18/0x40 Jul 21 07:42:55 qualactin kernel: [57061.397052] [<ffffffffa027500c>] end_workqueue_fn+0xec/0x120 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397062] [<ffffffffa02a3e3c>] worker_loop+0x14c/0x540 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397072] [<ffffffffa02a3cf0>] ? btrfs_queue_worker+0x320/0x320 [btrfs] Jul 21 07:42:55 qualactin kernel: [57061.397076] [<ffffffff810b37de>] kthread+0x9e/0xb0 Jul 21 07:42:55 qualactin kernel: [57061.397079] [<ffffffff8187cb14>] kernel_thread_helper+0x4/0x10 Jul 21 07:42:55 qualactin kernel: [57061.397082] [<ffffffff8187423b>] ? _raw_spin_unlock_irq+0x2b/0x40 Jul 21 07:42:55 qualactin kernel: [57061.397085] [<ffffffff81874540>] ? retint_restore_args+0xe/0xe Jul 21 07:42:55 qualactin kernel: [57061.397088] [<ffffffff810b3740>] ? __init_kthread_worker+0x70/0x70 Jul 21 07:42:55 qualactin kernel: [57061.397090] [<ffffffff8187cb10>] ? gs_change+0xb/0xb Jul 21 07:42:55 qualactin kernel: [57061.397092] ---[ end trace dd9e6d8cc54aa5e0 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Jul-22 00:53 UTC
Re: new metadata reader/writer locks in integration-test
Excerpts from Arne Jansen''s message of 2011-07-21 01:46:55 -0400:> On 21.07.2011 02:48, Tsutomu Itoh wrote: > > (2011/07/21 2:21), Chris Mason wrote: > >> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: > >>> Hi everyone, > >>> > >>> I''ve pushed out a new integration-test branch, and it includes a new > >>> reader/writer locking scheme for the btree locks. > >>> > >>> We''ve seen a number of benchmarks dominated by contention on the root > >>> node lock. This changes our locks into a simple reader/writer lock. > >>> They are based on mutexes so that we still take advantage of the mutex > >>> adaptive spins for write locks (rwsemaphores were much slower). > >>> > >>> I''m also sending the individual commits, please do take a look. > >> > >> Hi everyone, > >> > >> I just rebased Josef''s enospc fixes into integration-test, it should fix > >> the warnings in extent-tree.c > >> > > > > Unfortunately, I got the following messages. > > > > > > Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ > > Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() > > Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY > > Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] > > Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 > > Jul 21 09:41:22 luna kernel: Call Trace: > > Jul 21 09:41:22 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 > > Jul 21 09:41:22 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 > > Jul 21 09:41:22 luna kernel: [<ffffffffa044a068>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa0464121>] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa0468c0b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffff8106fe23>] ? try_to_del_timer_sync+0x83/0xe0 > > Jul 21 09:41:22 luna kernel: [<ffffffffa0468cd0>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa049a3c6>] end_compressed_bio_write+0x86/0xf0 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 > > Jul 21 09:41:22 luna kernel: [<ffffffffa0459d84>] end_workqueue_fn+0xf4/0x130 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa048841e>] worker_loop+0x13e/0x540 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] > > Jul 21 09:41:22 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 > > Jul 21 09:41:22 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 > > Jul 21 09:41:22 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 > > Jul 21 09:41:22 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 > > Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- > > > > a very similar warning here, but without compression involved:Ok, these are probably the enospc fixes. Could you please try bisecting out some of Josef''s patches? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On thu, 21 Jul 2011 20:53:24 -0400, Chris Mason wrote:>>>> Hi everyone, >>>> >>>> I just rebased Josef''s enospc fixes into integration-test, it should fix >>>> the warnings in extent-tree.c >>>> >>> >>> Unfortunately, I got the following messages. >>> >>> >>> Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ >>> Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() >>> Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY >>> Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] >>> Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 >>> Jul 21 09:41:22 luna kernel: Call Trace: >>> Jul 21 09:41:22 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 >>> Jul 21 09:41:22 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 >>> Jul 21 09:41:22 luna kernel: [<ffffffffa044a068>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa0464121>] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa0468c0b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffff8106fe23>] ? try_to_del_timer_sync+0x83/0xe0 >>> Jul 21 09:41:22 luna kernel: [<ffffffffa0468cd0>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa049a3c6>] end_compressed_bio_write+0x86/0xf0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 >>> Jul 21 09:41:22 luna kernel: [<ffffffffa0459d84>] end_workqueue_fn+0xf4/0x130 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa048841e>] worker_loop+0x13e/0x540 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 >>> Jul 21 09:41:22 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 >>> Jul 21 09:41:22 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 >>> Jul 21 09:41:22 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 >>> Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- >>> >> >> a very similar warning here, but without compression involved: > > Ok, these are probably the enospc fixes. Could you please try bisecting > out some of Josef''s patches?I did binary search and found the following patch led to this problem. commit 97ffc7d564f55787c7d9ea557d5d30d9ecb2f003 Author: Josef Bacik <josef@redhat.com> Date: Fri Jul 15 18:29:11 2011 +0000 Btrfs: don''t be as agressive with delalloc metadata reservations Currently we reserve enough space to COW an entirely full btree for every ex we have reserved for an inode. This _sucks_, because you only need to COW o and then everybody else is ok. Unfortunately we don''t know we''ll all be abl get into the same transaction so that''s what we have had to do. But the glo reserve holds a reservation large enough to cover a large percentage of all metadata currently in the fs. So all we really need to account for is any n blocks that we may allocate. So fix this by …… The reason is the calculation of the reservation is wrong, the nodes in the search path may be split, and new nodes may be created, but the above patch didn''t reserve space for these new nodes. The following patch can fix it. Though my test passed, I still need Arne''s verification to make sure it can fix all the reported problems. Arne, Could you test it for me? Subject: [PATCH] Btrfs: fix wrong calculation of the reservation for the transaction At worst, Btrfs may split all the nodes in the search path, so we must take those new nodes into account when we calculate the space that need be reserved. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- fs/btrfs/ctree.h | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index d813a67..4f23819 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2133,10 +2133,16 @@ static inline bool btrfs_mixed_space_info(struct btrfs_space_info *space_info) } /* extent-tree.c */ +/* + * This inline function is used to calc the size of new nodes/leaves that we + * may create. At worst, we may split all the nodes in the path and create + * two leaves for the insertion of one item. + */ static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, unsigned num_items) { - return root->leafsize * 3 * num_items; + return (root->leafsize * 2 + root->nodesize * (BTRFS_MAX_LEVEL - 1)) * + num_items; } void btrfs_put_block_group(struct btrfs_block_group_cache *cache); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On fri, 22 Jul 2011 12:06:40 +0800, Miao Xie wrote:> On thu, 21 Jul 2011 20:53:24 -0400, Chris Mason wrote: >>>>> Hi everyone, >>>>> >>>>> I just rebased Josef''s enospc fixes into integration-test, it should fix >>>>> the warnings in extent-tree.c >>>>> >>>> >>>> Unfortunately, I got the following messages. >>>> >>>> >>>> Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ >>>> Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() >>>> Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY >>>> Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] >>>> Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 >>>> Jul 21 09:41:22 luna kernel: Call Trace: >>>> Jul 21 09:41:22 luna kernel: [<ffffffff8106004f>] warn_slowpath_common+0x7f/0xc0 >>>> Jul 21 09:41:22 luna kernel: [<ffffffff810600aa>] warn_slowpath_null+0x1a/0x20 >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa044a068>] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa0464121>] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa0468c0b>] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffff8106fe23>] ? try_to_del_timer_sync+0x83/0xe0 >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa0468cd0>] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa049a3c6>] end_compressed_bio_write+0x86/0xf0 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffff8117f96d>] bio_endio+0x1d/0x40 >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa0459d84>] end_workqueue_fn+0xf4/0x130 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa048841e>] worker_loop+0x13e/0x540 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffffa04882e0>] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>>> Jul 21 09:41:22 luna kernel: [<ffffffff81081756>] kthread+0x96/0xa0 >>>> Jul 21 09:41:22 luna kernel: [<ffffffff81486004>] kernel_thread_helper+0x4/0x10 >>>> Jul 21 09:41:22 luna kernel: [<ffffffff810816c0>] ? kthread_worker_fn+0x1a0/0x1a0 >>>> Jul 21 09:41:22 luna kernel: [<ffffffff81486000>] ? gs_change+0x13/0x13 >>>> Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- >>>> >>> >>> a very similar warning here, but without compression involved: >> >> Ok, these are probably the enospc fixes. Could you please try bisecting >> out some of Josef''s patches? > > I did binary search and found the following patch led to this problem. > > commit 97ffc7d564f55787c7d9ea557d5d30d9ecb2f003 > Author: Josef Bacik <josef@redhat.com> > Date: Fri Jul 15 18:29:11 2011 +0000 > > Btrfs: don''t be as agressive with delalloc metadata reservations > > Currently we reserve enough space to COW an entirely full btree for every ex > we have reserved for an inode. This _sucks_, because you only need to COW o > and then everybody else is ok. Unfortunately we don''t know we''ll all be abl > get into the same transaction so that''s what we have had to do. But the glo > reserve holds a reservation large enough to cover a large percentage of all > metadata currently in the fs. So all we really need to account for is any n > blocks that we may allocate. So fix this by > ……Please ignore my analysis and patch, which can not fix the problem.> The reason is the calculation of the reservation is wrong, the nodes in the search path > may be split, and new nodes may be created, but the above patch didn''t reserve space for > these new nodes. > > The following patch can fix it. Though my test passed, I still need Arne''s verification > to make sure it can fix all the reported problems. > Arne, Could you test it for me? > > Subject: [PATCH] Btrfs: fix wrong calculation of the reservation for the transaction > > At worst, Btrfs may split all the nodes in the search path, so we must take > those new nodes into account when we calculate the space that need be reserved. > > Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> > --- > fs/btrfs/ctree.h | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index d813a67..4f23819 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -2133,10 +2133,16 @@ static inline bool btrfs_mixed_space_info(struct btrfs_space_info *space_info) > } > > /* extent-tree.c */ > +/* > + * This inline function is used to calc the size of new nodes/leaves that we > + * may create. At worst, we may split all the nodes in the path and create > + * two leaves for the insertion of one item. > + */ > static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, > unsigned num_items) > { > - return root->leafsize * 3 * num_items; > + return (root->leafsize * 2 + root->nodesize * (BTRFS_MAX_LEVEL - 1)) * > + num_items; > } > > void btrfs_put_block_group(struct btrfs_block_group_cache *cache);-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Hellwig
2011-Jul-22 15:01 UTC
rw_semaphore performance, was: new metadata reader/writer locks in integration-test
On Tue, Jul 19, 2011 at 01:30:22PM -0400, Chris Mason wrote:> We''ve seen a number of benchmarks dominated by contention on the root > node lock. This changes our locks into a simple reader/writer lock. > They are based on mutexes so that we still take advantage of the mutex > adaptive spins for write locks (rwsemaphores were much slower).Interesting. Do you have set up some artifical benchmarks for this? I wonder if the lack of adaptive spinning has something to do with the slightly slower XFS performance on Joern''s flash testing, given that we extensively use the rw_semaphore as the primary I/O mutex, while all others rely on plain mutexes as the primary synchronization primitive.
Chris Mason
2011-Jul-22 15:14 UTC
Re: rw_semaphore performance, was: new metadata reader/writer locks in integration-test
Excerpts from Christoph Hellwig''s message of 2011-07-22 11:01:51 -0400:> On Tue, Jul 19, 2011 at 01:30:22PM -0400, Chris Mason wrote: > > We''ve seen a number of benchmarks dominated by contention on the root > > node lock. This changes our locks into a simple reader/writer lock. > > They are based on mutexes so that we still take advantage of the mutex > > adaptive spins for write locks (rwsemaphores were much slower). > > Interesting. Do you have set up some artifical benchmarks for this? > > I wonder if the lack of adaptive spinning has something to do with the > slightly slower XFS performance on Joern''s flash testing, given that > we extensively use the rw_semaphore as the primary I/O mutex, while > all others rely on plain mutexes as the primary synchronization > primitive.For the rw locks I had three main tests. 1) dbench 10. This is interesting only because it is mostly bound by how quickly we can do metadata operations in ram. There''s not much IO and there''s a good mixture of read and write btree operations (about 50/50). rwsemaphores ran at 200MB/s while my current code runs at 2400MB/s. The old btrfs implementation runs at 3000MB/s. We all love and hate dbench, so I don''t put a huge amount of stock in 2400 vs 3000. But, 200 vs 2400...people notice that in real world stuff. 2) fs_mark doing parallel zero byte file creates. No fsyncs here, all metadata operations. The old btrfs locking was completely bound by getting write locks on the root node. The new code is much better here, overall about 30-50% faster. I didn''t do the rw semaphores on this one, I''ll give it a shot. 3) A stat-hammer program. This creates a bunch of files in parallel, and then times how long it takes us to stat all the inodes. I went from 3s of CPU time down to .9s. rwsems were about the same here (very fast), but that''s because it''s 100% reader locks. My money for Joern''s benchmarks is end-io latencies. xfs and btrfs are doing more at endio time. But I need to sit down and run them myself and take a look. -chris
Chris Mason
2011-Jul-22 15:49 UTC
Re: new metadata reader/writer locks in integration-test
On Wed, Jul 20, 2011 at 05:36:09PM +0900, Tsutomu Itoh wrote:> (2011/07/20 16:58), Chris Mason wrote: > > Excerpts from Tsutomu Itoh''s message of 2011-07-19 22:08:38 -0400: > >> (2011/07/20 2:30), Chris Mason wrote: > >>> Hi everyone, > >>> > >>> I''ve pushed out a new integration-test branch, and it includes a new > >>> reader/writer locking scheme for the btree locks. > >>> > >>> We''ve seen a number of benchmarks dominated by contention on the root > >>> node lock. This changes our locks into a simple reader/writer lock. > >>> They are based on mutexes so that we still take advantage of the mutex > >>> adaptive spins for write locks (rwsemaphores were much slower). > >>> > >>> I''m also sending the individual commits, please do take a look. > >> > >> I pulled the new integration-test branch, and I got the following > >> warning messages. > >> > >> Jul 20 10:03:30 luna kernel: ------------[ cut here ]------------ > >> Jul 20 10:03:30 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5704 btrfs_alloc_free_block+0x178/0x340 [btrfs]() > > > > Thanks, I think this one is related to Josef''s enospc changes, but I''ll > > double check. > > >What was the test? > > I ran my original test script. > This script concurrently executes the making deletion of a lot of files, > and the making deletion of a big file, etc.I''m having a hard time triggering this with Josef''s current patch (after my rebase). Could you please send along the reproduction script? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2011-Jul-22 15:58 UTC
Re: new metadata reader/writer locks in integration-test
On 21.07.2011 07:44, Arne Jansen wrote:> On 20.07.2011 19:21, Chris Mason wrote: >> Excerpts from Chris Mason''s message of 2011-07-19 13:30:22 -0400: >>> Hi everyone, >>> >>> I''ve pushed out a new integration-test branch, and it includes a new >>> reader/writer locking scheme for the btree locks. >>> >>> We''ve seen a number of benchmarks dominated by contention on the root >>> node lock. This changes our locks into a simple reader/writer lock. >>> They are based on mutexes so that we still take advantage of the mutex >>> adaptive spins for write locks (rwsemaphores were much slower). >>> >>> I''m also sending the individual commits, please do take a look. >> >> Hi everyone, >> >> I just rebased Josef''s enospc fixes into integration-test, it should fix >> the warnings in extent-tree.c >> > > With the current integration-test branch I get very early enospc on > a 7G volume create with -m single -d single and > > fs_mark-3.3/fs_mark -d /mnt/fsm -D 512 -t 16 -n 4096 -s 51200 -L5 -S0 -R1 > > It enospces at about 20%, but I can continue to fill it up to 94%.I tried to bisect this, but it turned out to be hard. Sooner or later I get this early enospc on every revision, on some sooner, on others later. At least the current for-linus branch is much worse than integration-test.> > -Arne > ---- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html