thr3ads.net - Btrfs devel - [2.6.38-rc6] create->rebalance->mount crash... [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Daniel J Blueman

2011-Feb-24 08:13 UTC

[2.6.38-rc6] create->rebalance->mount crash...

When creating a filesystem (single or redundant) with BTRFS and
subsequently executing a balance [1], we see a kernel oops at the next
mount [2].

Thanks,
  Daniel

--- [1]

# mkfs.btrfs /dev/sdb
# mount /dev/sdb /store
# btrfs filesystem balance /store
# umount /store

--- [2]

# mount /dev/sdb /store
Killed
# dmesg
device fsid bc4a5f28339d255f-6eccdd738ea4f0ac devid 1 transid 13 /dev/sdb
btrfs: relocating block group 29360128 flags 36
btrfs: found 2 extents
btrfs: relocating block group 20971520 flags 34
btrfs allocation failed flags 34, wanted 4096
space_info has 0 free, is not full
space_info total=12582912, used=4096, pinned=0, reserved=0, may_use=0,
readonly=12578816
block group 20971520 has 8388608 bytes, 4096 used 0 pinned 0 reserved
entry offset 20975616, bytes 8384512, bitmap no
block group has cluster?: no
1 blocks of free space at or bigger than bytes is
block group 0 has 4194304 bytes, 0 used 0 pinned 0 reserved
entry offset 131072, bytes 4063232, bitmap no
block group has cluster?: no
1 blocks of free space at or bigger than bytes is
btrfs: relocating block group 12582912 flags 1
btrfs: relocating block group 4194304 flags 4
device fsid bc4a5f28339d255f-6eccdd738ea4f0ac devid 1 transid 30 /dev/sdb
BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
IP: [<ffffffff81037cc9>] __ticket_spin_lock+0x9/0x20
PGD 305e2c067 PUD 305732067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/virtual/bdi/btrfs-4/uevent
CPU 1
Modules linked in: lp i7core_edac ioatdma edac_core parport psmouse
dca serio_raw joydev raid10 raid456 async_raid6_recov async_pq usbhid
hid raid6_pq async_xor xor async_memcpy async_tx ahci libahci raid1
raid0 multipath e1000e linear btrfs zlib_deflate libcrc32c

Pid: 1013, comm: mount Tainted: G        W   2.6.38-020638rc6-generic
#201102220910 Supermicro X8STi/X8STi
RIP: 0010:[<ffffffff81037cc9>]  [<ffffffff81037cc9>]
__ticket_spin_lock+0x9/0x20
RSP: 0018:ffff880303be5a18  EFLAGS: 00010246
RAX: 0000000000000100 RBX: 00000000000000b0 RCX: ffff880305ed5750
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000000000b0
RBP: ffff880303be5a18 R08: ffff8803056e62e8 R09: ffff880303be58c0
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000004 R14: ffff8803056e4140 R15: ffff8803056e4000
FS:  00007f78247707e0(0000) GS:ffff8800df480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b0 CR3: 0000000303e85000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount (pid: 1013, threadinfo ffff880303be4000, task ffff880305710000)
Stack:
 ffff880303be5a28 ffffffff815ace4e ffff880303be5a68 ffffffffa00231af
 ffff880304b56000 ffff8803056e4000 ffff8803056e6298 ffff880305ed5700
 ffff8803056e4140 ffff8803056e4000 ffff880303be5aa8 ffffffffa00232b7
Call Trace:
 [<ffffffff815ace4e>] _raw_spin_lock+0xe/0x20
 [<ffffffffa00231af>] calc_global_metadata_size+0x4f/0x120 [btrfs]
 [<ffffffffa00232b7>] update_global_block_rsv+0x37/0xe0 [btrfs]
 [<ffffffffa0023efb>] init_global_block_rsv+0xcb/0xe0 [btrfs]
 [<ffffffffa002a9df>] btrfs_read_block_groups+0x37f/0x4d0 [btrfs]
 [<ffffffff815ace4e>] ? _raw_spin_lock+0xe/0x20
 [<ffffffffa003780d>] open_ctree+0x10cd/0x1480 [btrfs]
 [<ffffffff812d8716>] ? vsnprintf+0x186/0x530
 [<ffffffff8116081c>] ? set_anon_super+0x7c/0x120
 [<ffffffffa001624e>] btrfs_fill_super+0x7e/0x140 [btrfs]
 [<ffffffff81161cd8>] ? sget+0x238/0x260
 [<ffffffff812d5b0f>] ? strlcpy+0x4f/0x70
 [<ffffffffa001790b>] btrfs_mount+0x31b/0x3b0 [btrfs]
 [<ffffffff811611fa>] vfs_kern_mount+0x8a/0x200
 [<ffffffff81161463>] do_kern_mount+0x53/0xb0
 [<ffffffff8117da7a>] do_new_mount+0x7a/0xb0
 [<ffffffff8117e148>] do_mount+0x188/0x1d0
 [<ffffffff8117e21f>] sys_mount+0x8f/0xd0
 [<ffffffff8100c002>] system_call_fastpath+0x16/0x1b
Code: ff 48 c7 c2 fe 7a 03 81 48 c7 c1 01 7b 03 81 e9 fe fe ff ff 90
90 90 90 90 90 90 90 90 90 90 90 90 90 55 b8 00 01 00 00 48 89 e5 <f0>
66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c9 c3 66 0f 1f 44
RIP  [<ffffffff81037cc9>] __ticket_spin_lock+0x9/0x20
 RSP <ffff880303be5a18>
CR2: 00000000000000b0
---[ end trace a7919e7f17c0a728 ]---
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

liubo

2011-Feb-24 12:48 UTC

head link

Re: [2.6.38-rc6] create->rebalance->mount crash...

On 02/24/2011 04:13 PM, Daniel J Blueman wrote:> When creating a filesystem (single or redundant) with BTRFS and
> subsequently executing a balance [1], we see a kernel oops at the next
> mount [2].
> 
Hi, Daniel,

After digging this, I''ve come up with a patch on this, would you please
test
it on your box?  Hopes that this is helpful, Thanks.

From: Liu Bo <liubo2009@cn.fujitsu.com>

[PATCH] btrfs: fix OOPS of empty filesystem after balance

btrfs will exclude unused block groups via a thread.
When a empty filesystem is balanced, the block group with tag "DATA"
may be dropped,
and after umount, this will lead to OOPS when we mount it again.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 100e409..4749ab0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3856,10 +3856,14 @@ static void update_global_block_rsv(struct btrfs_fs_info
*fs_info)
 	spin_unlock(&block_rsv->lock);
 }
 
-static void init_global_block_rsv(struct btrfs_fs_info *fs_info)
+static int init_global_block_rsv(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_space_info *space_info;
 
+	space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
+	if (!space_info)
+		return -EAGAIN;
+
 	space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_SYSTEM);
 	fs_info->chunk_block_rsv.space_info = space_info;
 	fs_info->chunk_block_rsv.priority = 10;
@@ -3884,6 +3888,8 @@ static void init_global_block_rsv(struct btrfs_fs_info
*fs_info)
 	btrfs_add_durable_block_rsv(fs_info, &fs_info->delalloc_block_rsv);
 
 	update_global_block_rsv(fs_info);
+
+	return 0;
 }
 
 static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
@@ -8514,7 +8520,13 @@ int btrfs_read_block_groups(struct btrfs_root *root)
 			set_block_group_ro(cache);
 	}
 
-	init_global_block_rsv(info);
+again:
+	ret = init_global_block_rsv(info);
+	if (ret == -EAGAIN) {
+		update_space_info(info, BTRFS_BLOCK_GROUP_DATA, 0, 0,
+				  &space_info);
+		goto again;
+	}
 	ret = 0;
 error:
 	btrfs_free_path(path);
-- 
1.6.5.2


> Thanks,
>   Daniel
> 
> --- [1]
> 
> # mkfs.btrfs /dev/sdb
> # mount /dev/sdb /store
> # btrfs filesystem balance /store
> # umount /store
> 
> --- [2]
> 
> # mount /dev/sdb /store
> Killed
> # dmesg
> device fsid bc4a5f28339d255f-6eccdd738ea4f0ac devid 1 transid 13 /dev/sdb
> btrfs: relocating block group 29360128 flags 36
> btrfs: found 2 extents
> btrfs: relocating block group 20971520 flags 34
> btrfs allocation failed flags 34, wanted 4096
> space_info has 0 free, is not full
> space_info total=12582912, used=4096, pinned=0, reserved=0, may_use=0,
> readonly=12578816
> block group 20971520 has 8388608 bytes, 4096 used 0 pinned 0 reserved
> entry offset 20975616, bytes 8384512, bitmap no
> block group has cluster?: no
> 1 blocks of free space at or bigger than bytes is
> block group 0 has 4194304 bytes, 0 used 0 pinned 0 reserved
> entry offset 131072, bytes 4063232, bitmap no
> block group has cluster?: no
> 1 blocks of free space at or bigger than bytes is
> btrfs: relocating block group 12582912 flags 1
> btrfs: relocating block group 4194304 flags 4
> device fsid bc4a5f28339d255f-6eccdd738ea4f0ac devid 1 transid 30 /dev/sdb
> BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
> IP: [<ffffffff81037cc9>] __ticket_spin_lock+0x9/0x20
> PGD 305e2c067 PUD 305732067 PMD 0
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/virtual/bdi/btrfs-4/uevent
> CPU 1
> Modules linked in: lp i7core_edac ioatdma edac_core parport psmouse
> dca serio_raw joydev raid10 raid456 async_raid6_recov async_pq usbhid
> hid raid6_pq async_xor xor async_memcpy async_tx ahci libahci raid1
> raid0 multipath e1000e linear btrfs zlib_deflate libcrc32c
> 
> Pid: 1013, comm: mount Tainted: G        W   2.6.38-020638rc6-generic
> #201102220910 Supermicro X8STi/X8STi
> RIP: 0010:[<ffffffff81037cc9>]  [<ffffffff81037cc9>]
__ticket_spin_lock+0x9/0x20
> RSP: 0018:ffff880303be5a18  EFLAGS: 00010246
> RAX: 0000000000000100 RBX: 00000000000000b0 RCX: ffff880305ed5750
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000000000b0
> RBP: ffff880303be5a18 R08: ffff8803056e62e8 R09: ffff880303be58c0
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000004 R14: ffff8803056e4140 R15: ffff8803056e4000
> FS:  00007f78247707e0(0000) GS:ffff8800df480000(0000)
knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000b0 CR3: 0000000303e85000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process mount (pid: 1013, threadinfo ffff880303be4000, task
ffff880305710000)
> Stack:
>  ffff880303be5a28 ffffffff815ace4e ffff880303be5a68 ffffffffa00231af
>  ffff880304b56000 ffff8803056e4000 ffff8803056e6298 ffff880305ed5700
>  ffff8803056e4140 ffff8803056e4000 ffff880303be5aa8 ffffffffa00232b7
> Call Trace:
>  [<ffffffff815ace4e>] _raw_spin_lock+0xe/0x20
>  [<ffffffffa00231af>] calc_global_metadata_size+0x4f/0x120 [btrfs]
>  [<ffffffffa00232b7>] update_global_block_rsv+0x37/0xe0 [btrfs]
>  [<ffffffffa0023efb>] init_global_block_rsv+0xcb/0xe0 [btrfs]
>  [<ffffffffa002a9df>] btrfs_read_block_groups+0x37f/0x4d0 [btrfs]
>  [<ffffffff815ace4e>] ? _raw_spin_lock+0xe/0x20
>  [<ffffffffa003780d>] open_ctree+0x10cd/0x1480 [btrfs]
>  [<ffffffff812d8716>] ? vsnprintf+0x186/0x530
>  [<ffffffff8116081c>] ? set_anon_super+0x7c/0x120
>  [<ffffffffa001624e>] btrfs_fill_super+0x7e/0x140 [btrfs]
>  [<ffffffff81161cd8>] ? sget+0x238/0x260
>  [<ffffffff812d5b0f>] ? strlcpy+0x4f/0x70
>  [<ffffffffa001790b>] btrfs_mount+0x31b/0x3b0 [btrfs]
>  [<ffffffff811611fa>] vfs_kern_mount+0x8a/0x200
>  [<ffffffff81161463>] do_kern_mount+0x53/0xb0
>  [<ffffffff8117da7a>] do_new_mount+0x7a/0xb0
>  [<ffffffff8117e148>] do_mount+0x188/0x1d0
>  [<ffffffff8117e21f>] sys_mount+0x8f/0xd0
>  [<ffffffff8100c002>] system_call_fastpath+0x16/0x1b
> Code: ff 48 c7 c2 fe 7a 03 81 48 c7 c1 01 7b 03 81 e9 fe fe ff ff 90
> 90 90 90 90 90 90 90 90 90 90 90 90 90 55 b8 00 01 00 00 48 89 e5
<f0>
> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c9 c3 66 0f 1f 44
> RIP  [<ffffffff81037cc9>] __ticket_spin_lock+0x9/0x20
>  RSP <ffff880303be5a18>
> CR2: 00000000000000b0
> ---[ end trace a7919e7f17c0a728 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2011-Feb-24 14:35 UTC

head link

Re: [2.6.38-rc6] create->rebalance->mount crash...

Excerpts from liubo''s message of 2011-02-24 07:48:22
-0500:> On 02/24/2011 04:13 PM, Daniel J Blueman wrote:
> > When creating a filesystem (single or redundant) with BTRFS and
> > subsequently executing a balance [1], we see a kernel oops at the next
> > mount [2].
> > 
> 
> Hi, Daniel,
> 
> After digging this, I''ve come up with a patch on this, would you
please test
> it on your box?  Hopes that this is helpful, Thanks.
> 
> From: Liu Bo <liubo2009@cn.fujitsu.com>
> 
> [PATCH] btrfs: fix OOPS of empty filesystem after balance
> 
> btrfs will exclude unused block groups via a thread.
> When a empty filesystem is balanced, the block group with tag
"DATA" may be dropped,
> and after umount, this will lead to OOPS when we mount it again.
Thanks for tracking this down!  Comment below:
> 
> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
> ---
>  fs/btrfs/extent-tree.c |   16 ++++++++++++++--
>  1 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 100e409..4749ab0 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3856,10 +3856,14 @@ static void update_global_block_rsv(struct
btrfs_fs_info *fs_info)
>      spin_unlock(&block_rsv->lock);
>  }
>  
> -static void init_global_block_rsv(struct btrfs_fs_info *fs_info)
> +static int init_global_block_rsv(struct btrfs_fs_info *fs_info)
>  {
>      struct btrfs_space_info *space_info;
>  
> +    space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
> +    if (!space_info)
> +        return -EAGAIN;
> +
>      space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_SYSTEM);
>      fs_info->chunk_block_rsv.space_info = space_info;
>      fs_info->chunk_block_rsv.priority = 10;
> @@ -3884,6 +3888,8 @@ static void init_global_block_rsv(struct
btrfs_fs_info *fs_info)
>      btrfs_add_durable_block_rsv(fs_info,
&fs_info->delalloc_block_rsv);
>  
>      update_global_block_rsv(fs_info);
> +
> +    return 0;
>  }
>  
>  static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
> @@ -8514,7 +8520,13 @@ int btrfs_read_block_groups(struct btrfs_root *root)
>              set_block_group_ro(cache);
>      }
>  
> -    init_global_block_rsv(info);
> +again:
> +    ret = init_global_block_rsv(info);
> +    if (ret == -EAGAIN) {
> +        update_space_info(info, BTRFS_BLOCK_GROUP_DATA, 0, 0,
> +                  &space_info);
> +        goto again;
> +    }
>      ret = 0;
Are we looping here because we expect the init_global_block_rsv to fail
more than once?  If so we need a cond_resched or something in there.

But if the EAGAIN is only returned once we should avoid the loop and
open code the call again.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Miao Xie

2011-Feb-25 02:19 UTC

head link

Re: [2.6.38-rc6] create->rebalance->mount crash...

Hi, Chris and Liu

On thu, 24 Feb 2011 09:35:32 -0500, Chris Mason wrote:
[SNIP]>> [PATCH] btrfs: fix OOPS of empty filesystem after balance
>>
>> btrfs will exclude unused block groups via a thread.
>> When a empty filesystem is balanced, the block group with tag
"DATA" may be dropped,
>> and after umount, this will lead to OOPS when we mount it again.
>
> Thanks for tracking this down!  Comment below:
[SNIP]>> -static void init_global_block_rsv(struct btrfs_fs_info *fs_info)
>> +static int init_global_block_rsv(struct btrfs_fs_info *fs_info)
>>   {
>>       struct btrfs_space_info *space_info;
>>
>> +    space_info = __find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
>> +    if (!space_info)
>> +        return -EAGAIN;
>> +
>>       space_info = __find_space_info(fs_info,
BTRFS_BLOCK_GROUP_SYSTEM);
>>       fs_info->chunk_block_rsv.space_info = space_info;
>>       fs_info->chunk_block_rsv.priority = 10;
>> @@ -3884,6 +3888,8 @@ static void init_global_block_rsv(struct
btrfs_fs_info *fs_info)
>>      
btrfs_add_durable_block_rsv(fs_info,&fs_info->delalloc_block_rsv);
>>
>>       update_global_block_rsv(fs_info);
>> +
>> +    return 0;
>>   }
>>
>>   static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
>> @@ -8514,7 +8520,13 @@ int btrfs_read_block_groups(struct btrfs_root
*root)
>>               set_block_group_ro(cache);
>>       }
>>
>> -    init_global_block_rsv(info);
>> +again:
>> +    ret = init_global_block_rsv(info);
>> +    if (ret == -EAGAIN) {
>> +        update_space_info(info, BTRFS_BLOCK_GROUP_DATA, 0, 0,
>> +&space_info);
>> +        goto again;
>> +    }
>>       ret = 0;
>
> Are we looping here because we expect the init_global_block_rsv to fail
> more than once?  If so we need a cond_resched or something in there.
>
> But if the EAGAIN is only returned once we should avoid the loop and
> open code the call again.
I don''t think we should create a space information object in
init_global_block_rsv(),
which just does initialize the global block reservation object.

I think it is better to split btrfs_read_block_group() to three steps.
Step 1: create and initialize the space information object.
Step 2: read the block groups and update the space information.
Step 3: initialize the global block reservation object.
In this way, the logic of the source is clear, and avoid sometrivial mistake.

BTW: I found the btrfs filesystem just has three types of data(file data, meta
data,
system meta data), why not add a space information array with three elements
into fs_info?
In this way, we can simplify the source code of the space information, and
needn''t use
RCU lock to protect the space information object list. (I didn''t find a
lock to protect
the space information object list in the write-side. Is it right?)

Regards
Miao
>
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel J Blueman

2011-Feb-26 04:16 UTC

head link

Re: [2.6.38-rc6] create->rebalance->mount crash...

On 24 February 2011 20:48, liubo <liubo2009@cn.fujitsu.com>
wrote:> On 02/24/2011 04:13 PM, Daniel J Blueman wrote:
>> When creating a filesystem (single or redundant) with BTRFS and
>> subsequently executing a balance [1], we see a kernel oops at the next
>> mount [2].
>>
>
> Hi, Daniel,
>
> After digging this, I''ve come up with a patch on this, would you
please test
> it on your box?  Hopes that this is helpful, Thanks.
>
> From: Liu Bo <liubo2009@cn.fujitsu.com>
>
> [PATCH] btrfs: fix OOPS of empty filesystem after balance
>
> btrfs will exclude unused block groups via a thread.
> When a empty filesystem is balanced, the block group with tag
"DATA" may be dropped,
> and after umount, this will lead to OOPS when we mount it again.
[snip]

Thanks, Bo; the patch addresses the oops.

Daniel

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Feb 2011 - [2.6.38-rc6] create->rebalance->mount crash...

[2.6.38-rc6] create->rebalance->mount crash...

Re: [2.6.38-rc6] create->rebalance->mount crash...

Re: [2.6.38-rc6] create->rebalance->mount crash...

Re: [2.6.38-rc6] create->rebalance->mount crash...

Re: [2.6.38-rc6] create->rebalance->mount crash...