thr3ads.net - Btrfs devel - Drive without DISCARD support crashes btrfs [May 2011]

If this information is useful, please help other people find it:
Share via:

David Sterba

2011-May-17 15:27 UTC

Drive without DISCARD support crashes btrfs

Hi,

while playing with lockedp, I started to hit consistent crashes with
xfstests/013 (fsstress):

[  537.284234] ------------[ cut here ]------------
[  537.288040] kernel BUG at fs/btrfs/tree-log.c:1792!
[  537.288040] invalid opcode: 0000 [#1] SMP
[  537.288040] last sysfs file: /sys/devices/virtual/bdi/btrfs-10/uevent
[  537.288040] CPU 1
[  537.288040] Modules linked in: btrfs
[  537.288040]
[  537.288040] Pid: 12069, comm: fsstress Not tainted 2.6.39-rc7-default+ #24
Intel
[  537.288040] RIP: 0010:[<ffffffffa0063c79>]  [<ffffffffa0063c79>]
walk_up_log_tree+0x2f9/0x330 [btrfs]
[  537.288040] RSP: 0018:ffff880058191c08  EFLAGS: 00010282
[  537.288040] RAX: 00000000ffffffa1 RBX: ffff88004f863910 RCX: 0000000000000001
[  537.288040] RDX: 0000000000000f0e RSI: 0000000000000001 RDI: ffff880078bd8600
[  537.288040] RBP: ffff880058191c88 R08: 0000000000000000 R09: 0000000000000001
[  537.288040] R10: 0000000000000000 R11: 0000000000006945 R12: ffff880058191cb4
[  537.288040] R13: 0000000000000000 R14: ffff880000000000 R15: ffff880058191cf8
[  537.288040] FS:  00007f469ba3d700(0000) GS:ffff88007e600000(0000)
knlGS:0000000000000000
[  537.288040] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  537.288040] CR2: 00007f469af27430 CR3: 000000007ba3c000 CR4: 00000000000006e0
[  537.288040] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  537.288040] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  537.288040] Process fsstress (pid: 12069, threadinfo ffff880058190000, task
ffff8800782f1140)
[  537.288040] Stack:
[  537.288040]  ffff880058191fd8 ffff88005e326f20 ffff880058191fd8
fffffffffffffffa
[  537.288040]  ffff880058191fd8 ffff880058191fd8 ffff880058191fd8
ffff880058191fd8
[  537.288040]  ffff88004eca4598 ffff8800580a6000 ffff880058191da8
0000000000000000
[  537.288040] Call Trace:
[  537.288040]  [<ffffffffa0064348>] walk_log_tree+0x108/0x2a0 [btrfs]
[  537.288040]  [<ffffffffa0064523>] free_log_tree+0x43/0xc0 [btrfs]
[  537.288040]  [<ffffffffa0061950>] ? insert_one_name+0xc0/0xc0 [btrfs]
[  537.288040]  [<ffffffffa006663f>] btrfs_free_log+0x1f/0x30 [btrfs]
[  537.288040]  [<ffffffffa002bb61>] commit_fs_roots+0x91/0x150 [btrfs]
[  537.288040]  [<ffffffffa002cc70>] ?
btrfs_commit_transaction+0x340/0x7d0 [btrfs]
[  537.288040]  [<ffffffffa002cc70>] ?
btrfs_commit_transaction+0x340/0x7d0 [btrfs]
[  537.288040]  [<ffffffffa002cc7b>] btrfs_commit_transaction+0x34b/0x7d0
[btrfs]
[  537.288040]  [<ffffffff810b8c80>] ? wake_up_bit+0x40/0x40
[  537.288040]  [<ffffffff811a8cb0>] ? __sync_filesystem+0x90/0x90
[  537.288040]  [<ffffffffa0003210>] btrfs_sync_fs+0x60/0xe0 [btrfs]
[  537.288040]  [<ffffffff811a8c7e>] __sync_filesystem+0x5e/0x90
[  537.288040]  [<ffffffff811a8ccf>] sync_one_sb+0x1f/0x30
[  537.288040]  [<ffffffff8117f287>] iterate_supers+0x77/0xe0
[  537.288040]  [<ffffffff811a8bd0>] sync_filesystems+0x20/0x30
[  537.616540]  [<ffffffff811a8d71>] sys_sync+0x21/0x40
[  537.616540]  [<ffffffff81e94f42>] system_call_fastpath+0x16/0x1b
[  537.616540] Code: 00 00 48 83 c4 58 b8 01 00 00 00 5b 41 5c 41 5d 41 5e 41 5f
c9 c3 0f 1f 40 00 48 83 c4 58 b0 01 5b 41 5c 41 5d 41 5e 41 5f c9 c3 <0f>
0b be fc 06 00 00 48 c7 c7 c8 f4 07 a0 e8 34 08 03 e1 e9 c0
[  537.616540] RIP  [<ffffffffa0063c79>] walk_up_log_tree+0x2f9/0x330
[btrfs]
[  537.616540]  RSP <ffff880058191c08>
[  537.619306] ---[ end trace 5f9597f2f24fa95c ]---

it''s applied on top of today''s linus c1d10d18c5. Mount options
are
compress and discard.

fsstress process is in D-state with
# cat /proc/12068/stack
[<ffffffffa002d419>] start_transaction+0x79/0x2f0 [btrfs]
[<ffffffffa002da13>] btrfs_start_transaction+0x13/0x20 [btrfs]
[<ffffffffa002e1aa>] __unlink_start_trans+0x4a/0x400 [btrfs]
[<ffffffffa0035997>] btrfs_unlink+0x37/0xe0 [btrfs]
[<ffffffff8118ab6a>] vfs_unlink+0x8a/0xf0
[<ffffffff8118ad53>] do_unlinkat+0x183/0x1c0
[<ffffffff8118bbc6>] sys_unlink+0x16/0x20
[<ffffffff81e94f42>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

shutdown is stuck, sync of btrfs does not proceed due to blocked
transaction commit due to previsou failures.

crash site:
1789                                 ret = btrfs_free_reserved_extent(root,
1790                                                
path->nodes[*level]->start,
1791                                                
path->nodes[*level]->len);
1792                                 BUG_ON(ret);

if EAX contains ret, 0xFFFFFFa1, which seems to be EOPNOTSUPP comming from
discard handling in btrfs_free_reserved_extent().

walk_up_log_tree
  btrfs_free_reserved_extent
    btrfs_discard_extent
      return -EOPNOTSUPP
    BUG_ON ret

I suspect line 1793 in btrfs_discard_extent():

1766 static int btrfs_discard_extent(struct btrfs_root *root, u64 bytenr,
1767                                 u64 num_bytes, u64 *actual_bytes)
1768 {
...
1793         if (discarded_bytes && ret == -EOPNOTSUPP)
1794                 ret = 0;
...
1800         return ret;
1801 }

discarded_bytes is initially 0, ret is EOPNOTSUPP from not-done
btrfs_issue_discard, true branch is not taken and nonzero retcode is
propagated.

I think the "discarded_bytes != 0" check should be dropped, I do not
see why
it''s there. Going to test now.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2011-May-17 16:00 UTC

head link

[PATCH] btrfs: fix crash when no drive supports DISCARD

xfstests/013 crashes when the test partition is mounted with -o discard:

walk_up_log_tree
  btrfs_free_reserved_extent
    btrfs_discard_extent
      return -EOPNOTSUPP
    BUG_ON ret

btrfs_discard_extent() should be fine when drive does not support the
DISCARD operation and filter the EOPNOTSUPP retcode, but currently it
does this only when some bytes were succesfully discarded.

Signed-off-by: David Sterba <dsterba@suse.cz>
CC: stable@kernel.org
---
 fs/btrfs/extent-tree.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 9ee6bd5..feab2ab 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1790,7 +1790,7 @@ static int btrfs_discard_extent(struct btrfs_root *root,
u64 bytenr,
 		}
 		kfree(multi);
 	}
-	if (discarded_bytes && ret == -EOPNOTSUPP)
+	if (ret == -EOPNOTSUPP)
 		ret = 0;
 
 	if (actual_bytes)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2011-May-17 18:49 UTC

head link

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

Excerpts from David Sterba''s message of 2011-05-17 12:00:31
-0400:> xfstests/013 crashes when the test partition is mounted with -o discard:
> 
> walk_up_log_tree
>   btrfs_free_reserved_extent
>     btrfs_discard_extent
>       return -EOPNOTSUPP
>     BUG_ON ret
> 
> btrfs_discard_extent() should be fine when drive does not support the
> DISCARD operation and filter the EOPNOTSUPP retcode, but currently it
> does this only when some bytes were succesfully discarded.
Fair enough, we aren''t really doing anything with EOPNOTSUPP now, so we
can just drop it in btrfs_discard_extent.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Li Dongyang

2011-May-18 03:29 UTC

head link

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

On Wednesday, May 18, 2011 12:00:31 AM David Sterba
wrote:> xfstests/013 crashes when the test partition is mounted with -o discard:
> 
> walk_up_log_tree
>   btrfs_free_reserved_extent
>     btrfs_discard_extent
>       return -EOPNOTSUPP
>     BUG_ON ret
> 
> btrfs_discard_extent() should be fine when drive does not support the
> DISCARD operation and filter the EOPNOTSUPP retcode, but currently it
> does this only when some bytes were succesfully discarded.
> 
> Signed-off-by: David Sterba <dsterba@suse.cz>
> CC: stable@kernel.org
> ---
>  fs/btrfs/extent-tree.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 9ee6bd5..feab2ab 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1790,7 +1790,7 @@ static int btrfs_discard_extent(struct btrfs_root
*root, u64 bytenr,
>  		}
>  		kfree(multi);
>  	}
> -	if (discarded_bytes && ret == -EOPNOTSUPP)
> +	if (ret == -EOPNOTSUPP)
>  		ret = 0;
>  
>  	if (actual_bytes)
> Thanks for the fix, I thought EOPNOTSUPP could be useful by the caller that
time, and maybe we could do somthing
like remove the discard mount_opt in the fs_info so we can avoid calling it
again.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2011-May-20 11:52 UTC

head link

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

On Wed, May 18, 2011 at 11:29:14AM +0800, Li Dongyang
wrote:> Thanks for the fix, I thought EOPNOTSUPP could be useful by the caller
> that time, and maybe we could do somthing like remove the discard
> mount_opt in the fs_info so we can avoid calling it again.
I do not agree that discard should be removed from mount_opt, because
one may add TRIM-capable devices later, it''s a whole filesystem option,
while. A disappeared discard mount option will probably cause panic
in administrator''s head.

However, if a drive does not support TRIM, the btrfs_issue_discard calls
can take a shortcut and do not call up to blkdev_issue_discard (though
it does return immediatelly), caching the state after first call. But
this is matter of the lower level call (blkdev) and should not be
propagated beyond to the extent "level" (ie. btrfs_discard_extent).

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2011-May-20 14:15 UTC

head link

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

On 05/20/2011 07:52 AM, David Sterba wrote:> On Wed, May 18, 2011 at 11:29:14AM +0800, Li Dongyang wrote:
>> Thanks for the fix, I thought EOPNOTSUPP could be useful by the caller
>> that time, and maybe we could do somthing like remove the discard
>> mount_opt in the fs_info so we can avoid calling it again.
> 
> I do not agree that discard should be removed from mount_opt, because
> one may add TRIM-capable devices later, it''s a whole filesystem
option,
> while. A disappeared discard mount option will probably cause panic
> in administrator''s head.
> 
> However, if a drive does not support TRIM, the btrfs_issue_discard calls
> can take a shortcut and do not call up to blkdev_issue_discard (though
> it does return immediatelly), caching the state after first call. But
> this is matter of the lower level call (blkdev) and should not be
> propagated beyond to the extent "level" (ie.
btrfs_discard_extent).
> 
There ought to just be a flag added to btrfs_device to say whether or
not it supports discard, and if we get back EOPNOTSUPP we stop putting
discards down on that device, that way if we have some devices that do
it and some that don''t (like for instance if we do that tiered caching
with ssd''s thing) we can make sure discard is actually done on drives
that care about it.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Li Dongyang

2011-May-21 08:26 UTC

head link

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

On Friday, May 20, 2011 10:15:46 PM Josef Bacik wrote:> On 05/20/2011 07:52 AM, David Sterba wrote:
> > On Wed, May 18, 2011 at 11:29:14AM +0800, Li Dongyang wrote:
> >> Thanks for the fix, I thought EOPNOTSUPP could be useful by the
caller
> >> that time, and maybe we could do somthing like remove the discard
> >> mount_opt in the fs_info so we can avoid calling it again.
> > 
> > I do not agree that discard should be removed from mount_opt, because
> > one may add TRIM-capable devices later, it''s a whole
filesystem option,
> > while. A disappeared discard mount option will probably cause panic
> > in administrator''s head.
> > 
> > However, if a drive does not support TRIM, the btrfs_issue_discard
calls
> > can take a shortcut and do not call up to blkdev_issue_discard (though
> > it does return immediatelly), caching the state after first call. But
> > this is matter of the lower level call (blkdev) and should not be
> > propagated beyond to the extent "level" (ie.
btrfs_discard_extent).
> > 
> 
> There ought to just be a flag added to btrfs_device to say whether or
> not it supports discard, and if we get back EOPNOTSUPP we stop putting
> discards down on that device, that way if we have some devices that do
> it and some that don''t (like for instance if we do that tiered
caching
> with ssd''s thing) we can make sure discard is actually done on
drives
> that care about it.  Thanks,
> 
> Josef
> This should be the best way. but if we do not export the EOPNOTSUPP to callers
of btrfs_discard_extent, we will get no errors if we run fstrim on a btrfs which
all the devices don''t have trim, I think this is not what we want.

Before commit 5378e60734f5b7bfe1b43dc191aaf6131c1befe7, btrfs_discard_extent
will only return the error of btrfs_map_block,so I think we should move the
BUG_ONs in
to btrfs_discard_extent from the callers as actually they are testing the result
of
btrfs_map_block.

Thanks
Li Dongyang
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - May 2011 - Drive without DISCARD support crashes btrfs

Drive without DISCARD support crashes btrfs

[PATCH] btrfs: fix crash when no drive supports DISCARD

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD

Re: [PATCH] btrfs: fix crash when no drive supports DISCARD