Josef Bacik
2010-Nov-16 21:22 UTC
[PATCH] Btrfs: do not loop through raid types when looking for free extent
There is a bug in find_free_extent where if we don''t find a free extent in the raid type we are looking for, we loop through to the next raid type. This is not ok since we need to make sure we honor the raid types we are given. So instead kill this check and get the proper index for the raid type we want from the allocator. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> --- fs/btrfs/extent-tree.c | 20 ++++++++------------ 1 files changed, 8 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 0c097f3..a917e3c 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4785,16 +4785,16 @@ wait_block_group_cache_done(struct btrfs_block_group_cache *cache) return 0; } -static int get_block_group_index(struct btrfs_block_group_cache *cache) +static int get_block_group_index(u64 flags) { int index; - if (cache->flags & BTRFS_BLOCK_GROUP_RAID10) + if (flags & BTRFS_BLOCK_GROUP_RAID10) index = 0; - else if (cache->flags & BTRFS_BLOCK_GROUP_RAID1) + else if (flags & BTRFS_BLOCK_GROUP_RAID1) index = 1; - else if (cache->flags & BTRFS_BLOCK_GROUP_DUP) + else if (flags & BTRFS_BLOCK_GROUP_DUP) index = 2; - else if (cache->flags & BTRFS_BLOCK_GROUP_RAID0) + else if (flags & BTRFS_BLOCK_GROUP_RAID0) index = 3; else index = 4; @@ -4834,7 +4834,7 @@ static noinline int find_free_extent(struct btrfs_trans_handle *trans, struct btrfs_space_info *space_info; int last_ptr_loop = 0; int loop = 0; - int index = 0; + int index = get_block_group_index(data); bool found_uncached_bg = false; bool failed_cluster_refill = false; bool failed_alloc = false; @@ -4913,7 +4913,6 @@ ideal_cache: btrfs_put_block_group(block_group); up_read(&space_info->groups_sem); } else { - index = get_block_group_index(block_group); goto have_block_group; } } else if (block_group) { @@ -5145,14 +5144,11 @@ checks: loop: failed_cluster_refill = false; failed_alloc = false; - BUG_ON(index != get_block_group_index(block_group)); + BUG_ON(index != get_block_group_index(block_group->flags)); btrfs_put_block_group(block_group); } up_read(&space_info->groups_sem); - if (!ins->objectid && ++index < BTRFS_NR_RAID_TYPES) - goto search; - /* LOOP_FIND_IDEAL, only search caching/cached bg''s, and don''t wait for * for them to make caching progress. Also * determine the best possible bg to cache @@ -8204,7 +8200,7 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info) static void __link_block_group(struct btrfs_space_info *space_info, struct btrfs_block_group_cache *cache) { - int index = get_block_group_index(cache); + int index = get_block_group_index(cache->flags); down_write(&space_info->groups_sem); list_add_tail(&cache->list, &space_info->block_groups[index]); -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Yan, Zheng
2010-Nov-17 01:37 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote:> There is a bug in find_free_extent where if we don''t find a free extent in the > raid type we are looking for, we loop through to the next raid type. This is > not ok since we need to make sure we honor the raid types we are given. So > instead kill this check and get the proper index for the raid type we want from > the allocator. Thanks, >Loop through raid types is for handling failure in the middle of raid type conversion. Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Yan, Zheng
2010-Nov-17 01:38 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote:> There is a bug in find_free_extent where if we don''t find a free extent in the > raid type we are looking for, we loop through to the next raid type. This is > not ok since we need to make sure we honor the raid types we are given. So > instead kill this check and get the proper index for the raid type we want from > the allocator. Thanks, >Loop through raid types is for handling failure in the middle of raid type conversion. Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2010-Nov-17 02:31 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
On Wed, Nov 17, 2010 at 09:37:29AM +0800, Yan, Zheng wrote:> On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote: > > There is a bug in find_free_extent where if we don''t find a free extent in the > > raid type we are looking for, we loop through to the next raid type. This is > > not ok since we need to make sure we honor the raid types we are given. So > > instead kill this check and get the proper index for the raid type we want from > > the allocator. Thanks, > > > > Loop through raid types is for handling failure in the middle of raid type > conversion. >We need to figure out a different way to deal with it then, at the moment we have users getting all of the different raid types across their disks because it''s really easy to exit out of the main loop without an allocation, for example when doing the ideal caching stuff, we just look for the best block group to cache before even trying to make an allocation, so doing that loop will automatically make us change the raid type from what we asked for. Or in the case that the hint doesn''t actually point at a compatible block group we''ll get the first index, and if thats metadata we''ll end up not using DUP if thats the allocation policy. We can deal with raid type conversion later, as it stands this stuff is broken. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-Dec-14 01:33 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
Excerpts from Yan, Zheng''s message of 2010-11-16 20:38:23 -0500:> On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote: > > There is a bug in find_free_extent where if we don''t find a free extent in the > > raid type we are looking for, we loop through to the next raid type. Â This is > > not ok since we need to make sure we honor the raid types we are given. Â So > > instead kill this check and get the proper index for the raid type we want from > > the allocator. Â Thanks, > > > > Loop through raid types is for handling failure in the middle of raid type > conversion.The problem is that nothing prevents it from looping back to a raid0 chunk when we really want raid1 or raid10. And mkfs leaves behind a small raid0 chunk (4MB) that is uses as it assembles all the devices. I confirmed that we often use the small raid0 chunk even in raid1 or raid10. Please take a look at this commit: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=83a50de97fe96aca82389e061862ed760ece2283 -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Yan, Zheng
2010-Dec-14 01:54 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
On Tue, Dec 14, 2010 at 9:33 AM, Chris Mason <chris.mason@oracle.com> wrote:> Excerpts from Yan, Zheng''s message of 2010-11-16 20:38:23 -0500: >> On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote: >> > There is a bug in find_free_extent where if we don''t find a free extent in the >> > raid type we are looking for, we loop through to the next raid type. This is >> > not ok since we need to make sure we honor the raid types we are given. So >> > instead kill this check and get the proper index for the raid type we want from >> > the allocator. Thanks, >> > >> >> Loop through raid types is for handling failure in the middle of raid type >> conversion. > > The problem is that nothing prevents it from looping back to a raid0 > chunk when we really want raid1 or raid10. And mkfs leaves behind a > small raid0 chunk (4MB) that is uses as it assembles all the devices.check code at end of btrfs_read_block_groups, it prevents allocating from raid0 when there are mirrored block groups.> > I confirmed that we often use the small raid0 chunk even in raid1 or > raid10. > > Please take a look at this commit: > > http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=83a50de97fe96aca82389e061862ed760ece2283 > > -chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-Dec-14 01:59 UTC
Re: [PATCH] Btrfs: do not loop through raid types when looking for free extent
Excerpts from Yan, Zheng''s message of 2010-12-13 20:54:12 -0500:> On Tue, Dec 14, 2010 at 9:33 AM, Chris Mason <chris.mason@oracle.com> wrote: > > Excerpts from Yan, Zheng''s message of 2010-11-16 20:38:23 -0500: > >> On Wed, Nov 17, 2010 at 5:22 AM, Josef Bacik <josef@redhat.com> wrote: > >> > There is a bug in find_free_extent where if we don''t find a free extent in the > >> > raid type we are looking for, we loop through to the next raid type.  This is > >> > not ok since we need to make sure we honor the raid types we are given.  So > >> > instead kill this check and get the proper index for the raid type we want from > >> > the allocator.  Thanks, > >> > > >> > >> Loop through raid types is for handling failure in the middle of raid type > >> conversion. > > > > The problem is that nothing prevents it from looping back to a raid0 > > chunk when we really want raid1 or raid10.  And mkfs leaves behind a > > small raid0 chunk (4MB) that is uses as it assembles all the devices. > > check code at end of btrfs_read_block_groups, it prevents allocating > from raid0 when there are mirrored block groups.We''re still allocating from them though. It''s a big problem when we mount degraded. -chris> > > > > I confirmed that we often use the small raid0 chunk even in raid1 or > > raid10. > > > > Please take a look at this commit: > > > > http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=83a50de97fe96aca82389e061862ed760ece2283 > > > > -chris > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at  http://vger.kernel.org/majordomo-info.html > >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html