Hi everyone, This spot in the chunk allocation code has seen a lot of little tweaks, so I wanted to send this patch out for more eyes. -- We try to limit the size of a chunk to 10GB, which keeps the unit of work reasonable during balance and resize operations. The limit checks were taking into account the number of copies of the data we had but what they really should be doing is comparing against the logical size of the chunk we''re creating. This moves the code around a little to use the count of data stripes from raid5/6. Signed-off-by: Chris Mason <chris.mason@fusionio.com> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5d6010b..538c5cf 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3837,10 +3837,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, */ data_stripes = num_stripes / ncopies; - if (stripe_size * ndevs > max_chunk_size * ncopies) { - stripe_size = max_chunk_size * ncopies; - do_div(stripe_size, ndevs); - } if (type & BTRFS_BLOCK_GROUP_RAID5) { raid_stripe_len = find_raid56_stripe_len(ndevs - 1, btrfs_super_stripesize(info->super_copy)); @@ -3851,6 +3847,27 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, btrfs_super_stripesize(info->super_copy)); data_stripes = num_stripes - 2; } + + /* + * Use the number of data stripes to figure out how big this chunk + * is really going to be in terms of logical address space, + * and compare that answer with the max chunk size + */ + if (stripe_size * data_stripes > max_chunk_size) { + u64 mask = (1ULL << 24) - 1; + stripe_size = max_chunk_size; + do_div(stripe_size, data_stripes); + + /* bump the answer up to a 16MB boundary */ + stripe_size = (stripe_size + mask) & ~mask; + + /* but don''t go higher than the limits we found + * while searching for free extents + */ + if (stripe_size > devices_info[ndevs-1].max_avail) + stripe_size = devices_info[ndevs-1].max_avail; + } + do_div(stripe_size, dev_stripes); /* align to BTRFS_STRIPE_LEN */ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Goffredo Baroncelli
2013-Feb-21 18:25 UTC
Re: [PATCH] Btrfs: fix max chunk size on raid5/6
Hi Chris, my comments below On 02/20/2013 10:32 PM, Chris Mason wrote:> Hi everyone, > > This spot in the chunk allocation code has seen a lot of little tweaks, > so I wanted to send this patch out for more eyes. > > -- > > We try to limit the size of a chunk to 10GB, which keeps the unit of > work reasonable during balance and resize operations. The limit checks > were taking into account the number of copies of the data we had but > what they really should be doing is comparing against the logical > size of the chunk we''re creating. > > This moves the code around a little to use the count of data stripes > from raid5/6. > > Signed-off-by: Chris Mason <chris.mason@fusionio.com> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 5d6010b..538c5cf 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -3837,10 +3837,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, > */ > data_stripes = num_stripes / ncopies; > > - if (stripe_size * ndevs > max_chunk_size * ncopies) { > - stripe_size = max_chunk_size * ncopies; > - do_div(stripe_size, ndevs); > - } > if (type & BTRFS_BLOCK_GROUP_RAID5) { > raid_stripe_len = find_raid56_stripe_len(ndevs - 1, > btrfs_super_stripesize(info->super_copy)); > @@ -3851,6 +3847,27 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, > btrfs_super_stripesize(info->super_copy)); > data_stripes = num_stripes - 2; > } > + > + /* > + * Use the number of data stripes to figure out how big this chunk > + * is really going to be in terms of logical address space, > + * and compare that answer with the max chunk size > + */ > + if (stripe_size * data_stripes > max_chunk_size) { > + u64 mask = (1ULL << 24) - 1;1<<24 should be a #define or better a parameter tunable via sysfs and/or a superblock field. May be we can use everywhere BTRFS_STRIPE_LEN ? (of course increasing its value to 16MB or more ?) I am asking that because I don''t know if it makes sense to have BTRFS_STRIPE_LEN different from the "round up" value ...> + stripe_size = max_chunk_size; > + do_div(stripe_size, data_stripes); > + > + /* bump the answer up to a 16MB boundary */ > + stripe_size = (stripe_size + mask) & ~mask; > + > + /* but don''t go higher than the limits we found > + * while searching for free extents > + */ > + if (stripe_size > devices_info[ndevs-1].max_avail) > + stripe_size = devices_info[ndevs-1].max_avail; > + } > + > do_div(stripe_size, dev_stripes);The logic to me seems correct. However my fear is that a "16MB" boundary is too low. I would felt better with a round-up to 128MB. So the stripe size would vary from 128MB to 1GB in step of 128MB. The combination wouldn''t be too high. -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html