thr3ads.net - Btrfs devel - [PATCH] btrfs: prefer cached block groups when allocating [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Josef Bacik

2009-Jul-22 20:19 UTC

[PATCH] btrfs: prefer cached block groups when allocating

This patch makes find_free_extent much more complicated by adding three extra
loops to the allocator.  First, we look for nothing but cached block groups.  If
the block group is not cached or in the middle of caching we skip it and move on
to the next one.  The next loop is to go ahead and look through partially cached
block groups, but if there is not enough free space in that block group then
move on to the next one instead of waiting for the block group to finish
caching.  The final straw is to look at every block group, and then if we
didn''t
find space in it and its still caching, wait for it to finish caching, and look
at the block group again.  Then if all of these steps fail, go back to our old
faithfull loops of adding a new block group and changing empty_size to 0 and see
if we can find stuff.

In order to keep us from tripping a caching kthread for _every_ block group we
have on the disk the first go around, if we are in one of those
"don''t wait"
loops, only kick off a new thread if there are less than 2 threads.  This will
keep us from kicking off so many caching kthreads that everything else on the
box comes to a screeching halt.

In my performance tests I fill up the disk with fs_mark and then unmount.  Then
I do the following

mount /dev/sda1 /mnt/btrfs
time touch /mnt/btrfs/file
umount /dev/sda1
mount /dev/sda1 /mnt/btrfs
time dd /mnt/btrfs/file
time umount /mnt/btrfs

The timing of the umount I think is the most interesting because it is the time
we have to wait for all of the caching kthreads to stop before we can finish.
The three times without this patch are

real    0m0.810s
user    0m0.000s
sys     0m0.000s

real    0m0.448s
user    0m0.000s
sys     0m0.004s

real    0m12.224s
user    0m0.008s

And the three times with this patch are

real    0m0.308s
user    0m0.000s
sys     0m0.012s

real    0m0.030s
user    0m0.000s
sys     0m0.012s

real    0m5.705s
user    0m0.000s
sys     0m0.044s

So a significant advantage on the unmount front, we haven''t started as
many
caching kthreads so it takes us less time to wait for them to finish, and we are
much smarter about doing our allocations so the touch and dd take less time as
well.  Thanks,

Signed-off-by: Josef Bacik <jbacik@redhat.com>
---
 fs/btrfs/extent-tree.c |   65 +++++++++++++++++++++++++++++++++++++----------
 1 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f0da696..8fd3669 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3660,6 +3660,7 @@ static noinline int find_free_extent(struct
btrfs_trans_handle *trans,
 	struct btrfs_space_info *space_info;
 	int last_ptr_loop = 0;
 	int loop = 0;
+	bool found_uncached_bg = false;
 
 	WARN_ON(num_bytes < root->sectorsize);
 	btrfs_set_key_type(ins, BTRFS_EXTENT_ITEM_KEY);
@@ -3699,7 +3700,12 @@ static noinline int find_free_extent(struct
btrfs_trans_handle *trans,
 	if (search_start == hint_byte) {
 		block_group = btrfs_lookup_block_group(root->fs_info,
 						       search_start);
-		if (block_group && block_group_bits(block_group, data)) {
+		/*
+		 * we don''t want to use the block group if it doesn''t match
our
+		 * allocation bits, or if its not cached.
+		 */
+		if (block_group && block_group_bits(block_group, data) &&
+		    block_group_cache_done(block_group)) {
 			down_read(&space_info->groups_sem);
 			if (list_empty(&block_group->list) ||
 			    block_group->ro) {
@@ -3729,11 +3735,30 @@ search:
 
 have_block_group:
 		if (unlikely(block_group->cached == BTRFS_CACHE_NO)) {
-			ret = cache_block_group(block_group);
-			BUG_ON(ret);
+			/*
+			 * we want to start caching kthreads, but not too many
+			 * right off the bat so we don''t overwhelm the system,
+			 * so only start them if there are less than 2 and we''re
+			 * in the initial allocation phase.
+			 */
+			if (loop > 1 ||
+			    atomic_read(&root->fs_info->async_caching_threads)
+			    < 2) {
+				ret = cache_block_group(block_group);
+				BUG_ON(ret);
+			}
 		}
 
 		cached = block_group_cache_done(block_group);
+		if (unlikely(!cached)) {
+			found_uncached_bg = true;
+
+			/*
+			 * if loop == 0 we only want cached bgs, so keep going
+			 */
+			if (loop == 0)
+				goto loop;
+		}
 
 		if (unlikely(block_group->ro))
 			goto loop;
@@ -3813,7 +3838,7 @@ refill_cluster:
 					spin_unlock(&last_ptr->refill_lock);
 					goto checks;
 				}
-			} else if (!cached) {
+			} else if (!cached && loop > 1) {
 				spin_unlock(&last_ptr->refill_lock);
 
 				wait_block_group_cache_progress(block_group,
@@ -3827,7 +3852,7 @@ refill_cluster:
 			 * cluster.  Free the cluster we''ve been trying
 			 * to use, and go to the next block group
 			 */
-			if (loop < 2) {
+			if (loop < 4) {
 				btrfs_return_cluster_to_free_space(NULL,
 								   last_ptr);
 				spin_unlock(&last_ptr->refill_lock);
@@ -3838,9 +3863,9 @@ refill_cluster:
 
 		offset = btrfs_find_space_for_alloc(block_group, search_start,
 						    num_bytes, empty_size);
-		if (!offset && cached) {
+		if (!offset && (cached || (!cached && loop == 1))) {
 			goto loop;
-		} else if (!offset) {
+		} else if (!offset && (!cached && loop > 1)) {
 			wait_block_group_cache_progress(block_group,
 					num_bytes + empty_size);
 			goto have_block_group;
@@ -3892,13 +3917,25 @@ loop:
 	}
 	up_read(&space_info->groups_sem);
 
-	/* loop == 0, try to find a clustered alloc in every block group
-	 * loop == 1, try again after forcing a chunk allocation
-	 * loop == 2, set empty_size and empty_cluster to 0 and try again
+	/* loop == 0, only search fully cached block groups
+	 * loop == 1, search partially cached block groups, but dont wait for
+	 *	      them to finish caching
+	 * loop == 2, search everything, and wait if our bg is caching
+	 * loop == 3, force a chunk allocation and try again
+	 * loop == 4, set empty_size and empty_cluster to 0 and try again
 	 */
-	if (!ins->objectid && loop < 3 &&
-	    (empty_size || empty_cluster || allowed_chunk_alloc)) {
-		if (loop >= 2) {
+	if (!ins->objectid && loop < 4 &&
+	    (found_uncached_bg || empty_size || empty_cluster ||
+	     allowed_chunk_alloc)) {
+		if (found_uncached_bg) {
+			found_uncached_bg = false;
+			if (loop < 2) {
+				loop++;
+				goto search;
+			}
+		}
+
+		if (loop == 3) {
 			empty_size = 0;
 			empty_cluster = 0;
 		}
@@ -3911,7 +3948,7 @@ loop:
 			space_info->force_alloc = 1;
 		}
 
-		if (loop < 3) {
+		if (loop < 4) {
 			loop++;
 			goto search;
 		}
-- 
1.5.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2009-Jul-22 20:56 UTC

head link

Re: [PATCH] btrfs: prefer cached block groups when allocating UPDATED

On Wed, Jul 22, 2009 at 04:19:09PM -0400, Josef Bacik
wrote:> This patch makes find_free_extent much more complicated by adding three
extra
> loops to the allocator.  First, we look for nothing but cached block
groups.  If
> the block group is not cached or in the middle of caching we skip it and
move on
> to the next one.  The next loop is to go ahead and look through partially
cached
> block groups, but if there is not enough free space in that block group
then
> move on to the next one instead of waiting for the block group to finish
> caching.  The final straw is to look at every block group, and then if we
didn''t
> find space in it and its still caching, wait for it to finish caching, and
look
> at the block group again.  Then if all of these steps fail, go back to our
old
> faithfull loops of adding a new block group and changing empty_size to 0
and see
> if we can find stuff.
> 
> In order to keep us from tripping a caching kthread for _every_ block group
we
> have on the disk the first go around, if we are in one of those
"don''t wait"
> loops, only kick off a new thread if there are less than 2 threads.  This
will
> keep us from kicking off so many caching kthreads that everything else on
the
> box comes to a screeching halt.
> 
> In my performance tests I fill up the disk with fs_mark and then unmount. 
Then
> I do the following
> 
> mount /dev/sda1 /mnt/btrfs
> time touch /mnt/btrfs/file
> umount /dev/sda1
> mount /dev/sda1 /mnt/btrfs
> time dd /mnt/btrfs/file
> time umount /mnt/btrfs
> 
> The timing of the umount I think is the most interesting because it is the
time
> we have to wait for all of the caching kthreads to stop before we can
finish.
> The three times without this patch are
> 
> real    0m0.810s
> user    0m0.000s
> sys     0m0.000s
> 
> real    0m0.448s
> user    0m0.000s
> sys     0m0.004s
> 
> real    0m12.224s
> user    0m0.008s
> 
> And the three times with this patch are
> 
> real    0m0.308s
> user    0m0.000s
> sys     0m0.012s
> 
> real    0m0.030s
> user    0m0.000s
> sys     0m0.012s
> 
> real    0m5.705s
> user    0m0.000s
> sys     0m0.044s
> 
> So a significant advantage on the unmount front, we haven''t
started as many
> caching kthreads so it takes us less time to wait for them to finish, and
we are
> much smarter about doing our allocations so the touch and dd take less time
as
> well.  Thanks,
> 
Updated to use an enum instead of just magic numbers for the loop state machine.
Thanks,

Signed-off-by: Josef Bacik <jbacik@redhat.com>
---
 fs/btrfs/extent-tree.c |   78 +++++++++++++++++++++++++++++++++++++----------
 1 files changed, 61 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f0da696..925bf8e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3635,6 +3635,14 @@ wait_block_group_cache_progress(struct
btrfs_block_group_cache *cache,
 	return 0;
 }
 
+enum btrfs_loop_type {
+	LOOP_CACHED_ONLY = 0,
+	LOOP_CACHING_NOWAIT = 1,
+	LOOP_CACHING_WAIT = 2,
+	LOOP_ALLOC_CHUNK = 3,
+	LOOP_NO_EMPTY_SIZE = 4,
+};
+
 /*
  * walks the btree of allocated extents and find a hole of a given size.
  * The key ins is changed to record the hole:
@@ -3660,6 +3668,7 @@ static noinline int find_free_extent(struct
btrfs_trans_handle *trans,
 	struct btrfs_space_info *space_info;
 	int last_ptr_loop = 0;
 	int loop = 0;
+	bool found_uncached_bg = false;
 
 	WARN_ON(num_bytes < root->sectorsize);
 	btrfs_set_key_type(ins, BTRFS_EXTENT_ITEM_KEY);
@@ -3691,15 +3700,18 @@ static noinline int find_free_extent(struct
btrfs_trans_handle *trans,
 	search_start = max(search_start, first_logical_byte(root, 0));
 	search_start = max(search_start, hint_byte);
 
-	if (!last_ptr) {
+	if (!last_ptr)
 		empty_cluster = 0;
-		loop = 1;
-	}
 
 	if (search_start == hint_byte) {
 		block_group = btrfs_lookup_block_group(root->fs_info,
 						       search_start);
-		if (block_group && block_group_bits(block_group, data)) {
+		/*
+		 * we don''t want to use the block group if it doesn''t match
our
+		 * allocation bits, or if its not cached.
+		 */
+		if (block_group && block_group_bits(block_group, data) &&
+		    block_group_cache_done(block_group)) {
 			down_read(&space_info->groups_sem);
 			if (list_empty(&block_group->list) ||
 			    block_group->ro) {
@@ -3729,11 +3741,28 @@ search:
 
 have_block_group:
 		if (unlikely(block_group->cached == BTRFS_CACHE_NO)) {
-			ret = cache_block_group(block_group);
-			BUG_ON(ret);
+			/*
+			 * we want to start caching kthreads, but not too many
+			 * right off the bat so we don''t overwhelm the system,
+			 * so only start them if there are less than 2 and we''re
+			 * in the initial allocation phase.
+			 */
+			if (loop > LOOP_CACHING_NOWAIT ||
+			    atomic_read(&root->fs_info->async_caching_threads)
+			    < 2) {
+				ret = cache_block_group(block_group);
+				BUG_ON(ret);
+			}
 		}
 
 		cached = block_group_cache_done(block_group);
+		if (unlikely(!cached)) {
+			found_uncached_bg = true;
+
+			/* if we only want cached bgs, loop */
+			if (loop == LOOP_CACHED_ONLY)
+				goto loop;
+		}
 
 		if (unlikely(block_group->ro))
 			goto loop;
@@ -3813,7 +3842,7 @@ refill_cluster:
 					spin_unlock(&last_ptr->refill_lock);
 					goto checks;
 				}
-			} else if (!cached) {
+			} else if (!cached && loop > LOOP_CACHING_NOWAIT) {
 				spin_unlock(&last_ptr->refill_lock);
 
 				wait_block_group_cache_progress(block_group,
@@ -3827,7 +3856,7 @@ refill_cluster:
 			 * cluster.  Free the cluster we''ve been trying
 			 * to use, and go to the next block group
 			 */
-			if (loop < 2) {
+			if (loop < LOOP_NO_EMPTY_SIZE) {
 				btrfs_return_cluster_to_free_space(NULL,
 								   last_ptr);
 				spin_unlock(&last_ptr->refill_lock);
@@ -3838,9 +3867,11 @@ refill_cluster:
 
 		offset = btrfs_find_space_for_alloc(block_group, search_start,
 						    num_bytes, empty_size);
-		if (!offset && cached) {
+		if (!offset && (cached || (!cached &&
+					   loop == LOOP_CACHING_NOWAIT))) {
 			goto loop;
-		} else if (!offset) {
+		} else if (!offset && (!cached &&
+				       loop > LOOP_CACHING_NOWAIT)) {
 			wait_block_group_cache_progress(block_group,
 					num_bytes + empty_size);
 			goto have_block_group;
@@ -3892,13 +3923,26 @@ loop:
 	}
 	up_read(&space_info->groups_sem);
 
-	/* loop == 0, try to find a clustered alloc in every block group
-	 * loop == 1, try again after forcing a chunk allocation
-	 * loop == 2, set empty_size and empty_cluster to 0 and try again
+	/* LOOP_CACHED_ONLY, only search fully cached block groups
+	 * LOOP_CACHING_NOWAIT, search partially cached block groups, but
+	 *			dont wait foR them to finish caching
+	 * LOOP_CACHING_WAIT, search everything, and wait if our bg is caching
+	 * LOOP_ALLOC_CHUNK, force a chunk allocation and try again
+	 * LOOP_NO_EMPTY_SIZE, set empty_size and empty_cluster to 0 and try
+	 *			again
 	 */
-	if (!ins->objectid && loop < 3 &&
-	    (empty_size || empty_cluster || allowed_chunk_alloc)) {
-		if (loop >= 2) {
+	if (!ins->objectid && loop < LOOP_NO_EMPTY_SIZE &&
+	    (found_uncached_bg || empty_size || empty_cluster ||
+	     allowed_chunk_alloc)) {
+		if (found_uncached_bg) {
+			found_uncached_bg = false;
+			if (loop < LOOP_CACHING_WAIT) {
+				loop++;
+				goto search;
+			}
+		}
+
+		if (loop == LOOP_ALLOC_CHUNK) {
 			empty_size = 0;
 			empty_cluster = 0;
 		}
@@ -3911,7 +3955,7 @@ loop:
 			space_info->force_alloc = 1;
 		}
 
-		if (loop < 3) {
+		if (loop < LOOP_NO_EMPTY_SIZE) {
 			loop++;
 			goto search;
 		}
-- 
1.5.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Jul 2009 - [PATCH] btrfs: prefer cached block groups when allocating

[PATCH] btrfs: prefer cached block groups when allocating

Re: [PATCH] btrfs: prefer cached block groups when allocating UPDATED