thr3ads.net - Btrfs devel - [patch]btrfs: finish read pages in the order they are submitted [Feb 2010]

If this information is useful, please help other people find it:
Share via:

Shaohua Li

2010-Feb-03 07:45 UTC

[patch]btrfs: finish read pages in the order they are submitted

the endio is done at reverse order of bio vectors. That means for a sequential
read, the page first submitted will finish last in a bio. Considering we will
do checksum (making cache hot) for every page, this does introduce delay (and
chance to squeeze cache used soon) for pages submitted at the begining. I
don''t observe obvious performance difference with below patch at my
simple test,
but seems more natural to finish read in the order they are submitted.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 96577e8..4df0c56 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1750,7 +1750,8 @@ static void end_bio_extent_writepage(struct bio *bio, int
err)
 static void end_bio_extent_readpage(struct bio *bio, int err)
 {
 	int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
-	struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
+	struct bio_vec *bvec_end = bio->bi_io_vec + bio->bi_vcnt - 1;
+	struct bio_vec *bvec = bio->bi_io_vec;
 	struct extent_io_tree *tree;
 	u64 start;
 	u64 end;
@@ -1773,7 +1774,7 @@ static void end_bio_extent_readpage(struct bio *bio, int
err)
 		else
 			whole_page = 0;
 
-		if (--bvec >= bio->bi_io_vec)
+		if (++bvec <= bvec_end)
 			prefetchw(&bvec->bv_page->flags);
 
 		if (uptodate && tree->ops &&
tree->ops->readpage_end_io_hook) {
@@ -1818,7 +1819,7 @@ static void end_bio_extent_readpage(struct bio *bio, int
err)
 			}
 			check_page_locked(tree, page);
 		}
-	} while (bvec >= bio->bi_io_vec);
+	} while (bvec <= bvec_end);
 
 	bio_put(bio);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2010-Feb-03 18:18 UTC

head link

Re: [patch]btrfs: finish read pages in the order they are submitted

On Wed, Feb 03, 2010 at 03:45:11PM +0800, Shaohua Li
wrote:> the endio is done at reverse order of bio vectors. That means for a
sequential
> read, the page first submitted will finish last in a bio. Considering we
will
> do checksum (making cache hot) for every page, this does introduce delay
(and
> chance to squeeze cache used soon) for pages submitted at the begining. I
> don''t observe obvious performance difference with below patch at
my simple test,
> but seems more natural to finish read in the order they are submitted.
Interesting, I wonder if we''d be able to see this on a higher
throughput
system.  Jens, care to give it a shot (patch below)?

-chris


Signed-off-by: Shaohua Li <shaohua.li@intel.com>

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 96577e8..4df0c56 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1750,7 +1750,8 @@ static void end_bio_extent_writepage(struct bio *bio, int
err)
 static void end_bio_extent_readpage(struct bio *bio, int err)
 {
 	int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
-	struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
+	struct bio_vec *bvec_end = bio->bi_io_vec + bio->bi_vcnt - 1;
+	struct bio_vec *bvec = bio->bi_io_vec;
 	struct extent_io_tree *tree;
 	u64 start;
 	u64 end;
@@ -1773,7 +1774,7 @@ static void end_bio_extent_readpage(struct bio *bio, int
err)
 		else
 			whole_page = 0;
 
-		if (--bvec >= bio->bi_io_vec)
+		if (++bvec <= bvec_end)
 			prefetchw(&bvec->bv_page->flags);
 
 		if (uptodate && tree->ops &&
tree->ops->readpage_end_io_hook) {
@@ -1818,7 +1819,7 @@ static void end_bio_extent_readpage(struct bio *bio, int
err)
 			}
 			check_page_locked(tree, page);
 		}
-	} while (bvec >= bio->bi_io_vec);
+	} while (bvec <= bvec_end);
 
 	bio_put(bio);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jens Axboe

2010-Feb-08 10:59 UTC

head link

Re: [patch]btrfs: finish read pages in the order they are submitted

On Wed, Feb 03 2010, Chris Mason wrote:> On Wed, Feb 03, 2010 at 03:45:11PM +0800, Shaohua Li wrote:
> > the endio is done at reverse order of bio vectors. That means for a
sequential
> > read, the page first submitted will finish last in a bio. Considering
we will
> > do checksum (making cache hot) for every page, this does introduce
delay (and
> > chance to squeeze cache used soon) for pages submitted at the
begining. I
> > don''t observe obvious performance difference with below patch
at my simple test,
> > but seems more natural to finish read in the order they are submitted.
> 
> Interesting, I wonder if we''d be able to see this on a higher
throughput
> system.  Jens, care to give it a shot (patch below)?
Sure, I gave it a spin. Baseline is current -git (-rc7''ish), and the
workload is just stream reading 8 16GB files. I used large streaming
reads as the bigger ios would hopefully help show the effect of doing
the reverse completions. The run takes ~1 minute, and the results are
averaged over 3 runs.

Throughput:

Kernel          Slowest         Fastest         Average
-------------------------------------------------------
baseline        2041MB/sec      2229MB/sec      2155MB/sec
patched         2052MB/sec      2071MB/sec      2062MB/sec

Completion latency average (msecs):

Kernel          Best            Worst           Average
-------------------------------------------------------
baseline        1.72            1.89            1.79
patche          1.83            1.89            1.85

Probably would need a LOT more runs to get a statistically significant
number here, it would be nice if O_DIRECT worked (hint, hint!) which
usually makes these things easier to test. If I look at the throughput
of the runs, the baseline usually starts a little slower (1.8GB/sec or
so) and gets faster, while the patched run starts much higher (close to
3.0GB/sec) and drops to 2.0GB/sec after that for the rest of the run.

So I did some perf stat checks too, to see if we see an improvement for
cache utilization. Results below.

Cache stats (millions)

Kernel          References              Misses
----------------------------------------------
baseline        3547                    2387
patched         3822                    2351o

These numbers are very stable, the above were also averaged over 3 runs,
but variability was very low.

My feeling is that the patch should be included. Cache misses are
provably down and the patch makes a lot of sense just logically. The
patched runs seemed more stable, and my gut tells me that the unpatched
runs may have been a bit flukey (one fast run, should probably be
excluded).

Let me know if you want more tests.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jens Axboe

2010-Feb-08 11:44 UTC

head link

Re: [patch]btrfs: finish read pages in the order they are submitted

On Mon, Feb 08 2010, Jens Axboe wrote:> Cache stats (millions)
> 
> Kernel          References              Misses
> ----------------------------------------------
> baseline        3547                    2387
> patched         3822                    2351
> 
> These numbers are very stable, the above were also averaged over 3 runs,
> but variability was very low.
Update on this. I setup the storage system for more stable runs and
repeated the above test. It runs a bit faster as well, completes the
workload at 2.5GB/sec average.

Cache stats (millions)

Kernel          References              Misses
----------------------------------------------
baseline        3384                    2318
baseline        3417                    2313
baseline        3382                    2323
baseline avg    3394                    2318

patched         3518                    2258
patched         3428                    2201
patched         3536                    2274
patched avg     3494                    2244

So for those runs, ~3% more references and ~3 less misses. Even with the
variability here, that looks like a win in my book.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Feb 2010 - [patch]btrfs: finish read pages in the order they are submitted

[patch]btrfs: finish read pages in the order they are submitted

Re: [patch]btrfs: finish read pages in the order they are submitted

Re: [patch]btrfs: finish read pages in the order they are submitted

Re: [patch]btrfs: finish read pages in the order they are submitted