Josef Bacik
2012-Sep-24 18:11 UTC
[PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
The reason we offload csumming is because it is CPU intensive, except it is not on modern intel CPUs. So check to see if we support hardware crc32c, and if we do just do the csumming in our current threads context. Otherwise we can farm it off. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> --- fs/btrfs/disk-io.c | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index dcaf556..830b9af 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -31,6 +31,7 @@ #include <linux/migrate.h> #include <linux/ratelimit.h> #include <asm/unaligned.h> +#include <asm/cpufeature.h> #include "compat.h" #include "ctree.h" #include "disk-io.h" @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio, } /* + * Pretty sure I''m going to hell for this. If our CPU can do crc32cs in + * the hardware then there is no reason to do the csum stuff + * asynchronously, it will be faster to do it inline, so test to see if + * our CPU can do hardware crc32c and if it can just do the csum in our + * threads context. + */ +#ifdef CONFIG_X86 + if (cpu_has_xmm4_2) { + printk(KERN_ERR "doing it the fast way\n"); + ret = btree_csum_one_bio(bio); + if (ret) + return ret; + return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0); + } +#endif + /* * kthread helpers are used to submit writes so that checksumming * can happen in parallel across all CPUs */ -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Arne Jansen
2012-Sep-24 18:19 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On 09/24/12 20:11, Josef Bacik wrote:> The reason we offload csumming is because it is CPU intensive, except it is > not on modern intel CPUs. So check to see if we support hardware crc32c, > and if we do just do the csumming in our current threads context. Otherwise > we can farm it off. Thanks, > > Signed-off-by: Josef Bacik <jbacik@fusionio.com> > --- > fs/btrfs/disk-io.c | 17 +++++++++++++++++ > 1 files changed, 17 insertions(+), 0 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index dcaf556..830b9af 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -31,6 +31,7 @@ > #include <linux/migrate.h> > #include <linux/ratelimit.h> > #include <asm/unaligned.h> > +#include <asm/cpufeature.h> > #include "compat.h" > #include "ctree.h" > #include "disk-io.h" > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio, > } > > /* > + * Pretty sure I''m going to hell for this. If our CPU can do crc32cs in > + * the hardware then there is no reason to do the csum stuff > + * asynchronously, it will be faster to do it inline, so test to see if > + * our CPU can do hardware crc32c and if it can just do the csum in our > + * threads context. > + */ > +#ifdef CONFIG_X86 > + if (cpu_has_xmm4_2) { > + printk(KERN_ERR "doing it the fast way\n");You''ll probably go to hell for the printk...> + ret = btree_csum_one_bio(bio); > + if (ret) > + return ret; > + return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0); > + } > +#endif > + /* > * kthread helpers are used to submit writes so that checksumming > * can happen in parallel across all CPUs > */ >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Sep-24 18:33 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote:> On 09/24/12 20:11, Josef Bacik wrote: > > The reason we offload csumming is because it is CPU intensive, except it is > > not on modern intel CPUs. So check to see if we support hardware crc32c, > > and if we do just do the csumming in our current threads context. Otherwise > > we can farm it off. Thanks, > > > > Signed-off-by: Josef Bacik <jbacik@fusionio.com> > > --- > > fs/btrfs/disk-io.c | 17 +++++++++++++++++ > > 1 files changed, 17 insertions(+), 0 deletions(-) > > > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > > index dcaf556..830b9af 100644 > > --- a/fs/btrfs/disk-io.c > > +++ b/fs/btrfs/disk-io.c > > @@ -31,6 +31,7 @@ > > #include <linux/migrate.h> > > #include <linux/ratelimit.h> > > #include <asm/unaligned.h> > > +#include <asm/cpufeature.h> > > #include "compat.h" > > #include "ctree.h" > > #include "disk-io.h" > > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio, > > } > > > > /* > > + * Pretty sure I''m going to hell for this. If our CPU can do crc32cs in > > + * the hardware then there is no reason to do the csum stuff > > + * asynchronously, it will be faster to do it inline, so test to see if > > + * our CPU can do hardware crc32c and if it can just do the csum in our > > + * threads context. > > + */ > > +#ifdef CONFIG_X86 > > + if (cpu_has_xmm4_2) { > > + printk(KERN_ERR "doing it the fast way\n"); > > You''ll probably go to hell for the printk... >Hahah oops, at least I remembered to take out the other printk, it had much more colorful language ;). Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Sep-24 18:58 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote:> On 09/24/12 20:11, Josef Bacik wrote: > > The reason we offload csumming is because it is CPU intensive, except it is > > not on modern intel CPUs. So check to see if we support hardware crc32c, > > and if we do just do the csumming in our current threads context. Otherwise > > we can farm it off. Thanks, > > > > Signed-off-by: Josef Bacik <jbacik@fusionio.com> > > --- > > fs/btrfs/disk-io.c | 17 +++++++++++++++++ > > 1 files changed, 17 insertions(+), 0 deletions(-) > > > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > > index dcaf556..830b9af 100644 > > --- a/fs/btrfs/disk-io.c > > +++ b/fs/btrfs/disk-io.c > > @@ -31,6 +31,7 @@ > > #include <linux/migrate.h> > > #include <linux/ratelimit.h> > > #include <asm/unaligned.h> > > +#include <asm/cpufeature.h> > > #include "compat.h" > > #include "ctree.h" > > #include "disk-io.h" > > @@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio, > > } > > > > /* > > + * Pretty sure I''m going to hell for this. If our CPU can do crc32cs in > > + * the hardware then there is no reason to do the csum stuff > > + * asynchronously, it will be faster to do it inline, so test to see if > > + * our CPU can do hardware crc32c and if it can just do the csum in our > > + * threads context. > > + */ > > +#ifdef CONFIG_X86 > > + if (cpu_has_xmm4_2) { > > + printk(KERN_ERR "doing it the fast way\n"); > > You''ll probably go to hell for the printk...;) Testing with dd on my recent intel box, I can hardware crc32c at 1.3GB/s. Anything beyond that and you really want more cpus jumping into the mix. I wanted to use this test for data crcs too, but I suppose the helpers only really hurt for the synchronous IO. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Sep-24 21:03 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On Mon, Sep 24, 2012 at 02:11:04PM -0400, Josef Bacik wrote:> +#ifdef CONFIG_X86 > + if (cpu_has_xmm4_2) { > + printk(KERN_ERR "doing it the fast way\n"); > + ret = btree_csum_one_bio(bio); > + if (ret) > + return ret; > + return btrfs_map_bio(BTRFS_I(inode)->root, rw, bio, mirror_num, 0); > + } > +#endifCould you please put the check into a separate helper and avoid the #ifdef? This is a second candidate for a standalone utils.c where non-fs support code could reside. Or you can call it hellpers.c . david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Sep-25 10:51 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote:> Could you please put the check into a separate helperPlease note that checksum will become a variable per-filesystem property, stored within the superblock, so the helper should be passed a fs_info pointer. thanks, david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ching
2012-Sep-25 11:40 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On 09/25/2012 06:51 PM, David Sterba wrote:> On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote: >> Could you please put the check into a separate helper > Please note that checksum will become a variable per-filesystem > property, stored within the superblock, so the helper should be passed a > fs_info pointer. > > thanks, > david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >How about enhancing the "*thread_pool=/number" mount option instead? thread_pool=n enable threadpool/**/for compression and checksum, MAY improve bandwidth/* */thread_pool=0 disable threadpool for compression and checksum, MIGHT reduce latency thread_pool=-1 or not provided automatically managed (current behaviour and default choice) This should allow user to tradeoff between latency and bandwidth, furthermore, you do not need to assume that btrfs may use crc32c algorithm only forever. /* -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ching
2012-Sep-25 11:54 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On 09/25/2012 06:51 PM, David Sterba wrote:> On Mon, Sep 24, 2012 at 11:03:49PM +0200, David Sterba wrote: >> Could you please put the check into a separate helper > Please note that checksum will become a variable per-filesystem > property, stored within the superblock, so the helper should be passed a > fs_info pointer. > > thanks, > david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >How about enhancing the "thread_pool=number" mount option instead? thread_pool=n enable threadpool for compression and checksum, MAY improve bandwidth thread_pool=0 disable threadpool for compression and checksum, MIGHT reduce latency thread_pool=-1 or not provided automatically managed (current behavior and default choice) This should allow user to tradeoff between latency and bandwidth, furthermore, you do not need to assume that btrfs may use crc32c algorithm only forever. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Sep-25 12:55 UTC
Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
On Tue, Sep 25, 2012 at 07:40:17PM +0800, ching wrote:> How about enhancing the "*thread_pool=/number" mount option instead? > > thread_pool=n enable threadpool/**/for compression and checksum, MAY improve bandwidth/* > */thread_pool=0 disable threadpool for compression and checksum, MIGHT reduce latency > thread_pool=-1 or not provided automatically managed (current behaviour and default choice)Sorry, I don''t understand the syntax, can you please write it more clearly? Thanks.> This should allow user to tradeoff between latency and bandwidth, > furthermore, you do not need to assume that btrfs may use crc32c > algorithm only forever.Some sort of finer control over the threads makes sense, we should distinguish for cpu-bound processing where paralelism wins and io-bound where it is not helpful to add more and more threads that hammer a single device (namely in case of HDD, so this should be tunable for the devices with cheap seeks). david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html