David Hildenbrand
2014-Sep-17  12:00 UTC
blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
> >>> Does anyone have an idea? > >>> The request itself is completely filled with cc > >> > >> That is very weird, the 'rq' is got from hctx->tags, and rq should be > >> valid, and rq->q shouldn't have been changed even though it was > >> double free or double allocation. > >> > >>> I am currently asking myself if blk_mq_map_request should protect against softirq here but I cant say for sure,as I have never looked into that code before. > >> > >> No, it needn't the protection. > >> > >> Thanks, > >> > > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo at vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > >Digging through the code, I think I found a possible cause: tags->rqs[..] is not initialized with zeroes (via alloc_pages_node in blk-mq.c:blk_mq_init_rq_map()). When a request is created: 1. __blk_mq_alloc_request() gets a free tag (thus e.g. removing it from bitmap_tags) 2. __blk_mq_alloc_request() initializes is via blk_mq_rq_ctx_init(). The struct is filled with life and rq->q is set. When blk_mq_hw_ctx_check_timeout() is called: 1. blk_mq_tag_busy_iter() is used to call blk_mq_timeout_check() on all busy tags. 2. This is done by collecting all free tags using bt_for_each_free() and handing them to blk_mq_timeout_check(). This uses bitmap_tags. 3. blk_mq_timeout_check() calls blk_mq_tag_to_rq() to get the rq. Could we have a race between - getting the tag (turning it busy) and initializing it and - detecting a tag to be busy and trying to access it? I haven't looked at the details yet. If so, we might either do some locking (if there is existing infrastructure), or somehow mark a request as not being initialized prior to accessing the data.
Ming Lei
2014-Sep-17  13:52 UTC
blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
On Wed, 17 Sep 2014 14:00:34 +0200 David Hildenbrand <dahi at linux.vnet.ibm.com> wrote:> > >>> Does anyone have an idea? > > >>> The request itself is completely filled with cc > > >> > > >> That is very weird, the 'rq' is got from hctx->tags, and rq should be > > >> valid, and rq->q shouldn't have been changed even though it was > > >> double free or double allocation. > > >> > > >>> I am currently asking myself if blk_mq_map_request should protect against softirq here but I cant say for sure,as I have never looked into that code before. > > >> > > >> No, it needn't the protection. > > >> > > >> Thanks, > > >> > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > > the body of a message to majordomo at vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > Digging through the code, I think I found a possible cause: > > tags->rqs[..] is not initialized with zeroes (via alloc_pages_node in > blk-mq.c:blk_mq_init_rq_map()).Yes, it may cause problem when the request is allocated at the 1st time, and timeout handler may comes just after the allocation and before its initialization, then oops triggered because of garbage data in the request. --
David Hildenbrand
2014-Sep-17  14:11 UTC
blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
> On Wed, 17 Sep 2014 14:00:34 +0200 > David Hildenbrand <dahi at linux.vnet.ibm.com> wrote: > > > > >>> Does anyone have an idea? > > > >>> The request itself is completely filled with cc > > > >> > > > >> That is very weird, the 'rq' is got from hctx->tags, and rq should be > > > >> valid, and rq->q shouldn't have been changed even though it was > > > >> double free or double allocation. > > > >> > > > >>> I am currently asking myself if blk_mq_map_request should protect against softirq here but I cant say for sure,as I have never looked into that code before. > > > >> > > > >> No, it needn't the protection. > > > >> > > > >> Thanks, > > > >> > > > > > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > > > the body of a message to majordomo at vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > Digging through the code, I think I found a possible cause: > > > > tags->rqs[..] is not initialized with zeroes (via alloc_pages_node in > > blk-mq.c:blk_mq_init_rq_map()). > > Yes, it may cause problem when the request is allocated at the 1st time, > and timeout handler may comes just after the allocation and before its > initialization, then oops triggered because of garbage data in the request. > > -- > From ffd0824b7b686074c2d5d70bc4e6bba3ba56a30c Mon Sep 17 00:00:00 2001 > From: Ming Lei <ming.lei at canonical.com> > Date: Wed, 17 Sep 2014 21:00:34 +0800 > Subject: [PATCH] blk-mq: initialize request before the 1st allocation > > Otherwise the request can be accessed from timeout handler > just after its 1st allocation from tag pool and before > initialization in blk_mq_rq_ctx_init(), so cause oops since > the request is filled up with garbage data. > > Signed-off-by: Ming Lei <ming.lei at canonical.com> > --- > block/blk-mq.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 4aac826..d24673f 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -514,6 +514,10 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag) > { > struct request *rq = tags->rqs[tag]; > > + /* uninitialized request */ > + if (!rq->q || rq->tag == -1) > + return rq; > + > if (!is_flush_request(rq, tag)) > return rq; > > @@ -1401,6 +1405,12 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set, > left -= to_do * rq_size; > for (j = 0; j < to_do; j++) { > tags->rqs[i] = p; > + > + /* Avoiding early access from timeout handler */ > + tags->rqs[i]->tag = -1; > + tags->rqs[i]->q = NULL; > + tags->rqs[i]->cmd_flags = 0;I was playing with a simple patch that just sets cmd_flags and action_flags to 0. That should already be sufficient to hinder blk_mq_tag_to_rq and the calling method to do the wrong thing. Will see the result tomorrow after testing.> + > if (set->ops->init_request) { > if (set->ops->init_request(set->driver_data, > tags->rqs[i], hctx_idx, i,
Jens Axboe
2014-Sep-17  14:22 UTC
blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
On 2014-09-17 07:52, Ming Lei wrote:> On Wed, 17 Sep 2014 14:00:34 +0200 > David Hildenbrand <dahi at linux.vnet.ibm.com> wrote: > >>>>>> Does anyone have an idea? >>>>>> The request itself is completely filled with cc >>>>> >>>>> That is very weird, the 'rq' is got from hctx->tags, and rq should be >>>>> valid, and rq->q shouldn't have been changed even though it was >>>>> double free or double allocation. >>>>> >>>>>> I am currently asking myself if blk_mq_map_request should protect against softirq here but I cant say for sure,as I have never looked into that code before. >>>>> >>>>> No, it needn't the protection. >>>>> >>>>> Thanks, >>>>> >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe kvm" in >>>> the body of a message to majordomo at vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >> >> Digging through the code, I think I found a possible cause: >> >> tags->rqs[..] is not initialized with zeroes (via alloc_pages_node in >> blk-mq.c:blk_mq_init_rq_map()). > > Yes, it may cause problem when the request is allocated at the 1st time, > and timeout handler may comes just after the allocation and before its > initialization, then oops triggered because of garbage data in the request. > > -- > From ffd0824b7b686074c2d5d70bc4e6bba3ba56a30c Mon Sep 17 00:00:00 2001 > From: Ming Lei <ming.lei at canonical.com> > Date: Wed, 17 Sep 2014 21:00:34 +0800 > Subject: [PATCH] blk-mq: initialize request before the 1st allocation > > Otherwise the request can be accessed from timeout handler > just after its 1st allocation from tag pool and before > initialization in blk_mq_rq_ctx_init(), so cause oops since > the request is filled up with garbage data. > > Signed-off-by: Ming Lei <ming.lei at canonical.com> > --- > block/blk-mq.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 4aac826..d24673f 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -514,6 +514,10 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag) > { > struct request *rq = tags->rqs[tag]; > > + /* uninitialized request */ > + if (!rq->q || rq->tag == -1) > + return rq; > + > if (!is_flush_request(rq, tag)) > return rq; > > @@ -1401,6 +1405,12 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set, > left -= to_do * rq_size; > for (j = 0; j < to_do; j++) { > tags->rqs[i] = p; > + > + /* Avoiding early access from timeout handler */ > + tags->rqs[i]->tag = -1; > + tags->rqs[i]->q = NULL; > + tags->rqs[i]->cmd_flags = 0; > + > if (set->ops->init_request) { > if (set->ops->init_request(set->driver_data, > tags->rqs[i], hctx_idx, i,Another way would be to ensure that the timeout handler doesn't touch hw_ctx or tag_sets that aren't fully initialized yet. But I think this is safer/cleaner. -- Jens Axboe
Maybe Matching Threads
- blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
- blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
- blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
- blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)
- blk-mq crash under KVM in multiqueue block code (with virtio-blk and ext4)