Daniel Stodden
2010-Aug-23 06:54 UTC
[Xen-devel] Fix the occasional xen-blkfront deadlock, when irqbalancing.
Hi. Please pull upstream/xen/blkfront from git://xenbits.xensource.com/people/dstodden/linux.git Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Aug-23 06:54 UTC
[Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
Response processing doesn''t really belong into hard irq context. Another potential problem this avoids is that switching interrupt cpu affinity in Xen domains can presently lead to event loss, if RING_FINAL_CHECK is run from hard irq context. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Cc: Tom Kopec <tek@acm.org> --- drivers/block/xen-blkfront.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 6c00538..75576d3 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -86,6 +86,7 @@ struct blkfront_info struct blkif_front_ring ring; struct scatterlist sg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; unsigned int evtchn, irq; + struct tasklet_struct tasklet; struct request_queue *rq; struct work_struct work; struct gnttab_free_callback callback; @@ -676,13 +677,14 @@ static void blkif_completion(struct blk_shadow *s) gnttab_end_foreign_access(s->req.seg[i].gref, 0, 0UL); } -static irqreturn_t blkif_interrupt(int irq, void *dev_id) +static void +blkif_do_interrupt(unsigned long data) { + struct blkfront_info *info = (struct blkfront_info *)data; struct request *req; struct blkif_response *bret; RING_IDX i, rp; unsigned long flags; - struct blkfront_info *info = (struct blkfront_info *)dev_id; int error; spin_lock_irqsave(&info->io_lock, flags); @@ -743,6 +745,15 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) out: spin_unlock_irqrestore(&info->io_lock, flags); +} + + +static irqreturn_t +blkif_interrupt(int irq, void *dev_id) +{ + struct blkfront_info *info = (struct blkfront_info *)dev_id; + + tasklet_schedule(&info->tasklet); return IRQ_HANDLED; } @@ -893,6 +904,7 @@ static int blkfront_probe(struct xenbus_device *dev, info->connected = BLKIF_STATE_DISCONNECTED; INIT_WORK(&info->work, blkif_restart_queue); spin_lock_init(&info->io_lock); + tasklet_init(&info->tasklet, blkif_do_interrupt, (unsigned long)info); for (i = 0; i < BLK_RING_SIZE; i++) info->shadow[i].req.id = i+1; -- 1.7.0.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Aug-23 07:01 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
This is upstream/xen/blkfront at git://xenbits.xensource.com/people/dstodden/linux.git Daniel On Mon, 2010-08-23 at 02:54 -0400, Daniel Stodden wrote:> Response processing doesn''t really belong into hard irq context. > > Another potential problem this avoids is that switching interrupt cpu > affinity in Xen domains can presently lead to event loss, if > RING_FINAL_CHECK is run from hard irq context. > > Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> > Cc: Tom Kopec <tek@acm.org> > --- > drivers/block/xen-blkfront.c | 16 ++++++++++++++-- > 1 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > index 6c00538..75576d3 100644 > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -86,6 +86,7 @@ struct blkfront_info > struct blkif_front_ring ring; > struct scatterlist sg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > unsigned int evtchn, irq; > + struct tasklet_struct tasklet; > struct request_queue *rq; > struct work_struct work; > struct gnttab_free_callback callback; > @@ -676,13 +677,14 @@ static void blkif_completion(struct blk_shadow *s) > gnttab_end_foreign_access(s->req.seg[i].gref, 0, 0UL); > } > > -static irqreturn_t blkif_interrupt(int irq, void *dev_id) > +static void > +blkif_do_interrupt(unsigned long data) > { > + struct blkfront_info *info = (struct blkfront_info *)data; > struct request *req; > struct blkif_response *bret; > RING_IDX i, rp; > unsigned long flags; > - struct blkfront_info *info = (struct blkfront_info *)dev_id; > int error; > > spin_lock_irqsave(&info->io_lock, flags); > @@ -743,6 +745,15 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) > > out: > spin_unlock_irqrestore(&info->io_lock, flags); > +} > + > + > +static irqreturn_t > +blkif_interrupt(int irq, void *dev_id) > +{ > + struct blkfront_info *info = (struct blkfront_info *)dev_id; > + > + tasklet_schedule(&info->tasklet); > > return IRQ_HANDLED; > } > @@ -893,6 +904,7 @@ static int blkfront_probe(struct xenbus_device *dev, > info->connected = BLKIF_STATE_DISCONNECTED; > INIT_WORK(&info->work, blkif_restart_queue); > spin_lock_init(&info->io_lock); > + tasklet_init(&info->tasklet, blkif_do_interrupt, (unsigned long)info); > > for (i = 0; i < BLK_RING_SIZE; i++) > info->shadow[i].req.id = i+1;_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Aug-23 21:09 UTC
[Xen-devel] Re: Fix the occasional xen-blkfront deadlock, when irqbalancing.
On 08/22/2010 11:54 PM, Daniel Stodden wrote:> Please pull upstream/xen/blkfront from > git://xenbits.xensource.com/people/dstodden/linux.gitI think this change is probably worthwhile on its own merits, but it just papers over the irqbalancing problem. I''d like to make sure that''s nailed down before pulling this patch. Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-02 22:46 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 08/22/2010 11:54 PM, Daniel Stodden wrote:> Response processing doesn''t really belong into hard irq context. > > Another potential problem this avoids is that switching interrupt cpu > affinity in Xen domains can presently lead to event loss, if > RING_FINAL_CHECK is run from hard irq context.I just got this warning from a 32-bit pv domain. I think it may relate to this change. The warning is void blk_start_queue(struct request_queue *q) { WARN_ON(!irqs_disabled()); Oddly, I only saw this pair once at boot, and after that the system seemed fine... [ 4.376451] ------------[ cut here ]------------ [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:337 blk_start_queue+0x20/0x36() [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] [ 4.377415] Pid: 0, comm: swapper Not tainted 2.6.32.21 #32 [ 4.377415] Call Trace: [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c [ 4.377415] [<c11b3ae1>] ? blk_start_queue+0x20/0x36 [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 [ 4.377415] [<c11b3ae1>] blk_start_queue+0x20/0x36 [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e [ 4.377415] ---[ end trace 0bfb98f0ed515cdb ]--- [ 4.377415] ------------[ cut here ]------------ [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:245 blk_remove_plug+0x20/0x7e() [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] [ 4.377415] Pid: 0, comm: swapper Tainted: G W 2.6.32.21 #32 [ 4.377415] Call Trace: [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c [ 4.377415] [<c11b3961>] ? blk_remove_plug+0x20/0x7e [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 [ 4.377415] [<c11b3961>] blk_remove_plug+0x20/0x7e [ 4.377415] [<c11b39ca>] __blk_run_queue+0xb/0x5e [ 4.377415] [<c11b3af4>] blk_start_queue+0x33/0x36 [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e [ 4.377415] ---[ end trace 0bfb98f0ed515cdc ]--- J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Sep-02 23:08 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Thu, 2010-09-02 at 18:46 -0400, Jeremy Fitzhardinge wrote:> On 08/22/2010 11:54 PM, Daniel Stodden wrote: > > Response processing doesn''t really belong into hard irq context. > > > > Another potential problem this avoids is that switching interrupt cpu > > affinity in Xen domains can presently lead to event loss, if > > RING_FINAL_CHECK is run from hard irq context. > > I just got this warning from a 32-bit pv domain. I think it may relate > to this change. The warning isWe clearly spin_lock_irqsave all through the blkif_do_interrupt frame. It follows that something underneath quite unconditionally chose to reenable them again (?) Either: Can you add a bunch of similar WARN_ONs along that path? Or: This lock is quite coarse-grained. The lock only matters for queue access, and we know irqs are reenabled, so no need for flags. In fact we only need to spin_lock_irq around the __blk_end_ calls and kick_pending_. But I don''t immediately see what''s to blame, so I''d be curious. Daniel> void blk_start_queue(struct request_queue *q) > { > WARN_ON(!irqs_disabled());> Oddly, I only saw this pair once at boot, and after that the system > seemed fine... > > [ 4.376451] ------------[ cut here ]------------ > [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:337 blk_start_queue+0x20/0x36() > [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] > [ 4.377415] Pid: 0, comm: swapper Not tainted 2.6.32.21 #32 > [ 4.377415] Call Trace: > [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c > [ 4.377415] [<c11b3ae1>] ? blk_start_queue+0x20/0x36 > [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 > [ 4.377415] [<c11b3ae1>] blk_start_queue+0x20/0x36 > [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] > [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] > [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 > [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 > [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c > [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c > [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 > [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc > [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 > [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f > [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 > [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 > [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 > [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 > [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e > [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e > [ 4.377415] ---[ end trace 0bfb98f0ed515cdb ]--- > [ 4.377415] ------------[ cut here ]------------ > [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:245 blk_remove_plug+0x20/0x7e() > [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] > [ 4.377415] Pid: 0, comm: swapper Tainted: G W 2.6.32.21 #32 > [ 4.377415] Call Trace: > [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c > [ 4.377415] [<c11b3961>] ? blk_remove_plug+0x20/0x7e > [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 > [ 4.377415] [<c11b3961>] blk_remove_plug+0x20/0x7e > [ 4.377415] [<c11b39ca>] __blk_run_queue+0xb/0x5e > [ 4.377415] [<c11b3af4>] blk_start_queue+0x33/0x36 > [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] > [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] > [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 > [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 > [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c > [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c > [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 > [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc > [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 > [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f > [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 > [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 > [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 > [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 > [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e > [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e > [ 4.377415] ---[ end trace 0bfb98f0ed515cdc ]--- > > J >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/03/2010 09:08 AM, Daniel Stodden wrote:> On Thu, 2010-09-02 at 18:46 -0400, Jeremy Fitzhardinge wrote: >> On 08/22/2010 11:54 PM, Daniel Stodden wrote: >>> Response processing doesn''t really belong into hard irq context. >>> >>> Another potential problem this avoids is that switching interrupt cpu >>> affinity in Xen domains can presently lead to event loss, if >>> RING_FINAL_CHECK is run from hard irq context. >> I just got this warning from a 32-bit pv domain. I think it may relate >> to this change. The warning is > We clearly spin_lock_irqsave all through the blkif_do_interrupt frame. > > It follows that something underneath quite unconditionally chose to > reenable them again (?) > > Either: Can you add a bunch of similar WARN_ONs along that path? > > Or: This lock is quite coarse-grained. The lock only matters for queue > access, and we know irqs are reenabled, so no need for flags. In fact we > only need to spin_lock_irq around the __blk_end_ calls and > kick_pending_. > > But I don''t immediately see what''s to blame, so I''d be curious.I haven''t got around to investigating this in more detail yet, but there''s also this long-standing lockdep hiccup in blktap: Starting auto Xen domains: lurch alloc irq_desc for 1235 on node 0 alloc kstat_irqs on node 0 block tda: sector-size: 512 capacity: 614400 INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. Pid: 4266, comm: tapdisk2 Not tainted 2.6.32.21 #146 Call Trace: [<ffffffff8107f0a4>] __lock_acquire+0x1df/0x16e5 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d [<ffffffff81080677>] lock_acquire+0xcd/0xf1 [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d [<ffffffff810f0259>] ? apply_to_page_range+0x195/0x37d [<ffffffff81506f7d>] _spin_lock+0x31/0x40 [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d [<ffffffff810f0359>] apply_to_page_range+0x295/0x37d [<ffffffff812ab37c>] ? blktap_map_uaddr_fn+0x0/0x55 [<ffffffff8100d0cf>] ? xen_make_pte+0x8a/0x8e [<ffffffff812ac34e>] blktap_device_process_request+0x43d/0x954 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff8107d687>] ? mark_held_locks+0x52/0x70 [<ffffffff81506ddb>] ? _spin_unlock_irq+0x30/0x3c [<ffffffff8107d949>] ? trace_hardirqs_on_caller+0x125/0x150 [<ffffffff812acba6>] blktap_device_run_queue+0x1c5/0x28f [<ffffffff812a0234>] ? unbind_from_irq+0x18/0x198 [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff812ab14d>] blktap_ring_poll+0x7c/0xc7 [<ffffffff81124e9b>] do_select+0x387/0x584 [<ffffffff81124b14>] ? do_select+0x0/0x584 [<ffffffff811255de>] ? __pollwait+0x0/0xcc [<ffffffff811256aa>] ? pollwake+0x0/0x56 [<ffffffff811256aa>] ? pollwake+0x0/0x56 [<ffffffff811256aa>] ? pollwake+0x0/0x56 [<ffffffff811256aa>] ? pollwake+0x0/0x56 [<ffffffff8108059b>] ? __lock_acquire+0x16d6/0x16e5 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff811252a4>] core_sys_select+0x20c/0x2da [<ffffffff811250d6>] ? core_sys_select+0x3e/0x2da [<ffffffff81010082>] ? check_events+0x12/0x20 [<ffffffff8101006f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81108661>] ? kmem_cache_free+0x18e/0x1c8 [<ffffffff8141e912>] ? sock_destroy_inode+0x19/0x1b [<ffffffff811299bd>] ? destroy_inode+0x2f/0x44 [<ffffffff8102ef22>] ? pvclock_clocksource_read+0x4b/0xa2 [<ffffffff8100fe8b>] ? xen_clocksource_read+0x21/0x23 [<ffffffff81010003>] ? xen_clocksource_get_cycles+0x9/0x16 [<ffffffff81075700>] ? ktime_get_ts+0xb2/0xbf [<ffffffff811255b6>] sys_select+0x96/0xbe [<ffffffff81013d32>] system_call_fastpath+0x16/0x1b block tdb: sector-size: 512 capacity: 20971520 block tdc: sector-size: 512 capacity: 146800640 block tdd: sector-size: 512 capacity: 188743680 J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 2010-09-06 at 21:39 -0400, Jeremy Fitzhardinge wrote:> On 09/03/2010 09:08 AM, Daniel Stodden wrote: > > On Thu, 2010-09-02 at 18:46 -0400, Jeremy Fitzhardinge wrote: > >> On 08/22/2010 11:54 PM, Daniel Stodden wrote: > >>> Response processing doesn''t really belong into hard irq context. > >>> > >>> Another potential problem this avoids is that switching interrupt cpu > >>> affinity in Xen domains can presently lead to event loss, if > >>> RING_FINAL_CHECK is run from hard irq context. > >> I just got this warning from a 32-bit pv domain. I think it may relate > >> to this change. The warning is > > We clearly spin_lock_irqsave all through the blkif_do_interrupt frame. > > > > It follows that something underneath quite unconditionally chose to > > reenable them again (?) > > > > Either: Can you add a bunch of similar WARN_ONs along that path? > > > > Or: This lock is quite coarse-grained. The lock only matters for queue > > access, and we know irqs are reenabled, so no need for flags. In fact we > > only need to spin_lock_irq around the __blk_end_ calls and > > kick_pending_. > > > > But I don''t immediately see what''s to blame, so I''d be curious. > > I haven''t got around to investigating this in more detail yet, but > there''s also this long-standing lockdep hiccup in blktap:Ack. Let''s fix that somewhere this week and see if we can clean up the spin locking problem too then. Daniel> Starting auto Xen domains: lurch alloc irq_desc for 1235 on node 0 > alloc kstat_irqs on node 0 > block tda: sector-size: 512 capacity: 614400 > INFO: trying to register non-static key. > the code is fine but needs lockdep annotation. > turning off the locking correctness validator. > Pid: 4266, comm: tapdisk2 Not tainted 2.6.32.21 #146 > Call Trace: > [<ffffffff8107f0a4>] __lock_acquire+0x1df/0x16e5 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d > [<ffffffff81080677>] lock_acquire+0xcd/0xf1 > [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d > [<ffffffff810f0259>] ? apply_to_page_range+0x195/0x37d > [<ffffffff81506f7d>] _spin_lock+0x31/0x40 > [<ffffffff810f0359>] ? apply_to_page_range+0x295/0x37d > [<ffffffff810f0359>] apply_to_page_range+0x295/0x37d > [<ffffffff812ab37c>] ? blktap_map_uaddr_fn+0x0/0x55 > [<ffffffff8100d0cf>] ? xen_make_pte+0x8a/0x8e > [<ffffffff812ac34e>] blktap_device_process_request+0x43d/0x954 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff8107d687>] ? mark_held_locks+0x52/0x70 > [<ffffffff81506ddb>] ? _spin_unlock_irq+0x30/0x3c > [<ffffffff8107d949>] ? trace_hardirqs_on_caller+0x125/0x150 > [<ffffffff812acba6>] blktap_device_run_queue+0x1c5/0x28f > [<ffffffff812a0234>] ? unbind_from_irq+0x18/0x198 > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff812ab14d>] blktap_ring_poll+0x7c/0xc7 > [<ffffffff81124e9b>] do_select+0x387/0x584 > [<ffffffff81124b14>] ? do_select+0x0/0x584 > [<ffffffff811255de>] ? __pollwait+0x0/0xcc > [<ffffffff811256aa>] ? pollwake+0x0/0x56 > [<ffffffff811256aa>] ? pollwake+0x0/0x56 > [<ffffffff811256aa>] ? pollwake+0x0/0x56 > [<ffffffff811256aa>] ? pollwake+0x0/0x56 > [<ffffffff8108059b>] ? __lock_acquire+0x16d6/0x16e5 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff8100f955>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff811252a4>] core_sys_select+0x20c/0x2da > [<ffffffff811250d6>] ? core_sys_select+0x3e/0x2da > [<ffffffff81010082>] ? check_events+0x12/0x20 > [<ffffffff8101006f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff81108661>] ? kmem_cache_free+0x18e/0x1c8 > [<ffffffff8141e912>] ? sock_destroy_inode+0x19/0x1b > [<ffffffff811299bd>] ? destroy_inode+0x2f/0x44 > [<ffffffff8102ef22>] ? pvclock_clocksource_read+0x4b/0xa2 > [<ffffffff8100fe8b>] ? xen_clocksource_read+0x21/0x23 > [<ffffffff81010003>] ? xen_clocksource_get_cycles+0x9/0x16 > [<ffffffff81075700>] ? ktime_get_ts+0xb2/0xbf > [<ffffffff811255b6>] sys_select+0x96/0xbe > [<ffffffff81013d32>] system_call_fastpath+0x16/0x1b > block tdb: sector-size: 512 capacity: 20971520 > block tdc: sector-size: 512 capacity: 146800640 > block tdd: sector-size: 512 capacity: 188743680 > > J >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-08 02:03 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/03/2010 09:08 AM, Daniel Stodden wrote:> We clearly spin_lock_irqsave all through the blkif_do_interrupt frame. > > It follows that something underneath quite unconditionally chose to > reenable them again (?) > > Either: Can you add a bunch of similar WARN_ONs along that path? > > Or: This lock is quite coarse-grained. The lock only matters for queue > access, and we know irqs are reenabled, so no need for flags. In fact we > only need to spin_lock_irq around the __blk_end_ calls and > kick_pending_. > > But I don''t immediately see what''s to blame, so I''d be curious.It looks like __blk_end_request_all(req, error); (line 743) is returning with interrupts enabled sometimes (not consistently). I had a quick look through the code, but I couldn''t see where it touches the interrupt state at all. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Sep-08 02:21 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Tue, 2010-09-07 at 22:03 -0400, Jeremy Fitzhardinge wrote:> On 09/03/2010 09:08 AM, Daniel Stodden wrote: > > We clearly spin_lock_irqsave all through the blkif_do_interrupt frame. > > > > It follows that something underneath quite unconditionally chose to > > reenable them again (?) > > > > Either: Can you add a bunch of similar WARN_ONs along that path? > > > > Or: This lock is quite coarse-grained. The lock only matters for queue > > access, and we know irqs are reenabled, so no need for flags. In fact we > > only need to spin_lock_irq around the __blk_end_ calls and > > kick_pending_. > > > > But I don''t immediately see what''s to blame, so I''d be curious. > > It looks like __blk_end_request_all(req, error); (line 743) is returning > with interrupts enabled sometimes (not consistently). I had a quick > look through the code, but I couldn''t see where it touches the interrupt > state at all.Oha. Was this found on 2.6.32 or later? Thanks, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-08 06:37 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/08/2010 12:21 PM, Daniel Stodden wrote:> On Tue, 2010-09-07 at 22:03 -0400, Jeremy Fitzhardinge wrote: >> On 09/03/2010 09:08 AM, Daniel Stodden wrote: >>> We clearly spin_lock_irqsave all through the blkif_do_interrupt frame. >>> >>> It follows that something underneath quite unconditionally chose to >>> reenable them again (?) >>> >>> Either: Can you add a bunch of similar WARN_ONs along that path? >>> >>> Or: This lock is quite coarse-grained. The lock only matters for queue >>> access, and we know irqs are reenabled, so no need for flags. In fact we >>> only need to spin_lock_irq around the __blk_end_ calls and >>> kick_pending_. >>> >>> But I don''t immediately see what''s to blame, so I''d be curious. >> It looks like __blk_end_request_all(req, error); (line 743) is returning >> with interrupts enabled sometimes (not consistently). I had a quick >> look through the code, but I couldn''t see where it touches the interrupt >> state at all. > Oha. Was this found on 2.6.32 or later?Yeah, xen/next-2.6.32. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Jones
2010-Sep-23 16:08 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/03/2010 12:46 AM, Jeremy Fitzhardinge wrote:> On 08/22/2010 11:54 PM, Daniel Stodden wrote: >> Response processing doesn''t really belong into hard irq context. >> >> Another potential problem this avoids is that switching interrupt cpu >> affinity in Xen domains can presently lead to event loss, if >> RING_FINAL_CHECK is run from hard irq context. > > I just got this warning from a 32-bit pv domain. I think it may relate > to this change. The warning is > > void blk_start_queue(struct request_queue *q) > { > WARN_ON(!irqs_disabled()); > > > Oddly, I only saw this pair once at boot, and after that the system > seemed fine... > > [ 4.376451] ------------[ cut here ]------------ > [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:337 blk_start_queue+0x20/0x36() > [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] > [ 4.377415] Pid: 0, comm: swapper Not tainted 2.6.32.21 #32 > [ 4.377415] Call Trace: > [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c > [ 4.377415] [<c11b3ae1>] ? blk_start_queue+0x20/0x36 > [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 > [ 4.377415] [<c11b3ae1>] blk_start_queue+0x20/0x36 > [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] > [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] > [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 > [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 > [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c > [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c > [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 > [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc > [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 > [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f > [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 > [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 > [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 > [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 > [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e > [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e > [ 4.377415] ---[ end trace 0bfb98f0ed515cdb ]--- > [ 4.377415] ------------[ cut here ]------------ > [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:245 blk_remove_plug+0x20/0x7e() > [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] > [ 4.377415] Pid: 0, comm: swapper Tainted: G W 2.6.32.21 #32 > [ 4.377415] Call Trace: > [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c > [ 4.377415] [<c11b3961>] ? blk_remove_plug+0x20/0x7e > [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 > [ 4.377415] [<c11b3961>] blk_remove_plug+0x20/0x7e > [ 4.377415] [<c11b39ca>] __blk_run_queue+0xb/0x5e > [ 4.377415] [<c11b3af4>] blk_start_queue+0x33/0x36 > [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] > [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] > [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 > [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 > [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c > [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c > [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 > [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc > [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 > [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f > [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 > [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 > [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 > [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 > [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e > [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e > [ 4.377415] ---[ end trace 0bfb98f0ed515cdc ]--- > > J >Hi Jeremy, Any developments with this? I''ve got a report of the exact same warnings on RHEL6 guest. See https://bugzilla.redhat.com/show_bug.cgi?id=632802 RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so that can be ruled out. Unfortunately I don''t have this reproducing on a test machine, so it''s difficult to debug. The report I have showed that in at least one case it occurred on boot up, right after initting the block device. I''m trying to get confirmation if that''s always the case. Thanks in advance for any pointers you might have. Drew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-23 16:23 UTC
Re: [Xen-devel] [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/23/2010 09:08 AM, Andrew Jones wrote:> On 09/03/2010 12:46 AM, Jeremy Fitzhardinge wrote: >> On 08/22/2010 11:54 PM, Daniel Stodden wrote: >>> Response processing doesn''t really belong into hard irq context. >>> >>> Another potential problem this avoids is that switching interrupt cpu >>> affinity in Xen domains can presently lead to event loss, if >>> RING_FINAL_CHECK is run from hard irq context. >> I just got this warning from a 32-bit pv domain. I think it may relate >> to this change. The warning is >> >> void blk_start_queue(struct request_queue *q) >> { >> WARN_ON(!irqs_disabled()); >> >> >> Oddly, I only saw this pair once at boot, and after that the system >> seemed fine... >> >> [ 4.376451] ------------[ cut here ]------------ >> [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:337 blk_start_queue+0x20/0x36() >> [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] >> [ 4.377415] Pid: 0, comm: swapper Not tainted 2.6.32.21 #32 >> [ 4.377415] Call Trace: >> [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c >> [ 4.377415] [<c11b3ae1>] ? blk_start_queue+0x20/0x36 >> [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 >> [ 4.377415] [<c11b3ae1>] blk_start_queue+0x20/0x36 >> [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] >> [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] >> [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 >> [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 >> [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c >> [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c >> [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 >> [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc >> [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 >> [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f >> [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 >> [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 >> [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 >> [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 >> [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e >> [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e >> [ 4.377415] ---[ end trace 0bfb98f0ed515cdb ]--- >> [ 4.377415] ------------[ cut here ]------------ >> [ 4.377415] WARNING: at /home/jeremy/git/linux/block/blk-core.c:245 blk_remove_plug+0x20/0x7e() >> [ 4.377415] Modules linked in: xfs exportfs xen_blkfront [last unloaded: scsi_wait_scan] >> [ 4.377415] Pid: 0, comm: swapper Tainted: G W 2.6.32.21 #32 >> [ 4.377415] Call Trace: >> [ 4.377415] [<c1039f74>] warn_slowpath_common+0x65/0x7c >> [ 4.377415] [<c11b3961>] ? blk_remove_plug+0x20/0x7e >> [ 4.377415] [<c1039f98>] warn_slowpath_null+0xd/0x10 >> [ 4.377415] [<c11b3961>] blk_remove_plug+0x20/0x7e >> [ 4.377415] [<c11b39ca>] __blk_run_queue+0xb/0x5e >> [ 4.377415] [<c11b3af4>] blk_start_queue+0x33/0x36 >> [ 4.377415] [<edc74712>] kick_pending_request_queues+0x1c/0x2a [xen_blkfront] >> [ 4.377415] [<edc74ec4>] blkif_do_interrupt+0x176/0x189 [xen_blkfront] >> [ 4.377415] [<c103e063>] tasklet_action+0x63/0xa8 >> [ 4.377415] [<c103f2d5>] __do_softirq+0xac/0x152 >> [ 4.377415] [<c103f3ac>] do_softirq+0x31/0x3c >> [ 4.377415] [<c103f484>] irq_exit+0x29/0x5c >> [ 4.377415] [<c121a1b6>] xen_evtchn_do_upcall+0x29/0x34 >> [ 4.377415] [<c100a027>] xen_do_upcall+0x7/0xc >> [ 4.377415] [<c10023a7>] ? hypercall_page+0x3a7/0x1005 >> [ 4.377415] [<c10065a9>] ? xen_safe_halt+0x12/0x1f >> [ 4.377415] [<c10042cb>] xen_idle+0x27/0x38 >> [ 4.377415] [<c100877e>] cpu_idle+0x49/0x63 >> [ 4.377415] [<c14a6427>] rest_init+0x53/0x55 >> [ 4.377415] [<c179c814>] start_kernel+0x2d4/0x2d9 >> [ 4.377415] [<c179c0a8>] i386_start_kernel+0x97/0x9e >> [ 4.377415] [<c179f478>] xen_start_kernel+0x576/0x57e >> [ 4.377415] ---[ end trace 0bfb98f0ed515cdc ]--- >> >> J >> > Hi Jeremy, > > Any developments with this? I''ve got a report of the exact same warnings > on RHEL6 guest. See > > https://bugzilla.redhat.com/show_bug.cgi?id=632802 > > RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so > that can be ruled out. Unfortunately I don''t have this reproducing on a > test machine, so it''s difficult to debug. The report I have showed that > in at least one case it occurred on boot up, right after initting the > block device. I''m trying to get confirmation if that''s always the case. > > Thanks in advance for any pointers you might have. >Yes, I see it even after reverting that change as well. However I only see it on my domain with an XFS filesystem, but I haven''t dug any deeper to see if that''s relevant. Do you know when this appeared? Is it recent? What changes are in the rhel6 kernel in question? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Paolo Bonzini
2010-Sep-23 16:38 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote:>> Any developments with this? I''ve got a report of the exact same warnings >> on RHEL6 guest. See >> >> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >> >> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >> that can be ruled out. Unfortunately I don''t have this reproducing on a >> test machine, so it''s difficult to debug. The report I have showed that >> in at least one case it occurred on boot up, right after initting the >> block device. I''m trying to get confirmation if that''s always the case. >> >> Thanks in advance for any pointers you might have. > > Yes, I see it even after reverting that change as well. However I only > see it on my domain with an XFS filesystem, but I haven''t dug any deeper > to see if that''s relevant. > > Do you know when this appeared? Is it recent? What changes are in the > rhel6 kernel in question?It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch blkfront series you posted last July. There are some RHEL-specific workarounds for PV-on-HVM, but for PV domains everything matches upstream. Paolo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-23 18:36 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/23/2010 09:38 AM, Paolo Bonzini wrote:> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>> Any developments with this? I''ve got a report of the exact same >>> warnings >>> on RHEL6 guest. See >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>> >>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >>> that can be ruled out. Unfortunately I don''t have this reproducing on a >>> test machine, so it''s difficult to debug. The report I have showed >>> that >>> in at least one case it occurred on boot up, right after initting the >>> block device. I''m trying to get confirmation if that''s always the case. >>> >>> Thanks in advance for any pointers you might have. >> >> Yes, I see it even after reverting that change as well. However I only >> see it on my domain with an XFS filesystem, but I haven''t dug any deeper >> to see if that''s relevant. >> >> Do you know when this appeared? Is it recent? What changes are in the >> rhel6 kernel in question? > > It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch > blkfront series you posted last July. There are some RHEL-specific > workarounds for PV-on-HVM, but for PV domains everything matches > upstream.Have you tried bisecting to see when this particular problem appeared? It looks to me like something is accidentally re-enabling interrupts - perhaps a stack overrun is corrupting the "flags" argument between a spin_lock_irqsave()/restore pair. Is it only on 32-bit kernels? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Jones
2010-Sep-24 07:14 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote:> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: >> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>>> Any developments with this? I''ve got a report of the exact same >>>> warnings >>>> on RHEL6 guest. See >>>> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>>> >>>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >>>> that can be ruled out. Unfortunately I don''t have this reproducing on a >>>> test machine, so it''s difficult to debug. The report I have showed >>>> that >>>> in at least one case it occurred on boot up, right after initting the >>>> block device. I''m trying to get confirmation if that''s always the case. >>>> >>>> Thanks in advance for any pointers you might have. >>> >>> Yes, I see it even after reverting that change as well. However I only >>> see it on my domain with an XFS filesystem, but I haven''t dug any deeper >>> to see if that''s relevant. >>> >>> Do you know when this appeared? Is it recent? What changes are in the >>> rhel6 kernel in question? >> >> It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch >> blkfront series you posted last July. There are some RHEL-specific >> workarounds for PV-on-HVM, but for PV domains everything matches >> upstream. > > Have you tried bisecting to see when this particular problem appeared? > It looks to me like something is accidentally re-enabling interrupts - > perhaps a stack overrun is corrupting the "flags" argument between a > spin_lock_irqsave()/restore pair. >Unfortunately I don''t have a test machine where I can do a bisection (yet). I''m looking for one. I only have this one report so far, and it''s on a production machine.> Is it only on 32-bit kernels? >This one report I have is a 32b guest on a 64b host. Drew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-24 18:50 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/24/2010 12:14 AM, Andrew Jones wrote:> On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote: >> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: >>> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>>>> Any developments with this? I''ve got a report of the exact same >>>>> warnings >>>>> on RHEL6 guest. See >>>>> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>>>> >>>>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >>>>> that can be ruled out. Unfortunately I don''t have this reproducing on a >>>>> test machine, so it''s difficult to debug. The report I have showed >>>>> that >>>>> in at least one case it occurred on boot up, right after initting the >>>>> block device. I''m trying to get confirmation if that''s always the case. >>>>> >>>>> Thanks in advance for any pointers you might have. >>>> Yes, I see it even after reverting that change as well. However I only >>>> see it on my domain with an XFS filesystem, but I haven''t dug any deeper >>>> to see if that''s relevant. >>>> >>>> Do you know when this appeared? Is it recent? What changes are in the >>>> rhel6 kernel in question? >>> It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch >>> blkfront series you posted last July. There are some RHEL-specific >>> workarounds for PV-on-HVM, but for PV domains everything matches >>> upstream. >> Have you tried bisecting to see when this particular problem appeared? >> It looks to me like something is accidentally re-enabling interrupts - >> perhaps a stack overrun is corrupting the "flags" argument between a >> spin_lock_irqsave()/restore pair. >> > Unfortunately I don''t have a test machine where I can do a bisection > (yet). I''m looking for one. I only have this one report so far, and it''s > on a production machine.The report says that its repeatedly killing the machine though? In my testing, it seems to hit the warning once at boot, but is OK after that (not that I''m doing anything very stressful on the domain).>> Is it only on 32-bit kernels? >> > This one report I have is a 32b guest on a 64b host.Is it using XFS by any chance? So far I''ve traced the re-enable to xfs_buf_bio_end_io(). However, my suspicion is that it might be related to the barrier changes we did. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Jones
2010-Sep-27 07:41 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/24/2010 08:50 PM, Jeremy Fitzhardinge wrote:> On 09/24/2010 12:14 AM, Andrew Jones wrote: >> On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote: >>> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: >>>> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>>>>> Any developments with this? I''ve got a report of the exact same >>>>>> warnings >>>>>> on RHEL6 guest. See >>>>>> >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>>>>> >>>>>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >>>>>> that can be ruled out. Unfortunately I don''t have this reproducing on a >>>>>> test machine, so it''s difficult to debug. The report I have showed >>>>>> that >>>>>> in at least one case it occurred on boot up, right after initting the >>>>>> block device. I''m trying to get confirmation if that''s always the case. >>>>>> >>>>>> Thanks in advance for any pointers you might have. >>>>> Yes, I see it even after reverting that change as well. However I only >>>>> see it on my domain with an XFS filesystem, but I haven''t dug any deeper >>>>> to see if that''s relevant. >>>>> >>>>> Do you know when this appeared? Is it recent? What changes are in the >>>>> rhel6 kernel in question? >>>> It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch >>>> blkfront series you posted last July. There are some RHEL-specific >>>> workarounds for PV-on-HVM, but for PV domains everything matches >>>> upstream. >>> Have you tried bisecting to see when this particular problem appeared? >>> It looks to me like something is accidentally re-enabling interrupts - >>> perhaps a stack overrun is corrupting the "flags" argument between a >>> spin_lock_irqsave()/restore pair. >>> >> Unfortunately I don''t have a test machine where I can do a bisection >> (yet). I''m looking for one. I only have this one report so far, and it''s >> on a production machine. > > The report says that its repeatedly killing the machine though? In my > testing, it seems to hit the warning once at boot, but is OK after that > (not that I''m doing anything very stressful on the domain). >It looks like the crash is from failing to read swap due to a bad page map. It''s possibly another issue, but I wanted to try and clean this issue up first to see what happens.>>> Is it only on 32-bit kernels? >>> >> This one report I have is a 32b guest on a 64b host. > > Is it using XFS by any chance? So far I''ve traced the re-enable to > xfs_buf_bio_end_io(). However, my suspicion is that it might be related > to the barrier changes we did. >I''ll check on the xfs and let you know.> J > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Sep-27 09:46 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Mon, 2010-09-27 at 03:41 -0400, Andrew Jones wrote:> On 09/24/2010 08:50 PM, Jeremy Fitzhardinge wrote: > > On 09/24/2010 12:14 AM, Andrew Jones wrote: > >> On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote: > >>> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: > >>>> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: > >>>>>> Any developments with this? I''ve got a report of the exact same > >>>>>> warnings > >>>>>> on RHEL6 guest. See > >>>>>> > >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 > >>>>>> > >>>>>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so > >>>>>> that can be ruled out. Unfortunately I don''t have this reproducing on a > >>>>>> test machine, so it''s difficult to debug. The report I have showed > >>>>>> that > >>>>>> in at least one case it occurred on boot up, right after initting the > >>>>>> block device. I''m trying to get confirmation if that''s always the case. > >>>>>> > >>>>>> Thanks in advance for any pointers you might have. > >>>>> Yes, I see it even after reverting that change as well. However I only > >>>>> see it on my domain with an XFS filesystem, but I haven''t dug any deeper > >>>>> to see if that''s relevant. > >>>>> > >>>>> Do you know when this appeared? Is it recent? What changes are in the > >>>>> rhel6 kernel in question? > >>>> It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch > >>>> blkfront series you posted last July. There are some RHEL-specific > >>>> workarounds for PV-on-HVM, but for PV domains everything matches > >>>> upstream. > >>> Have you tried bisecting to see when this particular problem appeared? > >>> It looks to me like something is accidentally re-enabling interrupts - > >>> perhaps a stack overrun is corrupting the "flags" argument between a > >>> spin_lock_irqsave()/restore pair. > >>> > >> Unfortunately I don''t have a test machine where I can do a bisection > >> (yet). I''m looking for one. I only have this one report so far, and it''s > >> on a production machine. > > > > The report says that its repeatedly killing the machine though? In my > > testing, it seems to hit the warning once at boot, but is OK after that > > (not that I''m doing anything very stressful on the domain). > > > > It looks like the crash is from failing to read swap due to a bad page > map. It''s possibly another issue, but I wanted to try and clean this > issue up first to see what happens.Uh oh. Sure this was a frontend crash? If you see it a again, a stack trace to look at would be great. Thanks, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Jones
2010-Sep-27 10:21 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/27/2010 11:46 AM, Daniel Stodden wrote:> On Mon, 2010-09-27 at 03:41 -0400, Andrew Jones wrote: >> On 09/24/2010 08:50 PM, Jeremy Fitzhardinge wrote: >>> On 09/24/2010 12:14 AM, Andrew Jones wrote: >>>> On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote: >>>>> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: >>>>>> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>>>>>>> Any developments with this? I''ve got a report of the exact same >>>>>>>> warnings >>>>>>>> on RHEL6 guest. See >>>>>>>> >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>>>>>>> >>>>>>>> RHEL6 doesn''t have the ''Move blkif_interrupt into a tasklet'' patch, so >>>>>>>> that can be ruled out. Unfortunately I don''t have this reproducing on a >>>>>>>> test machine, so it''s difficult to debug. The report I have showed >>>>>>>> that >>>>>>>> in at least one case it occurred on boot up, right after initting the >>>>>>>> block device. I''m trying to get confirmation if that''s always the case. >>>>>>>> >>>>>>>> Thanks in advance for any pointers you might have. >>>>>>> Yes, I see it even after reverting that change as well. However I only >>>>>>> see it on my domain with an XFS filesystem, but I haven''t dug any deeper >>>>>>> to see if that''s relevant. >>>>>>> >>>>>>> Do you know when this appeared? Is it recent? What changes are in the >>>>>>> rhel6 kernel in question? >>>>>> It''s got pretty much everything in stable-2.6.32.x, up to the 16 patch >>>>>> blkfront series you posted last July. There are some RHEL-specific >>>>>> workarounds for PV-on-HVM, but for PV domains everything matches >>>>>> upstream. >>>>> Have you tried bisecting to see when this particular problem appeared? >>>>> It looks to me like something is accidentally re-enabling interrupts - >>>>> perhaps a stack overrun is corrupting the "flags" argument between a >>>>> spin_lock_irqsave()/restore pair. >>>>> >>>> Unfortunately I don''t have a test machine where I can do a bisection >>>> (yet). I''m looking for one. I only have this one report so far, and it''s >>>> on a production machine. >>> >>> The report says that its repeatedly killing the machine though? In my >>> testing, it seems to hit the warning once at boot, but is OK after that >>> (not that I''m doing anything very stressful on the domain). >>> >> >> It looks like the crash is from failing to read swap due to a bad page >> map. It''s possibly another issue, but I wanted to try and clean this >> issue up first to see what happens. > > Uh oh. Sure this was a frontend crash? If you see it a again, a stack > trace to look at would be great. >Hi Daniel, You can take a look at this bug https://bugzilla.redhat.com/show_bug.cgi?id=632802 there''s stacks for the swap issue in the comments and also this attached dmesg https://bugzilla.redhat.com/attachment.cgi?id=447789 Thanks, Drew> Thanks, > Daniel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
imammedo
2011-Aug-16 11:26 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
Jeremy Fitzhardinge wrote:> > Have you tried bisecting to see when this particular problem appeared? > It looks to me like something is accidentally re-enabling interrupts - > perhaps a stack overrun is corrupting the "flags" argument between a > spin_lock_irqsave()/restore pair. > > Is it only on 32-bit kernels? >------------[ cut here ]------------ [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront pcspkr [last unloaded: scsi_wait_scan] [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 [604001.660181] Call Trace: [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 ... ------------[ cut here ]------------ I''ve debugged a bit blk-core warning and can say: - Yes, It is 32-bit PAE kernel and happens only with it so far. - Affects PV xen guest, bare-metal and kvm configs are not affected. - Upstream kernel is affected as well. - Reproduces on xen 4.1.1 and 3.1.2 hosts IF flag is always restored at drivers/md/dm.c static void clone_endio(struct bio *bio, int error) ... dm_endio_fn endio = tio->ti->type->end_io; ... when page fault happens accessing tio->ti->type field. After successful resync with kernel''s pagetable in do_page_fault->vmalloc_fault, io continues happily on, however with IF flag restored even if faulted context''s eflags register had no IF flag set. It happens with random task every time. Here is ftrace call graph showing problematic place: =======================================================# tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0+ # -------------------------------------------------------------------- # latency: 0 us, #42330/242738181, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / # CPU|||| DURATION FUNCTION CALLS # | |||| | | | | | | 0) d... | xen_evtchn_do_upcall() { 0) d... | irq_enter() { 0) d.h. 2.880 us | } 0) d.h. | __xen_evtchn_do_upcall() { 0) d.h. 0.099 us | irq_to_desc(); 0) d.h. | handle_edge_irq() { 0) d.h. 0.107 us | _raw_spin_lock(); 0) d.h. | ack_dynirq() { 0) d.h. 3.153 us | } 0) d.h. | handle_irq_event() { 0) d.h. | handle_irq_event_percpu() { 0) d.h. | blkif_interrupt() { 0) d.h. 0.110 us | _raw_spin_lock_irqsave(); 0) d.h. | __blk_end_request_all() { 0) d.h. | blk_update_bidi_request() { 0) d.h. | blk_update_request() { 0) d.h. | req_bio_endio() { 0) d.h. | bio_endio() { 0) d.h. | endio() { 0) d.h. | bio_put() { 0) d.h. 4.149 us | } 0) d.h. | dec_count() { 0) d.h. | mempool_free() { 0) d.h. 1.395 us | } 0) d.h. | read_callback() { 0) d.h. | bio_endio() { 0) d.h. | clone_endio() { 0) d.h. | /* ==> enter clone_endio: tio: c1e14c70 */ 0) d.h. 0.104 us | arch_irqs_disabled_flags(); 0) d.h. | /* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c918c040 */ 0) d.h. 0.100 us | arch_irqs_disabled_flags(); 0) d.h. 0.117 us | mirror_end_io(); 0) d.h. | free_tio() { 0) d.h. 2.269 us | } 0) d.h. | bio_put() { 0) d.h. 3.933 us | } 0) d.h. | dec_pending() { 0) d.h. 0.100 us | atomic_dec_and_test(); 0) d.h. | end_io_acct() { 0) d.h. 5.655 us | } 0) d.h. | free_io() { 0) d.h. 1.992 us | } 0) d.h. 0.098 us | trace_block_bio_complete(); 0) d.h. | bio_endio() { 0) d.h. | clone_endio() { 0) d.h. | /* ==> enter clone_endio: tio: c1e14ee0 */ 0) d.h. 0.098 us | arch_irqs_disabled_flags(); 0) d.h. | do_page_fault() { 0) d.h. 0.103 us | xen_read_cr2(); 0) d.h. | /* dpf: tsk: c785a6a0 mm: 0 comm: kworker/0:0 */ 0) d.h. | /* before vmalloc_fault (c9552044) regs: c786db1c ip: c082bb20 eflags: 10002 err: 0 irq: off */ ^^^ - fault error code 0) d.h. | vmalloc_fault() { 0) d.h. 0.104 us | xen_read_cr3(); 0) d.h. | xen_pgd_val(); 0) d.h. | xen_pgd_val(); 0) d.h. | xen_set_pmd(); 0) d.h. | xen_pmd_val(); 0) d.h.+ 14.599 us | } 0) d.h.+ 18.019 us | } v -- irq enabled 0) ..h. | /* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c9552040 */ 0) ..h. 0.102 us | arch_irqs_disabled_flags(); 0) ..h. | /* <7>clone_endio BUG DETECTED irq */ ======================================= So IF flag is restored right after exiting from do_page_fault(). Any thoughts why it might happen? PS: Full logs, additional trace patch, kernel config and a way reproduce bug can be found at https://bugzilla.redhat.com/show_bug.cgi?id=707552 -- View this message in context: http://xen.1045712.n5.nabble.com/Fix-the-occasional-xen-blkfront-deadlock-when-irqbalancing-tp2644296p4704111.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-16 14:57 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Tue, Aug 16, 2011 at 04:26:54AM -0700, imammedo [via Xen] wrote:> > Jeremy Fitzhardinge wrote: > > > > Have you tried bisecting to see when this particular problem appeared? > > It looks to me like something is accidentally re-enabling interrupts - > > perhaps a stack overrun is corrupting the "flags" argument between a > > spin_lock_irqsave()/restore pair. > > > > Is it only on 32-bit kernels? > >Any specific reason you did not include xen-devel in this email? I am CC-ing it here.> ------------[ cut here ]------------ > [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() > [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl > sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 > nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront > pcspkr [last unloaded: scsi_wait_scan] > [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 > [604001.660181] Call Trace: > [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 > [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 > [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 > [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 > [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 > ... > ------------[ cut here ]------------ > > I''ve debugged a bit blk-core warning and can say: > - Yes, It is 32-bit PAE kernel and happens only with it so far. > - Affects PV xen guest, bare-metal and kvm configs are not affected. > - Upstream kernel is affected as well. > - Reproduces on xen 4.1.1 and 3.1.2 hosts > > IF flag is always restored at drivers/md/dm.c > static void clone_endio(struct bio *bio, int error) > ... > dm_endio_fn endio = tio->ti->type->end_io; > ... > when page fault happens accessing tio->ti->type field. > > After successful resync with kernel''s pagetable in > do_page_fault->vmalloc_fault, io continues happily on, however with IF flag > restored even if faulted context''s eflags register had no IF flag set. > It happens with random task every time. > > Here is ftrace call graph showing problematic place: > =======================================================> # tracer: function_graph > # > # function_graph latency trace v1.1.5 on 3.0.0+ > # -------------------------------------------------------------------- > # latency: 0 us, #42330/242738181, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 > #P:1) > # ----------------- > # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) > # ----------------- > # > # _-----=> irqs-off > # / _----=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / > # CPU|||| DURATION FUNCTION CALLS > # | |||| | | | | | | > 0) d... | xen_evtchn_do_upcall() { > 0) d... | irq_enter() { > 0) d.h. 2.880 us | } > 0) d.h. | __xen_evtchn_do_upcall() { > 0) d.h. 0.099 us | irq_to_desc(); > 0) d.h. | handle_edge_irq() { > 0) d.h. 0.107 us | _raw_spin_lock(); > 0) d.h. | ack_dynirq() { > 0) d.h. 3.153 us | } > 0) d.h. | handle_irq_event() { > 0) d.h. | handle_irq_event_percpu() { > 0) d.h. | blkif_interrupt() { > 0) d.h. 0.110 us | _raw_spin_lock_irqsave(); > 0) d.h. | __blk_end_request_all() { > 0) d.h. | > blk_update_bidi_request() { > 0) d.h. | blk_update_request() { > 0) d.h. | req_bio_endio() { > 0) d.h. | bio_endio() { > 0) d.h. | endio() { > 0) d.h. | bio_put() { > 0) d.h. 4.149 us | } > 0) d.h. | dec_count() { > 0) d.h. | > mempool_free() { > 0) d.h. 1.395 us | } > 0) d.h. | > read_callback() { > 0) d.h. | > bio_endio() { > 0) d.h. | > clone_endio() { > 0) d.h. | /* ==> > enter clone_endio: tio: c1e14c70 */ > 0) d.h. 0.104 us | > arch_irqs_disabled_flags(); > 0) d.h. | /* ==> > clone_endio: endio = tio->ti->type->end_io: tio->ti c918c040 */ > 0) d.h. 0.100 us | > arch_irqs_disabled_flags(); > 0) d.h. 0.117 us | > mirror_end_io(); > 0) d.h. | > free_tio() { > 0) d.h. 2.269 us | } > 0) d.h. | > bio_put() { > 0) d.h. 3.933 us | } > 0) d.h. | > dec_pending() { > 0) d.h. 0.100 us | > atomic_dec_and_test(); > 0) d.h. | > end_io_acct() { > 0) d.h. 5.655 us | } > 0) d.h. | > free_io() { > 0) d.h. 1.992 us | } > 0) d.h. 0.098 us | > trace_block_bio_complete(); > 0) d.h. | > bio_endio() { > 0) d.h. | > clone_endio() { > 0) d.h. | > /* ==> enter clone_endio: tio: c1e14ee0 */ > 0) d.h. 0.098 us | > arch_irqs_disabled_flags(); > 0) d.h. | > do_page_fault() { > 0) d.h. 0.103 us | > xen_read_cr2(); > 0) d.h. | > /* dpf: tsk: c785a6a0 mm: 0 comm: kworker/0:0 */ > 0) d.h. | > /* before vmalloc_fault (c9552044) regs: c786db1c ip: c082bb20 eflags: > 10002 err: 0 irq: off */ > ^^^ - fault error code > 0) d.h. | > vmalloc_fault() { > 0) d.h. 0.104 us | > xen_read_cr3(); > 0) d.h. | > xen_pgd_val(); > 0) d.h. | > xen_pgd_val(); > 0) d.h. | > xen_set_pmd(); > 0) d.h. | > xen_pmd_val(); > 0) d.h.+ 14.599 us | > } > 0) d.h.+ 18.019 us | > } > v -- irq enabled > 0) ..h. | > /* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c9552040 */ > 0) ..h. 0.102 us | > arch_irqs_disabled_flags(); > 0) ..h. | > /* <7>clone_endio BUG DETECTED irq */ > =======================================> > So IF flag is restored right after exiting from do_page_fault(). > > Any thoughts why it might happen? > > PS: > Full logs, additional trace patch, kernel config and a way reproduce bug can > be found at https://bugzilla.redhat.com/show_bug.cgi?id=707552 > > > > ______________________________________ > If you reply to this email, your message will be added to the discussion below: > http://xen.1045712.n5.nabble.com/Fix-the-occasional-xen-blkfront-deadlock-when-irqbalancing-tp2644296p4704111.html > This email was sent by imammedo (via Nabble) > To receive all replies by email, subscribe to this discussion: http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_code&node=2644296&code=a29ucmFkLndpbGtAb3JhY2xlLmNvbXwyNjQ0Mjk2fDE1MjU5MDEwODc_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-17 02:38 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote:> > Jeremy Fitzhardinge wrote: > > > > Have you tried bisecting to see when this particular problem appeared? > > It looks to me like something is accidentally re-enabling interrupts - > > perhaps a stack overrun is corrupting the "flags" argument between a > > spin_lock_irqsave()/restore pair. > > > > Is it only on 32-bit kernels? > > > ------------[ cut here ]------------ > [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() > [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl > sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 > nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront > pcspkr [last unloaded: scsi_wait_scan] > [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 > [604001.660181] Call Trace: > [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 > [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 > [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 > [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 > [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 > ... > ------------[ cut here ]------------ > > I''ve debugged a bit blk-core warning and can say: > - Yes, It is 32-bit PAE kernel and happens only with it so far. > - Affects PV xen guest, bare-metal and kvm configs are not affected. > - Upstream kernel is affected as well. > - Reproduces on xen 4.1.1 and 3.1.2 hostsAnd the dom0 is 2.6.18 right? This problem is not present when you use a 3.0 dom0? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Paolo Bonzini
2011-Aug-17 07:30 UTC
[Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 08/16/2011 07:38 PM, Konrad Rzeszutek Wilk wrote:>> > I''ve debugged a bit blk-core warning and can say: >> > - Yes, It is 32-bit PAE kernel and happens only with it so far. >> > - Affects PV xen guest, bare-metal and kvm configs are not affected. >> > - Upstream kernel is affected as well. >> > - Reproduces on xen 4.1.1 and 3.1.2 hosts > And the dom0 is 2.6.18 right? This problem is not present > when you use a 3.0 dom0?Never say never, but why should the dom0 version matter if the problems occurs at an IRET? Igor, perhaps you can try printing the saved_upcall_mask field of the VCPU context in your traces. Paolo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Aug-17 09:07 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 08/17/2011 04:38 AM, Konrad Rzeszutek Wilk wrote:> On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote: >> >> Jeremy Fitzhardinge wrote: >>> >>> Have you tried bisecting to see when this particular problem appeared? >>> It looks to me like something is accidentally re-enabling interrupts - >>> perhaps a stack overrun is corrupting the "flags" argument between a >>> spin_lock_irqsave()/restore pair. >>> >>> Is it only on 32-bit kernels? >>> >> ------------[ cut here ]------------ >> [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() >> [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl >> sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 >> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront >> pcspkr [last unloaded: scsi_wait_scan] >> [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 >> [604001.660181] Call Trace: >> [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 >> [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 >> [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 >> [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 >> [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 >> [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 >> [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 >> ... >> ------------[ cut here ]------------ >> >> I''ve debugged a bit blk-core warning and can say: >> - Yes, It is 32-bit PAE kernel and happens only with it so far. >> - Affects PV xen guest, bare-metal and kvm configs are not affected. >> - Upstream kernel is affected as well. >> - Reproduces on xen 4.1.1 and 3.1.2 hosts > > And the dom0 is 2.6.18 right? This problem is not present > when you use a 3.0 dom0?For xen 4.1.1 testing, I''ve used as dom0 Jeremy''s 2.6.32.43 -- Thanks, Igor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-24 15:36 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Wed, Aug 17, 2011 at 11:07:19AM +0200, Igor Mammedov wrote:> On 08/17/2011 04:38 AM, Konrad Rzeszutek Wilk wrote: > >On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote: > >> > >>Jeremy Fitzhardinge wrote: > >>> > >>>Have you tried bisecting to see when this particular problem appeared? > >>>It looks to me like something is accidentally re-enabling interrupts - > >>>perhaps a stack overrun is corrupting the "flags" argument between a > >>>spin_lock_irqsave()/restore pair. > >>> > >>>Is it only on 32-bit kernels? > >>> > >> ------------[ cut here ]------------ > >>[604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() > >>[604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl > >>sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 > >>nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront > >>pcspkr [last unloaded: scsi_wait_scan] > >>[604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 > >>[604001.660181] Call Trace: > >>[604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 > >>[604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 > >>[604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 > >>[604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 > >>[604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 > >>[604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 > >>[604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 > >>... > >> ------------[ cut here ]------------ > >> > >>I''ve debugged a bit blk-core warning and can say: > >> - Yes, It is 32-bit PAE kernel and happens only with it so far. > >> - Affects PV xen guest, bare-metal and kvm configs are not affected. > >> - Upstream kernel is affected as well. > >> - Reproduces on xen 4.1.1 and 3.1.2 hosts > > > >And the dom0 is 2.6.18 right? This problem is not present > >when you use a 3.0 dom0? > > For xen 4.1.1 testing, I''ve used as dom0 Jeremy''s 2.6.32.43Jeremy pointed me to this: https://patchwork.kernel.org/patch/1091772/ (and http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979) which looks to have a similar backtrack. Perhaps Peter''s fix solves the issue?> > -- > Thanks, > Igor > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Aug-24 16:36 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 08/24/2011 05:36 PM, Konrad Rzeszutek Wilk wrote:> On Wed, Aug 17, 2011 at 11:07:19AM +0200, Igor Mammedov wrote: >> On 08/17/2011 04:38 AM, Konrad Rzeszutek Wilk wrote: >>> On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote: >>>> >>>> Jeremy Fitzhardinge wrote: >>>>> >>>>> Have you tried bisecting to see when this particular problem appeared? >>>>> It looks to me like something is accidentally re-enabling interrupts - >>>>> perhaps a stack overrun is corrupting the "flags" argument between a >>>>> spin_lock_irqsave()/restore pair. >>>>> >>>>> Is it only on 32-bit kernels? >>>>> >>>> ------------[ cut here ]------------ >>>> [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() >>>> [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl >>>> sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 >>>> nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront >>>> pcspkr [last unloaded: scsi_wait_scan] >>>> [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 >>>> [604001.660181] Call Trace: >>>> [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 >>>> [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 >>>> [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 >>>> [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 >>>> [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 >>>> [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 >>>> [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 >>>> ... >>>> ------------[ cut here ]------------ >>>> >>>> I''ve debugged a bit blk-core warning and can say: >>>> - Yes, It is 32-bit PAE kernel and happens only with it so far. >>>> - Affects PV xen guest, bare-metal and kvm configs are not affected. >>>> - Upstream kernel is affected as well. >>>> - Reproduces on xen 4.1.1 and 3.1.2 hosts >>> >>> And the dom0 is 2.6.18 right? This problem is not present >>> when you use a 3.0 dom0? >> >> For xen 4.1.1 testing, I''ve used as dom0 Jeremy''s 2.6.32.43 > > Jeremy pointed me to this: > https://patchwork.kernel.org/patch/1091772/ > (and http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979) > which looks to have a similar backtrack. > > Perhaps Peter''s fix solves the issue?I''ve applied patches: sched-separate-the-scheduler-entry-for-preemption.patch sched-move-blk_schedule_flush_plug-out-of-__schedule.patch block-shorten-interrupt-disabled-regions.patch Unfortunately these patches don''t help, the problem is still there.>> >> -- >> Thanks, >> Igor >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Thanks, Igor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Aug-29 19:46 UTC
Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Wed, Aug 24, 2011 at 06:36:58PM +0200, Igor Mammedov wrote:> On 08/24/2011 05:36 PM, Konrad Rzeszutek Wilk wrote: > >On Wed, Aug 17, 2011 at 11:07:19AM +0200, Igor Mammedov wrote: > >>On 08/17/2011 04:38 AM, Konrad Rzeszutek Wilk wrote: > >>>On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote: > >>>> > >>>>Jeremy Fitzhardinge wrote: > >>>>> > >>>>>Have you tried bisecting to see when this particular problem appeared? > >>>>>It looks to me like something is accidentally re-enabling interrupts - > >>>>>perhaps a stack overrun is corrupting the "flags" argument between a > >>>>>spin_lock_irqsave()/restore pair. > >>>>> > >>>>>Is it only on 32-bit kernels? > >>>>> > >>>> ------------[ cut here ]------------ > >>>>[604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() > >>>>[604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl > >>>>sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 > >>>>nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront > >>>>pcspkr [last unloaded: scsi_wait_scan] > >>>>[604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 > >>>>[604001.660181] Call Trace: > >>>>[604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 > >>>>[604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 > >>>>[604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 > >>>>[604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 > >>>>[604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 > >>>>[604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 > >>>>[604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 > >>>>... > >>>> ------------[ cut here ]------------ > >>>> > >>>>I''ve debugged a bit blk-core warning and can say: > >>>> - Yes, It is 32-bit PAE kernel and happens only with it so far. > >>>> - Affects PV xen guest, bare-metal and kvm configs are not affected. > >>>> - Upstream kernel is affected as well. > >>>> - Reproduces on xen 4.1.1 and 3.1.2 hosts > >>> > >>>And the dom0 is 2.6.18 right? This problem is not present > >>>when you use a 3.0 dom0? > >> > >>For xen 4.1.1 testing, I''ve used as dom0 Jeremy''s 2.6.32.43 > > > >Jeremy pointed me to this: > >https://patchwork.kernel.org/patch/1091772/ > >(and http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979) > >which looks to have a similar backtrack. > > > >Perhaps Peter''s fix solves the issue? > > > I''ve applied patches: > sched-separate-the-scheduler-entry-for-preemption.patch > sched-move-blk_schedule_flush_plug-out-of-__schedule.patch > block-shorten-interrupt-disabled-regions.patch > > Unfortunately these patches don''t help, the problem is still there.Those patches were a bit fresh. Both Peter and Thomas have some updated ones: http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=commitdiff;h=9c40cef2b799f9b5e7fa5de4d2ad3a0168ba118c http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=commitdiff;h=c259e01a1ec90063042f758e409cd26b2a0963c8 Please try those out _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Aug-31 22:37 UTC
[Xen-devel] Re: [PATCH] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 08/31/2011 04:47 PM, Igor Mammedov wrote:> If vmalloc page_fault happens inside of interrupt handler with interrupts > disabled then on exit path from exception handler when there is no pending > interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): > > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > sete XEN_vcpu_info_mask(%eax) > > will enable interrupts even if they has been previously disabled according to > eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) > > testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) > setz XEN_vcpu_info_mask(%eax) > > Solution is in setting XEN_vcpu_info_mask only when it should be set > according to > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > but not clearing it if there isn''t any pending events.Wow, that''s a great find. I guess it shows how rarely we end up doing an exception return with interrupts disabled, since that''s been there since, erm, 2.6.23? But this could definitely explain some bugs where interrupts became unexpectedly re-enabled. Were you tracking one down when you found this?> Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > arch/x86/xen/xen-asm_32.S | 6 +++++- > 1 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S > index 22a2093..313dca7 100644 > --- a/arch/x86/xen/xen-asm_32.S > +++ b/arch/x86/xen/xen-asm_32.S > @@ -113,10 +113,14 @@ xen_iret_start_crit: > > /* > * If there''s something pending, mask events again so we can > - * jump back into xen_hypervisor_callback > + * jump back into xen_hypervisor_callback. Otherwise do not > + * touch XEN_vcpu_info_mask. > */ > + jne ignore_vcpu_info_mask > sete XEN_vcpu_info_mask(%eax) > > +ignore_vcpu_info_mask: > +This should be: jne 1f movb $1, XEN_vcpu_info_mask(%eax) 1: popl %eax There''s no point in using sete if we''re already using a conditional jump to avoid the write, and it''s better to use local labels for little control flow changes like this. Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Aug-31 23:47 UTC
[Xen-devel] [PATCH] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
If vmalloc page_fault happens inside of interrupt handler with interrupts disabled then on exit path from exception handler when there is no pending interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): cmpw $0x0001, XEN_vcpu_info_pending(%eax) sete XEN_vcpu_info_mask(%eax) will enable interrupts even if they has been previously disabled according to eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) setz XEN_vcpu_info_mask(%eax) Solution is in setting XEN_vcpu_info_mask only when it should be set according to cmpw $0x0001, XEN_vcpu_info_pending(%eax) but not clearing it if there isn''t any pending events. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- arch/x86/xen/xen-asm_32.S | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S index 22a2093..313dca7 100644 --- a/arch/x86/xen/xen-asm_32.S +++ b/arch/x86/xen/xen-asm_32.S @@ -113,10 +113,14 @@ xen_iret_start_crit: /* * If there''s something pending, mask events again so we can - * jump back into xen_hypervisor_callback + * jump back into xen_hypervisor_callback. Otherwise do not + * touch XEN_vcpu_info_mask. */ + jne ignore_vcpu_info_mask sete XEN_vcpu_info_mask(%eax) +ignore_vcpu_info_mask: + popl %eax /* -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-01 08:19 UTC
[Xen-devel] Re: [PATCH] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 09/01/2011 12:37 AM, Jeremy Fitzhardinge wrote:> On 08/31/2011 04:47 PM, Igor Mammedov wrote: >> If vmalloc page_fault happens inside of interrupt handler with interrupts >> disabled then on exit path from exception handler when there is no pending >> interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): >> >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >> sete XEN_vcpu_info_mask(%eax) >> >> will enable interrupts even if they has been previously disabled according to >> eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) >> >> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) >> setz XEN_vcpu_info_mask(%eax) >> >> Solution is in setting XEN_vcpu_info_mask only when it should be set >> according to >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >> but not clearing it if there isn''t any pending events. > > Wow, that''s a great find. I guess it shows how rarely we end up doing > an exception return with interrupts disabled, since that''s been there > since, erm, 2.6.23? > > But this could definitely explain some bugs where interrupts became > unexpectedly re-enabled. Were you tracking one down when you found this? > >> Signed-off-by: Igor Mammedov<imammedo@redhat.com> >> --- >> arch/x86/xen/xen-asm_32.S | 6 +++++- >> 1 files changed, 5 insertions(+), 1 deletions(-) >> >> diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S >> index 22a2093..313dca7 100644 >> --- a/arch/x86/xen/xen-asm_32.S >> +++ b/arch/x86/xen/xen-asm_32.S >> @@ -113,10 +113,14 @@ xen_iret_start_crit: >> >> /* >> * If there''s something pending, mask events again so we can >> - * jump back into xen_hypervisor_callback >> + * jump back into xen_hypervisor_callback. Otherwise do not >> + * touch XEN_vcpu_info_mask. >> */ >> + jne ignore_vcpu_info_mask >> sete XEN_vcpu_info_mask(%eax) >> >> +ignore_vcpu_info_mask: >> + > > This should be: > > jne 1f > movb $1, XEN_vcpu_info_mask(%eax) > > 1: popl %eax > > > There''s no point in using sete if we''re already using a conditional jump > to avoid the write, and it''s better to use local labels for little > control flow changes like this. > > Thanks, >J Jeremy, Thanks for review, I''ll re-post it soon. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-01 11:46 UTC
[Xen-devel] [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
If vmalloc page_fault happens inside of interrupt handler with interrupts disabled then on exit path from exception handler when there is no pending interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): cmpw $0x0001, XEN_vcpu_info_pending(%eax) sete XEN_vcpu_info_mask(%eax) will enable interrupts even if they has been previously disabled according to eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) setz XEN_vcpu_info_mask(%eax) Solution is in setting XEN_vcpu_info_mask only when it should be set according to cmpw $0x0001, XEN_vcpu_info_pending(%eax) but not clearing it if there isn''t any pending events. Reproducer for bug is attached to RHBZ 707552 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> --- arch/x86/xen/xen-asm_32.S | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S index 22a2093..b040b0e 100644 --- a/arch/x86/xen/xen-asm_32.S +++ b/arch/x86/xen/xen-asm_32.S @@ -113,11 +113,13 @@ xen_iret_start_crit: /* * If there''s something pending, mask events again so we can - * jump back into xen_hypervisor_callback + * jump back into xen_hypervisor_callback. Otherwise do not + * touch XEN_vcpu_info_mask. */ - sete XEN_vcpu_info_mask(%eax) + jne 1f + movb $1, XEN_vcpu_info_mask(%eax) - popl %eax +1: popl %eax /* * From this point on the registers are restored and the stack -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-01 15:45 UTC
[Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On Thu, Sep 01, 2011 at 01:46:55PM +0200, Igor Mammedov wrote:> If vmalloc page_fault happens inside of interrupt handler with interrupts > disabled then on exit path from exception handler when there is no pending > interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): > > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > sete XEN_vcpu_info_mask(%eax) > > will enable interrupts even if they has been previously disabled according to > eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) > > testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) > setz XEN_vcpu_info_mask(%eax) > > Solution is in setting XEN_vcpu_info_mask only when it should be set > according to > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > but not clearing it if there isn''t any pending events. > > Reproducer for bug is attached to RHBZ 707552 > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>Stuck it in the queue for 3.1 and stable. Thanks for finding this one! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-01 16:46 UTC
[Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 09/01/2011 04:46 AM, Igor Mammedov wrote:> If vmalloc page_fault happens inside of interrupt handler with interrupts > disabled then on exit path from exception handler when there is no pending > interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): > > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > sete XEN_vcpu_info_mask(%eax) > > will enable interrupts even if they has been previously disabled according to > eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) > > testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) > setz XEN_vcpu_info_mask(%eax) > > Solution is in setting XEN_vcpu_info_mask only when it should be set > according to > cmpw $0x0001, XEN_vcpu_info_pending(%eax) > but not clearing it if there isn''t any pending events. > > Reproducer for bug is attached to RHBZ 707552 > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>One nit, this should be acked-by or reviewed-by, not signed-off-by, since the patch isn''t passing through my hands. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-02 08:18 UTC
[Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 09/01/2011 06:46 PM, Jeremy Fitzhardinge wrote:> On 09/01/2011 04:46 AM, Igor Mammedov wrote: >> If vmalloc page_fault happens inside of interrupt handler with interrupts >> disabled then on exit path from exception handler when there is no pending >> interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): >> >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >> sete XEN_vcpu_info_mask(%eax) >> >> will enable interrupts even if they has been previously disabled according to >> eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) >> >> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) >> setz XEN_vcpu_info_mask(%eax) >> >> Solution is in setting XEN_vcpu_info_mask only when it should be set >> according to >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >> but not clearing it if there isn''t any pending events. >> >> Reproducer for bug is attached to RHBZ 707552 >> >> Signed-off-by: Igor Mammedov<imammedo@redhat.com> >> Signed-off-by: Jeremy Fitzhardinge<jeremy@goop.org> > > One nit, this should be acked-by or reviewed-by, not signed-off-by, > since the patch isn''t passing through my hands. > > JI''m new to this stuff, would you like me to re-post it? Next time will do it right. ------- Thanks, Igor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-02 09:19 UTC
Re: [Xen-devel] [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
BTW, while debugging this issue, I''ve tried to print saved_upcall_mask inside xen when handling page fault from guest. And it value is always 0. Looking at asm code for example in xen/arch/x86/x86_32/entry.S:382 movzwl TRAPBOUNCE_cs(%edx),%eax ^^^^^ upper 16-bit is 0 set in propagate_page_fault by "tb->cs = ti->cs;" /* Null selectors (0-3) are not allowed. */ testl $~3,%eax jz domain_crash_synchronous movl %eax,UREGS_cs+4(%esp) ^^^^^^^^^^^^^^^^ and here is the only place we set saved_upcall_mask field in current cpu_user_regs It looks like "saved_upcall_mask" in cpu_user_regs is not really used any more for what it means and it''s presence in struct is just confusing and misleading. So why not delete it and extend _pad1 to 4 bytes? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Sep-02 10:00 UTC
Re: [Xen-devel] [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 02/09/2011 10:19, "Igor Mammedov" <imammedo@redhat.com> wrote:> BTW, while debugging this issue, I''ve tried to print saved_upcall_mask > inside xen when handling page fault from guest. And it value is always > 0. Looking at asm code for example in xen/arch/x86/x86_32/entry.S:382 > > movzwl TRAPBOUNCE_cs(%edx),%eax > ^^^^^ upper 16-bit is 0 set in propagate_page_fault > by "tb->cs = ti->cs;" > > /* Null selectors (0-3) are not allowed. */ > testl $~3,%eax > jz domain_crash_synchronous > movl %eax,UREGS_cs+4(%esp) > ^^^^^^^^^^^^^^^^ and here is the only place we set > saved_upcall_mask field in current cpu_user_regs > > It looks like "saved_upcall_mask" in cpu_user_regs is not really used > any more for what it means and it''s presence in struct is just confusing > and misleading. > > So why not delete it and extend _pad1 to 4 bytes?It''s part of the guest ABI. -- Keir> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-02 13:40 UTC
Re: [Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On Fri, Sep 02, 2011 at 10:18:23AM +0200, Igor Mammedov wrote:> On 09/01/2011 06:46 PM, Jeremy Fitzhardinge wrote: > >On 09/01/2011 04:46 AM, Igor Mammedov wrote: > >>If vmalloc page_fault happens inside of interrupt handler with interrupts > >>disabled then on exit path from exception handler when there is no pending > >>interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): > >> > >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) > >> sete XEN_vcpu_info_mask(%eax) > >> > >>will enable interrupts even if they has been previously disabled according to > >>eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) > >> > >> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) > >> setz XEN_vcpu_info_mask(%eax) > >> > >>Solution is in setting XEN_vcpu_info_mask only when it should be set > >>according to > >> cmpw $0x0001, XEN_vcpu_info_pending(%eax) > >>but not clearing it if there isn''t any pending events. > >> > >>Reproducer for bug is attached to RHBZ 707552 > >> > >>Signed-off-by: Igor Mammedov<imammedo@redhat.com> > >>Signed-off-by: Jeremy Fitzhardinge<jeremy@goop.org> > > > >One nit, this should be acked-by or reviewed-by, not signed-off-by, > >since the patch isn''t passing through my hands. > > > > J > > I''m new to this stuff, would you like me to re-post it?That is OK. I fixed it up in the git commit. Thanks for finding this one! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-02 14:01 UTC
Re: [Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 09/02/2011 03:40 PM, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 02, 2011 at 10:18:23AM +0200, Igor Mammedov wrote: >> On 09/01/2011 06:46 PM, Jeremy Fitzhardinge wrote: >>> On 09/01/2011 04:46 AM, Igor Mammedov wrote: >>>> If vmalloc page_fault happens inside of interrupt handler with interrupts >>>> disabled then on exit path from exception handler when there is no pending >>>> interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): >>>> >>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >>>> sete XEN_vcpu_info_mask(%eax) >>>> >>>> will enable interrupts even if they has been previously disabled according to >>>> eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) >>>> >>>> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) >>>> setz XEN_vcpu_info_mask(%eax) >>>> >>>> Solution is in setting XEN_vcpu_info_mask only when it should be set >>>> according to >>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >>>> but not clearing it if there isn''t any pending events. >>>> >>>> Reproducer for bug is attached to RHBZ 707552 >>>> >>>> Signed-off-by: Igor Mammedov<imammedo@redhat.com> >>>> Signed-off-by: Jeremy Fitzhardinge<jeremy@goop.org> >>> >>> One nit, this should be acked-by or reviewed-by, not signed-off-by, >>> since the patch isn''t passing through my hands. >>> >>> J >> >> I''m new to this stuff, would you like me to re-post it? > > That is OK. I fixed it up in the git commit. Thanks for finding this one!You''re welcome! I''ve learned a lot while debugging it. In particular, how to use kvm and qemu''s gdbstub to debug xen and guest using gdb. -- Thanks, Igor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-02 14:47 UTC
Re: [Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On Fri, Sep 02, 2011 at 04:01:35PM +0200, Igor Mammedov wrote:> On 09/02/2011 03:40 PM, Konrad Rzeszutek Wilk wrote: > >On Fri, Sep 02, 2011 at 10:18:23AM +0200, Igor Mammedov wrote: > >>On 09/01/2011 06:46 PM, Jeremy Fitzhardinge wrote: > >>>On 09/01/2011 04:46 AM, Igor Mammedov wrote: > >>>>If vmalloc page_fault happens inside of interrupt handler with interrupts > >>>>disabled then on exit path from exception handler when there is no pending > >>>>interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): > >>>> > >>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) > >>>> sete XEN_vcpu_info_mask(%eax) > >>>> > >>>>will enable interrupts even if they has been previously disabled according to > >>>>eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) > >>>> > >>>> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) > >>>> setz XEN_vcpu_info_mask(%eax) > >>>> > >>>>Solution is in setting XEN_vcpu_info_mask only when it should be set > >>>>according to > >>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) > >>>>but not clearing it if there isn''t any pending events. > >>>> > >>>>Reproducer for bug is attached to RHBZ 707552 > >>>> > >>>>Signed-off-by: Igor Mammedov<imammedo@redhat.com> > >>>>Signed-off-by: Jeremy Fitzhardinge<jeremy@goop.org> > >>> > >>>One nit, this should be acked-by or reviewed-by, not signed-off-by, > >>>since the patch isn''t passing through my hands. > >>> > >>> J > >> > >>I''m new to this stuff, would you like me to re-post it? > > > >That is OK. I fixed it up in the git commit. Thanks for finding this one! > > You''re welcome! > I''ve learned a lot while debugging it. In particular, how to use kvm and qemu''s > gdbstub to debug xen and guest using gdb.Oh, would you be interested in writting a blog article on xen.org by any chance? That sounds pretty nifty! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Igor Mammedov
2011-Sep-06 09:16 UTC
Re: [Xen-devel] Re: [PATCH v2] xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
On 09/02/2011 04:47 PM, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 02, 2011 at 04:01:35PM +0200, Igor Mammedov wrote: >> On 09/02/2011 03:40 PM, Konrad Rzeszutek Wilk wrote: >>> On Fri, Sep 02, 2011 at 10:18:23AM +0200, Igor Mammedov wrote: >>>> On 09/01/2011 06:46 PM, Jeremy Fitzhardinge wrote: >>>>> On 09/01/2011 04:46 AM, Igor Mammedov wrote: >>>>>> If vmalloc page_fault happens inside of interrupt handler with interrupts >>>>>> disabled then on exit path from exception handler when there is no pending >>>>>> interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): >>>>>> >>>>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >>>>>> sete XEN_vcpu_info_mask(%eax) >>>>>> >>>>>> will enable interrupts even if they has been previously disabled according to >>>>>> eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) >>>>>> >>>>>> testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) >>>>>> setz XEN_vcpu_info_mask(%eax) >>>>>> >>>>>> Solution is in setting XEN_vcpu_info_mask only when it should be set >>>>>> according to >>>>>> cmpw $0x0001, XEN_vcpu_info_pending(%eax) >>>>>> but not clearing it if there isn''t any pending events. >>>>>> >>>>>> Reproducer for bug is attached to RHBZ 707552 >>>>>> >>>>>> Signed-off-by: Igor Mammedov<imammedo@redhat.com> >>>>>> Signed-off-by: Jeremy Fitzhardinge<jeremy@goop.org> >>>>> >>>>> One nit, this should be acked-by or reviewed-by, not signed-off-by, >>>>> since the patch isn''t passing through my hands. >>>>> >>>>> J >>>> >>>> I''m new to this stuff, would you like me to re-post it? >>> >>> That is OK. I fixed it up in the git commit. Thanks for finding this one! >> >> You''re welcome! >> I''ve learned a lot while debugging it. In particular, how to use kvm and qemu''s >> gdbstub to debug xen and guest using gdb. > > Oh, would you be interested in writting a blog article on xen.org by any chance? > That sounds pretty nifty!Sure, that is on my todo list. I''ll just gather separated bits of information together, on how to prepare host for debugging. And description of problems and limitations of this approach, that I''ve stumbled on. -- Thanks, Igor _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel