Sander Eikelenboom
2010-Jul-25 15:35 UTC
[Xen-devel] xen pci passthrough hung task instead of terminate
Hi Konrad, I have tried both your trees, together with some experimental usb3 stuff. It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging. Since xen_evtchn is on the trace stack i''m wondering if any xen parts are causing it to hang instead of terminate. -- Sander Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 25 16:54:26 security kernel: [26400.136191] motion D ffffffff810049f9 0 1556 1 0x00000000 Jul 25 16:54:26 security kernel: [26400.136220] ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580 Jul 25 16:54:26 security kernel: [26400.136254] ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580 Jul 25 16:54:26 security kernel: [26400.136285] 0000000000014580 0000000000014580 ffff88001e088100 0000000000000001 Jul 25 16:54:26 security kernel: [26400.136316] Call Trace: Jul 25 16:54:26 security kernel: [26400.136346] [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125 Jul 25 16:54:26 security kernel: [26400.136374] [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28 Jul 25 16:54:26 security kernel: [26400.136399] [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core] Jul 25 16:54:26 security kernel: [26400.136424] [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa Jul 25 16:54:26 security kernel: [26400.136449] [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx] Jul 25 16:54:26 security kernel: [26400.136473] [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev] Jul 25 16:54:26 security kernel: [26400.136496] [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1 Jul 25 16:54:26 security kernel: [26400.136517] [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd Jul 25 16:54:26 security kernel: [26400.136539] [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d Jul 25 16:54:26 security kernel: [26400.136561] [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2 Jul 25 16:54:26 security kernel: [26400.136580] [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6 Jul 25 16:54:26 security kernel: [26400.136600] [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e Jul 25 16:54:26 security kernel: [26400.136620] [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec Jul 25 16:54:26 security kernel: [26400.136641] [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426 Jul 25 16:54:26 security kernel: [26400.136662] [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev] Jul 25 16:54:26 security kernel: [26400.136684] [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127 Jul 25 16:54:26 security kernel: [26400.136702] [<ffffffff81006332>] ? check_events+0x12/0x20 Jul 25 16:54:26 security kernel: [26400.136722] [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev] Jul 25 16:54:26 security kernel: [26400.136742] [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92 Jul 25 16:54:26 security kernel: [26400.136760] [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c Jul 25 16:54:26 security kernel: [26400.136779] [<ffffffff810c874c>] ? vfs_write+0x134/0x169 Jul 25 16:54:26 security kernel: [26400.136797] [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70 Jul 25 16:54:26 security kernel: [26400.136815] [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jul-26 15:53 UTC
Re: [Xen-devel] xen pci passthrough hung task instead of terminate
On Sun, Jul 25, 2010 at 05:35:07PM +0200, Sander Eikelenboom wrote:> Hi Konrad, > > I have tried both your trees, together with some experimental usb3 stuff.How many CPUs do you have assigned to your guest? I presume this problem does not appear under baremetal? Thought looking at the stack I would think it would too - it does not look Xen specific - just that a mutex is deadlocked.> It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging. > Since xen_evtchn is on the trace stack i''m wondering if any xen parts are causing it to hang instead of terminate.Here is what the mutex_lock says: 71 /*** 72 * mutex_lock - acquire the mutex 73 * @lock: the mutex to be acquired 74 * 75 * Lock the mutex exclusively for this task. If the mutex is not 76 * available right now, it will sleep until it can get it. 77 * 78 * The mutex must later on be released by the same task that 79 * acquired it. Recursive locking is not allowed. The task 80 * may not exit without first unlocking the mutex. Also, kernel 81 * memory where the mutex resides mutex must not be freed with 82 * the mutex still locked. The mutex must first be initialized 83 * (or statically defined) before it can be locked. memset()-ing 84 * the mutex to 0 is not allowed. 85 * 86 * ( The CONFIG_DEBUG_MUTEXES .config option turns on debugging 87 * checks that will enforce the restrictions and will also do 88 * deadlock debugging. ) 89 * 90 * This function is similar to (but not equivalent to) down() So I think the next step is to try CONFIG_DEBUG_MUTEXES, and see what it tells you.> > -- > Sander > > > > Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Jul 25 16:54:26 security kernel: [26400.136191] motion D ffffffff810049f9 0 1556 1 0x00000000 > Jul 25 16:54:26 security kernel: [26400.136220] ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580 > Jul 25 16:54:26 security kernel: [26400.136254] ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580 > Jul 25 16:54:26 security kernel: [26400.136285] 0000000000014580 0000000000014580 ffff88001e088100 0000000000000001 > Jul 25 16:54:26 security kernel: [26400.136316] Call Trace: > Jul 25 16:54:26 security kernel: [26400.136346] [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125 > Jul 25 16:54:26 security kernel: [26400.136374] [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28 > Jul 25 16:54:26 security kernel: [26400.136399] [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core] > Jul 25 16:54:26 security kernel: [26400.136424] [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa > Jul 25 16:54:26 security kernel: [26400.136449] [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx] > Jul 25 16:54:26 security kernel: [26400.136473] [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev] > Jul 25 16:54:26 security kernel: [26400.136496] [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1 > Jul 25 16:54:26 security kernel: [26400.136517] [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd > Jul 25 16:54:26 security kernel: [26400.136539] [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d > Jul 25 16:54:26 security kernel: [26400.136561] [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2 > Jul 25 16:54:26 security kernel: [26400.136580] [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6 > Jul 25 16:54:26 security kernel: [26400.136600] [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e > Jul 25 16:54:26 security kernel: [26400.136620] [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec > Jul 25 16:54:26 security kernel: [26400.136641] [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426 > Jul 25 16:54:26 security kernel: [26400.136662] [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev] > Jul 25 16:54:26 security kernel: [26400.136684] [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127 > Jul 25 16:54:26 security kernel: [26400.136702] [<ffffffff81006332>] ? check_events+0x12/0x20 > Jul 25 16:54:26 security kernel: [26400.136722] [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev] > Jul 25 16:54:26 security kernel: [26400.136742] [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92 > Jul 25 16:54:26 security kernel: [26400.136760] [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c > Jul 25 16:54:26 security kernel: [26400.136779] [<ffffffff810c874c>] ? vfs_write+0x134/0x169 > Jul 25 16:54:26 security kernel: [26400.136797] [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70 > Jul 25 16:54:26 security kernel: [26400.136815] [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Jul-26 15:55 UTC
Re: [Xen-devel] xen pci passthrough hung task instead of terminate
2 vcpus good idea to try just 1 for now :) Monday, July 26, 2010, 5:53:12 PM, you wrote:> On Sun, Jul 25, 2010 at 05:35:07PM +0200, Sander Eikelenboom wrote: >> Hi Konrad, >> >> I have tried both your trees, together with some experimental usb3 stuff.> How many CPUs do you have assigned to your guest?> I presume this problem does not appear under baremetal? Thought > looking at the stack I would think it would too - it does not > look Xen specific - just that a mutex is deadlocked.>> It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging. >> Since xen_evtchn is on the trace stack i''m wondering if any xen parts are causing it to hang instead of terminate.> Here is what the mutex_lock says:> 71 /*** > 72 * mutex_lock - acquire the mutex > 73 * @lock: the mutex to be acquired > 74 * > 75 * Lock the mutex exclusively for this task. If the mutex is not > 76 * available right now, it will sleep until it can get it. > 77 * > 78 * The mutex must later on be released by the same task that > 79 * acquired it. Recursive locking is not allowed. The task > 80 * may not exit without first unlocking the mutex. Also, kernel > 81 * memory where the mutex resides mutex must not be freed with > 82 * the mutex still locked. The mutex must first be initialized > 83 * (or statically defined) before it can be locked. memset()-ing > 84 * the mutex to 0 is not allowed. > 85 * > 86 * ( The CONFIG_DEBUG_MUTEXES .config option turns on debugging > 87 * checks that will enforce the restrictions and will also do > 88 * deadlock debugging. ) > 89 * > 90 * This function is similar to (but not equivalent to) down()> So I think the next step is to try CONFIG_DEBUG_MUTEXES, and see > what it tells you.>> >> -- >> Sander >> >> >> >> Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Jul 25 16:54:26 security kernel: [26400.136191] motion D ffffffff810049f9 0 1556 1 0x00000000 >> Jul 25 16:54:26 security kernel: [26400.136220] ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580 >> Jul 25 16:54:26 security kernel: [26400.136254] ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580 >> Jul 25 16:54:26 security kernel: [26400.136285] 0000000000014580 0000000000014580 ffff88001e088100 0000000000000001 >> Jul 25 16:54:26 security kernel: [26400.136316] Call Trace: >> Jul 25 16:54:26 security kernel: [26400.136346] [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125 >> Jul 25 16:54:26 security kernel: [26400.136374] [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28 >> Jul 25 16:54:26 security kernel: [26400.136399] [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core] >> Jul 25 16:54:26 security kernel: [26400.136424] [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa >> Jul 25 16:54:26 security kernel: [26400.136449] [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx] >> Jul 25 16:54:26 security kernel: [26400.136473] [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev] >> Jul 25 16:54:26 security kernel: [26400.136496] [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1 >> Jul 25 16:54:26 security kernel: [26400.136517] [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd >> Jul 25 16:54:26 security kernel: [26400.136539] [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d >> Jul 25 16:54:26 security kernel: [26400.136561] [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2 >> Jul 25 16:54:26 security kernel: [26400.136580] [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6 >> Jul 25 16:54:26 security kernel: [26400.136600] [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e >> Jul 25 16:54:26 security kernel: [26400.136620] [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec >> Jul 25 16:54:26 security kernel: [26400.136641] [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426 >> Jul 25 16:54:26 security kernel: [26400.136662] [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev] >> Jul 25 16:54:26 security kernel: [26400.136684] [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127 >> Jul 25 16:54:26 security kernel: [26400.136702] [<ffffffff81006332>] ? check_events+0x12/0x20 >> Jul 25 16:54:26 security kernel: [26400.136722] [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev] >> Jul 25 16:54:26 security kernel: [26400.136742] [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92 >> Jul 25 16:54:26 security kernel: [26400.136760] [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c >> Jul 25 16:54:26 security kernel: [26400.136779] [<ffffffff810c874c>] ? vfs_write+0x134/0x169 >> Jul 25 16:54:26 security kernel: [26400.136797] [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70 >> Jul 25 16:54:26 security kernel: [26400.136815] [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel