I am not able to see these with v3.9 but with v3.10 I can easily seem them.
And I can only see them when I build the kernel with these options:
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
Attached is the full serial log, but here are the excerpts:
(XEN) HVM1: 130MB medium detected
(XEN) HVM1: Booting from 0000:7c00
[ 182.836965] BUG: scheduling while atomic: qemu-dm/3621/0x00000101
[ 182.863930] no locks held by qemu-dm/3621.
[ 182.888475] Modules linked in: dm_multipath dm_mod xen_evtchn
iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c
crc32c nouveau mxm_wmi radeon ttm sg sr_mod sd_mod cdrom ahci libahci mperf
crc32c_intel libata scsi_mod fbcon tilebli xen_blkfront xen_netfront fb_sys_fops
sysimgblt sysfillrect syscopyarea xenfs xen_privcmd
[ 183.012005] CPU: 0 PID: 3621 Comm: qemu-dm Not tainted
3.9.0upstream-10936-g51a26ae #1
[ 183.042583] Hardware name: LENOVO ThinkServer TS130/ , BIOS 9HKT47AUS
01/10/2012
[ 183.073531] 0000000000000000 ffff88007fa03c38 ffffffff8169d092
ffff88007fa03c58
[ 183.104037] ffffffff810c23d5 ffff88007fa14b00 ffff88007fa14b00
ffff88007fa03ce8
[ 183.134392] ffffffff8169f16f 000000010e4341c0 ffff880012405fd8
ffff880012404000
[ 183.164498] Call Trace:
[ 183.189376] <IRQ> [<ffffffff8169d092>] dump_stack+0x19/0x1b
[ 183.217888] [<ffffffff810c23d5>] __schedule_bug+0x65/0x90
[ 183.246280] [<ffffffff8169f16f>] __schedule+0x81f/0x840
[ 183.274147] [<ffffffff8169f254>] schedule+0x24/0x70
[ 183.301306] [<ffffffff8169dfb0>]
schedule_hrtimeout_range_clock+0xc0/0x160
[ 183.330515] [<ffffffff810b98f0>] ? update_rmtp+0x80/0x80
[ 183.357663] [<ffffffff810baaff>] ? hrtimer_start_range_ns+0xf/0x20
[ 183.385601] [<ffffffff8169e05e>] schedule_hrtimeout_range+0xe/0x10
[ 183.413258] [<ffffffff8109e18b>] usleep_range+0x3b/0x40
[ 183.439494] [<ffffffffa007fc6d>] e1000_irq_enable+0x1ad/0x1e0 [e1000e]
[ 183.467222] [<ffffffffa007fe18>] e1000e_poll+0x178/0x2e0 [e1000e]
[ 183.494288] [<ffffffff81540b78>] ? net_rx_action+0xd8/0x280
[ 183.520433] [<ffffffff81540bd5>] net_rx_action+0x135/0x280
[ 183.546316] [<ffffffff81096bd9>] __do_softirq+0x119/0x2d0
[ 183.571792] [<ffffffff81096efd>] irq_exit+0xed/0x100
[ 183.596388] [<ffffffff813b742f>] xen_evtchn_do_upcall+0x2f/0x40
[ 183.621833] [<ffffffff816aac1e>] xen_do_hypervisor_callback+0x1e/0x30
[ 183.647781] <EOI> [<ffffffff8100122a>] ?
xen_hypercall_xen_version+0xa/0x20
[ 183.674269] [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[ 183.699930] [<ffffffff810420ed>] ? xen_force_evtchn_callback+0xd/0x10
[ 183.725964] [<ffffffff81042a22>] ? check_events+0x12/0x20
[ 183.750676] [<ffffffff810429c9>] ? xen_irq_enable_direct_rel[
183.776451] [<ffffffff816a970c>] ? system_call_after_swapgs+0x19/0x60
[ 183.802194] NOHZ: local_softirq_pending 282
[ 183.827712] sh (3751) used greatest stack depth: 2344 [ 184.035913] BUG:
scheduling while atomic: qemu-dm/3621/0x00000101
[ 184.035916] BUG: scheduling while atomic: sshd/3582/0x00000604
[ 184.035918] 7 locks held by sshd/3582:
[ 184.035924] #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff8159de57>]
tcp_sendmsg[ 184.035927] #1: (rcu_read_lock){.+.+..}, at:
[<ffffffff815916d0>] ip_queue_xmit+0x0/0x510
[ 184.035930] #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff81590ecb>]
ip_finish_output2+0x7b/0x3e0
[ 184.035933] #3: (r..}, at: [<ffffffff815418b0>]
dev_queue_xmit+0x0/0x690
[ 184.035937] #4: (rcu_read_lock){.+.+..}, at: [<ffffffff81649640>]
br_dev_xmit+0x0/0x1b0
[ 184.035939] #5: (rcu_read_lock_bh){.+....}, at: [<ffffffff815418b0>]
dev_queue_xmit+0x0/0x690
[ 184.035943] #6: (_xmit_ETHER#2){+.-...}, at: [<ffffffff815607b7>]
sch_direct_xmit+0xb7/0x280
And so on. It keeps on happening when QEMU runs and at some point the kernel
crashes due to corruption:
[ 204.049337] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff811ca4fb>]
fget_light+0x3b/0x150
[ 204.072019] BUG: unable to handle kernel paging request at 00000002e66c9780
[ 204.093663] IP: [<ffffffff810bed42>] task_curr+0x12/0x30
[ 204.113615] PGD 69dac067 PUD 0
[ 204.131150] Thread overran stack, or stack corrupted
[ 204.150870] Oops: 0000 [#1] SMP
[ 204.168495] Modules linked in: dm_multipath dm_mod xen_evtchn
iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transportd_mod cdrom ahci
libahci mperf crc32c_intel libata scsi_mod fbcon tileblit font bitblit i915
softcursor e1000e drm_kms_helper video tpm_tis wmi xen_blkfront xen_netfront
fb_sys_fops sysimgblt sysfillrec syscopyarea xenfs xen_privcmd
[ 204.270133] CPU: 0 PID: 3621 Comm: qemu-dm Tainted: G W
3.9.0upstream-10936-g51a26ae #1
[ 204.296935] Hardware name: LENOVO ThinkServer TS130/ , BIOS 9HKT47AUS
01/10/2012
[ 204.323074] task: ffff88006c942200 ti: ffff880012404000 task.ti:
ffff880012404000
[ 204.348978] RIP: e030:[<ffffffff810bed[ 204.375336] RSP:
e02b:ffff880012404240 EFLAGS: 00010046
[ 204.399162] RAX: 0000000000014b00 RBX: ffff88006c942200 RCX: 000000000000000d
[ 204.425264] RDX: 000000006c942200 RSI: ffff88006c942200 RDI: ffff88006c942200
[ 204.451354] RBP: ffff880012404240 R08: ffced0004a5b006a R09: 0000000000000001
[ 204.477580] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000000e
[ 204.503954] R13: 000000000000000e R14: 0000000000000001 R15: ffff88001268e0c0
[ 204.530235] FS: 00007f983fc23700(0000) GS:ffff88007fa00000(0000)
knlGS:0000000000000000
[ 204.557773] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 204.582781] CR2: 00000002e66c9780 CR3: 000000006ba6e000 CR4: 0000000000042660
[ 204.609561] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 204.636335] DR3: 0000000000000000 DR6: 00000ffff0ff0 DR7: 0000000000000400
[ 204.662967] Stack:
[ 204.683881] ffff8800124042a0 ffffffff810a2766 ffffffff00000001
00ff88000000000d
[ 204.710994] 0000000000000000 ffff88006
[ 204.738283] ffff88006c942200 000000000000000e 0000000000000001
0000000000000000
[ 204.765580] Call Trace:
[ 204.787773] [<ffffffff810a2766>] complete_signal+0x146/0x220
[ 204.813972] [<ffffffff810a5c1b>] send_sigqueue+0xcb/0x1e0
[ 204.839795] [<ffffffff810b513f>] posix_timer_event+0x7f/0xc0
[ 204.865905] [<ffffffff810b50c0>] ?
posix_timers_register_clock+0xe0/0xe0
[ 204.893286] [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[ 204.919256] [<ffffffff810b51d6>] posix_timer_fn+0x56/0xe0
[ 204.945028] [<ffffffff810b9f9f>] __run_hrtimer+0x6f/0x220
[ 204.970792] [<ffffffff810b5180>] ? posix_timer_event+0xc0/0xc0
[ 204.997134] [<ffffffff810ba42e>] hrtimer_interrupt+0x10e/0x290
[ 205.023520] [<ffffffff8104261f>] xen_timer_interrupt+0x2f/0x1b0
[ 205.049996] [<ffffffff8111da7c>] handle_irq_event_percpu+0x7c/0x240
[ 205.077010] [<ffffffff81120cd9>] handle_percpu_irq+0x49/0x70
[ 205.103378] [<ffffffff813b73dd>] __xen_evtchn_do_upcall+0x38d/0x3a0
[ 205.130601] [<ffffffff810e998d>] ? trace_hardirqs_off+0xd/0x10
[ 205.157356] [<ffffffff810c8b37>] ? irqtime_account_irq+0xe7/0x100
[ 205.184300] [<ffffffff813b742a>] xen_evtchn_do_upcall+0x2a/0x40
[ 205.211032] [<ffffffff816aac1e>] xen_do_hypervisor_callback+0x1e/0x30
[ 205.238212] [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[ 205.265131] [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[ 205.291516] [<ffffffff810420ed>] ? xen_force_evtchn_callback+0xd/0x10
[ 205.317653] [<ffffffff81042a22>] ? check_events+0x12/0x20
[ 205.342515] [<ffffffff81042a0f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[ 205.368739] [<ffffffff81090ac1>] ? vprintk_emit+0x251/0x520
[ 205.393899] [<ffffffff81042a01>] ? xen_restore_fl_direct+0
[ 205.419209] [<ffffffff8169cf43>] ? printk+0x48/0x4a
[ 205.442402] [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[ 205.465431] [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[ 205.487665] [<ffffffff810eb465>] ? print_lock+0x55/0xb0
[ 205.509345] [<ffffffff810eb53f>] ? lockdep_print_held_locks+0x[
205.532297] [<ffffffff810eb795>] ? debug_show_held_locks+0x15/0x30
[ 205.554631] [<ffffffff810c23bf>] ? __schedule_bug+0x4f/0x90
[ 205.575915] [<ffffffff8169f16f>] ? __schedule+0x81f/0x840
[ 205.596715] [<ffffffff8169f254>] ? schedule+0x24/0x70
[ 205.616849] [<ffffffff8169dfb0>] ?
schedule_hrtimeout_range_clock+0xc0/0x160
[ 205.639418] [<ffffffff810b98f0>] ? update_rmtp+0x80/0x80
[ 205.660007] [<ffffffff810baaff>] ? hrtimer_start_range_ns+0xf/0x20
[ 205.681604] [<ffffffff8169e05e>] ? schedule_hrtimeout_range+0xe/0x10
[ 205.703486] [<ffffffff8109e18b>] ? usleep_range+0[ 205.724082]
[<ffffffffa007baf5>] ? e1000e_update_tdt_wa+0x55/0xe0 [e1000e]
[ 205.746269] [<ffffffffa007cc28>] ? e1000_xm[ 205.768163]
[<ffffffff8153e8f2>] ? dev_queue_xmit_nit+0x202/0x280
[ 205.789490] [<ffffffff8153e6f0>] ? net_tx_action+0x2[ 205.810349]
[<ffffffff8153ec78>] ? dev_hard_start_xmit+0x308/0x5a0
[ 205.831671] [<ffffffff815607fe>] ? sch_direct_xmit+[ 205.852275]
[<ffffffff81541a39>] ? dev_queue_xmit+0x189/0x690
[ 205.872734] [<ffffffff815418b0>] ? dev_loopback_xmit+0x1e0/0x1e0
[ 205.893410] [<ffffffff8164b0b5>] ? br_dev_queue_push_xmit+0x55/0x70
[ 205.914143] [<ffffffff8164b20d>] ? br_forward_finish+0x1d/0x60
[ 205.934283] [<ffffffff81649640>] ? br_netpoll_setup+0x90/0x90
[ 205.954247] [<ffffffff8164b290>] ? __br_deliver+0x40/0x1
[ 205.973967] [<ffffffff8164b3cd>] ? br_deliver+0x3d/0x50
[ 205.993208] [<ffffffff816497ce>] ? br_dev_xmit+0x18e/0x1b0
[ 206.012589] [<ffffffff81649640>] ? br_netpoll_setup+0x90/0x90
[ 206.032322] [<ffffffff8153ec78>] ? dev_hard_start_xmit+0[ 206.052573]
[<ffffffff81541b87>] ? dev_queue_xmit+0x2d7/0x690
[ 206.072298] [<ffffffff815418b0>] ? dev_loopback_xmit+0x1e0/0x1e0
[ 206.092421] [<ffffffff81591020>] ? ip_finish_output2+0x1d0/0x3e0
[ 206.112370] [<ffffffff81590ecb>] ? ip_finish_output2+[ 206.132015]
[<ffffffff8156dc04>] ? nf_hook_slow+0x134/0x190
[ 206.151307] [<ffffffff81592890>] ? ip_fragment+0x8a0/0x8a0[
206.170420] [<ffffffff8159293e>] ? ip_finish_output+0xae/0x200
[ 206.189796] [<ffffffff81592ae4>] ? ip_output+0x54/0xe0[ 206.208336]
[<ffffffff81591258>] ? ip_local_out+0x28/0x80
[ 206.227053] [<ffffffff8159185b>] ? ip_queue_xmit+0x18b/0x510[
206.245981] [<ffffffff815916d0>] ? ip_send_unicast_reply+0x390/0x390
[ 206.265688] [<ffffffff815a83c5>] ? tcp_transmit_skb+0x465/0x880
[ 206.285137] [<ffffffff815a96fc>] ? tcp_send_ack+0xec/0x120
[ 206.304062] [<ffffffff815a0c09>] ? __tcp_ack_snd_check+0x59[
206.323558] [<ffffffff815a737c>] ? tcp_rcv_established+0x22c/0x810
[ 206.343227] [<ffffffff815b23ec>] ? tcp_v4_do_rcv+0x[ 206.362232]
[<ffffffff815b3011>] ? tcp_v4_rcv+0x5e1/0x7f0
[ 206.380800] [<ffffffff810ef0b0>] ? lock_acquire+0xb0/0x120
[ 206.399462] [<ffffffff8158b753>] ? ip_local_deliver_finish+0x43/0x350
[ 206.419330] [<ffffffff8158b710>] ? ip_local_deliver+0x80/0x80
[ 206.438526] [<ffffffff8158b808>] ? ip_local_deliver_finish+0xf8/0x350
[ 206.458487] [<ffffffff8158b753>] ? ip_local_deliver_finish+0x43/0x350
[ 206.478270] [<ffffffff8158b6d2>] ? ip_local_deliver+0x42/0x80
[ 206.497093] [<ffffffff8158bbec>] ? ip_rcv_finish+0x18c/0x4b0
[ 206.515770] [<ffffffff8158b599>] ? ip_rcv+0x219/0x310
[ 206.533777] [<ffffffff8153ff1a>] ?
__netif_receive_skb_core+0x6ca/0x850
[ 206.553680] [<ffffffff8153f951>] ?
__netif_receive_skb_core+0x101/0x850
[ 206.573385] [<ffffffff815400bd>] ? __netif_receive_skb+0x1d/0x70
[ 206.592123] [<ffffffff81540310>] ? netif_receive_skb+[ 206.610742]
[<ffffffff8164c2cd>] ? br_handle_frame_finish+0x1cd/0x2c0
[ 206.629932] [<ffffffff810ef0b0>] ? lock_acquire+[ 206.648109]
[<ffffffff8164c01a>] ? br_handle_frame+0x1aa/0x290
[ 206.666609] [<ffffffff8164be70>] ? br_handle_local_finish+0x40/0x40
[ 206.685485] [<ffffffff8153fb49>] ?
__netif_receive_skb_core+0x2f9/0x850
[ 206.704771] [<ffffffff8153f951>] ? __netif_rec[ 206.723887]
[<ffffffff815c65b0>] ? inet_gso_send_check+0x160/0x160
[ 206.742506] [<ffffffff815400bd>] ? __netif_receive_[ 206.761026]
[<ffffffff81540310>] ? netif_receive_skb+0x20/0x120
[ 206.779387] [<ffffffff815c66a3>] ? inet_gro_complete+0xf3/0x140
[ 206.797749] [<ffffffff815c65b0>] ? inet_gso_send_check+0x160/0x160
[ 206.816318] [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[ 206.834163] [<ffffffff8154052c>] ? napi_gro_complete+0x11c/
[ 206.852682] [<ffffffff81540430>] ? napi_gro_complete+0x20/0x140
[ 206.871004] [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[ 206.888823] [<ffffffff81540826>] ? dev_gro_receive+0x2d6/0x430
[ 206.907056] [<ffffffff81540748>] ? dev_gro_receive+0x1f8/0x430
[ 206.925189] [<ffffffff8119d993>] ? kmem_cache_free+0x123/0x370
[ 206.943303] [<ffffffff810ed400>] ? trace_hardirqs_on_ca[ 206.962279]
[<ffffffff81541026>] ? napi_gro_receive+0x56/0x150
[ 206.980487] [<ffffffffa007a8a5>] ? e1000_receive_skb+0x75/0xf0
[e1000e]
[ 206.999791] [<ffffffffa007d7a8>] ? e1000_clean_rx_irq+0x298/0x4a0
[e1000e]
[ 207.019536] [<ffffffffa007fd28>] ? e1000e_poll+0x88/0x2e0 [e1000e]
[ 207.038513] [<ffffffff81540b78>] ? net_rx_action+0xd8/0x280
[ 207.056877] [<ffffffff81540bd5>] ? net_rx_action+0x135/0x280
[ 207.075231] [<ffffffff81096bd9>] ? __do_softirq+0x119/0x2d0
[ 207.093466] [<ffffffff81096efd>] ? irq_exit+0xed/0x100
[ 207.111271] [<ffffffff813b742f>] ? xen_evtchn_do_upcall+0x2f/0x[
207.130320] [<ffffffff816aac1e>] ? xen_do_hypervisor_callback+0x1e/0x30
[ 207.149980] [<ffffffff811ca561>] ? fget_light+[ 207.168130]
[<ffffffff811ca531>] ? fget_light+0x71/0x150
[ 207.186070] [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[ 207.203830] [<ffffffff811c067e>] ? do_select+0x36e/0x6e0
[ 207.221524] [<ffffffff811c0310>] ?
select_estimate_accuracy+007.240538] [<ffffffff811c00d0>] ?
poll_freewait+0x90/0x90
[ 207.258534] [<ffffffff811c01c0>] ? __pollwait+0xf0/0xf0
[ 207.276217] [<ffffffff815392bd>] ?
net_rps_action_and_irq_enable+0x8d/0xa0
[ 207.295849] [<ffffffff8111d882>] ? __irq_get_desc_lock+0x62/0xb0
[ 207.314629] [<ffffffff810c122d>] ? __wake_up+0x2d/0x7
[ 207.332497] [<ffffffff810edfde>] ? __lock_acquire+0x7be/0x17e0
[ 207.351329] [<ffffffff810eda39>] ? __lock_acquire+0x219[ 207.370039]
[<ffffffff810ef1c8>] ? lock_release_non_nested+0xa8/0x360
[ 207.389539] [<ffffffff8131668e>] ? do_raw_spin_u[ 207.408549]
[<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[ 207.426757] [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[ 207.444973] [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[ 207.463071] [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[ 207.481346] [<ffffffff811c143c>] ? core_sys_select+0x21c/0x350
[ 207.499957] [<ffffffff811c1268>] ? core_sys_select+0x48/0x350
[ 207.518281] [<ffffffff810427d9>] ? xen_clocksource_read+0x39/0x50
[ 207.536978] [<ffffffff810ef1c8>] ? lock_release_non_nested+0xa8/0x360
[ 207.556248] [<ffffffff810e998d>] ? trace_hardirqs_off+0xd/0x10
[ 207.574735] [<ffffffff81125367>] ? rcu_irq_exit+0x87/0xe0
[ 207.592900] [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[ 207.610858] [<ffffffff810427d9>] ? xen_clocksource_read+0x39/0x50
[ 207.629792] [<ffffffff810429a9>] ? xen_clocksource_g[ 207.649195]
[<ffffffff810dfa77>] ? ktime_get_ts+0x47/0xf0
[ 207.667330] [<ffffffff811c17d2>] ? SyS_select+0x42/0x110
[ 207.685402] [<ffffffff816a9769>] ? system_call_fastpath+0x16/0x1b
[ 207.704362] Code: 5f c9 c3 66 0f 1f 44 00 00 55 31 c000 48 89 e5 8b 52 18
<48> 8b 14 d5 80 87 cb 81 48 39 bc 10 98 08 00 00 c9 0f 94 c0 0f
[ 207.749634] RIP [<ffffffff810bed42>[ 207.768941] RSP
<ffff880012404240>
[ 207.786147] CR2: 00000002e66c9780
[ 207.803299] ---[ end trace f347e5b235e48095 ]---
[ 207.821933] Kernel panic - not syncing: Fatal exception in interrupt
(XEN) Domain 0 crashed: ''noreboot'' set - not rebooting.
If anybody has some time to do a bit of git bisect to help identify the culprit
it would be very much welcomed.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel