thr3ads.net - Xen devel - System freeze with IGD passthrough [Dec 2012]

If this information is useful, please help other people find it:
Share via:

G.R.

2012-Dec-18 17:28 UTC

System freeze with IGD passthrough

Hi Stefano,

I recently tried to play some 3D games on my linux guest.
The game starts without problem but it freezes the entire system after
a some time (a minute or so?).
Here I mean both the host and domU are not responsive anymore.
The ssh freezes and i had to shutdown the machine using power button directly.

I did not find anything obvious from the host log. But from the guest,
I can find this:

Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
]------------
Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
BARs. Your kernel is fine.
Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
Not tainted 3.6.9 #4
Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
Dec 18 20:28:38 debvm kernel: [    0.899896]  [<ffffffff8103d194>] ?
warn_slowpath_common+0x76/0x8a
Dec 18 20:28:38 debvm kernel: [    0.899898]  [<ffffffff8103d240>] ?
warn_slowpath_fmt+0x45/0x4a
Dec 18 20:28:38 debvm kernel: [    0.899900]  [<ffffffff81032a6c>] ?
__ioremap_caller+0x2c4/0x33c
Dec 18 20:28:38 debvm kernel: [    0.899902]  [<ffffffff812c3be3>] ?
intel_opregion_setup+0x9c/0x201
Dec 18 20:28:38 debvm kernel: [    0.899904]  [<ffffffff812bcb75>] ?
intel_setup_gmbus+0x175/0x19d
Dec 18 20:28:38 debvm kernel: [    0.899907]  [<ffffffff8128a37a>] ?
i915_driver_load+0x548/0x90d
Dec 18 20:28:38 debvm kernel: [    0.899910]  [<ffffffff812ff804>] ?
setup_hpet_msi_remapped+0x20/0x20
Dec 18 20:28:38 debvm kernel: [    0.899912]  [<ffffffff81272706>] ?
drm_get_pci_dev+0x152/0x259
Dec 18 20:28:38 debvm kernel: [    0.899915]  [<ffffffff813d4883>] ?
_raw_spin_lock_irqsave+0x21/0x45
Dec 18 20:28:38 debvm kernel: [    0.899918]  [<ffffffff811d9ecc>] ?
local_pci_probe+0x5a/0xa0
Dec 18 20:28:38 debvm kernel: [    0.899920]  [<ffffffff811d9fcf>] ?
pci_device_probe+0xbd/0xe7
Dec 18 20:28:38 debvm kernel: [    0.899922]  [<ffffffff812cd887>] ?
driver_probe_device+0x1b0/0x1b0
Dec 18 20:28:38 debvm kernel: [    0.899923]  [<ffffffff812cd887>] ?
driver_probe_device+0x1b0/0x1b0
Dec 18 20:28:38 debvm kernel: [    0.899925]  [<ffffffff812cd769>] ?
driver_probe_device+0x92/0x1b0
Dec 18 20:28:38 debvm kernel: [    0.899926]  [<ffffffff812cd8da>] ?
__driver_attach+0x53/0x73
Dec 18 20:28:38 debvm kernel: [    0.899928]  [<ffffffff812cc06f>] ?
bus_for_each_dev+0x46/0x77
Dec 18 20:28:38 debvm kernel: [    0.899930]  [<ffffffff812ccf8f>] ?
bus_add_driver+0xd5/0x1f4
Dec 18 20:28:38 debvm kernel: [    0.899931]  [<ffffffff812cde14>] ?
driver_register+0x89/0x101
Dec 18 20:28:38 debvm kernel: [    0.899933]  [<ffffffff811d9336>] ?
__pci_register_driver+0x49/0xa3
Dec 18 20:28:38 debvm kernel: [    0.899935]  [<ffffffff816d55c7>] ?
ttm_init+0x63/0x63
Dec 18 20:28:38 debvm kernel: [    0.899937]  [<ffffffff81002085>] ?
do_one_initcall+0x75/0x12c
Dec 18 20:28:38 debvm kernel: [    0.899940]  [<ffffffff816a6cc2>] ?
kernel_init+0x13c/0x1c0
Dec 18 20:28:38 debvm kernel: [    0.899941]  [<ffffffff816a6565>] ?
do_early_param+0x83/0x83
Dec 18 20:28:38 debvm kernel: [    0.899943]  [<ffffffff813d9f44>] ?
kernel_thread_helper+0x4/0x10
Dec 18 20:28:38 debvm kernel: [    0.899945]  [<ffffffff816a6b86>] ?
start_kernel+0x3e1/0x3e1
Dec 18 20:28:38 debvm kernel: [    0.899947]  [<ffffffff813d9f40>] ?
gs_change+0x13/0x13
Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
db461543ce599b44 ]---

I''m not sure if this has anything to do with the freeze. This seems to
show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
system freeze happens only during gaming, which is much less frequent.
So I''m not sure if the two are related. But anyway, could you comment
about what does this log mean?

I can find the one of the mentioned address in the qemu_dm log:
pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
igd_write_opregion: Map OpRegion: cd996018 -> feff5018
igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000

PS: I also run xbmc on domU and it playbacks video under HW
acceleration (VAAPI) without any problem. XBMC by itself is also an
graphics intensive program. But this runs on an pure HVM guest, while
the failing case is on PVHVM.

PS2: I also suffered another instability yesterday. It happens when I
was compiling kernel in side the domU. The host reboots suddenly.
Since I''m not using graphics at that time (Xorg session is idle, I
connected through SSH), this may be a different issue.

Thanks,
Timothy

G.R.

2012-Dec-19 06:20 UTC

head link

Re: System freeze with IGD passthrough

Adding Jean, the author to the opregion patch.

Jean, I believe the warning is due to the offset within the page.
To accommodate the offset, you would need to reserve another page for it.
Will the extra page cause any unexpected problem?

The original thread is about an instability issue that directly freeze the host.
I believe this warning above should not has such effect.
What do you think? And any suggestion?

Thanks,
Timothy

On Wed, Dec 19, 2012 at 1:28 AM, G.R. <firemeteor@users.sourceforge.net>
wrote:> Hi Stefano,
>
> I recently tried to play some 3D games on my linux guest.
> The game starts without problem but it freezes the entire system after
> a some time (a minute or so?).
> Here I mean both the host and domU are not responsive anymore.
> The ssh freezes and i had to shutdown the machine using power button
directly.
>
> I did not find anything obvious from the host log. But from the guest,
> I can find this:
>
> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
> ]------------
> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
> BARs. Your kernel is fine.
> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
> Not tainted 3.6.9 #4
> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
> Dec 18 20:28:38 debvm kernel: [    0.899896]  [<ffffffff8103d194>] ?
> warn_slowpath_common+0x76/0x8a
> Dec 18 20:28:38 debvm kernel: [    0.899898]  [<ffffffff8103d240>] ?
> warn_slowpath_fmt+0x45/0x4a
> Dec 18 20:28:38 debvm kernel: [    0.899900]  [<ffffffff81032a6c>] ?
> __ioremap_caller+0x2c4/0x33c
> Dec 18 20:28:38 debvm kernel: [    0.899902]  [<ffffffff812c3be3>] ?
> intel_opregion_setup+0x9c/0x201
> Dec 18 20:28:38 debvm kernel: [    0.899904]  [<ffffffff812bcb75>] ?
> intel_setup_gmbus+0x175/0x19d
> Dec 18 20:28:38 debvm kernel: [    0.899907]  [<ffffffff8128a37a>] ?
> i915_driver_load+0x548/0x90d
> Dec 18 20:28:38 debvm kernel: [    0.899910]  [<ffffffff812ff804>] ?
> setup_hpet_msi_remapped+0x20/0x20
> Dec 18 20:28:38 debvm kernel: [    0.899912]  [<ffffffff81272706>] ?
> drm_get_pci_dev+0x152/0x259
> Dec 18 20:28:38 debvm kernel: [    0.899915]  [<ffffffff813d4883>] ?
> _raw_spin_lock_irqsave+0x21/0x45
> Dec 18 20:28:38 debvm kernel: [    0.899918]  [<ffffffff811d9ecc>] ?
> local_pci_probe+0x5a/0xa0
> Dec 18 20:28:38 debvm kernel: [    0.899920]  [<ffffffff811d9fcf>] ?
> pci_device_probe+0xbd/0xe7
> Dec 18 20:28:38 debvm kernel: [    0.899922]  [<ffffffff812cd887>] ?
> driver_probe_device+0x1b0/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899923]  [<ffffffff812cd887>] ?
> driver_probe_device+0x1b0/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899925]  [<ffffffff812cd769>] ?
> driver_probe_device+0x92/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899926]  [<ffffffff812cd8da>] ?
> __driver_attach+0x53/0x73
> Dec 18 20:28:38 debvm kernel: [    0.899928]  [<ffffffff812cc06f>] ?
> bus_for_each_dev+0x46/0x77
> Dec 18 20:28:38 debvm kernel: [    0.899930]  [<ffffffff812ccf8f>] ?
> bus_add_driver+0xd5/0x1f4
> Dec 18 20:28:38 debvm kernel: [    0.899931]  [<ffffffff812cde14>] ?
> driver_register+0x89/0x101
> Dec 18 20:28:38 debvm kernel: [    0.899933]  [<ffffffff811d9336>] ?
> __pci_register_driver+0x49/0xa3
> Dec 18 20:28:38 debvm kernel: [    0.899935]  [<ffffffff816d55c7>] ?
> ttm_init+0x63/0x63
> Dec 18 20:28:38 debvm kernel: [    0.899937]  [<ffffffff81002085>] ?
> do_one_initcall+0x75/0x12c
> Dec 18 20:28:38 debvm kernel: [    0.899940]  [<ffffffff816a6cc2>] ?
> kernel_init+0x13c/0x1c0
> Dec 18 20:28:38 debvm kernel: [    0.899941]  [<ffffffff816a6565>] ?
> do_early_param+0x83/0x83
> Dec 18 20:28:38 debvm kernel: [    0.899943]  [<ffffffff813d9f44>] ?
> kernel_thread_helper+0x4/0x10
> Dec 18 20:28:38 debvm kernel: [    0.899945]  [<ffffffff816a6b86>] ?
> start_kernel+0x3e1/0x3e1
> Dec 18 20:28:38 debvm kernel: [    0.899947]  [<ffffffff813d9f40>] ?
> gs_change+0x13/0x13
> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
> db461543ce599b44 ]---
>
> I''m not sure if this has anything to do with the freeze. This
seems to
> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
> system freeze happens only during gaming, which is much less frequent.
> So I''m not sure if the two are related. But anyway, could you
comment
> about what does this log mean?
>
> I can find the one of the mentioned address in the qemu_dm log:
> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
>
> PS: I also run xbmc on domU and it playbacks video under HW
> acceleration (VAAPI) without any problem. XBMC by itself is also an
> graphics intensive program. But this runs on an pure HVM guest, while
> the failing case is on PVHVM.
>
> PS2: I also suffered another instability yesterday. It happens when I
> was compiling kernel in side the domU. The host reboots suddenly.
> Since I''m not using graphics at that time (Xorg session is idle, I
> connected through SSH), this may be a different issue.
>
> Thanks,
> Timothy

G.R.

2012-Dec-19 16:04 UTC

head link

Re: System freeze with IGD passthrough

On Wed, Dec 19, 2012 at 2:20 PM, G.R. <firemeteor@users.sourceforge.net>
wrote:> Adding Jean, the author to the opregion patch.
>
> Jean, I believe the warning is due to the offset within the page.
> To accommodate the offset, you would need to reserve another page for it.
> Will the extra page cause any unexpected problem?
>
> The original thread is about an instability issue that directly freeze the
host.
> I believe this warning above should not has such effect.
> What do you think? And any suggestion?
>
Jean appears to be no longer reach able.
The warning I found turns out to be not relevant.
According to the OpRegion spec, the tail part is reserved and should
never be touched by the guest.
But anyway, I had a local fix to get rid of the warning, but reserving
one more page and map it when the host opregion is not page aligned.
I''ll send it to a separate thread.

Back to the topic. I updated to xen 4.2.1 and tried three times tonight.
Two of them lead to total freeze with no error log available, after
game playing for a couple of minutes.
And the last try ended up with GPU hang after 10+ minutes of game playing.
This is a guest only hang. But I still have no way to check GPU error
state even it has been collected:

[ 1553.588076] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
elapsed... GPU hung
[ 1553.592112] [drm] capturing error event; look for more information
in /debug/dri/0/i915_error_state
[ 1582.004075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
elapsed... GPU hung
[ 1597.220075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
elapsed... GPU hung
[ 1613.220074] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
elapsed... GPU hung

I''m wondering if the two syndromes are due to the same underlying
cause.
But I guess a GPU hang caused by guest driver issue should not freeze
the host. Is it true?

I''m going to try more with different config -- different kernel
version, with / without PVOPS, native run vs VM etc.
But this is kind of blindly since I have no clue at all. If you have
anything to suspect, it will be highly appreciated.

Thanks,
Timothy
> Thanks,
> Timothy
>
> On Wed, Dec 19, 2012 at 1:28 AM, G.R.
<firemeteor@users.sourceforge.net> wrote:
>> Hi Stefano,
>>
>> I recently tried to play some 3D games on my linux guest.
>> The game starts without problem but it freezes the entire system after
>> a some time (a minute or so?).
>> Here I mean both the host and domU are not responsive anymore.
>> The ssh freezes and i had to shutdown the machine using power button
directly.
>>
>> I did not find anything obvious from the host log. But from the guest,
>> I can find this:
>>
>> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
>> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
>> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
>> ]------------
>> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
>> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
>> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
>> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
>> BARs. Your kernel is fine.
>> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
>> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
>> Not tainted 3.6.9 #4
>> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
>> Dec 18 20:28:38 debvm kernel: [    0.899896] 
[<ffffffff8103d194>] ?
>> warn_slowpath_common+0x76/0x8a
>> Dec 18 20:28:38 debvm kernel: [    0.899898] 
[<ffffffff8103d240>] ?
>> warn_slowpath_fmt+0x45/0x4a
>> Dec 18 20:28:38 debvm kernel: [    0.899900] 
[<ffffffff81032a6c>] ?
>> __ioremap_caller+0x2c4/0x33c
>> Dec 18 20:28:38 debvm kernel: [    0.899902] 
[<ffffffff812c3be3>] ?
>> intel_opregion_setup+0x9c/0x201
>> Dec 18 20:28:38 debvm kernel: [    0.899904] 
[<ffffffff812bcb75>] ?
>> intel_setup_gmbus+0x175/0x19d
>> Dec 18 20:28:38 debvm kernel: [    0.899907] 
[<ffffffff8128a37a>] ?
>> i915_driver_load+0x548/0x90d
>> Dec 18 20:28:38 debvm kernel: [    0.899910] 
[<ffffffff812ff804>] ?
>> setup_hpet_msi_remapped+0x20/0x20
>> Dec 18 20:28:38 debvm kernel: [    0.899912] 
[<ffffffff81272706>] ?
>> drm_get_pci_dev+0x152/0x259
>> Dec 18 20:28:38 debvm kernel: [    0.899915] 
[<ffffffff813d4883>] ?
>> _raw_spin_lock_irqsave+0x21/0x45
>> Dec 18 20:28:38 debvm kernel: [    0.899918] 
[<ffffffff811d9ecc>] ?
>> local_pci_probe+0x5a/0xa0
>> Dec 18 20:28:38 debvm kernel: [    0.899920] 
[<ffffffff811d9fcf>] ?
>> pci_device_probe+0xbd/0xe7
>> Dec 18 20:28:38 debvm kernel: [    0.899922] 
[<ffffffff812cd887>] ?
>> driver_probe_device+0x1b0/0x1b0
>> Dec 18 20:28:38 debvm kernel: [    0.899923] 
[<ffffffff812cd887>] ?
>> driver_probe_device+0x1b0/0x1b0
>> Dec 18 20:28:38 debvm kernel: [    0.899925] 
[<ffffffff812cd769>] ?
>> driver_probe_device+0x92/0x1b0
>> Dec 18 20:28:38 debvm kernel: [    0.899926] 
[<ffffffff812cd8da>] ?
>> __driver_attach+0x53/0x73
>> Dec 18 20:28:38 debvm kernel: [    0.899928] 
[<ffffffff812cc06f>] ?
>> bus_for_each_dev+0x46/0x77
>> Dec 18 20:28:38 debvm kernel: [    0.899930] 
[<ffffffff812ccf8f>] ?
>> bus_add_driver+0xd5/0x1f4
>> Dec 18 20:28:38 debvm kernel: [    0.899931] 
[<ffffffff812cde14>] ?
>> driver_register+0x89/0x101
>> Dec 18 20:28:38 debvm kernel: [    0.899933] 
[<ffffffff811d9336>] ?
>> __pci_register_driver+0x49/0xa3
>> Dec 18 20:28:38 debvm kernel: [    0.899935] 
[<ffffffff816d55c7>] ?
>> ttm_init+0x63/0x63
>> Dec 18 20:28:38 debvm kernel: [    0.899937] 
[<ffffffff81002085>] ?
>> do_one_initcall+0x75/0x12c
>> Dec 18 20:28:38 debvm kernel: [    0.899940] 
[<ffffffff816a6cc2>] ?
>> kernel_init+0x13c/0x1c0
>> Dec 18 20:28:38 debvm kernel: [    0.899941] 
[<ffffffff816a6565>] ?
>> do_early_param+0x83/0x83
>> Dec 18 20:28:38 debvm kernel: [    0.899943] 
[<ffffffff813d9f44>] ?
>> kernel_thread_helper+0x4/0x10
>> Dec 18 20:28:38 debvm kernel: [    0.899945] 
[<ffffffff816a6b86>] ?
>> start_kernel+0x3e1/0x3e1
>> Dec 18 20:28:38 debvm kernel: [    0.899947] 
[<ffffffff813d9f40>] ?
>> gs_change+0x13/0x13
>> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
>> db461543ce599b44 ]---
>>
>> I''m not sure if this has anything to do with the freeze. This
seems to
>> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
>> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
>> system freeze happens only during gaming, which is much less frequent.
>> So I''m not sure if the two are related. But anyway, could you
comment
>> about what does this log mean?
>>
>> I can find the one of the mentioned address in the qemu_dm log:
>> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
>> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
>> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
>>
>> PS: I also run xbmc on domU and it playbacks video under HW
>> acceleration (VAAPI) without any problem. XBMC by itself is also an
>> graphics intensive program. But this runs on an pure HVM guest, while
>> the failing case is on PVHVM.
>>
>> PS2: I also suffered another instability yesterday. It happens when I
>> was compiling kernel in side the domU. The host reboots suddenly.
>> Since I''m not using graphics at that time (Xorg session is
idle, I
>> connected through SSH), this may be a different issue.
>>
>> Thanks,
>> Timothy

Jean Guyader

2012-Dec-19 18:18 UTC

head link

Re: System freeze with IGD passthrough

On Tue, Dec 18, 2012 at 9:28 AM, G.R. <firemeteor@users.sourceforge.net>
wrote:> Hi Stefano,
>
> I recently tried to play some 3D games on my linux guest.
> The game starts without problem but it freezes the entire system after
> a some time (a minute or so?).
> Here I mean both the host and domU are not responsive anymore.
> The ssh freezes and i had to shutdown the machine using power button
directly.
>
> I did not find anything obvious from the host log. But from the guest,
> I can find this:
>
> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
> ]------------
> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
> BARs. Your kernel is fine.
> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
> Not tainted 3.6.9 #4
> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
> Dec 18 20:28:38 debvm kernel: [    0.899896]  [<ffffffff8103d194>] ?
> warn_slowpath_common+0x76/0x8a
> Dec 18 20:28:38 debvm kernel: [    0.899898]  [<ffffffff8103d240>] ?
> warn_slowpath_fmt+0x45/0x4a
> Dec 18 20:28:38 debvm kernel: [    0.899900]  [<ffffffff81032a6c>] ?
> __ioremap_caller+0x2c4/0x33c
> Dec 18 20:28:38 debvm kernel: [    0.899902]  [<ffffffff812c3be3>] ?
> intel_opregion_setup+0x9c/0x201
> Dec 18 20:28:38 debvm kernel: [    0.899904]  [<ffffffff812bcb75>] ?
> intel_setup_gmbus+0x175/0x19d
> Dec 18 20:28:38 debvm kernel: [    0.899907]  [<ffffffff8128a37a>] ?
> i915_driver_load+0x548/0x90d
> Dec 18 20:28:38 debvm kernel: [    0.899910]  [<ffffffff812ff804>] ?
> setup_hpet_msi_remapped+0x20/0x20
> Dec 18 20:28:38 debvm kernel: [    0.899912]  [<ffffffff81272706>] ?
> drm_get_pci_dev+0x152/0x259
> Dec 18 20:28:38 debvm kernel: [    0.899915]  [<ffffffff813d4883>] ?
> _raw_spin_lock_irqsave+0x21/0x45
> Dec 18 20:28:38 debvm kernel: [    0.899918]  [<ffffffff811d9ecc>] ?
> local_pci_probe+0x5a/0xa0
> Dec 18 20:28:38 debvm kernel: [    0.899920]  [<ffffffff811d9fcf>] ?
> pci_device_probe+0xbd/0xe7
> Dec 18 20:28:38 debvm kernel: [    0.899922]  [<ffffffff812cd887>] ?
> driver_probe_device+0x1b0/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899923]  [<ffffffff812cd887>] ?
> driver_probe_device+0x1b0/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899925]  [<ffffffff812cd769>] ?
> driver_probe_device+0x92/0x1b0
> Dec 18 20:28:38 debvm kernel: [    0.899926]  [<ffffffff812cd8da>] ?
> __driver_attach+0x53/0x73
> Dec 18 20:28:38 debvm kernel: [    0.899928]  [<ffffffff812cc06f>] ?
> bus_for_each_dev+0x46/0x77
> Dec 18 20:28:38 debvm kernel: [    0.899930]  [<ffffffff812ccf8f>] ?
> bus_add_driver+0xd5/0x1f4
> Dec 18 20:28:38 debvm kernel: [    0.899931]  [<ffffffff812cde14>] ?
> driver_register+0x89/0x101
> Dec 18 20:28:38 debvm kernel: [    0.899933]  [<ffffffff811d9336>] ?
> __pci_register_driver+0x49/0xa3
> Dec 18 20:28:38 debvm kernel: [    0.899935]  [<ffffffff816d55c7>] ?
> ttm_init+0x63/0x63
> Dec 18 20:28:38 debvm kernel: [    0.899937]  [<ffffffff81002085>] ?
> do_one_initcall+0x75/0x12c
> Dec 18 20:28:38 debvm kernel: [    0.899940]  [<ffffffff816a6cc2>] ?
> kernel_init+0x13c/0x1c0
> Dec 18 20:28:38 debvm kernel: [    0.899941]  [<ffffffff816a6565>] ?
> do_early_param+0x83/0x83
> Dec 18 20:28:38 debvm kernel: [    0.899943]  [<ffffffff813d9f44>] ?
> kernel_thread_helper+0x4/0x10
> Dec 18 20:28:38 debvm kernel: [    0.899945]  [<ffffffff816a6b86>] ?
> start_kernel+0x3e1/0x3e1
> Dec 18 20:28:38 debvm kernel: [    0.899947]  [<ffffffff813d9f40>] ?
> gs_change+0x13/0x13
> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
> db461543ce599b44 ]---
>
> I''m not sure if this has anything to do with the freeze. This
seems to
> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
> system freeze happens only during gaming, which is much less frequent.
> So I''m not sure if the two are related. But anyway, could you
comment
> about what does this log mean?
>
> I can find the one of the mentioned address in the qemu_dm log:
> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
>
> PS: I also run xbmc on domU and it playbacks video under HW
> acceleration (VAAPI) without any problem. XBMC by itself is also an
> graphics intensive program. But this runs on an pure HVM guest, while
> the failing case is on PVHVM.
>
> PS2: I also suffered another instability yesterday. It happens when I
> was compiling kernel in side the domU. The host reboots suddenly.
> Since I''m not using graphics at that time (Xorg session is idle, I
> connected through SSH), this may be a different issue.
>
Hi Timothy,

Could you send /proc/iomem, lspci -vvvv and the e820 from dmesg for this VM?

Thanks,
Jean

G.R.

2012-Dec-20 13:52 UTC

head link

Re: System freeze with IGD passthrough

On Thu, Dec 20, 2012 at 2:18 AM, Jean Guyader <jean.guyader@gmail.com>
wrote:
> Hi Timothy,
>
> Could you send /proc/iomem, lspci -vvvv and the e820 from dmesg for this
VM?
>
Thanks Jean, here are info you asked.
Could I ask what is it about? The warning in kernel log or the host
freezing issue?
If it''s about the former, I should mention that in the log I posted, I
have already applied a local patch (sent in a separate thread with you
involved) that reserved one more page in e820.

/proc/iomem:
00000000-0000ffff : reserved
00010000-0009dfff : System RAM
0009e000-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000ce3ff : Video ROM
000ce800-000cf1ff : Adapter ROM
000e0000-000fffff : reserved
  000f0000-000fffff : System ROM
00100000-dfffffff : System RAM
  01000000-013dcc77 : Kernel code
  013dcc78-0168f03f : Kernel data
  01727000-01804fff : Kernel bss
e0000000-fbffffff : PCI Bus 0000:00
  e0000000-efffffff : 0000:00:02.0
  f0000000-f0ffffff : 0000:00:03.0
    f0000000-f0ffffff : xen-platform-pci
  f1000000-f13fffff : 0000:00:02.0
  f1400000-f1403fff : i915 MCHBAR
  f1620000-f1623fff : 0000:00:05.0
    f1620000-f1623fff : ICH HD audio
  f1624000-f1624fff : 0000:00:06.0
    f1624000-f1624fff : ehci_hcd
fc000000-feff3fff : reserved
  fec00000-fec003ff : IOAPIC 0
  fed00000-fed003ff : HPET 0
  fee00000-fee00fff : Local APIC
feff4000-feff6fff : ACPI Non-volatile Storage
feff7000-ffffffff : reserved
100000000-11c7fffff : System RAM
11c800000-11fffffff : RAM buffer


dmesg lines with ''e820'' in it.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dfffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000feff3fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feff4000-0x00000000feff6fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000feff7000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000011c7fffff] usable
[    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x11c800 max_arch_pfn = 0x400000000
[    0.000000] e820: last_pfn = 0xe0000 max_arch_pfn = 0x400000000
[    0.000000] e820: [mem 0xe0000000-0xfbffffff] available for PCI devices
[    0.439596] e820: reserve RAM buffer [mem 0x0009e000-0x0009ffff]
[    0.439597] e820: reserve RAM buffer [mem 0x11c800000-0x11fffffff]

Please find the lspci -vvv log in the attachment, it''s a bit lengthy.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

G.R.

2012-Dec-20 18:20 UTC

head link

Re: System freeze with IGD passthrough

On Thu, Dec 20, 2012 at 12:04 AM, G.R. <firemeteor@users.sourceforge.net>
wrote:>>> PS2: I also suffered another instability yesterday. It happens when
I
>>> was compiling kernel in side the domU. The host reboots suddenly.
>>> Since I''m not using graphics at that time (Xorg session is
idle, I
>>> connected through SSH), this may be a different issue.
I tried once more to rebuild kernel in the debian VM. It''s a total
mess this time.
The whole system (including dom0) unexpectedly reboots several times
during the compilation.
This destroyed the kernel tree and I failed to build the kernel.
I suspect this has something to do with disk driver, since the reboot
tend to happen during high disk load (like linking vmlinux).
Will run iozone to check tomorrow.

It seems that this issue has little to do with IGD passthrough.
I''m not sure if it''s the same issue for the host freezing
during game play.
Maybe I should track them separately.

Thanks,
Timothy

Konrad Rzeszutek Wilk

2012-Dec-21 19:38 UTC

head link

Re: System freeze with IGD passthrough

On Thu, Dec 20, 2012 at 12:04:01AM +0800, G.R. wrote:> On Wed, Dec 19, 2012 at 2:20 PM, G.R.
<firemeteor@users.sourceforge.net> wrote:
> > Adding Jean, the author to the opregion patch.
> >
> > Jean, I believe the warning is due to the offset within the page.
> > To accommodate the offset, you would need to reserve another page for
it.
> > Will the extra page cause any unexpected problem?
> >
> > The original thread is about an instability issue that directly freeze
the host.
> > I believe this warning above should not has such effect.
> > What do you think? And any suggestion?
> >
> 
> Jean appears to be no longer reach able.
> The warning I found turns out to be not relevant.
> According to the OpRegion spec, the tail part is reserved and should
> never be touched by the guest.
> But anyway, I had a local fix to get rid of the warning, but reserving
> one more page and map it when the host opregion is not page aligned.
> I''ll send it to a separate thread.
> 
> Back to the topic. I updated to xen 4.2.1 and tried three times tonight.
> Two of them lead to total freeze with no error log available, after
> game playing for a couple of minutes.
> And the last try ended up with GPU hang after 10+ minutes of game playing.
> This is a guest only hang. But I still have no way to check GPU error
> state even it has been collected:
> 
> [ 1553.588076] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1553.592112] [drm] capturing error event; look for more information
> in /debug/dri/0/i915_error_state
> [ 1582.004075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1597.220075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1613.220074] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
Those also appear with baremetal (Linus actually mentioned this).
> 
> I''m wondering if the two syndromes are due to the same underlying
cause.
> But I guess a GPU hang caused by guest driver issue should not freeze
> the host. Is it true?
It shouldn''t. Is the machine usuable with this guest being
frozen?> 
> I''m going to try more with different config -- different kernel
> version, with / without PVOPS, native run vs VM etc.
> But this is kind of blindly since I have no clue at all. If you have
> anything to suspect, it will be highly appreciated.
> 
> Thanks,
> Timothy
> 
> > Thanks,
> > Timothy
> >
> > On Wed, Dec 19, 2012 at 1:28 AM, G.R.
<firemeteor@users.sourceforge.net> wrote:
> >> Hi Stefano,
> >>
> >> I recently tried to play some 3D games on my linux guest.
> >> The game starts without problem but it freezes the entire system
after
> >> a some time (a minute or so?).
> >> Here I mean both the host and domU are not responsive anymore.
> >> The ssh freezes and i had to shutdown the machine using power
button directly.
> >>
> >> I did not find anything obvious from the host log. But from the
guest,
> >> I can find this:
> >>
> >> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity
check
> >> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
> >> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut
here
> >> ]------------
> >> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
> >> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
> >> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM
domU
> >> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping
multiple
> >> BARs. Your kernel is fine.
> >> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
> >> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm:
swapper/0
> >> Not tainted 3.6.9 #4
> >> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
> >> Dec 18 20:28:38 debvm kernel: [    0.899896] 
[<ffffffff8103d194>] ?
> >> warn_slowpath_common+0x76/0x8a
> >> Dec 18 20:28:38 debvm kernel: [    0.899898] 
[<ffffffff8103d240>] ?
> >> warn_slowpath_fmt+0x45/0x4a
> >> Dec 18 20:28:38 debvm kernel: [    0.899900] 
[<ffffffff81032a6c>] ?
> >> __ioremap_caller+0x2c4/0x33c
> >> Dec 18 20:28:38 debvm kernel: [    0.899902] 
[<ffffffff812c3be3>] ?
> >> intel_opregion_setup+0x9c/0x201
> >> Dec 18 20:28:38 debvm kernel: [    0.899904] 
[<ffffffff812bcb75>] ?
> >> intel_setup_gmbus+0x175/0x19d
> >> Dec 18 20:28:38 debvm kernel: [    0.899907] 
[<ffffffff8128a37a>] ?
> >> i915_driver_load+0x548/0x90d
> >> Dec 18 20:28:38 debvm kernel: [    0.899910] 
[<ffffffff812ff804>] ?
> >> setup_hpet_msi_remapped+0x20/0x20
> >> Dec 18 20:28:38 debvm kernel: [    0.899912] 
[<ffffffff81272706>] ?
> >> drm_get_pci_dev+0x152/0x259
> >> Dec 18 20:28:38 debvm kernel: [    0.899915] 
[<ffffffff813d4883>] ?
> >> _raw_spin_lock_irqsave+0x21/0x45
> >> Dec 18 20:28:38 debvm kernel: [    0.899918] 
[<ffffffff811d9ecc>] ?
> >> local_pci_probe+0x5a/0xa0
> >> Dec 18 20:28:38 debvm kernel: [    0.899920] 
[<ffffffff811d9fcf>] ?
> >> pci_device_probe+0xbd/0xe7
> >> Dec 18 20:28:38 debvm kernel: [    0.899922] 
[<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899923] 
[<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899925] 
[<ffffffff812cd769>] ?
> >> driver_probe_device+0x92/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899926] 
[<ffffffff812cd8da>] ?
> >> __driver_attach+0x53/0x73
> >> Dec 18 20:28:38 debvm kernel: [    0.899928] 
[<ffffffff812cc06f>] ?
> >> bus_for_each_dev+0x46/0x77
> >> Dec 18 20:28:38 debvm kernel: [    0.899930] 
[<ffffffff812ccf8f>] ?
> >> bus_add_driver+0xd5/0x1f4
> >> Dec 18 20:28:38 debvm kernel: [    0.899931] 
[<ffffffff812cde14>] ?
> >> driver_register+0x89/0x101
> >> Dec 18 20:28:38 debvm kernel: [    0.899933] 
[<ffffffff811d9336>] ?
> >> __pci_register_driver+0x49/0xa3
> >> Dec 18 20:28:38 debvm kernel: [    0.899935] 
[<ffffffff816d55c7>] ?
> >> ttm_init+0x63/0x63
> >> Dec 18 20:28:38 debvm kernel: [    0.899937] 
[<ffffffff81002085>] ?
> >> do_one_initcall+0x75/0x12c
> >> Dec 18 20:28:38 debvm kernel: [    0.899940] 
[<ffffffff816a6cc2>] ?
> >> kernel_init+0x13c/0x1c0
> >> Dec 18 20:28:38 debvm kernel: [    0.899941] 
[<ffffffff816a6565>] ?
> >> do_early_param+0x83/0x83
> >> Dec 18 20:28:38 debvm kernel: [    0.899943] 
[<ffffffff813d9f44>] ?
> >> kernel_thread_helper+0x4/0x10
> >> Dec 18 20:28:38 debvm kernel: [    0.899945] 
[<ffffffff816a6b86>] ?
> >> start_kernel+0x3e1/0x3e1
> >> Dec 18 20:28:38 debvm kernel: [    0.899947] 
[<ffffffff813d9f40>] ?
> >> gs_change+0x13/0x13
> >> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
> >> db461543ce599b44 ]---
> >>
> >> I''m not sure if this has anything to do with the freeze.
This seems to
> >> show up on every boot after I upgraded to xen version 4.2.1-rc2.
Both
> >> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
> >> system freeze happens only during gaming, which is much less
frequent.
> >> So I''m not sure if the two are related. But anyway, could
you comment
> >> about what does this log mean?
> >>
> >> I can find the one of the mentioned address in the qemu_dm log:
> >> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
> >> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
> >> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
> >>
> >> PS: I also run xbmc on domU and it playbacks video under HW
> >> acceleration (VAAPI) without any problem. XBMC by itself is also
an
> >> graphics intensive program. But this runs on an pure HVM guest,
while
> >> the failing case is on PVHVM.
> >>
> >> PS2: I also suffered another instability yesterday. It happens
when I
> >> was compiling kernel in side the domU. The host reboots suddenly.
> >> Since I''m not using graphics at that time (Xorg session
is idle, I
> >> connected through SSH), this may be a different issue.
> >>
> >> Thanks,
> >> Timothy
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

Konrad Rzeszutek Wilk

2012-Dec-21 19:39 UTC

head link

Re: System freeze with IGD passthrough

On Fri, Dec 21, 2012 at 02:20:17AM +0800, G.R. wrote:> On Thu, Dec 20, 2012 at 12:04 AM, G.R.
<firemeteor@users.sourceforge.net> wrote:
> >>> PS2: I also suffered another instability yesterday. It happens
when I
> >>> was compiling kernel in side the domU. The host reboots
suddenly.
> >>> Since I''m not using graphics at that time (Xorg
session is idle, I
> >>> connected through SSH), this may be a different issue.
> 
> I tried once more to rebuild kernel in the debian VM. It''s a total
> mess this time.
> The whole system (including dom0) unexpectedly reboots several times
> during the compilation.
> This destroyed the kernel tree and I failed to build the kernel.
> I suspect this has something to do with disk driver, since the reboot
> tend to happen during high disk load (like linking vmlinux).
Is the AHCI controller sharing the same interrupt line as the IGD?
> Will run iozone to check tomorrow.
> 
> It seems that this issue has little to do with IGD passthrough.
> I''m not sure if it''s the same issue for the host freezing
during game play.
> Maybe I should track them separately.
> 
> Thanks,
> Timothy
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

G.R.

2012-Dec-23 05:37 UTC

head link

Re: System freeze with IGD passthrough

> Is the AHCI controller sharing the same interrupt line as the IGD?
>
Thanks for your help, Konrad.
I did some more experiments and this turns out due to my stupid, again.

So basically the instability comes from the HW directly, it panics
once heavy load is present, either gaming or kernel compilation.
The direct cause of this HW instability is that I applied
under-voltage to my processor, which I almost forget about..
That config works fine on a native build -- it passes stress testing
from prime95.
However, the virtualization feature seems more demanding and does not
work well on that voltage setting.

After removing the under-voltage trick, the virtualized system works just fine.
So all known functionality issue about linux build have been solved.
Thank you all and apologize for wasting your time.

Thanks,
Timothy

PS: The bad news is that this instability fix does not help on the
win7 guest in anyways.
It''s as broken as before.

Xen devel - Dec 2012 - System freeze with IGD passthrough

System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough

Re: System freeze with IGD passthrough