Yasir Assam
2010-Feb-10 11:44 UTC
[Xen-devel] xen.git pvops kernel bug: i915 bug after memory upgrade
I upgraded my RAM from 2GB to 8GB today, and I''m no longer able to run X. My guess is this is a bug in the xen.git kernel (the dom0 kernel) in the i915 module. Other kernels (vanilla 2.6.32.x) work fine. I have attached the full dmesg log. The problem is completely reproducible on my machine. Thanks Yasir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-10 20:57 UTC
Re: [Xen-devel] xen.git pvops kernel bug: i915 bug after memory upgrade
On Wed, Feb 10, 2010 at 10:44:59PM +1100, Yasir Assam wrote:> I upgraded my RAM from 2GB to 8GB today, and I''m no longer able to run > X. My guess is this is a bug in the xen.git kernel (the dom0 kernel) in > the i915 module. Other kernels (vanilla 2.6.32.x) work fine. > > I have attached the full dmesg log. The problem is completely > reproducible on my machine.1) Can you give me the hardware specs? .. snip ..> [ 23.261678] BUG: unable to handle kernel paging request at ffffc900000c6000 > [ 23.261685] IP: [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > [ 23.261694] PGD 33d2067 PUD 33d3067 PMD 33d4067 PTE 0 > [ 23.261700] Oops: 0002 [#1] SMP > [ 23.261703] last sysfs file: /sys/module/i2c_core/initstate > [ 23.261705] CPU 0 > [ 23.261707] Modules linked in: i915(+) drm i2c_algo_bit video output ppdev lp parport sco bnep rfcomm l2cap bluetooth rfkill battery cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave fuse hwmon_vid k8temp eeprom i2c_nforce2 firewire_sbp2 firewire_core crc_itu_t loop snd_hda_codec_intelhdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi processor snd_seq_midi_event evdev pcspkr i2c_i801 i2c_core asus_atk0110 snd_seq snd_timer button snd_seq_device acpi_processor snd soundcore snd_page_alloc ext3 jbd mbcache dm_mod raid1 md_mod sg sd_mod crc_t10dif sr_mod cdrom usbhid hid pata_jmicron ata_generic ata_piix libata scsi_mod ide_pci_generic ehci_hcd r8169 mii ide_core usbcore nls_base intel_agp thermal fan thermal_sys [last unloaded: scsi_wait_scan] > [ 23.261775] Pid: 2379, comm: modprobe Not tainted 2.6.31.6-pvops-dom0 #7 System Product Name > [ 23.261777] RIP: e030:[<ffffffffa0015226>] [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp]Can you dissassemble the instructions around this to see what it is doing? Look here on how to do it: http://lists.xensource.com/archives/html/xen-devel/2009-10/msg00008.html> [ 23.261783] RSP: e02b:ffff880002155a58 EFLAGS: 00010286 > [ 23.261785] RAX: 0000000000000001 RBX: ffff88001e0f7300 RCX: 0000000000001000 > [ 23.261787] RDX: ffffc900000c6000 RSI: 00000000000007e9 RDI: ffff88001d5efe00 > [ 23.261789] RBP: ffff88001e96c000 R08: 0000000000000040 R09: ffff8800016f1000 > [ 23.261792] R10: ffff880000000000 R11: 6db6db6db6db6db7 R12: 0000000000000001 > [ 23.261794] R13: 00000000007e9000 R14: ffff88001e0f7f00 R15: 00000000007e9000 > [ 23.261799] FS: 00007f1a00ddd6f0(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000 > [ 23.261801] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 23.261803] CR2: ffffc900000c6000 CR3: 000000001dd65000 CR4: 0000000000002660 > [ 23.261806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 23.261808] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 23.261811] Process modprobe (pid: 2379, threadinfo ffff880002154000, task ffff8800198b8000) > [ 23.261812] Stack: > [ 23.261814] 0000000d00000000 000000007ea42086 000000007ea42086 ffffffffa03c387c > [ 23.261818] <0> ffff88001e0f7f00 000000007ea42086 ffff88001e96c000 ffff88001e0f7300 > [ 23.261823] <0> ffff88001e0f7f00 ffffffffa03c4fc3 ffff88001e0f7300 0000000000000000 > [ 23.261828] Call Trace: > [ 23.261840] [<ffffffffa03c387c>] ? i915_gem_object_flush_cpu_write_domain+0x30/0x53 [i915] > [ 23.261849] [<ffffffffa03c4fc3>] ? i915_gem_object_set_to_gtt_domain+0x57/0x9d [i915] > [ 23.261860] [<ffffffffa03d909a>] ? intelfb_create+0x1e5/0x7a3 [i915] > [ 23.261866] [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37 > [ 23.261877] [<ffffffffa03d9a1e>] ? intelfb_probe+0x3c6/0x62e [i915] > [ 23.261881] [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 23.261894] [<ffffffffa039d239>] ? drm_helper_initial_config+0x176/0x19c [drm] > [ 23.261902] [<ffffffffa03be2e7>] ? i915_driver_load+0xaa7/0xb3c [i915] > [ 23.261913] [<ffffffffa0393399>] ? drm_get_dev+0x321/0x444 [drm] > [ 23.261919] [<ffffffff811fc04b>] ? local_pci_probe+0x22/0x3e > [ 23.261922] [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37 > [ 23.261925] [<ffffffff811fd30e>] ? pci_device_probe+0x68/0xab > [ 23.261930] [<ffffffff81299c91>] ? driver_probe_device+0xa2/0x13a > [ 23.261933] [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 23.261936] [<ffffffff81299d8c>] ? __driver_attach+0x63/0x9a > [ 23.261939] [<ffffffff81299d29>] ? __driver_attach+0x0/0x9a > [ 23.261942] [<ffffffff812990ab>] ? bus_for_each_dev+0x54/0x9d > [ 23.261945] [<ffffffff81299674>] ? bus_add_driver+0xbc/0x218 > [ 23.261948] [<ffffffff8129a185>] ? driver_register+0xa3/0x122 > [ 23.261951] [<ffffffff811fd5b6>] ? __pci_register_driver+0x5e/0xe7 > [ 23.261959] [<ffffffffa0383000>] ? i915_init+0x0/0x74 [i915] > [ 23.261962] [<ffffffff8100a0f5>] ? do_one_initcall+0x77/0x1c1 > [ 23.261966] [<ffffffff810ae08f>] ? sys_init_module+0xda/0x223 > [ 23.261970] [<ffffffff81038fc2>] ? system_call_fastpath+0x16/0x1b > [ 23.261972] Code: 86 51 06 e1 48 83 c4 18 c3 48 83 ec 18 48 8b 15 f1 80 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 85 d2 74 04 b0 01 <89> 02 48 8b 44 24 10 65 48 33 04 25 28 00 00 00 74 05 e8 48 51 > [ 23.262012] RIP [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > [ 23.262017] RSP <ffff880002155a58> > [ 23.262019] CR2: ffffc900000c6000 > [ 23.262022] ---[ end trace cf5e2ee5497e2d52 ]--- > [ 26.955198] eth0: no IPv6 routers present > [ 27.230515] peth0: no IPv6 routers present> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jun-15 23:49 UTC
Re: [Xen-devel] xen.git pvops kernel bug: i915 bug after memory upgrade
On Wed, Feb 10, 2010 at 03:57:38PM -0500, Konrad Rzeszutek Wilk wrote:> On Wed, Feb 10, 2010 at 10:44:59PM +1100, Yasir Assam wrote: > > I upgraded my RAM from 2GB to 8GB today, and I''m no longer able to run > > X. My guess is this is a bug in the xen.git kernel (the dom0 kernel) in > > the i915 module. Other kernels (vanilla 2.6.32.x) work fine. > > > > I have attached the full dmesg log. The problem is completely > > reproducible on my machine. > > 1) Can you give me the hardware specs?Note: Per personal converstation it was an Asus P7H55-M Pro which has Intel H55 chipset or I965..> > .. snip .. > > [ 23.261678] BUG: unable to handle kernel paging request at ffffc900000c6000 > > [ 23.261685] IP: [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.261694] PGD 33d2067 PUD 33d3067 PMD 33d4067 PTE 0 > > [ 23.261700] Oops: 0002 [#1] SMP > > [ 23.261703] last sysfs file: /sys/module/i2c_core/initstate > > [ 23.261705] CPU 0 > > [ 23.261707] Modules linked in: i915(+) drm i2c_algo_bit video output ppdev lp parport sco bnep rfcomm l2cap bluetooth rfkill battery cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave fuse hwmon_vid k8temp eeprom i2c_nforce2 firewire_sbp2 firewire_core crc_itu_t loop snd_hda_codec_intelhdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi processor snd_seq_midi_event evdev pcspkr i2c_i801 i2c_core asus_atk0110 snd_seq snd_timer button snd_seq_device acpi_processor snd soundcore snd_page_alloc ext3 jbd mbcache dm_mod raid1 md_mod sg sd_mod crc_t10dif sr_mod cdrom usbhid hid pata_jmicron ata_generic ata_piix libata scsi_mod ide_pci_generic ehci_hcd r8169 mii ide_core usbcore nls_base intel_agp thermal fan thermal_sys [last unloaded: scsi_wait_scan] > > [ 23.261775] Pid: 2379, comm: modprobe Not tainted 2.6.31.6-pvops-dom0 #7 System Product Name > > [ 23.261777] RIP: e030:[<ffffffffa0015226>] [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.261783] RSP: e02b:ffff880002155a58 EFLAGS: 00010286 > > [ 23.261785] RAX: 0000000000000001 RBX: ffff88001e0f7300 RCX: 0000000000001000 > > [ 23.261787] RDX: ffffc900000c6000 RSI: 00000000000007e9 RDI: ffff88001d5efe00 > > [ 23.261789] RBP: ffff88001e96c000 R08: 0000000000000040 R09: ffff8800016f1000 > > [ 23.261792] R10: ffff880000000000 R11: 6db6db6db6db6db7 R12: 0000000000000001 > > [ 23.261794] R13: 00000000007e9000 R14: ffff88001e0f7f00 R15: 00000000007e9000 > > [ 23.261799] FS: 00007f1a00ddd6f0(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000 > > [ 23.261801] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 23.261803] CR2: ffffc900000c6000 CR3: 000000001dd65000 CR4: 0000000000002660 > > [ 23.261806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 23.261808] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > [ 23.261811] Process modprobe (pid: 2379, threadinfo ffff880002154000, task ffff8800198b8000) > > [ 23.261812] Stack: > > [ 23.261814] 0000000d00000000 000000007ea42086 000000007ea42086 ffffffffa03c387c > > [ 23.261818] <0> ffff88001e0f7f00 000000007ea42086 ffff88001e96c000 ffff88001e0f7300 > > [ 23.261823] <0> ffff88001e0f7f00 ffffffffa03c4fc3 ffff88001e0f7300 0000000000000000 > > [ 23.261828] Call Trace: > > [ 23.261840] [<ffffffffa03c387c>] ? i915_gem_object_flush_cpu_write_domain+0x30/0x53 [i915] > > [ 23.261849] [<ffffffffa03c4fc3>] ? i915_gem_object_set_to_gtt_domain+0x57/0x9d [i915] > > [ 23.261860] [<ffffffffa03d909a>] ? intelfb_create+0x1e5/0x7a3 [i915] > > [ 23.261866] [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37 > > [ 23.261877] [<ffffffffa03d9a1e>] ? intelfb_probe+0x3c6/0x62e [i915] > > [ 23.261881] [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1 > > [ 23.261894] [<ffffffffa039d239>] ? drm_helper_initial_config+0x176/0x19c [drm] > > [ 23.261902] [<ffffffffa03be2e7>] ? i915_driver_load+0xaa7/0xb3c [i915] > > [ 23.261913] [<ffffffffa0393399>] ? drm_get_dev+0x321/0x444 [drm] > > [ 23.261919] [<ffffffff811fc04b>] ? local_pci_probe+0x22/0x3e > > [ 23.261922] [<ffffffff81033525>] ? xen_force_evtchn_callback+0x1d/0x37 > > [ 23.261925] [<ffffffff811fd30e>] ? pci_device_probe+0x68/0xab > > [ 23.261930] [<ffffffff81299c91>] ? driver_probe_device+0xa2/0x13a > > [ 23.261933] [<ffffffff8103400f>] ? xen_restore_fl_direct_end+0x0/0x1 > > [ 23.261936] [<ffffffff81299d8c>] ? __driver_attach+0x63/0x9a > > [ 23.261939] [<ffffffff81299d29>] ? __driver_attach+0x0/0x9a > > [ 23.261942] [<ffffffff812990ab>] ? bus_for_each_dev+0x54/0x9d > > [ 23.261945] [<ffffffff81299674>] ? bus_add_driver+0xbc/0x218 > > [ 23.261948] [<ffffffff8129a185>] ? driver_register+0xa3/0x122 > > [ 23.261951] [<ffffffff811fd5b6>] ? __pci_register_driver+0x5e/0xe7 > > [ 23.261959] [<ffffffffa0383000>] ? i915_init+0x0/0x74 [i915] > > [ 23.261962] [<ffffffff8100a0f5>] ? do_one_initcall+0x77/0x1c1 > > [ 23.261966] [<ffffffff810ae08f>] ? sys_init_module+0xda/0x223 > > [ 23.261970] [<ffffffff81038fc2>] ? system_call_fastpath+0x16/0x1b > > [ 23.261972] Code: 86 51 06 e1 48 83 c4 18 c3 48 83 ec 18 48 8b 15 f1 80 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 85 d2 74 04 b0 01 <89> 02 48 8b 44 24 10 65 48 33 04 25 28 00 00 00 74 05 e8 48 51 > > [ 23.262012] RIP [<ffffffffa0015226>] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.262017] RSP <ffff880002155a58> > > [ 23.262019] CR2: ffffc900000c6000 > > [ 23.262022] ---[ end trace cf5e2ee5497e2d52 ]--- > > [ 26.955198] eth0: no IPv6 routers present > > [ 27.230515] peth0: no IPv6 routers presentIn the latest of PV-OPS kernel (and the 2.6.31.x) there does not seem to be a big red mark on why this would happen. There are two things that I think might at fault here: 1). CONFIG_DMAR was not set and you ended up using the non-PCI DMA mapping of pages. 2). We mapped the wrong address. I am perplexed here. But we can narrow this down. 1) Apply the attached patch. 2) With a working setup (perhaps booting PV-OPS kernel without Xen) but still with 8GB of RAM, run lspci -vvv and also ''dmesg''. 3) Get a PCI or PCI-e Serial card. I''ve been using the Rosewill RC-301 and RC-301EU with success. I had to figure the ioports from ''lspci'' and put this in my Xen command line: "com1=115200,8n1,0xd800,0". The 0xd800 is what lspci told me was on the first IO port of that serial card. 4). Also add to your Xen command line: ''console=com1,vga guest_loglvl=all" 5). On your Linux kernel command line add: "initcall_debug debug" 6). Compile the kernel and reboot. Make sure to have CONFIG_DMAR=y set. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel