Martin Wilck
2010-Oct-08 20:04 UTC
[Xen-devel] system freeze when processor.ko is loaded during boot
Hello, I see a system freeze with xen-unstable and jeremy/xen/2.6.32-next kernel when the processor ACPI mocule is loaded during boot on my Samsung X50 notebook. I first saw this problem with Xen under OpenSUSE 11.3 (https://bugzilla.novell.com/show_bug.cgi?id=623680) with the OpenSUSE hypervisor and kernel. So far, I have found out the following: - no problem with OpenSUSE default kernel - no problem with Xen if I don''t load processor.ko - no problem with "max_cstate=1" hypervisor parameter (but max_cstate=2 freezes) - when the freeze happens, I am usually seeing some unreleated messages about USB or SATA device initialization. No keys at all are working, sometimes the disk LED stays on without the disk making noises. I haven''t been able to see any hypervisor messages (even with "vga=keep"; unfortunately the system has no serial port). The Xen "watchdog" parameter causes the machine to reboot without any messages. - The weirdest thing is that once the system has booted, I seem to be able to load processor.ko without problems (at least the system doesn''t freeze, even if I let it idle for a long time or if I do normal work under X). With the SUSE hypervisor + kernel I was even able to verify that cpuidle was working and C3 was being used (with xen-unstable, I couldn''t get xenpm to work so far). - However if processor.ko is loaded during boot, the system always freezes (100% reproducable). I tried loading it in the initrd (SUSE default) and early after mounting the root FS (autoloaded by udev I think), and even compiling it into the kernel proper (tried that only with the jeremy kernel). In all cases, the system freezes hard. Probably the freeze occurs when the CPU enters a deep C-state while some HW initialization is going on at boot time. I am a little out of clues how to debug this further, any hints would be welcome. The info below was taken with the normal OpenSUSE kernel. Thanks for any hints Martin martin@athene:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.13GHz stepping : 8 cpu MHz : 800.000 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 bogomips : 1596.03 clflush size : 64 cache_alignment : 64 address sizes : 32 bits physical, 32 bits virtual power management: martin@athene:~$ cat /proc/acpi/processor/CPU0/info processor id: 0 acpi id: 0 bus mastering control: yes power management: yes throttling control: yes limit interface: yes martin@athene:~$ cat /proc/acpi/processor/CPU0/power active state: C0 max_cstate: C8 maximum allowed latency: 2000000000 usec states: C1: type[C1] promotion[--] demotion[--] latency[001] usage[00001563] duration[00000000000000000000] C2: type[C2] promotion[--] demotion[--] latency[001] usage[00308057] duration[00000000002512350868] C3: type[C3] promotion[--] demotion[--] latency[085] usage[00126109] duration[00000000000772728489] C4: type[C3] promotion[--] demotion[--] latency[185] usage[00537539] duration[00000000008753288513] m [ 3.345147] processor_driver-0392 [00] processor_get_info : Bus mastering arbitration control present [ 3.345187] processor_driver-0474 [00] processor_get_info : Processor [0:0] [ 3.345215] processor_throttling-1141 [00] processor_get_throttli: pblk_address[0x00001010] duty_offset[3] duty_width[1] [ 3.345233] processor_throttling-1187 [00] processor_get_throttli: Found 2 throttling states [ 3.345246] processor_throttling-0657 [00] processor_get_throttli: Throttling state is T0 (1000% throttling applied) [ 3.345585] processor_idle-0526 [00] processor_get_power_in: Found 4 power states [ 3.345931] processor_throttling-0218 [00] processor_throttling_i: Assume no T-state coordination [ 82.129875] processor_perflib-0349 [00] processor_get_performa: Found 6 performance states [ 82.129889] processor_perflib-0367 [00] processor_get_performa: Extracting state 0 [ 82.129901] processor_perflib-0385 [00] processor_get_performa: State [0]: core_frequency[2133] power[27000] transition_latency[10] bus_master_latency[10] control[0x1029] status[0x1029] [ 82.129918] processor_perflib-0367 [00] processor_get_performa: Extracting state 1 [ 82.129928] processor_perflib-0385 [00] processor_get_performa: State [1]: core_frequency[1867] power[24000] transition_latency[10] bus_master_latency[10] control[0xe25] status[0xe25] [ 82.129945] processor_perflib-0367 [00] processor_get_performa: Extracting state 2 [ 82.129955] processor_perflib-0385 [00] processor_get_performa: State [2]: core_frequency[1600] power[21000] transition_latency[10] bus_master_latency[10] control[0xc20] status[0xc20] [ 82.129972] processor_perflib-0367 [00] processor_get_performa: Extracting state 3 [ 82.129982] processor_perflib-0385 [00] processor_get_performa: State [3]: core_frequency[1333] power[19000] transition_latency[10] bus_master_latency[10] control[0xa1c] status[0xa1c] [ 82.129998] processor_perflib-0367 [00] processor_get_performa: Extracting state 4 [ 82.130009] processor_perflib-0385 [00] processor_get_performa: State [4]: core_frequency[1067] power[16000] transition_latency[10] bus_master_latency[10] control[0x817] status[0x817] [ 82.130025] processor_perflib-0367 [00] processor_get_performa: Extracting state 5 [ 82.130035] processor_perflib-0385 [00] processor_get_performa: State [5]: core_frequency[800] power[13000] transition_latency[10] bus_master_latency[10] control[0x612] status[0x612] [ 82.130173] processor_perflib-0486 [00] processor_notify_smm : No SMI port or pstate_control _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Oct-20 06:57 UTC
[Xen-devel] Re: system freeze when processor.ko is loaded during boot
(Stub reply, to make the mail visible to the list - apparently the original is still sitting in the to-be-approved queue.)>>> On 08.10.10 at 22:04, Martin Wilck <mwilck@arcor.de> wrote: > Hello, > > I see a system freeze with xen-unstable and jeremy/xen/2.6.32-next > kernel when the processor ACPI mocule is loaded during boot on my > Samsung X50 notebook. I first saw this problem with Xen under OpenSUSE > 11.3 (https://bugzilla.novell.com/show_bug.cgi?id=623680) with the > OpenSUSE hypervisor and kernel. > > So far, I have found out the following: > > - no problem with OpenSUSE default kernel > - no problem with Xen if I don''t load processor.ko > - no problem with "max_cstate=1" hypervisor parameter (but max_cstate=2 > freezes) > > - when the freeze happens, I am usually seeing some unreleated messages > about USB or SATA device initialization. No keys at all are working, > sometimes the disk LED stays on without the disk making noises. I > haven''t been able to see any hypervisor messages (even with "vga=keep"; > unfortunately the system has no serial port). The Xen "watchdog" > parameter causes the machine to reboot without any messages. > > - The weirdest thing is that once the system has booted, I seem to be > able to load processor.ko without problems (at least the system doesn''t > freeze, even if I let it idle for a long time or if I do normal work > under X). With the SUSE hypervisor + kernel I was even able to verify > that cpuidle was working and C3 was being used (with xen-unstable, I > couldn''t get xenpm to work so far). > > - However if processor.ko is loaded during boot, the system always > freezes (100% reproducable). I tried loading it in the initrd (SUSE > default) and early after mounting the root FS (autoloaded by udev I > think), and even compiling it into the kernel proper (tried that only > with the jeremy kernel). In all cases, the system freezes hard. > > Probably the freeze occurs when the CPU enters a deep C-state while some > HW initialization is going on at boot time. > > I am a little out of clues how to debug this further, any hints would be > welcome. > > The info below was taken with the normal OpenSUSE kernel. > > Thanks for any hints > Martin > > martin@athene:~$ cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 13 > model name : Intel(R) Pentium(R) M processor 2.13GHz > stepping : 8 > cpu MHz : 800.000 > cache size : 2048 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 > bogomips : 1596.03 > clflush size : 64 > cache_alignment : 64 > address sizes : 32 bits physical, 32 bits virtual > power management: > > martin@athene:~$ cat /proc/acpi/processor/CPU0/info > processor id: 0 > acpi id: 0 > bus mastering control: yes > power management: yes > throttling control: yes > limit interface: yes > > martin@athene:~$ cat /proc/acpi/processor/CPU0/power > active state: C0 > max_cstate: C8 > maximum allowed latency: 2000000000 usec > states: > C1: type[C1] promotion[--] demotion[--] > latency[001] usage[00001563] duration[00000000000000000000] > C2: type[C2] promotion[--] demotion[--] > latency[001] usage[00308057] duration[00000000002512350868] > C3: type[C3] promotion[--] demotion[--] > latency[085] usage[00126109] duration[00000000000772728489] > C4: type[C3] promotion[--] demotion[--] > latency[185] usage[00537539] duration[00000000008753288513] > m > > [ 3.345147] processor_driver-0392 [00] processor_get_info : Bus > mastering arbitration control present > [ 3.345187] processor_driver-0474 [00] processor_get_info : > Processor [0:0] > [ 3.345215] processor_throttling-1141 [00] processor_get_throttli: > pblk_address[0x00001010] duty_offset[3] duty_width[1] > [ 3.345233] processor_throttling-1187 [00] processor_get_throttli: > Found 2 throttling states > [ 3.345246] processor_throttling-0657 [00] processor_get_throttli: > Throttling state is T0 (1000% throttling applied) > [ 3.345585] processor_idle-0526 [00] processor_get_power_in: Found 4 > power states > [ 3.345931] processor_throttling-0218 [00] processor_throttling_i: > Assume no T-state coordination > [ 82.129875] processor_perflib-0349 [00] processor_get_performa: Found > 6 performance states > [ 82.129889] processor_perflib-0367 [00] processor_get_performa: > Extracting state 0 > [ 82.129901] processor_perflib-0385 [00] processor_get_performa: State > [0]: core_frequency[2133] power[27000] transition_latency[10] > bus_master_latency[10] control[0x1029] status[0x1029] > [ 82.129918] processor_perflib-0367 [00] processor_get_performa: > Extracting state 1 > [ 82.129928] processor_perflib-0385 [00] processor_get_performa: State > [1]: core_frequency[1867] power[24000] transition_latency[10] > bus_master_latency[10] control[0xe25] status[0xe25] > [ 82.129945] processor_perflib-0367 [00] processor_get_performa: > Extracting state 2 > [ 82.129955] processor_perflib-0385 [00] processor_get_performa: State > [2]: core_frequency[1600] power[21000] transition_latency[10] > bus_master_latency[10] control[0xc20] status[0xc20] > [ 82.129972] processor_perflib-0367 [00] processor_get_performa: > Extracting state 3 > [ 82.129982] processor_perflib-0385 [00] processor_get_performa: State > [3]: core_frequency[1333] power[19000] transition_latency[10] > bus_master_latency[10] control[0xa1c] status[0xa1c] > [ 82.129998] processor_perflib-0367 [00] processor_get_performa: > Extracting state 4 > [ 82.130009] processor_perflib-0385 [00] processor_get_performa: State > [4]: core_frequency[1067] power[16000] transition_latency[10] > bus_master_latency[10] control[0x817] status[0x817] > [ 82.130025] processor_perflib-0367 [00] processor_get_performa: > Extracting state 5 > [ 82.130035] processor_perflib-0385 [00] processor_get_performa: State > [5]: core_frequency[800] power[13000] transition_latency[10] > bus_master_latency[10] control[0x612] status[0x612] > [ 82.130173] processor_perflib-0486 [00] processor_notify_smm : No > SMI port or pstate_control_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Nov-04 14:57 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Martin, did you try latest origin/xen/stable-2.6.32.x branch to see if anything different? There was a bug in .32 branch on processor before, which should have been fixed. And can you share the boot log and the feature supported in this platform? And I remember report about Xen freeze on some platform with HPET broadcast enabled http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html , but not sure if it''s the same issue with you (I checked that issue when it was reported and made no progress). Thanks --jyh>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com >[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Jan Beulich >Sent: Wednesday, October 20, 2010 2:57 PM >To: xen-devel@lists.xensource.com >Cc: Martin Wilck >Subject: [Xen-devel] Re: system freeze when processor.ko is loaded during boot > >(Stub reply, to make the mail visible to the list - apparently the original >is still sitting in the to-be-approved queue.) > >>>> On 08.10.10 at 22:04, Martin Wilck <mwilck@arcor.de> wrote: >> Hello, >> >> I see a system freeze with xen-unstable and jeremy/xen/2.6.32-next >> kernel when the processor ACPI mocule is loaded during boot on my >> Samsung X50 notebook. I first saw this problem with Xen under OpenSUSE >> 11.3 (https://bugzilla.novell.com/show_bug.cgi?id=623680) with the >> OpenSUSE hypervisor and kernel. >> >> So far, I have found out the following: >> >> - no problem with OpenSUSE default kernel >> - no problem with Xen if I don''t load processor.ko >> - no problem with "max_cstate=1" hypervisor parameter (but max_cstate=2 >> freezes) >> >> - when the freeze happens, I am usually seeing some unreleated messages >> about USB or SATA device initialization. No keys at all are working, >> sometimes the disk LED stays on without the disk making noises. I >> haven''t been able to see any hypervisor messages (even with "vga=keep"; >> unfortunately the system has no serial port). The Xen "watchdog" >> parameter causes the machine to reboot without any messages. >> >> - The weirdest thing is that once the system has booted, I seem to be >> able to load processor.ko without problems (at least the system doesn''t >> freeze, even if I let it idle for a long time or if I do normal work >> under X). With the SUSE hypervisor + kernel I was even able to verify >> that cpuidle was working and C3 was being used (with xen-unstable, I >> couldn''t get xenpm to work so far). >> >> - However if processor.ko is loaded during boot, the system always >> freezes (100% reproducable). I tried loading it in the initrd (SUSE >> default) and early after mounting the root FS (autoloaded by udev I >> think), and even compiling it into the kernel proper (tried that only >> with the jeremy kernel). In all cases, the system freezes hard. >> >> Probably the freeze occurs when the CPU enters a deep C-state while some >> HW initialization is going on at boot time. >> >> I am a little out of clues how to debug this further, any hints would be >> welcome. >> >> The info below was taken with the normal OpenSUSE kernel. >> >> Thanks for any hints >> Martin >> >> martin@athene:~$ cat /proc/cpuinfo >> processor : 0 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 13 >> model name : Intel(R) Pentium(R) M processor 2.13GHz >> stepping : 8 >> cpu MHz : 800.000 >> cache size : 2048 KB >> fdiv_bug : no >> hlt_bug : no >> f00f_bug : no >> coma_bug : no >> fpu : yes >> fpu_exception : yes >> cpuid level : 2 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge >> mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 >> bogomips : 1596.03 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 32 bits physical, 32 bits virtual >> power management: >> >> martin@athene:~$ cat /proc/acpi/processor/CPU0/info >> processor id: 0 >> acpi id: 0 >> bus mastering control: yes >> power management: yes >> throttling control: yes >> limit interface: yes >> >> martin@athene:~$ cat /proc/acpi/processor/CPU0/power >> active state: C0 >> max_cstate: C8 >> maximum allowed latency: 2000000000 usec >> states: >> C1: type[C1] promotion[--] demotion[--] >> latency[001] usage[00001563] duration[00000000000000000000] >> C2: type[C2] promotion[--] demotion[--] >> latency[001] usage[00308057] duration[00000000002512350868] >> C3: type[C3] promotion[--] demotion[--] >> latency[085] usage[00126109] duration[00000000000772728489] >> C4: type[C3] promotion[--] demotion[--] >> latency[185] usage[00537539] duration[00000000008753288513] >> m >> >> [ 3.345147] processor_driver-0392 [00] processor_get_info : Bus >> mastering arbitration control present >> [ 3.345187] processor_driver-0474 [00] processor_get_info : >> Processor [0:0] >> [ 3.345215] processor_throttling-1141 [00] processor_get_throttli: >> pblk_address[0x00001010] duty_offset[3] duty_width[1] >> [ 3.345233] processor_throttling-1187 [00] processor_get_throttli: >> Found 2 throttling states >> [ 3.345246] processor_throttling-0657 [00] processor_get_throttli: >> Throttling state is T0 (1000% throttling applied) >> [ 3.345585] processor_idle-0526 [00] processor_get_power_in: Found 4 >> power states >> [ 3.345931] processor_throttling-0218 [00] processor_throttling_i: >> Assume no T-state coordination >> [ 82.129875] processor_perflib-0349 [00] processor_get_performa: Found >> 6 performance states >> [ 82.129889] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 0 >> [ 82.129901] processor_perflib-0385 [00] processor_get_performa: State >> [0]: core_frequency[2133] power[27000] transition_latency[10] >> bus_master_latency[10] control[0x1029] status[0x1029] >> [ 82.129918] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 1 >> [ 82.129928] processor_perflib-0385 [00] processor_get_performa: State >> [1]: core_frequency[1867] power[24000] transition_latency[10] >> bus_master_latency[10] control[0xe25] status[0xe25] >> [ 82.129945] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 2 >> [ 82.129955] processor_perflib-0385 [00] processor_get_performa: State >> [2]: core_frequency[1600] power[21000] transition_latency[10] >> bus_master_latency[10] control[0xc20] status[0xc20] >> [ 82.129972] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 3 >> [ 82.129982] processor_perflib-0385 [00] processor_get_performa: State >> [3]: core_frequency[1333] power[19000] transition_latency[10] >> bus_master_latency[10] control[0xa1c] status[0xa1c] >> [ 82.129998] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 4 >> [ 82.130009] processor_perflib-0385 [00] processor_get_performa: State >> [4]: core_frequency[1067] power[16000] transition_latency[10] >> bus_master_latency[10] control[0x817] status[0x817] >> [ 82.130025] processor_perflib-0367 [00] processor_get_performa: >> Extracting state 5 >> [ 82.130035] processor_perflib-0385 [00] processor_get_performa: State >> [5]: core_frequency[800] power[13000] transition_latency[10] >> bus_master_latency[10] control[0x612] status[0x612] >> [ 82.130173] processor_perflib-0486 [00] processor_notify_smm : No >> SMI port or pstate_control > > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Martin Wilck
2010-Nov-23 23:16 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Hello Yunhong, thanks for your reply, I am sorry it took me so long to respond.> Martin, did you try latest origin/xen/stable-2.6.32.x branch to see if anything different?I ran 2.6.32.25 from Jeremy''s branch, no change - system still freezes if processor.ko is loaded at boot time. xen-unstable changeset: 22417:c0c1f5f0745e xen/stable-2.6.32.x commit: 481bd8e6b8dafed2ea445e8cde2abbbb95b49ec1 There was a bug in .32 branch on processor before, which should have been fixed. And can you share the boot log and the feature supported in this platform? dmesg.txt (from non-xen kernel) is attached. $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.13GHz stepping : 8 cpu MHz : 800.000 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 bogomips : 1596.16 clflush size : 64 cache_alignment : 64 address sizes : 32 bits physical, 32 bits virtual power management: $ cat /proc/acpi/processor/CPU0/power active state: C0 max_cstate: C8 maximum allowed latency: 2000000000 usec states: C1: type[C1] promotion[--] demotion[--] latency[001] usage[00001533] duration[00000000000000000000] C2: type[C2] promotion[--] demotion[--] latency[001] usage[00294108] duration[00000000004868085138] C3: type[C3] promotion[--] demotion[--] latency[085] usage[00470473] duration[00000000008899295934] $ cat /proc/acpi/processor/CPU0/info processor id: 0 acpi id: 0 bus mastering control: yes power management: yes throttling control: yes limit interface: yes> And I remember report about Xen freeze on some platform with HPET broadcast enabled http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html , but not sure if it''s the same issue with you (I checked that issue when it was reported and made no progress).I played around with the hpet_broadcast xen paramter, but it didn''t have an effect. Regards Martin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2011-Jan-10 15:37 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Martin, I''d like to reproduce the bug at my desktop and have a look at it. I''m setting up debug environment now, and need some environment/config info at your side: 1. xen-upstable changeset 2. Jeremy pvops kernel version/ git commit/ .config file 3. ioemu git commit 4. grub.conf file 5. processor.ko related config to load the modules at booting time (of Jeremy pvops kernel, not SUSE) 6. xen/kernel booting serial log BTW, is xen still alive when dom0 kernel freeze? If yes, some dump log like key ''0''/ ''c''/ ''d''/ ''q'' is highly welcomed. Thanks, Jinsong Jan Beulich wrote:> (Stub reply, to make the mail visible to the list - apparently the > original is still sitting in the to-be-approved queue.) > >>>> On 08.10.10 at 22:04, Martin Wilck <mwilck@arcor.de> wrote: Hello, >> >> I see a system freeze with xen-unstable and jeremy/xen/2.6.32-next >> kernel when the processor ACPI mocule is loaded during boot on my >> Samsung X50 notebook. I first saw this problem with Xen under >> OpenSUSE >> 11.3 (https://bugzilla.novell.com/show_bug.cgi?id=623680) with the >> OpenSUSE hypervisor and kernel. >> >> So far, I have found out the following: >> >> - no problem with OpenSUSE default kernel >> - no problem with Xen if I don''t load processor.ko >> - no problem with "max_cstate=1" hypervisor parameter (but >> max_cstate=2 freezes) >> >> - when the freeze happens, I am usually seeing some unreleated >> messages about USB or SATA device initialization. No keys at all are >> working, sometimes the disk LED stays on without the disk making >> noises. I haven''t been able to see any hypervisor messages (even >> with "vga=keep"; unfortunately the system has no serial port). The >> Xen "watchdog" parameter causes the machine to reboot without any >> messages. >> >> - The weirdest thing is that once the system has booted, I seem to >> be able to load processor.ko without problems (at least the system >> doesn''t freeze, even if I let it idle for a long time or if I do >> normal work under X). With the SUSE hypervisor + kernel I was even >> able to verify that cpuidle was working and C3 was being used (with >> xen-unstable, I couldn''t get xenpm to work so far). >> >> - However if processor.ko is loaded during boot, the system always >> freezes (100% reproducable). I tried loading it in the initrd (SUSE >> default) and early after mounting the root FS (autoloaded by udev I >> think), and even compiling it into the kernel proper (tried that only >> with the jeremy kernel). In all cases, the system freezes hard. >> >> Probably the freeze occurs when the CPU enters a deep C-state while >> some HW initialization is going on at boot time. >> >> I am a little out of clues how to debug this further, any hints >> would be welcome. >> >> The info below was taken with the normal OpenSUSE kernel. >> >> Thanks for any hints >> Martin >> >> martin@athene:~$ cat /proc/cpuinfo >> processor : 0 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 13 >> model name : Intel(R) Pentium(R) M processor 2.13GHz >> stepping : 8 >> cpu MHz : 800.000 >> cache size : 2048 KB >> fdiv_bug : no >> hlt_bug : no >> f00f_bug : no >> coma_bug : no >> fpu : yes >> fpu_exception : yes >> cpuid level : 2 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est >> tm2 bogomips : 1596.03 clflush size : 64 >> cache_alignment : 64 >> address sizes : 32 bits physical, 32 bits virtual >> power management: >> >> martin@athene:~$ cat /proc/acpi/processor/CPU0/info >> processor id: 0 >> acpi id: 0 >> bus mastering control: yes >> power management: yes >> throttling control: yes >> limit interface: yes >> >> martin@athene:~$ cat /proc/acpi/processor/CPU0/power >> active state: C0 >> max_cstate: C8 >> maximum allowed latency: 2000000000 usec >> states: >> C1: type[C1] promotion[--] demotion[--] >> latency[001] usage[00001563] duration[00000000000000000000] >> C2: type[C2] promotion[--] demotion[--] >> latency[001] usage[00308057] duration[00000000002512350868] >> C3: type[C3] promotion[--] demotion[--] >> latency[085] usage[00126109] duration[00000000000772728489] >> C4: type[C3] promotion[--] demotion[--] >> latency[185] usage[00537539] duration[00000000008753288513] >> m >> >> [ 3.345147] processor_driver-0392 [00] processor_get_info : Bus >> mastering arbitration control present >> [ 3.345187] processor_driver-0474 [00] processor_get_info : >> Processor [0:0] [ 3.345215] processor_throttling-1141 [00] >> processor_get_throttli: pblk_address[0x00001010] duty_offset[3] >> duty_width[1] [ 3.345233] processor_throttling-1187 [00] >> processor_get_throttli: Found 2 throttling states [ 3.345246] >> processor_throttling-0657 [00] processor_get_throttli: Throttling >> state is T0 (1000% throttling applied) [ 3.345585] >> processor_idle-0526 [00] processor_get_power_in: Found 4 power >> states [ 3.345931] processor_throttling-0218 [00] >> processor_throttling_i: Assume no T-state coordination [ >> 82.129875] processor_perflib-0349 [00] processor_get_performa: Found >> 6 performance states [ 82.129889] processor_perflib-0367 [00] >> processor_get_performa: Extracting state 0 [ 82.129901] >> processor_perflib-0385 [00] processor_get_performa: State [0]: >> core_frequency[2133] power[27000] transition_latency[10] >> bus_master_latency[10] control[0x1029] status[0x1029] [ 82.129918] >> processor_perflib-0367 [00] processor_get_performa: Extracting state >> 1 [ 82.129928] processor_perflib-0385 [00] processor_get_performa: >> State [1]: core_frequency[1867] power[24000] transition_latency[10] >> bus_master_latency[10] control[0xe25] status[0xe25] [ 82.129945] >> processor_perflib-0367 [00] processor_get_performa: Extracting state >> 2 [ 82.129955] processor_perflib-0385 [00] processor_get_performa: >> State [2]: core_frequency[1600] power[21000] transition_latency[10] >> bus_master_latency[10] control[0xc20] status[0xc20] [ 82.129972] >> processor_perflib-0367 [00] processor_get_performa: Extracting state >> 3 [ 82.129982] processor_perflib-0385 [00] processor_get_performa: >> State [3]: core_frequency[1333] power[19000] transition_latency[10] >> bus_master_latency[10] control[0xa1c] status[0xa1c] [ 82.129998] >> processor_perflib-0367 [00] processor_get_performa: Extracting state >> 4 [ 82.130009] processor_perflib-0385 [00] processor_get_performa: >> State [4]: core_frequency[1067] power[16000] transition_latency[10] >> bus_master_latency[10] control[0x817] status[0x817] [ 82.130025] >> processor_perflib-0367 [00] processor_get_performa: Extracting state >> 5 [ 82.130035] processor_perflib-0385 [00] processor_get_performa: >> State [5]: core_frequency[800] power[13000] transition_latency[10] >> bus_master_latency[10] control[0x612] status[0x612] [ 82.130173] >> processor_perflib-0486 [00] processor_notify_smm : No SMI port or >> pstate_control > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2011-Jan-11 14:29 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Martin, Any update? I need reproduce the bug so your environment/config is really needed. BTW, have you tried at some other platform beside Samsung XS50 laptop? is it a machine specific issue? Thanks, Jinsong Liu, Jinsong wrote:> Martin, > > I''d like to reproduce the bug at my desktop and have a look at it. > I''m setting up debug environment now, and need some > environment/config info at your side: > > 1. xen-upstable changeset > 2. Jeremy pvops kernel version/ git commit/ .config file > 3. ioemu git commit > 4. grub.conf file > 5. processor.ko related config to load the modules at booting time > (of Jeremy pvops kernel, not SUSE) > 6. xen/kernel booting serial log > > BTW, is xen still alive when dom0 kernel freeze? > If yes, some dump log like key ''0''/ ''c''/ ''d''/ ''q'' is highly welcomed. > > Thanks, > Jinsong > > > Jan Beulich wrote: >> (Stub reply, to make the mail visible to the list - apparently the >> original is still sitting in the to-be-approved queue.) >> >>>>> On 08.10.10 at 22:04, Martin Wilck <mwilck@arcor.de> wrote: Hello, >>> >>> I see a system freeze with xen-unstable and jeremy/xen/2.6.32-next >>> kernel when the processor ACPI mocule is loaded during boot on my >>> Samsung X50 notebook. I first saw this problem with Xen under >>> OpenSUSE >>> 11.3 (https://bugzilla.novell.com/show_bug.cgi?id=623680) with the >>> OpenSUSE hypervisor and kernel. >>> >>> So far, I have found out the following: >>> >>> - no problem with OpenSUSE default kernel >>> - no problem with Xen if I don''t load processor.ko >>> - no problem with "max_cstate=1" hypervisor parameter (but >>> max_cstate=2 freezes) >>> >>> - when the freeze happens, I am usually seeing some unreleated >>> messages about USB or SATA device initialization. No keys at all are >>> working, sometimes the disk LED stays on without the disk making >>> noises. I haven''t been able to see any hypervisor messages (even >>> with "vga=keep"; unfortunately the system has no serial port). The >>> Xen "watchdog" parameter causes the machine to reboot without any >>> messages. >>> >>> - The weirdest thing is that once the system has booted, I seem to >>> be able to load processor.ko without problems (at least the system >>> doesn''t freeze, even if I let it idle for a long time or if I do >>> normal work under X). With the SUSE hypervisor + kernel I was even >>> able to verify that cpuidle was working and C3 was being used (with >>> xen-unstable, I couldn''t get xenpm to work so far). >>> >>> - However if processor.ko is loaded during boot, the system always >>> freezes (100% reproducable). I tried loading it in the initrd (SUSE >>> default) and early after mounting the root FS (autoloaded by udev I >>> think), and even compiling it into the kernel proper (tried that >>> only with the jeremy kernel). In all cases, the system freezes hard. >>> >>> Probably the freeze occurs when the CPU enters a deep C-state while >>> some HW initialization is going on at boot time. >>> >>> I am a little out of clues how to debug this further, any hints >>> would be welcome. >>> >>> The info below was taken with the normal OpenSUSE kernel. >>> >>> Thanks for any hints >>> Martin >>> >>> martin@athene:~$ cat /proc/cpuinfo >>> processor : 0 >>> vendor_id : GenuineIntel >>> cpu family : 6 >>> model : 13 >>> model name : Intel(R) Pentium(R) M processor 2.13GHz >>> stepping : 8 >>> cpu MHz : 800.000 >>> cache size : 2048 KB >>> fdiv_bug : no >>> hlt_bug : no >>> f00f_bug : no >>> coma_bug : no >>> fpu : yes >>> fpu_exception : yes >>> cpuid level : 2 >>> wp : yes >>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >>> pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est >>> tm2 bogomips : 1596.03 clflush size : 64 >>> cache_alignment : 64 >>> address sizes : 32 bits physical, 32 bits virtual >>> power management: >>> >>> martin@athene:~$ cat /proc/acpi/processor/CPU0/info >>> processor id: 0 >>> acpi id: 0 >>> bus mastering control: yes >>> power management: yes >>> throttling control: yes >>> limit interface: yes >>> >>> martin@athene:~$ cat /proc/acpi/processor/CPU0/power >>> active state: C0 >>> max_cstate: C8 >>> maximum allowed latency: 2000000000 usec >>> states: >>> C1: type[C1] promotion[--] demotion[--] >>> latency[001] usage[00001563] duration[00000000000000000000] >>> C2: type[C2] promotion[--] demotion[--] >>> latency[001] usage[00308057] duration[00000000002512350868] >>> C3: type[C3] promotion[--] demotion[--] >>> latency[085] usage[00126109] duration[00000000000772728489] >>> C4: type[C3] promotion[--] demotion[--] >>> latency[185] usage[00537539] duration[00000000008753288513] >>> m >>> >>> [ 3.345147] processor_driver-0392 [00] processor_get_info : >>> Bus mastering arbitration control present >>> [ 3.345187] processor_driver-0474 [00] processor_get_info : >>> Processor [0:0] [ 3.345215] processor_throttling-1141 [00] >>> processor_get_throttli: pblk_address[0x00001010] duty_offset[3] >>> duty_width[1] [ 3.345233] processor_throttling-1187 [00] >>> processor_get_throttli: Found 2 throttling states [ 3.345246] >>> processor_throttling-0657 [00] processor_get_throttli: Throttling >>> state is T0 (1000% throttling applied) [ 3.345585] >>> processor_idle-0526 [00] processor_get_power_in: Found 4 power >>> states [ 3.345931] processor_throttling-0218 [00] >>> processor_throttling_i: Assume no T-state coordination [ >>> 82.129875] processor_perflib-0349 [00] processor_get_performa: Found >>> 6 performance states [ 82.129889] processor_perflib-0367 [00] >>> processor_get_performa: Extracting state 0 [ 82.129901] >>> processor_perflib-0385 [00] processor_get_performa: State [0]: >>> core_frequency[2133] power[27000] transition_latency[10] >>> bus_master_latency[10] control[0x1029] status[0x1029] [ 82.129918] >>> processor_perflib-0367 [00] processor_get_performa: Extracting state >>> 1 [ 82.129928] processor_perflib-0385 [00] processor_get_performa: >>> State [1]: core_frequency[1867] power[24000] transition_latency[10] >>> bus_master_latency[10] control[0xe25] status[0xe25] [ 82.129945] >>> processor_perflib-0367 [00] processor_get_performa: Extracting state >>> 2 [ 82.129955] processor_perflib-0385 [00] processor_get_performa: >>> State [2]: core_frequency[1600] power[21000] transition_latency[10] >>> bus_master_latency[10] control[0xc20] status[0xc20] [ 82.129972] >>> processor_perflib-0367 [00] processor_get_performa: Extracting state >>> 3 [ 82.129982] processor_perflib-0385 [00] processor_get_performa: >>> State [3]: core_frequency[1333] power[19000] transition_latency[10] >>> bus_master_latency[10] control[0xa1c] status[0xa1c] [ 82.129998] >>> processor_perflib-0367 [00] processor_get_performa: Extracting state >>> 4 [ 82.130009] processor_perflib-0385 [00] processor_get_performa: >>> State [4]: core_frequency[1067] power[16000] transition_latency[10] >>> bus_master_latency[10] control[0x817] status[0x817] [ 82.130025] >>> processor_perflib-0367 [00] processor_get_performa: Extracting state >>> 5 [ 82.130035] processor_perflib-0385 [00] processor_get_performa: >>> State [5]: core_frequency[800] power[13000] transition_latency[10] >>> bus_master_latency[10] control[0x612] status[0x612] [ 82.130173] >>> processor_perflib-0486 [00] processor_notify_smm : No SMI port or >>> pstate_control >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Martin Wilck
2011-Jan-13 21:29 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Dear Jinsong,> I''d like to reproduce the bug at my desktop and have a look at it. > I''m setting up debug environment now, and need some environment/config info at your side:I sent some Infos in November, but they may have got lost on their way to the mailing list.> 1. xen-upstable changeset22417:c0c1f5f0745e> 2. Jeremy pvops kernel version/ git commit/ .config filexen/stable-2.6.32.x commit: 481bd8e6b8dafed2ea445e8cde2abbbb95b49ec1 config file attached> 3. ioemu git commit60766b459c41e429a4b2405124b42512ea362984 But I am certain none of these really matter a lot - I got the same result with all kernels and hypervisors I tried so far since upgrading to OpenSUSE 11.3. My previous xen hypervisor had no cpuidle or cpufreq support.> 4. grub.conf filetitle Xen4.1 root (hd0,5) kernel /xen-4.1.gz vga=mode-0,keep cpufreq=xen cpuidle loglvl=all module /vmlinuz-2.6.32.25 root=/dev/mapper/vg-os11.2 vga=0 nomodeset debug sysrq=9 S module /initrd-2.6.32.25> 5. processor.ko related config to load the modules at booting time (of Jeremy pvops kernel, not SUSE)I am not sure what you are referring to here. I have tried different things. The usual SUSE way is to load processor.ko in the initrd (INITRD_MODULES="ahci processor" in /etc/sysconfig/kernel).> 6. xen/kernel booting serial logI''d love to provide it, but I have no serial port. I am considering to by a docking station to be able to get one.> BTW, is xen still alive when dom0 kernel freeze?I don''t think so.> If yes, some dump log like key ''0''/ ''c''/ ''d''/ ''q'' is highly welcomed.Please give me more hints - just hit ''0'' no Alt-sysrq or anything? BTW, have you tried at some other platform beside Samsung XS50 laptop? is it a machine specific issue?> It could be. Noone else reported anything similar to SUSE. I only sawit on this laptop. Martin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2011-Jan-14 07:53 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Martin, I cannot reproduce the bug at my desktop according to the environment/config you send me. I try inserting processor.ko during boot/ after boot, transferring max_cstate=2 to processor.ko through /etc/modprobe.conf, dom0 boot success. However, some minor suggestions for you :) 1. for Jeremy kernel .config, set ''CONFIG_SYSFS_DEPRECATED=y''. Without it, dom0 will panic during boot; 2. would you please try some other platform? maybe it''s machine specific issue; I think at your case Xen is still alive. With serial console, type ''ctrl+a'' 3 times will switch to xen console, and under xen console, type key ''0''/ ''c''/ ''d''/ ''q'' will get some xen-dump-log (and prove Xen alive). Thanks, Jinsong Martin Wilck wrote:> Dear Jinsong, > >> I''d like to reproduce the bug at my desktop and have a look at it. >> I''m setting up debug environment now, and need some >> environment/config info at your side: > > I sent some Infos in November, but they may have got lost on their way > to the mailing list. > >> 1. xen-upstable changeset > 22417:c0c1f5f0745e >> 2. Jeremy pvops kernel version/ git commit/ .config file > xen/stable-2.6.32.x commit: 481bd8e6b8dafed2ea445e8cde2abbbb95b49ec1 > config file attached >> 3. ioemu git commit > 60766b459c41e429a4b2405124b42512ea362984 > > But I am certain none of these really matter a lot - I got the same > result with all kernels and hypervisors I tried so far since upgrading > to OpenSUSE 11.3. My previous xen hypervisor had no cpuidle or cpufreq > support. > >> 4. grub.conf file > title Xen4.1 > root (hd0,5) > kernel /xen-4.1.gz vga=mode-0,keep cpufreq=xen cpuidle loglvl=all > module /vmlinuz-2.6.32.25 root=/dev/mapper/vg-os11.2 vga=0 > nomodeset debug sysrq=9 S > module /initrd-2.6.32.25 > >> 5. processor.ko related config to load the modules at booting time >> (of Jeremy pvops kernel, not SUSE) > I am not sure what you are referring to here. I have tried different > things. The usual SUSE way is to load processor.ko in the initrd > (INITRD_MODULES="ahci processor" in /etc/sysconfig/kernel). > >> 6. xen/kernel booting serial log > I''d love to provide it, but I have no serial port. I am considering to > by a docking station to be able to get one. > >> BTW, is xen still alive when dom0 kernel freeze? I don''t think so. > >> If yes, some dump log like key ''0''/ ''c''/ ''d''/ ''q'' is highly welcomed. > Please give me more hints - just hit ''0'' no Alt-sysrq or anything? > > BTW, have you tried at some other platform beside Samsung XS50 laptop? > is it a machine specific issue? >> It could be. Noone else reported anything similar to SUSE. I only >> saw it on this laptop. > > Martin_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2011-Mar-18 03:40 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Hi, Martin Any update about the bug? We still keep an eay on it :) Thanks, Jinsong Liu, Jinsong wrote:> Martin, > > I cannot reproduce the bug at my desktop according to the > environment/config you send me. > I try inserting processor.ko during boot/ after boot, transferring > max_cstate=2 to processor.ko through /etc/modprobe.conf, dom0 boot > success. > > However, some minor suggestions for you :) > 1. for Jeremy kernel .config, set ''CONFIG_SYSFS_DEPRECATED=y''. > Without it, dom0 will panic during boot; > 2. would you please try some other platform? maybe it''s machine > specific issue; > > I think at your case Xen is still alive. With serial console, type > ''ctrl+a'' 3 times will switch to xen console, and under xen console, > type key ''0''/ ''c''/ ''d''/ ''q'' will get some xen-dump-log (and prove Xen > alive). > > Thanks, > Jinsong > > > Martin Wilck wrote: >> Dear Jinsong, >> >>> I''d like to reproduce the bug at my desktop and have a look at it. >>> I''m setting up debug environment now, and need some >>> environment/config info at your side: >> >> I sent some Infos in November, but they may have got lost on their >> way to the mailing list. >> >>> 1. xen-upstable changeset >> 22417:c0c1f5f0745e >>> 2. Jeremy pvops kernel version/ git commit/ .config file >> xen/stable-2.6.32.x commit: 481bd8e6b8dafed2ea445e8cde2abbbb95b49ec1 >> config file attached >>> 3. ioemu git commit >> 60766b459c41e429a4b2405124b42512ea362984 >> >> But I am certain none of these really matter a lot - I got the same >> result with all kernels and hypervisors I tried so far since >> upgrading to OpenSUSE 11.3. My previous xen hypervisor had no >> cpuidle or cpufreq support. >> >>> 4. grub.conf file >> title Xen4.1 >> root (hd0,5) >> kernel /xen-4.1.gz vga=mode-0,keep cpufreq=xen cpuidle loglvl=all >> module /vmlinuz-2.6.32.25 root=/dev/mapper/vg-os11.2 vga=0 >> nomodeset debug sysrq=9 S module /initrd-2.6.32.25 >> >>> 5. processor.ko related config to load the modules at booting time >>> (of Jeremy pvops kernel, not SUSE) >> I am not sure what you are referring to here. I have tried different >> things. The usual SUSE way is to load processor.ko in the initrd >> (INITRD_MODULES="ahci processor" in /etc/sysconfig/kernel). >> >>> 6. xen/kernel booting serial log >> I''d love to provide it, but I have no serial port. I am considering >> to by a docking station to be able to get one. >> >>> BTW, is xen still alive when dom0 kernel freeze? I don''t think so. >> >>> If yes, some dump log like key ''0''/ ''c''/ ''d''/ ''q'' is highly >>> welcomed. >> Please give me more hints - just hit ''0'' no Alt-sysrq or anything? >> >> BTW, have you tried at some other platform beside Samsung XS50 >> laptop? is it a machine specific issue? >>> It could be. Noone else reported anything similar to SUSE. I only >>> saw it on this laptop. >> >> Martin > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Martin Wilck
2011-Mar-28 22:31 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Hi Liu,> Any update? I need reproduce the bug so your environment/config is really needed.I think I have some news, finally. I got a docking station and a serial cable and was finally able to capture some more debug output. You were right, the hypervisor seems top be still alive (but timers seen to be stalled). When the problem occurs, I see (XEN) Platform timer appears to have unexpectedly wrapped 10 or more times. This has been reported elsewhere, but I haven''t seen anyone reporting a total freeze in this case, like me. I am attaching a serial capure. When the problem occured, I switched the serial port to the hypervisor and ran some diagnostics. Hints appreciated. This was done with the hypervisor and kernel of OpenSUSE 11.4. xen-4.0.2_02-4.7.1.i586 kernel-xen-2.6.37.1-1.2.2.i586> BTW, have you tried at some other platform beside Samsung XS50 laptop? is it a machine specific issue?I haven''t seen it on any other machine. But I don''t have a representative set of machines, either. Regards Martin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Martin Wilck
2011-Mar-28 22:48 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Here is one more capture. It shows that (unfortunately) clocksource=pit doesn''t help here, and that the xen watchdog hits if I configure it (just that the reboot doesn''t work, and I can only see the output since I''ve been using the serial console). Martin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Haitao Shan
2011-Mar-31 06:23 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Hi, Martin, I have checked your dump info via debug key. I saw that the EIPs remained the same between two successive dump. However, without the symbols I could not identify which code kernel was hanging on. Is it possible that you can find this information by disassembling the kernel binaries (with symbols). Or could you please repeat your test using an upstreaming Xen and kernel so that I could compile a same kernel just as you would be using? And I see you CPU is a very old model, UP without 64 bit support and no PAE? Right? Shan Haitao 2011/3/29 Martin Wilck <mwilck@arcor.de>> Hi Liu, > > > Any update? I need reproduce the bug so your environment/config is really > needed. > > I think I have some news, finally. I got a docking station and a serial > cable and was finally able to capture some more debug output. You were > right, the hypervisor seems top be still alive (but timers seen to be > stalled). When the problem occurs, I see > > (XEN) Platform timer appears to have unexpectedly wrapped 10 or more times. > > This has been reported elsewhere, but I haven''t seen anyone reporting a > total freeze in this case, like me. > > I am attaching a serial capure. When the problem occured, I switched the > serial port to the hypervisor and ran some diagnostics. Hints > appreciated. This was done with the hypervisor and kernel of OpenSUSE 11.4. > > xen-4.0.2_02-4.7.1.i586 > kernel-xen-2.6.37.1-1.2.2.i586 > > > BTW, have you tried at some other platform beside Samsung XS50 laptop? is > it a machine specific issue? > > I haven''t seen it on any other machine. But I don''t have a > representative set of machines, either. > > Regards > Martin > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Mar-31 09:52 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
>>> On 29.03.11 at 00:31, Martin Wilck <mwilck@arcor.de> wrote: > (XEN) Platform timer appears to have unexpectedly wrapped 10 or more times.The native kernel on here says "Marking TSC unstable due to TSC halts in idle" Knowing (from the bug entry we have for this) that when limiting C-state use to C1 the machine boots, I suspect that there''s some problem with recovering the TSC after exiting C2 or C3. Knowing whether the problem is still present with 4.1 would of course be useful. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Mar-31 11:48 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
>>> On 29.03.11 at 00:48, Martin Wilck <mwilck@arcor.de> wrote: > Here is one more capture. It shows that (unfortunately) clocksource=pit > doesn''t help here, and that the xen watchdog hits if I configure it > (just that the reboot doesn''t work, and I can only see the output since > I''ve been using the serial console).The stack evaluates to logarithmic_accumulation update_wall_time do_timer(0x898d7) tick_do_update_jiffies64 tick_sched_timer __run_hrtimer hrtimer_interrupt timer_interrupt (matches the previously sent one, just that there the tick count passed to do_timer() is "only" 0x179ab. So the kernel, afaict, is busy recovering from the time jump in Xen. It is clearly also a bad sign that the NMI hit while Dom0 was executing, as that guarantees interrupts aren''t disabled (and hence timer interrupts can occur, and timers would not be prevented from running - presumably the time jump suppressed the invocation of, among others, the NMI timer). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Liu, Jinsong
2011-Apr-01 02:26 UTC
RE: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
Yeah, sounds reasonable :) CC Haitao into circle who will continue take care of the bug. Thanks, Jinsong Jan Beulich wrote:>>>> On 29.03.11 at 00:48, Martin Wilck <mwilck@arcor.de> wrote: >> Here is one more capture. It shows that (unfortunately) >> clocksource=pit doesn''t help here, and that the xen watchdog hits if >> I configure it (just that the reboot doesn''t work, and I can only >> see the output since I''ve been using the serial console). > > The stack evaluates to > > logarithmic_accumulation > update_wall_time > do_timer(0x898d7) > tick_do_update_jiffies64 > tick_sched_timer > __run_hrtimer > hrtimer_interrupt > timer_interrupt > > (matches the previously sent one, just that there the tick count > passed to do_timer() is "only" 0x179ab. > > So the kernel, afaict, is busy recovering from the time jump in Xen. > > It is clearly also a bad sign that the NMI hit while Dom0 was > executing, as that guarantees interrupts aren''t disabled (and > hence timer interrupts can occur, and timers would not be > prevented from running - presumably the time jump suppressed > the invocation of, among others, the NMI timer). > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Martin Wilck
2011-Apr-03 13:46 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
On 03/31/2011 08:23 AM, Haitao Shan wrote:> I have checked your dump info via debug key. I saw that the EIPs > remained the same between two successive dump. However, without the > symbols I could not identify which code kernel was hanging on. Is it > possible that you can find this information by disassembling the kernel > binaries (with symbols).I think Jan did just that already, I am attaching his analysis again.> Or could you please repeat your test using an > upstreaming Xen and kernel so that I could compile a same kernel just as > you would be using?Can do that but it needs some time.> And I see you CPU is a very old model, UP without 64 bit support and no > PAE? Right?It has PAE, but it is UP and has no 64bit nor VT-x. processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.13GHz stepping : 8 cpu MHz : 800.000 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2 bogomips : 1596.45 clflush size : 64 cache_alignment : 64 address sizes : 32 bits physical, 32 bits virtual Martin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-04 09:22 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
>>> On 03.04.11 at 15:46, Martin Wilck <mwilck@arcor.de> wrote: > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 13Quoting SDM Vol 3a Section 16.11: "When WRMSR is used to write the time-stamp counter on processors before family [0FH], models [03H, 04H]: only the low-order 32-bits of the time-stamp counter can be written (the high-order 32 bits are cleared to 0). For family [0FH], models [03H, 04H, 06H]; for family [06H]], model [0EH, 0FH]; for family [06H]], DisplayModel [17H, 1AH, 1CH, 1DH]: all 64 bits are writable." Quite obviously nothing good can result if we write the TSC on a CPU that zeroes the upper 32 bits. Hopefully, none of the affected CPUs has X86_FEATURE_CONSTANT_TSC, since otherwise time_calibration_tsc_rendezvous() could get used, which also uses write_tsc(). Haitao, while it is quite clear that with the current implementation we just can''t use C states above C1 on CPUs that may halt the TSC in C2 or C3 *and* that don''t allow writing the full TSC, this family/model based determination clearly isn''t nice (and since it is a white list, it can''t possibly be complete). An alternative would seem to be to probe for how TSC writes behave (thus at once covering eventual other vendors'' CPUs that may have similar shortcomings). That of course would need to be done early, so that resetting the upper bits to zero wouldn''t have any adverse effect. What do you think? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-06 09:58 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
>>> On 04.04.11 at 11:22, "Jan Beulich" <JBeulich@novell.com> wrote: > Haitao, while it is quite clear that with the current > implementation we just can''t use C states above C1 on CPUs > that may halt the TSC in C2 or C3 *and* that don''t allow > writing the full TSC, this family/model based determination > clearly isn''t nice (and since it is a white list, it can''t possibly be > complete). An alternative would seem to be to probe for how > TSC writes behave (thus at once covering eventual other > vendors'' CPUs that may have similar shortcomings). That of > course would need to be done early, so that resetting the > upper bits to zero wouldn''t have any adverse effect. What > do you think?The probing itself seems to work fine. I''m confused by something else though: synchronize_tsc_{master,slave}() execute their loops (at boot or during hotplug) on any CPU that doesn''t have X86_FEATURE_TSC_RELIABLE, including such where TSC writes don''t really work (luckily I still haven''t thrown out one that is affected by this). What is the point of doing this synchronization if we can happily live with it actually not working (Xen runs fine on that box afaict)? c/s 21468:26c2922da53c is also not very verbose about why this got (re-)added... Should the body perhaps really only be run for X86_FEATURE_CONSTANT_TSC but !X86_FEATURE_NONSTOP_TSC CPUs? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Apr-12 13:59 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
On 04/04/2011 10:22, "Jan Beulich" <JBeulich@novell.com> wrote:> Haitao, while it is quite clear that with the current > implementation we just can''t use C states above C1 on CPUs > that may halt the TSC in C2 or C3 *and* that don''t allow > writing the full TSC, this family/model based determination > clearly isn''t nice (and since it is a white list, it can''t possibly be > complete). An alternative would seem to be to probe for how > TSC writes behave (thus at once covering eventual other > vendors'' CPUs that may have similar shortcomings). That of > course would need to be done early, so that resetting the > upper bits to zero wouldn''t have any adverse effect. What > do you think?We should do early run-time test of this from the BSP then, on failure, avoid all further potential uses of write_tsc() in an appropriate way (e.g., bail early in cstate_restore_tsc(), synchronize_tsc_*(), and avoid use of time_calibration_tsc_rendezvous()). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Apr-12 14:12 UTC
Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot
>>> On 12.04.11 at 15:59, Keir Fraser <keir@xen.org> wrote: > On 04/04/2011 10:22, "Jan Beulich" <JBeulich@novell.com> wrote: > >> Haitao, while it is quite clear that with the current >> implementation we just can''t use C states above C1 on CPUs >> that may halt the TSC in C2 or C3 *and* that don''t allow >> writing the full TSC, this family/model based determination >> clearly isn''t nice (and since it is a white list, it can''t possibly be >> complete). An alternative would seem to be to probe for how >> TSC writes behave (thus at once covering eventual other >> vendors'' CPUs that may have similar shortcomings). That of >> course would need to be done early, so that resetting the >> upper bits to zero wouldn''t have any adverse effect. What >> do you think? > > We should do early run-time test of this from the BSP then, on failure, > avoid all further potential uses of write_tsc() in an appropriate way (e.g., > bail early in cstate_restore_tsc(), synchronize_tsc_*(), and avoid use of > time_calibration_tsc_rendezvous()).Okay, that matches what I have so far (just need to implement the mechanism to suppress synchronize_tsc_*() then). Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel