Ed Smith
2006-Sep-26 14:04 UTC
[Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
Summary: Changeset 11616 - NEW: 32bit SMP HVM Guests hang on boot: "Uncompressing Linux... OK booting the kernel" (failure.6) - NEW: 64bit UP and SMP guests crash domain on boot: domain_crash_sync called from vmx.c:2268 (failure.5) Test Configuration: Dell Precision WorkStation 380, Dual Core, 2GB, 3 SATA (Intel VT) 64bit XEN Hypervisor on a RHEL4U2 64bit root (/dev/sda) 32bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdb) pae=1(smp) pae=0(up), acpi=1, apic=1 kernargs noapic 64bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdc) pae=1, acpi=1, apic=1 kernargs noapic Boot Tests: Boot a fully virtualized (HVM) guest to the login prompt Results are marked Pass|Fail where (n) points to a failure description Regression Tests: 852 tests (851 ltp tests and one 30 minute user load test) Tests are marked #Pass/#Fail where (n) points to a failure description XEN 64bit 2 CPU Hypervisor (booted smp): ---------------------------------------------------------------------- | XEN | Guest Kernel (SMP kernels booted with 2 CPUs) | | Changeset|-----------------------------------------------------------| | | 32bit UP | 32bit SMP | 64bit UP | 64bit SMP | | |--------------|--------------|--------------|--------------| | | Boot | Test | Boot | Test | Boot | Test | Boot | Test | |----------|------|-------|------|-------|------|-------|------|-------| | 11616 | Pass | | Fail | | Fail | | Fail | | | | | | (6) | | (5) | | (5) | | |----------|------|-------|------|-------|------|-------|------|-------| | 11600 | Pass | | Pass | 851/1 | Pass | | Pass | 852/0 | | | | | |(1,4) | | | | | |----------|------|-------|------|-------|------|-------|------|-------| | 11486 | Pass | | Pass | 851/1 | Pass | | Pass | 852/0 | | | | | |(1,2,3)| | | | | |----------|------|-------|------|-------|------|-------|------|-------| | 11483 | Pass | | Pass | 850/2 | Pass | | Pass | 852/0 | | | | | | (1) | | | | | |----------|------|-------|------|-------|------|-------|------|-------| | 11470 | Pass | | Pass | 849/3 | Pass | | Pass | 852/0 | | | | | | (1) | | | | | ---------------------------------------------------------------------- Multiple Guest Boot Test Test is a 30 minute user load on both Guests XEN 64bit 2 CPU Hypervisor (booted smp): -------------------------------------------- | XEN | Guest Kernel | | Changeset|---------------------------------| | | 32bit 1CPU UP | 32bit 2CPU SMP | | | 64bit 1CPU UP | 64bit 2CPU SMP | | |----------------|----------------| | | Boot | Test | Boot | Test | |----------|------|---------|------|---------| | 11616 | Fail | | Fail | Fail | | | (5) | | (5) | | |----------|------|---------|------|---------| | 11600 | Pass | Pass | Pass | Pass | | | | | | | |----------|------|---------|------|---------| | 11486 | Pass | Pass | Pass | Pass | | | | | | | |----------|------|---------|------|---------| | 11483 | Pass | Pass | Pass | Pass | | | | | | | |----------|------|---------|------|---------| | 11470 | Pass | Pass | Pass | Pass | | | | | | | -------------------------------------------- Failures: 1. BUG 666: 32bit guests fail ltp gettimeofday02 and nanosleep01/02 with clock problems 2. XEN crash on an xm destroy of a dead guest 32bit SMP HVM guest: Fatal page fault - put_page_from_l1e+0x85/0x140 3. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: BUG at multi.c:2864 from sh_page_fault__shadow_3_guest_3 4. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: BUG at multi.c:3958 from sh_clear_shadow_entry__shadow_3_guest_3 5. 64bit UP and SMP guests crash domain on boot domain_crash_sync called from vmx.c:2268 (failure.5) 6. NEW: 32bit SMP HVM Guests hang on boot: "Uncompressing Linux... OK booting the kernel" (failure.6) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Hand
2006-Sep-26 14:42 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
Hi Ed, any chance you can test with debug=y ? The back-traces aren''t really very useful otherwise. It''d also be good to know what the h/w platform is - AMD or Intel.>Summary: >Changeset 11616 >- NEW: 32bit SMP HVM Guests hang on boot: > "Uncompressing Linux... OK booting the kernel" (failure.6)By ''32bit'' do you mean PAE? What guest is this? What''s the guest config? I cannot reproduce this myself on 11616...>- NEW: 64bit UP and SMP guests crash domain on boot: > domain_crash_sync called from vmx.c:2268 (failure.5)This looks like the same bug to me, also cannot repro. Once more, guest + guest config info would be useful.> 2. XEN crash on an xm destroy of a dead guest > 32bit SMP HVM guest: > Fatal page fault - put_page_from_l1e+0x85/0x140Hmm - when you say "GUEST CRASHED IN GUEST CONSOLE", what actually happens? Can you post the output? Have you seen this post 11486, or has it gone away? Can you let us know what guest and guest config you are using? Testing with a debug build of Xen will also help.> 3. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: > BUG at multi.c:2864 from sh_page_fault__shadow_3_guest_3This is probably the same as #2, just taking a different path through Xen. Have you seen this post 11486, or has it gone away? [Again: guest, guest config, debug xen]> 4. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: > BUG at multi.c:3958 from sh_clear_shadow_entry__shadow_3_guest_3May be related to #2 and #3. [Again: guest, guest config, debug xen] cheers, S. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ed Smith
2006-Sep-26 15:33 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
Steven Hand wrote:> Hi Ed, > > any chance you can test with debug=y ? The back-traces aren''t > really very useful otherwise. >These are automated builds and tests that run each night. We don''t normally build a debug XEN as we try and test the bits a customer would run. If these backtraces are useful in the release build how will we diagnose crashes on a customer''s site?> It''d also be good to know what the h/w platform is - AMD or Intel.This hardware and guest information is in the test report but I''ve added it here as well. Also I tried booting the guests with/without the kernarg noapic, no difference. Did you need more guest config information that this? I can send you the actual config if you like but these are the key settings. Test Configuration: Dell Precision WorkStation 380, Dual Core, 2GB, 3 SATA (Intel VT) 64bit XEN Hypervisor on a RHEL4U2 64bit root (/dev/sda) 32bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdb) pae=1(smp) pae=0(up), acpi=1, apic=1 kernargs noapic 64bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdc) pae=1, acpi=1, apic=1 kernargs noapic> >> Summary: >> Changeset 11616 >> - NEW: 32bit SMP HVM Guests hang on boot: >> "Uncompressing Linux... OK booting the kernel" (failure.6) > > By ''32bit'' do you mean PAE? What guest is this? What''s the guest > config? I cannot reproduce this myself on 11616...RedHat 4 SMP kernels as shipped are PAE, while their UP kernels are non-PAE. The appropriate setting is used in HVM config depending on which kernel I''m booting.> > >> - NEW: 64bit UP and SMP guests crash domain on boot: >> domain_crash_sync called from vmx.c:2268 (failure.5) > > This looks like the same bug to me, also cannot repro. > Once more, guest + guest config info would be useful.Hardware and guest information is in the report, see above.> > >> 2. XEN crash on an xm destroy of a dead guest >> 32bit SMP HVM guest: >> Fatal page fault - put_page_from_l1e+0x85/0x140 > > Hmm - when you say "GUEST CRASHED IN GUEST CONSOLE", what actually > happens? Can you post the output? Have you seen this post 11486, or > has it gone away?I did not have the guest configured for serial console so the stack traces scrolled by on the guests VGA console. If I can reproduce this I''ll capture the serial console output and post it. I did not reproduce this yesterday on c/s 11600.> > Can you let us know what guest and guest config you are using?In report.> > Testing with a debug build of Xen will also help. > > >> 3. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: >> BUG at multi.c:2864 from sh_page_fault__shadow_3_guest_3 > > This is probably the same as #2, just taking a different path through > Xen. Have you seen this post 11486, or has it gone away? > > [Again: guest, guest config, debug xen]Again in report. This particular crash was not reproduced yesterday on c/s 11600.> >> 4. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: >> BUG at multi.c:3958 from sh_clear_shadow_entry__shadow_3_guest_3 > > May be related to #2 and #3.This bug was reproduced on c/s 11600.> > [Again: guest, guest config, debug xen]Again in report.> > > > cheers, > > S. > >Thanks, Ed _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Hand
2006-Sep-26 16:19 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
>Steven Hand wrote: >> Hi Ed, >> >> any chance you can test with debug=y ? The back-traces aren''t >> really very useful otherwise. >> > >These are automated builds and tests that run each night. We don''t >normally build a debug XEN as we try and test the bits a customer >would run. If these backtraces are useful in the release build how >will we diagnose crashes on a customer''s site?Erm, not sure what you''re going to do if you have customers with Xen crashes but that doesn''t seem to be a good reason not to use debug builds to try and track down known issues...>> It''d also be good to know what the h/w platform is - AMD or Intel. > >This hardware and guest information is in the test report but I''ve >added it here as well. Also I tried booting the guests with/without >the kernarg noapic, no difference. Did you need more guest config >information that this? I can send you the actual config if you like >but these are the key settings. > >Test Configuration: >Dell Precision WorkStation 380, Dual Core, 2GB, 3 SATA (Intel VT) >64bit XEN Hypervisor on a RHEL4U2 64bit root (/dev/sda) >32bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdb) > pae=1(smp) pae=0(up), acpi=1, apic=1 > kernargs noapic >64bit fully virtualized (HVM) guest RHEL4U2 256MB (/dev/sdc) > pae=1, acpi=1, apic=1 > kernargs noapicAh great, thanks. Missed that first time around.>>> 2. XEN crash on an xm destroy of a dead guest >>> 32bit SMP HVM guest: >>> Fatal page fault - put_page_from_l1e+0x85/0x140 >> >> Hmm - when you say "GUEST CRASHED IN GUEST CONSOLE", what actually >> happens? Can you post the output? Have you seen this post 11486, or >> has it gone away? > >I did not have the guest configured for serial console so the stack >traces scrolled by on the guests VGA console. If I can reproduce this >I''ll capture the serial console output and post it. I did not reproduce >this yesterday on c/s 11600.Ok, great.>>> 4. XEN crash running ltp "mtest01 -p80" on 32bit SMP HVM guest: >>> BUG at multi.c:3958 from sh_clear_shadow_entry__shadow_3_guest_3 >> >> May be related to #2 and #3. > >This bug was reproduced on c/s 11600.Ok - we''re looking at this now. cheers, S. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Sep-26 16:28 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
On 26/9/06 16:33, "Ed Smith" <esmith@virtualiron.com> wrote:>> any chance you can test with debug=y ? The back-traces aren''t >> really very useful otherwise. > > These are automated builds and tests that run each night. We don''t > normally build a debug XEN as we try and test the bits a customer > would run. If these backtraces are useful in the release build how > will we diagnose crashes on a customer''s site?They take more deciphering, sometimes with the aid of the xen-syms file, as the backtrace contains functions that aren''t really in the call chain (and misses some that are). It''s less time consuming with a debug build, and there''s less reliance on the xen-syms file, as we include frame pointers. Trying to match customer bits doesn''t make sense. There is no customer for these bits! So why throw away debug info during development testing just because there are situations where debug info is not available? Is it considered good training for developers? :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ed Smith
2006-Sep-26 18:31 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
Keir Fraser wrote:> On 26/9/06 16:33, "Ed Smith" <esmith@virtualiron.com> wrote: > >>> any chance you can test with debug=y ? The back-traces aren''t >>> really very useful otherwise. >> These are automated builds and tests that run each night. We don''t >> normally build a debug XEN as we try and test the bits a customer >> would run. If these backtraces are useful in the release build how >> will we diagnose crashes on a customer''s site? > > They take more deciphering, sometimes with the aid of the xen-syms file, as > the backtrace contains functions that aren''t really in the call chain (and > misses some that are). It''s less time consuming with a debug build, and > there''s less reliance on the xen-syms file, as we include frame pointers. > > Trying to match customer bits doesn''t make sense. There is no customer for > these bits! So why throw away debug info during development testing just > because there are situations where debug info is not available? Is it > considered good training for developers? :-) > > -- Keir >Debug builds are fine and certainly easier to well, debug with, but they often run slower than release builds and hide problems. Humm... I wonder if thats why you are not seeing this problem. Also when we rely on debug builds to diagnose problems we do not design in the ability to diagnose problems when the bits are in customers hands. ''Good training for developers''? No just trying to work towards a released product that is easier to debug because just enough debug-ability is built-in. I''m building a debug build now and will post the results of booting a 64bit HVM guest on it. Hopefully that will help diagnose this problem. Thanks, Ed _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Sep-26 18:34 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
On 26/9/06 7:31 pm, "Ed Smith" <esmith@virtualiron.com> wrote:> Debug builds are fine and certainly easier to well, debug with, but they often > run slower than release builds and hide problems. Humm... I wonder if thats > why you are not seeing this problem.It is usually the other way round, since debug builds contain lots of cross checks and assertions that are not included in production builds. Certainly a few bugs do only crop up in production builds, and so we test both types of builds ourselves, but it''s rare and the first thing we''ll do if we see a production-build crash is to try and repro with a debug build.> Also when we rely on debug builds to > diagnose problems we do not design in the ability to diagnose problems when > the bits are in customers hands. ''Good training for developers''? No just > trying to work towards a released product that is easier to debug because > just enough debug-ability is built-in.It''s not entirely a diagnosis issue (production build backtraces can be useful, but as I said they are more hassle). A debug build will often crash earlier and with more immediately useful information about what has gone wrong. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ed Smith
2006-Sep-26 20:26 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
Keir Fraser wrote:> On 26/9/06 7:31 pm, "Ed Smith" <esmith@virtualiron.com> wrote: > >> Debug builds are fine and certainly easier to well, debug with, but they often >> run slower than release builds and hide problems. Humm... I wonder if thats >> why you are not seeing this problem. > > It is usually the other way round, since debug builds contain lots of cross > checks and assertions that are not included in production builds. Certainly > a few bugs do only crop up in production builds, and so we test both types > of builds ourselves, but it''s rare and the first thing we''ll do if we see a > production-build crash is to try and repro with a debug build.This must be one of those rare ones ;'') Debug build works, release build fails. I did a 64bit XEN debug build and tried booting 64bit RHEL4U2 2CPU 256MB HVM guest and it boots fine. I then did a 64bit XEN release build and tried booting the same guest and I crash in vmx.c:2268. dom0 console output for both debug and release builds is attached. This is c/s 11616. Cheers, Ed _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Sep-27 09:29 UTC
Re: [Xen-devel] Testing status of HVM (Intel VT) on 64bit XEN unstable c/s 11616
> I did a 64bit XEN debug build and tried booting 64bit RHEL4U2 2CPU 256MB HVM > guest and it boots fine. I then did a 64bit XEN release build and tried > booting > the same guest and I crash in vmx.c:2268. dom0 console output for both debug > and release builds is attached. > > This is c/s 11616.11626 has a possible fix for this and at least will print out some more info (even on production builds ;-) if the MSR_EFER write fails. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel