Li, Haicheng
2008-Feb-03 07:48 UTC
[Xen-devel] most of qcow images cannot boot since c/s 16958.
Hi all, We found since c/s 16958, most of our qcow images will hang on boot; xm dmesg shows: (XEN) HVM3: int13_harddisk: function 41, unmapped device for ELDL=81 (XEN) HVM3: int13_harddisk: function 08, unmapped device for ELDL=81 (XEN) HVM3: *** int 15h function AX=00C0, BX=0000 not yet supported! When this hang happens, "xm destroy" command cannot destroy that domain any more; and xend will lose response after running "xm destroy" until this qemu process is killed by "kill -9". This issue blocks our testing. And c/s 16945 has no such issue. Detailed xm dmesg log is attached. -- haicheng _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-03 08:53 UTC
Re: [Xen-devel] most of qcow images cannot boot since c/s 16958.
I''m not sure what causes that. Hanging ''xm destroy'' is because qemu-dm is now killed by SIGTERM rather than SIGKILL. I think that was a mistake to change, so I''ll probably revert that. But it doesn''t explain why your qemu-dm went bad in the first place. I think you will have to work out exactly which changeset caused the problem. -- Keir On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote:> Hi all, > > We found since c/s 16958, most of our qcow images will hang on boot; xm > dmesg shows: > (XEN) HVM3: int13_harddisk: function 41, unmapped device for ELDL=81 > (XEN) HVM3: int13_harddisk: function 08, unmapped device for ELDL=81 > (XEN) HVM3: *** int 15h function AX=00C0, BX=0000 not yet supported! > > When this hang happens, "xm destroy" command cannot destroy that domain > any more; and xend will lose response after running "xm destroy" until > this qemu process is killed by "kill -9". > > This issue blocks our testing. And c/s 16945 has no such issue. > > Detailed xm dmesg log is attached. > > -- haicheng > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Li, Haicheng
2008-Feb-03 09:08 UTC
RE: [Xen-devel] most of qcow images cannot boot since c/s 16958.
Thanks for the quick response. I have tried to locate the exact changeset, but from c/s 16946 to c/s 16957, most of them cannot get tools compiled successfully. To help us better track this issue, I filed a bug in bugzilla: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1161. -- haicheng -----Original Message----- From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] Sent: 2008年2月3日 16:54 To: Li, Haicheng; xen-devel Subject: Re: [Xen-devel] most of qcow images cannot boot since c/s 16958. I''m not sure what causes that. Hanging ''xm destroy'' is because qemu-dm is now killed by SIGTERM rather than SIGKILL. I think that was a mistake to change, so I''ll probably revert that. But it doesn''t explain why your qemu-dm went bad in the first place. I think you will have to work out exactly which changeset caused the problem. -- Keir On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote:> Hi all, > > We found since c/s 16958, most of our qcow images will hang on boot; xm > dmesg shows: > (XEN) HVM3: int13_harddisk: function 41, unmapped device for ELDL=81 > (XEN) HVM3: int13_harddisk: function 08, unmapped device for ELDL=81 > (XEN) HVM3: *** int 15h function AX=00C0, BX=0000 not yet supported! > > When this hang happens, "xm destroy" command cannot destroy that domain > any more; and xend will lose response after running "xm destroy" until > this qemu process is killed by "kill -9". > > This issue blocks our testing. And c/s 16945 has no such issue. > > Detailed xm dmesg log is attached. > > -- haicheng > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-03 10:30 UTC
Re: [Xen-devel] most of qcow images cannot boot since c/s 16958.
You''ll have to selectively remove changesets. A good starting point would be: hg update 16955 hg export 16954 | patch -Rp1 hg export 16949 | patch -Rp1 hg export 16946 | patch -Rp1 ...and see if you can reproduce the problem. If you can, it is one of changesets 16947, 16948, 16950, 16951, 16952, 16953, 16955. Otherwise it is one of 16946, 16949, 16954, 16956, 16957, 16958. -- Keir On 3/2/08 09:08, "Li, Haicheng" <haicheng.li@intel.com> wrote:> Thanks for the quick response. I have tried to locate the exact changeset, but > from c/s 16946 to c/s 16957, most of them cannot get tools compiled > successfully. To help us better track this issue, I filed a bug in bugzilla: > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1161. > > > -- haicheng > > -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 2008年2月3日 16:54 > To: Li, Haicheng; xen-devel > Subject: Re: [Xen-devel] most of qcow images cannot boot since c/s 16958. > > I''m not sure what causes that. Hanging ''xm destroy'' is because qemu-dm is > now killed by SIGTERM rather than SIGKILL. I think that was a mistake to > change, so I''ll probably revert that. But it doesn''t explain why your > qemu-dm went bad in the first place. I think you will have to work out > exactly which changeset caused the problem. > > -- Keir > > On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote: > >> Hi all, >> >> We found since c/s 16958, most of our qcow images will hang on boot; xm >> dmesg shows: >> (XEN) HVM3: int13_harddisk: function 41, unmapped device for ELDL=81 >> (XEN) HVM3: int13_harddisk: function 08, unmapped device for ELDL=81 >> (XEN) HVM3: *** int 15h function AX=00C0, BX=0000 not yet supported! >> >> When this hang happens, "xm destroy" command cannot destroy that domain >> any more; and xend will lose response after running "xm destroy" until >> this qemu process is killed by "kill -9". >> >> This issue blocks our testing. And c/s 16945 has no such issue. >> >> Detailed xm dmesg log is attached. >> >> -- haicheng >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
You, Yongkang
2008-Feb-04 06:38 UTC
RE: [Xen-devel] most of qcow images cannot boot since c/s 16958.
On Sunday, February 03, 2008 6:30 PM, "Keir Fraser" wrote: Hi Keir and Haicheng, I found this issue might be caused by 16947. (http://xenbits.xensource.com/xen-unstable.hg?rev/32b898768217) After revert 16947 and rebuild the qemu-dm, HVM domains can boot up successfully.> You''ll have to selectively remove changesets. A good starting point > would be: > hg update 16955 > hg export 16954 | patch -Rp1 > hg export 16949 | patch -Rp1 > hg export 16946 | patch -Rp1 > > ...and see if you can reproduce the problem. If you can, it is one of > changesets 16947, 16948, 16950, 16951, 16952, 16953, 16955. Otherwise > it is one of 16946, 16949, 16954, 16956, 16957, 16958. > > -- Keir > > On 3/2/08 09:08, "Li, Haicheng" <haicheng.li@intel.com> wrote: > >> Thanks for the quick response. I have tried to locate the exact >> changeset, but from c/s 16946 to c/s 16957, most of them cannot get >> tools compiled successfully. To help us better track this issue, I >> filed a bug in bugzilla: >> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1161. >> >> >> -- haicheng >> >> -----Original Message----- >> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] >> Sent: 2008年2月3日 16:54 >> To: Li, Haicheng; xen-devel >> Subject: Re: [Xen-devel] most of qcow images cannot boot since c/s >> 16958. >> >> I''m not sure what causes that. Hanging ''xm destroy'' is because >> qemu-dm is >> now killed by SIGTERM rather than SIGKILL. I think that was a >> mistake to change, so I''ll probably revert that. But it doesn''t >> explain why your >> qemu-dm went bad in the first place. I think you will have to work >> out exactly which changeset caused the problem. >> >> -- Keir >> >> On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote: >> >>> Hi all, >>> >>> We found since c/s 16958, most of our qcow images will hang on >>> boot; xm dmesg shows: (XEN) HVM3: int13_harddisk: function 41, >>> unmapped device for ELDL=81 (XEN) HVM3: int13_harddisk: function >>> 08, unmapped device for ELDL=81 (XEN) HVM3: *** int 15h function >>> AX=00C0, BX=0000 not yet supported! >>> >>> When this hang happens, "xm destroy" command cannot destroy that >>> domain any more; and xend will lose response after running "xm >>> destroy" until this qemu process is killed by "kill -9". >>> >>> This issue blocks our testing. And c/s 16945 has no such issue. >>> >>> Detailed xm dmesg log is attached. >>>Best Regards, Yongkang You _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-04 07:37 UTC
Re: [Xen-devel] most of qcow images cannot boot since c/s 16958.
It''s certainly plausible. If you can confirm it then I''ll revert it. -- Keir On 4/2/08 06:38, "You, Yongkang" <yongkang.you@intel.com> wrote:> On Sunday, February 03, 2008 6:30 PM, "Keir Fraser" wrote: > > Hi Keir and Haicheng, > > I found this issue might be caused by 16947. > (http://xenbits.xensource.com/xen-unstable.hg?rev/32b898768217) > > After revert 16947 and rebuild the qemu-dm, HVM domains can boot up > successfully. > >> You''ll have to selectively remove changesets. A good starting point >> would be: >> hg update 16955 >> hg export 16954 | patch -Rp1 >> hg export 16949 | patch -Rp1 >> hg export 16946 | patch -Rp1 >> >> ...and see if you can reproduce the problem. If you can, it is one of >> changesets 16947, 16948, 16950, 16951, 16952, 16953, 16955. Otherwise >> it is one of 16946, 16949, 16954, 16956, 16957, 16958. >> >> -- Keir >> >> On 3/2/08 09:08, "Li, Haicheng" <haicheng.li@intel.com> wrote: >> >>> Thanks for the quick response. I have tried to locate the exact >>> changeset, but from c/s 16946 to c/s 16957, most of them cannot get >>> tools compiled successfully. To help us better track this issue, I >>> filed a bug in bugzilla: >>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1161. >>> >>> >>> -- haicheng >>> >>> -----Original Message----- >>> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] >>> Sent: 2008年2月3日 16:54 >>> To: Li, Haicheng; xen-devel >>> Subject: Re: [Xen-devel] most of qcow images cannot boot since c/s >>> 16958. >>> >>> I''m not sure what causes that. Hanging ''xm destroy'' is because >>> qemu-dm is >>> now killed by SIGTERM rather than SIGKILL. I think that was a >>> mistake to change, so I''ll probably revert that. But it doesn''t >>> explain why your >>> qemu-dm went bad in the first place. I think you will have to work >>> out exactly which changeset caused the problem. >>> >>> -- Keir >>> >>> On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote: >>> >>>> Hi all, >>>> >>>> We found since c/s 16958, most of our qcow images will hang on >>>> boot; xm dmesg shows: (XEN) HVM3: int13_harddisk: function 41, >>>> unmapped device for ELDL=81 (XEN) HVM3: int13_harddisk: function >>>> 08, unmapped device for ELDL=81 (XEN) HVM3: *** int 15h function >>>> AX=00C0, BX=0000 not yet supported! >>>> >>>> When this hang happens, "xm destroy" command cannot destroy that >>>> domain any more; and xend will lose response after running "xm >>>> destroy" until this qemu process is killed by "kill -9". >>>> >>>> This issue blocks our testing. And c/s 16945 has no such issue. >>>> >>>> Detailed xm dmesg log is attached. >>>> > > > > Best Regards, > Yongkang You_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
You, Yongkang
2008-Feb-04 09:07 UTC
RE: [Xen-devel] most of qcow images cannot boot since c/s 16958.
On Monday, February 04, 2008 3:38 PM, "Keir Fraser" wrote:> It''s certainly plausible. If you can confirm it then I''ll revert it. >We rebuilt a Xen environemnt by cset #16971 without patch of #16974. A subset of nightly cases (they all used qcow) can pass smoothly on it. But they can not pass on real #16971. So, #16974 should include a bug. ==================================================================== Total Pass Fail NoResult Crash ====================================================================control_panel 7 7 0 0 0 Restart 2 2 0 0 0 gtest 4 3 1 0 0 ====================================================================control_panel 7 7 0 0 0 :XEN_linux_win_64_g64 1 1 0 0 0 :XEN_256M_guest_64_g64 1 1 0 0 0 :XEN_256M_guest_64_gPAE 1 1 0 0 0 :XEN_256M_xenu_64_gPAE 1 1 0 0 0 :XEN_vmx_4vcpu_64_g64 1 1 0 0 0 :XEN_SR_64_g64 1 1 0 0 0 :XEN_two_winxp_64_g64 1 1 0 0 0 Restart 2 2 0 0 0 :GuestPAE_64_gPAE 1 1 0 0 0 :Guest64_64_gPAE 1 1 0 0 0 gtest 4 3 1 0 0 :boot_up_acpi_win2k3_64_ 1 1 0 0 0 :reboot_fc6_64_g64 1 0 1 0 0 :boot_up_acpi_xp_64_g64 1 1 0 0 0 :bootx_64_g64 1 1 0 0 0 ====================================================================Total 13 12 1 0 0 BTW, we also verify HVM save/restore and Live Migration can pass on latest xen-unstable.> -- Keir > > On 4/2/08 06:38, "You, Yongkang" <yongkang.you@intel.com> wrote: > >> On Sunday, February 03, 2008 6:30 PM, "Keir Fraser" wrote: >> >> Hi Keir and Haicheng, >> >> I found this issue might be caused by 16947. >> (http://xenbits.xensource.com/xen-unstable.hg?rev/32b898768217) >> >> After revert 16947 and rebuild the qemu-dm, HVM domains can boot up >> successfully. >> >>> You''ll have to selectively remove changesets. A good starting point >>> would be: hg update 16955 >>> hg export 16954 | patch -Rp1 >>> hg export 16949 | patch -Rp1 >>> hg export 16946 | patch -Rp1 >>> >>> ...and see if you can reproduce the problem. If you can, it is one >>> of changesets 16947, 16948, 16950, 16951, 16952, 16953, 16955. >>> Otherwise it is one of 16946, 16949, 16954, 16956, 16957, 16958. >>> >>> -- Keir >>> >>> On 3/2/08 09:08, "Li, Haicheng" <haicheng.li@intel.com> wrote: >>> >>>> Thanks for the quick response. I have tried to locate the exact >>>> changeset, but from c/s 16946 to c/s 16957, most of them cannot get >>>> tools compiled successfully. To help us better track this issue, I >>>> filed a bug in bugzilla: >>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1161. >>>> >>>> >>>> -- haicheng >>>> >>>> -----Original Message----- >>>> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] >>>> Sent: 2008年2月3日 16:54 >>>> To: Li, Haicheng; xen-devel >>>> Subject: Re: [Xen-devel] most of qcow images cannot boot since c/s >>>> 16958. >>>> >>>> I''m not sure what causes that. Hanging ''xm destroy'' is because >>>> qemu-dm is now killed by SIGTERM rather than SIGKILL. I think that >>>> was a mistake to change, so I''ll probably revert that. But it >>>> doesn''t explain why your qemu-dm went bad in the first place. I >>>> think you will have to work out exactly which changeset caused the >>>> problem. >>>> >>>> -- Keir >>>> >>>> On 3/2/08 07:48, "Li, Haicheng" <haicheng.li@intel.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> We found since c/s 16958, most of our qcow images will hang on >>>>> boot; xm dmesg shows: (XEN) HVM3: int13_harddisk: function 41, >>>>> unmapped device for ELDL=81 (XEN) HVM3: int13_harddisk: function >>>>> 08, unmapped device for ELDL=81 (XEN) HVM3: *** int 15h function >>>>> AX=00C0, BX=0000 not yet supported! >>>>> >>>>> When this hang happens, "xm destroy" command cannot destroy that >>>>> domain any more; and xend will lose response after running "xm >>>>> destroy" until this qemu process is killed by "kill -9". >>>>> >>>>> This issue blocks our testing. And c/s 16945 has no such issue. >>>>> >>>>> Detailed xm dmesg log is attached. >>>>>Best Regards, Yongkang You _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
You, Yongkang
2008-Feb-04 09:15 UTC
RE: [Xen-devel] most of qcow images cannot boot since c/s 16958.
On Monday, February 04, 2008 5:07 PM, "You, Yongkang" wrote:> On Monday, February 04, 2008 3:38 PM, "Keir Fraser" wrote: > >> It''s certainly plausible. If you can confirm it then I''ll revert it. >> > > We rebuilt a Xen environemnt by cset #16971 without patch of #16974.It should be 16947, but not 16974. Sorry.> A subset of nightly cases (they all used qcow) can pass smoothly on > it. But they can not pass on real #16971. So, #16947 should include a > bug. >Best Regards, Yongkang You _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel