Sorry if this is more of a xen-users question, but I''m at a loss on how to fix this. I have a system that has an AMD A10-6800K on a ASUS F2A85-M PRO and any HVM domain hangs the whole system. I started with XenServer 6.2, then Centos 6.4, and finally Ubuntu 13.04. All have the same results. Its not terribly consistent when the hang happens, but every HVM domain will eventually hang the system if left running long enough. Sometimes the system hang immediately, sometimes after 20 minutes. I disabled all power management (turned off PowerNow!) and that actually seems to make it freeze faster. So I''m running with the smallest HVM configuration possible using xl as below name = ''test'' memory = 1024 builder = ''hvm'' Since the system just hangs I get no errors messages or anything. Has anybody successfully ran HVM on a A10 AMD CPU? Its one of the new Richland APUs. Can anybody point me to some info on how to debug this issue? FWIW, KVM runs fine. Thanks, Darren _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 14/08/13 06:51, Darren Shepherd wrote:> Sorry if this is more of a xen-users question, but I''m at a loss on > how to fix this. I have a system that has an AMD A10-6800K on a ASUS > F2A85-M PRO and any HVM domain hangs the whole system. I started with > XenServer 6.2, then Centos 6.4, and finally Ubuntu 13.04. All have > the same results. > > Its not terribly consistent when the hang happens, but every HVM > domain will eventually hang the system if left running long enough. > Sometimes the system hang immediately, sometimes after 20 minutes. I > disabled all power management (turned off PowerNow!) and that actually > seems to make it freeze faster. So I''m running with the smallest HVM > configuration possible using xl as below > > name = ''test'' > memory = 1024 > builder = ''hvm'' > > Since the system just hangs I get no errors messages or anything. Has > anybody successfully ran HVM on a A10 AMD CPU? Its one of the new > Richland APUs. Can anybody point me to some info on how to debug this > issue? FWIW, KVM runs fine. > > Thanks, > DarrenCan you boot xen with the additional options "loglvl=all iommu=debug,verbose apic_verbosity=debug cpuinfo" and attach Xen''s dmeg `xl dmesg` and dom0''s `dmesg` as a starting point. I have to say that we have never countered symptoms like this when testing XenServer, but on the other hand, we don''t appear to have any similar hardware. Are you able to get a serial console attached? ~Andrew> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 14.08.13 at 07:51, Darren Shepherd <darren.s.shepherd@gmail.com> wrote: > Its not terribly consistent when the hang happens, but every HVM domain > will eventually hang the system if left running long enough. Sometimes the > system hang immediately, sometimes after 20 minutes. I disabled all power > management (turned off PowerNow!) and that actually seems to make it freeze > faster. So I''m running with the smallest HVM configuration possible using > xl as belowIf disabling power management makes it hang faster, then I''d assume a cooling problem in your system. Anyway, knowing whether something gets output upon the hang (crash?) by the hypervisor, as Andrew already asked for, will be essential here. Jan
I would like to help you because my team at AMD does not see this problem but maybe there is something unique in your environment that we have not tested with. So you are using the Xen hypervisor from XenServer 6.2 and you have tested with Centos and Unbuntu guests? Could you tell me your Guest configurations (vcpus and memory per guest). Any information you can provide would be appreciated. From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Darren Shepherd Sent: Wednesday, August 14, 2013 12:51 AM To: xen-devel@lists.xen.org Subject: [Xen-devel] HVM hangs system on AMD A10-6800K + Hudson-D4 Sorry if this is more of a xen-users question, but I''m at a loss on how to fix this. I have a system that has an AMD A10-6800K on a ASUS F2A85-M PRO and any HVM domain hangs the whole system. I started with XenServer 6.2, then Centos 6.4, and finally Ubuntu 13.04. All have the same results. Its not terribly consistent when the hang happens, but every HVM domain will eventually hang the system if left running long enough. Sometimes the system hang immediately, sometimes after 20 minutes. I disabled all power management (turned off PowerNow!) and that actually seems to make it freeze faster. So I''m running with the smallest HVM configuration possible using xl as below name = ''test'' memory = 1024 builder = ''hvm'' Since the system just hangs I get no errors messages or anything. Has anybody successfully ran HVM on a A10 AMD CPU? Its one of the new Richland APUs. Can anybody point me to some info on how to debug this issue? FWIW, KVM runs fine. Thanks, Darren _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Wed, Aug 14, 2013 at 11:22 AM, Hurwitz, Sherry <sherry.hurwitz@amd.com>wrote:> I would like to help you because my team at AMD does not see this > problem but maybe there is something unique in your environment that we > have not tested with. So you are using the Xen hypervisor from XenServer > 6.2 and you have tested with Centos and Unbuntu guests? Could you tell me > your Guest configurations (vcpus and memory per guest). Any information > you can provide would be appreciated.**** > > ** ** >I found another user on the xen-users list who experienced this same problem [1] with the same APU but with a different motherboard that has a Hudson-D3. He suggested compiling and running Xen from the latest source and that seemed to work. I haven''t extensively tested this, but before the system would freeze under a minute and now I''ve ran it for a couple hours with no issues. So I have ran all of the following dom0 setups and they have failed: XenServer 6.2 CentOS 6.4 Ubuntu 13.04 Ubuntu 13.04 + Xen 4.3 compiled from source So there is something that was checked in since Xen 4.3 that seems to fix the issue or did something to mask the issue. Specifically what I was doing that made it fail was that I was running a Windows XP installation from ISO. During the installation, around the time of entering your product key or configuring networking the system would hang. On XenServer the domain was configured just however XenServer does it when selecting the OS type as Windows XP. On CentOS and Ubuntu I was using the smallest xl cfg I could which was basically name, builder, disks, vif, memory. I started debugging the issue further on Ubuntu 13.04 and got it so that I can easily reproduce it. I''m running basically a fresh install of 13.04 64-bit with "apt-get install xen-hypervisor-4.2." I then disabled all features I could in the BIOS to just start ruling things out. So I turned off SMP, IOMMU, CPB, NX, C6, and PowerNow!. All other settings in the BIOS are default. If I ran the following xl conf name = ''blank'' builder = ''hvm'' memory = 1024 I then run "xl create -f blank.cfg" so that domain creates, and then about a second later I run "xl create -f blank.cfg ''name="blank2"''" Its about 50/50 if the system freezes on the first "xl create" or the second. So basically, I can reproduce the issue with no domU OS. Unfortunately I don''t have a COM port plugged into the motherboard and I haven''t had time to go run and buy a com port to plug into the motherboard so I have no clue if Xen is logging any errors, but I definitly don''t get anything on the linux console or in logs. Below are the specs of the system I have AMD A10-6800K G.SKILL Ripjaws X Series 32GB (4 x 8GB) DDR3 2400 <-- Running at 1333mhz (which is BIOS default) ASUS F2A85-M PRO FM2 AMD A85X I''m particularly interested in what has made this work in latest git because I''d like the patch to be backed ported into Xen 4.2.x so that CentOS can pick it up. Thanks, Darren [1] http://lists.xenproject.org/archives/html/xen-users/2013-08/msg00148.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Sun, Aug 18, 2013 at 3:14 AM, Darren Shepherd <darren.s.shepherd@gmail.com> wrote:> > On Wed, Aug 14, 2013 at 11:22 AM, Hurwitz, Sherry <sherry.hurwitz@amd.com> > wrote: >> >> I would like to help you because my team at AMD does not see this problem >> but maybe there is something unique in your environment that we have not >> tested with. So you are using the Xen hypervisor from XenServer 6.2 and you >> have tested with Centos and Unbuntu guests? Could you tell me your Guest >> configurations (vcpus and memory per guest). Any information you can >> provide would be appreciated. >> >> > > > I found another user on the xen-users list who experienced this same problem > [1] with the same APU but with a different motherboard that has a Hudson-D3. > He suggested compiling and running Xen from the latest source and that > seemed to work. I haven''t extensively tested this, but before the system > would freeze under a minute and now I''ve ran it for a couple hours with no > issues. So I have ran all of the following dom0 setups and they have > failed: > > XenServer 6.2 > CentOS 6.4 > Ubuntu 13.04 > Ubuntu 13.04 + Xen 4.3 compiled from source > > So there is something that was checked in since Xen 4.3 that seems to fix > the issue or did something to mask the issue. Specifically what I was doing > that made it fail was that I was running a Windows XP installation from ISO. > During the installation, around the time of entering your product key or > configuring networking the system would hang. On XenServer the domain was > configured just however XenServer does it when selecting the OS type as > Windows XP. On CentOS and Ubuntu I was using the smallest xl cfg I could > which was basically name, builder, disks, vif, memory. > > I started debugging the issue further on Ubuntu 13.04 and got it so that I > can easily reproduce it. I''m running basically a fresh install of 13.04 > 64-bit with "apt-get install xen-hypervisor-4.2." I then disabled all > features I could in the BIOS to just start ruling things out. So I turned > off SMP, IOMMU, CPB, NX, C6, and PowerNow!. All other settings in the BIOS > are default. If I ran the following xl conf > > name = ''blank'' > builder = ''hvm'' > memory = 1024 > > I then run "xl create -f blank.cfg" so that domain creates, and then about a > second later I run "xl create -f blank.cfg ''name="blank2"''" Its about 50/50 > if the system freezes on the first "xl create" or the second. So basically, > I can reproduce the issue with no domU OS. > > Unfortunately I don''t have a COM port plugged into the motherboard and I > haven''t had time to go run and buy a com port to plug into the motherboard > so I have no clue if Xen is logging any errors, but I definitly don''t get > anything on the linux console or in logs. Below are the specs of the system > I have > > AMD A10-6800K > G.SKILL Ripjaws X Series 32GB (4 x 8GB) DDR3 2400 <-- Running at 1333mhz > (which is BIOS default) > ASUS F2A85-M PRO FM2 AMD A85X > > I''m particularly interested in what has made this work in latest git because > I''d like the patch to be backed ported into Xen 4.2.x so that CentOS can > pick it up.Andrew, Jan, any idea which patch might be the culprit for fixing this problem? Barring that, Darren, you can try to do a git bisect to figure out when things started working again: http://webchick.net/node/99 -George
On Aug 19, 2013, at 3:31 AM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> On Sun, Aug 18, 2013 at 3:14 AM, Darren Shepherd > <darren.s.shepherd@gmail.com> wrote: >> >> On Wed, Aug 14, 2013 at 11:22 AM, Hurwitz, Sherry <sherry.hurwitz@amd.com> >> wrote: >>> >>> I would like to help you because my team at AMD does not see this problem >>> but maybe there is something unique in your environment that we have not >>> tested with. So you are using the Xen hypervisor from XenServer 6.2 and you >>> have tested with Centos and Unbuntu guests? Could you tell me your Guest >>> configurations (vcpus and memory per guest). Any information you can >>> provide would be appreciated. >> >> >> I found another user on the xen-users list who experienced this same problem >> [1] with the same APU but with a different motherboard that has a Hudson-D3. >> He suggested compiling and running Xen from the latest source and that >> seemed to work. I haven''t extensively tested this, but before the system >> would freeze under a minute and now I''ve ran it for a couple hours with no >> issues. So I have ran all of the following dom0 setups and they have >> failed: >> >> XenServer 6.2 >> CentOS 6.4 >> Ubuntu 13.04 >> Ubuntu 13.04 + Xen 4.3 compiled from source >> >> So there is something that was checked in since Xen 4.3 that seems to fix >> the issue or did something to mask the issue. Specifically what I was doing >> that made it fail was that I was running a Windows XP installation from ISO. >> During the installation, around the time of entering your product key or >> configuring networking the system would hang. On XenServer the domain was >> configured just however XenServer does it when selecting the OS type as >> Windows XP. On CentOS and Ubuntu I was using the smallest xl cfg I could >> which was basically name, builder, disks, vif, memory. >> >> I started debugging the issue further on Ubuntu 13.04 and got it so that I >> can easily reproduce it. I''m running basically a fresh install of 13.04 >> 64-bit with "apt-get install xen-hypervisor-4.2." I then disabled all >> features I could in the BIOS to just start ruling things out. So I turned >> off SMP, IOMMU, CPB, NX, C6, and PowerNow!. All other settings in the BIOS >> are default. If I ran the following xl conf >> >> name = ''blank'' >> builder = ''hvm'' >> memory = 1024 >> >> I then run "xl create -f blank.cfg" so that domain creates, and then about a >> second later I run "xl create -f blank.cfg ''name="blank2"''" Its about 50/50 >> if the system freezes on the first "xl create" or the second. So basically, >> I can reproduce the issue with no domU OS. >> >> Unfortunately I don''t have a COM port plugged into the motherboard and I >> haven''t had time to go run and buy a com port to plug into the motherboard >> so I have no clue if Xen is logging any errors, but I definitly don''t get >> anything on the linux console or in logs. Below are the specs of the system >> I have >> >> AMD A10-6800K >> G.SKILL Ripjaws X Series 32GB (4 x 8GB) DDR3 2400 <-- Running at 1333mhz >> (which is BIOS default) >> ASUS F2A85-M PRO FM2 AMD A85X >> >> I''m particularly interested in what has made this work in latest git because >> I''d like the patch to be backed ported into Xen 4.2.x so that CentOS can >> pick it up. > > Andrew, Jan, any idea which patch might be the culprit for fixing this problem? > > Barring that, Darren, you can try to do a git bisect to figure out > when things started working again: > > http://webchick.net/node/99 > > -GeorgeOkay, so I compiled, ran and tested everything between RELEASE-4.3.0 (bad) with master (good). Unfortunately I came to the conclusion it''s not a patch but instead it''s the debug flag. So if I build RELEASE-4.3.0 it fails my test case because because Config.mk has "debug ?= n". If I then just rebuild xen.gz with "make debug=y" it works. I really couldn''t figure out a way to do bisect with debugging. I tried compiling xen.gz with debug on and then just compiling a few obj with debug off, but that produced weird results and I have no clue if that is really even valid to have a linked file with half debug .o''s and half not. I''ll keep fooling around and see if I can narrow this down any further. So my simple test fails with debug off, but I don''t know if debug on just makes it less likely to happen. I''m going to run more extensive tests later. I''ll also try to get a serial port setup on the box. Darren
>>> On 20.08.13 at 03:42, Darren Shepherd <darren.s.shepherd@gmail.com> wrote: > Okay, so I compiled, ran and tested everything between RELEASE-4.3.0 (bad) > with master (good). Unfortunately I came to the conclusion it''s not a patch > but instead it''s the debug flag. So if I build RELEASE-4.3.0 it fails my test > case because because Config.mk has "debug ?= n". If I then just rebuild > xen.gz with "make debug=y" it works.Pretty unusual - normally we''d expect more problems with debug=y (because of extra checking done), not less.> I''ll keep fooling around and see if I can narrow this down any further. So > my simple test fails with debug off, but I don''t know if debug on just makes > it less likely to happen. I''m going to run more extensive tests later. I''ll > also try to get a serial port setup on the box.I''m afraid that''s the only viable route. Jan
On 08/20/2013 09:26 AM, Jan Beulich wrote:>>>> On 20.08.13 at 03:42, Darren Shepherd <darren.s.shepherd@gmail.com> wrote: >> Okay, so I compiled, ran and tested everything between RELEASE-4.3.0 (bad) >> with master (good). Unfortunately I came to the conclusion it''s not a patch >> but instead it''s the debug flag. So if I build RELEASE-4.3.0 it fails my test >> case because because Config.mk has "debug ?= n". If I then just rebuild >> xen.gz with "make debug=y" it works. > > Pretty unusual - normally we''d expect more problems with debug=y > (because of extra checking done), not less.Unfortunately the kind of bug that changes when you add debugging is usually pretty difficult to track down: race conditions, clobbered variables... nearly impossible without a serial console.> >> I''ll keep fooling around and see if I can narrow this down any further. So >> my simple test fails with debug off, but I don''t know if debug on just makes >> it less likely to happen. I''m going to run more extensive tests later. I''ll >> also try to get a serial port setup on the box. > > I''m afraid that''s the only viable route. > > Jan >