Has anybody noticed that the domain id number seems to increment more than usual on xen unstable? the first domU that I start gets id 2, after a reboot its id 4, etc etc. I am also having some problems with "xm create" taking a very long time and sometimes failing because devices are still in use or "hotplug scripts are not responding", where I am using file backed block devices the loop device is sometimes left connected after the domain is shutdown, i have to losetup -d to remove it before i can start the domain again. Perhaps its related.. Andy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 06/03/2009 19:20, "Andrew Lyon" <andrew.lyon@gmail.com> wrote:> Has anybody noticed that the domain id number seems to increment more > than usual on xen unstable? the first domU that I start gets id 2, > after a reboot its id 4, etc etc. > > I am also having some problems with "xm create" taking a very long > time and sometimes failing because devices are still in use or > "hotplug scripts are not responding", where I am using file backed > block devices the loop device is sometimes left connected after the > domain is shutdown, i have to losetup -d to remove it before i can > start the domain again.Allocation should still eb sequential. Maybe xend is taking two goes to start a domain. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Mar 6, 2009 at 7:42 PM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 06/03/2009 19:20, "Andrew Lyon" <andrew.lyon@gmail.com> wrote: > >> Has anybody noticed that the domain id number seems to increment more >> than usual on xen unstable? the first domU that I start gets id 2, >> after a reboot its id 4, etc etc. >> >> I am also having some problems with "xm create" taking a very long >> time and sometimes failing because devices are still in use or >> "hotplug scripts are not responding", where I am using file backed >> block devices the loop device is sometimes left connected after the >> domain is shutdown, i have to losetup -d to remove it before i can >> start the domain again. > > Allocation should still eb sequential. Maybe xend is taking two goes to > start a domain. > > -- KeirThe very first domain I create gets ID=2: xm create win2008.cfg Using config file "./win2008.cfg". Started domain win2008 (id=2) Should have id 1. I''ve also noticed that if i shutdown the domain cleanly (ie from inside windows) then I can start it again without any problems, but if i have to use xm destroy then when I next try to start the domain the xm process hangs, if I ctrl+c the xm process the domain is listed and paused, I can unpause it and it seems to work ok. xenstore-ls shows that some of the domain devices etc are not removed, should they be when it is destroyed? Andy> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 08/03/2009 14:32, "Andrew Lyon" <andrew.lyon@gmail.com> wrote:> I''ve also noticed that if i shutdown the domain cleanly (ie from > inside windows) then I can start it again without any problems, but if > i have to use xm destroy then when I next try to start the domain the > xm process hangs, if I ctrl+c the xm process the domain is listed and > paused, I can unpause it and it seems to work ok. > > xenstore-ls shows that some of the domain devices etc are not removed, > should they be when it is destroyed?What changeset of xen-unstable are you running and when did you start seeing this? I have my suspicions about changeset 19250, which was checked in on Monday -- you could try reverting it (hg revert 19250), re-install dom0 tools, and see if that helps. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Mar 8, 2009 at 3:44 PM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 08/03/2009 14:32, "Andrew Lyon" <andrew.lyon@gmail.com> wrote: > >> I''ve also noticed that if i shutdown the domain cleanly (ie from >> inside windows) then I can start it again without any problems, but if >> i have to use xm destroy then when I next try to start the domain the >> xm process hangs, if I ctrl+c the xm process the domain is listed and >> paused, I can unpause it and it seems to work ok. >> >> xenstore-ls shows that some of the domain devices etc are not removed, >> should they be when it is destroyed? > > What changeset of xen-unstable are you running and when did you start seeing > this? I have my suspicions about changeset 19250, which was checked in on > Monday -- you could try reverting it (hg revert 19250), re-install dom0 > tools, and see if that helps. > > -- KeirKeir, I tried that but it didn''t help, however I noticed that there were several xen-hotplug-cleanup scripts running, and that script has changed between 3.3.1 and unstable, on my 3.3.1 system I can run the script and it exits very quickly, but on my unstable system the script can only be run once, after that it hangs trying to get the lock, I added a couple of echo''s to the script as you can see below, only the first one is executed, the second is not and as the rest of the script is not run the lock is never released. So the offending line is vm=$(xenstore-read "/local/domain/${path_array[2]}/vm") Andy #! /bin/bash dir=$(dirname "$0") . "$dir/xen-hotplug-common.sh" # Claim the lock protecting /etc/xen/scripts/block. This stops a race whereby # paths in the store would disappear underneath that script as it attempted to # read from the store checking for device sharing. # Any other scripts that do similar things will have to have their lock # claimed too. # This is pretty horrible, but there''s not really a nicer way of solving this. claim_lock "block" # split backend/DEVCLASS/VMID/DEVID on slashes path_array=( ${XENBUS_PATH//\// } ) # get /vm/UUID path echo "echo test 1" vm=$(xenstore-read "/local/domain/${path_array[2]}/vm") echo "echo test 2" # construct /vm/UUID/device/DEVCLASS/DEVID vm_dev="$vm/device/${path_array[1]}/${path_array[3]}" # remove device frontend store entries xenstore-rm -t \ $(xenstore-read "$XENBUS_PATH/frontend" 2>/dev/null) 2>/dev/null || true # remove device backend store entries xenstore-rm -t "$XENBUS_PATH" 2>/dev/null || true xenstore-rm -t "error/$XENBUS_PATH" 2>/dev/null || true # remove device path from /vm/UUID xenstore-rm -t "$vm_dev" 2>/dev/null || true release_lock "block"> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Monday -- you could try reverting it (hg revert 19250), re-install dom0 > > tools, and see if that helps. > > > > -- Keir > > Keir, > > I tried that but it didn''t help, however I noticed that there were > several xen-hotplug-cleanup scripts running, and that script has > changed between 3.3.1 and unstable, on my 3.3.1 system I can run the > script and it exits very quickly, but on my unstable system the script > can only be run once, after that it hangs trying to get the lock, I > added a couple of echo''s to the script as you can see below, only the > first one is executed, the second is not and as the rest of the script > is not run the lock is never released. > > So the offending line is vm=$(xenstore-read > "/local/domain/${path_array[2]}/vm") > > Andy >Andy, What Dom0 kernel are you actually running? I spent ages tracking down a problem that didn''t seem to be affecting anyone else but it turned out to be a bug in the linux Dom0 kernel. The symptoms were (IIRC) that ''xm reboot'' would shut the domain down but it would still be there in a cleaning up state. If I executed ''xm create'' it would complete the reboot, but bits and pieces were also left running in xen and dom0 so you could only get dom id up to about 70 or so before the whole system ran out of resources. The bug would only come about if you limited Dom0 to a single CPU via ''(dom0-cpus 1)'' in xend-config.sxp. If you are running a kernel that roughly matches your version of unstable then this won''t be your problem. James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 9, 2009 at 10:35 AM, James Harper <james.harper@bendigoit.com.au> wrote:>> > Monday -- you could try reverting it (hg revert 19250), re-install dom0 >> > tools, and see if that helps. >> > >> > -- Keir >> >> Keir, >> >> I tried that but it didn''t help, however I noticed that there were >> several xen-hotplug-cleanup scripts running, and that script has >> changed between 3.3.1 and unstable, on my 3.3.1 system I can run the >> script and it exits very quickly, but on my unstable system the script >> can only be run once, after that it hangs trying to get the lock, I >> added a couple of echo''s to the script as you can see below, only the >> first one is executed, the second is not and as the rest of the script >> is not run the lock is never released. >> >> So the offending line is vm=$(xenstore-read >> "/local/domain/${path_array[2]}/vm") >> >> Andy >> > > Andy, > > What Dom0 kernel are you actually running? I spent ages tracking down a problem that didn''t seem to be affecting anyone else but it turned out to be a bug in the linux Dom0 kernel. >The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg> The symptoms were (IIRC) that ''xm reboot'' would shut the domain down but it would still be there in a cleaning up state. If I executed ''xm create'' it would complete the reboot, but bits and pieces were also left running in xen and dom0 so you could only get dom id up to about 70 or so before the whole system ran out of resources. > > The bug would only come about if you limited Dom0 to a single CPU via ''(dom0-cpus 1)'' in xend-config.sxp. > > If you are running a kernel that roughly matches your version of unstable then this won''t be your problem. > > James > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I had the same bug and Suse support determined that you need at least two processors assigned to dom0, otherwise you get that odd behavior. Federico -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Andrew Lyon Sent: Monday, March 09, 2009 6:50 AM To: James Harper Cc: xen-devel@lists.xensource.com; Keir Fraser Subject: Re: [Xen-devel] domain id number on xen unstable On Mon, Mar 9, 2009 at 10:35 AM, James Harper <james.harper@bendigoit.com.au> wrote:>> > Monday -- you could try reverting it (hg revert 19250), re-install dom0 >> > tools, and see if that helps. >> > >> > -- Keir >> >> Keir, >> >> I tried that but it didn''t help, however I noticed that there were >> several xen-hotplug-cleanup scripts running, and that script has >> changed between 3.3.1 and unstable, on my 3.3.1 system I can run the >> script and it exits very quickly, but on my unstable system the script >> can only be run once, after that it hangs trying to get the lock, I >> added a couple of echo''s to the script as you can see below, only the >> first one is executed, the second is not and as the rest of the script >> is not run the lock is never released. >> >> So the offending line is vm=$(xenstore-read >> "/local/domain/${path_array[2]}/vm") >> >> Andy >> > > Andy, > > What Dom0 kernel are you actually running? I spent ages tracking down aproblem that didn''t seem to be affecting anyone else but it turned out to be a bug in the linux Dom0 kernel.>The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg> The symptoms were (IIRC) that ''xm reboot'' would shut the domain down butit would still be there in a cleaning up state. If I executed ''xm create'' it would complete the reboot, but bits and pieces were also left running in xen and dom0 so you could only get dom id up to about 70 or so before the whole system ran out of resources.> > The bug would only come about if you limited Dom0 to a single CPU via''(dom0-cpus 1)'' in xend-config.sxp.> > If you are running a kernel that roughly matches your version of unstablethen this won''t be your problem.> > James > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/03/2009 10:49, "Andrew Lyon" <andrew.lyon@gmail.com> wrote:>> What Dom0 kernel are you actually running? I spent ages tracking down a >> problem that didn''t seem to be affecting anyone else but it turned out to be >> a bug in the linux Dom0 kernel. >> > > The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hgYeah, that has the bug. Probably you have (dom0-cpus 1) or similar in your xend-config.sxp? I will check in the fix to the 2.6.27 tree. Venefax: I''m pretty sure that the fix I will port across will fix the issue for dom0-cpus set to any value, even 1. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/03/2009 11:01, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>>> What Dom0 kernel are you actually running? I spent ages tracking down a >>> problem that didn''t seem to be affecting anyone else but it turned out to be >>> a bug in the linux Dom0 kernel. >>> >> >> The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg > > Yeah, that has the bug. Probably you have (dom0-cpus 1) or similar in your > xend-config.sxp? I will check in the fix to the 2.6.27 tree. > > Venefax: I''m pretty sure that the fix I will port across will fix the issue > for dom0-cpus set to any value, even 1.Now applied to http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg as changeset 1. Thanks James for catching this! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 09, 2009 at 10:49:53AM +0000, Andrew Lyon wrote:> On Mon, Mar 9, 2009 at 10:35 AM, James Harper > <james.harper@bendigoit.com.au> wrote: > >> > Monday -- you could try reverting it (hg revert 19250), re-install dom0 > >> > tools, and see if that helps. > >> > > >> > -- Keir > >> > >> Keir, > >> > >> I tried that but it didn''t help, however I noticed that there were > >> several xen-hotplug-cleanup scripts running, and that script has > >> changed between 3.3.1 and unstable, on my 3.3.1 system I can run the > >> script and it exits very quickly, but on my unstable system the script > >> can only be run once, after that it hangs trying to get the lock, I > >> added a couple of echo''s to the script as you can see below, only the > >> first one is executed, the second is not and as the rest of the script > >> is not run the lock is never released. > >> > >> So the offending line is vm=$(xenstore-read > >> "/local/domain/${path_array[2]}/vm") > >> > >> Andy > >> > > > > Andy, > > > > What Dom0 kernel are you actually running? I spent ages tracking down a problem that didn''t seem to be affecting anyone else but it turned out to be a bug in the linux Dom0 kernel. > > > > The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg >Afaik that tree is not maintained by anyone.. ie. there was only the initial commit, and all the known bugs etc are not fixed in that tree. I don''t know why Novell guys don''t submit their patches/fixes there.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 9, 2009 at 11:06 AM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 09/03/2009 11:01, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >>>> What Dom0 kernel are you actually running? I spent ages tracking down a >>>> problem that didn''t seem to be affecting anyone else but it turned out to be >>>> a bug in the linux Dom0 kernel. >>>> >>> >>> The latest from http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg >> >> Yeah, that has the bug. Probably you have (dom0-cpus 1) or similar in your >> xend-config.sxp? I will check in the fix to the 2.6.27 tree.I dont. xend-config.sxp:# In SMP system, dom0 will use dom0-cpus # of CPUS xend-config.sxp:# If dom0-cpus = 0, dom0 will take all cpus available xend-config.sxp:(dom0-cpus 0)>> >> Venefax: I''m pretty sure that the fix I will port across will fix the issue >> for dom0-cpus set to any value, even 1. > > Now applied to http://xenbits.xensource.com/ext/linux-2.6.27-xen.hg as > changeset 1. Thanks James for catching this! > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/03/2009 11:07, "Andrew Lyon" <andrew.lyon@gmail.com> wrote:>>> Yeah, that has the bug. Probably you have (dom0-cpus 1) or similar in your >>> xend-config.sxp? I will check in the fix to the 2.6.27 tree. > > I dont. > > xend-config.sxp:# In SMP system, dom0 will use dom0-cpus # of CPUS > xend-config.sxp:# If dom0-cpus = 0, dom0 will take all cpus available > xend-config.sxp:(dom0-cpus 0)Oh well, it has to be a different issue then really, I think. Unfortunately the 2.6.27 tree isn''t being actively maintained or tested by us, although I''ll port across fixes as and when they are pointed out, as in this case. Undoubtedly the best place to get a supported Suse 2.6.2x tree is from Suse. ;-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Mar 9, 2009 at 11:17 AM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 09/03/2009 11:07, "Andrew Lyon" <andrew.lyon@gmail.com> wrote: > >>>> Yeah, that has the bug. Probably you have (dom0-cpus 1) or similar in your >>>> xend-config.sxp? I will check in the fix to the 2.6.27 tree. >> >> I dont. >> >> xend-config.sxp:# In SMP system, dom0 will use dom0-cpus # of CPUS >> xend-config.sxp:# If dom0-cpus = 0, dom0 will take all cpus available >> xend-config.sxp:(dom0-cpus 0) > > Oh well, it has to be a different issue then really, I think. Unfortunately > the 2.6.27 tree isn''t being actively maintained or tested by us, although > I''ll port across fixes as and when they are pointed out, as in this case. > Undoubtedly the best place to get a supported Suse 2.6.2x tree is from Suse. > ;-) > > -- KeirI maintain a gentoo ebuild for dom0 kernel which uses opensuse xen patches rebased to apply to vanilla without the usual extra patches that opensuse uses, the most recent suse xen kernel is 2.6.27.13 which I''ve also tried with Xen unstable, the same problem exists. Of course that kernel could still be lacking something which is causing this bug, replacing xen-hotplug-cleanup with the one from 3.3.1 seems to work a lot better, but its probably not cleaning up all objects and something will break eventually. Theres nothing more I can do, the xensource kernel does not support the hardware on my test system sufficiently to boot and test. Andy> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/03/2009 19:18, "Andrew Lyon" <andrew.lyon@gmail.com> wrote:> I maintain a gentoo ebuild for dom0 kernel which uses opensuse xen > patches rebased to apply to vanilla without the usual extra patches > that opensuse uses, the most recent suse xen kernel is 2.6.27.13 which > I''ve also tried with Xen unstable, the same problem exists. > > Of course that kernel could still be lacking something which is > causing this bug, replacing xen-hotplug-cleanup with the one from > 3.3.1 seems to work a lot better, but its probably not cleaning up all > objects and something will break eventually. > > Theres nothing more I can do, the xensource kernel does not support > the hardware on my test system sufficiently to boot and test.The pv_ops patchqueue from Jeremy will be the future maintained kernel for Xen, but although it''s inching ever closer I think it''s not really quite there yet still. Limited manpower is slowing down that effort. With 2.6.18, the ''XenSource'' product kernel has loads of drivers backported onto it. I''ll look into getting that better published or perhaps a snapshot taken. Obviously it''s distributed with Citrix products and freely redistributable, but it''s not currently on xenbits. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel