Thank you Ian and Niels for your help with the memory problem. Now onto the next one - I start the migration and now it gets to the second iteration (moving memory I would assume) and then freezes and the guest domain itself is locked up. I reviewed the logs for any info and couldn''t see any but I am new to migration so there easily could be something I am missing. xend.log and xend-debug.log where both last modified on Jul 9 and xen-hotplug.log was modified on Sept 26 so I don''t thing they are relevant. xl-Win98.log and qemu-dm-Win81.log are included from the migration that I started on Friday and it ran all weekend with the result I described above. I have included the output from the Friday attempt in xl_migrate_Fri.txt and from a previous attempt where I turned on verbose output in Verbose_xl_migrate_output.txt. I scoured the internet as much as I could Friday and could find any reference to any situation like this but if I missed something I apologize. Any insight would be greatly appreciated. Thanks -- Shane D. Johnson IT Administrator Rasmussen Equipment _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
In digging deeper into this, I was looking at the CPU flags and noticed that between the two machines, they are considerably different. Could this cause an issue? I thought CPU flags where only an issue in XCP. If anyone has any insights it would be greatly appreciated. These are from the Destination: fpu de tsc msr pae mce cx8 apic mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow rep_good nopl extd_apicid pni cx16 hypervisor lahf_lm cmp_legacy extapic cr8_legacy 3dnowprefetch And These are from the Originator: fpu de tsc msr pae mce cx8 apic mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm constant_tsc rep_good nopl nonstop_tsc extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c hypervisor lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch xop fma4 tce tbm perfctr_core perfctr_nb arat cpb hw_pstate Thank you. Shane On Mon, Oct 7, 2013 at 8:47 AM, Shane Johnson <sdj@rasmussenequipment.com> wrote:> Thank you Ian and Niels for your help with the memory problem. > > Now onto the next one - I start the migration and now it gets to the > second iteration (moving memory I would assume) and then freezes and > the guest domain itself is locked up. > > I reviewed the logs for any info and couldn''t see any but I am new to > migration so there easily could be something I am missing. > > xend.log and xend-debug.log where both last modified on Jul 9 and > xen-hotplug.log was modified on Sept 26 so I don''t thing they are > relevant. xl-Win98.log and qemu-dm-Win81.log are included from the > migration that I started on Friday and it ran all weekend with the > result I described above. I have included the output from the Friday > attempt in xl_migrate_Fri.txt and from a previous attempt where I > turned on verbose output in Verbose_xl_migrate_output.txt. > > I scoured the internet as much as I could Friday and could find any > reference to any situation like this but if I missed something I > apologize. > > Any insight would be greatly appreciated. > > Thanks > > -- > Shane D. Johnson > IT Administrator > Rasmussen Equipment-- Shane D. Johnson IT Administrator Rasmussen Equipment
On Mon, 2013-10-07 at 16:44 -0600, Shane Johnson wrote:> In digging deeper into this, I was looking at the CPU flags and > noticed that between the two machines, they are considerably > different. Could this cause an issue? I thought CPU flags where only > an issue in XCP.CPU flags are always an issue, since a guest which starts on one host expects to find at least the same set of processor features on the destination too. You can use the cpuid syntax in your cfg file to hide features on the source domain, or I htink there is a way to do it host wide with a xen command line parameter. it''s a bit more manual with regular Xen as opposed to XCP, andI''m afraid I''m not 100% sure of the details. I suppose the docs and/or wiki ought to have something. Ian/
On Mon, 2013-10-07 at 08:47 -0600, Shane Johnson wrote:> relevant. xl-Win98.log and qemu-dm-Win81.log are included from theYou might find the *-Win98-incoming*.log on the target contains some of the more interesting stuff, from the sending end things look mostly ok. I don''t know if all of the xc: detail: type fail: page 136 mfn 0001f888 stuff is worrying or not...
Ian, Thank you for the info, I can''t find the log file you mentioned though where should it be located and what does it mean if it doesn''t exist? Thanks Shane On Tue, Oct 8, 2013 at 7:16 AM, Ian Campbell <ian.campbell@citrix.com> wrote:> On Mon, 2013-10-07 at 08:47 -0600, Shane Johnson wrote: >> relevant. xl-Win98.log and qemu-dm-Win81.log are included from the > > You might find the *-Win98-incoming*.log on the target contains some of > the more interesting stuff, from the sending end things look mostly ok. > I don''t know if all of the > xc: detail: type fail: page 136 mfn 0001f888 > stuff is worrying or not... >-- Shane D. Johnson IT Administrator Rasmussen Equipment
On Tue, 2013-10-08 at 07:23 -0600, Shane Johnson wrote:> Ian, > Thank you for the info, I can''t find the log file you mentioned though > where should it be locatedIt should be in /var/log/xen on the target system i.e. the one you are migrating to.> and what does it mean if it doesn''t exist?I''m not sure, your sender side logs seem to suggest that something is receiving, so I''d have thought the log really should be there...> > Thanks > Shane > > On Tue, Oct 8, 2013 at 7:16 AM, Ian Campbell <ian.campbell@citrix.com> wrote: > > On Mon, 2013-10-07 at 08:47 -0600, Shane Johnson wrote: > >> relevant. xl-Win98.log and qemu-dm-Win81.log are included from the > > > > You might find the *-Win98-incoming*.log on the target contains some of > > the more interesting stuff, from the sending end things look mostly ok. > > I don''t know if all of the > > xc: detail: type fail: page 136 mfn 0001f888 > > stuff is worrying or not... > > > > >
On Tue, Oct 8, 2013 at 7:32 AM, Ian Campbell <ian.campbell@citrix.com> wrote:> On Tue, 2013-10-08 at 07:23 -0600, Shane Johnson wrote: >> Ian, >> Thank you for the info, I can''t find the log file you mentioned though >> where should it be located > > It should be in /var/log/xen on the target system i.e. the one you are > migrating to. > >> and what does it mean if it doesn''t exist? > > I''m not sure, your sender side logs seem to suggest that something is > receiving, so I''d have thought the log really should be there... > >> >> Thanks >> Shane >> >> On Tue, Oct 8, 2013 at 7:16 AM, Ian Campbell <ian.campbell@citrix.com> wrote: >> > On Mon, 2013-10-07 at 08:47 -0600, Shane Johnson wrote: >> >> relevant. xl-Win98.log and qemu-dm-Win81.log are included from the >> > >> > You might find the *-Win98-incoming*.log on the target contains some of >> > the more interesting stuff, from the sending end things look mostly ok. >> > I don''t know if all of the >> > xc: detail: type fail: page 136 mfn 0001f888 >> > stuff is worrying or not... >> > >> >> >> > >Nope no incoming log. Did some more testing and here is what I found. First some details - scutter is the oldest machine, builder is next and then sdj is the newest. I tested starting the DomU on each machine and seeing where I can migrate to. I can migrate from scutter to either of the other machines but that is the only source to destinations that works. With this, I am thinking it''s the CPU flags. During all these tests, the *-incoming*.log file wasn''t created until the end of the migration so no info there on why they are locking up. I guess I am lucky these machines don''t have to be up 24/7 so I can do after hours migrations. Where it''s not essential I have live migration I am going to have to put research on masking the flags on hold for now. Ian, you mentioned that the flags aren''t as big of a problem with XCP. Could you provide a quick and dirty explanation on why? Thank you Ian for all your help. -- Shane D. Johnson IT Administrator Rasmussen Equipment
On Tue, 2013-10-08 at 09:15 -0600, Shane Johnson wrote:> On Tue, Oct 8, 2013 at 7:32 AM, Ian Campbell <ian.campbell@citrix.com> wrote: > > On Tue, 2013-10-08 at 07:23 -0600, Shane Johnson wrote: > >> Ian, > >> Thank you for the info, I can''t find the log file you mentioned though > >> where should it be located > > > > It should be in /var/log/xen on the target system i.e. the one you are > > migrating to. > > > >> and what does it mean if it doesn''t exist? > > > > I''m not sure, your sender side logs seem to suggest that something is > > receiving, so I''d have thought the log really should be there... > > > >> > >> Thanks > >> Shane > >> > >> On Tue, Oct 8, 2013 at 7:16 AM, Ian Campbell <ian.campbell@citrix.com> wrote: > >> > On Mon, 2013-10-07 at 08:47 -0600, Shane Johnson wrote: > >> >> relevant. xl-Win98.log and qemu-dm-Win81.log are included from the > >> > > >> > You might find the *-Win98-incoming*.log on the target contains some of > >> > the more interesting stuff, from the sending end things look mostly ok. > >> > I don''t know if all of the > >> > xc: detail: type fail: page 136 mfn 0001f888 > >> > stuff is worrying or not... > >> > > >> > >> > >> > > > > > > Nope no incoming log. > Did some more testing and here is what I found. > First some details - scutter is the oldest machine, builder is next > and then sdj is the newest. > I tested starting the DomU on each machine and seeing where I can > migrate to. I can migrate from scutter to either of the other > machines but that is the only source to destinations that works. > With this, I am thinking it''s the CPU flags. During all these tests, > the *-incoming*.log file wasn''t created until the end of the migration > so no info there on why they are locking up.It might be possible to hack things so that the "xl migrate-receive" on the target gets -vvv too. ISTR having to change xl_cmdimpl.c to do this though... It might also be worth having a look at xen''s dmesg on the target.> Ian, you mentioned that the flags aren''t as big of a problem with XCP. > Could you provide a quick and dirty explanation on why?XCP has more user interface to simplify things when doing host-wide levelling, by figuring out some of the magic numbers you need. Ian.> > Thank you Ian for all your help. >
<snip>> It might be possible to hack things so that the "xl migrate-receive" on > the target gets -vvv too. ISTR having to change xl_cmdimpl.c to do this > though... > > It might also be worth having a look at xen''s dmesg on the target. > >> Ian, you mentioned that the flags aren''t as big of a problem with XCP. >> Could you provide a quick and dirty explanation on why? > > XCP has more user interface to simplify things when doing host-wide > levelling, by figuring out some of the magic numbers you need. > > Ian. > >> >> Thank you Ian for all your help. >> > >So using the -s in the migrate command wouldn''t suffice in passing the -vvv? I thought I saw somewhere on the web yesterday passing special ssh commands through using that switch. I''ll do some digging when I get a chance to see if I can come up with something. I''ll take a look at possibly moving to XCP and see if that will alleviate some of the pain. Thanks again. -- Shane D. Johnson IT Administrator Rasmussen Equipment
Shane said: I''ll take a look at possibly moving to XCP and see if that will alleviate some of the pain. Hi Shane - I''m not sure if that''s a stop-gap fix, but I believe XCP releases have stopped as XenServer was opensourced - maybe there is some possibility to improve the method XenServer uses to resolve the issue or something? Cheers, Mitch.
Mitch, Thanks for the info. When I get to that point, I will be exploring all options and see what is going to work best. I am thinking I may transition from virtualization to a true private cloud. We''ll have to see when I get to that point. Shane On Tue, Oct 8, 2013 at 6:06 PM, mitch@bitblock.net <mitch@bitblock.net> wrote:> Shane said: I''ll take a look at possibly moving to XCP and see if that will alleviate some of the pain. > > Hi Shane - I''m not sure if that''s a stop-gap fix, but I believe XCP releases have stopped as XenServer was opensourced - maybe there is some possibility to improve the method XenServer uses to resolve the issue or something? > > Cheers, > > Mitch. >-- Shane D. Johnson IT Administrator Rasmussen Equipment