Kris
2012-Feb-08 19:44 UTC
Question about xen network and strange happenings with migrating a node
Hi, I''ve been experiencing some issues with Xen-3.4.2-2.el5 (from gitco) on CentOS 5.4 when migrating nodes from one physical node to another. Each VM has 3 network interfaces (vifs). When doing a live migration (xm migrate --live <params>) the network of the migrated VM seems to go in to a very odd state when it''s done migrating. I''ve searched high and low for some documentation on what the state, evt-ch,tx-/rx-ring-ref fields mean with their particular integer values, but 1,-1,-1/-1 respectively seems to be indicative of something that has gone badly. A state of 4 seems to indicate that the interface is in a working state. When doing a xm network-list VM on the node it is being migrated *to* - in this case all of them are broken, but sometimes it''s just one or two interfaces are in this state.: Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path 0 0 <REDACTED> 0 1 -1 -1 /-1 /local/domain/0/backend/vif/2/0 1 0 <REDACTED> 1 1 -1 -1 /-1 /local/domain/0/backend/vif/2/1 2 0 <REDACTED> 2 1 -1 -1 /-1 /local/domain/0/backend/vif/2/2 The result is that the network interfaces with the output above do not work. Is there anything I can do to perhaps debug this better? Is there any documentation I''m missing out that would explain this stuff? Any troubleshooting help is appreciated. I''ve looked at the xend logs and there doesn''t seem to be anything indicative of why this is occurring. Thanks, Kris
Ian Campbell
2012-Feb-09 13:11 UTC
Re: Question about xen network and strange happenings with migrating a node
On Wed, 2012-02-08 at 19:44 +0000, Kris wrote:> Hi, > > I''ve been experiencing some issues with Xen-3.4.2-2.el5 (from gitco) > on CentOS 5.4 when migrating nodes from one physical node to another. > Each VM has 3 network interfaces (vifs). > > When doing a live migration (xm migrate --live <params>) the network > of the migrated VM seems to go in to a very odd state when it''s done > migrating. I''ve searched high and low for some documentation on what > the state, evt-ch,tx-/rx-ring-ref fields mean with their particular > integer values, but 1,-1,-1/-1 respectively seems to be indicative of > something that has gone badly. A state of 4 seems to indicate that the > interface is in a working state. > > When doing a xm network-list VM on the node it is being migrated *to* > - in this case all of them are broken, but sometimes it''s just one or > two interfaces are in this state.: > Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path > 0 0 <REDACTED> 0 1 -1 -1 /-1 /local/domain/0/backend/vif/2/0 > 1 0 <REDACTED> 1 1 -1 -1 /-1 /local/domain/0/backend/vif/2/1 > 2 0 <REDACTED> 2 1 -1 -1 /-1 /local/domain/0/backend/vif/2/2 > > The result is that the network interfaces with the output above do not > work. > > Is there anything I can do to perhaps debug this better? Is there any > documentation I''m missing out that would explain this stuff? Any > troubleshooting help is appreciated. I''ve looked at the xend logs and > there doesn''t seem to be anything indicative of why this is occurring.There''s a good chance this is a guest kernel issue. What sort of guests are they and what kernel version are they running? Is there anything in the guest kernel logs or on the guest console? The output of "xenstore-ls -fp" after this has happened would also be potentially interesting. Ian.
Kris
2012-Feb-09 15:16 UTC
Re: Question about xen network and strange happenings with migrating a node
Hi Ian, Thanks so much for the response. The guest kernel is 2.6.32-71.29.1.el6.x86_64 on a CentOS 6.0 system. However, this happens on other versions of the kernel on a CentOS 5.X system. I''ve had it happen *anecdotally* on systems from CentOS 5.0-6.0. I''ll try to replicate this and attempt to issue ''xenstore-ls -fp''. If there''s anything else you or anyone can think of, please recommend them. Especially some sort of explanation of what those integers mean in the network-list? Thanks again, Kris On 2012-02-09, at 8:11 AM, Ian Campbell wrote:> On Wed, 2012-02-08 at 19:44 +0000, Kris wrote: >> Hi, >> >> I''ve been experiencing some issues with Xen-3.4.2-2.el5 (from gitco) >> on CentOS 5.4 when migrating nodes from one physical node to another. >> Each VM has 3 network interfaces (vifs). >> >> When doing a live migration (xm migrate --live <params>) the network >> of the migrated VM seems to go in to a very odd state when it''s done >> migrating. I''ve searched high and low for some documentation on what >> the state, evt-ch,tx-/rx-ring-ref fields mean with their particular >> integer values, but 1,-1,-1/-1 respectively seems to be indicative of >> something that has gone badly. A state of 4 seems to indicate that the >> interface is in a working state. >> >> When doing a xm network-list VM on the node it is being migrated *to* >> - in this case all of them are broken, but sometimes it''s just one or >> two interfaces are in this state.: >> Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path >> 0 0 <REDACTED> 0 1 -1 -1 /-1 /local/domain/0/backend/vif/2/0 >> 1 0 <REDACTED> 1 1 -1 -1 /-1 /local/domain/0/backend/vif/2/1 >> 2 0 <REDACTED> 2 1 -1 -1 /-1 /local/domain/0/backend/vif/2/2 >> >> The result is that the network interfaces with the output above do not >> work. >> >> Is there anything I can do to perhaps debug this better? Is there any >> documentation I''m missing out that would explain this stuff? Any >> troubleshooting help is appreciated. I''ve looked at the xend logs and >> there doesn''t seem to be anything indicative of why this is occurring. > > There''s a good chance this is a guest kernel issue. What sort of guests > are they and what kernel version are they running? Is there anything in > the guest kernel logs or on the guest console? > > The output of "xenstore-ls -fp" after this has happened would also be > potentially interesting. > > Ian. > >
Ian Campbell
2012-Feb-09 15:28 UTC
Re: Question about xen network and strange happenings with migrating a node
Please don''t top post. On Thu, 2012-02-09 at 15:16 +0000, Kris wrote:> Especially some sort of explanation of what those integers mean in the network-list?IIRC those numbers come from the xenstore entry for the device. I expect -1 means "entry is missing". The guest should have written them -- hence the question about guests logs and versions etc. Ian.
Kris
2012-Feb-09 20:14 UTC
Re: Question about xen network and strange happenings with migrating a node
On 2012-02-09, at 8:11 AM, Ian Campbell wrote:> There''s a good chance this is a guest kernel issue. What sort of guests > are they and what kernel version are they running? Is there anything in > the guest kernel logs or on the guest console?I did some digging around in the kernel logs for the guest post-migration and found an entry that indicates the failure. Feb 1 18:21:03 XX kernel: netfront: device eth0 has copying receive path. Feb 1 18:21:03 XX kernel: netfront: device eth1 has copying receive path. Feb 1 18:21:03 XX kernel: vif vif-3: 2 reading other end details from device/vif/3 Feb 1 18:21:03 XX kernel: xenbus: resume (talk_to_otherend) vif-3 failed: -2 Feb 1 18:21:03 XX kernel: Initializing CPU#1 Feb 1 18:21:03 XX kernel: Initializing CPU#2 Feb 1 18:21:03 XX kernel: Initializing CPU#3 Notice line 3&4. I''ll continue to try and replicate, but if this rings any bells, please comment. Kris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel