George Shuklin
2010-Jul-21 22:04 UTC
[Xen-users] XCP - grace arp missing while migrating VM
This problem was actual even in XCP 0.11, and now I confirm it in XCP 0.5. When idle VM migrating between hosts no any arp requests sends on behave of VM (or from guest kernel), so for few minutes (depends on settings of switches in network) VM become inaccessible via IP. More specific: Scheme (monospace fonts): Before migration: +-----------+ | | ---port1 ----[HOST1 (VM) ] |L2 switch | | |----port2 ----[HOST2] +-----------+ After: +-----------+ | | ---port1 ----[HOST1 ] |L2 switch | | |----port2 ----[HOST2 (VM) ] +-----------+ When VM boots on host1 it send arp packet to neighbors and switch learn that any traffic to MAC=XX:XX:XX:XX:XX:XX must be sent to port1. After that, if we migrate vm to host2 and this VM will not sent any data switch will not learn about port change and continue to directs all traffic to mac XX:XX:XX:XX:XX:XX to port1 (and this traffic will silently ignored by OVS/dom0 kernel). This will continue until: 1) VM send any packet outside 2) switch decide to update MAC-port table This is real problem: in my test debian without any network services does not sent any data outside within hours, and switch does not update table often. How it can be fixed? I see two ways: 1) dom0 in host2 sent fake arp packet after accepting migrating VM (single packet does not create any problem to anyone). In this case dom0 controls ''network migration'' process 2) domU after receiving event about migration initiate arp broadcast message. I don''t know which way is better (#1 seems for me more controllable), but without those actions migration will cause sometime illusion of VM hang (no reply to pings). --- wBR, George. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Frank Pikelner
2010-Jul-23 18:11 UTC
Re: [Xen-users] XCP - grace arp missing while migrating VM
On Thu, 2010-07-22 at 02:04 +0400, George Shuklin wrote:> This problem was actual even in XCP 0.11, and now I confirm it in XCP > 0.5. > > When idle VM migrating between hosts no any arp requests sends on behave > of VM (or from guest kernel), so for few minutes (depends on settings of > switches in network) VM become inaccessible via IP. > > More specific: > > Scheme (monospace fonts): > > Before migration: > +-----------+ > | | ---port1 ----[HOST1 (VM) ] > |L2 switch | > | |----port2 ----[HOST2] > +-----------+ > > After: > +-----------+ > | | ---port1 ----[HOST1 ] > |L2 switch | > | |----port2 ----[HOST2 (VM) ] > +-----------+ > > When VM boots on host1 it send arp packet to neighbors and switch learn > that any traffic to MAC=XX:XX:XX:XX:XX:XX must be sent to port1. > > After that, if we migrate vm to host2 and this VM will not sent any data > switch will not learn about port change and continue to directs all > traffic to mac XX:XX:XX:XX:XX:XX to port1 (and this traffic will > silently ignored by OVS/dom0 kernel). > > This will continue until: > > 1) VM send any packet outside > 2) switch decide to update MAC-port table > > > This is real problem: in my test debian without any network services > does not sent any data outside within hours, and switch does not update > table often. > > How it can be fixed? > > I see two ways: > 1) dom0 in host2 sent fake arp packet after accepting migrating VM > (single packet does not create any problem to anyone). In this case dom0 > controls ''network migration'' process > > 2) domU after receiving event about migration initiate arp broadcast > message. > > I don''t know which way is better (#1 seems for me more controllable), > but without those actions migration will cause sometime illusion of VM > hang (no reply to pings). >If a machine is idle, switch CAM tables may even not have the idle machines MAC address if there has not been any network traffic for a period of time. That is until a machine either generates traffic or for an incoming packet being sent to all ports to locate the machine. If I''m not mistaken when a physical machine is disconnected from a physical switch port, the MAC table entries for that port are cleared. In a VM migration where there are multiple virtual switches, it is probably the switches that need to clear their CAM table entries rather than the machines sending out gratuitous ARP requests since the machines should not really have to be aware that they are being migrated (ie. operational state is saved, then restored and machines continues on). Best, Frank Pikelner _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users