We have seen cases where, when a guest is shutdown during heavy network traffic, the backend vif hangs around for considerable amounts of time holding up the xenbus thread for the duration of the cleanup. We have seen cases (HP, Novell) where it has taken hours to cleanup the interface and for that duration we cannot launch new guests. We have traced this problem to the fact that the front-end can shutdown with transmit resources still in transit on the back-end side. On the back-end side, the moment the carrier is turned off on the interface, the interface is not eligible for further scheduling in the backend. Since turning off the carrier happens completely asynchronously with the rest of the packet processing in the back-end (via tasklets today), we can have situations where the vif cannot be cleaned up for a significant amount of time. I am attaching a patch for addressing this issue. I have tried to fix the problem with minimal changes to the current code. The main idea is to cleanup the guest transmit resources quickly and aggressively. With this patch, the cleanup time has been dramatically reduced. Even in this patch, the xenbus thread is held up for the duration of the cleanup - however we attempt to cleanup aggressively. This is in comparison to the current code where the xenbus thread just waits for the refcnt on the interface to drop to zero. An obvious improvement would be to not tie up the xenbus thread for the duration of cleaning up the interface. If there is interest, I can submit a patch for doing this. This patch has been tested on 2.6.16 (SLES10 sp2, sp3 kernels) as well as 2.6.27 (sles11) kernel. This patch is however against the 2.6.18-xen.hg sources. Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> Regards, K. Y _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This looks like a hack. I think you''d get most of the benefit by simply always calling net_tx_action_dealloc() at the top of net_tx_action() (i.e., remove the dealloc_cons!=dealloc_prod conditional). I think the presence of that conditional is actually a bug, and removing it should get interface shutdowns to a reasonable time with no other changes. -- Keir On 06/12/2009 17:18, "Ky Srinivasan" <ksrinivasan@novell.com> wrote:> We have seen cases where, when a guest is shutdown during heavy network > traffic, the backend vif hangs around for considerable amounts of time holding > up the xenbus thread for the duration of the cleanup. We have seen cases (HP, > Novell) where it has taken hours to cleanup the interface and for that > duration we cannot launch new guests. We have traced this problem to the fact > that the front-end can shutdown with transmit resources still in transit on > the back-end side. On the back-end side, the moment the carrier is turned off > on the interface, the interface is not eligible for further scheduling in the > backend. Since turning off the carrier happens completely asynchronously with > the rest of the packet processing in the back-end (via tasklets today), we can > have situations where the vif cannot be cleaned up for a significant amount of > time. > > I am attaching a patch for addressing this issue. I have tried to fix the > problem with minimal changes to the current code. The main idea is to cleanup > the guest transmit resources quickly and aggressively. With this patch, the > cleanup time has been dramatically reduced. Even in this patch, the xenbus > thread is held up for the duration of the cleanup - however we attempt to > cleanup aggressively. This is in comparison to the current code where the > xenbus thread just waits for the refcnt on the interface to drop to zero. An > obvious improvement would be to not tie up the xenbus thread for the duration > of cleaning up the interface. If there is interest, I can submit a patch for > doing this. This patch has been tested on 2.6.16 (SLES10 sp2, sp3 kernels) as > well as 2.6.27 (sles11) kernel. This patch is however against the > 2.6.18-xen.hg sources. > > Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> > > > > Regards, > > K. Y >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 12/6/2009 at 4:44 PM, in message <C741D648.3B7A%keir.fraser@eu.citrix.com>,Keir Fraser <keir.fraser@eu.citrix.com> wrote:> This looks like a hack. I think you''d get most of the benefit by simply > always calling net_tx_action_dealloc() at the top of net_tx_action() (i.e., > remove the dealloc_cons!=dealloc_prod conditional). I think the presence of > that conditional is actually a bug, and removing it should get interface > shutdowns to a reasonable time with no other changes.That was my first impression as well and was the first patch we tested. It does not work. Take the case where we have only one netif that netback is handling that gets into this weird state, then we would never call net_tx_action_dealloc() if the carrier is turned off on the netif that is being cleaned up. This patch, fixes that problem by forcefully scheduling the netif whose carrier has been turned off. In the interest of minimizing the code changes, I went about making these changes one at a time, and the combination of what I have in the patch appears to reduce the cleanup time the most. Even with these hacks, I am not particularly happy about the maximum time we have seen the cleanup take, and for this duration the physical host is unusable for launching new guests. So, as I noted in my earlier email, we may still want to not tie up the xenbus thread for the duration of this cleanup. Regards, K. Y> > -- Keir > > On 06/12/2009 17:18, "Ky Srinivasan" <ksrinivasan@novell.com> wrote: > >> We have seen cases where, when a guest is shutdown during heavy network >> traffic, the backend vif hangs around for considerable amounts of time > holding >> up the xenbus thread for the duration of the cleanup. We have seen cases > (HP, >> Novell) where it has taken hours to cleanup the interface and for that >> duration we cannot launch new guests. We have traced this problem to the > fact >> that the front-end can shutdown with transmit resources still in transit on >> the back-end side. On the back-end side, the moment the carrier is turned off >> on the interface, the interface is not eligible for further scheduling in > the >> backend. Since turning off the carrier happens completely asynchronously > with >> the rest of the packet processing in the back-end (via tasklets today), we > can >> have situations where the vif cannot be cleaned up for a significant amount > of >> time. >> >> I am attaching a patch for addressing this issue. I have tried to fix the >> problem with minimal changes to the current code. The main idea is to > cleanup >> the guest transmit resources quickly and aggressively. With this patch, the >> cleanup time has been dramatically reduced. Even in this patch, the xenbus >> thread is held up for the duration of the cleanup - however we attempt to >> cleanup aggressively. This is in comparison to the current code where the >> xenbus thread just waits for the refcnt on the interface to drop to zero. > An >> obvious improvement would be to not tie up the xenbus thread for the > duration >> of cleaning up the interface. If there is interest, I can submit a patch for >> doing this. This patch has been tested on 2.6.16 (SLES10 sp2, sp3 kernels) > as >> well as 2.6.27 (sles11) kernel. This patch is however against the >> 2.6.18-xen.hg sources. >> >> Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> >> >> >> >> Regards, >> >> K. Y >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 06/12/2009 23:15, "Ky Srinivasan" <ksrinivasan@novell.com> wrote:>> This looks like a hack. I think you''d get most of the benefit by simply >> always calling net_tx_action_dealloc() at the top of net_tx_action() (i.e., >> remove the dealloc_cons!=dealloc_prod conditional). I think the presence of >> that conditional is actually a bug, and removing it should get interface >> shutdowns to a reasonable time with no other changes. > That was my first impression as well and was the first patch we tested. It > does not work. Take the case where we have only one netif that netback is > handling that gets into this weird state, then we would never call > net_tx_action_dealloc() if the carrier is turned off on the netif that is > being cleaned up. This patch, fixes that problem by forcefully scheduling the > netif whose carrier has been turned off.I think that this is simply another bug. There is an early ''return'' in net_tx_action() which should actually goto the end of the function which checks the pending list and conditionally calls mod_timer(). Otherwise the pending list can get partially flushed but the timer doesn''t get re-mod()ed.> In the interest of minimizing the > code changes, I went about making these changes one at a time, and the > combination of what I have in the patch appears to reduce the cleanup time the > most. Even with these hacks, I am not particularly happy about the maximum > time we have seen the cleanup take, and for this duration the physical host is > unusable for launching new guests. So, as I noted in my earlier email, we may > still want to not tie up the xenbus thread for the duration of this cleanup.Perhaps we could decouple the whole thing from xenbus thread. In any case, the above bug fixes are to me preferable to the original patch, and will get the orders-of-magnitude low-hanging fruit (delay xenbus thread for maybe a second, rather than hours!). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 12/7/2009 at 3:04 AM, in message <C7426771.3BA0%keir.fraser@eu.citrix.com>,Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 06/12/2009 23:15, "Ky Srinivasan" <ksrinivasan@novell.com> wrote: > >>> This looks like a hack. I think you''d get most of the benefit by simply >>> always calling net_tx_action_dealloc() at the top of net_tx_action() (i.e., >>> remove the dealloc_cons!=dealloc_prod conditional). I think the presence of >>> that conditional is actually a bug, and removing it should get interface >>> shutdowns to a reasonable time with no other changes. >> That was my first impression as well and was the first patch we tested. It >> does not work. Take the case where we have only one netif that netback is >> handling that gets into this weird state, then we would never call >> net_tx_action_dealloc() if the carrier is turned off on the netif that is >> being cleaned up. This patch, fixes that problem by forcefully scheduling > the >> netif whose carrier has been turned off. > > I think that this is simply another bug. There is an early ''return'' in > net_tx_action() which should actually goto the end of the function which > checks the pending list and conditionally calls mod_timer(). Otherwise the > pending list can get partially flushed but the timer doesn''t get re-mod()ed. >I suspect you are referring to the conditional early return ( if (mop==tx_map_ops)) . Anna Fischer of HP has a test environment where we can see this problem most easily. I will give her a netback with these changes and report back on the cleanup times. Thanks, K. Y>> In the interest of minimizing the >> code changes, I went about making these changes one at a time, and the >> combination of what I have in the patch appears to reduce the cleanup time > the >> most. Even with these hacks, I am not particularly happy about the maximum >> time we have seen the cleanup take, and for this duration the physical host > is >> unusable for launching new guests. So, as I noted in my earlier email, we > may >> still want to not tie up the xenbus thread for the duration of this cleanup. > > Perhaps we could decouple the whole thing from xenbus thread. In any case, > the above bug fixes are to me preferable to the original patch, and will get > the orders-of-magnitude low-hanging fruit (delay xenbus thread for maybe a > second, rather than hours!). > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel