Ky Srinivasan
2006-Jun-05 03:18 UTC
[xen-devel][PATCH] concurrency control issues in netback
I have been looking at the netback code (netback.c) and I think there may be a bug in the way concurrency is managed here. Specifically, the function that cleans up completed requests (net_tx_action_dealloc()) is executed only from within a tasklet and hence does not acquire any locks to manipulate global state that could potentially be modified from other cpus. The code seems to assume writes are ordered without explicitly forcing the ordering. The problematic global state is the index that keeps track of the last valid index to be dealloced (dealloc_prod). This could potentially be manipulated outside of the tasklet. The dealloc_ring[] is populated and the dealloc_prod variable is advanced in the function netif_idx_release() under the protection of a spin lock. However, if the two memory writes (assigning at the current dealloc_prod index and advancing the dealloc_prod index) in netif_idx_release() are seen out of order on the cpu executing net_tx_action_dealloc(), we can get into a situation where we could attempt to operate on an index that has invalid data. The fix would be fairly simple. We could snapshot the current value of dealloc_prod in net_tx_action_dealloc() under the protection of the same spin lock that is used to advance the dealloc_prod index in netif_idx_release. This would give a consistent snapshot of the dealloc_prod index and deal with the problem described above. The attached patch does just that. Signed- off- by K. Y. Srinivasan <ksrinivasan@novell.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-05 13:48 UTC
Re: [xen-devel][PATCH] concurrency control issues in netback
On 5 Jun 2006, at 04:18, Ky Srinivasan wrote:> I have been looking at the netback code (netback.c) and I think there > may be a bug in the way concurrency is managed here. Specifically, the > function that cleans up completed requests (net_tx_action_dealloc()) is > executed only from within a tasklet and hence does not acquire any > locks > to manipulate global state that could potentially be modified from > other > cpus. The code seems to assume writes are ordered without explicitly > forcing the ordering. The problematic global state is the index that > keeps track of the last valid index to be dealloced (dealloc_prod). > This > could potentially be manipulated outside of the tasklet.That''s a good catch! Does your patch fix the network problems that you guys have been seeing? The patch needs cleaning up: at least to rename the lock, but actually just explicitly forcing the ordering we need is probably better. I''ll sort that out and check in a suitable fix. Thanks! Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ky Srinivasan
2006-Jun-05 14:13 UTC
Re: [xen-devel][PATCH] concurrency control issues in netback
>>> On Mon, Jun 5, 2006 at 9:48 AM, in message<daacab32d9e77bbbae6fa0c9ef088a91@cl.cam.ac.uk>, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> On 5 Jun 2006, at 04:18, Ky Srinivasan wrote: > >> I have been looking at the netback code (netback.c) and I thinkthere>> may be a bug in the way concurrency is managed here. Specifically,the>> function that cleans up completed requests (net_tx_action_dealloc())is>> executed only from within a tasklet and hence does not acquire any >> locks >> to manipulate global state that could potentially be modified from >> other >> cpus. The code seems to assume writes are ordered withoutexplicitly>> forcing the ordering. The problematic global state is the indexthat>> keeps track of the last valid index to be dealloced (dealloc_prod).>> This >> could potentially be manipulated outside of the tasklet. > > That''s a good catch! Does your patch fix the network problems thatyou> guys have been seeing?I am hoping it will fix the problem! The machines needed to reproduce the problem are in our Provo labs. We will get the tests going this morning. We will keep you posted.> > The patch needs cleaning up: at least to rename the lock, butactually> just explicitly forcing the ordering we need is probably better. I''llAgreed.> sort that out and check in a suitable fix. > > Thanks! > Keir >Regards, K. Y _______________________________________________> Xen- devel mailing list > Xen- devel@lists.xensource.com > http://lists.xensource.com/xen- devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel