Hi Wei Liu, I am doing some network performance on Xen4.1.2 and kernel 3.0, and get a crash with BUG_ON(netbk->mmap_pages[idx] != page) in netbk_gop_frag() accidentally. By analyzing the module drivers/xen/netback, I think the reason is as follows when sending packets from VM1 to VM2: 1) The two netback thread(the first for VM1 sending, second for VM2 receiving) run concurrently. 2) In first netback thread, it will do delayed copy from a foreign granted page to local memory when some outstanding packets have been pending too long( above half of one HZ). Then netbk->mmap_pages[idx] will be replaced with new allocated page. 3) If the packets are forwarded to VM2 by virtual switch, netbk_gop_frag() will be called in second netback thread. And that function will judge whether the pages in skb frags[] is foreign in order to make sure how to do grant copy. 4) If the page replacing was done after the page foreign judge in netbk_gop_frag(), the BUG will be invoked because the page from skb frags[] are different with mmap_pages[idx]. I tried to using spin_lock to protect the page accessing, but no appropriate solutions was found. How to fix this problem? Would you like to share some opinions? In addition, I have tried to turn off copy_skb. Then the vif netdevice may not be released after shutting down VM, that''s because outstanding packets hold the reference count of the device too long for some unknown reason. The reason may be that the NIC does not release packets after DMA. Does anyone have met such problems? Thanks. Best regards, Jerry
>>> On 16.10.13 at 06:13, jerry <jerry.lilijun@huawei.com> wrote: > Hi Wei Liu, > > I am doing some network performance on Xen4.1.2 and kernel 3.0, and get a > crash with BUG_ON(netbk->mmap_pages[idx] != page) in netbk_gop_frag() > accidentally. > > By analyzing the module drivers/xen/netback,You aren''t looking at the upstream driver, are you? If so, Wei is very likely the wrong addressee. Assuming that you instead talk of the SLE11 kernel, I can only point out that a problem in that code was found and fixed a couple of months ago (resulting in the BUG_ON() you quoted not being there anymore), so you''re simply not looking at up-to-date code. Jan> I think the reason is as > follows when sending packets from VM1 to VM2: > 1) The two netback thread(the first for VM1 sending, second for VM2 > receiving) run concurrently. > 2) In first netback thread, it will do delayed copy from a foreign granted > page to local memory when some outstanding packets have been pending too > long( above half of one HZ). > Then netbk->mmap_pages[idx] will be replaced with new allocated page. > 3) If the packets are forwarded to VM2 by virtual switch, netbk_gop_frag() > will be called in second netback thread. > And that function will judge whether the pages in skb frags[] is foreign > in order to make sure how to do grant copy. > 4) If the page replacing was done after the page foreign judge in > netbk_gop_frag(), the BUG will be invoked because the page from skb frags[] > are different with mmap_pages[idx]. > > I tried to using spin_lock to protect the page accessing, but no appropriate > solutions was found. > How to fix this problem? Would you like to share some opinions? > > In addition, I have tried to turn off copy_skb. Then the vif netdevice may > not be released after shutting down VM, > that''s because outstanding packets hold the reference count of the device > too long for some unknown reason. > The reason may be that the NIC does not release packets after DMA. > Does anyone have met such problems? Thanks. > > Best regards, > Jerry > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Hi Jan, Thanks for your reply. Yes, I am using the SLE11 kernel 3.0.58 which is not up-to-date as you assumed. I find one related patch named xen-netback-generalize which was committed on Aug 7 and has been applied to SLE11 kernel 3.0.98. That BUG_ON(netbk->mmap_pages[idx] != page) has been removed in this patch. But there may be still concurrency problems in my test. If the page replacing in copy_pending_req() was done after netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked with GNTCOPY_source_gref. Here the memory of that page in skb has been replaced with Dom0 local memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in netbk_rx_actions() will get errors. The messages is shown as: (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) Would you like to share some opinions? Regards, Jerry On 2013/10/16 19:10, Jan Beulich wrote:>>>> On 16.10.13 at 06:13, jerry <jerry.lilijun@huawei.com> wrote: >> Hi Wei Liu, >> >> I am doing some network performance on Xen4.1.2 and kernel 3.0, and get a >> crash with BUG_ON(netbk->mmap_pages[idx] != page) in netbk_gop_frag() >> accidentally. >> >> By analyzing the module drivers/xen/netback, > > You aren''t looking at the upstream driver, are you? If so, Wei is > very likely the wrong addressee. > > Assuming that you instead talk of the SLE11 kernel, I can only > point out that a problem in that code was found and fixed a > couple of months ago (resulting in the BUG_ON() you quoted not > being there anymore), so you''re simply not looking at up-to-date > code. > > Jan > >> I think the reason is as >> follows when sending packets from VM1 to VM2: >> 1) The two netback thread(the first for VM1 sending, second for VM2 >> receiving) run concurrently. >> 2) In first netback thread, it will do delayed copy from a foreign granted >> page to local memory when some outstanding packets have been pending too >> long( above half of one HZ). >> Then netbk->mmap_pages[idx] will be replaced with new allocated page. >> 3) If the packets are forwarded to VM2 by virtual switch, netbk_gop_frag() >> will be called in second netback thread. >> And that function will judge whether the pages in skb frags[] is foreign >> in order to make sure how to do grant copy. >> 4) If the page replacing was done after the page foreign judge in >> netbk_gop_frag(), the BUG will be invoked because the page from skb frags[] >> are different with mmap_pages[idx]. >> >> I tried to using spin_lock to protect the page accessing, but no appropriate >> solutions was found. >> How to fix this problem? Would you like to share some opinions? >> >> In addition, I have tried to turn off copy_skb. Then the vif netdevice may >> not be released after shutting down VM, >> that''s because outstanding packets hold the reference count of the device >> too long for some unknown reason. >> The reason may be that the NIC does not release packets after DMA. >> Does anyone have met such problems? Thanks. >> >> Best regards, >> Jerry >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > > > . >
>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote: > But there may be still concurrency problems in my test. > If the page replacing in copy_pending_req() was done after > netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked > with GNTCOPY_source_gref. > Here the memory of that page in skb has been replaced with Dom0 local > memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in > netbk_rx_actions() will get errors. > The messages is shown as: > > (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) > > Would you like to share some opinions?At a first glance that seems possible, but the question is - does it cause any problems other than the quoted message to be issued (and the problematic packet getting re-transmitted)? I''m asking mainly because fixing this would appear to imply adding locking to these paths - with the risk of adversely affecting performance. Jan
Hi Jan, In my test, the grant table copy error may cause that VM crash. The stack is as follows: kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372! Pid: 2658, comm: iperf Not tainted 2.6.32-220.el6.x86_64 #1 Xen HVM domU RIP: 0010:[<ffffffffa01166ca>] [<ffffffffa01166ca>] xennet_tx_buf_gc+0x18a/0x1f0 [xen_netfront] RSP: 0018:ffff880004403df8 EFLAGS: 00010096 RAX: 0000000000000049 RBX: ffff8800821986e0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 RBP: ffff880004403e48 R08: ffffffff81c00690 R09: 0000000000000080 R10: 0000000000013030 R11: 0000000000000000 R12: 000000000000003b R13: 000000000000023d R14: 0000000000000011 R15: 0000000000000011 FS: 00007fd8fd97e700(0000) GS:ffff880004400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000030270aab70 CR3: 0000000080cf4000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process iperf (pid: 2658, threadinfo ffff8800813ba000, task ffff880080d0eb00) Stack: ffff880082198020 ffff880082198f90 ffff88007f8d00c0 0000003f04415fc0 <0> ffff880004403e28 ffff880082198768 ffff880082198020 ffff8800821986e0 <0> 0000000000000282 0000000000000100 ffff880004403e78 ffffffffa0117d4c Call Trace: <IRQ> [<ffffffffa0117d4c>] xennet_interrupt+0x4c/0xb0 [xen_netfront] [<ffffffff810d94f0>] handle_IRQ_event+0x60/0x170 [<ffffffff8109b8a3>] ? ktime_get+0x63/0xe0 [<ffffffff810dbc2e>] handle_edge_irq+0xde/0x180 [<ffffffff812fe809>] __xen_evtchn_do_upcall+0x1b9/0x1f0 [<ffffffff812fedbf>] xen_evtchn_do_upcall+0x2f/0x50 [<ffffffff8100c373>] xen_hvm_callback_vector+0x13/0x20 The BUG code in xen-netfront.c xennet_tx_buf_gc() is: if (unlikely(gnttab_query_foreign_access( np->grant_tx_ref[id]) != 0)) { printk(KERN_ALERT "xennet_tx_buf_gc: warning " "-- grant still in use by backend " "domain.\n"); BUG(); In my guess the reason may be as follows: 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and __acquire_grant_for_copy() is executed failed and the grant ref is not ended. So GTF_reading bit cannot be cleared. 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is still set. Regards, Jerry On 2013/10/17 16:00, Jan Beulich wrote:>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote: >> But there may be still concurrency problems in my test. >> If the page replacing in copy_pending_req() was done after >> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked >> with GNTCOPY_source_gref. >> Here the memory of that page in skb has been replaced with Dom0 local >> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in >> netbk_rx_actions() will get errors. >> The messages is shown as: >> >> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) >> >> Would you like to share some opinions? > > At a first glance that seems possible, but the question is - does it > cause any problems other than the quoted message to be issued > (and the problematic packet getting re-transmitted)? I''m asking > mainly because fixing this would appear to imply adding locking to > these paths - with the risk of adversely affecting performance. > > Jan > > >
>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@huawei.com> wrote: > Hi Jan,please don''t top post.> In my test, the grant table copy error may cause that VM crash. > The stack is as follows: > kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372! > ... > The BUG code in xen-netfront.c xennet_tx_buf_gc() is: > if (unlikely(gnttab_query_foreign_access( > np->grant_tx_ref[id]) != 0)) { > printk(KERN_ALERT "xennet_tx_buf_gc: warning " > "-- grant still in use by backend " > "domain.\n"); > BUG(); > > In my guess the reason may be as follows: > 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and > __acquire_grant_for_copy() is executed failed and the grant ref is not ended. > So GTF_reading bit cannot be cleared. > 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is > still set.If that was the case, this would be a hypervisor bug: a grant copy operation is supposed to hold the grant active only for as long as the copy operation takes. You''ll in particular notice that __acquire_grant_for_copy() in its error path clears GTF_reading (and GTF_writing, as appropriate) again. You''d likely need to instrument the code to demonstrate (via a couple of extra log messages) what you think is not working properly here. Jan> On 2013/10/17 16:00, Jan Beulich wrote: >>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote: >>> But there may be still concurrency problems in my test. >>> If the page replacing in copy_pending_req() was done after >>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked >>> with GNTCOPY_source_gref. >>> Here the memory of that page in skb has been replaced with Dom0 local >>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in >>> netbk_rx_actions() will get errors. >>> The messages is shown as: >>> >>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) >>> >>> Would you like to share some opinions? >> >> At a first glance that seems possible, but the question is - does it >> cause any problems other than the quoted message to be issued >> (and the problematic packet getting re-transmitted)? I''m asking >> mainly because fixing this would appear to imply adding locking to >> these paths - with the risk of adversely affecting performance. >> >> Jan >> >> >>
On 2013/10/17 20:11, Jan Beulich wrote:>>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@huawei.com> wrote: >> Hi Jan, > > please don''t top post. > >> In my test, the grant table copy error may cause that VM crash. >> The stack is as follows: >> kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372! >> ... >> The BUG code in xen-netfront.c xennet_tx_buf_gc() is: >> if (unlikely(gnttab_query_foreign_access( >> np->grant_tx_ref[id]) != 0)) { >> printk(KERN_ALERT "xennet_tx_buf_gc: warning " >> "-- grant still in use by backend " >> "domain.\n"); >> BUG(); >> >> In my guess the reason may be as follows: >> 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and >> __acquire_grant_for_copy() is executed failed and the grant ref is not ended. >> So GTF_reading bit cannot be cleared. >> 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is >> still set. > > If that was the case, this would be a hypervisor bug: a grant copy > operation is supposed to hold the grant active only for as long as > the copy operation takes. You''ll in particular notice that > __acquire_grant_for_copy() in its error path clears GTF_reading > (and GTF_writing, as appropriate) again. You''d likely need to > instrument the code to demonstrate (via a couple of extra log > messages) what you think is not working properly here.I have proved that the GTF_reading or GTF_writing is surely cleared after __gnttab_copy(). So the question is where the GTF_reading is set. Is hypervisor doing a grant copy operation while VM netfront calling xennet_tx_buf_gc()? Any ideas?> > Jan > >> On 2013/10/17 16:00, Jan Beulich wrote: >>>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote: >>>> But there may be still concurrency problems in my test. >>>> If the page replacing in copy_pending_req() was done after >>>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked >>>> with GNTCOPY_source_gref. >>>> Here the memory of that page in skb has been replaced with Dom0 local >>>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in >>>> netbk_rx_actions() will get errors. >>>> The messages is shown as: >>>> >>>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0) >>>> >>>> Would you like to share some opinions? >>> >>> At a first glance that seems possible, but the question is - does it >>> cause any problems other than the quoted message to be issued >>> (and the problematic packet getting re-transmitted)? I''m asking >>> mainly because fixing this would appear to imply adding locking to >>> these paths - with the risk of adversely affecting performance. >>> >>> Jan >>> >>> >>> > > > > > . >
>>> On 22.10.13 at 03:18, jerry <jerry.lilijun@huawei.com> wrote: > I have proved that the GTF_reading or GTF_writing is surely cleared after > __gnttab_copy(). > So the question is where the GTF_reading is set. > Is hypervisor doing a grant copy operation while VM netfront calling > xennet_tx_buf_gc()?Surely not - grant copy operations would only ever be invoked by netback. Jan
On 2013/10/16 19:10, Jan Beulich wrote:>>>> On 16.10.13 at 06:13, jerry <jerry.lilijun@huawei.com> wrote: >> Hi Wei Liu, >> >> I am doing some network performance on Xen4.1.2 and kernel 3.0, and get a >> crash with BUG_ON(netbk->mmap_pages[idx] != page) in netbk_gop_frag() >> accidentally. >> >> By analyzing the module drivers/xen/netback, > > You aren''t looking at the upstream driver, are you? If so, Wei is > very likely the wrong addressee. > > Assuming that you instead talk of the SLE11 kernel, I can only > point out that a problem in that code was found and fixed a > couple of months ago (resulting in the BUG_ON() you quoted not > being there anymore), so you''re simply not looking at up-to-date > code. > > Jan > >> I think the reason is as >> follows when sending packets from VM1 to VM2: >> 1) The two netback thread(the first for VM1 sending, second for VM2 >> receiving) run concurrently. >> 2) In first netback thread, it will do delayed copy from a foreign granted >> page to local memory when some outstanding packets have been pending too >> long( above half of one HZ). >> Then netbk->mmap_pages[idx] will be replaced with new allocated page. >> 3) If the packets are forwarded to VM2 by virtual switch, netbk_gop_frag() >> will be called in second netback thread. >> And that function will judge whether the pages in skb frags[] is foreign >> in order to make sure how to do grant copy. >> 4) If the page replacing was done after the page foreign judge in >> netbk_gop_frag(), the BUG will be invoked because the page from skb frags[] >> are different with mmap_pages[idx]. >> >> I tried to using spin_lock to protect the page accessing, but no appropriate >> solutions was found. >> How to fix this problem? Would you like to share some opinions? >> >> In addition, I have tried to turn off copy_skb. Then the vif netdevice may >> not be released after shutting down VM, >> that''s because outstanding packets hold the reference count of the device >> too long for some unknown reason. >> The reason may be that the NIC does not release packets after DMA. >> Does anyone have met such problems? Thanks. >>The reason why the vif net-device isn''t released after shutting down VM was found with copy_skb disabled. Let it be supposed that VM1(vif1.0) sends packets to VM2(vif2.0) by virtual switch. 1) The VM2''s OS is windows 2003 and has been shutdown before for some unexpected reason. After being created, this VM2 stopped the starting process at the prompt windows named "Shutdown Event Tracker". It is waiting for users to input some messages for the question why the computer shut down unexpectedly. 2) The VM2 already has vif2.0 created. Then I added a new vif net-device using virsh commands. The new vif2.1 was not completely created with no interrupts, but its state is running and tx queues is started as default. The function connect() in xenbus.c hasn''t been called for vif2.1. The related information in xenstore is as follows: linux-szRoyS:/ # xenstore-ls -f | grep 2 | grep state /local/domain/0/device-model/2/state = "running" /local/domain/0/backend/vbd/2/51712/state = "4" /local/domain/0/backend/vbd/2/51760/state = "4" /local/domain/0/backend/vif/2/0/state = "4" /local/domain/0/backend/vif/2/1/state = "2" /local/domain/0/backend/console/2/0/state = "1" /local/domain/2/control/uvp/vm_state = "running" /local/domain/2/device/vbd/51712/state = "4" /local/domain/2/device/vbd/51760/state = "4" /local/domain/2/device/vif/0/state = "4" /local/domain/2/device/vif/1/state = "1" 3) The KOBJ_ONLINE message was generated in function backend_create_netif() called in netback_probe(). This event will invoke network script named "vif-bridge" executing and add vif2.1 to virtual switch. Then packets from vif1.0(VM1) will be forwarded or flooded to vif2.1 by virtual switch. The vif2.1 dropped this packets because its not netif_schedulable() in function netif_be_start_xmit(). 4) After setting vif2.1 to down and then to up, the TX queue can''t be started in net_open() with carrier off. So its qdisc became fifo_qdic and the TX queue state stopped. In this case, the packets will be held in qdisc queue and can''t be dequeued in function dequeue_skb() for vif2.1''s stopped TX queues. 5) If VM1 was destroyed, the packets from vif1.0 can''t be released and vif1.0 can''t be disconnected. The vif1.0 will be remained unreleased until setting vif2.1 to down. This problem is mainly because that vif2.1 was not created successfully and got in a strange state: running but TX queue is stopped. The function backend_create_netif() is called in two place netback_probe() and frontend_changed(). I think we can remove the backend_create_netif() call in netback_probe(). So we can make sure the vif net-device created completely after front-end changed to XenbusStateConnected. The patch is as follows: --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 @@ -156,9 +156,6 @@ if (err) goto fail; - /* This kicks hotplug scripts, so do it immediately. */ - backend_create_netif(be); - return 0; abort_transaction: Do you have some ideas?>> Best regards, >> Jerry >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > > > . >
>>> On 26.10.13 at 10:32, jerry <jerry.lilijun@huawei.com> wrote: > The reason why the vif net-device isn''t released after shutting down VM was > found with copy_skb disabled. > Let it be supposed that VM1(vif1.0) sends packets to VM2(vif2.0) by virtual > switch. > 1) The VM2''s OS is windows 2003 and has been shutdown before for some > unexpected reason. > After being created, this VM2 stopped the starting process at the prompt > windows named "Shutdown Event Tracker". > It is waiting for users to input some messages for the question why the > computer shut down unexpectedly. > > 2) The VM2 already has vif2.0 created. Then I added a new vif net-device > using virsh commands. > The new vif2.1 was not completely created with no interrupts, but its > state is running and tx queues is started as default. > The function connect() in xenbus.c hasn''t been called for vif2.1. The > related information in xenstore is as follows: > linux-szRoyS:/ # xenstore-ls -f | grep 2 | grep state > /local/domain/0/device-model/2/state = "running" > /local/domain/0/backend/vbd/2/51712/state = "4" > /local/domain/0/backend/vbd/2/51760/state = "4" > /local/domain/0/backend/vif/2/0/state = "4" > /local/domain/0/backend/vif/2/1/state = "2" > /local/domain/0/backend/console/2/0/state = "1" > /local/domain/2/control/uvp/vm_state = "running" > /local/domain/2/device/vbd/51712/state = "4" > /local/domain/2/device/vbd/51760/state = "4" > /local/domain/2/device/vif/0/state = "4" > /local/domain/2/device/vif/1/state = "1" > > 3) The KOBJ_ONLINE message was generated in function backend_create_netif() > called in netback_probe(). > This event will invoke network script named "vif-bridge" executing and > add vif2.1 to virtual switch. > Then packets from vif1.0(VM1) will be forwarded or flooded to vif2.1 by > virtual switch. > The vif2.1 dropped this packets because its not netif_schedulable() in > function netif_be_start_xmit(). > > 4) After setting vif2.1 to down and then to up, the TX queue can''t be > started in net_open() with carrier off. > So its qdisc became fifo_qdic and the TX queue state stopped. > In this case, the packets will be held in qdisc queue and can''t be > dequeued in function dequeue_skb() > for vif2.1''s stopped TX queues. > > 5) If VM1 was destroyed, the packets from vif1.0 can''t be released and > vif1.0 can''t be disconnected. > The vif1.0 will be remained unreleased until setting vif2.1 to down. > > This problem is mainly because that vif2.1 was not created successfully > and got in a strange state: > running but TX queue is stopped. The function backend_create_netif() is > called in two place netback_probe() and > frontend_changed(). I think we can remove the backend_create_netif() call > in netback_probe(). > So we can make sure the vif net-device created completely after front-end > changed to XenbusStateConnected. > > The patch is as follows: > --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 > +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 > @@ -156,9 +156,6 @@ > if (err) > goto fail; > > - /* This kicks hotplug scripts, so do it immediately. */ > - backend_create_netif(be); > - > return 0; > > abort_transaction: > > Do you have some ideas?No, not really. Would be helpful if this could be matched up to behavior (and eventual changes thereto) of the upstream driver. Jan
On Sat, Oct 26, 2013 at 04:32:08PM +0800, jerry wrote: [...]> > The patch is as follows: > --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 > +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 > @@ -156,9 +156,6 @@ > if (err) > goto fail; > > - /* This kicks hotplug scripts, so do it immediately. */ > - backend_create_netif(be); > - > return 0; > > abort_transaction: > > Do you have some ideas? >My gut feeling is that this sort of change is regression-prone but we have to live with that. In any case, does upstream changeset ea732dff5c (xen-netback: Handle backend state transitions in a more robust way) useful to you? Wei.
On 2013/10/28 15:43, Jan Beulich wrote:>>>> On 26.10.13 at 10:32, jerry <jerry.lilijun@huawei.com> wrote: >> The reason why the vif net-device isn''t released after shutting down VM was >> found with copy_skb disabled. >> Let it be supposed that VM1(vif1.0) sends packets to VM2(vif2.0) by virtual >> switch. >> 1) The VM2''s OS is windows 2003 and has been shutdown before for some >> unexpected reason. >> After being created, this VM2 stopped the starting process at the prompt >> windows named "Shutdown Event Tracker". >> It is waiting for users to input some messages for the question why the >> computer shut down unexpectedly. >> >> 2) The VM2 already has vif2.0 created. Then I added a new vif net-device >> using virsh commands. >> The new vif2.1 was not completely created with no interrupts, but its >> state is running and tx queues is started as default. >> The function connect() in xenbus.c hasn''t been called for vif2.1. The >> related information in xenstore is as follows: >> linux-szRoyS:/ # xenstore-ls -f | grep 2 | grep state >> /local/domain/0/device-model/2/state = "running" >> /local/domain/0/backend/vbd/2/51712/state = "4" >> /local/domain/0/backend/vbd/2/51760/state = "4" >> /local/domain/0/backend/vif/2/0/state = "4" >> /local/domain/0/backend/vif/2/1/state = "2" >> /local/domain/0/backend/console/2/0/state = "1" >> /local/domain/2/control/uvp/vm_state = "running" >> /local/domain/2/device/vbd/51712/state = "4" >> /local/domain/2/device/vbd/51760/state = "4" >> /local/domain/2/device/vif/0/state = "4" >> /local/domain/2/device/vif/1/state = "1" >> >> 3) The KOBJ_ONLINE message was generated in function backend_create_netif() >> called in netback_probe(). >> This event will invoke network script named "vif-bridge" executing and >> add vif2.1 to virtual switch. >> Then packets from vif1.0(VM1) will be forwarded or flooded to vif2.1 by >> virtual switch. >> The vif2.1 dropped this packets because its not netif_schedulable() in >> function netif_be_start_xmit(). >> >> 4) After setting vif2.1 to down and then to up, the TX queue can''t be >> started in net_open() with carrier off. >> So its qdisc became fifo_qdic and the TX queue state stopped. >> In this case, the packets will be held in qdisc queue and can''t be >> dequeued in function dequeue_skb() >> for vif2.1''s stopped TX queues. >> >> 5) If VM1 was destroyed, the packets from vif1.0 can''t be released and >> vif1.0 can''t be disconnected. >> The vif1.0 will be remained unreleased until setting vif2.1 to down. >> >> This problem is mainly because that vif2.1 was not created successfully >> and got in a strange state: >> running but TX queue is stopped. The function backend_create_netif() is >> called in two place netback_probe() and >> frontend_changed(). I think we can remove the backend_create_netif() call >> in netback_probe(). >> So we can make sure the vif net-device created completely after front-end >> changed to XenbusStateConnected. >> >> The patch is as follows: >> --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 >> +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 >> @@ -156,9 +156,6 @@ >> if (err) >> goto fail; >> >> - /* This kicks hotplug scripts, so do it immediately. */ >> - backend_create_netif(be); >> - >> return 0; >> >> abort_transaction: >> >> Do you have some ideas? > > No, not really. Would be helpful if this could be matched up to > behavior (and eventual changes thereto) of the upstream driver.Hi Wei and Jan, Thanks for your reply. My VM is running with SUSE11 sp2 netback drivers. So the upstream driver xen-netback has not been tested in such situation. The patch before may introduce some problems when migrating VMs. So I have a new solution to fix my problem. The patch is as follows: --- drivers/xen/netback/interface.c.old 2013-10-29 11:46:36.000000000 +0800 +++ drivers/xen/netback/interface.c 2013-10-29 11:46:47.000000000 +0800 @@ -111,8 +111,8 @@ netif_t *netif = netdev_priv(dev); if (netback_carrier_ok(netif)) { __netif_up(netif); - netif_start_queue(dev); } + netif_start_queue(dev); return 0; } After this modification, when vif is not connected to front-end, we can make Qdisc continue to transmit skb to vif and then dropped. I mean that the Qdisc queue shouldn''t cache SKBs when vif is not created completely. Any ideas? Jerry> > Jan > > > . >
On Mon, 2013-10-28 at 11:43 +0000, Wei Liu wrote:> On Sat, Oct 26, 2013 at 04:32:08PM +0800, jerry wrote: > [...] > > > > The patch is as follows: > > --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 > > +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 > > @@ -156,9 +156,6 @@ > > if (err) > > goto fail; > > > > - /* This kicks hotplug scripts, so do it immediately. */ > > - backend_create_netif(be); > > - > > return 0; > > > > abort_transaction: > > > > Do you have some ideas? > > > > My gut feeling is that this sort of change is regression-prone but we > have to live with that. >This thread/fix doesn''t apply to upstream netback, which doesn''t have copy_skb mode, right?> In any case, does upstream changeset ea732dff5c (xen-netback: Handle > backend state transitions in a more robust way) useful to you? > > > Wei. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Thu, Oct 31, 2013 at 03:17:11PM +0000, Ian Campbell wrote:> On Mon, 2013-10-28 at 11:43 +0000, Wei Liu wrote: > > On Sat, Oct 26, 2013 at 04:32:08PM +0800, jerry wrote: > > [...] > > > > > > The patch is as follows: > > > --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 > > > +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 > > > @@ -156,9 +156,6 @@ > > > if (err) > > > goto fail; > > > > > > - /* This kicks hotplug scripts, so do it immediately. */ > > > - backend_create_netif(be); > > > - > > > return 0; > > > > > > abort_transaction: > > > > > > Do you have some ideas? > > > > > > > My gut feeling is that this sort of change is regression-prone but we > > have to live with that. > > > > This thread/fix doesn''t apply to upstream netback, which doesn''t have > copy_skb mode, right? >No, it''s SuSE kernel.> > In any case, does upstream changeset ea732dff5c (xen-netback: Handle > > backend state transitions in a more robust way) useful to you? > > > > > > Wei. > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel >
On 2013/10/31 23:32, Wei Liu wrote:> On Thu, Oct 31, 2013 at 03:17:11PM +0000, Ian Campbell wrote: >> On Mon, 2013-10-28 at 11:43 +0000, Wei Liu wrote: >>> On Sat, Oct 26, 2013 at 04:32:08PM +0800, jerry wrote: >>> [...] >>>> >>>> The patch is as follows: >>>> --- drivers/xen/netback/xenbus.c.old 2013-10-26 16:23:07.000000000 +0800 >>>> +++ drivers/xen/netback/xenbus.c 2013-10-26 16:23:31.000000000 +0800 >>>> @@ -156,9 +156,6 @@ >>>> if (err) >>>> goto fail; >>>> >>>> - /* This kicks hotplug scripts, so do it immediately. */ >>>> - backend_create_netif(be); >>>> - >>>> return 0; >>>> >>>> abort_transaction: >>>> >>>> Do you have some ideas? >>>> >>> >>> My gut feeling is that this sort of change is regression-prone but we >>> have to live with that. >>> >> >> This thread/fix doesn''t apply to upstream netback, which doesn''t have >> copy_skb mode, right? >> > > No, it''s SuSE kernel.Yes, I am using SuSE11 SP2 kernel. The two mainly points in my other emails can be concluded as follows: 1) If testing with copy_skb mode enabled, some grant copy operations in another RX netbk tread will failed. This error will introduce packet retransmit and sometimes VM get crashed. Now I have no appropriate solution to fix the problem. So I have to turn off copy_skb mode. 2) If that''s disabled, the vif can''t be disconnected when VM is destroyed and its sending packets have not been consumed. Fortunately I found those packets was cached in another abnormal vif''s qdisc queues. My solution is keeping vif TX queue started when its set to up. So packets are dropped if vif is created, but not connected. The fix patch is shown as follow: --- drivers/xen/netback/interface.c.old 2013-10-29 11:46:36.000000000 +0800 +++ drivers/xen/netback/interface.c 2013-10-29 11:46:47.000000000 +0800 @@ -111,8 +111,8 @@ netif_t *netif = netdev_priv(dev); if (netback_carrier_ok(netif)) { __netif_up(netif); - netif_start_queue(dev); } + netif_start_queue(dev); return 0; }> >>> In any case, does upstream changeset ea732dff5c (xen-netback: Handle >>> backend state transitions in a more robust way) useful to you? >>> >>> >>> Wei. >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >> > > . >