I guess I have stumbled upon a bug in gplpv drivers. If you try to re-enable an nic adapter after disabling it, it just hides out of blue. And you have to reboot. Latest version of gplpv drivers and Windows 2008 R2 here. Anyone else noticed the same issue? I wonder if this is related to hot-plugging something like blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual-machine.html#.UZIop3G53Rg ??
On May 14, 2013, at 8:34 AM, Micky <mickylmartin@gmail.com> wrote:> I guess I have stumbled upon a bug in gplpv drivers. > If you try to re-enable an nic adapter after disabling it, it just > hides out of blue. And you have to reboot. > > Latest version of gplpv drivers and Windows 2008 R2 here. > > Anyone else noticed the same issue? > > I wonder if this is related to hot-plugging something like > blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual-machine.html#.UZIop3G53Rg > ??I had the same problem, Xen 4.2.1 (or 4.2... Not sure of the top of my head), dom0 is Debian 6.0.7 x64, on a win7 x64 domU. Oddly, and in addition to your issue, every time I launch DotA 2, the Xen Net device would lose all network connectivity; it gives the yellow exclamation point on the connectivity indicator in the system tray. I worked around it by giving it a physical NIC, but just wanted to chime in. -Andrew
Perhaps James can shed some light on it. Gplpv drivers are working fine for me other than the disappearance of adapters. I am gonna continue some testing, just to make sure that things work as expected before I push the drivers into production. On Tue, May 14, 2013 at 6:13 PM, Andrew Bobulsky <rulerof@gmail.com> wrote:> On May 14, 2013, at 8:34 AM, Micky <mickylmartin@gmail.com> wrote: > >> I guess I have stumbled upon a bug in gplpv drivers. >> If you try to re-enable an nic adapter after disabling it, it just >> hides out of blue. And you have to reboot. >> >> Latest version of gplpv drivers and Windows 2008 R2 here. >> >> Anyone else noticed the same issue? >> >> I wonder if this is related to hot-plugging something like >> blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual-machine.html#.UZIop3G53Rg >> ?? > > > I had the same problem, Xen 4.2.1 (or 4.2... Not sure of the top of my > head), dom0 is Debian 6.0.7 x64, on a win7 x64 domU. > > Oddly, and in addition to your issue, every time I launch DotA 2, the > Xen Net device would lose all network connectivity; it gives the > yellow exclamation point on the connectivity indicator in the system > tray. > > I worked around it by giving it a physical NIC, but just wanted to chime in. > > -Andrew
Further, I just noticed that devmgmt shows this error for Xen Net Device Driver "Device cannot start. (Code 10)" On Tue, May 14, 2013 at 7:12 PM, Micky <mickylmartin@gmail.com> wrote:> Perhaps James can shed some light on it. Gplpv drivers are working > fine for me other than the disappearance of adapters. I am gonna > continue some testing, just to make sure that things work as expected > before I push the drivers into production. > > On Tue, May 14, 2013 at 6:13 PM, Andrew Bobulsky <rulerof@gmail.com> wrote: >> On May 14, 2013, at 8:34 AM, Micky <mickylmartin@gmail.com> wrote: >> >>> I guess I have stumbled upon a bug in gplpv drivers. >>> If you try to re-enable an nic adapter after disabling it, it just >>> hides out of blue. And you have to reboot. >>> >>> Latest version of gplpv drivers and Windows 2008 R2 here. >>> >>> Anyone else noticed the same issue? >>> >>> I wonder if this is related to hot-plugging something like >>> blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual-machine.html#.UZIop3G53Rg >>> ?? >> >> >> I had the same problem, Xen 4.2.1 (or 4.2... Not sure of the top of my >> head), dom0 is Debian 6.0.7 x64, on a win7 x64 domU. >> >> Oddly, and in addition to your issue, every time I launch DotA 2, the >> Xen Net device would lose all network connectivity; it gives the >> yellow exclamation point on the connectivity indicator in the system >> tray. >> >> I worked around it by giving it a physical NIC, but just wanted to chime in. >> >> -Andrew
On May 14, 2013, at 10:36 AM, Micky <mickylmartin@gmail.com> wrote:> Further, I just noticed that devmgmt shows this error for Xen Net Device Driver > "Device cannot start. (Code 10)"Ah, yes! I recall the same.> > On Tue, May 14, 2013 at 7:12 PM, Micky <mickylmartin@gmail.com> wrote: >> Perhaps James can shed some light on it. Gplpv drivers are working >> fine for me other than the disappearance of adapters. I am gonna >> continue some testing, just to make sure that things work as expected >> before I push the drivers into production. >> >> On Tue, May 14, 2013 at 6:13 PM, Andrew Bobulsky <rulerof@gmail.com> wrote: >>> On May 14, 2013, at 8:34 AM, Micky <mickylmartin@gmail.com> wrote: >>> >>>> I guess I have stumbled upon a bug in gplpv drivers. >>>> If you try to re-enable an nic adapter after disabling it, it just >>>> hides out of blue. And you have to reboot. >>>> >>>> Latest version of gplpv drivers and Windows 2008 R2 here. >>>> >>>> Anyone else noticed the same issue? >>>> >>>> I wonder if this is related to hot-plugging something like >>>> blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual-machine.html#.UZIop3G53Rg >>>> ?? >>> >>> >>> I had the same problem, Xen 4.2.1 (or 4.2... Not sure of the top of my >>> head), dom0 is Debian 6.0.7 x64, on a win7 x64 domU. >>> >>> Oddly, and in addition to your issue, every time I launch DotA 2, the >>> Xen Net device would lose all network connectivity; it gives the >>> yellow exclamation point on the connectivity indicator in the system >>> tray. >>> >>> I worked around it by giving it a physical NIC, but just wanted to chime in. >>> >>> -Andrew
> > I guess I have stumbled upon a bug in gplpv drivers. > If you try to re-enable an nic adapter after disabling it, it just > hides out of blue. And you have to reboot. > > Latest version of gplpv drivers and Windows 2008 R2 here. > > Anyone else noticed the same issue? > > I wonder if this is related to hot-plugging something like > blogs.vmware.com/kb/2010/06/nic-is-missing-in-my-virtual- > machine.html#.UZIop3G53Rg > ?? >With version 0.11.0.402 I cannot reproduce this problem. Disabling the network interface unloads the xennet driver, and enabling it reloads the driver and it all works fine. If you had two network interfaces loaded, disabling wouldn''t unload the driver so the code path would be different. I can''t test that at the moment but if that is your situation I will set something up to reproduce. If you are using 0.11.0.402 then please install the debug version of the drivers, and send me the output of /var/log/xen/qemu-dm-<domu name>.log after you have disabled and then enabled the network. If you are using something older then you can do the same and I can have a look but I recommend you install the latest drivers from the testing directory as they have been a lot more robust for me. The only problem is that they are test-signed at the moment until I can sort out getting a certificate, so you would need to do bcdedit /set testsigning on to be able to install them on a 64 bit OS. James
Just tried with the latest testing version 0.11.0.402 and it still behaves the same. i.e Re-enabling the nic crashes the driver. Rather, the "enabling" dialogue gets stuck with this latest testing version and the only way out is by forcefully destroying the domain. As usually I have two nic adapters in domu. The required debug log is here: pastebin.com/2A7eJbKM On Wed, May 15, 2013 at 6:48 PM, Micky <mickylmartin@gmail.com> wrote:>>> On Wed, May 15, 2013 at 4:49 AM, James Harper <james.harper@bendigoit.com.au> wrote: >>> >> With version 0.11.0.402 I cannot reproduce this problem. Disabling the network interface unloads the xennet driver, and enabling it reloads the driver and it all works fine. >> >> If you had two network interfaces loaded, disabling wouldn''t unload the driver so the code path would be different. I can''t test that at the moment but if that is your situation I will set something up to reproduce. >> >> If you are using 0.11.0.402 then please install the debug version of the drivers, and send me the output of /var/log/xen/qemu-dm-<domu name>.log after you have disabled and then enabled the network. If you are using something older then you can do the same and I can have a look but I recommend you install the latest drivers from the testing directory as they have been a lot more robust for me. The only problem is that they are test-signed at the moment until I can sort out getting a certificate, so you would need to do bcdedit /set testsigning on to be able to install them on a 64 bit OS. >> >> James >> > > Thank you for your response and time, James. I do appreciate the work > your do to keep this thing up and running! > I am using 0.11.0.372. And yes, I do have TWO network adapters added > to the DomU. > At the time checked, I really didn''t look into "/testing/" drivers -- > I assumed that the stable version would be good enough. But I will > test the testing version 0.11.0.402 and will report back if the same > issue exists there. > > By the way, why would adding two interfaces would not let driver to be > unloaded when one interface is re-enabled? Just curious.
> > Just tried with the latest testing version 0.11.0.402 and it still > behaves the same. i.e Re-enabling the nic crashes the driver. Rather, > the "enabling" dialogue gets stuck with this latest testing version > and the only way out is by forcefully destroying the domain. As > usually I have two nic adapters in domu. > > The required debug log is here: > pastebin.com/2A7eJbKM >Thanks for that. Unfortunately there is nothing pointing to an actual error there, which I guess is expected if you don''t get a crash. I just added a second network adapter to a 2008R2 vm and it crashed before I could log in, so I guess there is another bug somewhere too!> > > > By the way, why would adding two interfaces would not let driver to be > > unloaded when one interface is re-enabled? Just curious.Windows will unload a driver when nothing is using it. With two network adapters, if you disable one there is still a device left using it so it has to stay in memory. James
> Just tried with the latest testing version 0.11.0.402 and it still > behaves the same. i.e Re-enabling the nic crashes the driver. Rather, > the "enabling" dialogue gets stuck with this latest testing version > and the only way out is by forcefully destroying the domain. As > usually I have two nic adapters in domu. >I fixed my previous problem (old driver installed), and cannot reproduce what you are seeing. "Enabling..." comes up, then "Enabled", then it works as normal. Do you have any other software loaded that might be binding to the network stack? Firewall or antivirus software would be the obvious ones but some VPN software can trip things up. I guess next I need to see what''s in the xenstore. Get the id of the domain then disable and try and re-enable the adapter then xenstore-ls /local/domain/<id>/device/vif (and let me know which instance is stuck). Also get the backend value and do a xenstore-ls on that. Sent me the output. James
> > I fixed my previous problem (old driver installed), and cannot reproduce what you are seeing. "Enabling..." comes up, then "Enabled", then it works as normal. >Thanks for checking it up. That does sound good but odd in my case.> Do you have any other software loaded that might be binding to the network stack? Firewall or antivirus software would be the obvious ones but some VPN software can trip things up. >It was a clean install. The only thing that I have different is that I disabled gso and task offload on dom0 on all interfaces. Doesn''t seem like if it would cause any problems as that''s the pretty standard process. Or does it?> I guess next I need to see what''s in the xenstore. Get the id of the domain then disable and try and re-enable the adapter then xenstore-ls /local/domain/<id>/device/vif (and let me know which instance is stuck). Also get the backend value and do a xenstore-ls on that. Sent me the output.Below is what xenstore-ls looks like: # xenstore-ls /local/domain/158/device/vif 0 = "" backend = "/local/domain/0/backend/vif/158/0" backend-id = "0" state = "1" handle = "0" mac = "00:16:3e:9e:55:03" tx-ring-ref = "16358" rx-ring-ref = "16366" event-channel = "9" request-rx-copy = "1" feature-rx-notify = "1" feature-no-csum-offload = "0" feature-sg = "1" feature-gso-tcpv4 = "1" 1 = "" backend = "/local/domain/0/backend/vif/158/1" backend-id = "0" state = "4" handle = "1" mac = "00:16:3e:7d:30:be" tx-ring-ref = "16351" rx-ring-ref = "16142" event-channel = "10" request-rx-copy = "1" feature-rx-notify = "1" feature-no-csum-offload = "0" feature-sg = "1" feature-gso-tcpv4 = "1"
> > Do you have any other software loaded that might be binding to the > > network stack? Firewall or antivirus software would be the obvious ones but > > some VPN software can trip things up. > > > > It was a clean install. The only thing that I have different is that I > disabled gso and task offload on dom0 on all interfaces. Doesn''t seem > like if it would cause any problems as that''s the pretty standard > process. Or does it?Shouldn''t matter but it''s easy enough for me to test.> > I guess next I need to see what''s in the xenstore. Get the id of the domain > > then disable and try and re-enable the adapter then xenstore-ls > > /local/domain/<id>/device/vif (and let me know which instance is stuck). > > Also get the backend value and do a xenstore-ls on that. Sent me the output. > > Below is what xenstore-ls looks like: >Ok so vif0 has state = 1 so it''s not running, but everything else seems okay. Can you do a xenstore-ls on the backend value, eg xenstore-ls /local/domain/0/backend/vif/158/0 James> # xenstore-ls /local/domain/158/device/vif > 0 = "" > backend = "/local/domain/0/backend/vif/158/0" > backend-id = "0" > state = "1" > handle = "0" > mac = "00:16:3e:9e:55:03" > tx-ring-ref = "16358" > rx-ring-ref = "16366" > event-channel = "9" > request-rx-copy = "1" > feature-rx-notify = "1" > feature-no-csum-offload = "0" > feature-sg = "1" > feature-gso-tcpv4 = "1" > 1 = "" > backend = "/local/domain/0/backend/vif/158/1" > backend-id = "0" > state = "4" > handle = "1" > mac = "00:16:3e:7d:30:be" > tx-ring-ref = "16351" > rx-ring-ref = "16142" > event-channel = "10" > request-rx-copy = "1" > feature-rx-notify = "1" > feature-no-csum-offload = "0" > feature-sg = "1" > feature-gso-tcpv4 = "1"
> Shouldn''t matter but it''s easy enough for me to test.Ah, thanks!>> > > Ok so vif0 has state = 1 so it''s not running, but everything else seems okay. Can you do a xenstore-ls on the backend value, eg xenstore-ls /local/domain/0/backend/vif/158/0 >Yea that''s when the adapter inside domu was disabled and re-enabling crashed the driver. After a reboot, things come back and it shows a state of 4. # xenstore-ls /local/domain/0/backend/vif/159/0 frontend = "/local/domain/159/device/vif/0" frontend-id = "159" online = "1" state = "4" script = "/etc/xen/scripts/vif-bridge" mac = "00:16:3e:9e:55:03" bridge = "br0" handle = "0" type = "vif_ioemu" feature-sg = "1" feature-gso-tcpv4 = "1" feature-rx-copy = "1" feature-rx-flip = "0" hotplug-status = "connected"
> >> > > > > Ok so vif0 has state = 1 so it''s not running, but everything else seems okay. > Can you do a xenstore-ls on the backend value, eg xenstore-ls > /local/domain/0/backend/vif/158/0 > > > > Yea that''s when the adapter inside domu was disabled and re-enabling > crashed the driver. After a reboot, things come back and it shows a > state of 4. >I really need to see the frontend and backend xenstore when the driver is in a failed/hung state. Currently there is no timeout implemented, xennet just waits forever fort the backend to progress to the next state. I can put a timeout in there which will give you a more sensible error (eg windows will complain that device couldn''t start or something) instead of hanging, but won''t solve the underlying problem. You only included the xennet stuff in your logs so I can''t see if xenpci is doing the right thing, I can''t imagine it will tell me anything different though. Can you also send me the end of the kernel logs as there should be some messages logged as the backend changes state. James
> I really need to see the frontend and backend xenstore when the driver is in a failed/hung state. >I should have mentioned the dom0 kernel version at the beginning. Must have forgotten. It is 3.8.8-1.el6xen.x86_64 not self compiled but taken from CRC''s repo.> Currently there is no timeout implemented, xennet just waits forever fort the backend to progress to the next state. I can put a timeout in there which will give you a more sensible error (eg windows will complain that device couldn''t start or something) instead of hanging, but won''t solve the underlying problem.Yea, right now the problem is not irritating but makes me just curious since some most people encounter it. Not certainly the end of world but would help if the driver doesn''t crash.> > You only included the xennet stuff in your logs so I can''t see if xenpci is doing the right thing, I can''t imagine it will tell me anything different though. > Can you also send me the end of the kernel logs as there should be some messages logged as the backend changes state.Sure. Snippet of the logs is below. But I am afraid if kernel tells anything but few telltale signs of what''s happening other than telling a vif is disabled. br0: port 3(vif159.0) entered disabled state device vif159.0 left promiscuous mode br0: port 3(vif159.0) entered disabled state frontend_changed: backend/vif/159/0: prepare for reconnect Thanks.
> > I should have mentioned the dom0 kernel version at the beginning. Must > have forgotten. > It is 3.8.8-1.el6xen.x86_64 not self compiled but taken from CRC''s repo. >My test machine is Debian 3.8.5 which should be close enough although it''s possible there is a patch that changes the state transition in a subtle way.> > Currently there is no timeout implemented, xennet just waits forever fort > > the backend to progress to the next state. I can put a timeout in there which > > will give you a more sensible error (eg windows will complain that device > > couldn''t start or something) instead of hanging, but won''t solve the > > underlying problem. > > Yea, right now the problem is not irritating but makes me just curious > since some most people encounter it. Not certainly the end of world > but would help if the driver doesn''t crash. > > > > > You only included the xennet stuff in your logs so I can''t see if xenpci is > > doing the right thing, I can''t imagine it will tell me anything different though. > > Can you also send me the end of the kernel logs as there should be some > > messages logged as the backend changes state. > > Sure. Snippet of the logs is below. But I am afraid if kernel tells > anything but few telltale signs of what''s happening other than telling > a vif is disabled. > > br0: port 3(vif159.0) entered disabled state > device vif159.0 left promiscuous mode > br0: port 3(vif159.0) entered disabled state > frontend_changed: backend/vif/159/0: prepare for reconnect >I''d expect an error would be logged if the reconnect failed but I can''t be sure. Can you try disabling both adapters so the driver unloads then enable them both again (even if it gets stuck when the first one loads)? James
> > My test machine is Debian 3.8.5 which should be close enough although it''s possible there is a patch that changes the state transition in a subtle way. >That is quite interesting. I do think this would be the case since we have tried everything else. Someday I may be able to try this on Debian dom0.> Can you try disabling both adapters so the driver unloads then enable them both again (even if it gets stuck when the first one loads)?I guess that was the first apparent thing that I did when an adapter disappeared while re-enabling, yea as funny as it sounds, LOL. But I did just try again; both adapters disappear and driver crashes with same error. Reboot brings them back.
> > > > > My test machine is Debian 3.8.5 which should be close enough although it''s > possible there is a patch that changes the state transition in a subtle way. > > > > That is quite interesting. I do think this would be the case since we > have tried everything else. Someday I may be able to try this on > Debian dom0. > > > Can you try disabling both adapters so the driver unloads then enable them > both again (even if it gets stuck when the first one loads)? > > I guess that was the first apparent thing that I did when an adapter > disappeared while re-enabling, yea as funny as it sounds, LOL. But I > did just try again; both adapters disappear and driver crashes with > same error. Reboot brings them back.I just uploaded a version 404 to testing which has some timeouts implemented (and a PAE/x64 fix for vbd). That won''t fix the problem but might tell me more about the error if you can send me the debug log. When you say crash is that a BSoD? I can''t remember if I''ve asked you that before. James
> I just uploaded a version 404 to testing which has some timeouts implemented (and a PAE/x64 fix for vbd). That won''t fix the problem but might tell me more about the error if you can send me the debug log.Thanks. I''ll take a peek soon.> When you say crash is that a BSoD? I can''t remember if I''ve asked you that before.Not a BSOD but a driver crash with "device cannot start (error 10)" in device manager.
Reasonably Related Threads
- GPL PV drivers for Windows 0.9.11-pre12
- GPL PV drivers for Windows 0.9.11-pre12
- GPLPV (9.11pre20) in Win2003 x64 on XenServer Enterprise 5.0 (CD drive missing)
- [GPLPV] Xennet device issues on Win2k3 x32 DomU/Disabling just one PV device
- Network not working after restore with qemu-xen windows domU and gplpv