mukesh agrawal
2005-Jan-15 01:38 UTC
[Xen-devel] (repeatable) cross-domain networking failure
Summary: I''m running into a situation where, after sending some UDP traffic between two xen domains (Domain 0 and Domain 1) the networking between the domains fails. This failure is 100% repeatable. In more detail: I have two xen domains. They run the kernels from the 2.0.3 release. (I''ve run into the same problem with 2.0.1 as well.) Domain 0 has 5 physical ethernet interfaces, and a virtual interface to Domain 1. Domain 1 has just the virtual interface to Domain 0. D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The netmask is set to 255.255.0.0. When I bring up D1, I can ping D1 from D0, ssh into D1, etc. I then start a UDP server in D0, and a traffic generator in D1. After the traffic generator sends its 128-th packet, networking between the domains fails. The 128th packet is received successfully by the UDP server, but no later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP. Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer receives packets sent by D1. D1, however, does receive packets sent by D0. (To be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic. There is not UDP traffic sent from D0 to D1.) (I suspect the stuff in this paragraph doesn''t matter, but include it for completeness.) Eventually, D0''s ARP cache entry for D1 expires. D0 ARPs for D1, and D1 replies. But D0 never receives these replies. And eventually, D1 stops replying to the ARPs entirely. (D1''s sending behavior is observed via tcpdump running in the console connection to D1.) Note that the networking failure only occurs if the UDP packets are delivered to a user-level process in D0. In particular, UDP traffic to D0''s kernel NFS server does not induce the failure. Nor does traffic sent to D0 for which there is no user process to accept the packets. And neither does traffic which is forwarded on to other hosts via NAT. (I haven''t tested the regular forwarding case.) Also, for what it''s worth, Domain 0''s network connectivity on its other interfaces (which are connected to the world at large) are unaffected. Looking through the mailing list archive, I saw a prior bug that seemed similar, but involved IP fragmentation. That is not the case here, as the UDP packets sent by D1 are small (<100 bytes). Any suggestions for debugging this? Thanks, mukesh ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-15 16:40 UTC
[Xen-devel] (repeatable) cross-domain networking failure
Summary: After sending some UDP traffic between two xen domains (Domain 0 and Domain 1) the networking between the domains fails. This failure is 100% repeatable. In more detail: I have two xen domains. They run the kernels from the 2.0.3 release. (I''ve run into the same problem with 2.0.1 as well.) Domain 0 has 5 physical ethernet interfaces, and a virtual interface to Domain 1. Domain 1 has just the virtual interface to Domain 0. D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The netmask is set to 255.255.0.0. When I bring up D1, I can ping D1 from D0, ssh into D1, etc. I then start a UDP server in D0, and a traffic generator in D1. After the traffic generator sends its 128-th packet, networking between the domains fails. The 128th packet is received successfully by the UDP server, but no later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP. Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer receives packets sent by D1. D1, however, does receive packets sent by D0. (To be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic. There is not UDP traffic sent from D0 to D1.) (I suspect the stuff in this paragraph doesn''t matter, but include it for completeness.) Eventually, D0''s ARP cache entry for D1 expires. D0 ARPs for D1, and D1 replies. But D0 never receives these replies. And eventually, D1 stops replying to the ARPs entirely. (D1''s sending behavior is observed via tcpdump running in the console connection to D1.) Note that the networking failure only occurs if the UDP packets are delivered to a user-level process in D0. In particular, UDP traffic to D0''s kernel NFS server does not induce the failure. Nor does traffic sent to D0 for which there is no user process to accept the packets. And neither does traffic which is forwarded on to other hosts via NAT. (I haven''t tested the regular forwarding case.) Also, for what it''s worth, Domain 0''s network connectivity on its other interfaces (which are connected to the world at large) are unaffected. Looking through the mailing list archive, I saw a prior bug that seemed similar, but involved IP fragmentation. That is not the case here, as the UDP packets sent by D1 are small (<100 bytes). Any suggestions for debugging this? Thanks, mukesh ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Jan-15 17:04 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
Maybe add some tracing to the backend driver -- it''s possible the backend isn''t sending responses for those packets back to domU, and so things seize up for a while. If no responses are being generated it is because the backend thinks the packets are still in flight, so there would be some bug-hunting to find out why that is. -- Keir> > Summary: > > After sending some UDP traffic between two xen domains (Domain 0 and > Domain 1) the networking between the domains fails. This failure is 100% > repeatable. > > In more detail: > > I have two xen domains. They run the kernels from the 2.0.3 release. (I''ve run > into the same problem with 2.0.1 as well.) Domain 0 has 5 physical ethernet > interfaces, and a virtual interface to Domain 1. Domain 1 has just the virtual > interface to Domain 0. > > D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The > netmask is set to 255.255.0.0. > > When I bring up D1, I can ping D1 from D0, ssh into D1, etc. > > I then start a UDP server in D0, and a traffic generator in D1. After the > traffic generator sends its 128-th packet, networking between the domains > fails. The 128th packet is received successfully by the UDP server, but no > later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP. > > Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer > receives packets sent by D1. D1, however, does receive packets sent by D0. (To > be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic. > There is not UDP traffic sent from D0 to D1.) > > (I suspect the stuff in this paragraph doesn''t matter, but include it for > completeness.) Eventually, D0''s ARP cache entry for D1 expires. D0 ARPs for D1, > and D1 replies. But D0 never receives these replies. And eventually, D1 stops > replying to the ARPs entirely. (D1''s sending behavior is observed via tcpdump > running in the console connection to D1.) > > Note that the networking failure only occurs if the UDP packets are delivered > to a user-level process in D0. In particular, UDP traffic to D0''s kernel NFS > server does not induce the failure. Nor does traffic sent to D0 for which there > is no user process to accept the packets. And neither does traffic which is > forwarded on to other hosts via NAT. (I haven''t tested the regular forwarding > case.) > > Also, for what it''s worth, Domain 0''s network connectivity on its other > interfaces (which are connected to the world at large) are unaffected. > > Looking through the mailing list archive, I saw a prior bug that seemed > similar, but involved IP fragmentation. That is not the case here, as the UDP > packets sent by D1 are small (<100 bytes). > > Any suggestions for debugging this? > > Thanks, > mukesh > > > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel >------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-15 17:26 UTC
Re: Fwd: [Xen-devel] (repeatable) cross-domain networking failure
> Maybe add some tracing to the backend driver -- it''s possible the > backend isn''t sending responses for those packets back to domU, and so > things seize up for a while. If no responses are being generated it is > because the backend thinks the packets are still in flight, so there > would be some bug-hunting to find out why that is.I''m not at all familiar with the details of the networking implementation, so please bear with my questions. (Feel free to point me at existing documentation on the details that I may have overlooked.) 1. When you say "the backend", is there just one backend (running, perhaps, in dom0)? Or is there a backend in each domain? 2. When you talk about responses not being generated, are you referring to the ICMP and ARP traffic? (For the UDP traffic, there isn''t expected to be any packet sent back from dom0 back to domU.) Thanks, mukesh ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Nivedita Singhvi
2005-Jan-15 21:14 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
mukesh agrawal wrote:> > Summary: > > After sending some UDP traffic between two xen domains (Domain 0 and > Domain 1) the networking between the domains fails. This failure is 100% > repeatable.I don''t have boxes at the moment and can''t reproduce till Monday, but can you show us the output of netstat -uan and netstat -s on both domains? Is there stuff in the receive or send queues? And was all the udp traffic going to the same port? i.e. any successful udp traffic to another endpoint?> I then start a UDP server in D0, and a traffic generator in D1. After > the traffic generator sends its 128-th packet, networking between the > domains fails. The 128th packet is received successfully by the UDP > server, but no later traffic arrives in D0. This includes UDP, TCP, > ICMP, and ARP.What does ifconfig on dom0 show? Are there any error messages in /var/log/messages?> Looking at the interrupt counts in /proc/interrupts, I see that D0 no > longer receives packets sent by D1. D1, however, does receive packets > sent by D0. (To be clear, D0->D1 traffic is ICMP ping requests, > unrelated to the UDP traffic. There is not UDP traffic sent from D0 to D1.)Is there any other successful traffic from D0 -> D1 (tcp?) thanks, Nivedita ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-16 20:49 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
Nivedita Singhvi <niv@us.ibm.com> wrote:> I don''t have boxes at the moment and can''t reproduce till > Monday, but can you show us the output of netstat -uan and > netstat -s on both domains? Is there stuff in the receive > or send queues?The detailed output of netstat follows. But their is neither anything in the send queue on domU, nor anything in the receive queue on dom0. (The UDP server in question is running on port 2000.) On dom0: $ netstat -uan Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State udp 0 0 0.0.0.0:1024 0.0.0.0:* udp 0 0 0.0.0.0:2049 0.0.0.0:* udp 0 0 0.0.0.0:514 0.0.0.0:* udp 0 0 0.0.0.0:1027 0.0.0.0:* udp 0 0 155.98.36.34:1028 155.98.32.70:8509 ESTABLISHED udp 0 0 0.0.0.0:775 0.0.0.0:* udp 0 0 0.0.0.0:653 0.0.0.0:* udp 0 0 192.168.0.1:2000 192.168.1.1:1024 ESTABLISHED udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 0.0.0.0:111 0.0.0.0:* udp 0 0 0.0.0.0:759 0.0.0.0:* On domU: # netstat -uan Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State udp 0 0 192.168.1.1:1024 192.168.0.1:2000 ESTABLISHED The netstat -s output is a bit long, so I''ve attached those, instead of including them inline.> And was all the udp traffic going to the same port? i.e. any successful > udp traffic to another endpoint?All the traffic was going to port 2000. Trying to send UDP traffic from domU to a different port in dom0 (after the networking failure) does not succeed. (If you''re asking if traffic could be sent to multiple ports while the networking is functional, I believe the answer is yes, but would double check.)> What does ifconfig on dom0 show? > Are there any error messages in /var/log/messages?$ ifconfig vif1.0 vif1.0 Link encap:Ethernet HWaddr AA:00:01:7B:92:C2 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:134 errors:0 dropped:0 overruns:0 frame:0 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5884 (5.7 Kb) TX bytes:676 (676.0 b) $ sudo tail /var/log/messages Jan 16 19:34:09 node1 ntpd[993]: kernel time sync disabled 0041 Jan 16 19:35:15 node1 ntpd[993]: kernel time sync enabled 0001 Jan 16 19:39:29 node1 ntpd[993]: synchronized to 155.98.33.74, stratum=2 Jan 16 19:49:07 node1 ntpd[993]: time correction of -18001 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. Jan 16 19:59:15 node1 sshd(pam_unix)[1457]: session opened for user mukesh by (uid=30245) Jan 16 19:59:18 node1 sshd(pam_unix)[1486]: session opened for user mukesh by (uid=30245) Jan 16 19:59:30 node1 sshd(pam_unix)[1517]: session opened for user mukesh by (uid=30245) Jan 16 20:09:29 node1 modprobe: modprobe: Can''t open dependencies file /lib/modules/2.4.27-xen0/modules.dep (No such file or directory) Jan 16 20:09:44 node1 last message repeated 2 times Jan 16 20:16:02 node1 kernel: device vif1.0 entered promiscuous mode>> Looking at the interrupt counts in /proc/interrupts, I see that D0 no >> longer receives packets sent by D1. D1, however, does receive packets >> sent by D0. (To be clear, D0->D1 traffic is ICMP ping requests, >> unrelated to the UDP traffic. There is not UDP traffic sent from D0 to D1.) > > Is there any other successful traffic from D0 -> D1 (tcp?)Any traffic is successful from D0->D1, even after the network stops working. This includes ICMP, UDP, and TCP. (Sorry if my comment about "There is not UDP traffic sent from D0 to D1" was confusing. What I meant was that I wasn''t sending and UDP traffic from D0 to D1. Not that such traffic fails.) This is subject to the limitation mentioned in my first message. Namely, that dom0''s ARP cache entry for domU eventually times out. At that point, dom0 attempts to ARP for domU''s MAC. domU sees this, and replies (as seen by tcpdump on domU). But dom0 never gets the ARP replies, so eventually D0->D1 traffic fails as well. (E.g. "telnet 192.168.1.1" returns "No route to host".) Also, let me add some more detail to my original report: 1. The networking fails after the 128th UDP packet received in dom0, even if I restart domU. Specifically: - If I send one UDP packet from domU to dom0, shut down domU, and start a fresh domU, then I can only send 127 (rather than 128) UDP packets from the new domU before networking will fail. - If I shut down domU after the networking failure, and start a new domU, networking between the new domU and dom0 does not work. 2. The server run in dom0 is nc -l -u -p 2000 3. The traffic generator run in domU is i=0; while true; do ((++i)); echo $i echo $i | nc -u -w 1 192.168.0.1 2000 done & thanks, mukesh
Keir Fraser
2005-Jan-16 21:09 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
> Also, let me add some more detail to my original report: > > 1. The networking fails after the 128th UDP packet received in dom0, even > if I restart domU. Specifically: > > - If I send one UDP packet from domU to dom0, shut down domU, and > start a fresh domU, then I can only send 127 (rather than > 128) UDP packets from the new domU before networking will fail. > > - If I shut down domU after the networking failure, and start a > new domU, networking between the new domU and dom0 does not > work. >This corroborates my intial guess that the backend driver (in DOM0) is sending the packets into the DOM0 networking layer, and never hearing back when the packet is freed. Normally this would trigger a response to be sent back to the domU and resources in the backend driver would get freed up. This isn''t happening and you eventually hit a limit on the number of packets that the driver will simultaneously put in flight. Either those UDP packets are queued up somewhere in the DOM0 network stack, or the destructor callback is not getting called for some reason or has got overwritten(!). -- Keir ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-16 21:56 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
On Sun, 16 Jan 2005, Keir Fraser wrote:> This corroborates my intial guess that the backend driver (in DOM0) is > sending the packets into the DOM0 networking layer, and never hearing > back when the packet is freed. Normally this would trigger a response > to be sent back to the domU and resources in the backend driver would > get freed up. This isn''t happening and you eventually hit a limit on > the number of packets that the driver will simultaneously put in > flight.When you say "resources in the backend driver would get freed up", that''s the domU (sender) backend driver?> Either those UDP packets are queued up somewhere in the DOM0 network > stack, or the destructor callback is not getting called for some > reason or has got overwritten(!).Well, the packets aren''t stuck in the dom0 network stack... They get delivered all the way up to the application just fine (nc in the trivial test case). So I think it must be the latter... After delivering the UDP packet to the application, the destructor is not being called back. Further, this seems to be specific to the receive path for packets delivered to userspace (since traffic to the kernel NFS server doesn''t seem to trigger it, nor traffic to closed ports). What (specific source files or documentation) would you suggest starting at, to see an example of how the destruction is supposed to be done? I guess the TCP receive code works properly, so maybe I should compare that to the UDP code? Thanks, mukesh ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-16 22:52 UTC
RE: [Xen-devel] (repeatable) cross-domain networking failure
> What (specific source files or documentation) would you > suggest starting > at, to see an example of how the destruction is supposed to > be done? I > guess the TCP receive code works properly, so maybe I should > compare that > to the UDP code?Have you modified the config of your kernel at all? Can you reproduce with one of the kernels compiled by us? To debug this, I''d start off by instrumenting calls to skb_dequeue in netback''s net_rx_action, along with calls to skb_free and __kfree_skb in skbuff.c Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-16 22:57 UTC
RE: [Xen-devel] (repeatable) cross-domain networking failure
On Sun, 16 Jan 2005, Ian Pratt wrote:> Have you modified the config of your kernel at all? Can you reproduce > with one of the kernels compiled by us?Yep. I''ve experienced these hangs with the kernels and hypervisor from the Xen 2.0.3 release.> To debug this, I''d start off by instrumenting calls to skb_dequeue in > netback''s net_rx_action, along with calls to skb_free and __kfree_skb in > skbuff.cOk, will do. Thanks, mukesh ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-17 23:14 UTC
RE: [Xen-devel] (repeatable) cross-domain networking failure
OK, I have a good handle on the problem with UDP hangs into user-space of domain 0. It''s down to message size: if the UDP payload size is less than 24 bytes, the buffer is not freed properly. Bizarre, but it explains why our regression tests weren''t picking it up as they all use larger message sizes. Anyhow, now we can reproduce, a fix should be forthcoming. Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Adam Heath
2005-Jan-18 02:06 UTC
RE: [Xen-devel] (repeatable) cross-domain networking failure
On Mon, 17 Jan 2005, Ian Pratt wrote:> > OK, I have a good handle on the problem with UDP hangs into user-space > of domain 0. > > It''s down to message size: if the UDP payload size is less than 24 > bytes, the buffer is not freed properly. Bizarre, but it explains why > our regression tests weren''t picking it up as they all use larger > message sizes. > > Anyhow, now we can reproduce, a fix should be forthcoming.Is it possible for an nfs request/response to be less than 24 bytes in size? ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Jan-18 11:05 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
> > OK, I have a good handle on the problem with UDP hangs into user-space > of domain 0. > > It''s down to message size: if the UDP payload size is less than 24 > bytes, the buffer is not freed properly. Bizarre, but it explains why > our regression tests weren''t picking it up as they all use larger > message sizes. > > Anyhow, now we can reproduce, a fix should be forthcoming. > > Ian >-=- MIME -=- OK, I have a good handle on the problem with UDP hangs into user-space of domain 0. It''s down to message size: if the UDP payload size is less than 24 bytes, the buffer is not freed properly. Bizarre, but it explains why our regression tests weren''t picking it up as they all use larger message sizes. Anyhow, now we can reproduce, a fix should be forthcoming. Ian This bug is now (hopefully) fixed in the testing and unstable trees. -- Keir ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Keir Fraser
2005-Jan-18 11:28 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
> OK, I have a good handle on the problem with UDP hangs into user-space > of domain 0. > > It''s down to message size: if the UDP payload size is less than 24 > bytes, the buffer is not freed properly. Bizarre, but it explains why > our regression tests weren''t picking it up as they all use larger > message sizes. > > Anyhow, now we can reproduce, a fix should be forthcoming. > > IanThis bug is now (hopefully) fixed in the testing and unstable trees. -- Keir ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Nivedita Singhvi
2005-Jan-18 16:04 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
Keir Fraser wrote:>>Anyhow, now we can reproduce, a fix should be forthcoming. >> >>Ian > > > This bug is now (hopefully) fixed in the testing and unstable trees.Many thanks, Ian and Keir! I know this was recently mentioned on a thread but I''m unable to remember or locate it - but are your regression tests available publicly? I''m currently assisting some engineers to put some automated testing for this internally. The small message test (a netperf with msg size going from say 1 byte in steps to > ~64K) is very handy indeed, it has often exposed problems. We''d be glad to throw some tests at you as well. thanks, Nivedita ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
mukesh agrawal
2005-Jan-19 23:17 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
On Tue, 18 Jan 2005, Keir Fraser wrote:> This bug is now (hopefully) fixed in the testing and unstable trees.Yep, works for me now. Thanks! ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Adam Heath
2005-Jan-20 19:11 UTC
Re: [Xen-devel] (repeatable) cross-domain networking failure
On Tue, 18 Jan 2005, Keir Fraser wrote:> > > OK, I have a good handle on the problem with UDP hangs into user-space > > of domain 0. > > > > It''s down to message size: if the UDP payload size is less than 24 > > bytes, the buffer is not freed properly. Bizarre, but it explains why > > our regression tests weren''t picking it up as they all use larger > > message sizes. > > > > Anyhow, now we can reproduce, a fix should be forthcoming. > > > > Ian > > This bug is now (hopefully) fixed in the testing and unstable trees.Does this bug exist in the stable(2.0) tree? ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-20 22:08 UTC
RE: [Xen-devel] (repeatable) cross-domain networking failure
> > > It''s down to message size: if the UDP payload size is less than24> > > bytes, the buffer is not freed properly. Bizarre, but it > explains why > > > our regression tests weren''t picking it up as they all use larger > > > message sizes. > > > > > > Anyhow, now we can reproduce, a fix should be forthcoming. > > > > > > Ian > > > > This bug is now (hopefully) fixed in the testing and unstable trees. > > Does this bug exist in the stable(2.0) tree?Yes - it will be fixed in 2.0.4. It was pretty obscure (having been in there ever since 1.3) so we''re not rushing head long to doing a new release. Ian ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel