Kamble, Nitin A
2006-Feb-25 00:10 UTC
[Xen-devel] [PATCH] Fox for pcnet device model data corruption
Hi Ian, The attached patch fixes pcnet data corruption for VMX guests as reported by you. All the packets go through the qemu generic packet interface to the specific device model. In this case the device model is pcnet. The pcnet device model receiver is registered with it like this. qemu_add_read_packet(nd, pcnet_can_receive, pcnet_receive, d); pcnet_can_receive function is used to tell the generic qemu framework that the DM can receive packets. It is suppose block incoming packets in the cases such as when the pcnet driver is not yet started by the OS or pcnet device is suspended or stopped by the OS or it is not ready to receive more packets. When the traffic is heavy on the DM, its receive rings can get filled up, and it will has to drop the receiving packets. This patch detects this situation in the pcnet_can_receive() function and avoids dropping of packets. This mechanism is working as a bandwidth handshaking between device model and the sender. Dm is saying send me up to the rate at which I can handle it. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stekloff
2006-Feb-25 04:44 UTC
Re: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
On Fri, 2006-02-24 at 16:10 -0800, Kamble, Nitin A wrote:> Hi Ian, > > The attached patch fixes pcnet data corruption for VMX guests as > reported by you.Hi Nitin, This doesn''t fix the problem for me. If I try to transfer, with scp, a 2gb file to DomU from Dom0, the transfer is disconnected with the following error: Received disconnect from 192.168.1.3: 2: Corrupted MAC on input. Unlike dropping packets, there''s actual corruption that domU is seeing and ending the transfer. There''s a buffer that''s being overwritten. I can reproduce this on an SMP system. Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Li, Xin B
2006-Feb-25 06:51 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
>> Hi Ian, >> >> The attached patch fixes pcnet data corruption for VMX guests as >> reported by you. > > >Hi Nitin, > >This doesn''t fix the problem for me. If I try to transfer, with scp, a >2gb file to DomU from Dom0, the transfer is disconnected with the >following error: > >Received disconnect from 192.168.1.3: 2: Corrupted MAC on input.This fix is to VMX domain only. -Xin> >Unlike dropping packets, there''s actual corruption that domU is seeing >and ending the transfer. There''s a buffer that''s being overwritten. I >can reproduce this on an SMP system. > >Thanks, > >Dan > > > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-25 11:23 UTC
[Xen-devel] Re: [PATCH] Fox for pcnet device model data corruption
On 25 Feb 2006, at 00:10, Kamble, Nitin A wrote:> Hi Ian, > The attached patch fixes pcnet data corruption for VMX guests as > reported by you. > All the packets go through the qemu generic packet interface to the > specific device model. In this case the device model is pcnet. > The pcnet device model receiver is registered with it like this. > qemu_add_read_packet(nd, pcnet_can_receive, pcnet_receive, d); > pcnet_can_receive function is used to tell the generic qemu > framework that the DM can receive packets. It is suppose block > incoming packets in the cases such as when the pcnet driver is not yet > started by the OS or pcnet device is suspended or stopped by the OS or > it is not ready to receive more packets. > When the traffic is heavy on the DM, its receive rings can get > filled up, and it will has to drop the receiving packets. This patch > detects this situation in the pcnet_can_receive() function and avoids > dropping of packets. This mechanism is working as a bandwidth > handshaking between device model and the sender. Dm is saying send me > up to the rate at which I can handle it.I can see that this may avoid packet loss, but does pcnet_receive really get confused and corrupt data if there is no spare space? It appears to check the same status flag that you check in your patch? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-25 15:27 UTC
[Xen-devel] RE: [PATCH] Fox for pcnet device model data corruption
Hi Keir, While copying a big file to VMX domain I observed that the receive ring in the DM was getting completely full; Partially because the VMX domain driver was not able to pull it off from the DM fast enough. In that case the DM was dropping the packet and sentting the status of 0x1000 and sending interrupt notifying the linux driver that the packets are lost due to space unavailabilty in the receive ring of DM. The pcnet driver handles this situation differently. Based on the real pcnet issues on some real hardware, the pcnet driver tries to clear up the receive ring, assuming it is full of errors. For the emulated DM that is not the case and things go wrong from that point onwards. I think the error handling part of the pcnet DM is not correct, and it causes the buffer overwrites resulting the corrpution we see. The patch is letting the DM detect the receive ring full condition in advance, so that packets will not be pushed to DM, in that situation, and that si better because otherwise it is just going to drop the packet and raise an error. Yeh, the DM is checking for this condition in the pcnet_receive(), to signal the OS driver that packets are dropped. But it is too late because the OS driver handling for this situation does not work properly for the DM. Thanks & Regards, Nitin -------------------------------------------------- Open Source Technology Center, Intel Corp -----Original Message----- From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] Sent: Sat 2/25/2006 3:23 AM To: Kamble, Nitin A Cc: xen-devel@lists.xensource.com; Ian Pratt Subject: Re: [PATCH] Fox for pcnet device model data corruption On 25 Feb 2006, at 00:10, Kamble, Nitin A wrote:> Hi Ian, > The attached patch fixes pcnet data corruption for VMX guests as > reported by you. > All the packets go through the qemu generic packet interface to the > specific device model. In this case the device model is pcnet. > The pcnet device model receiver is registered with it like this. > qemu_add_read_packet(nd, pcnet_can_receive, pcnet_receive, d); > pcnet_can_receive function is used to tell the generic qemu > framework that the DM can receive packets. It is suppose block > incoming packets in the cases such as when the pcnet driver is not yet > started by the OS or pcnet device is suspended or stopped by the OS or > it is not ready to receive more packets. > When the traffic is heavy on the DM, its receive rings can get > filled up, and it will has to drop the receiving packets. This patch > detects this situation in the pcnet_can_receive() function and avoids > dropping of packets. This mechanism is working as a bandwidth > handshaking between device model and the sender. Dm is saying send me > up to the rate at which I can handle it.I can see that this may avoid packet loss, but does pcnet_receive really get confused and corrupt data if there is no spare space? It appears to check the same status flag that you check in your patch? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-25 15:27 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
Yes, the pcnet issue is VMX domain specific. FYI the vmx domains see the pcnet network adapter which is actually emulated in the DeviceModel. I believe there is still similar issue for dom0 & domu; and people are working on that. Thanks & Regards, Nitin -------------------------------------------------- Open Source Technology Center, Intel Corp This fix is to VMX domain only. -Xin> >Unlike dropping packets, there''s actual corruption that domU is seeing >and ending the transfer. There''s a buffer that''s being overwritten. I >can reproduce this on an SMP system. > >Thanks, > >Dan > > > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stekloff
2006-Feb-25 15:45 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
On Sat, 2006-02-25 at 14:51 +0800, Li, Xin B wrote:> >> Hi Ian, > >> > >> The attached patch fixes pcnet data corruption for VMX guests as > >> reported by you. > > > > > >Hi Nitin, > > > >This doesn''t fix the problem for me. If I try to transfer, with scp, a > >2gb file to DomU from Dom0, the transfer is disconnected with the > >following error: > > > >Received disconnect from 192.168.1.3: 2: Corrupted MAC on input. > > This fix is to VMX domain only. > -XinYeah.. sorry if I wasn''t specific in my comment, referring to DomU and not VMX domain. But the comment remains. This patch does not completely fix the race in qemu-dm for VMX domains. I tested this patch on my SMP system and the corruption still happens. I couldn''t successfully transfer a 2 gb file from Dom0 to a VMX domain. The VMX domain is still seeing corruption. Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-25 15:54 UTC
[Xen-devel] Re: [PATCH] Fox for pcnet device model data corruption
On 25 Feb 2006, at 15:27, Kamble, Nitin A wrote:> The patch is letting the DM detect the receive ring full condition > in advance, so that packets will not be pushed to DM, in that > situation, and that si better because otherwise it is just going to > drop the packet and raise an error. > Yeh, the DM is checking for this condition in the pcnet_receive(), > to signal the OS driver that packets are dropped. But it is too late > because the OS driver handling for this situation does not work > properly for the DM.Yes, it''s a shame there''s no supported device model of a decent NIC. The pcnet one is not keeping up with qemu developments, and is apparently buggy. It''ll be a pain when we want to upgrade to a more recent version of qemu too, although I think I''ve seen an upgraded pcnet patch floating around (although not necessarily from anyone who understands either the particular device model or qemu all that well). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-27 16:19 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
Hi Dan, If you are scping from different host, then it is possible that you are seeing the dom0 network issue in the vmx guest. Can you try scping the file from dom0 to vmx guest? Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: Daniel Stekloff [mailto:dsteklof@us.ibm.com] >Sent: Saturday, February 25, 2006 7:46 AM >To: Li, Xin B >Cc: Kamble, Nitin A; Ian Pratt; xen-devel@lists.xensource.com >Subject: RE: [Xen-devel] [PATCH] Fox for pcnet device model datacorruption> >On Sat, 2006-02-25 at 14:51 +0800, Li, Xin B wrote: >> >> Hi Ian, >> >> >> >> The attached patch fixes pcnet data corruption for VMX guestsas>> >> reported by you. >> > >> > >> >Hi Nitin, >> > >> >This doesn''t fix the problem for me. If I try to transfer, with scp,a>> >2gb file to DomU from Dom0, the transfer is disconnected with the >> >following error: >> > >> >Received disconnect from 192.168.1.3: 2: Corrupted MAC on input. >> >> This fix is to VMX domain only. >> -Xin > > >Yeah.. sorry if I wasn''t specific in my comment, referring to DomU and >not VMX domain. But the comment remains. > >This patch does not completely fix the race in qemu-dm for VMX domains. >I tested this patch on my SMP system and the corruption still happens.I>couldn''t successfully transfer a 2 gb file from Dom0 to a VMX domain. >The VMX domain is still seeing corruption. > >Thanks, > >Dan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stekloff
2006-Feb-27 16:40 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
Hi Nitin, I set up a private bridge on dom0 and then have the VMX domain use that bridge. I use the private bridge to limit the error, so no copying off system. I''m using a 2GB file for the test. I''m running on 32 and 64bit SMP systems. I''m using a rhel3u5 Linux guest image, which is a 2.4 kernel. If I add any debugging in pcnet_receive(), it slows it down enough for the transfer to work. What environment are you testing in? What is your test case? Are you not seeing the error anymore? Is anyone else still seeing the issue like I am? Since I''ve applied your patch, it fails quicker now - a few mbs into the transfer rather than hundreds of mbs before. I will try a linux 2.6 guest today. Thanks, Dan On Mon, 2006-02-27 at 08:19 -0800, Kamble, Nitin A wrote:> Hi Dan, > If you are scping from different host, then it is possible that you > are seeing the dom0 network issue in the vmx guest. Can you try scping > the file from dom0 to vmx guest? > > Thanks & Regards, > Nitin > ------------------------------------------------------------------------ > ----------- > Open Source Technology Center, Intel Corp > > >-----Original Message----- > >From: Daniel Stekloff [mailto:dsteklof@us.ibm.com] > >Sent: Saturday, February 25, 2006 7:46 AM > >To: Li, Xin B > >Cc: Kamble, Nitin A; Ian Pratt; xen-devel@lists.xensource.com > >Subject: RE: [Xen-devel] [PATCH] Fox for pcnet device model data > corruption > > > >On Sat, 2006-02-25 at 14:51 +0800, Li, Xin B wrote: > >> >> Hi Ian, > >> >> > >> >> The attached patch fixes pcnet data corruption for VMX guests > as > >> >> reported by you. > >> > > >> > > >> >Hi Nitin, > >> > > >> >This doesn''t fix the problem for me. If I try to transfer, with scp, > a > >> >2gb file to DomU from Dom0, the transfer is disconnected with the > >> >following error: > >> > > >> >Received disconnect from 192.168.1.3: 2: Corrupted MAC on input. > >> > >> This fix is to VMX domain only. > >> -Xin > > > > > >Yeah.. sorry if I wasn''t specific in my comment, referring to DomU and > >not VMX domain. But the comment remains. > > > >This patch does not completely fix the race in qemu-dm for VMX domains. > >I tested this patch on my SMP system and the corruption still happens. > I > >couldn''t successfully transfer a 2 gb file from Dom0 to a VMX domain. > >The VMX domain is still seeing corruption. > > > >Thanks, > > > >Dan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stekloff
2006-Feb-27 22:43 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
On Mon, 2006-02-27 at 08:19 -0800, Kamble, Nitin A wrote:> Hi Dan, > If you are scping from different host, then it is possible that you > are seeing the dom0 network issue in the vmx guest. Can you try scping > the file from dom0 to vmx guest?I have now run using FC4 vmx domain and the 2gb failed. Transfer rate was between 900kb/sec and 1.6kb/sec - mostly around 1.1mb/sec. What''s the acceptable rate here? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-28 17:42 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
Hi Dan, I am using i386 FC3 on the VMX guest. And x86_64 FC4 as the dom0 host. I am using the default config for bridge and networking, using the xen-bridge. The data rate I am getting for scp from dom0 to vmx guest is 2.4MBps. With the patch I have not seen any scp failures from dom0 to VMX guest. It was comparatively easier for me to reproduce the VMX guest scp problem on the same setup without the patch. Without the patch I was also observing packets lost errors reported by the pcnet device in the vmx guest. By design with the patch those errors should never come, and that''s what I am observing. Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: Daniel Stekloff [mailto:dsteklof@us.ibm.com] >Sent: Monday, February 27, 2006 2:43 PM >To: Kamble, Nitin A >Cc: Li, Xin B; Ian Pratt; xen-devel@lists.xensource.com >Subject: RE: [Xen-devel] [PATCH] Fox for pcnet device model datacorruption> >On Mon, 2006-02-27 at 08:19 -0800, Kamble, Nitin A wrote: >> Hi Dan, >> If you are scping from different host, then it is possible that you >> are seeing the dom0 network issue in the vmx guest. Can you tryscping>> the file from dom0 to vmx guest? > > >I have now run using FC4 vmx domain and the 2gb failed. Transfer rate >was between 900kb/sec and 1.6kb/sec - mostly around 1.1mb/sec. What''s >the acceptable rate here?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-28 19:02 UTC
RE: [Xen-devel] [PATCH] Fox for pcnet device model data corruption
Hi Dan, I also tried the i386 RHEL3U5 vmx guest on top of FC4 x86_64 dom0 host, and I am not seeing any issues with scp from dom0 to vmx guest. Ther guest kernel now is 2.4 and the data rate I am getting is 3.0MBps now. I am trying it on SMP system, but the dom0 is still UP. How about your dom0? Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: Daniel Stekloff [mailto:dsteklof@us.ibm.com] >Sent: Monday, February 27, 2006 8:41 AM >To: Kamble, Nitin A >Cc: Li, Xin B; Ian Pratt; xen-devel@lists.xensource.com >Subject: RE: [Xen-devel] [PATCH] Fox for pcnet device model datacorruption> > >Hi Nitin, > >I set up a private bridge on dom0 and then have the VMX domain use that >bridge. I use the private bridge to limit the error, so no copying off >system. I''m using a 2GB file for the test. I''m running on 32 and 64bit >SMP systems. I''m using a rhel3u5 Linux guest image, which is a 2.4 >kernel. If I add any debugging in pcnet_receive(), it slows it down >enough for the transfer to work. > >What environment are you testing in? What is your test case? Are younot>seeing the error anymore? > >Is anyone else still seeing the issue like I am? > >Since I''ve applied your patch, it fails quicker now - a few mbs intothe>transfer rather than hundreds of mbs before. > >I will try a linux 2.6 guest today. > >Thanks, > >Dan > >On Mon, 2006-02-27 at 08:19 -0800, Kamble, Nitin A wrote: >> Hi Dan, >> If you are scping from different host, then it is possible that you >> are seeing the dom0 network issue in the vmx guest. Can you tryscping>> the file from dom0 to vmx guest? >> >> Thanks & Regards, >> Nitin >>------------------------------------------------------------------------>> ----------- >> Open Source Technology Center, Intel Corp >> >> >-----Original Message----- >> >From: Daniel Stekloff [mailto:dsteklof@us.ibm.com] >> >Sent: Saturday, February 25, 2006 7:46 AM >> >To: Li, Xin B >> >Cc: Kamble, Nitin A; Ian Pratt; xen-devel@lists.xensource.com >> >Subject: RE: [Xen-devel] [PATCH] Fox for pcnet device model data >> corruption >> > >> >On Sat, 2006-02-25 at 14:51 +0800, Li, Xin B wrote: >> >> >> Hi Ian, >> >> >> >> >> >> The attached patch fixes pcnet data corruption for VMXguests>> as >> >> >> reported by you. >> >> > >> >> > >> >> >Hi Nitin, >> >> > >> >> >This doesn''t fix the problem for me. If I try to transfer, withscp,>> a >> >> >2gb file to DomU from Dom0, the transfer is disconnected with the >> >> >following error: >> >> > >> >> >Received disconnect from 192.168.1.3: 2: Corrupted MAC on input. >> >> >> >> This fix is to VMX domain only. >> >> -Xin >> > >> > >> >Yeah.. sorry if I wasn''t specific in my comment, referring to DomUand>> >not VMX domain. But the comment remains. >> > >> >This patch does not completely fix the race in qemu-dm for VMXdomains.>> >I tested this patch on my SMP system and the corruption stillhappens.>> I >> >couldn''t successfully transfer a 2 gb file from Dom0 to a VMXdomain.>> >The VMX domain is still seeing corruption. >> > >> >Thanks, >> > >> >Dan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel