Then i''m use qemu-xen-traditional And start simple domain: name="170131-10018" vif="mac=00:16:3e:00:21:98,ip=62.76.184.141" disk="phy:/dev/disk/vbd/170131-18,xvda,w" memory=2048 maxmem=2048 vcpus=4 maxvcpus=4 cpu_cap=400 cpu_weight=2048 vfb="type=vnc,vncpasswd=x0boskQNan" with kernel inside domU 2.6.32-220-el6 (centos 6 stock kernel) and worked for 12-16 hours domU misses network. No packets goes to dom0 vif interface (tcpdump says nothing). xl dmesg says nothing. qemu-dm says nothing I''m try to rmmod xen_netfront module inside domU and after that modprobe it and get errors: net eth0: xennet_release_rx_bufs: fixme for copying receiver. WARNING: g.e. still in use! WARNING: leaking g.e. and page still in use! ip a s says that i have eth0 but i cant get link up (ip link set up dev eth0 says: Cannot assign requested address) This behaviour fixes reboot. But after some time this happened again. P.S. I can check qemu-xen (upstream qemu) because it can''t migrate (see my previous emails with subject xen 4.3 test report. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, May 31, 2013 at 01:14:05PM +0400, Vasiliy Tolstov wrote:> Then i''m use qemu-xen-traditional > > And start simple domain: > > name="170131-10018" > vif="mac=00:16:3e:00:21:98,ip=62.76.184.141" > disk="phy:/dev/disk/vbd/170131-18,xvda,w" > memory=2048 > maxmem=2048 > vcpus=4 > maxvcpus=4 > cpu_cap=400 > cpu_weight=2048 > vfb="type=vnc,vncpasswd=x0boskQNan" > > with kernel inside domU 2.6.32-220-el6 (centos 6 stock kernel) > and worked for 12-16 hours domU misses network. > No packets goes to dom0 vif interface (tcpdump says nothing). > xl dmesg says nothing. > qemu-dm says nothing > > I''m try to rmmod xen_netfront module inside domU and after that > modprobe it and get errors: > net eth0: xennet_release_rx_bufs: fixme for copying receiver. > WARNING: g.e. still in use! > WARNING: leaking g.e. and page still in use! >What''s your Dom0 kernel? Could you confirm netback is still running? Did you try to rmmod xen-netfront when there was not connection? It looks like the grant table entry is still in use. But if netfront was still pushing packets to netback I think that warning message is expected.> ip a s says that i have eth0 but i cant get link up (ip link set up > dev eth0 says: > Cannot assign requested address) > > This behaviour fixes reboot. But after some time this happened again. > > > P.S. I can check qemu-xen (upstream qemu) because it can''t migrate > (see my previous emails with subject xen 4.3 test report. >Not quite sure if this is related to QEMU. I think the preferred backend is netback. Wei.
2013/5/31 Wei Liu <wei.liu2@citrix.com>:> What''s your Dom0 kernel? Could you confirm netback is still running? >dom0 kernel is 3.8.6> Did you try to rmmod xen-netfront when there was not connection? It > looks like the grant table entry is still in use. But if netfront was > still pushing packets to netback I think that warning message is > expected.yes i''m try and after that domU can''t manipulate eth0 interface (can''t assign address and up link)>> P.S. I can check qemu-xen (upstream qemu) because it can''t migrate >> (see my previous emails with subject xen 4.3 test report. >> > > Not quite sure if this is related to QEMU. I think the preferred backend > is netback.May be... -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
2013/5/31 Wei Liu <wei.liu2@citrix.com>:> What''s your Dom0 kernel? Could you confirm netback is still running?Yes netback is running and other servers work''s fine on this node. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, 2013-05-31 at 10:23 +0100, Wei Liu wrote:> On Fri, May 31, 2013 at 01:14:05PM +0400, Vasiliy Tolstov wrote: > > Then i''m use qemu-xen-traditionalAre you sure of this? The default is now qemu-xen and you don''t appear override it in your domain cfg. (although as discussed below qemu is probably not relevant here)> > And start simple domain: > > > > name="170131-10018" > > vif="mac=00:16:3e:00:21:98,ip=62.76.184.141" > > disk="phy:/dev/disk/vbd/170131-18,xvda,w" > > memory=2048 > > maxmem=2048 > > vcpus=4 > > maxvcpus=4 > > cpu_cap=400 > > cpu_weight=2048 > > vfb="type=vnc,vncpasswd=x0boskQNan" > > > > with kernel inside domU 2.6.32-220-el6 (centos 6 stock kernel) > > and worked for 12-16 hours domU misses network. > > No packets goes to dom0 vif interface (tcpdump says nothing). > > xl dmesg says nothing. > > qemu-dm says nothing > > > > I''m try to rmmod xen_netfront module inside domU and after that > > modprobe it and get errors: > > net eth0: xennet_release_rx_bufs: fixme for copying receiver. > > WARNING: g.e. still in use! > > WARNING: leaking g.e. and page still in use! > > > > What''s your Dom0 kernel? Could you confirm netback is still running? > > Did you try to rmmod xen-netfront when there was not connection? It > looks like the grant table entry is still in use. But if netfront was > still pushing packets to netback I think that warning message is > expected.Yes, I think the message is expected too. I''m more concerned about the network dropping out after 12-16 hours (which I think is what Vasiliy is really reporting).> > ip a s says that i have eth0 but i cant get link up (ip link set up > > dev eth0 says: > > Cannot assign requested address) > > > > This behaviour fixes reboot. But after some time this happened again. > > > > > > P.S. I can check qemu-xen (upstream qemu) because it can''t migrate > > (see my previous emails with subject xen 4.3 test report. > > > > Not quite sure if this is related to QEMU. I think the preferred backend > is netback.Correct, nothing in the xen toolstack ever uses the qemu netback (which, if it exists at all, is a xenner thing). Since this is a PV g uest the only thing qemu will be doing is providing the xenfb backend. Ian.
2013/5/31 Ian Campbell <Ian.Campbell@citrix.com>:> Are you sure of this? The default is now qemu-xen and you don''t appear > override it in your domain cfg. (although as discussed below qemu is > probably not relevant here) >I''m sure - xm toolstack with xend con''t operate with qemu-xen =). And i use xm with xend because on 4.3 only this combination provide live migration =)> I''m more concerned about the network dropping out after 12-16 hours > (which I think is what Vasiliy is really reporting).I''m try to get more statistics with domU kernels that affected by this. What can i get from my system to get more debug output in case of this error? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, May 31, 2013 at 10:30:20AM +0100, Ian Campbell wrote: [...]> > > > What''s your Dom0 kernel? Could you confirm netback is still running? > > > > Did you try to rmmod xen-netfront when there was not connection? It > > looks like the grant table entry is still in use. But if netfront was > > still pushing packets to netback I think that warning message is > > expected. > > Yes, I think the message is expected too. > > I''m more concerned about the network dropping out after 12-16 hours > (which I think is what Vasiliy is really reporting). >Right. I just wanted to know whether there was stale grant entry. If, for some reason, netfront runs out of grant table entries I think that might cause connectivity lost. Vasiliy, can you grab more logs from DomU and Dom0? Wei.
On Fri, May 31, 2013 at 01:14:05PM +0400, Vasiliy Tolstov wrote:> Then i''m use qemu-xen-traditional > > And start simple domain: > > name="170131-10018" > vif="mac=00:16:3e:00:21:98,ip=62.76.184.141" > disk="phy:/dev/disk/vbd/170131-18,xvda,w" > memory=2048 > maxmem=2048 > vcpus=4 > maxvcpus=4 > cpu_cap=400 > cpu_weight=2048 > vfb="type=vnc,vncpasswd=x0boskQNan" > > with kernel inside domU 2.6.32-220-el6 (centos 6 stock kernel) > and worked for 12-16 hours domU misses network. >Did you try with the latest CentOS 6.4 kernel-2.6.32-358.6.2.el6 ? -220 you''re using is quite old. -- Pasi
2013/5/31 Wei Liu <wei.liu2@citrix.com>:> Right. I just wanted to know whether there was stale grant entry. > > If, for some reason, netfront runs out of grant table entries I think > that might cause connectivity lost. > > Vasiliy, can you grab more logs from DomU and Dom0?Is about 36 hours in xen dmesg appeared messages: (XEN) grant_table.c:289:d0 Increased maptrack size to 2 frames (XEN) mm.c:1695:d0 Bad L1 flags 400000 Inside domU nothing. As i see network absent not only centos 6 kernel based domU but on 2.6.32.26 (build from git jeremy xenlinux tree is about 2 years old) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
2013/5/31 Pasi Kärkkäinen <pasik@iki.fi>:> Did you try with the latest CentOS 6.4 kernel-2.6.32-358.6.2.el6 ? > -220 you're using is quite old.I can't. i have is about 700 vps with this kernel. And on xen 4.1.3 with suse patches (dom0 2.6.32 kernel) this problem never happened (vps work fine for 30-60 days) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
2013/5/31 Vasiliy Tolstov <v.tolstov@selfip.ru>:>> Did you try with the latest CentOS 6.4 kernel-2.6.32-358.6.2.el6 ? >> -220 you''re using is quite old.I''m try to create test server with never centos kernel and wait for this bug... -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, May 31, 2013 at 02:22:44PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Vasiliy Tolstov <v.tolstov@selfip.ru>: > >> Did you try with the latest CentOS 6.4 kernel-2.6.32-358.6.2.el6 ? > >> -220 you''re using is quite old. > > > I''m try to create test server with never centos kernel and wait for this bug... >Yep. Remember in the old -220 there are multiple security bugs etc. -- Pasi
2013/5/31 Pasi Kärkkäinen <pasik@iki.fi>:> Yep. Remember in the old -220 there are multiple security bugs etc.This is client problems. We can't forcing it update =). But my problem that in new hypervisor stable kernel works unstable. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, May 31, 2013 at 02:26:38PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Pasi Kärkkäinen <pasik@iki.fi>: > > Yep. Remember in the old -220 there are multiple security bugs etc. > > > This is client problems. We can''t forcing it update =). > But my problem that in new hypervisor stable kernel works unstable. >The problem might also be in dom0 kernel (xen-netback driver perhaps?), not necessarily in the hypervisor.. -- Pasi
2013/5/31 Pasi Kärkkäinen <pasik@iki.fi>:> The problem might also be in dom0 kernel (xen-netback driver perhaps?), > not necessarily in the hypervisor..May be, in another server with kernel 3.6.9 and xen 4.1.3_04-0.5.1 (from opensuse repo) works fine without this problems. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, May 31, 2013 at 02:21:03PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Wei Liu <wei.liu2@citrix.com>: > > Right. I just wanted to know whether there was stale grant entry. > > > > If, for some reason, netfront runs out of grant table entries I think > > that might cause connectivity lost. > > > > Vasiliy, can you grab more logs from DomU and Dom0? > > > Is about 36 hours in xen dmesg appeared messages: > (XEN) grant_table.c:289:d0 Increased maptrack size to 2 frames > (XEN) mm.c:1695:d0 Bad L1 flags 400000 > > Inside domU nothing. As i see network absent not only centos 6 kernel > based domU but on 2.6.32.26 (build from git jeremy xenlinux tree is > about 2 years old) >How about dom0 linux dmesg? Anything there? -- Pasi
On Fri, May 31, 2013 at 02:21:03PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Wei Liu <wei.liu2@citrix.com>: > > Right. I just wanted to know whether there was stale grant entry. > > > > If, for some reason, netfront runs out of grant table entries I think > > that might cause connectivity lost. > > > > Vasiliy, can you grab more logs from DomU and Dom0? > > > Is about 36 hours in xen dmesg appeared messages: > (XEN) grant_table.c:289:d0 Increased maptrack size to 2 framesIf you didn''t have more DomU up and running, this looks like a leak somewhere.> (XEN) mm.c:1695:d0 Bad L1 flags 400000 >And this... I have now idea at the moment.> Inside domU nothing. As i see network absent not only centos 6 kernel > based domU but on 2.6.32.26 (build from git jeremy xenlinux tree is > about 2 years old) >Though that looks mostly like a backend problem, I would still ask if you can spare some time to try the same workload on newer kernels and see what happens. And I would also need your Dom0 configuration, dmesg log in Dom0. Wei.> > -- > Vasiliy Tolstov, > e-mail: v.tolstov@selfip.ru > jabber: vase@selfip.ru
2013/5/31 Wei Liu <wei.liu2@citrix.com>:> If you didn''t have more DomU up and running, this looks like a leak > somewhere. >on this node i''m only have 3 domains with bad network. all other 30 domains work''s fine.>> (XEN) mm.c:1695:d0 Bad L1 flags 400000 >> > > And this... I have now idea at the moment. > > > And I would also need your Dom0 configuration, dmesg log in Dom0.Bingo. On another server with this kernel and xen i have: [131872.141036] vif vif-139-0 vif139.0: Too many frags [131872.141042] vif139.0: fatal error; disabling device What from dom0 configuration i need to provide? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, May 31, 2013 at 03:07:00PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Wei Liu <wei.liu2@citrix.com>: > > If you didn''t have more DomU up and running, this looks like a leak > > somewhere. > > > > on this node i''m only have 3 domains with bad network. all other 30 > domains work''s fine. > > >> (XEN) mm.c:1695:d0 Bad L1 flags 400000 > >> > > > > And this... I have now idea at the moment. > > > > > > And I would also need your Dom0 configuration, dmesg log in Dom0. > > Bingo. On another server with this kernel and xen i have: > > [131872.141036] vif vif-139-0 vif139.0: Too many frags > [131872.141042] vif139.0: fatal error; disabling device > > What from dom0 configuration i need to provide? >Ah, this is a known issue. MAX_SKB_FRAGS in DomU is larger than the one in Dom0, Dom0 thinks the packet is malicious and disconnect the vif. Patches have been upstreamed to fix this, but the backporting to 3.8 is not kicked off. If you don''t bother applying patches and rebuilding your kernel, you can probably want until the backport happens. If you cannot wait, look for email sent by Ian Campbell with title: "xen-netback stable backports request (regression fixes)". Wei.> -- > Vasiliy Tolstov, > e-mail: v.tolstov@selfip.ru > jabber: vase@selfip.ru
2013/5/31 Wei Liu <wei.liu2@citrix.com>:> Ah, this is a known issue. MAX_SKB_FRAGS in DomU is larger than the one > in Dom0, Dom0 thinks the packet is malicious and disconnect the vif. > Patches have been upstreamed to fix this, but the backporting to 3.8 is > not kicked off. > > If you don''t bother applying patches and rebuilding your kernel, you can > probably want until the backport happens. > > If you cannot wait, look for email sent by Ian Campbell with title: > "xen-netback stable backports request (regression fixes)".Thanks, i''m try it. (i''m try to use 3.9.4) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On Fri, May 31, 2013 at 04:15:31PM +0400, Vasiliy Tolstov wrote:> 2013/5/31 Wei Liu <wei.liu2@citrix.com>: > > Ah, this is a known issue. MAX_SKB_FRAGS in DomU is larger than the one > > in Dom0, Dom0 thinks the packet is malicious and disconnect the vif. > > Patches have been upstreamed to fix this, but the backporting to 3.8 is > > not kicked off. > > > > If you don''t bother applying patches and rebuilding your kernel, you can > > probably want until the backport happens. > > > > If you cannot wait, look for email sent by Ian Campbell with title: > > "xen-netback stable backports request (regression fixes)". > > > Thanks, i''m try it. (i''m try to use 3.9.4) >I think you need 3.10, but I''m not sure. Wei.> -- > Vasiliy Tolstov, > e-mail: v.tolstov@selfip.ru > jabber: vase@selfip.ru
On 31/05/2013 11:10 PM, Wei Liu wrote:> On Fri, May 31, 2013 at 04:15:31PM +0400, Vasiliy Tolstov wrote: >> 2013/5/31 Wei Liu <wei.liu2@citrix.com>: >>> Ah, this is a known issue. MAX_SKB_FRAGS in DomU is larger than the one >>> in Dom0, Dom0 thinks the packet is malicious and disconnect the vif. >>> Patches have been upstreamed to fix this, but the backporting to 3.8 is >>> not kicked off. >>> >>> If you don''t bother applying patches and rebuilding your kernel, you can >>> probably want until the backport happens. >>> >>> If you cannot wait, look for email sent by Ian Campbell with title: >>> "xen-netback stable backports request (regression fixes)". >> >> >> Thanks, i''m try it. (i''m try to use 3.9.4) >> > > I think you need 3.10, but I''m not sure.Correct. I package 3.9.x for EL6 and I''ve had to patch it in manually as it doesn''t exist in the vanilla kernel... -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 Fax: (03) 8338 0299
2013/5/31 Steven Haigh <netwiz@crc.id.au>:> Correct. I package 3.9.x for EL6 and I''ve had to patch it in manually as it > doesn''t exist in the vanilla kernel...Thanks this save my time. But in case of xen what the best - build own kernel with backported patches (3.9.4) or use 3.10? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru jabber: vase@selfip.ru
On 01/06/2013, at 7:17, Vasiliy Tolstov <v.tolstov@selfip.ru> wrote:> 2013/5/31 Steven Haigh <netwiz@crc.id.au>: >> Correct. I package 3.9.x for EL6 and I''ve had to patch it in manually as it >> doesn''t exist in the vanilla kernel... > > > Thanks this save my time. But in case of xen what the best - build own > kernel with backported patches (3.9.4) or use 3.10?I guess this depends on your organisations needs. I wouldn''t use mainline in production, but that is just my opinion... -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 Fax: (03) 8338 0299