I have a OpenSuse 10.3 with xen 3.1.0 running and it''s been running
fine
for a few months.
This past weekend it suddenly started to act up and after some
troubleshooting I can now say that it seems like the guests(domU) loose
the outgoing network pipe, from the console I can see that the TX
counter is stuck at the same value but it''s no errors. It behaves as if
whatever I try to connect to isn''t there.
I can reboot the guest but the problem stays, TX stays at 0 while RX
counts up.
Rebooting the host(dom0) solves the problem for a few hours (seems to be
2-6h).
I tried to look for what the problem can be but don''t know where to
look. The closest I got was when I narrowed it down to that it doesn''t
send any network traffic out from any domU and once it happens the domU
mac is no where to be found outside the domU (checked brctl showmacs &
on the switch)
What bothers me most is that it worked fine up until Sunday. I was even
out of town for a few days before so I didn''t change anything.
Also, why does it work for a while after reboot?
My setup is not that strange. I have one domu as firewall and another in
two DMZs so I have my own network-bridge script that calls the stock
opensuse script
for i in $(seq 0 4); do
$dir/network-bridge "$@" vifnum=$i netdev=eth$i bridge=xenbr$i
/usr/sbin/ethtool -K eth$i tx off
done
and this gives
# brctl show
bridge name bridge id STP enabled interfaces
xenbr0 8000.fefffffff000 no vif0.0
peth0
vif2.0
vif4.0
xenbr1 8000.fefffffff001 no vif0.1
peth1
vif2.1
vif3.0
xenbr2 8000.fefffffff002 no vif0.2
peth2
vif1.0
vif2.2
xenbr3 8000.fefffffff003 no vif0.3
peth3
vif2.3
xenbr4 8000.00508bcfd44d no eth4
vif2.4
The kernel and xen running is stock opensuse
# xm info
host : enterprise
release : 2.6.22.17-0.1-xen
version : #1 SMP 2008/02/10 20:01:04 UTC
machine : x86_64
nr_cpus : 2
nr_nodes : 1
sockets_per_node : 1
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 2611
hw_caps :
178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
total_memory : 4031
free_memory : 0
max_free_memory : 1106
max_para_memory : 1102
max_hvm_memory : 1091
xen_major : 3
xen_minor : 1
xen_extra : .0_15042-51.3
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : 15042
cc_compiler : gcc version 4.2.1 (SUSE Linux)
cc_compile_by : abuild
cc_compile_domain : suse.de
cc_compile_date : Thu Dec 20 19:57:34 UTC 2007
xend_config_format : 4
So, where should I look for problems?
/ps
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Peter Sjoberg
2008-May-23 01:29 UTC
Re: [Xen-users] outgoing domU network dies after 135-194 minutes
Done some more testing and found that * it seems like it always dies after 2h15m-2h45min * when it dies it dies for all domU and on all ports at the same time (or at least within 1 minute) * external traffic to & from dom0 works fine all the time * traffic peth0->vif1.0->domU eth0 works fine (tcpdump in domU shows packages) * traffic domU ->eth0 ->vif1.0 dies directly, eth0 TX counter doesn''t change and tcpdump on vif1.0 shows outgoing no traffic (only incoming) * I restarted one domU after an hour but it died at the same time as the others so it seems tied to uptime of dom0 * besides that firewall rules doesn''t change after a while I have everything open * all domU are paravirtualized I''m at loss as to where to look. I have started to move over some things to a second system but it can''t handle a full failover (not enough disk and no backup tape) so I need to figure out what''s going on here. what is the common sw that can effect all 4 domU on all 5 network ports (vif[1-4].*) but not dom0 /ps On Wed, 2008-05-21 at 07:56 -0400, Peter Sjoberg wrote:> I have a OpenSuse 10.3 with xen 3.1.0 running and it''s been running fine > for a few months. > This past weekend it suddenly started to act up and after some > troubleshooting I can now say that it seems like the guests(domU) loose > the outgoing network pipe, from the console I can see that the TX > counter is stuck at the same value but it''s no errors. It behaves as if > whatever I try to connect to isn''t there. > I can reboot the guest but the problem stays, TX stays at 0 while RX > counts up. > Rebooting the host(dom0) solves the problem for a few hours (seems to be > 2-6h). > > I tried to look for what the problem can be but don''t know where to > look. The closest I got was when I narrowed it down to that it doesn''t > send any network traffic out from any domU and once it happens the domU > mac is no where to be found outside the domU (checked brctl showmacs & > on the switch) > What bothers me most is that it worked fine up until Sunday. I was even > out of town for a few days before so I didn''t change anything. > Also, why does it work for a while after reboot? > > My setup is not that strange. I have one domu as firewall and another in > two DMZs so I have my own network-bridge script that calls the stock > opensuse script > > for i in $(seq 0 4); do > $dir/network-bridge "$@" vifnum=$i netdev=eth$i bridge=xenbr$i > /usr/sbin/ethtool -K eth$i tx off > done > > and this gives > # brctl show > bridge name bridge id STP enabled interfaces > xenbr0 8000.fefffffff000 no vif0.0 > peth0 > vif2.0 > vif4.0 > xenbr1 8000.fefffffff001 no vif0.1 > peth1 > vif2.1 > vif3.0 > xenbr2 8000.fefffffff002 no vif0.2 > peth2 > vif1.0 > vif2.2 > xenbr3 8000.fefffffff003 no vif0.3 > peth3 > vif2.3 > xenbr4 8000.00508bcfd44d no eth4 > vif2.4 > The kernel and xen running is stock opensuse > > # xm info > host : enterprise > release : 2.6.22.17-0.1-xen > version : #1 SMP 2008/02/10 20:01:04 UTC > machine : x86_64 > nr_cpus : 2 > nr_nodes : 1 > sockets_per_node : 1 > cores_per_socket : 2 > threads_per_core : 1 > cpu_mhz : 2611 > hw_caps : 178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f > total_memory : 4031 > free_memory : 0 > max_free_memory : 1106 > max_para_memory : 1102 > max_hvm_memory : 1091 > xen_major : 3 > xen_minor : 1 > xen_extra : .0_15042-51.3 > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : 15042 > cc_compiler : gcc version 4.2.1 (SUSE Linux) > cc_compile_by : abuild > cc_compile_domain : suse.de > cc_compile_date : Thu Dec 20 19:57:34 UTC 2007 > xend_config_format : 4 > > > So, where should I look for problems? > > /ps > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Peter Sjoberg
2008-Jun-02 03:59 UTC
Re: [Xen-users] outgoing domU network dies after 135-194 minutes
For the list/archives, think I found the issue but can''t understand why
it suddenly became an issue.
I have 5 network ports on the host. When I start up I get 4 properly
configured xenbrX but the 5th xenbr4 is missing the peth0 and gets the
real MAC assigned.
xenbr4 8000.009027bea5ee no eth4
vif1.4
xenbr4 is only used by the firewall and since it worked I didn''t change
anything. When this problem started I did look for different things and
fixed up minor issues like this one by one. after doing
echo ''options netloop nloopbacks=16''
>/etc/modprobe.d/netloop.local
to get enough loopback to properly configure peth4 it stopped dying.
I removed it, rebooted and it died again so now I have it in.
What I wonder most is why it suddenly decided to start breaking the
system "out of the blue" and why the second server (which took over
firewall duty when this one died) never got the same issue.
/ps
On Thu, 2008-05-22 at 21:29 -0400, Peter Sjoberg wrote:> Done some more testing and found that
> * it seems like it always dies after 2h15m-2h45min
> * when it dies it dies for all domU and on all ports at the same time
> (or at least within 1 minute)
> * external traffic to & from dom0 works fine all the time
> * traffic peth0->vif1.0->domU eth0 works fine (tcpdump in domU shows
> packages)
> * traffic domU ->eth0 ->vif1.0 dies directly, eth0 TX counter
doesn''t
> change and tcpdump on vif1.0 shows outgoing no traffic (only incoming)
> * I restarted one domU after an hour but it died at the same time as the
> others so it seems tied to uptime of dom0
>
> * besides that firewall rules doesn''t change after a while I have
> everything open
> * all domU are paravirtualized
>
> I''m at loss as to where to look. I have started to move over some
things
> to a second system but it can''t handle a full failover (not enough
disk
> and no backup tape) so I need to figure out what''s going on here.
>
> what is the common sw that can effect all 4 domU on all 5 network ports
> (vif[1-4].*) but not dom0
>
> /ps
>
> On Wed, 2008-05-21 at 07:56 -0400, Peter Sjoberg wrote:
> > I have a OpenSuse 10.3 with xen 3.1.0 running and it''s been
running fine
> > for a few months.
> > This past weekend it suddenly started to act up and after some
> > troubleshooting I can now say that it seems like the guests(domU)
loose
> > the outgoing network pipe, from the console I can see that the TX
> > counter is stuck at the same value but it''s no errors. It
behaves as if
> > whatever I try to connect to isn''t there.
> > I can reboot the guest but the problem stays, TX stays at 0 while RX
> > counts up.
> > Rebooting the host(dom0) solves the problem for a few hours (seems to
be
> > 2-6h).
> >
> > I tried to look for what the problem can be but don''t know
where to
> > look. The closest I got was when I narrowed it down to that it
doesn''t
> > send any network traffic out from any domU and once it happens the
domU
> > mac is no where to be found outside the domU (checked brctl showmacs
&
> > on the switch)
> > What bothers me most is that it worked fine up until Sunday. I was
even
> > out of town for a few days before so I didn''t change
anything.
> > Also, why does it work for a while after reboot?
> >
> > My setup is not that strange. I have one domu as firewall and another
in
> > two DMZs so I have my own network-bridge script that calls the stock
> > opensuse script
> >
> > for i in $(seq 0 4); do
> > $dir/network-bridge "$@" vifnum=$i netdev=eth$i
bridge=xenbr$i
> > /usr/sbin/ethtool -K eth$i tx off
> > done
> >
> > and this gives
> > # brctl show
> > bridge name bridge id STP enabled interfaces
> > xenbr0 8000.fefffffff000 no vif0.0
> > peth0
> > vif2.0
> > vif4.0
> > xenbr1 8000.fefffffff001 no vif0.1
> > peth1
> > vif2.1
> > vif3.0
> > xenbr2 8000.fefffffff002 no vif0.2
> > peth2
> > vif1.0
> > vif2.2
> > xenbr3 8000.fefffffff003 no vif0.3
> > peth3
> > vif2.3
> > xenbr4 8000.00508bcfd44d no eth4
> > vif2.4
> > The kernel and xen running is stock opensuse
> >
> > # xm info
> > host : enterprise
> > release : 2.6.22.17-0.1-xen
> > version : #1 SMP 2008/02/10 20:01:04 UTC
> > machine : x86_64
> > nr_cpus : 2
> > nr_nodes : 1
> > sockets_per_node : 1
> > cores_per_socket : 2
> > threads_per_core : 1
> > cpu_mhz : 2611
> > hw_caps :
178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
> > total_memory : 4031
> > free_memory : 0
> > max_free_memory : 1106
> > max_para_memory : 1102
> > max_hvm_memory : 1091
> > xen_major : 3
> > xen_minor : 1
> > xen_extra : .0_15042-51.3
> > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
> > xen_scheduler : credit
> > xen_pagesize : 4096
> > platform_params : virt_start=0xffff800000000000
> > xen_changeset : 15042
> > cc_compiler : gcc version 4.2.1 (SUSE Linux)
> > cc_compile_by : abuild
> > cc_compile_domain : suse.de
> > cc_compile_date : Thu Dec 20 19:57:34 UTC 2007
> > xend_config_format : 4
> >
> >
> > So, where should I look for problems?
> >
> > /ps
> >
> >
> > _______________________________________________
> > Xen-users mailing list
> > Xen-users@lists.xensource.com
> > http://lists.xensource.com/xen-users
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users