Hardware: X4150, 8 cores, 8Gb RAM; 2 off Dom0: b122, limited to 2 cores, 2Gb RAM DomUs: 4 Solaris 10u7, 2 with 1 core 1Gb and 2 with 2 cores 2 GB The single core DomUs act as DHCP/LDAP servers, the two core DomUs act as fileservers problem: Under heavy load the network on the DomU stops working. So for example rsync between the servers on different metal will work fine for up to 30 minutes and then stop. Cannot ping out of or in to the DomU, but ifconfig and dladm appear to show everything OK. Reboot the DomU and it is all back to normal. Happens if the LDAP and DHCP server comes under heavy load (class of 150 trying to log in to workstations simultaneously) When lightly loaded the machines have been Ok The ports are at 1000Mb, but I don''t remember the problem occurring when they were going at 100Mb So, several questions. Has anyone else seen this? What can I look at to track down the cause, I cannot see anything obvious in the logs? Two of the DomUs on each machine have dedicated network ports. At present they are bridged through Dom0, I believe that in this situation I can go straight to the port from the DomU. Is how to do this documented anywhere and can I make the change without having to rebuild the DomU? Relevant part of xml file: <emulator>/usr/lib/xen/bin/qemu-dm</emulator> <interface type=''bridge''> <mac address=''52:54:00:11:48:0f''/> <source bridge=''e1000g2''/> <script path=''/usr/lib/xen/scripts/vif-vnic''/> <target dev=''vif11.0''/> </interface> Thanks for any help John -- John Landamore Department of Computer Science University of Leicester University Road, LEICESTER, LE1 7RH J.Landamore@mcs.le.ac.uk Phone: +44 (0)116 2523410 Fax: +44 (0)116 2523604
J. Landamore wrote:> Hardware: X4150, 8 cores, 8Gb RAM; 2 off > > Dom0: b122, limited to 2 cores, 2Gb RAM > DomUs: 4 Solaris 10u7, 2 with 1 core 1Gb and 2 with 2 cores 2 GB > The single core DomUs act as DHCP/LDAP servers, the two core DomUs > act as fileservers > > problem: Under heavy load the network on the DomU stops working. So for > example rsync between the servers on different metal will work fine for up > to 30 minutes and then stop. Cannot ping out of or in to the DomU, but > ifconfig and dladm appear to show everything OK. Reboot the DomU and it > is all back to normal. Happens if the LDAP and DHCP server comes under > heavy load (class of 150 trying to log in to workstations simultaneously) > When lightly loaded the machines have been Ok > The ports are at 1000Mb, but I don''t remember the problem occurring when > they were going at 100MbCan you do a kstat xnbo on dom0 when this happens?> So, several questions. > > Has anyone else seen this?I know Dave had run into a network hang problem that he fixed... Not sure if that fix has made it back to the gate. Dave?> What can I look at to track down the cause, I cannot see anything obvious > in the logs?kstats on the interface in dom0 and domU would help.> Two of the DomUs on each machine have dedicated network ports. At present > they are bridged through Dom0, I believe that in this situation I can go > straight to the port from the DomU. Is how to do this documented anywhere > and can I make the change without having to rebuild the DomU?Not quite sure what your asking? Can you give a little more detail on what you want to do? Thanks, MRJ> Relevant part of xml file: > > <emulator>/usr/lib/xen/bin/qemu-dm</emulator> > <interface type=''bridge''> > <mac address=''52:54:00:11:48:0f''/> > <source bridge=''e1000g2''/> > <script path=''/usr/lib/xen/scripts/vif-vnic''/> > <target dev=''vif11.0''/> > </interface> > > Thanks for any help > > John > >
On Oct 9 2009, Mark Johnson wrote:> > >J. Landamore wrote: >> Hardware: X4150, 8 cores, 8Gb RAM; 2 off >> >> Dom0: b122, limited to 2 cores, 2Gb RAM >> DomUs: 4 Solaris 10u7, 2 with 1 core 1Gb and 2 with 2 cores 2 GB >> The single core DomUs act as DHCP/LDAP servers, the two core DomUs >> act as fileservers >> >> problem: Under heavy load the network on the DomU stops working. So for >> example rsync between the servers on different metal will work fine for >> up to 30 minutes and then stop. Cannot ping out of or in to the DomU, >> but ifconfig and dladm appear to show everything OK. Reboot the DomU and >> it is all back to normal. Happens if the LDAP and DHCP server comes >> under heavy load (class of 150 trying to log in to workstations >> simultaneously) When lightly loaded the machines have been Ok The ports >> are at 1000Mb, but I don''t remember the problem occurring when they were >> going at 100Mb > >Can you do a kstat xnbo on dom0 when this happens?Might be a day or two as I''m off for a couple of days, but will do>> So, several questions. >> >> Has anyone else seen this? > >I know Dave had run into a network hang problem that he fixed... Not >sure if that fix has made it back to the gate. Dave? > > > >> What can I look at to track down the cause, I cannot see anything obvious >> in the logs? > >kstats on the interface in dom0 and domU would help.Will do as soon as I''m back>> Two of the DomUs on each machine have dedicated network ports. At >> present they are bridged through Dom0, I believe that in this situation >> I can go straight to the port from the DomU. Is how to do this >> documented anywhere and can I make the change without having to rebuild >> the DomU? > >Not quite sure what your asking? Can you give a little more detail >on what you want to do?As you know the 4150 has 4 e1000g ports. Our configuration is that Dom0 uses e1000g0, DomU[0] uses e1000g1, DomU[1] uses e1000g2 and DomU[2]&[3] use e1000g3 At present all the DomUs have the networking defined as per the snippet of xml below which, from my understanding, means network traffic goes through the Dom0 via a vnic before leaving the box on the specified port. I thought I had seen a comment in a post some months ago that in the situation we are in for DomU[0] and DomU[1] the "scripts/vif-vnic" could be replaced by "scripts/vif-dedicated" (or something similar) and with some other reconfiguration network traffic could go directly from DomU[0] to e1000g1, bypassing the vnic through Dom0, similarly for Domu[1] and e1000g2. Have I mis-understood the situation? Thanks John> >Thanks, > >MRJ > > > >> Relevant part of xml file: >> >> <emulator>/usr/lib/xen/bin/qemu-dm</emulator> >> <interface type=''bridge''> >> <mac address=''52:54:00:11:48:0f''/> >> <source bridge=''e1000g2''/> >> <script path=''/usr/lib/xen/scripts/vif-vnic''/> >> <target dev=''vif11.0''/> >> </interface> >> >> Thanks for any help >> >> John >> >>John Landamore Department of Computer Science University of Leicester University Road Leicester UK
On 9 Oct 2009, at 6:30pm, Mark Johnson wrote:>> So, several questions. >> Has anyone else seen this? > > I know Dave had run into a network hang problem that he fixed... Not > sure if that fix has made it back to the gate. Dave?6855136 made it into build 123. The behaviour would seem to match that described. If it''s easily reproducible then some investigation with kmdb should allow us to confirm that it''s the same problem.
jal@mcs.le.ac.uk wrote:> On Oct 9 2009, Mark Johnson wrote: > >> >> >> J. Landamore wrote: >>> Hardware: X4150, 8 cores, 8Gb RAM; 2 off >>> >>> Dom0: b122, limited to 2 cores, 2Gb RAM >>> DomUs: 4 Solaris 10u7, 2 with 1 core 1Gb and 2 with 2 cores 2 GB >>> The single core DomUs act as DHCP/LDAP servers, the two core >>> DomUs >>> act as fileservers >>> >>> problem: Under heavy load the network on the DomU stops working. So >>> for example rsync between the servers on different metal will work >>> fine for up to 30 minutes and then stop. Cannot ping out of or in to >>> the DomU, but ifconfig and dladm appear to show everything OK. Reboot >>> the DomU and it is all back to normal. Happens if the LDAP and DHCP >>> server comes under heavy load (class of 150 trying to log in to >>> workstations simultaneously) When lightly loaded the machines have >>> been Ok The ports are at 1000Mb, but I don''t remember the problem >>> occurring when they were going at 100Mb >> >> Can you do a kstat xnbo on dom0 when this happens? > > Might be a day or two as I''m off for a couple of days, but will do > >>> So, several questions. >>> >>> Has anyone else seen this? >> >> I know Dave had run into a network hang problem that he fixed... Not >> sure if that fix has made it back to the gate. Dave? >> >> >> >>> What can I look at to track down the cause, I cannot see anything >>> obvious >>> in the logs? >> >> kstats on the interface in dom0 and domU would help. > > Will do as soon as I''m backfrom Dave''s reply, it sounds like this is fixed in b123. I would try switching to b124 (seems like a stable build to me) and see if this fixes your problem..>>> Two of the DomUs on each machine have dedicated network ports. At >>> present they are bridged through Dom0, I believe that in this >>> situation I can go straight to the port from the DomU. Is how to do >>> this documented anywhere and can I make the change without having to >>> rebuild the DomU? >> >> Not quite sure what your asking? Can you give a little more detail >> on what you want to do? > > As you know the 4150 has 4 e1000g ports. Our configuration is that Dom0 > uses e1000g0, DomU[0] uses e1000g1, DomU[1] uses e1000g2 and DomU[2]&[3] > use e1000g3 At present all the DomUs have the networking defined as per > the snippet of xml below which, from my understanding, means network > traffic goes through the Dom0 via a vnic before leaving the box on the > specified port. I thought I had seen a comment in a post some months ago > that in the situation we are in for DomU[0] and DomU[1] the > "scripts/vif-vnic" could be replaced by "scripts/vif-dedicated" (or > something similar) and with some other reconfiguration network traffic > could go directly from DomU[0] to e1000g1, bypassing the vnic through > Dom0, similarly for Domu[1] and e1000g2. Have I mis-understood the > situation?Ah, yes... There are two hotplug scripts, vif-vnic for using a vnic and /usr/lib/xen/scripts/vif-dedicated for dedicating a NIC to a guest. <interface type=''ethernet''> <mac address=''XXX''/> <script path=''/usr/lib/xen/scripts/vif-vnic''/> <target dev=''vif-1.0''/> </interface> I haven''t done this myself, but I believe you just need to replace the script path. With the guest shutdown, backup your guest config then virsh edit <guest> and change the script path... xm list -l <guest> > ./my-guest-backup.sxp virsh edit <guest>
On 12 Oct 2009, at 2:30pm, Mark Johnson wrote:>>>> Two of the DomUs on each machine have dedicated network ports. At >>>> present they are bridged through Dom0, I believe that in this >>>> situation I can go straight to the port from the DomU. Is how to >>>> do this documented anywhere and can I make the change without >>>> having to rebuild the DomU? >>> >>> Not quite sure what your asking? Can you give a little more detail >>> on what you want to do? >> As you know the 4150 has 4 e1000g ports. Our configuration is that >> Dom0 uses e1000g0, DomU[0] uses e1000g1, DomU[1] uses e1000g2 and >> DomU[2]&[3] use e1000g3 At present all the DomUs have the >> networking defined as per the snippet of xml below which, from my >> understanding, means network traffic goes through the Dom0 via a >> vnic before leaving the box on the specified port. I thought I had >> seen a comment in a post some months ago that in the situation we >> are in for DomU[0] and DomU[1] the "scripts/vif-vnic" could be >> replaced by "scripts/vif-dedicated" (or something similar) and with >> some other reconfiguration network traffic could go directly from >> DomU[0] to e1000g1, bypassing the vnic through Dom0, similarly for >> Domu[1] and e1000g2. Have I mis-understood the situation? > > Ah, yes... There are two hotplug scripts, vif-vnic for using a vnic > and /usr/lib/xen/scripts/vif-dedicated for dedicating a NIC to a > guest. > > <interface type=''ethernet''> > <mac address=''XXX''/> > <script path=''/usr/lib/xen/scripts/vif-vnic''/> > <target dev=''vif-1.0''/> > </interface> > > I haven''t done this myself, but I believe you just need to replace the > script path....and ensure that the ''bridge'' parameter specifies the link that you plan to dedicate to the guest.
Thanks to both you and Dave for your replies. I''ll upgrade to b124 and change to a dedicated NIC and see what happens. Thanks again John On Mon, Oct 12, 2009 at 09:30:40AM -0400, Mark Johnson wrote:> > > jal@mcs.le.ac.uk wrote: > >On Oct 9 2009, Mark Johnson wrote: > > > >> > >> > >>J. Landamore wrote: > >>>Hardware: X4150, 8 cores, 8Gb RAM; 2 off > >>> > >>>Dom0: b122, limited to 2 cores, 2Gb RAM > >>>DomUs: 4 Solaris 10u7, 2 with 1 core 1Gb and 2 with 2 cores 2 GB > >>> The single core DomUs act as DHCP/LDAP servers, the two core > >>>DomUs > >>>act as fileservers > >>> > >>>problem: Under heavy load the network on the DomU stops working. So > >>>for example rsync between the servers on different metal will work > >>>fine for up to 30 minutes and then stop. Cannot ping out of or in to > >>>the DomU, but ifconfig and dladm appear to show everything OK. Reboot > >>>the DomU and it is all back to normal. Happens if the LDAP and DHCP > >>>server comes under heavy load (class of 150 trying to log in to > >>>workstations simultaneously) When lightly loaded the machines have > >>>been Ok The ports are at 1000Mb, but I don''t remember the problem > >>>occurring when they were going at 100Mb > >> > >>Can you do a kstat xnbo on dom0 when this happens? > > > >Might be a day or two as I''m off for a couple of days, but will do > > > >>>So, several questions. > >>> > >>>Has anyone else seen this? > >> > >>I know Dave had run into a network hang problem that he fixed... Not > >>sure if that fix has made it back to the gate. Dave? > >> > >> > >> > >>>What can I look at to track down the cause, I cannot see anything > >>>obvious > >>>in the logs? > >> > >>kstats on the interface in dom0 and domU would help. > > > >Will do as soon as I''m back > > from Dave''s reply, it sounds like this is fixed in b123. I would > try switching to b124 (seems like a stable build to me) and see if > this fixes your problem.. > > > > > >>>Two of the DomUs on each machine have dedicated network ports. At > >>>present they are bridged through Dom0, I believe that in this > >>>situation I can go straight to the port from the DomU. Is how to do > >>>this documented anywhere and can I make the change without having to > >>>rebuild the DomU? > >> > >>Not quite sure what your asking? Can you give a little more detail > >>on what you want to do? > > > >As you know the 4150 has 4 e1000g ports. Our configuration is that Dom0 > >uses e1000g0, DomU[0] uses e1000g1, DomU[1] uses e1000g2 and DomU[2]&[3] > >use e1000g3 At present all the DomUs have the networking defined as per > >the snippet of xml below which, from my understanding, means network > >traffic goes through the Dom0 via a vnic before leaving the box on the > >specified port. I thought I had seen a comment in a post some months ago > >that in the situation we are in for DomU[0] and DomU[1] the > >"scripts/vif-vnic" could be replaced by "scripts/vif-dedicated" (or > >something similar) and with some other reconfiguration network traffic > >could go directly from DomU[0] to e1000g1, bypassing the vnic through > >Dom0, similarly for Domu[1] and e1000g2. Have I mis-understood the > >situation? > > Ah, yes... There are two hotplug scripts, vif-vnic for using a vnic > and /usr/lib/xen/scripts/vif-dedicated for dedicating a NIC to a guest. > > <interface type=''ethernet''> > <mac address=''XXX''/> > <script path=''/usr/lib/xen/scripts/vif-vnic''/> > <target dev=''vif-1.0''/> > </interface> > > I haven''t done this myself, but I believe you just need to replace the > script path. With the guest shutdown, backup your guest config then > virsh edit <guest> and change the script path... > > xm list -l <guest> > ./my-guest-backup.sxp > virsh edit <guest> > > >-- John Landamore Department of Computer Science University of Leicester University Road, LEICESTER, LE1 7RH J.Landamore@mcs.le.ac.uk Phone: +44 (0)116 2523410 Fax: +44 (0)116 2523604
john - did you have any success with this? my attempts at using vif-dedicated have so far failed :/ i keep getting: error: POST operation failed: xend_post: error from xen daemon: (xend.err ''Device 1 (vif) could not be connected. error: no NIC specified at backend/vif/16/1/bridge.'') when trying to start the vm if you''ve succeeded i''d love to see your xml fragment p -- This message posted from opensolaris.org