Hi, I have been tracking a bug affecting all my servers running Debian Squeeze for more than a month now, and I desperately need your help :) I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of them are running Debian Squeeze with the latest Xen Debian kernel (2.6.32-5-xen-amd64 == 2.6.32-29). The rest are running Debian Lenny (2.6.26-2-xen-amd64 == 2.6.26-26lenny1). On a Squeeze boxe, under very high IO (such as running a IO stress test, ie bonnie++), server starts behaving weirdly and I see messages like these in kernel.log : [see attachement]. Then the server becomes totally unresponsive (but doesn''t "freeze") and commands such as "ls" or "reboot" don''t work anymore. I have to do an hard reboot. After the server has reboot, the RAID array seems degraded (I am using the mpt-status command) and starts rebuilding. After several hours, the raid array is "fine" ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion MPT SPI Host driver 3.04.06". I have attached the result of "lsmod". None of my Lenny boxes are affected by this issue, all of my Squeeze boxes are. What does it have to do with Xen ? When I boot my Squeeze boxes without the Xen hypervisor but the same Xen kernel, bonnie++ runs absolutely fine. The issue appears only with the Xen hypervisor loaded. There is a debian bug report for this : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 Any suggestion ? Thanks ! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote:> Hi, > I have been tracking a bug affecting all my servers running Debian Squeeze > for more than a month now, and I*desperately*need your help :)* > I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of > them are running Debian Squeeze with the latest Xen Debian kernel > (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny > (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > On a Squeeze boxe, under very high IO (such as running a IO stress test, > ie bonnie++), server starts behaving*weirdly and I see messages like these > in kernel.log : [see attachement]. Then the server becomes totally > unresponsive (but doesn''t "freeze") and commands such as "ls" or "reboot" > don''t work anymore. I have to do an hard reboot. After the server has > reboot, the RAID array seems degraded (I am using the mpt-status command) > and starts rebuilding. After several hours, the raid array is "fine" > ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion > MPT SPI Host driver 3.04.06". I have attached the result of "lsmod". > None of my Lenny boxes are affected by this issue, all of my Squeeze boxes > are. > What does it have to do with Xen ? When I boot my Squeeze boxes without > the Xen hypervisor but the same Xen kernel, bonnie++ runs*absolutely*fine. > The issue appears only with the Xen hypervisor loaded.* > There is a debian bug report for this > :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > Any suggestion ?*Did you check if LSI has newer driver version available? Also you might check which driver version for example RHEL6 or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too. On one of my testboxes I need to upgrade the LSI driver to a newer version to make it work. This is SAS based LSI though. Can you try using another disk controller? Also: Did you try using the latest kernel (-30) ? -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thanks for your reply. LSI has indeed newer driver for the controler; but I can''t "build" it, there''s an error when I try to compile it [see attachement]. I will give another try in the next days. What is puzzling is that the IO errors only occurs with Xen HV. I am 100% willing to accept that the problem is the drivers, but how come the exact same kernel (the xenified one) could work fine without Xen loaded ? I am almost a noob in kernel/driver and stuff; but I thought the drivers were entirely in the kernel. I will try with the latest kernel in a few days. SLES11SP1 ships mptfusion 4.22 (http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) I dont know for RHEL On Sat, Jan 29, 2011 at 6:02 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: >> Hi, >> I have been tracking a bug affecting all my servers running Debian Squeeze >> for more than a month now, and I*desperately*need your help :)* >> I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of >> them are running Debian Squeeze with the latest Xen Debian kernel >> (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny >> (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). >> On a Squeeze boxe, under very high IO (such as running a IO stress test, >> ie bonnie++), server starts behaving*weirdly and I see messages like these >> in kernel.log : [see attachement]. Then the server becomes totally >> unresponsive (but doesn''t "freeze") and commands such as "ls" or "reboot" >> don''t work anymore. I have to do an hard reboot. After the server has >> reboot, the RAID array seems degraded (I am using the mpt-status command) >> and starts rebuilding. After several hours, the raid array is "fine" >> ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion >> MPT SPI Host driver 3.04.06". I have attached the result of "lsmod". >> None of my Lenny boxes are affected by this issue, all of my Squeeze boxes >> are. >> What does it have to do with Xen ? When I boot my Squeeze boxes without >> the Xen hypervisor but the same Xen kernel, bonnie++ runs*absolutely*fine. >> The issue appears only with the Xen hypervisor loaded.* >> There is a debian bug report for this >> :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 >> Any suggestion ?* > > Did you check if LSI has newer driver version available? > > Also you might check which driver version for example RHEL6 > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too. > > On one of my testboxes I need to upgrade the LSI driver > to a newer version to make it work. This is SAS based LSI though. > > Can you try using another disk controller? > > Also: Did you try using the latest kernel (-30) ? > > -- Pasi > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote:> Thanks for your reply. LSI has indeed newer driver for the controler; > but I can''t "build" it, there''s an error when I try to compile it [see > attachement]. I will give another try in the next days. > > What is puzzling is that the IO errors only occurs with Xen HV. I am > 100% willing to accept that the problem is the drivers, but how come > the exact same kernel (the xenified one) could work fine without Xen > loaded ? I am almost a noob in kernel/driver and stuff; but I thought > the drivers were entirely in the kernel. >Yep, the driver is entirely in the kernel, but that''s not the whole story. Xen dom0 kernel does irq handling through Xen hypervisor, so that might make some drivers behave in a different way baremetal vs. dom0. Also remember dom0 is a *vm*, so some timing stuff might happen differently on baremetal vs. dom0.> I will try with the latest kernel in a few days. > > SLES11SP1 ships mptfusion 4.22 > (http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) > I dont know for RHEL >What driver version does the squeeze kernel have? -- Pasi> On Sat, Jan 29, 2011 at 6:02 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: > >> Hi, > >> I have been tracking a bug affecting all my servers running Debian Squeeze > >> for more than a month now, and I*desperately*need your help :)* > >> I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of > >> them are running Debian Squeeze with the latest Xen Debian kernel > >> (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny > >> (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > >> On a Squeeze boxe, under very high IO (such as running a IO stress test, > >> ie bonnie++), server starts behaving*weirdly and I see messages like these > >> in kernel.log : [see attachement]. Then the server becomes totally > >> unresponsive (but doesn''t "freeze") and commands such as "ls" or "reboot" > >> don''t work anymore. I have to do an hard reboot. After the server has > >> reboot, the RAID array seems degraded (I am using the mpt-status command) > >> and starts rebuilding. After several hours, the raid array is "fine" > >> ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion > >> MPT SPI Host driver 3.04.06". I have attached the result of "lsmod". > >> None of my Lenny boxes are affected by this issue, all of my Squeeze boxes > >> are. > >> What does it have to do with Xen ? When I boot my Squeeze boxes without > >> the Xen hypervisor but the same Xen kernel, bonnie++ runs*absolutely*fine. > >> The issue appears only with the Xen hypervisor loaded.* > >> There is a debian bug report for this > >> :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > >> Any suggestion ?* > > > > Did you check if LSI has newer driver version available? > > > > Also you might check which driver version for example RHEL6 > > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too. > > > > On one of my testboxes I need to upgrade the LSI driver > > to a newer version to make it work. This is SAS based LSI though. > > > > Can you try using another disk controller? > > > > Also: Did you try using the latest kernel (-30) ? > > > > -- Pasi > > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
>Xen dom0 kernel does irq handling through Xen hypervisor, >so that might make some drivers behave in a different way baremetal vs.dom0. Ok, so the driver is a good "responsible" for this SCSI crazyness.>What driver version does the squeeze kernel have?3.04. Which seems to be several years old. There is lot of users complaining about LSI drivers all over the Internet. I will keep you posted as soon as I manage to build the latest driver. On Sat, Jan 29, 2011 at 7:25 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: >> Thanks for your reply. LSI has indeed newer driver for the controler; >> but I can''t "build" it, there''s an error when I try to compile it [see >> attachement]. I will give another try in the next days. >> >> What is puzzling is that the IO errors only occurs with Xen HV. I am >> 100% willing to accept that the problem is the drivers, but how come >> the exact same kernel (the xenified one) could work fine without Xen >> loaded ? I am almost a noob in kernel/driver and stuff; but I thought >> the drivers were entirely in the kernel. >> > > Yep, the driver is entirely in the kernel, but that''s not the whole story. > > Xen dom0 kernel does irq handling through Xen hypervisor, > so that might make some drivers behave in a different way baremetal vs.dom0.> > Also remember dom0 is a *vm*, so some timing stuff might happen > differently on baremetal vs. dom0. > >> I will try with the latest kernel in a few days. >> >> SLES11SP1 ships mptfusion 4.22 >> (http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage )>> I dont know for RHEL >> > > What driver version does the squeeze kernel have? > > > -- Pasi > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: >> >> Hi, >> >> I have been tracking a bug affecting all my servers running DebianSqueeze>> >> for more than a month now, and I*desperately*need your help :)* >> >> I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror).4 of>> >> them are running Debian Squeeze with the latest Xen Debian kernel >> >> (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running DebianLenny>> >> (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). >> >> On a Squeeze boxe, under very high IO (such as running a IO stresstest,>> >> ie bonnie++), server starts behaving*weirdly and I see messageslike these>> >> in kernel.log : [see attachement]. Then the server becomes totally >> >> unresponsive (but doesn''t "freeze") and commands such as "ls" or"reboot">> >> don''t work anymore. I have to do an hard reboot. After the serverhas>> >> reboot, the RAID array seems degraded (I am using the mpt-statuscommand)>> >> and starts rebuilding. After several hours, the raid array is"fine">> >> ("clean"). The raid controler is "LSI53C1030" U320, with driver"Fusion>> >> MPT SPI Host driver 3.04.06". I have attached the result of"lsmod".>> >> None of my Lenny boxes are affected by this issue, all of mySqueeze boxes>> >> are. >> >> What does it have to do with Xen ? When I boot my Squeeze boxeswithout>> >> the Xen hypervisor but the same Xen kernel, bonnie++runs*absolutely*fine.>> >> The issue appears only with the Xen hypervisor loaded.* >> >> There is a debian bug report for this >> >> :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 >> >> Any suggestion ?* >> > >> > Did you check if LSI has newer driver version available? >> > >> > Also you might check which driver version for example RHEL6 >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernelstoo.>> > >> > On one of my testboxes I need to upgrade the LSI driver >> > to a newer version to make it work. This is SAS based LSI though. >> > >> > Can you try using another disk controller? >> > >> > Also: Did you try using the latest kernel (-30) ? >> > >> > -- Pasi >> > >> > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sat, Jan 29, 2011 at 07:32:31PM +0100, Jordan Pittier wrote:> >Xen dom0 kernel does irq handling through Xen hypervisor, > >so that might make some drivers behave in a different way baremetal vs. > dom0. > Ok, so the driver is a good "responsible" for this SCSI crazyness.I''m not sure if it is, but it *could* be.> >What driver version does the squeeze kernel have? > 3.04. Which seems to be several years old. There is lot of users > complaining about LSI drivers all over the Internet.* > I will keep you posted as soon as I manage to build the latest driver.See here for tips how to build updated megaraid_sas driver: http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html Maybe it helps also with your driver. -- Pasi> On Sat, Jan 29, 2011 at 7:25 PM, Pasi K*rkk*inen <[1]pasik@iki.fi> wrote: > > On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: > >> Thanks for your reply. LSI has indeed newer driver for the controler; > >> but I can''t "build" it, there''s an error when I try to compile it [see > >> attachement]. I will give another try in the next days. > >> > >> What is puzzling is that the IO errors only occurs with Xen HV. I am > >> 100% willing to accept that the problem is the drivers, but how come > >> the exact same kernel (the xenified one) could work fine without Xen > >> loaded ? I am almost a noob in kernel/driver and stuff; but I thought > >> the drivers were entirely in the kernel. > >> > > > > Yep, the driver is entirely in the kernel, but that''s not the whole > story. > > > > Xen dom0 kernel does irq handling through Xen hypervisor, > > so that might make some drivers behave in a different way baremetal vs. > dom0. > > > > Also remember dom0 is a *vm*, so some timing stuff might happen > > differently on baremetal vs. dom0. > > > >> I will try with the latest kernel in a few days. > >> > >> SLES11SP1 ships mptfusion 4.22 > >> > ([2]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) > >> I dont know for RHEL > >> > > > > What driver version does the squeeze kernel have? > > > > > > -- Pasi > > > > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi K*rkk*inen <[3]pasik@iki.fi> > wrote: > >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: > >> >> * *Hi, > >> >> * *I have been tracking a bug affecting all my servers running > Debian Squeeze > >> >> * *for more than a month now, and I*desperately*need your help :)* > >> >> * *I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 => mirror). 4 of > >> >> * *them are running Debian Squeeze with the latest Xen Debian kernel > >> >> * *(2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian > Lenny > >> >> * *(2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > >> >> * *On a Squeeze boxe, under very high IO (such as running a IO > stress test, > >> >> * *ie bonnie++), server starts behaving*weirdly and I see messages > like these > >> >> * *in kernel.log : [see attachement]. Then the server becomes > totally > >> >> * *unresponsive (but doesn''t "freeze") and commands such as "ls" or > "reboot" > >> >> * *don''t work anymore. I have to do an hard reboot. After the server > has > >> >> * *reboot, the RAID array seems degraded (I am using the mpt-status > command) > >> >> * *and starts rebuilding. After several hours, the raid array is > "fine" > >> >> * *("clean"). The raid controler is "LSI53C1030" U320, with driver > "Fusion > >> >> * *MPT SPI Host driver 3.04.06". I have attached the result of > "lsmod". > >> >> * *None of my Lenny boxes are affected by this issue, all of my > Squeeze boxes > >> >> * *are. > >> >> * *What does it have to do with Xen ? When I boot my Squeeze boxes > without > >> >> * *the Xen hypervisor but the same Xen kernel, bonnie++ > runs*absolutely*fine. > >> >> * *The issue appears only with the Xen hypervisor loaded.* > >> >> * *There is a debian bug report for this > >> >> * *:*[1][4]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > >> >> * *Any suggestion ?* > >> > > >> > Did you check if LSI has newer driver version available? > >> > > >> > Also you might check which driver version for example RHEL6 > >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels > too. > >> > > >> > On one of my testboxes I need to upgrade the LSI driver > >> > to a newer version to make it work. This is SAS based LSI though. > >> > > >> > Can you try using another disk controller? > >> > > >> > Also: Did you try using the latest kernel (-30) ? > >> > > >> > -- Pasi > >> > > >> > > > > > References > > Visible links > 1. mailto:pasik@iki.fi > 2. http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 3. mailto:pasik@iki.fi > 4. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi, Finally I managed to compile the driver LSI MPT Fusion 4.22. I took the source from kernel 2.6.34 shipped with SLES. Then I slightly changed the driver sources to "backport" it on a debian 2.6.32. Now my servers seem 100% stable, so I am verry happy :) Thanks for your big hint toward a possible depreciated driver. Jordan On Sat, Jan 29, 2011 at 7:49 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Sat, Jan 29, 2011 at 07:32:31PM +0100, Jordan Pittier wrote: > > >Xen dom0 kernel does irq handling through Xen hypervisor, > > >so that might make some drivers behave in a different way baremetal > vs. > > dom0. > > Ok, so the driver is a good "responsible" for this SCSI crazyness. > > > I''m not sure if it is, but it *could* be. > > > > >What driver version does the squeeze kernel have? > > 3.04. Which seems to be several years old. There is lot of users > > complaining about LSI drivers all over the Internet.* > > I will keep you posted as soon as I manage to build the latest driver. > > See here for tips how to build updated megaraid_sas driver: > http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html > > Maybe it helps also with your driver. > > -- Pasi > > > On Sat, Jan 29, 2011 at 7:25 PM, Pasi K*rkk*inen <[1]pasik@iki.fi> > wrote: > > > On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: > > >> Thanks for your reply. LSI has indeed newer driver for the > controler; > > >> but I can''t "build" it, there''s an error when I try to compile it > [see > > >> attachement]. I will give another try in the next days. > > >> > > >> What is puzzling is that the IO errors only occurs with Xen HV. I > am > > >> 100% willing to accept that the problem is the drivers, but how > come > > >> the exact same kernel (the xenified one) could work fine without > Xen > > >> loaded ? I am almost a noob in kernel/driver and stuff; but I > thought > > >> the drivers were entirely in the kernel. > > >> > > > > > > Yep, the driver is entirely in the kernel, but that''s not the whole > > story. > > > > > > Xen dom0 kernel does irq handling through Xen hypervisor, > > > so that might make some drivers behave in a different way baremetal > vs. > > dom0. > > > > > > Also remember dom0 is a *vm*, so some timing stuff might happen > > > differently on baremetal vs. dom0. > > > > > >> I will try with the latest kernel in a few days. > > >> > > >> SLES11SP1 ships mptfusion 4.22 > > >> > > ([2] > http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > ) > > >> I dont know for RHEL > > >> > > > > > > What driver version does the squeeze kernel have? > > > > > > > > > -- Pasi > > > > > > > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi K*rkk*inen <[3]pasik@iki.fi> > > wrote: > > >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote: > > >> >> * *Hi, > > >> >> * *I have been tracking a bug affecting all my servers running > > Debian Squeeze > > >> >> * *for more than a month now, and I*desperately*need your help > :)* > > >> >> * *I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 => > mirror). 4 of > > >> >> * *them are running Debian Squeeze with the latest Xen Debian > kernel > > >> >> * *(2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running > Debian > > Lenny > > >> >> * *(2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > > >> >> * *On a Squeeze boxe, under very high IO (such as running a IO > > stress test, > > >> >> * *ie bonnie++), server starts behaving*weirdly and I see > messages > > like these > > >> >> * *in kernel.log : [see attachement]. Then the server becomes > > totally > > >> >> * *unresponsive (but doesn''t "freeze") and commands such as "ls" > or > > "reboot" > > >> >> * *don''t work anymore. I have to do an hard reboot. After the > server > > has > > >> >> * *reboot, the RAID array seems degraded (I am using the > mpt-status > > command) > > >> >> * *and starts rebuilding. After several hours, the raid array is > > "fine" > > >> >> * *("clean"). The raid controler is "LSI53C1030" U320, with > driver > > "Fusion > > >> >> * *MPT SPI Host driver 3.04.06". I have attached the result of > > "lsmod". > > >> >> * *None of my Lenny boxes are affected by this issue, all of my > > Squeeze boxes > > >> >> * *are. > > >> >> * *What does it have to do with Xen ? When I boot my Squeeze > boxes > > without > > >> >> * *the Xen hypervisor but the same Xen kernel, bonnie++ > > runs*absolutely*fine. > > >> >> * *The issue appears only with the Xen hypervisor loaded.* > > >> >> * *There is a debian bug report for this > > >> >> * *:*[1][4] > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > > >> >> * *Any suggestion ?* > > >> > > > >> > Did you check if LSI has newer driver version available? > > >> > > > >> > Also you might check which driver version for example RHEL6 > > >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 > kernels > > too. > > >> > > > >> > On one of my testboxes I need to upgrade the LSI driver > > >> > to a newer version to make it work. This is SAS based LSI though. > > >> > > > >> > Can you try using another disk controller? > > >> > > > >> > Also: Did you try using the latest kernel (-30) ? > > >> > > > >> > -- Pasi > > >> > > > >> > > > > > > > > References > > > > Visible links > > 1. mailto:pasik@iki.fi > > 2. > http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > > 3. mailto:pasik@iki.fi > > 4. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Feb 02, 2011 at 11:43:29PM +0100, Jordan Pittier wrote:> Hi, > Finally I managed to compile the driver LSI MPT Fusion 4.22. I took the > source from kernel 2.6.34 shipped with SLES. Then I slightly changed the > driver sources to "backport" it on a debian 2.6.32. > Now my servers seem 100% stable, so I am verry happy :) Thanks for your > big hint toward a possible depreciated driver. >Good to hear it helped! -- Pasi> Jordan > > On Sat, Jan 29, 2011 at 7:49 PM, Pasi Kärkkäinen <[1]pasik@iki.fi> wrote: > > On Sat, Jan 29, 2011 at 07:32:31PM +0100, Jordan Pittier wrote: > > >Xen dom0 kernel does irq handling through Xen hypervisor, > > >so that might make some drivers behave in a different way > baremetal vs. > > dom0. > > Ok, so the driver is a good "responsible" for this SCSI crazyness. > > I''m not sure if it is, but it *could* be. > > > >What driver version does the squeeze kernel have? > > 3.04. Which seems to be several years old. There is lot of users > > complaining about LSI drivers all over the Internet.* > > I will keep you posted as soon as I manage to build the latest > driver. > > See here for tips how to build updated megaraid_sas driver: > [2]http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html > > Maybe it helps also with your driver. > > -- Pasi > > On Sat, Jan 29, 2011 at 7:25 PM, Pasi K*rkk*inen > <[1][3]pasik@iki.fi> wrote: > > > On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote: > > >> Thanks for your reply. LSI has indeed newer driver for the > controler; > > >> but I can''t "build" it, there''s an error when I try to compile > it [see > > >> attachement]. I will give another try in the next days. > > >> > > >> What is puzzling is that the IO errors only occurs with Xen HV. > I am > > >> 100% willing to accept that the problem is the drivers, but how > come > > >> the exact same kernel (the xenified one) could work fine without > Xen > > >> loaded ? I am almost a noob in kernel/driver and stuff; but I > thought > > >> the drivers were entirely in the kernel. > > >> > > > > > > Yep, the driver is entirely in the kernel, but that''s not the > whole > > story. > > > > > > Xen dom0 kernel does irq handling through Xen hypervisor, > > > so that might make some drivers behave in a different way > baremetal vs. > > dom0. > > > > > > Also remember dom0 is a *vm*, so some timing stuff might happen > > > differently on baremetal vs. dom0. > > > > > >> I will try with the latest kernel in a few days. > > >> > > >> SLES11SP1 ships mptfusion 4.22 > > >> > > > ([2][4]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage) > > >> I dont know for RHEL > > >> > > > > > > What driver version does the squeeze kernel have? > > > > > > > > > -- Pasi > > > > > > > > >> On Sat, Jan 29, 2011 at 6:02 PM, Pasi K*rkk*inen > <[3][5]pasik@iki.fi> > > wrote: > > >> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier > wrote: > > >> >> * *Hi, > > >> >> * *I have been tracking a bug affecting all my servers > running > > Debian Squeeze > > >> >> * *for more than a month now, and I*desperately*need your > help :)* > > >> >> * *I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 => > mirror). 4 of > > >> >> * *them are running Debian Squeeze with the latest Xen Debian > kernel > > >> >> * *(2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running > Debian > > Lenny > > >> >> * *(2.6.26-2-xen-amd64 ==*2.6.26-26lenny1). > > >> >> * *On a Squeeze boxe, under very high IO (such as running a > IO > > stress test, > > >> >> * *ie bonnie++), server starts behaving*weirdly and I see > messages > > like these > > >> >> * *in kernel.log : [see attachement]. Then the server becomes > > totally > > >> >> * *unresponsive (but doesn''t "freeze") and commands such as > "ls" or > > "reboot" > > >> >> * *don''t work anymore. I have to do an hard reboot. After the > server > > has > > >> >> * *reboot, the RAID array seems degraded (I am using the > mpt-status > > command) > > >> >> * *and starts rebuilding. After several hours, the raid array > is > > "fine" > > >> >> * *("clean"). The raid controler is "LSI53C1030" U320, with > driver > > "Fusion > > >> >> * *MPT SPI Host driver 3.04.06". I have attached the result > of > > "lsmod". > > >> >> * *None of my Lenny boxes are affected by this issue, all of > my > > Squeeze boxes > > >> >> * *are. > > >> >> * *What does it have to do with Xen ? When I boot my Squeeze > boxes > > without > > >> >> * *the Xen hypervisor but the same Xen kernel, bonnie++ > > runs*absolutely*fine. > > >> >> * *The issue appears only with the Xen hypervisor loaded.* > > >> >> * *There is a debian bug report for this > > >> >> * > *:*[1][4][6]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > > >> >> * *Any suggestion ?* > > >> > > > >> > Did you check if LSI has newer driver version available? > > >> > > > >> > Also you might check which driver version for example RHEL6 > > >> > or SLES11SP1 ships with.. both of those distros have 2.6.32 > kernels > > too. > > >> > > > >> > On one of my testboxes I need to upgrade the LSI driver > > >> > to a newer version to make it work. This is SAS based LSI > though. > > >> > > > >> > Can you try using another disk controller? > > >> > > > >> > Also: Did you try using the latest kernel (-30) ? > > >> > > > >> > -- Pasi > > >> > > > >> > > > > > > > > References > > > > Visible links > > 1. mailto:[7]pasik@iki.fi > > 2. > [8]http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > > 3. mailto:[9]pasik@iki.fi > > 4. [10]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > > References > > Visible links > 1. mailto:pasik@iki.fi > 2. http://lists.xensource.com/archives/html/xen-devel/2010-11/msg00250.html > 3. mailto:pasik@iki.fi > 4. http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 5. mailto:pasik@iki.fi > 6. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727 > 7. mailto:pasik@iki.fi > 8. http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage > 9. mailto:pasik@iki.fi > 10. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Guido Hecken
2011-Feb-03 07:26 UTC
[Xen-users] Startup-script changing firewall settings each time domu (re)starts or gets created
Hi list, has anyone an idea on where to put some custom startup-script in addition to the default scripts (network-bridge and vif-bridge). I have xen bridge setup running fine and want to put some firewall rules in place and have them refreshed, every time a special domu is created or (re)started. Something like this: ... INTERFACE=`xm list $NAME | tail -1 | awk ''{print $2}''` iptables -A FORWARD -m physdev --physdev-in vif${INTERFACE}.0 -j $IN iptables -A $IN -s 192.168.161.82 -p tcp --sport 3389 -d 192.168.161.216 -j ACCEPT ... The Script is working fine an can be executed manually with the desired results. Any input is highly welcome Guido _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users