Ok, I''ve been running this 2.6.32.13 pv_ops dom0 kernel for several weeks and it has twice killed my domU''s. I get numerous CPU soft lockup bug errors and at times it will freeze which means a power cycle boot. This has resulted in things like: EXT-fs error (device dm-0): ext4_lookup: deleted inode reference EXT-fs error (device dm-0): ext4_lookup: deleted inode reference in the domU boots which has killed two of them. I had to rebuild the images from backups. The domU''s are running 2.6.32-16-server domU kernels (ubuntu). -Gerry May 19, 2010 09:01:00 PM, greno@verizon.net wrote: On 05/19/2010 05:24 PM, Jeremy Fitzhardinge wrote: > On 05/19/2010 02:09 PM, Gerry Reno wrote: > >> I am using a pv_ops dom0 kernel 2.6.32.12 with xen 4.0.0-rc8. My >> domU''s use 2.6.31-14-server ubuntu. >> >> When I try to ping another computer on the network from the domU I >> still received this error: >> Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet >> >> I thought this error was fixed somewhere around 2.6.32.10 but >> apparently it is still in 2.6.32.12. >> >> How do I get around this problem? >> > I applied a patch to fix up a checksum bug in netback, but I realized I > hadn''t applied it to stable-2.6.32. Please try again (it will be > 2.6.32.13) and tell me how it goes. > > Thanks, > J > > Jeremy, that fixed it. Thanks. -Gerry _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-09 22:37 UTC
Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
On 06/09/2010 02:02 PM, greno@verizon.net wrote:> > Ok, I''ve been running this 2.6.32.13 pv_ops dom0 kernel for several > weeks and it has twice killed my domU''s. I get numerous CPU soft > lockup bug errors and at times it will freeze which means a power > cycle boot.The lockups are in dom0 or domU? Do the backtraces indicate a common subsystem, or are they all over the place?> This has resulted in things like: > EXT-fs error (device dm-0): ext4_lookup: deleted inode reference > EXT-fs error (device dm-0): ext4_lookup: deleted inode reference > in the domU boots which has killed two of them.What''s your storage path from guest device to media? Are they using barriers? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy, The soft lockups seemed to be occurring in different systems. And I could never make sense out of what was triggering them. I have not mounted any file systems with "nobarriers" in guests. The guests are all a single /dev/xvda. The underlying physical hardware is LVM over RAID-1 arrays. I''m attaching dmesg, kern.log, and messages in case these might be useful. -Gerry Jun 9, 2010 06:38:00 PM, jeremy@goop.org wrote: On 06/09/2010 02:02 PM, greno@verizon.net wrote: > > Ok, I''ve been running this 2.6.32.13 pv_ops dom0 kernel for several > weeks and it has twice killed my domU''s. I get numerous CPU soft > lockup bug errors and at times it will freeze which means a power > cycle boot. The lockups are in dom0 or domU? Do the backtraces indicate a common subsystem, or are they all over the place? > This has resulted in things like: > EXT-fs error (device dm-0): ext4_lookup: deleted inode reference > EXT-fs error (device dm-0): ext4_lookup: deleted inode reference > in the domU boots which has killed two of them. What''s your storage path from guest device to media? Are they using barriers? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-09 23:11 UTC
Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
On 06/09/2010 04:05 PM, greno@verizon.net wrote:> Jeremy, > The soft lockups seemed to be occurring in different systems. And I > could never make sense out of what was triggering them. I have not > mounted any file systems with "nobarriers" in guests. The guests are > all a single /dev/xvda. The underlying physical hardware is LVM over > RAID-1 arrays. I''m attaching dmesg, kern.log, and messages in case > these might be useful.Using what storage backend? blkback? blktap2? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
greno@verizon.net
2010-Jun-09 23:27 UTC
Re: Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
blkbackd Jun 9, 2010 07:13:23 PM, jeremy@goop.org wrote: On 06/09/2010 04:05 PM, greno@verizon.net wrote: > Jeremy, > The soft lockups seemed to be occurring in different systems. And I > could never make sense out of what was triggering them. I have not > mounted any file systems with "nobarriers" in guests. The guests are > all a single /dev/xvda. The underlying physical hardware is LVM over > RAID-1 arrays. I''m attaching dmesg, kern.log, and messages in case > these might be useful. Using what storage backend? blkback? blktap2? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-09 23:37 UTC
Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
On 06/09/2010 04:27 PM, greno@verizon.net wrote:> blkbackdUsing phy: in your config file? That really isn''t recommended because it has poor integrity; the writes are buffered in dom0 so writes can be reordered or lost on crash, and the guest filesystem can''t maintain any of its own integrity guarantees. tap:aio: is more resilient, since the writes go directly to the device without buffering. That doesn''t directly relate to your lockup issues, but it should prevent filesystem corruption when they happen. J> > > > Jun 9, 2010 07:13:23 PM, jeremy@goop.org wrote: > > On 06/09/2010 04:05 PM, greno@verizon.net wrote: > > Jeremy, > > The soft lockups seemed to be occurring in different systems. And I > > could never make sense out of what was triggering them. I have not > > mounted any file systems with "nobarriers" in guests. The guests are > > all a single /dev/xvda. The underlying physical hardware is LVM over > > RAID-1 arrays. I''m attaching dmesg, kern.log, and messages in case > > these might be useful. > > Using what storage backend? blkback? blktap2? > > J > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
greno@verizon.net
2010-Jun-09 23:52 UTC
Re: Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
Thanks I check into using tap:aio. I had tried once before and could not get it to work. Here is my entry from pv-grub: # pv-grub: tap:aio: will not work for disk, use file: disk = [ "file:/var/lib/xen/images/CLOUD-CC-1.img,xvda,w" ] I had difficulty getting tap:aio to work with disk. I can''t remember if that problem was just with pv-grub or with dom0 in general. This was about 6 months ago. I guess that is no longer a problem now? Jun 9, 2010 07:37:46 PM, jeremy@goop.org wrote: On 06/09/2010 04:27 PM, greno@verizon.net wrote: > blkbackd Using phy: in your config file? That really isn''t recommended because it has poor integrity; the writes are buffered in dom0 so writes can be reordered or lost on crash, and the guest filesystem can''t maintain any of its own integrity guarantees. tap:aio: is more resilient, since the writes go directly to the device without buffering. That doesn''t directly relate to your lockup issues, but it should prevent filesystem corruption when they happen. J > > > > Jun 9, 2010 07:13:23 PM, jeremy@goop.org wrote: > > On 06/09/2010 04:05 PM, greno@verizon.net wrote: > > Jeremy, > > The soft lockups seemed to be occurring in different systems. And I > > could never make sense out of what was triggering them. I have not > > mounted any file systems with "nobarriers" in guests. The guests are > > all a single /dev/xvda. The underlying physical hardware is LVM over > > RAID-1 arrays. I''m attaching dmesg, kern.log, and messages in case > > these might be useful. > > Using what storage backend? blkback? blktap2? > > J > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > lists.xensource.com/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-10 00:47 UTC
Re: [Xen-devel] Xen pv_ops dom0 2.6.32.13 issues
On 06/09/2010 04:52 PM, greno@verizon.net wrote:> Thanks I check into using tap:aio. I had tried once before and could > not get it to work. > > Here is my entry from pv-grub: > # pv-grub: tap:aio: will not work for disk, use file: > disk = [ "file:/var/lib/xen/images/CLOUD-CC-1.img,xvda,w" ]Ah, file: is even worse than using phy:.> I had difficulty getting tap:aio to work with disk. I can''t remember > if that problem was just with pv-grub or with dom0 in general. This > was about 6 months ago. I guess that is no longer a problem now?I use it with no problems. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Hello Jeremy, Jeremy Fitzhardinge wrote:> Using phy: in your config file? That really isn''t recommended because it > has poor integrity; the writes are buffered in dom0 so writes can be > reordered or lost on crash, and the guest filesystem can''t maintain any > of its own integrity guarantees. > > tap:aio: is more resilient, since the writes go directly to the device > without buffering.Do you mean that using tap:aio with a disk.image is prefered against using phy: with LVM-device? Best Regards Jens Friedrich aka Neobiker (neobiker.de) -- View this message in context: old.nabble.com/Xen-pv_ops-dom0-2.6.32.13-issues-tp28835895p28857720.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Valtteri Kiviniemi
2010-Jun-11 17:42 UTC
Re: [Xen-devel] Which disk backend to use in domU?
Hi, I am also using phy: with LVM-partitions, and I also would like to know if there is a better or more preferred way. - Valtteri Kiviniemi Neobiker kirjoitti:> Hello Jeremy, > > > Jeremy Fitzhardinge wrote: >> Using phy: in your config file? That really isn''t recommended because it >> has poor integrity; the writes are buffered in dom0 so writes can be >> reordered or lost on crash, and the guest filesystem can''t maintain any >> of its own integrity guarantees. >> >> tap:aio: is more resilient, since the writes go directly to the device >> without buffering. > > Do you mean that using tap:aio with a disk.image is prefered against using > phy: with LVM-device? > > Best Regards > Jens Friedrich aka Neobiker (neobiker.de)_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Valtteri Kiviniemi
2010-Jun-11 17:53 UTC
Re: [Xen-devel] Which disk backend to use in domU?
Hi, Ah, misunderstanding sorry, you were talking about disk images :) Valtteri Kiviniemi kirjoitti:> Hi, > > I am also using phy: with LVM-partitions, and I also would like to know > if there is a better or more preferred way. > > - Valtteri Kiviniemi > > Neobiker kirjoitti: >> Hello Jeremy, >> >> >> Jeremy Fitzhardinge wrote: >>> Using phy: in your config file? That really isn''t recommended >>> because it >>> has poor integrity; the writes are buffered in dom0 so writes can be >>> reordered or lost on crash, and the guest filesystem can''t maintain any >>> of its own integrity guarantees. >>> >>> tap:aio: is more resilient, since the writes go directly to the device >>> without buffering. >> >> Do you mean that using tap:aio with a disk.image is prefered against >> using >> phy: with LVM-device? >> >> Best Regards >> Jens Friedrich aka Neobiker (neobiker.de) > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Hi Valtteri Kiviniemi-2 wrote:> > Hi, Ah, misunderstanding sorry, you were talking about disk images :) >I''m talking about this config: disk = [ ''phy:/dev/vm/vm01,xvda1,w'', ''phy:/dev/vm/vm01-swap,xvda2,w'', ''phy:/dev/daten/devel_debian_amd64,xvda3,w'', ] BR neobiker -- View this message in context: old.nabble.com/Xen-pv_ops-dom0-2.6.32.13-issues-tp28835895p28858517.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-14 10:49 UTC
Re: [Xen-devel] Which disk backend to use in domU?
On 06/11/2010 07:11 PM, Neobiker wrote:> Hi > > Valtteri Kiviniemi-2 wrote: > >> Hi, Ah, misunderstanding sorry, you were talking about disk images :) >> >> > I''m talking about this config: > disk = [ > ''phy:/dev/vm/vm01,xvda1,w'', > ''phy:/dev/vm/vm01-swap,xvda2,w'', > ''phy:/dev/daten/devel_debian_amd64,xvda3,w'', > ] >file: is definitely unsafe; its IO gets buffered in the dom0 pagecache, which means the guests writes aren''t really writes. I believe phy: has similar problems, whereas tap:aio: implemented direct IO. But someone more storagey can confirm. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
On Mon, 2010-06-14 at 06:49 -0400, Jeremy Fitzhardinge wrote:> On 06/11/2010 07:11 PM, Neobiker wrote: > > Hi > > > > Valtteri Kiviniemi-2 wrote: > > > >> Hi, Ah, misunderstanding sorry, you were talking about disk images :) > >> > >> > > I''m talking about this config: > > disk = [ > > ''phy:/dev/vm/vm01,xvda1,w'', > > ''phy:/dev/vm/vm01-swap,xvda2,w'', > > ''phy:/dev/daten/devel_debian_amd64,xvda3,w'', > > ] > > > > file: is definitely unsafe; its IO gets buffered in the dom0 pagecache, > which means the guests writes aren''t really writes. I believe phy: has > similar problems, whereas tap:aio: implemented direct IO. But someone > more storagey can confirm.Unless there''s a difference in type names between XCP and .org, ''phy'' means a bare LUN plugged into blkback? Those run underneath the entire block cache subsystems, which ironically has caching issues of it''s own. But your writes are safe. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel
On Mon, Jun 14, 2010 at 11:49:45AM +0100, Jeremy Fitzhardinge wrote:> On 06/11/2010 07:11 PM, Neobiker wrote: > > Hi > > > > Valtteri Kiviniemi-2 wrote: > > > >> Hi, Ah, misunderstanding sorry, you were talking about disk images :) > >> > >> > > I''m talking about this config: > > disk = [ > > ''phy:/dev/vm/vm01,xvda1,w'', > > ''phy:/dev/vm/vm01-swap,xvda2,w'', > > ''phy:/dev/daten/devel_debian_amd64,xvda3,w'', > > ] > > > > file: is definitely unsafe; its IO gets buffered in the dom0 pagecache, > which means the guests writes aren''t really writes. I believe phy: has > similar problems, whereas tap:aio: implemented direct IO. But someone > more storagey can confirm. >I though phy: submits direct bio''s bypassing dom0 pagecache.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com lists.xensource.com/xen-devel