Mark Adams
2010-Nov-11 10:24 UTC
[Xen-users] pci-passthrough in pvops causing offline raid
Hi All, Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. In a voip setup, where I have forwarded the onboard NIC interfaces through to domU using the following grub config: module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 I''m having a serious issue where the raid card goes offline after an indefinate period of time. Sometimes runs fine for a week, other times 1 day before I get "offline device" errors. Rebooting the machine fixes it straight away, and everything is back online. What in the Xen pciback is causing the raid card to go offline? The only devices hidden are the 2 onboard NIC''s. I know that this issue is with Xen, as I had this running on a different server (same xen setup) and it had the same issues, which I initially thought were to do with the raid card. Is there known issues in this kernel and xen version with pciback? I''m going to update to the current package versions this evening (4.0.1-1 and 2.6.32-27) however would appreciate if anyone has any other insight into this issue, or even just a note to say it is a bug that has been fixed in current versions! Thanks, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Olivier Hanesse
2010-Nov-11 11:13 UTC
Re: [Xen-users] pci-passthrough in pvops causing offline raid
Hello, What is the module of your raid card ? If it is "megaraid_sas", please try upgrading the version of that module (see others post in the ML). Regards 2010/11/11 Mark Adams <mark@campbell-lange.net>> Hi All, > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > In a voip setup, where I have forwarded the onboard NIC interfaces > through to domU using the following grub config: > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder > root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet > xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) > pci=resource_alignment=02:00.0;03:00.0 > > I''m having a serious issue where the raid card goes offline after an > indefinate period of time. Sometimes runs fine for a week, other times 1 > day before I get "offline device" errors. Rebooting the machine fixes it > straight away, and everything is back online. > > What in the Xen pciback is causing the raid card to go offline? The > only devices hidden are the 2 onboard NIC''s. > > I know that this issue is with Xen, as I had this running on a different > server (same xen setup) and it had the same issues, which I initially > thought were to do with the raid card. > > Is there known issues in this kernel and xen version with pciback? I''m > going to update to the current package versions this evening (4.0.1-1 > and 2.6.32-27) however would appreciate if anyone has any other insight > into this issue, or even just a note to say it is a bug that has been > fixed in current versions! > > Thanks, > Mark > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-11 12:03 UTC
Re: [Xen-devel] Re: [Xen-users] pci-passthrough in pvops causing offline raid
Hi - It''s not megaraid, its an Areca card (arcmsr) This is definately something to do with the pciback. Anyone else got any ideas or views on this? The domU is an HVM debian-squeeze instance. On Thu, Nov 11, 2010 at 12:13:31PM +0100, Olivier Hanesse wrote:> Hello, > > What is the module of your raid card ? > If it is "megaraid_sas", please try upgrading the version of that module > (see others post in the ML). > > Regards > > 2010/11/11 Mark Adams <mark@campbell-lange.net> > > > Hi All, > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > > > In a voip setup, where I have forwarded the onboard NIC interfaces > > through to domU using the following grub config: > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder > > root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet > > xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) > > pci=resource_alignment=02:00.0;03:00.0 > > > > I''m having a serious issue where the raid card goes offline after an > > indefinate period of time. Sometimes runs fine for a week, other times 1 > > day before I get "offline device" errors. Rebooting the machine fixes it > > straight away, and everything is back online. > > > > What in the Xen pciback is causing the raid card to go offline? The > > only devices hidden are the 2 onboard NIC''s. > > > > I know that this issue is with Xen, as I had this running on a different > > server (same xen setup) and it had the same issues, which I initially > > thought were to do with the raid card. > > > > Is there known issues in this kernel and xen version with pciback? I''m > > going to update to the current package versions this evening (4.0.1-1 > > and 2.6.32-27) however would appreciate if anyone has any other insight > > into this issue, or even just a note to say it is a bug that has been > > fixed in current versions! > > > > Thanks, > > Mark > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > >> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-11 16:53 UTC
[Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote:> Hi All, > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > In a voip setup, where I have forwarded the onboard NIC interfaces > through to domU using the following grub config: > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 > > I''m having a serious issue where the raid card goes offline after an > indefinate period of time. Sometimes runs fine for a week, other times 1 > day before I get "offline device" errors. Rebooting the machine fixes it > straight away, and everything is back online. > > What in the Xen pciback is causing the raid card to go offline? The > only devices hidden are the 2 onboard NIC''s.You need to give more details. Is the RAID card a 3Ware? An LSI? Do you run with an IOMMU? When the RAID card goes offline, do you see a stop of IRQs going to the device? Are the IRQs for the RAID card sent to all of your CPUs or just a specific one? Are you pinning your guests to specific CPUs? Does the issue disappear if you don''t passthrough the NIC interfaces? If so have you run this setup for "a week" to make sure?> > I know that this issue is with Xen, as I had this running on a different > server (same xen setup) and it had the same issues, which I initially > thought were to do with the raid card.So you never ran this setup on this kernel (2.6.32-5) without the Xen hypervisor?> > Is there known issues in this kernel and xen version with pciback? I''mNo. It all works perfectly :-)> going to update to the current package versions this evening (4.0.1-1 > and 2.6.32-27) however would appreciate if anyone has any other insight > into this issue, or even just a note to say it is a bug that has been > fixed in current versions!Well, there were issues with the LSI cards having a hidden PCI device. But those are pretty obvious as you can''t even use it correctly. There is also a problem with 3Ware 9506 IDE card - which on my box stops sending IRQs on the IOAPIC it has been assigned (28) and instead uses another one (17). Not sure if this is just the PCI card using the wrong PCI interrupt pin on the card and it ends up poking the wrong IOAPIC. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-11 17:38 UTC
[Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 11:53:40AM -0500, Konrad Rzeszutek Wilk wrote:> On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote: > > Hi All, > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > > > In a voip setup, where I have forwarded the onboard NIC interfaces > > through to domU using the following grub config: > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 > > > > I''m having a serious issue where the raid card goes offline after an > > indefinate period of time. Sometimes runs fine for a week, other times 1 > > day before I get "offline device" errors. Rebooting the machine fixes it > > straight away, and everything is back online. > > > > What in the Xen pciback is causing the raid card to go offline? The > > only devices hidden are the 2 onboard NIC''s. > > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you > run with an IOMMU? When the RAID card goes offline, do you see a stop of > IRQs going to the device? Are the IRQs for the RAID card sent to all of your > CPUs or just a specific one? Are you pinning your guests to specific CPUs? > Does the issue disappear if you don''t passthrough the NIC interfaces? If so have > you run this setup for "a week" to make sure?It is an Areca 1220. I can''t see anything when the device goes offline apart from [77324.264270] sd 0:0:0:1: rejecting I/O to offline device [77334.005854] sd 0:0:0:0: rejecting I/O to offline device Unfortunately nothing get''s logged because there is nothing to write to anymore. I''m not sure how I can see the IRQs otherwise. There is no pinning being done at all, and the machine was running for a few months OK before the pciback was added. Is my kernel module line correct above? are the xen-pciback.permissive and resource_alignment options required? Also I am passing through the onboard NIC''s - is this something that should be avoided or is it ok to do?> > > > I know that this issue is with Xen, as I had this running on a different > > server (same xen setup) and it had the same issues, which I initially > > thought were to do with the raid card. > > So you never ran this setup on this kernel (2.6.32-5) without the Xen hypervisor?no, its always had the hypervisor - but it was running ok before the pciback options were added. This week, it''s seemed to happen approximately every 24 hours.> > > > > Is there known issues in this kernel and xen version with pciback? I''m > > No. It all works perfectly :-) > > > going to update to the current package versions this evening (4.0.1-1 > > and 2.6.32-27) however would appreciate if anyone has any other insight > > into this issue, or even just a note to say it is a bug that has been > > fixed in current versions! > > Well, there were issues with the LSI cards having a hidden PCI device. But those > are pretty obvious as you can''t even use it correctly. There is also > a problem with 3Ware 9506 IDE card - which on my box stops sending IRQs > on the IOAPIC it has been assigned (28) and instead uses another one (17). > Not sure if this is just the PCI card using the wrong PCI interrupt pin on the > card and it ends up poking the wrong IOAPIC.Thanks, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Richie
2010-Nov-11 17:40 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On 11/11/2010 11:53 AM, Konrad Rzeszutek Wilk wrote:> On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote: > >> Hi All, >> >> Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. >> >> In a voip setup, where I have forwarded the onboard NIC interfaces >> through to domU using the following grub config: >> >> module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 >> >> I''m having a serious issue where the raid card goes offline after an >> indefinate period of time. Sometimes runs fine for a week, other times 1 >> day before I get "offline device" errors. Rebooting the machine fixes it >> straight away, and everything is back online. >> >> What in the Xen pciback is causing the raid card to go offline? The >> only devices hidden are the 2 onboard NIC''s. >> > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you > run with an IOMMU? When the RAID card goes offline, do you see a stop of > IRQs going to the device? Are the IRQs for the RAID card sent to all of your > CPUs or just a specific one? Are you pinning your guests to specific CPUs? > Does the issue disappear if you don''t passthrough the NIC interfaces? If so have > you run this setup for "a week" to make sure? > >> I know that this issue is with Xen, as I had this running on a different >> server (same xen setup) and it had the same issues, which I initially >> thought were to do with the raid card. >> > So you never ran this setup on this kernel (2.6.32-5) without the Xen hypervisor? > > >> Is there known issues in this kernel and xen version with pciback? I''m >> > No. It all works perfectly :-) > > >> going to update to the current package versions this evening (4.0.1-1 >> and 2.6.32-27) however would appreciate if anyone has any other insight >> into this issue, or even just a note to say it is a bug that has been >> fixed in current versions! >> > Well, there were issues with the LSI cards having a hidden PCI device. But those > are pretty obvious as you can''t even use it correctly. There is also > a problem with 3Ware 9506 IDE card - which on my box stops sending IRQs > on the IOAPIC it has been assigned (28) and instead uses another one (17). > Not sure if this is just the PCI card using the wrong PCI interrupt pin on the > card and it ends up poking the wrong IOAPIC. > >Note: I have no idea if this would be related to your issue or that my assessment is completely accurate. I had an issue that I feel the debian squeeze kernel running under domU played a part in. My dom0 is 2.6.34.7 Xenified w/Andrew lyon''s patches and I running Xen 4.0.2-pre (xen testing). I passthrough a pci tuner card but have not considered that this could also contribute. Sometimes when I shutdown the domU and upon halt I started getting libata style DRIVE_NOT_READY errors in my dom0. Either one drive would drop from my mdadm raid (which houses my lvm filesystems including root for dom0 and domU) or perhaps they would drop and cause a panic. A reboot fixes everything though a rebuild would occur. I was not able to capture those errors in the few times it happened, but I have since changed to use a 2.6.31 pvops kernel from jeremy''s stable branch in my domU and I have yet to reproduce the issue. I did note that it might take a number of days for the problem to manifest and so far I''ve tested a domU shutdown after 24 and 72 hours using the new kernel with no issues. My next test is @ 7 days. I wish I had more information myself, but I don''t. Regardless of the accuracy of this claim, I recommend trying other kernels to see if the problem persists. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Konrad Rzeszutek Wilk
2010-Nov-11 17:58 UTC
Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 05:38:50PM +0000, Mark Adams wrote:> On Thu, Nov 11, 2010 at 11:53:40AM -0500, Konrad Rzeszutek Wilk wrote: > > On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote: > > > Hi All, > > > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > > > > > In a voip setup, where I have forwarded the onboard NIC interfaces > > > through to domU using the following grub config: > > > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 > > > > > > I''m having a serious issue where the raid card goes offline after an > > > indefinate period of time. Sometimes runs fine for a week, other times 1 > > > day before I get "offline device" errors. Rebooting the machine fixes it > > > straight away, and everything is back online. > > > > > > What in the Xen pciback is causing the raid card to go offline? The > > > only devices hidden are the 2 onboard NIC''s. > > > > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you > > run with an IOMMU? When the RAID card goes offline, do you see a stop of > > IRQs going to the device? Are the IRQs for the RAID card sent to all of your > > CPUs or just a specific one? Are you pinning your guests to specific CPUs? > > Does the issue disappear if you don''t passthrough the NIC interfaces? If so have > > you run this setup for "a week" to make sure? > > It is an Areca 1220. I can''t see anything when the device goes offline > apart from > > [77324.264270] sd 0:0:0:1: rejecting I/O to offline device > [77334.005854] sd 0:0:0:0: rejecting I/O to offline deviceThat is it? No other details from the driver? Did you poke at the driver (modinfo) to see if there are any options to increase its verbosity.> > Unfortunately nothing get''s logged because there is nothing to write to > anymore. I''m not sure how I can see the IRQs otherwise. There is nocat /proc/interrupts> pinning being done at all, and the machine was running for a few months > OK before the pciback was added.Ok, what about your NICs? Are they on-board? Are they sharing the IRQ with the card? You should be able to see this by looking at /proc/interrupts. Which NICs are they? lspci can you help you there. As of matter of fact, run lspci -vvv and send that.> > Is my kernel module line correct above? are the xen-pciback.permissive > and resource_alignment options required? Also I am passing through theNot always. The resource_alignment only if the BARs (look at lspci output) are not page-aligned. If you have no idea what I am talking about then the answer is yes.> onboard NIC''s - is this something that should be avoided or is it ok to > do?It is fine. That is the first thing I test..> > > > > > > I know that this issue is with Xen, as I had this running on a different > > > server (same xen setup) and it had the same issues, which I initially > > > thought were to do with the raid card. > > > > So you never ran this setup on this kernel (2.6.32-5) without the Xen hypervisor? > > no, its always had the hypervisor - but it was running ok before the > pciback options were added. This week, it''s seemed to happen > approximately every 24 hours.When this hang occurs, can you do ''xm debug-key Q'', ''xm debug-key i'', ''xm debug-key z''. Then run ''xm dmesg'' and provide that to me? Is your boot disk on the same disk as the RAID? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-11 18:13 UTC
[Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 12:58:09PM -0500, Konrad Rzeszutek Wilk wrote:> On Thu, Nov 11, 2010 at 05:38:50PM +0000, Mark Adams wrote: > > On Thu, Nov 11, 2010 at 11:53:40AM -0500, Konrad Rzeszutek Wilk wrote: > > > On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote: > > > > Hi All, > > > > > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > > > > > > > In a voip setup, where I have forwarded the onboard NIC interfaces > > > > through to domU using the following grub config: > > > > > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 > > > > > > > > I''m having a serious issue where the raid card goes offline after an > > > > indefinate period of time. Sometimes runs fine for a week, other times 1 > > > > day before I get "offline device" errors. Rebooting the machine fixes it > > > > straight away, and everything is back online. > > > > > > > > What in the Xen pciback is causing the raid card to go offline? The > > > > only devices hidden are the 2 onboard NIC''s. > > > > > > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you > > > run with an IOMMU? When the RAID card goes offline, do you see a stop of > > > IRQs going to the device? Are the IRQs for the RAID card sent to all of your > > > CPUs or just a specific one? Are you pinning your guests to specific CPUs? > > > Does the issue disappear if you don''t passthrough the NIC interfaces? If so have > > > you run this setup for "a week" to make sure? > > > > It is an Areca 1220. I can''t see anything when the device goes offline > > apart from > > > > [77324.264270] sd 0:0:0:1: rejecting I/O to offline device > > [77334.005854] sd 0:0:0:0: rejecting I/O to offline device > > That is it? No other details from the driver? Did you poke at the driver (modinfo) > to see if there are any options to increase its verbosity.I can''t do anything once its happened, everything is offline so I have no utils...> > > > > Unfortunately nothing get''s logged because there is nothing to write to > > anymore. I''m not sure how I can see the IRQs otherwise. There is no > > cat /proc/interrupts > > > pinning being done at all, and the machine was running for a few months > > OK before the pciback was added. > > Ok, what about your NICs? Are they on-board? Are they sharing the IRQ > with the card? You should be able to see this by looking at /proc/interrupts. > Which NICs are they? lspci can you help you there. As of matter of fact, run > lspci -vvv and send that.It is the onboard nics, they are Intel 82574L. I can see the arcmsr line, but not anything for the NICS (because they are hidden?) 39: 1126249 0 0 0 0 0 0 0 xen-pirq-ioapic-level arcmsr Nothing else is on 1126249 see lspci.txt attached.> > > > Is my kernel module line correct above? are the xen-pciback.permissive > > and resource_alignment options required? Also I am passing through the > > Not always. The resource_alignment only if the BARs (look at lspci output) are > not page-aligned. If you have no idea what I am talking about then the answer > is yes. > > > onboard NIC''s - is this something that should be avoided or is it ok to > > do? > > It is fine. That is the first thing I test.. > > > > > > > > > > > I know that this issue is with Xen, as I had this running on a different > > > > server (same xen setup) and it had the same issues, which I initially > > > > thought were to do with the raid card. > > > > > > So you never ran this setup on this kernel (2.6.32-5) without the Xen hypervisor? > > > > no, its always had the hypervisor - but it was running ok before the > > pciback options were added. This week, it''s seemed to happen > > approximately every 24 hours. > > When this hang occurs, can you do ''xm debug-key Q'', ''xm debug-key i'', ''xm debug-key z''. > Then run ''xm dmesg'' and provide that to me?I can try this, but It probably won''t work as the device is will not be readable.> > Is your boot disk on the same disk as the RAID?There are 2 raids, a Raid1 for the OS (/boot / /var /tmp /usr) and a raid5 for VM''s - They both dissapear at the same time so it appears the card is dissapearing.. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-11 18:47 UTC
Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 06:13:29PM +0000, Mark Adams wrote:> On Thu, Nov 11, 2010 at 12:58:09PM -0500, Konrad Rzeszutek Wilk wrote: > > On Thu, Nov 11, 2010 at 05:38:50PM +0000, Mark Adams wrote: > > > On Thu, Nov 11, 2010 at 11:53:40AM -0500, Konrad Rzeszutek Wilk wrote: > > > > On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote: > > > > > Hi All, > > > > > > > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21. > > > > > > > > > > In a voip setup, where I have forwarded the onboard NIC interfaces > > > > > through to domU using the following grub config: > > > > > > > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0) pci=resource_alignment=02:00.0;03:00.0 > > > > > > > > > > I''m having a serious issue where the raid card goes offline after an > > > > > indefinate period of time. Sometimes runs fine for a week, other times 1 > > > > > day before I get "offline device" errors. Rebooting the machine fixes it > > > > > straight away, and everything is back online. > > > > > > > > > > What in the Xen pciback is causing the raid card to go offline? The > > > > > only devices hidden are the 2 onboard NIC''s. > > > > > > > > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you > > > > run with an IOMMU? When the RAID card goes offline, do you see a stop of > > > > IRQs going to the device? Are the IRQs for the RAID card sent to all of your > > > > CPUs or just a specific one? Are you pinning your guests to specific CPUs? > > > > Does the issue disappear if you don''t passthrough the NIC interfaces? If so have > > > > you run this setup for "a week" to make sure? > > > > > > It is an Areca 1220. I can''t see anything when the device goes offline > > > apart from > > > > > > [77324.264270] sd 0:0:0:1: rejecting I/O to offline device > > > [77334.005854] sd 0:0:0:0: rejecting I/O to offline device > > > > That is it? No other details from the driver? Did you poke at the driver (modinfo) > > to see if there are any options to increase its verbosity. > > I can''t do anything once its happened, everything is offline so I have > no utils... > > > > > > > > Unfortunately nothing get''s logged because there is nothing to write to > > > anymore. I''m not sure how I can see the IRQs otherwise. There is no > > > > cat /proc/interrupts > > > > > pinning being done at all, and the machine was running for a few months > > > OK before the pciback was added. > > > > Ok, what about your NICs? Are they on-board? Are they sharing the IRQ > > with the card? You should be able to see this by looking at /proc/interrupts. > > Which NICs are they? lspci can you help you there. As of matter of fact, run > > lspci -vvv and send that. > > It is the onboard nics, they are Intel 82574L. I can see the arcmsr > line, but not anything for the NICS (because they are hidden?) > > 39: 1126249 0 0 0 0 0 0 0 xen-pirq-ioapic-level arcmsr > > Nothing else is on 1126249 > > see lspci.txt attached. >I''ve just noticed this at the end of xm dmesg (XEN) msi.c:715: MSI is already in use on device 02:00.0 (XEN) msi.c:715: MSI is already in use on device 02:00.0 (XEN) msi.c:715: MSI is already in use on device 02:00.0 Something else trying to use the device being exported? (the nics are 02:00.0 and 03:00.0) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-11 18:57 UTC
Re: [Xen-devel] pci-passthrough in pvops causing offline raid
> > > It is an Areca 1220. I can''t see anything when the device goes offline > > > apart from > > > > > > [77324.264270] sd 0:0:0:1: rejecting I/O to offline device > > > [77334.005854] sd 0:0:0:0: rejecting I/O to offline device > > > > That is it? No other details from the driver? Did you poke at the driver (modinfo) > > to see if there are any options to increase its verbosity. > > I can''t do anything once its happened, everything is offline so I have > no utils...An easy is to use netconsole. You can make all of the kernel log output got a different machine on your network.> > > > > > > > Unfortunately nothing get''s logged because there is nothing to write to > > > anymore. I''m not sure how I can see the IRQs otherwise. There is no > > > > cat /proc/interrupts > > > > > pinning being done at all, and the machine was running for a few months > > > OK before the pciback was added. > > > > Ok, what about your NICs? Are they on-board? Are they sharing the IRQ > > with the card? You should be able to see this by looking at /proc/interrupts. > > Which NICs are they? lspci can you help you there. As of matter of fact, run > > lspci -vvv and send that. > > It is the onboard nics, they are Intel 82574L. I can see the arcmsr > line, but not anything for the NICS (because they are hidden?)Your lspci tells me it is on 16 and 17. You should see in /proc/interrupts on that line something about pciback?> > 39: 1126249 0 0 0 0 0 0 0 xen-pirq-ioapic-level arcmsr > > Nothing else is on 1126249You mean IRQ 39.> > see lspci.txt attached.thanks.> > When this hang occurs, can you do ''xm debug-key Q'', ''xm debug-key i'', ''xm debug-key z''. > > Then run ''xm dmesg'' and provide that to me? > > I can try this, but It probably won''t work as the device is will not be > readable.Look on Google for ''Wiki PVOPS'' and there is a section on how to connect a serial console. With the serial console we can send those commands to the hypervisor even if your box is hanged. http://wiki.xen.org/xenwiki/XenSerialConsole> > > > Is your boot disk on the same disk as the RAID? > > There are 2 raids, a Raid1 for the OS (/boot / /var /tmp /usr) and a > raid5 for VM''s - They both dissapear at the same time so it appears the > card is dissapearing.. >I wonder if we have your IRQs confused. Can you provide the full cat /proc/interrupts and as well the serial bootup of the console? Or just the ''xm dmesg'' and ''dmesg'' output if you don''t have the serial console hooked up yet. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-11 19:06 UTC
Re: [Xen-devel] pci-passthrough in pvops causing offline raid
> I''ve just noticed this at the end of xm dmesg > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > Something else trying to use the device being exported? (the nics are > 02:00.0 and 03:00.0)Hmm, looks like it, but it should not have happend. Can you attach the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I asked for in the previous e-mail? Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just for unstable? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-11 19:22 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
See attached On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote:> > I''ve just noticed this at the end of xm dmesg > > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > Something else trying to use the device being exported? (the nics are > > 02:00.0 and 03:00.0) > > Hmm, looks like it, but it should not have happend. Can you attach > the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I > asked for in the previous e-mail? > > Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just > for unstable? > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-11 19:42 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
Apols - Also see plain dmesg attached. This one from updated machine (4.0.1) still showing the msi issues. On Thu, Nov 11, 2010 at 07:22:56PM +0000, Mark Adams wrote:> See attached > > On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > I''ve just noticed this at the end of xm dmesg > > > > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > Something else trying to use the device being exported? (the nics are > > > 02:00.0 and 03:00.0) > > > > Hmm, looks like it, but it should not have happend. Can you attach > > the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I > > asked for in the previous e-mail? > > > > Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just > > for unstable? > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-12 17:10 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote:> > I''ve just noticed this at the end of xm dmesg > > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > Something else trying to use the device being exported? (the nics are > > 02:00.0 and 03:00.0) > > Hmm, looks like it, but it should not have happend. Can you attach > the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I > asked for in the previous e-mail? > > Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just > for unstable? >Any further idea''s on this? Is it a xen bug if the hidden device is being accessed in dom0? or is there an overlap somewhere? (not sure how this would work).. Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-12 22:22 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote:> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > I''ve just noticed this at the end of xm dmesg > > > > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > Something else trying to use the device being exported? (the nics are > > > 02:00.0 and 03:00.0) > > > > Hmm, looks like it, but it should not have happend. Can you attach > > the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I > > asked for in the previous e-mail? > > > > Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just > > for unstable? > > > > Any further idea''s on this? Is it a xen bug if the hidden device is being > accessed in dom0? or is there an overlap somewhere? (not sure how this > would work)..I was going to look in the source today to get an idea but never got to it... You might, as I mentioned in earlier emails, try to setup a serial console or netconsole and log the Linux kernel output when it hangs/fails. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-14 17:15 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: >>>> I''ve just noticed this at the end of xm dmesg >>>> >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 >>>> >>>> Something else trying to use the device being exported? (the nics are >>>> 02:00.0 and 03:00.0) >>> >>> Hmm, looks like it, but it should not have happend. Can you attach >>> the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I >>> asked for in the previous e-mail? >>> >>> Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just >>> for unstable? >>> >> >> Any further idea''s on this? Is it a xen bug if the hidden device is being >> accessed in dom0? or is there an overlap somewhere? (not sure how this >> would work).. > > I was going to look in the source today to get an idea but never got to it... > > You might, as I mentioned in earlier emails, try to setup a serial console > or netconsole and log the Linux kernel output when it hangs/fails. >can''t do this unfortunately, the server Is in use so not able to just let it hang again... The passthrough is not in use now until I think there is some possible solution to get rid of the MSI conflict (when it won''t hang anymore!)> _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-15 17:11 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote:> > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > >>>> I''ve just noticed this at the end of xm dmesg > >>>> > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > >>>> > >>>> Something else trying to use the device being exported? (the nics are > >>>> 02:00.0 and 03:00.0) > >>> > >>> Hmm, looks like it, but it should not have happend. Can you attach > >>> the output of ''xm dmesg'' and also do the ''xm debug-keys ..'' that I > >>> asked for in the previous e-mail? > >>> > >>> Jan, the fixes for the MSI you did, they weren''t for 4.0.1 right? Just > >>> for unstable? > >>> > >> > >> Any further idea''s on this? Is it a xen bug if the hidden device is being > >> accessed in dom0? or is there an overlap somewhere? (not sure how this > >> would work).. > > > > I was going to look in the source today to get an idea but never got to it... > > > > You might, as I mentioned in earlier emails, try to setup a serial console > > or netconsole and log the Linux kernel output when it hangs/fails. > > > > can''t do this unfortunately, the server > Is in use so not able to just let it hang again... The passthrough is > not in use now until I think there is some possible solution to get > rid of the MSI conflict (when it won''t hang anymore!)Is there anything else I can do to help with resolution on this issue? I see another user is also having a similar problem with (quiet possibly) PCI passthrough causing their RAID array to go offline. Did my logs show anything useful? You mentioned some fix for MSI earlier has this been corrected in a newer version of Xen? Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-15 17:15 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote:> > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > >>>> I''ve just noticed this at the end of xm dmesg > >>>> > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0Looking briefly at the code it means that somebody enabled the MSI already on the device and did not disable them. But I wonder how you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) or pciback.hide (for older kernels) to "hide" the devices away from the Linux Dom0 kernel?> > I was going to look in the source today to get an idea but never got to it... > > > > You might, as I mentioned in earlier emails, try to setup a serial console > > or netconsole and log the Linux kernel output when it hangs/fails. > > > > can''t do this unfortunately, the server > Is in use so not able to just let it hang again... The passthrough is not in use now until I think there is some possible solution to get rid of the MSI conflict (when it won''t hang anymore!)Didn''t you say that you had two servers and saw this problem on another box too? Without more details on the Xen hypervisor line or the kernel line when the failure occurs I sadly can''t help you. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-15 17:23 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Mon, Nov 15, 2010 at 12:15:44PM -0500, Konrad Rzeszutek Wilk wrote:> On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote: > > > > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > >>>> I''ve just noticed this at the end of xm dmesg > > >>>> > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > Looking briefly at the code it means that somebody enabled the MSI > already on the device and did not disable them. But I wonder how > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > or pciback.hide (for older kernels) to "hide" the devices away from the > Linux Dom0 kernel?using xen-pciback.hide as its a pvops kernel (debian squeeze 2.6.32-5-27)> > > > > I was going to look in the source today to get an idea but never got to it... > > > > > > You might, as I mentioned in earlier emails, try to setup a serial console > > > or netconsole and log the Linux kernel output when it hangs/fails. > > > > > > > can''t do this unfortunately, the server > > Is in use so not able to just let it hang again... The passthrough is not in use now until I think there is some possible solution to get rid of the MSI conflict (when it won''t hang anymore!) > > Didn''t you say that you had two servers and saw this problem on another > box too? > > Without more details on the Xen hypervisor line or the kernel line when > the failure occurs I sadly can''t help you.Yes this occurs on both servers that I''ve tried it on. Doesn''t the MSI log above indicate that there is a conflict - which is what ends up causing the device to go offline? Is there no other way to identify the conflict? Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-15 17:44 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Mon, Nov 15, 2010 at 05:23:09PM +0000, Mark Adams wrote:> On Mon, Nov 15, 2010 at 12:15:44PM -0500, Konrad Rzeszutek Wilk wrote: > > On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote: > > > > > > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > > > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > >>>> I''ve just noticed this at the end of xm dmesg > > > >>>> > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > Looking briefly at the code it means that somebody enabled the MSI > > already on the device and did not disable them. But I wonder how > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > or pciback.hide (for older kernels) to "hide" the devices away from the > > Linux Dom0 kernel? > > using xen-pciback.hide as its a pvops kernel (debian squeeze > 2.6.32-5-27)Ok. Then it might be worth looking in when this happens. I think there is an argument on the Xen hyperisor line to include the time-stamp, but I don''t remember it :-(> > Didn''t you say that you had two servers and saw this problem on another > > box too? > > > > Without more details on the Xen hypervisor line or the kernel line when > > the failure occurs I sadly can''t help you. > > Yes this occurs on both servers that I''ve tried it on. Doesn''t the MSI > log above indicate that there is a conflict - which is what ends up > causing the device to go offline? Is there no other way to identify theCould be, but it is unclear - it depends on when the message pops out. But that does not help with finding out why your RAID controller goes offline. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-15 17:56 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Mon, Nov 15, 2010 at 12:44:13PM -0500, Konrad Rzeszutek Wilk wrote:> On Mon, Nov 15, 2010 at 05:23:09PM +0000, Mark Adams wrote: > > On Mon, Nov 15, 2010 at 12:15:44PM -0500, Konrad Rzeszutek Wilk wrote: > > > On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote: > > > > > > > > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > > > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > > > > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > > >>>> I''ve just noticed this at the end of xm dmesg > > > > >>>> > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > already on the device and did not disable them. But I wonder how > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > Linux Dom0 kernel? > > > > using xen-pciback.hide as its a pvops kernel (debian squeeze > > 2.6.32-5-27) > > Ok. Then it might be worth looking in when this happens. I think > there is an argument on the Xen hyperisor line to include the time-stamp, but > I don''t remember it :-( > > > > Didn''t you say that you had two servers and saw this problem on another > > > box too? > > > > > > Without more details on the Xen hypervisor line or the kernel line when > > > the failure occurs I sadly can''t help you. > > > > Yes this occurs on both servers that I''ve tried it on. Doesn''t the MSI > > log above indicate that there is a conflict - which is what ends up > > causing the device to go offline? Is there no other way to identify the > > Could be, but it is unclear - it depends on when the message pops out.The message appears immediately on boot.> > But that does not help with finding out why your RAID controller goes offline.Maybe the other user having a similar issue can help with logs if it is still happening to him. I''ll ask on that thread... Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Nov-15 19:26 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Mon, Nov 15, 2010 at 12:44:13PM -0500, Konrad Rzeszutek Wilk wrote:> On Mon, Nov 15, 2010 at 05:23:09PM +0000, Mark Adams wrote: > > On Mon, Nov 15, 2010 at 12:15:44PM -0500, Konrad Rzeszutek Wilk wrote: > > > On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote: > > > > > > > > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > > > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > > > > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > > >>>> I''ve just noticed this at the end of xm dmesg > > > > >>>> > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > already on the device and did not disable them. But I wonder how > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > Linux Dom0 kernel? > > > > using xen-pciback.hide as its a pvops kernel (debian squeeze > > 2.6.32-5-27) > > Ok. Then it might be worth looking in when this happens. I think > there is an argument on the Xen hyperisor line to include the time-stamp, but > I don''t remember it :-( >http://wiki.xen.org/xenwiki/XenHypervisorBootOptions So I think it''s "console_timestamps" -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-16 10:37 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Mon, Nov 15, 2010 at 12:44:13PM -0500, Konrad Rzeszutek Wilk wrote:> On Mon, Nov 15, 2010 at 05:23:09PM +0000, Mark Adams wrote: > > On Mon, Nov 15, 2010 at 12:15:44PM -0500, Konrad Rzeszutek Wilk wrote: > > > On Sun, Nov 14, 2010 at 05:15:02PM +0000, Mark Adams wrote: > > > > > > > > On 12 Nov 2010, at 22:22, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > > > > > On Fri, Nov 12, 2010 at 05:10:58PM +0000, Mark Adams wrote: > > > > >> On Thu, Nov 11, 2010 at 02:06:58PM -0500, Konrad Rzeszutek Wilk wrote: > > > > >>>> I''ve just noticed this at the end of xm dmesg > > > > >>>> > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > already on the device and did not disable them. But I wonder how > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > Linux Dom0 kernel? > > > > using xen-pciback.hide as its a pvops kernel (debian squeeze > > 2.6.32-5-27) > > Ok. Then it might be worth looking in when this happens. I think > there is an argument on the Xen hyperisor line to include the time-stamp, but > I don''t remember it :-( > > > > Didn''t you say that you had two servers and saw this problem on another > > > box too? > > > > > > Without more details on the Xen hypervisor line or the kernel line when > > > the failure occurs I sadly can''t help you. > > > > Yes this occurs on both servers that I''ve tried it on. Doesn''t the MSI > > log above indicate that there is a conflict - which is what ends up > > causing the device to go offline? Is there no other way to identify the > > Could be, but it is unclear - it depends on when the message pops out. > > But that does not help with finding out why your RAID controller goes offline.Stephan Austermuhle advises that nothing is logged via remote syslog when this hang occurs. I''ll reply on that thread to see if he can add the additional logging> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Konrad Rzeszutek Wilk
2010-Nov-16 16:04 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
> > Could be, but it is unclear - it depends on when the message pops out. > > > > But that does not help with finding out why your RAID controller goes offline. > > Stephan Austermuhle advises that nothing is logged via remote syslog > when this hang occurs. I''ll reply on that thread to see if he can addHe can also look at http://wiki.xensource.com/xenwiki/XenSerialConsole> the additional loggingPasi, is there a Wiki with this?: When the hang happens, he needs to do two things: 1) In the Linux kernel, hit SysRq-L, SysRQ-T 2). Then go in the hypervisor, hit Ctrl-A three times. He should see a prompt saying (XEN) ** Serial ... and hit ''*'' - that will collect all of the relevant information. 3). Send the full serial log from the start of the machine to us (or in this case, to you Mark - or you can just CC him on this thread). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Nov-16 16:47 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
Hi Stephan, please see debugging instructions from Konrad below. Regards, Mark On Tue, Nov 16, 2010 at 11:04:22AM -0500, Konrad Rzeszutek Wilk wrote:> > > Could be, but it is unclear - it depends on when the message pops out. > > > > > > But that does not help with finding out why your RAID controller goes offline. > > > > Stephan Austermuhle advises that nothing is logged via remote syslog > > when this hang occurs. I''ll reply on that thread to see if he can add > > He can also look at http://wiki.xensource.com/xenwiki/XenSerialConsole > > > the additional logging > > Pasi, is there a Wiki with this?: > > When the hang happens, he needs to do two things: > > 1) In the Linux kernel, hit SysRq-L, SysRQ-T > > 2). Then go in the hypervisor, hit Ctrl-A three times. He should see a > prompt saying (XEN) ** Serial ... > and hit ''*'' - that will collect all of the relevant information. > > 3). Send the full serial log from the start of the machine to us (or in this > case, to you Mark - or you can just CC him on this thread)._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Nov-16 21:19 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Tue, Nov 16, 2010 at 11:04:22AM -0500, Konrad Rzeszutek Wilk wrote:> > > Could be, but it is unclear - it depends on when the message pops out. > > > > > > But that does not help with finding out why your RAID controller goes offline. > > > > Stephan Austermuhle advises that nothing is logged via remote syslog > > when this hang occurs. I''ll reply on that thread to see if he can add > > He can also look at http://wiki.xensource.com/xenwiki/XenSerialConsole > > > the additional logging > > Pasi, is there a Wiki with this?: >I don''t think we have this in the wiki.. at least I haven''t seen one.. -- Pasi> When the hang happens, he needs to do two things: > > 1) In the Linux kernel, hit SysRq-L, SysRQ-T > > 2). Then go in the hypervisor, hit Ctrl-A three times. He should see a > prompt saying (XEN) ** Serial ... > and hit ''*'' - that will collect all of the relevant information. > > 3). Send the full serial log from the start of the machine to us (or in this > case, to you Mark - or you can just CC him on this thread)._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Austermühle
2010-Nov-18 08:42 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
Hello Mark! Am 16.11.2010 17:47, schrieb Mark Adams:>>> Stephan Austermuhle advises that nothing is logged via remote syslog >>> when this hang occurs. I''ll reply on that thread to see if he can add >> >> He can also look at http://wiki.xensource.com/xenwiki/XenSerialConsole >> >>> the additional logging >> >> Pasi, is there a Wiki with this?: >> >> When the hang happens, he needs to do two things: >> >> 1) In the Linux kernel, hit SysRq-L, SysRQ-T >> >> 2). Then go in the hypervisor, hit Ctrl-A three times. He should see a >> prompt saying (XEN) ** Serial ... >> and hit ''*'' - that will collect all of the relevant information. >> >> 3). Send the full serial log from the start of the machine to us (or in this >> case, to you Mark - or you can just CC him on this thread).Thanks for your support. The server is far away from me (some hundred kilometers) with no chance to connect a serial console. The only thing that I have access to is a network console (kind of iLO). Is it sufficient to collect additional debug data? Best regards, Stephan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Nov-18 08:45 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Thu, Nov 18, 2010 at 09:42:32AM +0100, Stephan Austermühle wrote:> Hello Mark! > > Am 16.11.2010 17:47, schrieb Mark Adams: > > >>> Stephan Austermuhle advises that nothing is logged via remote syslog > >>> when this hang occurs. I''ll reply on that thread to see if he can add > >> > >> He can also look at http://wiki.xensource.com/xenwiki/XenSerialConsole > >> > >>> the additional logging > >> > >> Pasi, is there a Wiki with this?: > >> > >> When the hang happens, he needs to do two things: > >> > >> 1) In the Linux kernel, hit SysRq-L, SysRQ-T > >> > >> 2). Then go in the hypervisor, hit Ctrl-A three times. He should see a > >> prompt saying (XEN) ** Serial ... > >> and hit ''*'' - that will collect all of the relevant information. > >> > >> 3). Send the full serial log from the start of the machine to us (or in this > >> case, to you Mark - or you can just CC him on this thread). > > Thanks for your support. > > The server is far away from me (some hundred kilometers) with no chance > to connect a serial console. The only thing that I have access to is a > network console (kind of iLO). Is it sufficient to collect additional > debug data? >If it''s SOL (Serial Over LAN), then yes, that''s enough. See: http://wiki.xensource.com/xenwiki/XenSerialConsole -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Austermühle
2010-Nov-18 08:48 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
Hi Pasi, Am 18.11.2010 09:45, schrieb Pasi Kärkkäinen:>> The server is far away from me (some hundred kilometers) with no chance >> to connect a serial console. The only thing that I have access to is a >> network console (kind of iLO). Is it sufficient to collect additional >> debug data? > > If it''s SOL (Serial Over LAN), then yes, that''s enough. > > See: http://wiki.xensource.com/xenwiki/XenSerialConsoleI''ll check. Stephan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-24 17:59 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
> > > > > >>>> > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > > already on the device and did not disable them. But I wonder how > > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > > Linux Dom0 kernel? > > > > > > using xen-pciback.hide as its a pvops kernel (debian squeeze > > > 2.6.32-5-27) > > > > Ok. Then it might be worth looking in when this happens. I think > > there is an argument on the Xen hyperisor line to include the time-stamp, but > > I don''t remember it :-( > >I''ve got a test setup in place now, and am trying to reproduce this. I''ve not connected up serial as yet, but can see the following logs in the qemu-dm log file when I get the "MSI is already in use" errors above. Note also that this error -always- shows for the first specified device in the pci= field, and not the 2nd. pt_msixctrl_reg_write: guest enabling MSI-X, disable MSI-INTx translation pci_intx: intx=1 pt_msix_update_one: Update msix entry 0 with pirq 4d gvec 59 pt_msix_update_one: Update msix entry 1 with pirq 4c gvec 61 pt_msix_update_one: Update msix entry 2 with pirq 4b gvec 69 pt_msixctrl_reg_write: guest enabling MSI-X, disable MSI-INTx translation pci_intx: intx=2 I have also seen the following log just once, not sure if it''s related: (XEN) domctl.c:811:d0 XEN_DOMCTL_test_assign_device: 2:0.0 already assigned, or non-existent Does this help at all with debugging my issues? Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-24 20:28 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Wed, Nov 24, 2010 at 05:59:26PM +0000, Mark Adams wrote:> > > > > > >>>> > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > > > already on the device and did not disable them. But I wonder how > > > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > > > Linux Dom0 kernel? > > > > > > > > using xen-pciback.hide as its a pvops kernel (debian squeeze > > > > 2.6.32-5-27) > > > > > > Ok. Then it might be worth looking in when this happens. I think > > > there is an argument on the Xen hyperisor line to include the time-stamp, but > > > I don''t remember it :-( > > > > > I''ve got a test setup in place now, and am trying to reproduce this. > I''ve not connected up serial as yet, but can see the following logs in > the qemu-dm log file when I get the "MSI is already in use" errors > above. Note also that this error -always- shows for the first specified > device in the pci= field, and not the 2nd. > > pt_msixctrl_reg_write: guest enabling MSI-X, disable MSI-INTx translation > pci_intx: intx=1 > pt_msix_update_one: Update msix entry 0 with pirq 4d gvec 59 > pt_msix_update_one: Update msix entry 1 with pirq 4c gvec 61 > pt_msix_update_one: Update msix entry 2 with pirq 4b gvec 69 > pt_msixctrl_reg_write: guest enabling MSI-X, disable MSI-INTx translation > pci_intx: intx=2 > > I have also seen the following log just once, not sure if it''s related: > > (XEN) domctl.c:811:d0 XEN_DOMCTL_test_assign_device: 2:0.0 already assigned, or non-existent > > Does this help at all with debugging my issues?Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your machine is toast. I mentioned in the previous email the key sequences - look on Google on how to pass in SysRQ if you are using a serial concentrator. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-26 11:15 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Wed, Nov 24, 2010 at 03:28:43PM -0500, Konrad Rzeszutek Wilk wrote:> On Wed, Nov 24, 2010 at 05:59:26PM +0000, Mark Adams wrote: > > > > > > > >>>> > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > > > > already on the device and did not disable them. But I wonder how > > > > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > > > > Linux Dom0 kernel? > > > > I''ve got a test setup in place now, and am trying to reproduce this. > > I''ve not connected up serial as yet, but can see the following logs in > > the qemu-dm log file when I get the "MSI is already in use" errors > > above. Note also that this error -always- shows for the first specified > > device in the pci= field, and not the 2nd. > >In my new test setup, I have seen some strange behaviour. 1 of the HVM''s (with identical config in dom0 and domU) suddenly would not allow the igb driver to be loaded in domU, even though the device was visible in lspci. Shutting the machine down, removing the power cord, waiting 5 seconds then plugging it in again corrected that issue - Is this possibly a motherboard bug? I have also disabled the SR-IOV functionality in the BIOS incase this is causing any issues. In addition, to try to correct the MSI issue noted above, I have changed my pci= line to the following: pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] This has stopped the "already in use on device" log, and the devices appear to show correctly in the domU. Is it safe to disable msitranslate? as I understand it, its for allowing multifunction devices to be seen as such in domU. Is that correct? I haven''t been able to reproduce the dropped raid issue yet, but I am awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause this due to their very high interrupt usage (2000 per second). In the mean time, I can see the following in the qemu-dm logs now with the msitranslate=0 enabled. Is it anything to worry about? pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function.> > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > machine is toast. I mentioned in the previous email the key sequences - look on Google > on how to pass in SysRQ if you are using a serial concentrator.I will do this when I can get the machine to crash. Best Regards, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Nov-26 15:25 UTC
Re: [Xen-users] Re: [Xen-devel] pci-passthrough in pvops causing offline raid
On Fri, Nov 26, 2010 at 11:15:20AM +0000, Mark Adams wrote:> On Wed, Nov 24, 2010 at 03:28:43PM -0500, Konrad Rzeszutek Wilk wrote: > > On Wed, Nov 24, 2010 at 05:59:26PM +0000, Mark Adams wrote: > > > > > > > > >>>> > > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > >>>> (XEN) msi.c:715: MSI is already in use on device 02:00.0 > > > > > > > > > > > > > > Looking briefly at the code it means that somebody enabled the MSI > > > > > > > already on the device and did not disable them. But I wonder how > > > > > > > you got those in the first place. Did you use xen-pciback.hide (for PVOPS kernels) > > > > > > > or pciback.hide (for older kernels) to "hide" the devices away from the > > > > > > > Linux Dom0 kernel? > > > > > > I''ve got a test setup in place now, and am trying to reproduce this. > > > I''ve not connected up serial as yet, but can see the following logs in > > > the qemu-dm log file when I get the "MSI is already in use" errors > > > above. Note also that this error -always- shows for the first specified > > > device in the pci= field, and not the 2nd. > > > > > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > (with identical config in dom0 and domU) suddenly would not allow the > igb driver to be loaded in domU, even though the device was visible in > lspci. Shutting the machine down, removing the power cord, waiting 5 > seconds then plugging it in again corrected that issue - Is this > possibly a motherboard bug? I have also disabled the SR-IOV > functionality in the BIOS incase this is causing any issues. > > In addition, to try to correct the MSI issue noted above, I have changed > my pci= line to the following: > > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ]Apologies for replying to my own post, but I''m having an issue with this setup in that I can not get a link on the 2nd interface that I''m passing through. I''ve tried with msitranslate both on and off, with no success. The device shows in the domU as an interface correctly, but even with the link up (link led''s show on the interface) domU always shows the eth2 interface as not ready. [ 7.001784] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 7.047903] ADDRCONF(NETDEV_UP): eth2: link is not ready [ 10.108995] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 10.109653] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 16.468102] eth0: no IPv6 routers present [ 20.404092] eth1: no IPv6 routers present I''ve tried using the ET dual port card (igb) aswell as the onboard interfaces (e1000e) with exactly the same result. The eth0 interface is a xen bridge device, and if I remove this, both passthrough interfaces work correctly. Can you not have a bridge and pci-passthrough operational? is there a limit of 2 NIC''s in a HVM domU? (this doesn''t sound right...) Regards, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-29 16:36 UTC
[Xen-devel] HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > (with identical config in dom0 and domU) suddenly would not allow the > igb driver to be loaded in domU, even though the device was visible inLet''s create a new thread for this other issue.> lspci. Shutting the machine down, removing the power cord, waiting 5 > seconds then plugging it in again corrected that issue - Is this > possibly a motherboard bug? I have also disabled the SR-IOV > functionality in the BIOS incase this is causing any issues. > > In addition, to try to correct the MSI issue noted above, I have changed > my pci= line to the following: > > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ]With the msi_translate=1 turned on the DomU HVM guests did work, right?> > This has stopped the "already in use on device" log, and the devices > appear to show correctly in the domU. Is it safe to disable > msitranslate? as I understand it, its for allowing multifunction devices > to be seen as such in domU. Is that correct? > > I haven''t been able to reproduce the dropped raid issue yet, but I am > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > this due to their very high interrupt usage (2000 per second).OK.> > In the mean time, I can see the following in the qemu-dm logs now with > the msitranslate=0 enabled. Is it anything to worry about?Well, the "Error" ones are pretty bad, thought I am having a hard time understanding what it means. Lets copy some of the QEMU folks on this.> pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > > > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > > machine is toast. I mentioned in the previous email the key sequences - look on Google > > on how to pass in SysRQ if you are using a serial concentrator. > > I will do this when I can get the machine to crash. > > Best Regards, > Mark > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Dec-08 12:58 UTC
[Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
Hi - Apologies to top post this, but after alot of testing, I believe there must be an issue with IRQ''s going missing between domU and dom0. Unfortunately I have no data to prove this! With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel grub config, all 3 NIC''s appear OK in the domU however I still had issues with the red-fone ISDN box. The interrupts were showing correctly (2000/s) in the domU but communication to the device via the NIC was still being interrupted (as shown in the asterisk console)Note that to get the igb driver to allow this many interrupts, the InterruptThrottleRate was set to 0. The same config (red-fone box, asterisk etc) works fine with a physical server. There is also the additional issue that I could not get the passthrough NIC''s to show correctly when I also had a bridge setup. Throughout my testing however, I could not get the machine to crash. Not sure where to go with this one. For now we are keeping our VoIP servers physical when ISDN connections are required. Regards, Mark On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote:> > > > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > > (with identical config in dom0 and domU) suddenly would not allow the > > igb driver to be loaded in domU, even though the device was visible in > > Let''s create a new thread for this other issue. > > > lspci. Shutting the machine down, removing the power cord, waiting 5 > > seconds then plugging it in again corrected that issue - Is this > > possibly a motherboard bug? I have also disabled the SR-IOV > > functionality in the BIOS incase this is causing any issues. > > > > In addition, to try to correct the MSI issue noted above, I have changed > > my pci= line to the following: > > > > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] > > With the msi_translate=1 turned on the DomU HVM guests did work, right? > > > > > This has stopped the "already in use on device" log, and the devices > > appear to show correctly in the domU. Is it safe to disable > > msitranslate? as I understand it, its for allowing multifunction devices > > to be seen as such in domU. Is that correct? > > > > I haven''t been able to reproduce the dropped raid issue yet, but I am > > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > > this due to their very high interrupt usage (2000 per second). > > OK. > > > > In the mean time, I can see the following in the qemu-dm logs now with > > the msitranslate=0 enabled. Is it anything to worry about? > > Well, the "Error" ones are pretty bad, thought I am having a hard time > understanding what it means. Lets copy some of the QEMU folks on this. > > > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > > > > > > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > > > machine is toast. I mentioned in the previous email the key sequences - look on Google > > > on how to pass in SysRQ if you are using a serial concentrator. > > > > I will do this when I can get the machine to crash. > > > > Best Regards, > > Mark > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Dec-08 13:37 UTC
Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
Hello Mark, Just a recap: you pass through: - 3 physical nics/IGB - 1 ISDN pci ISDN box - all using msi/msi-x interrupts ? Have you tried using a PV domU instead of a HVM domU ? Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? -- Sander Wednesday, December 8, 2010, 1:58:55 PM, you wrote:> Hi - Apologies to top post this, but after alot of testing, I believe > there must be an issue with IRQ''s going missing between domU and dom0. > Unfortunately I have no data to prove this!> With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel > grub config, all 3 NIC''s appear OK in the domU however I still had > issues with the red-fone ISDN box. The interrupts were showing correctly > (2000/s) in the domU but communication to the device via the NIC was > still being interrupted (as shown in the asterisk console)Note that to > get the igb driver to allow this many interrupts, the > InterruptThrottleRate was set to 0. The same config (red-fone box, > asterisk etc) works fine with a physical server.> There is also the additional issue that I could not get the passthrough > NIC''s to show correctly when I also had a bridge setup.> Throughout my testing however, I could not get the machine to crash.> Not sure where to go with this one. For now we are keeping our VoIP > servers physical when ISDN connections are required.> Regards, > Mark> On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: >> > >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s >> > (with identical config in dom0 and domU) suddenly would not allow the >> > igb driver to be loaded in domU, even though the device was visible in >> >> Let''s create a new thread for this other issue. >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5 >> > seconds then plugging it in again corrected that issue - Is this >> > possibly a motherboard bug? I have also disabled the SR-IOV >> > functionality in the BIOS incase this is causing any issues. >> > >> > In addition, to try to correct the MSI issue noted above, I have changed >> > my pci= line to the following: >> > >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right? >> >> > >> > This has stopped the "already in use on device" log, and the devices >> > appear to show correctly in the domU. Is it safe to disable >> > msitranslate? as I understand it, its for allowing multifunction devices >> > to be seen as such in domU. Is that correct? >> > >> > I haven''t been able to reproduce the dropped raid issue yet, but I am >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause >> > this due to their very high interrupt usage (2000 per second). >> >> OK. >> > >> > In the mean time, I can see the following in the qemu-dm logs now with >> > the msitranslate=0 enabled. Is it anything to worry about? >> >> Well, the "Error" ones are pretty bad, thought I am having a hard time >> understanding what it means. Lets copy some of the QEMU folks on this. >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> > >> > > >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google >> > > on how to pass in SysRQ if you are using a serial concentrator. >> > >> > I will do this when I can get the machine to crash. >> > >> > Best Regards, >> > Mark >> > >> > _______________________________________________ >> > Xen-devel mailing list >> > Xen-devel@lists.xensource.com >> > http://lists.xensource.com/xen-devel-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dietmar Hahn
2010-Dec-08 13:48 UTC
Re: [Xen-users] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
Hi, Am 08.12.2010 schrieb "Mark Adams <mark@campbell-lange.net>":> Hi - Apologies to top post this, but after alot of testing, I believe > there must be an issue with IRQ''s going missing between domU and dom0. > Unfortunately I have no data to prove this! > > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel > grub config, all 3 NIC''s appear OK in the domU however I still had > issues with the red-fone ISDN box. The interrupts were showing correctly > (2000/s) in the domU but communication to the device via the NIC was > still being interrupted (as shown in the asterisk console)Note that to > get the igb driver to allow this many interrupts, the > InterruptThrottleRate was set to 0. The same config (red-fone box, > asterisk etc) works fine with a physical server. > > There is also the additional issue that I could not get the passthrough > NIC''s to show correctly when I also had a bridge setup. > > Throughout my testing however, I could not get the machine to crash. > > Not sure where to go with this one. For now we are keeping our VoIP > servers physical when ISDN connections are required.Today I did some tests with xen-unstable and found these problems too. I tried to passthrough 2 pci cards and got some error messages on the xen xonsole and in the qemu logs. With msitranslate=0 and pci=nomsi I got the soundcard working in a domU linux but it doesn''t help on windows. I attached the logs from the xen serial console and the qemu logs. Thanks! Dietmar.> > Regards, > Mark > > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: > > > > > > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > > > (with identical config in dom0 and domU) suddenly would not allow the > > > igb driver to be loaded in domU, even though the device was visible in > > > > Let''s create a new thread for this other issue. > > > > > lspci. Shutting the machine down, removing the power cord, waiting 5 > > > seconds then plugging it in again corrected that issue - Is this > > > possibly a motherboard bug? I have also disabled the SR-IOV > > > functionality in the BIOS incase this is causing any issues. > > > > > > In addition, to try to correct the MSI issue noted above, I have changed > > > my pci= line to the following: > > > > > > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] > > > > With the msi_translate=1 turned on the DomU HVM guests did work, right? > > > > > > > > This has stopped the "already in use on device" log, and the devices > > > appear to show correctly in the domU. Is it safe to disable > > > msitranslate? as I understand it, its for allowing multifunction devices > > > to be seen as such in domU. Is that correct? > > > > > > I haven''t been able to reproduce the dropped raid issue yet, but I am > > > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > > > this due to their very high interrupt usage (2000 per second). > > > > OK. > > > > > > In the mean time, I can see the following in the qemu-dm logs now with > > > the msitranslate=0 enabled. Is it anything to worry about? > > > > Well, the "Error" ones are pretty bad, thought I am having a hard time > > understanding what it means. Lets copy some of the QEMU folks on this. > > > > > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > > > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > > > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > > > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > > > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > > > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > > > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > > > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > > > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > > > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > > > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > > > > > > > > > > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > > > > machine is toast. I mentioned in the previous email the key sequences - look on Google > > > > on how to pass in SysRQ if you are using a serial concentrator. > > > > > > I will do this when I can get the machine to crash. > > > > > > Best Regards, > > > Mark > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xensource.com > > > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- Company details: http://ts.fujitsu.com/imprint.html _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Adams
2010-Dec-08 13:48 UTC
Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote:> Hello Mark,Hi> > Just a recap: > you pass through: > - 3 physical nics/IGB > - 1 ISDN pci ISDN boxThe redfone box runs on 1 of the nics - its not seperate. It converts ISDN to TDMoE see here.. http://www.red-fone.com/> - all using msi/msi-x interrupts ?I tried using msi/msi-x interrupts, but it caused the raid card to drop off (after some use) and provided seemingly even worse performance than pegging everything back to legacy.> > Have you tried using a PV domU instead of a HVM domU ?I initially tried PV but had issues with the igb NIC''s. There was another thread somewhere about my issues with that.> Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? > > > -- > Sander > > > > Wednesday, December 8, 2010, 1:58:55 PM, you wrote: > > > Hi - Apologies to top post this, but after alot of testing, I believe > > there must be an issue with IRQ''s going missing between domU and dom0. > > Unfortunately I have no data to prove this! > > > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel > > grub config, all 3 NIC''s appear OK in the domU however I still had > > issues with the red-fone ISDN box. The interrupts were showing correctly > > (2000/s) in the domU but communication to the device via the NIC was > > still being interrupted (as shown in the asterisk console)Note that to > > get the igb driver to allow this many interrupts, the > > InterruptThrottleRate was set to 0. The same config (red-fone box, > > asterisk etc) works fine with a physical server. > > > There is also the additional issue that I could not get the passthrough > > NIC''s to show correctly when I also had a bridge setup. > > > Throughout my testing however, I could not get the machine to crash. > > > Not sure where to go with this one. For now we are keeping our VoIP > > servers physical when ISDN connections are required. > > > Regards, > > Mark > > > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: > >> > > >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > >> > (with identical config in dom0 and domU) suddenly would not allow the > >> > igb driver to be loaded in domU, even though the device was visible in > >> > >> Let''s create a new thread for this other issue. > >> > >> > lspci. Shutting the machine down, removing the power cord, waiting 5 > >> > seconds then plugging it in again corrected that issue - Is this > >> > possibly a motherboard bug? I have also disabled the SR-IOV > >> > functionality in the BIOS incase this is causing any issues. > >> > > >> > In addition, to try to correct the MSI issue noted above, I have changed > >> > my pci= line to the following: > >> > > >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] > >> > >> With the msi_translate=1 turned on the DomU HVM guests did work, right? > >> > >> > > >> > This has stopped the "already in use on device" log, and the devices > >> > appear to show correctly in the domU. Is it safe to disable > >> > msitranslate? as I understand it, its for allowing multifunction devices > >> > to be seen as such in domU. Is that correct? > >> > > >> > I haven''t been able to reproduce the dropped raid issue yet, but I am > >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > >> > this due to their very high interrupt usage (2000 per second). > >> > >> OK. > >> > > >> > In the mean time, I can see the following in the qemu-dm logs now with > >> > the msitranslate=0 enabled. Is it anything to worry about? > >> > >> Well, the "Error" ones are pretty bad, thought I am having a hard time > >> understanding what it means. Lets copy some of the QEMU folks on this. > >> > >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> > > >> > > > >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google > >> > > on how to pass in SysRQ if you are using a serial concentrator. > >> > > >> > I will do this when I can get the machine to crash. > >> > > >> > Best Regards, > >> > Mark > >> > > >> > _______________________________________________ > >> > Xen-devel mailing list > >> > Xen-devel@lists.xensource.com > >> > http://lists.xensource.com/xen-devel > > > > > > -- > Best regards, > Sander mailto:linux@eikelenboom.it >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Dec-08 14:05 UTC
Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
Wednesday, December 8, 2010, 2:48:48 PM, you wrote:> On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote: >> Hello Mark,> Hi>> >> Just a recap: >> you pass through: >> - 3 physical nics/IGB >> - 1 ISDN pci ISDN box> The redfone box runs on 1 of the nics - its not seperate. It converts > ISDN to TDMoE see here.. http://www.red-fone.com/So the problem is probably with the igb''s. Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , perhaps worth a try ? Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to see if it''s due to any of the special offload features ?>> - all using msi/msi-x interrupts ?> I tried using msi/msi-x interrupts, but it caused the raid card to drop > off (after some use) and provided seemingly even worse performance than > pegging everything back to legacy.>> >> Have you tried using a PV domU instead of a HVM domU ?> I initially tried PV but had issues with the igb NIC''s. There was > another thread somewhere about my issues with that.>> Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? >> >> >> -- >> Sander >> >> >> >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote: >> >> > Hi - Apologies to top post this, but after alot of testing, I believe >> > there must be an issue with IRQ''s going missing between domU and dom0. >> > Unfortunately I have no data to prove this! >> >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel >> > grub config, all 3 NIC''s appear OK in the domU however I still had >> > issues with the red-fone ISDN box. The interrupts were showing correctly >> > (2000/s) in the domU but communication to the device via the NIC was >> > still being interrupted (as shown in the asterisk console)Note that to >> > get the igb driver to allow this many interrupts, the >> > InterruptThrottleRate was set to 0. The same config (red-fone box, >> > asterisk etc) works fine with a physical server. >> >> > There is also the additional issue that I could not get the passthrough >> > NIC''s to show correctly when I also had a bridge setup. >> >> > Throughout my testing however, I could not get the machine to crash. >> >> > Not sure where to go with this one. For now we are keeping our VoIP >> > servers physical when ISDN connections are required. >> >> > Regards, >> > Mark >> >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: >> >> > >> >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s >> >> > (with identical config in dom0 and domU) suddenly would not allow the >> >> > igb driver to be loaded in domU, even though the device was visible in >> >> >> >> Let''s create a new thread for this other issue. >> >> >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5 >> >> > seconds then plugging it in again corrected that issue - Is this >> >> > possibly a motherboard bug? I have also disabled the SR-IOV >> >> > functionality in the BIOS incase this is causing any issues. >> >> > >> >> > In addition, to try to correct the MSI issue noted above, I have changed >> >> > my pci= line to the following: >> >> > >> >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] >> >> >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right? >> >> >> >> > >> >> > This has stopped the "already in use on device" log, and the devices >> >> > appear to show correctly in the domU. Is it safe to disable >> >> > msitranslate? as I understand it, its for allowing multifunction devices >> >> > to be seen as such in domU. Is that correct? >> >> > >> >> > I haven''t been able to reproduce the dropped raid issue yet, but I am >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause >> >> > this due to their very high interrupt usage (2000 per second). >> >> >> >> OK. >> >> > >> >> > In the mean time, I can see the following in the qemu-dm logs now with >> >> > the msitranslate=0 enabled. Is it anything to worry about? >> >> >> >> Well, the "Error" ones are pretty bad, thought I am having a hard time >> >> understanding what it means. Lets copy some of the QEMU folks on this. >> >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> > >> >> > > >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your >> >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google >> >> > > on how to pass in SysRQ if you are using a serial concentrator. >> >> > >> >> > I will do this when I can get the machine to crash. >> >> > >> >> > Best Regards, >> >> > Mark >> >> > >> >> > _______________________________________________ >> >> > Xen-devel mailing list >> >> > Xen-devel@lists.xensource.com >> >> > http://lists.xensource.com/xen-devel >> >> >> >> >> >> -- >> Best regards, >> Sander mailto:linux@eikelenboom.it >>-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Dec-08 15:48 UTC
Re: [Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On Wed, Dec 08, 2010 at 03:05:50PM +0100, Sander Eikelenboom wrote:> > Wednesday, December 8, 2010, 2:48:48 PM, you wrote: > > > On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote: > >> Hello Mark, > > > Hi > > >> > >> Just a recap: > >> you pass through: > >> - 3 physical nics/IGB > >> - 1 ISDN pci ISDN box > > > The redfone box runs on 1 of the nics - its not seperate. It converts > > ISDN to TDMoE see here.. http://www.red-fone.com/ > > So the problem is probably with the igb''s. > Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , perhaps worth a try ?Tried this - doesn''t help.> > Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to see if it''s due to any of the special offload features ?Haven''t got any other NIC''s to try unfortunately. Even if it did work with 1, it would be no use to me as I need 3.> > > >> - all using msi/msi-x interrupts ? > > > I tried using msi/msi-x interrupts, but it caused the raid card to drop > > off (after some use) and provided seemingly even worse performance than > > pegging everything back to legacy. > > >> > >> Have you tried using a PV domU instead of a HVM domU ? > > > I initially tried PV but had issues with the igb NIC''s. There was > > another thread somewhere about my issues with that. > > > >> Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? > >> > >> > >> -- > >> Sander > >> > >> > >> > >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote: > >> > >> > Hi - Apologies to top post this, but after alot of testing, I believe > >> > there must be an issue with IRQ''s going missing between domU and dom0. > >> > Unfortunately I have no data to prove this! > >> > >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel > >> > grub config, all 3 NIC''s appear OK in the domU however I still had > >> > issues with the red-fone ISDN box. The interrupts were showing correctly > >> > (2000/s) in the domU but communication to the device via the NIC was > >> > still being interrupted (as shown in the asterisk console)Note that to > >> > get the igb driver to allow this many interrupts, the > >> > InterruptThrottleRate was set to 0. The same config (red-fone box, > >> > asterisk etc) works fine with a physical server. > >> > >> > There is also the additional issue that I could not get the passthrough > >> > NIC''s to show correctly when I also had a bridge setup. > >> > >> > Throughout my testing however, I could not get the machine to crash. > >> > >> > Not sure where to go with this one. For now we are keeping our VoIP > >> > servers physical when ISDN connections are required. > >> > >> > Regards, > >> > Mark > >> > >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: > >> >> > > >> >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > >> >> > (with identical config in dom0 and domU) suddenly would not allow the > >> >> > igb driver to be loaded in domU, even though the device was visible in > >> >> > >> >> Let''s create a new thread for this other issue. > >> >> > >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5 > >> >> > seconds then plugging it in again corrected that issue - Is this > >> >> > possibly a motherboard bug? I have also disabled the SR-IOV > >> >> > functionality in the BIOS incase this is causing any issues. > >> >> > > >> >> > In addition, to try to correct the MSI issue noted above, I have changed > >> >> > my pci= line to the following: > >> >> > > >> >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] > >> >> > >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right? > >> >> > >> >> > > >> >> > This has stopped the "already in use on device" log, and the devices > >> >> > appear to show correctly in the domU. Is it safe to disable > >> >> > msitranslate? as I understand it, its for allowing multifunction devices > >> >> > to be seen as such in domU. Is that correct? > >> >> > > >> >> > I haven''t been able to reproduce the dropped raid issue yet, but I am > >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > >> >> > this due to their very high interrupt usage (2000 per second). > >> >> > >> >> OK. > >> >> > > >> >> > In the mean time, I can see the following in the qemu-dm logs now with > >> >> > the msitranslate=0 enabled. Is it anything to worry about? > >> >> > >> >> Well, the "Error" ones are pretty bad, thought I am having a hard time > >> >> understanding what it means. Lets copy some of the QEMU folks on this. > >> >> > >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> > > >> >> > > > >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > >> >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google > >> >> > > on how to pass in SysRQ if you are using a serial concentrator. > >> >> > > >> >> > I will do this when I can get the machine to crash. > >> >> > > >> >> > Best Regards, > >> >> > Mark > >> >> > > >> >> > _______________________________________________ > >> >> > Xen-devel mailing list > >> >> > Xen-devel@lists.xensource.com > >> >> > http://lists.xensource.com/xen-devel > >> > >> > >> > >> > >> > >> -- > >> Best regards, > >> Sander mailto:linux@eikelenboom.it > >> > > > > -- > Best regards, > Sander mailto:linux@eikelenboom.it > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Dec-08 16:44 UTC
Re: [Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
Wednesday, December 8, 2010, 4:48:57 PM, you wrote:> On Wed, Dec 08, 2010 at 03:05:50PM +0100, Sander Eikelenboom wrote: >> >> Wednesday, December 8, 2010, 2:48:48 PM, you wrote: >> >> > On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote: >> >> Hello Mark, >> >> > Hi >> >> >> >> >> Just a recap: >> >> you pass through: >> >> - 3 physical nics/IGB >> >> - 1 ISDN pci ISDN box >> >> > The redfone box runs on 1 of the nics - its not seperate. It converts >> > ISDN to TDMoE see here.. http://www.red-fone.com/ >> >> So the problem is probably with the igb''s. >> Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , perhaps worth a try ?> Tried this - doesn''t help.>> >> Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to see if it''s due to any of the special offload features ?> Haven''t got any other NIC''s to try unfortunately. Even if it did work > with 1, it would be no use to me as I need 3.I understand, but simplifying the setup and trying to isolate the problem, could clarify things. I also read you previous thread, and i saw you hide the 02:00.0 and 03:00.0 with xen-pciback (e1000e driver) there, but now you seem to be passing through 08:00.0 and 08:00.1 (igb) ? So i assume you have already tried 2 different NIC''s http://download.intel.com/design/network/specupdt/82574.pdf though shows some errata regarding msi-x interrupts and timing issues and workarounds on the 82574 (02:00.0 and 03:00.0) nics. -- Sander>> >> >> >> - all using msi/msi-x interrupts ? >> >> > I tried using msi/msi-x interrupts, but it caused the raid card to drop >> > off (after some use) and provided seemingly even worse performance than >> > pegging everything back to legacy. >> >> >> >> >> Have you tried using a PV domU instead of a HVM domU ? >> >> > I initially tried PV but had issues with the igb NIC''s. There was >> > another thread somewhere about my issues with that. >> >> >> >> Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? >> >> >> >> >> >> -- >> >> Sander >> >> >> >> >> >> >> >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote: >> >> >> >> > Hi - Apologies to top post this, but after alot of testing, I believe >> >> > there must be an issue with IRQ''s going missing between domU and dom0. >> >> > Unfortunately I have no data to prove this! >> >> >> >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel >> >> > grub config, all 3 NIC''s appear OK in the domU however I still had >> >> > issues with the red-fone ISDN box. The interrupts were showing correctly >> >> > (2000/s) in the domU but communication to the device via the NIC was >> >> > still being interrupted (as shown in the asterisk console)Note that to >> >> > get the igb driver to allow this many interrupts, the >> >> > InterruptThrottleRate was set to 0. The same config (red-fone box, >> >> > asterisk etc) works fine with a physical server. >> >> >> >> > There is also the additional issue that I could not get the passthrough >> >> > NIC''s to show correctly when I also had a bridge setup. >> >> >> >> > Throughout my testing however, I could not get the machine to crash. >> >> >> >> > Not sure where to go with this one. For now we are keeping our VoIP >> >> > servers physical when ISDN connections are required. >> >> >> >> > Regards, >> >> > Mark >> >> >> >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: >> >> >> > >> >> >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s >> >> >> > (with identical config in dom0 and domU) suddenly would not allow the >> >> >> > igb driver to be loaded in domU, even though the device was visible in >> >> >> >> >> >> Let''s create a new thread for this other issue. >> >> >> >> >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5 >> >> >> > seconds then plugging it in again corrected that issue - Is this >> >> >> > possibly a motherboard bug? I have also disabled the SR-IOV >> >> >> > functionality in the BIOS incase this is causing any issues. >> >> >> > >> >> >> > In addition, to try to correct the MSI issue noted above, I have changed >> >> >> > my pci= line to the following: >> >> >> > >> >> >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] >> >> >> >> >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right? >> >> >> >> >> >> > >> >> >> > This has stopped the "already in use on device" log, and the devices >> >> >> > appear to show correctly in the domU. Is it safe to disable >> >> >> > msitranslate? as I understand it, its for allowing multifunction devices >> >> >> > to be seen as such in domU. Is that correct? >> >> >> > >> >> >> > I haven''t been able to reproduce the dropped raid issue yet, but I am >> >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause >> >> >> > this due to their very high interrupt usage (2000 per second). >> >> >> >> >> >> OK. >> >> >> > >> >> >> > In the mean time, I can see the following in the qemu-dm logs now with >> >> >> > the msitranslate=0 enabled. Is it anything to worry about? >> >> >> >> >> >> Well, the "Error" ones are pretty bad, thought I am having a hard time >> >> >> understanding what it means. Lets copy some of the QEMU folks on this. >> >> >> >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] >> >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 >> >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] >> >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 >> >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 >> >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 >> >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 >> >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 >> >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 >> >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. >> >> >> > >> >> >> > > >> >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your >> >> >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google >> >> >> > > on how to pass in SysRQ if you are using a serial concentrator. >> >> >> > >> >> >> > I will do this when I can get the machine to crash. >> >> >> > >> >> >> > Best Regards, >> >> >> > Mark >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Xen-devel mailing list >> >> >> > Xen-devel@lists.xensource.com >> >> >> > http://lists.xensource.com/xen-devel >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> Best regards, >> >> Sander mailto:linux@eikelenboom.it >> >> >> >> >> >> -- >> Best regards, >> Sander mailto:linux@eikelenboom.it >> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Dec-08 17:01 UTC
[Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
> > Just a recap: > > you pass through: > > - 3 physical nics/IGB > > - 1 ISDN pci ISDN box > > The redfone box runs on 1 of the nics - its not seperate. It converts > ISDN to TDMoE see here.. http://www.red-fone.com/ > > > - all using msi/msi-x interrupts ? > > I tried using msi/msi-x interrupts, but it caused the raid card to drop > off (after some use) and provided seemingly even worse performance than > pegging everything back to legacy.Were you able to get a serial log and hit all of the differetn debug options when the the RAID card died?> > > > > Have you tried using a PV domU instead of a HVM domU ? > > I initially tried PV but had issues with the igb NIC''s. There was > another thread somewhere about my issues with that.Hmm, the only other thread I see from you is about the RAID.> > > > Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ?Mr. Sander idea of isolating one piece by piece is the right way. There is a bunch of warnings in the QEMU output - some of them quite .. troubling. You seem to have issues with gntdev (as in, not found), but if you are using OpenSuSE then that would work - I think. When you tested xen-unstable did you use the OpenSUSE kernel or the PV-OPS one? _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sander Eikelenboom
2010-Dec-08 17:15 UTC
Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
It seems Jeremy has had some troubles with the Intel 82754L as well. Don''t know if he has got those resolved, or has to use workarounds ? http://www.gossamer-threads.com/lists/linux/kernel/1166308 What kernel are you using in your domU ? Also the 2.6.32 debian ? Perhaps it''s also worth a try to test a newer intel driver/kernel ? -- Sander Wednesday, December 8, 2010, 6:01:25 PM, you wrote:>> > Just a recap: >> > you pass through: >> > - 3 physical nics/IGB >> > - 1 ISDN pci ISDN box >> >> The redfone box runs on 1 of the nics - its not seperate. It converts >> ISDN to TDMoE see here.. http://www.red-fone.com/ >> >> > - all using msi/msi-x interrupts ? >> >> I tried using msi/msi-x interrupts, but it caused the raid card to drop >> off (after some use) and provided seemingly even worse performance than >> pegging everything back to legacy.> Were you able to get a serial log and hit all of the differetn debug options > when the the RAID card died? >> >> > >> > Have you tried using a PV domU instead of a HVM domU ? >> >> I initially tried PV but had issues with the igb NIC''s. There was >> another thread somewhere about my issues with that.> Hmm, the only other thread I see from you is about the RAID. >> >> >> > Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ?> Mr. Sander idea of isolating one piece by piece is the right way.> There is a bunch of warnings in the QEMU output - some of them quite .. troubling.> You seem to have issues with gntdev (as in, not found), but if you are using OpenSuSE > then that would work - I think. When you tested xen-unstable did you use the > OpenSUSE kernel or the PV-OPS one?-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Dec-08 17:18 UTC
[Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
And another thread with problems and strange irq reports with regard to the 82754 http://www.linuxquestions.org/questions/linux-hardware-18/intel-82574l-gigabit-network-card-issues-and-resolution-831364/ (please Mr. Konrad, no "Mr." that makes me feel really really old ;-) ) Wednesday, December 8, 2010, 6:01:25 PM, you wrote:>> > Just a recap: >> > you pass through: >> > - 3 physical nics/IGB >> > - 1 ISDN pci ISDN box >> >> The redfone box runs on 1 of the nics - its not seperate. It converts >> ISDN to TDMoE see here.. http://www.red-fone.com/ >> >> > - all using msi/msi-x interrupts ? >> >> I tried using msi/msi-x interrupts, but it caused the raid card to drop >> off (after some use) and provided seemingly even worse performance than >> pegging everything back to legacy.> Were you able to get a serial log and hit all of the differetn debug options > when the the RAID card died? >> >> > >> > Have you tried using a PV domU instead of a HVM domU ? >> >> I initially tried PV but had issues with the igb NIC''s. There was >> another thread somewhere about my issues with that.> Hmm, the only other thread I see from you is about the RAID. >> >> >> > Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ?> Mr. Sander idea of isolating one piece by piece is the right way.> There is a bunch of warnings in the QEMU output - some of them quite .. troubling.> You seem to have issues with gntdev (as in, not found), but if you are using OpenSuSE > then that would work - I think. When you tested xen-unstable did you use the > OpenSUSE kernel or the PV-OPS one?-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Konrad Rzeszutek Wilk
2010-Dec-08 17:43 UTC
[Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On Wed, Dec 08, 2010 at 06:18:06PM +0100, Sander Eikelenboom wrote:> And another thread with problems and strange irq reports with regard to the 82754 > http://www.linuxquestions.org/questions/linux-hardware-18/intel-82574l-gigabit-network-card-issues-and-resolution-831364/Ugh, that device sure looks to have some faults. So I think I''ve confused myself. There was another person who tried a similar pass-through to an HVM guest of a sound card. While it worked it did not seem to work that well and spitted out lots of warnings. But those are quite different from what Mark had. Mark, right now we are all busy trying to get patches ready for 2.6.38 so hence the reason for not being so fast at responding to you or trying to reproduce this on our machines. The RAID is troubling, but the neat thing about it is that it hangs your machine so if you hit all of those debug options via the serial console, it can help us narrow down on where the problem is. The other issues you have - well, there are many posibilities (and it might be very well the same issue you are hitting with the RAID card) and narrowing it down to the exact cause (say - it might be what Sander suggested - one of the NICs is just funky or perhaps needs a firmware update) can help here. The latest kernel is 2.6.32.26 (I think?) and the latest xen-unstable.hg has some fixes to the MSI ownership and some IRQ migration issues fixed.> > (please Mr. Konrad, no "Mr." that makes me feel really really old ;-) )Sure thing :-) _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Jeremy Fitzhardinge
2010-Dec-08 19:51 UTC
Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On 12/08/2010 09:15 AM, Sander Eikelenboom wrote:> It seems Jeremy has had some troubles with the Intel 82754L as well. > > Don''t know if he has got those resolved, or has to use workarounds ?Its basically a bug in that particular chip, I think. The workaround is to disable ASPM for it; I''m not sure if the upstream driver does that yet, but the workaround used to be to completely disable ASPM (pcie_aspm=off on the kernel command line, and in the BIOS). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Dec-09 10:39 UTC
Re: [Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On Wed, Dec 08, 2010 at 05:44:39PM +0100, Sander Eikelenboom wrote:> Wednesday, December 8, 2010, 4:48:57 PM, you wrote: > > > On Wed, Dec 08, 2010 at 03:05:50PM +0100, Sander Eikelenboom wrote: > >> > >> Wednesday, December 8, 2010, 2:48:48 PM, you wrote: > >> > >> > On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote: > >> >> Hello Mark, > >> > >> > Hi > >> > >> >> > >> >> Just a recap: > >> >> you pass through: > >> >> - 3 physical nics/IGB > >> >> - 1 ISDN pci ISDN box > >> > >> > The redfone box runs on 1 of the nics - its not seperate. It converts > >> > ISDN to TDMoE see here.. http://www.red-fone.com/ > >> > >> So the problem is probably with the igb''s. > >> Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , perhaps worth a try ? > > > Tried this - doesn''t help. > > >> > >> Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to see if it''s due to any of the special offload features ? > > > Haven''t got any other NIC''s to try unfortunately. Even if it did work > > with 1, it would be no use to me as I need 3. > > I understand, but simplifying the setup and trying to isolate the problem, could clarify things. > > I also read you previous thread, and i saw you hide the 02:00.0 and 03:00.0 with xen-pciback (e1000e driver) there, but now you seem to be passing through 08:00.0 and 08:00.1 (igb) ? > So i assume you have already tried 2 different NIC''s > > http://download.intel.com/design/network/specupdt/82574.pdf though shows some errata regarding msi-x interrupts and timing issues and workarounds on the 82574 (02:00.0 and 03:00.0) nics. >I was initially using the onboard NICs (e1000e) when I had the crashing problem. To try to get around this, I disabled all the msi based stuff I could find - which seemed to correct the crashing issue. In order to do this I needed 3 NIC''s because bridging would not work at the same time as passthrough (would not show all devices being passed through?) hence starting to use the igb based NIC card thats also in the machine. Unfortunately the servers I''ve been testing on need to go in to production now, so can''t test any further (hence sticking the voip stuff on to a physical box). Xen works really well for me when I don''t use pci-passthrough! Regards, Mark> -- > Sander > > > >> > >> > >> >> - all using msi/msi-x interrupts ? > >> > >> > I tried using msi/msi-x interrupts, but it caused the raid card to drop > >> > off (after some use) and provided seemingly even worse performance than > >> > pegging everything back to legacy. > >> > >> >> > >> >> Have you tried using a PV domU instead of a HVM domU ? > >> > >> > I initially tried PV but had issues with the igb NIC''s. There was > >> > another thread somewhere about my issues with that. > >> > >> > >> >> Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? > >> >> > >> >> > >> >> -- > >> >> Sander > >> >> > >> >> > >> >> > >> >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote: > >> >> > >> >> > Hi - Apologies to top post this, but after alot of testing, I believe > >> >> > there must be an issue with IRQ''s going missing between domU and dom0. > >> >> > Unfortunately I have no data to prove this! > >> >> > >> >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel > >> >> > grub config, all 3 NIC''s appear OK in the domU however I still had > >> >> > issues with the red-fone ISDN box. The interrupts were showing correctly > >> >> > (2000/s) in the domU but communication to the device via the NIC was > >> >> > still being interrupted (as shown in the asterisk console)Note that to > >> >> > get the igb driver to allow this many interrupts, the > >> >> > InterruptThrottleRate was set to 0. The same config (red-fone box, > >> >> > asterisk etc) works fine with a physical server. > >> >> > >> >> > There is also the additional issue that I could not get the passthrough > >> >> > NIC''s to show correctly when I also had a bridge setup. > >> >> > >> >> > Throughout my testing however, I could not get the machine to crash. > >> >> > >> >> > Not sure where to go with this one. For now we are keeping our VoIP > >> >> > servers physical when ISDN connections are required. > >> >> > >> >> > Regards, > >> >> > Mark > >> >> > >> >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote: > >> >> >> > > >> >> >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM''s > >> >> >> > (with identical config in dom0 and domU) suddenly would not allow the > >> >> >> > igb driver to be loaded in domU, even though the device was visible in > >> >> >> > >> >> >> Let''s create a new thread for this other issue. > >> >> >> > >> >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5 > >> >> >> > seconds then plugging it in again corrected that issue - Is this > >> >> >> > possibly a motherboard bug? I have also disabled the SR-IOV > >> >> >> > functionality in the BIOS incase this is causing any issues. > >> >> >> > > >> >> >> > In addition, to try to correct the MSI issue noted above, I have changed > >> >> >> > my pci= line to the following: > >> >> >> > > >> >> >> > pci=[ ''08:00.0,msitranslate=0'', ''08:00.1,msitranslate=0'' ] > >> >> >> > >> >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right? > >> >> >> > >> >> >> > > >> >> >> > This has stopped the "already in use on device" log, and the devices > >> >> >> > appear to show correctly in the domU. Is it safe to disable > >> >> >> > msitranslate? as I understand it, its for allowing multifunction devices > >> >> >> > to be seen as such in domU. Is that correct? > >> >> >> > > >> >> >> > I haven''t been able to reproduce the dropped raid issue yet, but I am > >> >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause > >> >> >> > this due to their very high interrupt usage (2000 per second). > >> >> >> > >> >> >> OK. > >> >> >> > > >> >> >> > In the mean time, I can see the following in the qemu-dm logs now with > >> >> >> > the msitranslate=0 enabled. Is it anything to worry about? > >> >> >> > >> >> >> Well, the "Error" ones are pretty bad, thought I am having a hard time > >> >> >> understanding what it means. Lets copy some of the QEMU folks on this. > >> >> >> > >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:05.0][Offset:14h][Length:4] > >> >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0 > >> >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0 > >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:06.0][Offset:14h][Length:4] > >> >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0 > >> >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0 > >> >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59 > >> >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61 > >> >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69 > >> >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71 > >> >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79 > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 0 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 1 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 2 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 3 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> >> > pci_msix_writel: Error: Can''t update msix entry 4 since MSI-X is already function. > >> >> >> > > >> >> >> > > > >> >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen hypervisor when your > >> >> >> > > machine is toast. I mentioned in the previous email the key sequences - look on Google > >> >> >> > > on how to pass in SysRQ if you are using a serial concentrator. > >> >> >> > > >> >> >> > I will do this when I can get the machine to crash. > >> >> >> > > >> >> >> > Best Regards, > >> >> >> > Mark > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Xen-devel mailing list > >> >> >> > Xen-devel@lists.xensource.com > >> >> >> > http://lists.xensource.com/xen-devel > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> Best regards, > >> >> Sander mailto:linux@eikelenboom.it > >> >> > >> > >> > >> > >> -- > >> Best regards, > >> Sander mailto:linux@eikelenboom.it > >> > >> > >> _______________________________________________ > >> Xen-users mailing list > >> Xen-users@lists.xensource.com > >> http://lists.xensource.com/xen-users > > > > > -- > Best regards, > Sander mailto:linux@eikelenboom.it > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Adams
2010-Dec-09 10:49 UTC
[Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.
On Wed, Dec 08, 2010 at 12:01:25PM -0500, Konrad Rzeszutek Wilk wrote:> > > Just a recap: > > > you pass through: > > > - 3 physical nics/IGB > > > - 1 ISDN pci ISDN box > > > > The redfone box runs on 1 of the nics - its not seperate. It converts > > ISDN to TDMoE see here.. http://www.red-fone.com/ > > > > > - all using msi/msi-x interrupts ? > > > > I tried using msi/msi-x interrupts, but it caused the raid card to drop > > off (after some use) and provided seemingly even worse performance than > > pegging everything back to legacy. > > Were you able to get a serial log and hit all of the differetn debug options > when the the RAID card died? > >I didn''t get it to crash - I have spent my limited time with the hardware trying to get the setup to work effectively on the current debian xen packages (hence disabling all the MSI stuff which seemed to be the problem). I understand that this isn''t helpful in terms of getting whatever bugs may be there fixed for the future however!> > > > > > Have you tried using a PV domU instead of a HVM domU ? > > > > I initially tried PV but had issues with the igb NIC''s. There was > > another thread somewhere about my issues with that. > > Hmm, the only other thread I see from you is about the RAID. > > > > > > > Have you tried passing through only the ISDN box, and let the network run with the xen backend/frontend to rule out the IGB/network stuff ? > > Mr. Sander idea of isolating one piece by piece is the right way. > > There is a bunch of warnings in the QEMU output - some of them quite .. troubling. > > You seem to have issues with gntdev (as in, not found), but if you are using OpenSuSE > then that would work - I think. When you tested xen-unstable did you use the > OpenSUSE kernel or the PV-OPS one?I''m running the Debian xen packages in squeeze (4.0.1-1 and 2.6.32-28) - haven''t stepped out of the packages at all. Regards, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users