thr3ads.net - Xen devel - Status of FLR in Xen 4.4 [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Matthias

2013-Sep-26 16:05 UTC

Status of FLR in Xen 4.4

Hi everyone,

I would like to ask what the current status of FLR, or better of FLR
emulation is in latest Xen and if we can expect better support in the
future.

I''m asking because with xl (latest build and traditional qemu, not
upstream), I always had problems with rebooting domUs which have vga cards
passed through to them, because appearently they don''t get
reinitialized
and then cause either bluescreens (windows), blackscreens (linux) or the
complete freeze of the dom0. As far as I understood this is caused by the
vga card do not have FLR capability (lspci -vvv shows FLReset-). So while
lately rebooting sometimes works on windows, it never works on linux domUs
and it appears that xl is simply not really capable of dealing with reboots
with non-FLR''ed vga cards passed through the domUs and I have to reboot
the
dom0 to get the vga cards running again.

Is this the current status or is this supposed to work and I only have a
problem on my setup?

Also, I''m specifically referring to xl because back in the day when I
used
xm with xen 4.0 and 4.1, this never was an issue and i could reboot both
linux and windows domUs without issues as often as I wanted (with the same
hardware setup I now use with xl). So to me it seems that there is a
possibility to handle non-FLR''ed vga cards gracefully, but xl simply
isn''t
capable of that / does not do that.

It would be great to have a quick roundup of the current situation and
future plans, because I''m planing a project to use xen''s vga
passthrough in
a cloud / big data setup and the unreliable reboot behaviour is currently a
deal breaker for me.

Thanks in advance!


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2013-Sep-26 16:16 UTC

head link

Re: Status of FLR in Xen 4.4

On Thu, 2013-09-26 at 18:05 +0200, Matthias wrote:
> I would like to ask what the current status of FLR, or better of FLR
> emulation is in latest Xen and if we can expect better support in the
> future.
> 
> Is this the current status or is this supposed to work and I only have
> a problem on my setup?
xl simply asks the dom0 kernel to reset the card, so this is entirely
dependent on the functionality of your dom0 kernel and/or the features
of the particular hardware WRT allowing things to be reset.
> Also, I''m specifically referring to xl because back in the day
when I
> used xm with xen 4.0 and 4.1, this never was an issue and i could
> reboot both linux and windows domUs without issues as often as I
> wanted (with the same hardware setup I now use with xl). So to me it
> seems that there is a possibility to handle non-FLR''ed vga cards
> gracefully, but xl simply isn''t capable of that / does not do
that.
This I''m afraid I don''t know enough about to comment much.

tools/python/xen/util/pci.py appears to implement various FLR quirks for
bits of hardware, including some GFX from the looks of things.

These all belong in the upstream Linux kernel these days. You don''t say
which kernel you are using but you could try updating it.

You could also check the kernel source for a quirk for your particular
hardware.

Ian.

David Vrabel

2013-Sep-26 16:20 UTC

head link

Re: Status of FLR in Xen 4.4

On 26/09/13 17:05, Matthias wrote:> Hi everyone,
> 
> I would like to ask what the current status of FLR, or better of FLR
> emulation is in latest Xen and if we can expect better support in the
> future.
What are these cards, are they multi-function and do they actually
support FLR?  Many graphics cards do not.

I have the following hack to pciback to fallback to a bus reset for
multi-function devices without FLR.  Does it help for your use case?
You will need to ensure that all functions are co-assigned to the same
domain.

David

8<---------------------------------------
diff --git a/drivers/xen/xen-pciback/pci_stub.c
b/drivers/xen/xen-pciback/pci_stub.c
index 4e8ba38..5a03e63 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -14,6 +14,7 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/atomic.h>
+#include <linux/delay.h>
 #include <xen/events.h>
 #include <asm/xen/pci.h>
 #include <asm/xen/hypervisor.h>
@@ -43,6 +44,7 @@ struct pcistub_device {
 	struct kref kref;
 	struct list_head dev_list;
 	spinlock_t lock;
+	bool created_reset_file;

 	struct pci_dev *dev;
 	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
@@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);

+/*
+ * pci_reset_function() will only work if there is a mechanism to
+ * reset that single function (e.g., FLR or a D-state transition).
+ * For PCI hardware that has two or more functions but no per-function
+ * reset, we can do a bus reset iff all the functions are co-assigned
+ * to the same domain.
+ *
+ * If a function has no per-function reset mechanism the
''reset'' sysfs
+ * file that the toolstack uses to reset a function prior to assigning
+ * the device will be missing.  In this case, pciback adds its own
+ * which will try a bus reset.
+ *
+ * Note: pciback does not check for co-assigment before doing a bus
+ * reset, only that the devices are bound to pciback.  The toolstack
+ * is assumed to have done the right thing.
+ */
+static int __pcistub_reset_function(struct pci_dev *dev)
+{
+	struct pci_dev *pdev;
+	u16 ctrl;
+	int ret;
+
+	ret = __pci_reset_function_locked(dev);
+	if (ret == 0)
+		return 0;
+
+	if (pci_is_root_bus(dev->bus) || dev->subordinate ||
!dev->bus->self)
+		return -ENOTTY;
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
+		if (pdev != dev && (!pdev->driver
+				    || strcmp(pdev->driver->name, "pciback")))
+			return -ENOTTY;
+		pci_save_state(pdev);
+	}
+
+	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
+	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+		pci_restore_state(pdev);
+
+	return 0;
+}
+
+static int pcistub_reset_function(struct pci_dev *dev)
+{
+	int ret;
+
+	device_lock(&dev->dev);
+	ret = __pcistub_reset_function(dev);
+	device_unlock(&dev->dev);
+
+	return ret;
+}
+
+static ssize_t pcistub_reset_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+	ssize_t result = strict_strtoul(buf, 0, &val);
+
+	if (result < 0)
+		return result;
+
+	if (val != 1)
+		return -EINVAL;
+
+	result = pcistub_reset_function(pdev);
+	if (result < 0)
+		return result;
+	return count;
+}
+static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
+
+static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
+{
+	struct device *dev = &psdev->dev->dev;
+	struct sysfs_dirent *reset_dirent;
+	int ret;
+
+	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
+	if (reset_dirent) {
+		sysfs_put(reset_dirent);
+		return 0;
+	}
+
+	ret = device_create_file(dev, &dev_attr_reset);
+	if (ret < 0)
+		return ret;
+	psdev->created_reset_file = true;
+	return 0;
+}
+
+static void pcistub_remove_reset_file(struct pcistub_device *psdev)
+{
+	if (psdev && psdev->created_reset_file)
+		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
+}
+
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)

 	dev_dbg(&dev->dev, "pcistub_device_release\n");

+	pcistub_remove_reset_file(psdev);
+
 	xen_unregister_device_domain_owner(dev);

 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
-	__pci_reset_function_locked(dev);
+	__pcistub_reset_function(psdev->dev);
+
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_dbg(&dev->dev, "Could not reload PCI state\n");
 	else
@@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with ''reset''
 	 */
-	pci_reset_function(dev);
+	pcistub_reset_function(psdev->dev);
 	pci_restore_state(psdev->dev);

 	/* This disables the device. */
@@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
 		dev_err(&dev->dev, "Could not store PCI conf saved
state!\n");
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
-		__pci_reset_function_locked(dev);
+		__pcistub_reset_function(dev);
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
@@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
 	if (!psdev)
 		return -ENOMEM;

+	err = pcistub_try_create_reset_file(psdev);
+	if (err < 0)
+		goto out;
+
 	spin_lock_irqsave(&pcistub_devices_lock, flags);

 	if (initialize_devices) {
@@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
 	}

 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
-
+out:
 	if (err)
 		pcistub_device_put(psdev);
-
 	return err;
 }

Ross Philipson

2013-Sep-26 17:48 UTC

head link

Re: Status of FLR in Xen 4.4

On 09/26/2013 12:20 PM, David Vrabel wrote:> On 26/09/13 17:05, Matthias wrote:
>> Hi everyone,
>>
>> I would like to ask what the current status of FLR, or better of FLR
>> emulation is in latest Xen and if we can expect better support in the
>> future.
>
> What are these cards, are they multi-function and do they actually
> support FLR?  Many graphics cards do not.
>
> I have the following hack to pciback to fallback to a bus reset for
> multi-function devices without FLR.  Does it help for your use case?
> You will need to ensure that all functions are co-assigned to the same
> domain.
New kernels (e.g. 3.8) have full support for PCI-e and PCI AF FLRs as 
well as fallback support for D0-D3 and secondary bus resets. This 
functionality is also in the some of the last 2.6 kernels like 2.6.39. 
If you are using an older kernel I guess you might need to patch it.

Also depending on your hw there might be a specific quirk you need (e.g. 
the 82599 quirk in pci/quirks.c).

Ross
>
> David
>
> 8<---------------------------------------
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 4e8ba38..5a03e63 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -14,6 +14,7 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   #include <linux/atomic.h>
> +#include <linux/delay.h>
>   #include <xen/events.h>
>   #include <asm/xen/pci.h>
>   #include <asm/xen/hypervisor.h>
> @@ -43,6 +44,7 @@ struct pcistub_device {
>   	struct kref kref;
>   	struct list_head dev_list;
>   	spinlock_t lock;
> +	bool created_reset_file;
>
>   	struct pci_dev *dev;
>   	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>   static int initialize_devices;
>   static LIST_HEAD(seized_devices);
>
> +/*
> + * pci_reset_function() will only work if there is a mechanism to
> + * reset that single function (e.g., FLR or a D-state transition).
> + * For PCI hardware that has two or more functions but no per-function
> + * reset, we can do a bus reset iff all the functions are co-assigned
> + * to the same domain.
> + *
> + * If a function has no per-function reset mechanism the
''reset'' sysfs
> + * file that the toolstack uses to reset a function prior to assigning
> + * the device will be missing.  In this case, pciback adds its own
> + * which will try a bus reset.
> + *
> + * Note: pciback does not check for co-assigment before doing a bus
> + * reset, only that the devices are bound to pciback.  The toolstack
> + * is assumed to have done the right thing.
> + */
> +static int __pcistub_reset_function(struct pci_dev *dev)
> +{
> +	struct pci_dev *pdev;
> +	u16 ctrl;
> +	int ret;
> +
> +	ret = __pci_reset_function_locked(dev);
> +	if (ret == 0)
> +		return 0;
> +
> +	if (pci_is_root_bus(dev->bus) || dev->subordinate ||
!dev->bus->self)
> +		return -ENOTTY;
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> +		if (pdev != dev && (!pdev->driver
> +				    || strcmp(pdev->driver->name, "pciback")))
> +			return -ENOTTY;
> +		pci_save_state(pdev);
> +	}
> +
> +	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL,
&ctrl);
> +	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> +		pci_restore_state(pdev);
> +
> +	return 0;
> +}
> +
> +static int pcistub_reset_function(struct pci_dev *dev)
> +{
> +	int ret;
> +
> +	device_lock(&dev->dev);
> +	ret = __pcistub_reset_function(dev);
> +	device_unlock(&dev->dev);
> +
> +	return ret;
> +}
> +
> +static ssize_t pcistub_reset_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	unsigned long val;
> +	ssize_t result = strict_strtoul(buf, 0, &val);
> +
> +	if (result < 0)
> +		return result;
> +
> +	if (val != 1)
> +		return -EINVAL;
> +
> +	result = pcistub_reset_function(pdev);
> +	if (result < 0)
> +		return result;
> +	return count;
> +}
> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
> +
> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
> +{
> +	struct device *dev = &psdev->dev->dev;
> +	struct sysfs_dirent *reset_dirent;
> +	int ret;
> +
> +	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL,
"reset");
> +	if (reset_dirent) {
> +		sysfs_put(reset_dirent);
> +		return 0;
> +	}
> +
> +	ret = device_create_file(dev, &dev_attr_reset);
> +	if (ret < 0)
> +		return ret;
> +	psdev->created_reset_file = true;
> +	return 0;
> +}
> +
> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
> +{
> +	if (psdev && psdev->created_reset_file)
> +		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
> +}
> +
>   static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>   {
>   	struct pcistub_device *psdev;
> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>
>   	dev_dbg(&dev->dev, "pcistub_device_release\n");
>
> +	pcistub_remove_reset_file(psdev);
> +
>   	xen_unregister_device_domain_owner(dev);
>
>   	/* Call the reset function which does not take lock as this
>   	 * is called from "unbind" which takes a device_lock mutex.
>   	 */
> -	__pci_reset_function_locked(dev);
> +	__pcistub_reset_function(psdev->dev);
> +
>   	if (pci_load_and_free_saved_state(dev,
&dev_data->pci_saved_state))
>   		dev_dbg(&dev->dev, "Could not reload PCI state\n");
>   	else
> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>   	/* This is OK - we are running from workqueue context
>   	 * and want to inhibit the user from fiddling with
''reset''
>   	 */
> -	pci_reset_function(dev);
> +	pcistub_reset_function(psdev->dev);
>   	pci_restore_state(psdev->dev);
>
>   	/* This disables the device. */
> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>   		dev_err(&dev->dev, "Could not store PCI conf saved
state!\n");
>   	else {
>   		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the
device\n");
> -		__pci_reset_function_locked(dev);
> +		__pcistub_reset_function(dev);
>   		pci_restore_state(dev);
>   	}
>   	/* Now disable the device (this also ensures some private device
> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>   	if (!psdev)
>   		return -ENOMEM;
>
> +	err = pcistub_try_create_reset_file(psdev);
> +	if (err < 0)
> +		goto out;
> +
>   	spin_lock_irqsave(&pcistub_devices_lock, flags);
>
>   	if (initialize_devices) {
> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>   	}
>
>   	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> -
> +out:
>   	if (err)
>   		pcistub_device_put(psdev);
> -
>   	return err;
>   }
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

Matthias

2013-Sep-26 17:59 UTC

head link

Re: Status of FLR in Xen 4.4

I''m currently on a vanilla 3.8.2 kernel because this is the only
>3.4
kernel I found which doesn''t give me this issue:
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html

So I would assume that the kernel should be new enough to handle that. On
the other hand, as far as I understand the whole process, the kernel itself
will only deal with the vga card if it is actually bind to the dom0 / to
it''s driver which it is not. Is there any way to test either if the
ask-command from xl is really executed on dom0 or to test this command
manually?

Btw: Hardware is a Radeon HD 5750 and a Radeon HD 5400..


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

David Vrabel

2013-Sep-26 18:01 UTC

head link

Re: Status of FLR in Xen 4.4

On 26/09/13 18:48, Ross Philipson wrote:> On 09/26/2013 12:20 PM, David Vrabel wrote:
>> On 26/09/13 17:05, Matthias wrote:
>>> Hi everyone,
>>>
>>> I would like to ask what the current status of FLR, or better of
FLR
>>> emulation is in latest Xen and if we can expect better support in
the
>>> future.
>>
>> What are these cards, are they multi-function and do they actually
>> support FLR?  Many graphics cards do not.
>>
>> I have the following hack to pciback to fallback to a bus reset for
>> multi-function devices without FLR.  Does it help for your use case?
>> You will need to ensure that all functions are co-assigned to the same
>> domain.
> 
> New kernels (e.g. 3.8) have full support for PCI-e and PCI AF FLRs as
> well as fallback support for D0-D3 and secondary bus resets. This
> functionality is also in the some of the last 2.6 kernels like 2.6.39.
> If you are using an older kernel I guess you might need to patch it.
It will only do a secondary bus reset iff the function to be reset is
the only function on that bus.  If you have a multi-function device
secondary bus reset is not tried.

David

Matthias

2013-Sep-26 18:41 UTC

head link

Re: Status of FLR in Xen 4.4

Hi,

thanks for your answers, the cards are a AMD HD 5750 and a HD 5400, both
with dual functions (due to audio capabilities), both co-assigned to their
respective domU and both not capable of FLR from lspci -vvv output.

also, @Ross, I''m running a 3.8.2 Kernel, so this should be fine, but I
assume that the ''official'' command where xl asks the dom0
about the reset
do not work (if I have understand david correctly) since it''s dual
function
so no dual bus reset is actually executed causing the misbehaviour, and on
the other side xm doing a bus reset so it works in this specific case.

I''m currently recompiling the kernel to see if your patch works David.

Also, just to understand it better, is the secondary bus reset the thing
which you can manually invoke via /sys/bus/pci/devices/.../reset ?

So as a workaround, would the following work in principle?

xl pci-assignable-remove 0X:00.0
xl pci-assignable-remove 0X:00.1
echo "1" > /sys/bus/pci/devices/0X:00.0/reset
echo "1" > /sys/bus/pci/devices/0X:00.1/reset
xl pci-assignable-add 0X:00.0
xl pci-assignable-add 0X:00.1

Anyway, thanks for your answers and I will report if the patch works!

2013/9/26 David Vrabel <david.vrabel@citrix.com>
> On 26/09/13 18:48, Ross Philipson wrote:
> > On 09/26/2013 12:20 PM, David Vrabel wrote:
> >> On 26/09/13 17:05, Matthias wrote:
> >>> Hi everyone,
> >>>
> >>> I would like to ask what the current status of FLR, or better
of FLR
> >>> emulation is in latest Xen and if we can expect better support
in the
> >>> future.
> >>
> >> What are these cards, are they multi-function and do they actually
> >> support FLR?  Many graphics cards do not.
> >>
> >> I have the following hack to pciback to fallback to a bus reset
for
> >> multi-function devices without FLR.  Does it help for your use
case?
> >> You will need to ensure that all functions are co-assigned to the
same
> >> domain.
> >
> > New kernels (e.g. 3.8) have full support for PCI-e and PCI AF FLRs as
> > well as fallback support for D0-D3 and secondary bus resets. This
> > functionality is also in the some of the last 2.6 kernels like 2.6.39.
> > If you are using an older kernel I guess you might need to patch it.
>
> It will only do a secondary bus reset iff the function to be reset is
> the only function on that bus.  If you have a multi-function device
> secondary bus reset is not tried.
>
> David
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Gordan Bobic

2013-Sep-26 19:13 UTC

head link

Re: Status of FLR in Xen 4.4

On 09/26/2013 07:41 PM, Matthias wrote:> Hi,
>
> thanks for your answers, the cards are a AMD HD 5750 and a HD 5400, both
> with dual functions (due to audio capabilities), both co-assigned to
> their respective domU and both not capable of FLR from lspci -vvv output.
>
> also, @Ross, I''m running a 3.8.2 Kernel, so this should be fine,
but I
> assume that the ''official'' command where xl asks the dom0
about the
> reset do not work (if I have understand david correctly) since
it''s dual
> function so no dual bus reset is actually executed causing the
> misbehaviour, and on the other side xm doing a bus reset so it works in
> this specific case.
>
> I''m currently recompiling the kernel to see if your patch works
David.
>
> Also, just to understand it better, is the secondary bus reset the thing
> which you can manually invoke via /sys/bus/pci/devices/.../reset ?
>
> So as a workaround, would the following work in principle?
>
> xl pci-assignable-remove 0X:00.0
> xl pci-assignable-remove 0X:00.1
> echo "1" > /sys/bus/pci/devices/0X:00.0/reset
> echo "1" > /sys/bus/pci/devices/0X:00.1/reset
This bit is up to the driver to implement. Since pciback is a 
placeholder rather than a driver that knows about the hardware the reset 
node won''t be there.

You could try to do something with setpci to force the registers between 
D0 and D3 power states in a vague hope that might do something, but I 
doubt it.

The reason nvidia cards work OK is because the domU driver knows how to 
reinitialize the hardware and acts accordingly. If the manufacturer 
won''t implement a standard function to reset the hardware, then it is
up
to their drivers to handle the situation.

As a workaround, if (on Windows domUs) ejecting the card before 
shutdown/reboot of domU works, you could probably write some powershell 
magic that does that on shutdown/reboot as a reasonable workaround.

Gordan

Matthias

2013-Sep-27 12:26 UTC

head link

Re: Status of FLR in Xen 4.4

Hi Gordon,

I tried your patch on my dom0 kernel and I think it somehow helped in the
sense that now I can reboot the domUs now without crashing the whole host,
but linux domU still gets a blackscreen and windows7 domU only starts till
black screen with (actual movable) cursor, but not furthor.. this might
only be a coincidence, though, have to double check this..

I tried some other stuff, too:

1) after domU shutdown rebind both functions to the dom0 drivers, do a
sysfs reset and re-add to assignable devices -> crashes dom0
2) after domU shutdown rebind both functions to the dom0 drivers and readd
to assignable devices -> dom0 crashes somtime when domU using the devices
comes up, sometimes not, but no success either way
3) sysfs reset of the devices within domU seems to be passed through dom0
(see commands in qemu-log) but no effect

Also, I analysed your code and compared it to the stuff in the python tools
of xm and it is the same approach and i don''t see any obvious
differences..
Then I tried to replicate the secondary bus reset on command lind for
testing purposes via

printf ''\x40'' | dd
of=/sys/devices/pci0000\:00/0000\:00\:0b.0/config bs=1
seek=$((0x3e)) count=1 conv=notrunc

but I think I got some endians or offset slightly wrong because after that
xl refuses to give the device (00:0b.0 is the bus of my 2-function vga card
I have assigned to my domU) to the domU and later crashes dom0.

So I''m a little lost at that point and would welcome some suggestions.

Does FLR reset works for any of you for vga cards?


2013/9/26 Gordan Bobic <gordan@bobich.net>
> On 09/26/2013 07:41 PM, Matthias wrote:
>
>> Hi,
>>
>> thanks for your answers, the cards are a AMD HD 5750 and a HD 5400,
both
>> with dual functions (due to audio capabilities), both co-assigned to
>> their respective domU and both not capable of FLR from lspci -vvv
output.
>>
>> also, @Ross, I''m running a 3.8.2 Kernel, so this should be
fine, but I
>> assume that the ''official'' command where xl asks the
dom0 about the
>> reset do not work (if I have understand david correctly) since
it''s dual
>> function so no dual bus reset is actually executed causing the
>> misbehaviour, and on the other side xm doing a bus reset so it works in
>> this specific case.
>>
>> I''m currently recompiling the kernel to see if your patch
works David.
>>
>> Also, just to understand it better, is the secondary bus reset the
thing
>> which you can manually invoke via /sys/bus/pci/devices/.../reset ?
>>
>> So as a workaround, would the following work in principle?
>>
>> xl pci-assignable-remove 0X:00.0
>> xl pci-assignable-remove 0X:00.1
>> echo "1" > /sys/bus/pci/devices/0X:00.0/**reset
>> echo "1" > /sys/bus/pci/devices/0X:00.1/**reset
>>
>
> This bit is up to the driver to implement. Since pciback is a placeholder
> rather than a driver that knows about the hardware the reset node
won''t be
> there.
>
> You could try to do something with setpci to force the registers between
> D0 and D3 power states in a vague hope that might do something, but I doubt
> it.
>
> The reason nvidia cards work OK is because the domU driver knows how to
> reinitialize the hardware and acts accordingly. If the manufacturer
won''t
> implement a standard function to reset the hardware, then it is up to their
> drivers to handle the situation.
>
> As a workaround, if (on Windows domUs) ejecting the card before
> shutdown/reboot of domU works, you could probably write some powershell
> magic that does that on shutdown/reboot as a reasonable workaround.
>
> Gordan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Gordan Bobic

2013-Sep-27 13:27 UTC

head link

Re: Status of FLR in Xen 4.4

On Fri, 27 Sep 2013 14:26:31 +0200, Matthias 
 <matthias.kannenberg@googlemail.com> wrote:> Hi Gordon,
>
> I tried your patch on my dom0 kernel and I think it somehow helped in
> the sense that now I can reboot the domUs now without crashing the
> whole host, but linux domU still gets a blackscreen and windows7 domU
> only starts till black screen with (actual movable) cursor, but not
> furthor.. this might only be a coincidence, though, have to double
> check this..
 What patch? Nothing I posted to the list is fit for public
 consumption yet. You shouldn''t be using it unless you really,
 REALLY know exactly what it does and know exactly what you
 are trying to achieve.
> I tried some other stuff, too:
>
> 1) after domU shutdown rebind both functions to the dom0 drivers, do 
> a
> sysfs reset and re-add to assignable devices -> crashes dom0
 My experience shows that letting dom0 drivers ever touch the hardware
 is a recipe for disaster.
> 2) after domU shutdown rebind both functions to the dom0 drivers and
> readd to assignable devices -> dom0 crashes somtime when domU using
> the devices comes up, sometimes not, but no success either way
>  3) sysfs reset of the devices within domU seems to be passed through
> dom0 (see commands in qemu-log) but no effect
 It''s up to the drivers to do the sensible thing. Nvidia drivers
 handle this a little more sanely, but if the drivers cannot handle
 clobbering the device''s state into a known state, you are pretty
 much fighting a losing battle.
> Also, I analysed your code and compared it to the stuff in the python
> tools of xm and it is the same approach and i don''t see any
obvious
> differences..
 I am starting to suspect you aren''t actually talking about my code
 but somebody else''s...
> Then I tried to replicate the secondary bus reset on
> command lind for testing purposes via
>
>  printf ''x40'' | dd
of=/sys/devices/pci0000:00/0000:00:0b.0/config
> bs=1
> seek=$((0x3e)) count=1 conv=notrunc
>
> but I think I got some endians or offset slightly wrong because after
> that xl refuses to give the device (00:0b.0 is the bus of my
> 2-function vga card I have assigned to my domU) to the domU and later
> crashes dom0.
>
> So I''m a little lost at that point and would welcome some 
> suggestions.
>
> Does FLR reset works for any of you for vga cards?
 If you are talking about VGA cards with _proper_ FLR implementations
 on PCI level - there is no such thing. In all cases it is down to
 the domU driver to handle the card in whatever state it is. This
 works reasonably well with supported Nvidia cards (i.e.
 Quadro [K][2456]000 and Grid K[12] and equivalent modified GeForce
 cards (Fermi 4xx and Kepler 6xx/7xx series)). I never managed to
 get it working properly on any other GPUs.

 Even with Nvidia cards rebooting can lead to issues. For example,
 I have two GPUs passed to two different domUs. One is a GTX470
 modified to Q5000. The other is a GTX480 modified to Q6000. The
 domU with Q5000 always handled reboots reasonably reliably. The
 one with a Q6000 did not. I since switched the one with a Q6000
 to a QK5000 (modified GTX680), and now the reboots seem to work
 reasonably reliably, but I have found that there is still a
 crash if the monitor on the card changes between shutdown and
 restart - I''m guessing the card remembers it''s state and if
it
 isn''t consistent when it returns, driver gets confused. I have
 other issues (see recent thread about Nvidia passthrough from
 David), but they seem to be specific to my setup.

 It''s not perfect, but it''s the only workable solution I have
 found.

 Gordan

Konrad Rzeszutek Wilk

2013-Sep-27 13:34 UTC

head link

Re: Status of FLR in Xen 4.4

On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias
wrote:> I''m currently on a vanilla 3.8.2 kernel because this is the only
>3.4
> kernel I found which doesn''t give me this issue:
> http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
So v3.12 (or rather the latest and greaters of the Linus) has the mechanism
for the NMI - so you can actually see what is causing the stall.
> 
> So I would assume that the kernel should be new enough to handle that. On
> the other hand, as far as I understand the whole process, the kernel itself
> will only deal with the vga card if it is actually bind to the dom0 / to
> it''s driver which it is not. Is there any way to test either if
the
> ask-command from xl is really executed on dom0 or to test this command
> manually?
> 
> Btw: Hardware is a Radeon HD 5750 and a Radeon HD 5400..
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Konrad Rzeszutek Wilk

2013-Sep-27 13:48 UTC

head link

Re: Status of FLR in Xen 4.4

On Fri, Sep 27, 2013 at 02:27:46PM +0100, Gordan Bobic
wrote:> On Fri, 27 Sep 2013 14:26:31 +0200, Matthias
> <matthias.kannenberg@googlemail.com> wrote:
> >Hi Gordon,
> >
> >I tried your patch on my dom0 kernel and I think it somehow helped in
> >the sense that now I can reboot the domUs now without crashing the
> >whole host, but linux domU still gets a blackscreen and windows7 domU
> >only starts till black screen with (actual movable) cursor, but not
> >furthor.. this might only be a coincidence, though, have to double
> >check this..
> 
> What patch? Nothing I posted to the list is fit for public
> consumption yet. You shouldn''t be using it unless you really,
> REALLY know exactly what it does and know exactly what you
> are trying to achieve.
> 
> >I tried some other stuff, too:
> >
> >1) after domU shutdown rebind both functions to the dom0 drivers,
> >do a
> >sysfs reset and re-add to assignable devices -> crashes dom0
> 
> My experience shows that letting dom0 drivers ever touch the hardware
> is a recipe for disaster.
> 
> >2) after domU shutdown rebind both functions to the dom0 drivers and
> >readd to assignable devices -> dom0 crashes somtime when domU using
> >the devices comes up, sometimes not, but no success either way
> > 3) sysfs reset of the devices within domU seems to be passed through
> >dom0 (see commands in qemu-log) but no effect
> 
> It''s up to the drivers to do the sensible thing. Nvidia drivers
> handle this a little more sanely, but if the drivers cannot handle
> clobbering the device''s state into a known state, you are pretty
> much fighting a losing battle.
> 
> >Also, I analysed your code and compared it to the stuff in the python
> >tools of xm and it is the same approach and i don''t see any
obvious
> >differences..
> 
> I am starting to suspect you aren''t actually talking about my code
> but somebody else''s...
> 
> >Then I tried to replicate the secondary bus reset on
> >command lind for testing purposes via
> >
> > printf ''x40'' | dd
of=/sys/devices/pci0000:00/0000:00:0b.0/config
> >bs=1
> >seek=$((0x3e)) count=1 conv=notrunc
> >
> >but I think I got some endians or offset slightly wrong because after
> >that xl refuses to give the device (00:0b.0 is the bus of my
> >2-function vga card I have assigned to my domU) to the domU and later
> >crashes dom0.
> >
> >So I''m a little lost at that point and would welcome some
> >suggestions.
> >
> >Does FLR reset works for any of you for vga cards?
> 
> If you are talking about VGA cards with _proper_ FLR implementations
> on PCI level - there is no such thing. In all cases it is down to
> the domU driver to handle the card in whatever state it is. This
> works reasonably well with supported Nvidia cards (i.e.
> Quadro [K][2456]000 and Grid K[12] and equivalent modified GeForce
> cards (Fermi 4xx and Kepler 6xx/7xx series)). I never managed to
> get it working properly on any other GPUs.
> 
> Even with Nvidia cards rebooting can lead to issues. For example,
> I have two GPUs passed to two different domUs. One is a GTX470
> modified to Q5000. The other is a GTX480 modified to Q6000. The
> domU with Q5000 always handled reboots reasonably reliably. The
> one with a Q6000 did not. I since switched the one with a Q6000
> to a QK5000 (modified GTX680), and now the reboots seem to work
> reasonably reliably, but I have found that there is still a
> crash if the monitor on the card changes between shutdown and
> restart - I''m guessing the card remembers it''s state and
if it
> isn''t consistent when it returns, driver gets confused. I have
> other issues (see recent thread about Nvidia passthrough from
> David), but they seem to be specific to my setup.
This state thing. If one were to capture the cards state before
doing any PCI passthrough in and tried to write it exactly
back would that eliminate some of these issues?

I know that the pciback does that to the PCI configuration values.
(Or at least it should) whenever a device has been de-assigned
from a guest - or unplugged.

But I presume that the rest (the BAR contents) are not in any
way saved/restored. What would be the worst if one wrote exactly
all of the MMIO values back as they were?

(Probably a recipe for disaster, but who knows).> 
> It''s not perfect, but it''s the only workable solution I
have
> found.
> 
> Gordan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Gordan Bobic

2013-Sep-27 14:00 UTC

head link

Re: Status of FLR in Xen 4.4

On Fri, 27 Sep 2013 09:48:34 -0400, Konrad Rzeszutek Wilk 
 <konrad.wilk@oracle.com> wrote:> On Fri, Sep 27, 2013 at 02:27:46PM +0100, Gordan Bobic wrote:
>> On Fri, 27 Sep 2013 14:26:31 +0200, Matthias
>> <matthias.kannenberg@googlemail.com> wrote:
>> >Hi Gordon,
>> >
>> >I tried your patch on my dom0 kernel and I think it somehow helped 
>> in
>> >the sense that now I can reboot the domUs now without crashing the
>> >whole host, but linux domU still gets a blackscreen and windows7 
>> domU
>> >only starts till black screen with (actual movable) cursor, but not
>> >furthor.. this might only be a coincidence, though, have to double
>> >check this..
>>
>> What patch? Nothing I posted to the list is fit for public
>> consumption yet. You shouldn''t be using it unless you really,
>> REALLY know exactly what it does and know exactly what you
>> are trying to achieve.
>>
>> >I tried some other stuff, too:
>> >
>> >1) after domU shutdown rebind both functions to the dom0 drivers,
>> >do a
>> >sysfs reset and re-add to assignable devices -> crashes dom0
>>
>> My experience shows that letting dom0 drivers ever touch the 
>> hardware
>> is a recipe for disaster.
>>
>> >2) after domU shutdown rebind both functions to the dom0 drivers 
>> and
>> >readd to assignable devices -> dom0 crashes somtime when domU
using
>> >the devices comes up, sometimes not, but no success either way
>> > 3) sysfs reset of the devices within domU seems to be passed 
>> through
>> >dom0 (see commands in qemu-log) but no effect
>>
>> It''s up to the drivers to do the sensible thing. Nvidia
drivers
>> handle this a little more sanely, but if the drivers cannot handle
>> clobbering the device''s state into a known state, you are
pretty
>> much fighting a losing battle.
>>
>> >Also, I analysed your code and compared it to the stuff in the 
>> python
>> >tools of xm and it is the same approach and i don''t see
any obvious
>> >differences..
>>
>> I am starting to suspect you aren''t actually talking about my
code
>> but somebody else''s...
>>
>> >Then I tried to replicate the secondary bus reset on
>> >command lind for testing purposes via
>> >
>> > printf ''x40'' | dd
of=/sys/devices/pci0000:00/0000:00:0b.0/config
>> >bs=1
>> >seek=$((0x3e)) count=1 conv=notrunc
>> >
>> >but I think I got some endians or offset slightly wrong because 
>> after
>> >that xl refuses to give the device (00:0b.0 is the bus of my
>> >2-function vga card I have assigned to my domU) to the domU and 
>> later
>> >crashes dom0.
>> >
>> >So I''m a little lost at that point and would welcome some
>> >suggestions.
>> >
>> >Does FLR reset works for any of you for vga cards?
>>
>> If you are talking about VGA cards with _proper_ FLR implementations
>> on PCI level - there is no such thing. In all cases it is down to
>> the domU driver to handle the card in whatever state it is. This
>> works reasonably well with supported Nvidia cards (i.e.
>> Quadro [K][2456]000 and Grid K[12] and equivalent modified GeForce
>> cards (Fermi 4xx and Kepler 6xx/7xx series)). I never managed to
>> get it working properly on any other GPUs.
>>
>> Even with Nvidia cards rebooting can lead to issues. For example,
>> I have two GPUs passed to two different domUs. One is a GTX470
>> modified to Q5000. The other is a GTX480 modified to Q6000. The
>> domU with Q5000 always handled reboots reasonably reliably. The
>> one with a Q6000 did not. I since switched the one with a Q6000
>> to a QK5000 (modified GTX680), and now the reboots seem to work
>> reasonably reliably, but I have found that there is still a
>> crash if the monitor on the card changes between shutdown and
>> restart - I''m guessing the card remembers it''s state
and if it
>> isn''t consistent when it returns, driver gets confused. I have
>> other issues (see recent thread about Nvidia passthrough from
>> David), but they seem to be specific to my setup.
>
> This state thing. If one were to capture the cards state before
> doing any PCI passthrough in and tried to write it exactly
> back would that eliminate some of these issues?
>
> I know that the pciback does that to the PCI configuration values.
> (Or at least it should) whenever a device has been de-assigned
> from a guest - or unplugged.
>
> But I presume that the rest (the BAR contents) are not in any
> way saved/restored. What would be the worst if one wrote exactly
> all of the MMIO values back as they were?
>
> (Probably a recipe for disaster, but who knows).
>>
>> It''s not perfect, but it''s the only workable solution
I have
>> found.
 That doesn''t cover the entire state of the device.
 What about the rest of the device memory and states of all
 the proprietary registers?

 Since there are open source FB and accelerated drivers
 available for Radeon cards, enough is publicly known about
 them to be able to achieve suitable resetting. How
 difficult that might be to achieve, I have no idea. I
 have seen the open source Radeon Xorg driver successfully
 reset the GPU when the GPU stopped responding without
 taking Xorg or any of the running apps down in the process,
 so something similar to what it does might just be good
 enough.

 Whether it is a good idea to adopt anything but a fully
 hands-off approach to any passthrough hardware is a
 different question entirely.

 Gordan

Matthias

2013-Sep-27 17:07 UTC

head link

Re: Status of FLR in Xen 4.4

Hi Konrad,

good call! I was able to reproduce the error with the 3.12-rc2 kernel, got
a lot of information with the new NMI traces (log attached), but since
I''m
not a xen hacker I don''t really know how to continue from here. So I
might
add this to the original post and maybe someone can help me. After all the
error persists for half a year now and besides 2 kernel version / .config
Combinations (a 3.8.2 and a 3.6.something) I could never trace this issue
back (even with bisecting the .config because at some point it seemed
random).

2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
> > I''m currently on a vanilla 3.8.2 kernel because this is the
only >3.4
> > kernel I found which doesn''t give me this issue:
> > http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
>
> So v3.12 (or rather the latest and greaters of the Linus) has the mechanism
> for the NMI - so you can actually see what is causing the stall.
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Sander Eikelenboom

2013-Sep-27 17:28 UTC

head link

Re: Status of FLR in Xen 4.4

Hi Matthias,

Have you tried adding "no-cpuidle" on the xen/hypervisor commandline
in grub ?

--
Sander

Friday, September 27, 2013, 7:07:33 PM, you wrote:
> Hi Konrad,
> good call! I was able to reproduce the error with the 3.12-rc2 kernel, got
> a lot of information with the new NMI traces (log attached), but since
I''m
> not a xen hacker I don''t really know how to continue from here. So
I might
> add this to the original post and maybe someone can help me. After all the
> error persists for half a year now and besides 2 kernel version / .config
> Combinations (a 3.8.2 and a 3.6.something) I could never trace this issue
> back (even with bisecting the .config because at some point it seemed
> random).
> 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
>> > I''m currently on a vanilla 3.8.2 kernel because this is
the only >3.4
>> > kernel I found which doesn''t give me this issue:
>> > http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
>>
>> So v3.12 (or rather the latest and greaters of the Linus) has the
mechanism
>> for the NMI - so you can actually see what is causing the stall.
>>

Konrad Rzeszutek Wilk

2013-Sep-27 17:53 UTC

head link

Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

On Fri, Sep 27, 2013 at 07:07:33PM +0200, Matthias
wrote:> Hi Konrad,
> 
> good call! I was able to reproduce the error with the 3.12-rc2 kernel, got
> a lot of information with the new NMI traces (log attached), but since
I''m
> not a xen hacker I don''t really know how to continue from here. So
I might
> add this to the original post and maybe someone can help me. After all the
> error persists for half a year now and besides 2 kernel version / .config
> Combinations (a 3.8.2 and a 3.6.something) I could never trace this issue
> back (even with bisecting the .config because at some point it seemed
> random).
Can you tell me a bit on how this happens? Is it happening after you
boot the machine? Does it happen after a specific workload?


It looks like something in the RCU is taking far too long and
the RCU callback mechanism starts complaining. The CPU0 is when the
RCU mechanism detects that something is off and starts sending NMI to
all CPUs. CPU2 is the only one that looks to be doing RCU callback:


NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc2 #2
Hardware name: System manufacturer System Product Name/Crosshair IV Formula,
BIOS 3029    10/09/2012
task: ffff8800658da080 ti: ffff880065900000 task.ti: ffff880065900000
RIP: e030:[<ffffffff8125b2b2>]  [<ffffffff8125b2b2>]
cfb_imageblit+0x1b3/0x411
RSP: e02b:ffff88007de439f0  EFLAGS: 00000046
RAX: 0000000000000000 RBX: ffff88001e1c2800 RCX: 0000000000000003
RDX: 000000000000003b RSI: ffff88001e00614e RDI: 0000000000000000
RBP: 0000000000000013 R08: 0000000000000001 R09: ffffffff814655f0
R10: ffff88001e006116 R11: ffffc90014875710 R12: 000000000000000d
R13: 0000000000000000 R14: ffffc90014875714 R15: ffffc90014875000
FS:  00007fb294ab4900(0000) GS:ffff88007de40000(0000) knlGS:0000000000000000
CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00007fb29177a9a0 CR3: 000000000160c000 CR4: 0000000000000660
Stack:
 0000000100aaaaaa 00000000000001d8 0000000000000000 0000000000aaaaaa
 ffff8800532f0a40 ffff88001e1c2800 0000000000000001 ffff88001e1c2800
 0000000000000000 ffff88007d424400 00000000ffff00ff 000000000000003b
Call Trace:
 <IRQ>  [<ffffffff81256ac4>] ? bit_putcs+0x352/0x39d
 [<ffffffff81219825>] ? paravirt_read_tsc+0x5/0x8
 [<ffffffff81256772>] ? bit_cursor+0x45d/0x45d
 [<ffffffff812523a8>] ? fbcon_putcs+0xbd/0xcc
 [<ffffffff812bc6b6>] ? vt_console_print+0x234/0x290
 [<ffffffff810b336f>] ? call_console_drivers.constprop.18+0xb3/0xfc
 [<ffffffff810b3c7d>] ? console_unlock+0x131/0x306
 [<ffffffff810b420e>] ? vprintk_emit+0x3bc/0x3eb
 [<ffffffff812c92f5>] ? paravirt_read_tsc+0x5/0x8
 [<ffffffff812cae43>] ? add_interrupt_randomness+0x3f/0x15d
 [<ffffffff813db9c8>] ? printk+0x4f/0x51
 [<ffffffff810e4433>] ? rcu_check_callbacks+0x195/0x598                   
<================= [<ffffffff810a3b50>] ?
irqtime_account_process_tick.isra.2+0xd6/0x239
 [<ffffffff810c232a>] ? tick_sched_do_timer+0x2e/0x2e
 [<ffffffff81084c35>] ? update_process_times+0x30/0x5b
 [<ffffffff810c2237>] ? tick_sched_handle+0x3e/0x4a
 [<ffffffff810c235a>] ? tick_sched_timer+0x30/0x4c
 [<ffffffff81098355>] ? __run_hrtimer+0x93/0x159
 [<ffffffff81098b72>] ? hrtimer_interrupt+0xe3/0x1ca
 [<ffffffff8103d8e4>] ? xen_timer_interrupt+0x31/0x13b
 [<ffffffff81294c4c>] ? HYPERVISOR_event_channel_op+0xd/0x1d
 [<ffffffff8103d79b>] ? xen_force_evtchn_callback+0x9/0xa
 [<ffffffff8103df22>] ? check_events+0x12/0x20
 [<ffffffff810b5b7a>] ? handle_irq_event_percpu+0x4d/0x1c5
 [<ffffffff813e556e>] ? notifier_call_chain+0x32/0x52
 [<ffffffff810b8287>] ? handle_percpu_irq+0x39/0x4c
 [<ffffffff812951c0>] ? __xen_evtchn_do_upcall+0x107/0x2cb
 [<ffffffff81219936>] ? delay_tsc+0x9c/0xc6
 [<ffffffff81093ba0>] ? __rcu_read_unlock+0x33/0x51
 [<ffffffff8129663a>] ? xen_evtchn_do_upcall+0x22/0x32
 [<ffffffff813e897e>] ? xen_do_hypervisor_callback+0x1e/0x30
 <EOI>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff8103d768>] ? xen_safe_halt+0xc/0x13
 [<ffffffff8104ae0b>] ? default_idle+0x14/0x3e
 [<ffffffff810b53ee>] ? cpu_startup_entry+0x107/0x160
Code: fb 4c 89 d6 b9 08 00 00 00 ff cd 83 fd ff 74 32 44 0f be 2e 44 29 c1 8b 44
24 18 4d 8d 73 04 41 d3 fd 44 23 6c 24 04 43 23 04 a9 <41> 89 c5 41 31 fd
45 89 2b 85 c9 75 05 48 ff c6 b1 08 4d 89 f3


Which looks to be printing something on the VT console (which is running
in KMS mode as it uses framebuffer calls). So is there something on the
screen scrolling widly in a loop?

But then there are also complains about 

INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run:
1.115 msecs

this taking too long. I am wondering if there is some time issue
on your box.

What version of Xen do you have?> 
> 
> 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> > On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
> > > I''m currently on a vanilla 3.8.2 kernel because this is
the only >3.4
> > > kernel I found which doesn''t give me this issue:
> > >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
> >
> > So v3.12 (or rather the latest and greaters of the Linus) has the
mechanism
> > for the NMI - so you can actually see what is causing the stall.
> >

Matthias

2013-Sep-27 19:19 UTC

head link

Re: Status of FLR in Xen 4.4

Hi Sander,

thanks for the advice, I have actually no rcu stalls when i use the
no-cpuidle function. Do you have a little more insight on what is actually
causing this behaviour and if there is a better solution then this option,
cause I don''t want to sacrifice my C-states (I would assume this makes
the
overall server more power hungry?).

Does this has something to do with the new tickless-kernel options in the
newer kernel, or is this really only an apci incompatibility with xen?

Thanks!


2013/9/27 Sander Eikelenboom <linux@eikelenboom.it>
> Hi Matthias,
>
> Have you tried adding "no-cpuidle" on the xen/hypervisor
commandline in
> grub ?
>
> --
> Sander
>
> Friday, September 27, 2013, 7:07:33 PM, you wrote:
>
> > Hi Konrad,
>
> > good call! I was able to reproduce the error with the 3.12-rc2 kernel,
> got
> > a lot of information with the new NMI traces (log attached), but since
> I''m
> > not a xen hacker I don''t really know how to continue from
here. So I
> might
> > add this to the original post and maybe someone can help me. After all
> the
> > error persists for half a year now and besides 2 kernel version /
.config
> > Combinations (a 3.8.2 and a 3.6.something) I could never trace this
issue
> > back (even with bisecting the .config because at some point it seemed
> > random).
>
>
> > 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>
> >> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
> >> > I''m currently on a vanilla 3.8.2 kernel because this
is the only >3.4
> >> > kernel I found which doesn''t give me this issue:
> >> >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
> >>
> >> So v3.12 (or rather the latest and greaters of the Linus) has the
> mechanism
> >> for the NMI - so you can actually see what is causing the stall.
> >>
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Sander Eikelenboom

2013-Sep-27 19:33 UTC

head link

Re: Status of FLR in Xen 4.4

Friday, September 27, 2013, 9:19:14 PM, you wrote:
> Hi Sander,
> thanks for the advice, I have actually no rcu stalls when i use the
no-cpuidle function. Do you have a little more insight on what is actually
causing this behaviour and if there is a better solution then this option, cause
I don''t want to sacrifice my C-states (I would assume this makes the
overall server more power hungry?).
> Does this has something to do with the new tickless-kernel options in the
newer kernel, or is this really only an apci incompatibility with xen?
> Thanks!
 Are you running xen-unstable ?
 Some patches went in lately

 You also seem to have a motherboard with a AMD 890fx chipset, i suspect your
bios also has issues around the HPET as mine had.
 I was also seeing RCU stalls on boot (and only on boot) .. hitting any key on
the console when it appears to stall during boot made it continue in my case
(happens several times).
 Took a while to find the problems, Jan Beulich has made and commited some
patches that went in xen-unstable recently.

 Are you running xen-unstable ?
 If not, could you give it a try and provide the xl dmesg / serial log ?

 --
 Sander







>   2013/9/27 Sander Eikelenboom <linux@eikelenboom.it>
>   Hi Matthias,
>  
>  Have you tried adding "no-cpuidle" on the xen/hypervisor
commandline in grub ?
>  
>  --
>  Sander
>  
>  Friday, September 27, 2013, 7:07:33 PM, you wrote:
>  
 >> Hi Konrad,>   >> good call! I was able to reproduce the error with the 3.12-rc2 kernel,
got
 >> a lot of information with the new NMI traces (log attached), but since
I''m
 >> not a xen hacker I don''t really know how to continue from
here. So I might
 >> add this to the original post and maybe someone can help me. After all
the
 >> error persists for half a year now and besides 2 kernel version /
.config
 >> Combinations (a 3.8.2 and a 3.6.something) I could never trace this
issue
 >> back (even with bisecting the .config because at some point it seemed
 >> random).>  
>   >> 2013/9/27 Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>>   >>> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
 >>> > I''m currently on a vanilla 3.8.2 kernel because this
is the only >3.4
 >>> > kernel I found which doesn''t give me this issue:
 >>> >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
 >>>
 >>> So v3.12 (or rather the latest and greaters of the Linus) has the
mechanism
 >>> for the NMI - so you can actually see what is causing the stall.
 >>>>  
>

Matthias

2013-Sep-27 19:48 UTC

head link

Re: Status of FLR in Xen 4.4

Yes, running the most recent xen-unstable-staging tree, but I have these
issues at least since february with xen-unstable, so I don''t suspect
recent
changes to be the issue in my case.

I will do some testing with switching from tickless-idle to non-tickless
and after you mentioned hpet issues maybe changing the clocksource, will
see what happens..




2013/9/27 Sander Eikelenboom <linux@eikelenboom.it>
>
> Friday, September 27, 2013, 9:19:14 PM, you wrote:
>
> > Hi Sander,
>
> > thanks for the advice, I have actually no rcu stalls when i use the
> no-cpuidle function. Do you have a little more insight on what is actually
> causing this behaviour and if there is a better solution then this option,
> cause I don''t want to sacrifice my C-states (I would assume this
makes the
> overall server more power hungry?).
>
> > Does this has something to do with the new tickless-kernel options in
> the newer kernel, or is this really only an apci incompatibility with xen?
>
> > Thanks!
>
>  Are you running xen-unstable ?
>  Some patches went in lately
>
>  You also seem to have a motherboard with a AMD 890fx chipset, i suspect
> your bios also has issues around the HPET as mine had.
>  I was also seeing RCU stalls on boot (and only on boot) .. hitting any
> key on the console when it appears to stall during boot made it continue in
> my case (happens several times).
>  Took a while to find the problems, Jan Beulich has made and commited some
> patches that went in xen-unstable recently.
>
>  Are you running xen-unstable ?
>  If not, could you give it a try and provide the xl dmesg / serial log ?
>
>  --
>  Sander
>
>
>
>
>
>
>
>
> >   2013/9/27 Sander Eikelenboom <linux@eikelenboom.it>
>
> >   Hi Matthias,
> >
> >  Have you tried adding "no-cpuidle" on the xen/hypervisor
commandline in
> grub ?
> >
> >  --
> >  Sander
> >
>
> >  Friday, September 27, 2013, 7:07:33 PM, you wrote:
> >
>  >> Hi Konrad,
> >
>  >> good call! I was able to reproduce the error with the 3.12-rc2
kernel,
> got
>  >> a lot of information with the new NMI traces (log attached), but
since
> I''m
>  >> not a xen hacker I don''t really know how to continue
from here. So I
> might
>  >> add this to the original post and maybe someone can help me.
After all
> the
>  >> error persists for half a year now and besides 2 kernel version /
> .config
>  >> Combinations (a 3.8.2 and a 3.6.something) I could never trace
this
> issue
>  >> back (even with bisecting the .config because at some point it
seemed
>  >> random).
> >
> >
>  >> 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >
>  >>> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
>  >>> > I''m currently on a vanilla 3.8.2 kernel because
this is the only
> >3.4
>  >>> > kernel I found which doesn''t give me this
issue:
>  >>> >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
>  >>>
>  >>> So v3.12 (or rather the latest and greaters of the Linus) has
the
> mechanism
>  >>> for the NMI - so you can actually see what is causing the
stall.
>  >>>
> >
> >
>
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Sander Eikelenboom

2013-Sep-27 20:06 UTC

head link

Re: Status of FLR in Xen 4.4

Friday, September 27, 2013, 9:48:39 PM, you wrote:
> Yes, running the most recent xen-unstable-staging tree, but I have these
issues at least since february with xen-unstable, so I don''t suspect
recent changes to be the issue in my case.
> I will do some testing with switching from tickless-idle to non-tickless
and after you mentioned hpet issues maybe changing the clocksource, will see
what happens..
I''m now running with tickless-idle, so i suspect it will make no
difference.
So i think trying to make it boot by pressing a key on the keyboard when it
doesn''t make progress on boot (see if that works) and if it does ..
provide the output of "xl dmesg"
would be the best shot.

( BTW there were 2 seperate issues ..  see threads:
http://lists.xen.org/archives/html/xen-devel/2013-03/msg01796.html
http://lists.xen.org/archives/html/xen-devel/2013-08/msg00201.html
)


> 2013/9/27 Sander Eikelenboom <linux@eikelenboom.it>
>   
>  Friday, September 27, 2013, 9:19:14 PM, you wrote:
>  
 >> Hi Sander,>   >> thanks for the advice, I have actually no rcu stalls when i use the
no-cpuidle function. Do you have a little more insight on what is actually
causing this behaviour and if there is a better solution then this option, cause
I don''t want to sacrifice my C-states (I would assume this makes the
overall server more power hungry?).>     >> Does this has something to do with the new tickless-kernel options in
the newer kernel, or is this really only an apci incompatibility with
xen?>  
 >> Thanks!>  
>   Are you running xen-unstable ?
>   Some patches went in lately
>  
>   You also seem to have a motherboard with a AMD 890fx chipset, i suspect
your bios also has issues around the HPET as mine had.
>   I was also seeing RCU stalls on boot (and only on boot) .. hitting any
key on the console when it appears to stall during boot made it continue in my
case (happens several times).
>   Took a while to find the problems, Jan Beulich has made and commited some
patches that went in xen-unstable recently.
>  
>   Are you running xen-unstable ?
>   If not, could you give it a try and provide the xl dmesg / serial log ?
>  
>   --
>   Sander
>  
>  
>  
>  
>  
>  
>  
>   >>   2013/9/27 Sander Eikelenboom
<linux@eikelenboom.it>>   >>   Hi Matthias,
 >>
 >>  Have you tried adding "no-cpuidle" on the xen/hypervisor
commandline in grub ?
 >>
 >>  --
 >>  Sander
 >>>   >>  Friday, September 27, 2013, 7:07:33 PM, you wrote:
 >>
  >>> Hi Konrad,
 >>
  >>> good call! I was able to reproduce the error with the 3.12-rc2
kernel, got
  >>> a lot of information with the new NMI traces (log attached), but
since I''m
  >>> not a xen hacker I don''t really know how to continue
from here. So I might
  >>> add this to the original post and maybe someone can help me.
After all the
  >>> error persists for half a year now and besides 2 kernel version /
.config
  >>> Combinations (a 3.8.2 and a 3.6.something) I could never trace
this issue
  >>> back (even with bisecting the .config because at some point it
seemed
  >>> random).
 >>
 >>
  >>> 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 >>
  >>>> On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
  >>>> > I''m currently on a vanilla 3.8.2 kernel because
this is the only >3.4
  >>>> > kernel I found which doesn''t give me this
issue:
  >>>> >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
  >>>>
  >>>> So v3.12 (or rather the latest and greaters of the Linus) has
the mechanism
  >>>> for the NMI - so you can actually see what is causing the
stall.
  >>>>
 >>
 >>>  
>  
>

Matthias

2013-Oct-03 22:20 UTC

head link

Re: Status of FLR in Xen 4.4

Hi David,

with your patch as inspiration, I did various test in the past days but
didn''t manage to succeed in resetting my vga the right way..

With your patch, and later mine, secondary bus reset is executed but after
that i can''t boot the vm because i get a ''device model is not
ready'' /
''refused to pass the pci device'' error..
I also tried to don''t reset the secondary function of the vga card
after
executing a secondary bus reset when the first function reset is called,
but with the same result.
Do you have any idea if I am missing anything? I tried it with both
load/restore configure and not doing so, but it seems xenstore can''t
handle
the vga after the parent bus reset.

Something else that is odd is that my vga has in fact a sysfs/reset file
(both functions have a seperate one) but neither doing a normal reset nor
doing it by hand does make any change / I think it is not executed, because
when I commented out the reset completly, the VM showed the same behaviour
on the second boot then when doing a normal reset.. BTW: the same result
comes when I''m doing a d0->d3 transition via the kernel. FLR and
AR_FLR do
not work anyway due to no capability in the card..

So I compared the xen-pciback reset-method with both the pci/pci.c method
and what was done in python/xen/util/pci.py and the actions are basically
the same (the quirks in pci.py are only for some nvidia and integrated
vgas) and I don''t see what I am missing.. can it be that after the
parent
bus reset, the vga card somehow looses it''s entry in xenstore or
something?

Can you elaborate a bit more what hardware you are having and if your patch
works fine for you? I''m currently testing with a AMD HD5400.

Thanks in advance!


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Matthias

2013-Oct-03 22:34 UTC

head link

Re: Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

Hi Konrad,

sorry I missed your entry, google mail might not be the best software to
view mailing lists ;)

The RCU stall happens roughly 2 minutes after the machine is fully booted,
and I''m usually working via SSH by then..

I basically have two cases where the stall happens:

1) Without the no-cpuidle function, It happens when I start xencommons
2) With or without no-cpuidle, this happens sometimes and arbitrary and I
have the feeling that logging in via SSH (or network traffic in general?)
will increase the chance of the rcu stall and (and this is only a guess) in
most cases this actually happens when I enter a command of more then 16
chars in the ssh command prompt. (I don''t really think that this is
really
causing the issue, I just noticed that when entering the usual commands to
start all the xen stuff / boot the domUs, it stalls mostly on the same
commands / when ssh freezes I came to the same part of the command). But
more ssh-intensive commands like ''dmesg'' or
''htop'' don''t cause it..


Also, I can''t really say what is on the screen because my dom0 does not
have a vga card / both vga cards in the server are passed to different
domUs and when I don''t hide the vga cards on boot via xen-pciback.hide,
the
rcu usually does not stall and everything is fine..





2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> On Fri, Sep 27, 2013 at 07:07:33PM +0200, Matthias wrote:
> > Hi Konrad,
> >
> > good call! I was able to reproduce the error with the 3.12-rc2 kernel,
> got
> > a lot of information with the new NMI traces (log attached), but since
> I''m
> > not a xen hacker I don''t really know how to continue from
here. So I
> might
> > add this to the original post and maybe someone can help me. After all
> the
> > error persists for half a year now and besides 2 kernel version /
.config
> > Combinations (a 3.8.2 and a 3.6.something) I could never trace this
issue
> > back (even with bisecting the .config because at some point it seemed
> > random).
>
> Can you tell me a bit on how this happens? Is it happening after you
> boot the machine? Does it happen after a specific workload?
>
>
> It looks like something in the RCU is taking far too long and
> the RCU callback mechanism starts complaining. The CPU0 is when the
> RCU mechanism detects that something is off and starts sending NMI to
> all CPUs. CPU2 is the only one that looks to be doing RCU callback:
>
>
> NMI backtrace for cpu 1
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc2 #2
> Hardware name: System manufacturer System Product Name/Crosshair IV
> Formula, BIOS 3029    10/09/2012
> task: ffff8800658da080 ti: ffff880065900000 task.ti: ffff880065900000
> RIP: e030:[<ffffffff8125b2b2>]  [<ffffffff8125b2b2>]
> cfb_imageblit+0x1b3/0x411
> RSP: e02b:ffff88007de439f0  EFLAGS: 00000046
> RAX: 0000000000000000 RBX: ffff88001e1c2800 RCX: 0000000000000003
> RDX: 000000000000003b RSI: ffff88001e00614e RDI: 0000000000000000
> RBP: 0000000000000013 R08: 0000000000000001 R09: ffffffff814655f0
> R10: ffff88001e006116 R11: ffffc90014875710 R12: 000000000000000d
> R13: 0000000000000000 R14: ffffc90014875714 R15: ffffc90014875000
> FS:  00007fb294ab4900(0000) GS:ffff88007de40000(0000)
> knlGS:0000000000000000
> CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00007fb29177a9a0 CR3: 000000000160c000 CR4: 0000000000000660
> Stack:
>  0000000100aaaaaa 00000000000001d8 0000000000000000 0000000000aaaaaa
>  ffff8800532f0a40 ffff88001e1c2800 0000000000000001 ffff88001e1c2800
>  0000000000000000 ffff88007d424400 00000000ffff00ff 000000000000003b
> Call Trace:
>  <IRQ>  [<ffffffff81256ac4>] ? bit_putcs+0x352/0x39d
>  [<ffffffff81219825>] ? paravirt_read_tsc+0x5/0x8
>  [<ffffffff81256772>] ? bit_cursor+0x45d/0x45d
>  [<ffffffff812523a8>] ? fbcon_putcs+0xbd/0xcc
>  [<ffffffff812bc6b6>] ? vt_console_print+0x234/0x290
>  [<ffffffff810b336f>] ? call_console_drivers.constprop.18+0xb3/0xfc
>  [<ffffffff810b3c7d>] ? console_unlock+0x131/0x306
>  [<ffffffff810b420e>] ? vprintk_emit+0x3bc/0x3eb
>  [<ffffffff812c92f5>] ? paravirt_read_tsc+0x5/0x8
>  [<ffffffff812cae43>] ? add_interrupt_randomness+0x3f/0x15d
>  [<ffffffff813db9c8>] ? printk+0x4f/0x51
>  [<ffffffff810e4433>] ? rcu_check_callbacks+0x195/0x598
>        <=================>  [<ffffffff810a3b50>] ?
irqtime_account_process_tick.isra.2+0xd6/0x239
>  [<ffffffff810c232a>] ? tick_sched_do_timer+0x2e/0x2e
>  [<ffffffff81084c35>] ? update_process_times+0x30/0x5b
>  [<ffffffff810c2237>] ? tick_sched_handle+0x3e/0x4a
>  [<ffffffff810c235a>] ? tick_sched_timer+0x30/0x4c
>  [<ffffffff81098355>] ? __run_hrtimer+0x93/0x159
>  [<ffffffff81098b72>] ? hrtimer_interrupt+0xe3/0x1ca
>  [<ffffffff8103d8e4>] ? xen_timer_interrupt+0x31/0x13b
>  [<ffffffff81294c4c>] ? HYPERVISOR_event_channel_op+0xd/0x1d
>  [<ffffffff8103d79b>] ? xen_force_evtchn_callback+0x9/0xa
>  [<ffffffff8103df22>] ? check_events+0x12/0x20
>  [<ffffffff810b5b7a>] ? handle_irq_event_percpu+0x4d/0x1c5
>  [<ffffffff813e556e>] ? notifier_call_chain+0x32/0x52
>  [<ffffffff810b8287>] ? handle_percpu_irq+0x39/0x4c
>  [<ffffffff812951c0>] ? __xen_evtchn_do_upcall+0x107/0x2cb
>  [<ffffffff81219936>] ? delay_tsc+0x9c/0xc6
>  [<ffffffff81093ba0>] ? __rcu_read_unlock+0x33/0x51
>  [<ffffffff8129663a>] ? xen_evtchn_do_upcall+0x22/0x32
>  [<ffffffff813e897e>] ? xen_do_hypervisor_callback+0x1e/0x30
>  <EOI>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>  [<ffffffff8103d768>] ? xen_safe_halt+0xc/0x13
>  [<ffffffff8104ae0b>] ? default_idle+0x14/0x3e
>  [<ffffffff810b53ee>] ? cpu_startup_entry+0x107/0x160
> Code: fb 4c 89 d6 b9 08 00 00 00 ff cd 83 fd ff 74 32 44 0f be 2e 44 29 c1
> 8b 44 24 18 4d 8d 73 04 41 d3 fd 44 23 6c 24 04 43 23 04 a9 <41> 89
c5 41
> 31 fd 45 89 2b 85 c9 75 05 48 ff c6 b1 08 4d 89 f3
>
>
> Which looks to be printing something on the VT console (which is running
> in KMS mode as it uses framebuffer calls). So is there something on the
> screen scrolling widly in a loop?
>
> But then there are also complains about
>
> INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long
> to run: 1.115 msecs
>
> this taking too long. I am wondering if there is some time issue
> on your box.
>
> What version of Xen do you have?
> >
> >
> > 2013/9/27 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >
> > > On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
> > > > I''m currently on a vanilla 3.8.2 kernel because
this is the only >3.4
> > > > kernel I found which doesn''t give me this issue:
> > > >
http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
> > >
> > > So v3.12 (or rather the latest and greaters of the Linus) has the
> mechanism
> > > for the NMI - so you can actually see what is causing the stall.
> > >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Pasi Kärkkäinen

2013-Oct-04 06:07 UTC

head link

Re: Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

On Fri, Oct 04, 2013 at 12:34:56AM +0200, Matthias
wrote:>    Hi Konrad,
> 
>    sorry I missed your entry, google mail might not be the best software to
>    view mailing lists ;)
> 
>    The RCU stall happens roughly 2 minutes after the machine is fully
booted,
>    and I''m usually working via SSH by then..
> 
>    I basically have two cases where the stall happens:
> 
>    1) Without the no-cpuidle function, It happens when I start xencommons
>    2) With or without no-cpuidle, this happens sometimes and arbitrary and
I
>    have the feeling that logging in via SSH (or network traffic in
general?)
>    will increase the chance of the rcu stall and (and this is only a guess)
>    in most cases this actually happens when I enter a command of more then
16
>    chars in the ssh command prompt. (I don''t really think that
this is really
>    causing the issue, I just noticed that when entering the usual commands
to
>    start all the xen stuff / boot the domUs, it stalls mostly on the same
>    commands / when ssh freezes I came to the same part of the command). But
>    more ssh-intensive commands like ''dmesg'' or
''htop'' don''t cause it..
> 
>    Also, I can''t really say what is on the screen because my dom0
does not
>    have a vga card / both vga cards in the server are passed to different
>    domUs and when I don''t hide the vga cards on boot via
xen-pciback.hide,
>    the rcu usually does not stall and everything is fine..
> 
For debugging you should have a serial console.. so maybe get a pci serial card,
if you don''t have any management processors offering SOL ? 

-- Pasi
>    2013/9/27 Konrad Rzeszutek Wilk <[1]konrad.wilk@oracle.com>
> 
>      On Fri, Sep 27, 2013 at 07:07:33PM +0200, Matthias wrote:
>      > Hi Konrad,
>      >
>      > good call! I was able to reproduce the error with the 3.12-rc2
kernel,
>      got
>      > a lot of information with the new NMI traces (log attached), but
since
>      I''m
>      > not a xen hacker I don''t really know how to continue
from here. So I
>      might
>      > add this to the original post and maybe someone can help me.
After all
>      the
>      > error persists for half a year now and besides 2 kernel version /
>      .config
>      > Combinations (a 3.8.2 and a 3.6.something) I could never trace
this
>      issue
>      > back (even with bisecting the .config because at some point it
seemed
>      > random).
> 
>      Can you tell me a bit on how this happens? Is it happening after you
>      boot the machine? Does it happen after a specific workload?
> 
>      It looks like something in the RCU is taking far too long and
>      the RCU callback mechanism starts complaining. The CPU0 is when the
>      RCU mechanism detects that something is off and starts sending NMI to
>      all CPUs. CPU2 is the only one that looks to be doing RCU callback:
> 
>      NMI backtrace for cpu 1
>      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-rc2 #2
>      Hardware name: System manufacturer System Product Name/Crosshair IV
>      Formula, BIOS 3029    10/09/2012
>      task: ffff8800658da080 ti: ffff880065900000 task.ti: ffff880065900000
>      RIP: e030:[<ffffffff8125b2b2>]  [<ffffffff8125b2b2>]
>      cfb_imageblit+0x1b3/0x411
>      RSP: e02b:ffff88007de439f0  EFLAGS: 00000046
>      RAX: 0000000000000000 RBX: ffff88001e1c2800 RCX: 0000000000000003
>      RDX: 000000000000003b RSI: ffff88001e00614e RDI: 0000000000000000
>      RBP: 0000000000000013 R08: 0000000000000001 R09: ffffffff814655f0
>      R10: ffff88001e006116 R11: ffffc90014875710 R12: 000000000000000d
>      R13: 0000000000000000 R14: ffffc90014875714 R15: ffffc90014875000
>      FS:  00007fb294ab4900(0000) GS:ffff88007de40000(0000)
>      knlGS:0000000000000000
>      CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
>      CR2: 00007fb29177a9a0 CR3: 000000000160c000 CR4: 0000000000000660
>      Stack:
>       0000000100aaaaaa 00000000000001d8 0000000000000000 0000000000aaaaaa
>       ffff8800532f0a40 ffff88001e1c2800 0000000000000001 ffff88001e1c2800
>       0000000000000000 ffff88007d424400 00000000ffff00ff 000000000000003b
>      Call Trace:
>       <IRQ>  [<ffffffff81256ac4>] ? bit_putcs+0x352/0x39d
>       [<ffffffff81219825>] ? paravirt_read_tsc+0x5/0x8
>       [<ffffffff81256772>] ? bit_cursor+0x45d/0x45d
>       [<ffffffff812523a8>] ? fbcon_putcs+0xbd/0xcc
>       [<ffffffff812bc6b6>] ? vt_console_print+0x234/0x290
>       [<ffffffff810b336f>] ?
call_console_drivers.constprop.18+0xb3/0xfc
>       [<ffffffff810b3c7d>] ? console_unlock+0x131/0x306
>       [<ffffffff810b420e>] ? vprintk_emit+0x3bc/0x3eb
>       [<ffffffff812c92f5>] ? paravirt_read_tsc+0x5/0x8
>       [<ffffffff812cae43>] ? add_interrupt_randomness+0x3f/0x15d
>       [<ffffffff813db9c8>] ? printk+0x4f/0x51
>       [<ffffffff810e4433>] ? rcu_check_callbacks+0x195/0x598
>               <=================>       [<ffffffff810a3b50>] ?
irqtime_account_process_tick.isra.2+0xd6/0x239
>       [<ffffffff810c232a>] ? tick_sched_do_timer+0x2e/0x2e
>       [<ffffffff81084c35>] ? update_process_times+0x30/0x5b
>       [<ffffffff810c2237>] ? tick_sched_handle+0x3e/0x4a
>       [<ffffffff810c235a>] ? tick_sched_timer+0x30/0x4c
>       [<ffffffff81098355>] ? __run_hrtimer+0x93/0x159
>       [<ffffffff81098b72>] ? hrtimer_interrupt+0xe3/0x1ca
>       [<ffffffff8103d8e4>] ? xen_timer_interrupt+0x31/0x13b
>       [<ffffffff81294c4c>] ? HYPERVISOR_event_channel_op+0xd/0x1d
>       [<ffffffff8103d79b>] ? xen_force_evtchn_callback+0x9/0xa
>       [<ffffffff8103df22>] ? check_events+0x12/0x20
>       [<ffffffff810b5b7a>] ? handle_irq_event_percpu+0x4d/0x1c5
>       [<ffffffff813e556e>] ? notifier_call_chain+0x32/0x52
>       [<ffffffff810b8287>] ? handle_percpu_irq+0x39/0x4c
>       [<ffffffff812951c0>] ? __xen_evtchn_do_upcall+0x107/0x2cb
>       [<ffffffff81219936>] ? delay_tsc+0x9c/0xc6
>       [<ffffffff81093ba0>] ? __rcu_read_unlock+0x33/0x51
>       [<ffffffff8129663a>] ? xen_evtchn_do_upcall+0x22/0x32
>       [<ffffffff813e897e>] ? xen_do_hypervisor_callback+0x1e/0x30
>       <EOI>  [<ffffffff810013aa>] ?
xen_hypercall_sched_op+0xa/0x20
>       [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>       [<ffffffff8103d768>] ? xen_safe_halt+0xc/0x13
>       [<ffffffff8104ae0b>] ? default_idle+0x14/0x3e
>       [<ffffffff810b53ee>] ? cpu_startup_entry+0x107/0x160
>      Code: fb 4c 89 d6 b9 08 00 00 00 ff cd 83 fd ff 74 32 44 0f be 2e 44
29
>      c1 8b 44 24 18 4d 8d 73 04 41 d3 fd 44 23 6c 24 04 43 23 04 a9
<41> 89
>      c5 41 31 fd 45 89 2b 85 c9 75 05 48 ff c6 b1 08 4d 89 f3
> 
>      Which looks to be printing something on the VT console (which is
running
>      in KMS mode as it uses framebuffer calls). So is there something on
the
>      screen scrolling widly in a loop?
> 
>      But then there are also complains about
> 
>      INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too
long
>      to run: 1.115 msecs
> 
>      this taking too long. I am wondering if there is some time issue
>      on your box.
> 
>      What version of Xen do you have?
>      >
>      >
>      > 2013/9/27 Konrad Rzeszutek Wilk <[2]konrad.wilk@oracle.com>
>      >
>      > > On Thu, Sep 26, 2013 at 07:59:40PM +0200, Matthias wrote:
>      > > > I''m currently on a vanilla 3.8.2 kernel
because this is the only
>      >3.4
>      > > > kernel I found which doesn''t give me this
issue:
>      > > >
>      [3]http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
>      > >
>      > > So v3.12 (or rather the latest and greaters of the Linus)
has the
>      mechanism
>      > > for the NMI - so you can actually see what is causing the
stall.
>      > >
> 
>      _______________________________________________
>      Xen-devel mailing list
>      [4]Xen-devel@lists.xen.org
>      [5]http://lists.xen.org/xen-devel
> 
> References
> 
>    Visible links
>    1. mailto:konrad.wilk@oracle.com
>    2. mailto:konrad.wilk@oracle.com
>    3. http://lists.xen.org/archives/html/xen-users/2013-02/msg00114.html
>    4. mailto:Xen-devel@lists.xen.org
>    5. http://lists.xen.org/xen-devel
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Possibly Parallel Threads

Search for more seemingly similar threads

Xen devel - Sep 2013 - Status of FLR in Xen 4.4

Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Status of FLR in Xen 4.4

Re: Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

Re: Is: RCU callback detects an RCU hang with Linux 3.12+ Was: Re: Status of FLR in Xen 4.4

Possibly Parallel Threads