thr3ads.net - Xen devel - [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug. [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Joe Jin

2010-Jan-19 09:52 UTC

[Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Hi,

At backend driver blkback and blktap, when checking statistics information,
at the time vbd device remove, kernel will crash.

Below patch will fix it, please review and apply.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

 blkback/xenbus.c |    5 ++++-
 blktap/xenbus.c  |    5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)
---

diff -r 6061d5615522 drivers/xen/blkback/xenbus.c
--- a/drivers/xen/blkback/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blkback/xenbus.c	Tue Jan 19 17:37:32 2010 +0800
@@ -104,10 +104,13 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
 		struct backend_info *be = dev->dev.driver_data;		\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (be && be->blkif)					\
+			ret = sprintf(buf, format, ##args);		\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
diff -r 6061d5615522 drivers/xen/blktap/xenbus.c
--- a/drivers/xen/blktap/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blktap/xenbus.c	Tue Jan 19 17:37:32 2010 +0800
@@ -122,10 +122,13 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
 		struct backend_info *be = dev->dev.driver_data;		\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (be && be->blkif)					\
+			ret = sprintf(buf, format, ##args);		\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Jan-19 10:25 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52
>>>
>At backend driver blkback and blktap, when checking statistics information,
>at the time vbd device remove, kernel will crash.
>
>Below patch will fix it, please review and apply.
This isn''t a complete fix if I follow your analysis: There''s
still a race
between blk{back,tap}_remove() freeing be->blkif/be and the sysfs
code. dev->dev.driver_data (and possibly be->blkif) must be cleared
before freeing it (them).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-19 11:32 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-19 10:25, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52
>>>
> >At backend driver blkback and blktap, when checking statistics
information,
> >at the time vbd device remove, kernel will crash.
> >
> >Below patch will fix it, please review and apply.
> 
> This isn''t a complete fix if I follow your analysis:
There''s still a race
> between blk{back,tap}_remove() freeing be->blkif/be and the sysfs
> code. dev->dev.driver_data (and possibly be->blkif) must be cleared
> before freeing it (them).
> 
Thanks for you point.
Anyway, I think need to add some check at VBD_SHOW even cleared 
dev->dev.driver_data before free it, sysfs->fops() have been 
initialized when call open(), and later when read the file, call
trace will fall VBD_SHOW defined function(s), so it should crashed.

The checks may look like below?


diff -r 6061d5615522 drivers/xen/blkback/xenbus.c
--- a/drivers/xen/blkback/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blkback/xenbus.c	Tue Jan 19 19:30:32 2010 +0800
@@ -104,10 +104,13 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (dev && (be = dev->dev.driver_data) && be->blkif)	\
+			ret = sprintf(buf, format, ##args);		\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
diff -r 6061d5615522 drivers/xen/blktap/xenbus.c
--- a/drivers/xen/blktap/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blktap/xenbus.c	Tue Jan 19 19:30:32 2010 +0800
@@ -122,10 +122,13 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (dev && (be = dev->dev.driver_data) && be->blkif)	\
+			ret = sprintf(buf, format, ##args);		\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Jan-19 12:06 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 12:32
>>>
>On 2010-01-19 10:25, Jan Beulich wrote:
>> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10
10:52 >>>
>> >At backend driver blkback and blktap, when checking statistics
information,
>> >at the time vbd device remove, kernel will crash.
>> >
>> >Below patch will fix it, please review and apply.
>> 
>> This isn''t a complete fix if I follow your analysis:
There''s still a race
>> between blk{back,tap}_remove() freeing be->blkif/be and the sysfs
>> code. dev->dev.driver_data (and possibly be->blkif) must be
cleared
>> before freeing it (them).
>> 
>
>Thanks for you point.
>Anyway, I think need to add some check at VBD_SHOW even cleared 
>dev->dev.driver_data before free it, sysfs->fops() have been 
>initialized when call open(), and later when read the file, call
>trace will fall VBD_SHOW defined function(s), so it should crashed.
I think I understand what you''re saying. But then there''s only
one
solution - avoid freeing those two data structures from the .remove
handler.
>The checks may look like below?
Checking dev to be non-NULL is certainly pointless in any case.

But wait - how did you see the crash you''re trying to fix occur in the
first place? blkback_remove() calls xenvbd_sysfs_delif(), so the sysfs
entries are gone by the time driver_data gets freed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-19 14:13 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-19 12:06, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 12:32
>>>
> >On 2010-01-19 10:25, Jan Beulich wrote:
> >> >>> "Joe Jin" <joe.jin@oracle.com>
19.01.10 10:52 >>>
> >> >At backend driver blkback and blktap, when checking statistics
information,
> >> >at the time vbd device remove, kernel will crash.
> >> >
> >> >Below patch will fix it, please review and apply.
> >> 
> >> This isn''t a complete fix if I follow your analysis:
There''s still a race
> >> between blk{back,tap}_remove() freeing be->blkif/be and the
sysfs
> >> code. dev->dev.driver_data (and possibly be->blkif) must be
cleared
> >> before freeing it (them).
> >> 
> >
> >Thanks for you point.
> >Anyway, I think need to add some check at VBD_SHOW even cleared 
> >dev->dev.driver_data before free it, sysfs->fops() have been 
> >initialized when call open(), and later when read the file, call
> >trace will fall VBD_SHOW defined function(s), so it should crashed.
> 
> I think I understand what you''re saying. But then there''s
only one
> solution - avoid freeing those two data structures from the .remove
> handler.
locked blk{back,tap}_remove and VBD_SHOW? I did not found any refcnt help 
for this. but even add lock for this, I think still need the checks.
> 
> >The checks may look like below?
> 
> Checking dev to be non-NULL is certainly pointless in any case.
> 
> But wait - how did you see the crash you''re trying to fix occur in
the
> first place? blkback_remove() calls xenvbd_sysfs_delif(), so the sysfs
> entries are gone by the time driver_data gets freed.
When tested continous reboot VM testcase kernel panic. then I write 
a program as blow:
while (1) {
	open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect");
	usleep(100);
	read(fd, buff, BUFF_LEN);
}

running it, then reboot the vm, will hit the panic.

As I have menthioned, when open the file, kernel have initialized 
file handler, when call file operation, call the function.
At here, even blkback_remove() have delete sysfs entry, but did not
released sysfs->fops, that caused the panic.


Thanks,
Joe

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Jan-19 16:20 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 15:13
>>>
>When tested continous reboot VM testcase kernel panic. then I write 
>a program as blow:
>while (1) {
>	open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect");
>	usleep(100);
>	read(fd, buff, BUFF_LEN);
>}
>
>running it, then reboot the vm, will hit the panic.
>
>As I have menthioned, when open the file, kernel have initialized 
>file handler, when call file operation, call the function.
>At here, even blkback_remove() have delete sysfs entry, but did not
>released sysfs->fops, that caused the panic.
That would seem like a general sysfs problem then - I can''t imagine
other modules do much to prevent this, though I''m sure there are
other modules which add/remove entries on the fly. Would require
some looking through the sources to see whether there is any
generally usable (and simple) solution to this.

But irrespective of that, I think the race remains even after your
patch - only the window gets smaller.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-20 02:06 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-19 16:20, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 15:13
>>>
> >When tested continous reboot VM testcase kernel panic. then I write 
> >a program as blow:
> >while (1) {
> >
open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect");
> >	usleep(100);
> >	read(fd, buff, BUFF_LEN);
> >}
> >
> >running it, then reboot the vm, will hit the panic.
> >
> >As I have menthioned, when open the file, kernel have initialized 
> >file handler, when call file operation, call the function.
> >At here, even blkback_remove() have delete sysfs entry, but did not
> >released sysfs->fops, that caused the panic.
> 
> That would seem like a general sysfs problem then - I can''t
imagine
> other modules do much to prevent this, though I''m sure there are
> other modules which add/remove entries on the fly. Would require
> some looking through the sources to see whether there is any
> generally usable (and simple) solution to this.
sysfs did not provide lock to handle this, not sure if developer
think it is not necessary or they''d like to caller to handled it.
> 
> But irrespective of that, I think the race remains even after your
> patch - only the window gets smaller.
Add a rwlock will fix the race like below patch?


diff -r 6061d5615522 drivers/xen/blkback/xenbus.c
--- a/drivers/xen/blkback/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blkback/xenbus.c	Wed Jan 20 10:00:53 2010 +0800
@@ -26,6 +26,8 @@
 #define DPRINTK(fmt, args...)				\
 	pr_debug("blkback/xenbus (%s:%d) " fmt ".\n",	\
 		 __FUNCTION__, __LINE__, ##args)
+
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 struct backend_info
 {
@@ -104,10 +106,15 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		read_lock(&sysfs_read_lock);				\
+		if ((be = dev->dev.driver_data) && be->blkif)		\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -173,6 +180,7 @@
 
 	DPRINTK("");
 
+	write_lock(&sysfs_read_lock);
 	if (be->major || be->minor)
 		xenvbd_sysfs_delif(dev);
 
@@ -191,6 +199,7 @@
 
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 
diff -r 6061d5615522 drivers/xen/blktap/xenbus.c
--- a/drivers/xen/blktap/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
+++ b/drivers/xen/blktap/xenbus.c	Wed Jan 20 10:00:53 2010 +0800
@@ -50,6 +50,7 @@
 	int group_added;
 };
 
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 static void connect(struct backend_info *);
 static int connect_ring(struct backend_info *);
@@ -122,10 +123,15 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
+		ssize_t ret = -ENODEV;					\
 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		read_lock(&sysfs_read_lock);				\
+		if (dev && (be = dev->dev.driver_data) && be->blkif)	\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -170,6 +176,7 @@
 {
 	struct backend_info *be = dev->dev.driver_data;
 
+	write_lock(&sysfs_read_lock);
 	if (be->group_added)
 		xentap_sysfs_delif(be->dev);
 	if (be->backend_watch.node) {
@@ -187,6 +194,7 @@
 	}
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Jan-20 07:46 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

>>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 03:06
>>>
>sysfs did not provide lock to handle this, not sure if developer
>think it is not necessary or they''d like to caller to handled it.
A lock is probably not the usual way to deal with this; ref-counting
would seem more common. Nevertheless I think adding a lock will
take care of the issue here.
>--- a/drivers/xen/blktap/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
>+++ b/drivers/xen/blktap/xenbus.c	Wed Jan 20 10:00:53 2010 +0800
>...
>@@ -122,10 +123,15 @@
> 				   struct device_attribute *attr,	\
> 				   char *buf)				\
> 	{								\
>+		ssize_t ret = -ENODEV;					\
> 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
The use of to_xenbus_device() here makes ...
>-		struct backend_info *be = dev->dev.driver_data;		\
>+		struct backend_info *be;				\
> 									\
>-		return sprintf(buf, format, ##args);			\
>+		read_lock(&sysfs_read_lock);				\
>+		if (dev && (be = dev->dev.driver_data) &&
be->blkif)	\
... the checking of dev here useless (and the blkback part of the patch
doesn''t do the same).
>+			ret = sprintf(buf, format, ##args);		\
>+		read_unlock(&sysfs_read_lock);				\
>+		return ret;						\
> 	}								\
> 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
> 
And btw., in both cases with the lock added there''s no need to check
both ''be'' and ''be->blkif'', since
be->blkif can''t be NULL when be is
non-NULL.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-20 10:51 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-20 07:46, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 03:06
>>>
> >sysfs did not provide lock to handle this, not sure if developer
> >think it is not necessary or they''d like to caller to handled
it.
> 
> A lock is probably not the usual way to deal with this; ref-counting
> would seem more common. Nevertheless I think adding a lock will
> take care of the issue here.
Add refcnt could not fix the race for file open is sysfs file operations
and will not touch xenbus/blk{back,tap} struct, so could not increase 
the refcnt.
The root cause is after open sysfs file, vbd was been removed by other.
For the lock type is read/write lock, so the lock almost will not slowdwon 
performance, isn''t it?
> 
> >--- a/drivers/xen/blktap/xenbus.c	Fri Jan 08 13:07:17 2010 +0000
> >+++ b/drivers/xen/blktap/xenbus.c	Wed Jan 20 10:00:53 2010 +0800
> >...
> >@@ -122,10 +123,15 @@
> > 				   struct device_attribute *attr,	\
> > 				   char *buf)				\
> > 	{								\
> >+		ssize_t ret = -ENODEV;					\
> > 		struct xenbus_device *dev = to_xenbus_device(_dev);	\
> 
> The use of to_xenbus_device() here makes ...
> 
> >-		struct backend_info *be = dev->dev.driver_data;		\
> >+		struct backend_info *be;				\
> > 									\
> >-		return sprintf(buf, format, ##args);			\
> >+		read_lock(&sysfs_read_lock);				\
> >+		if (dev && (be = dev->dev.driver_data) &&
be->blkif)	\
> 
> ... the checking of dev here useless (and the blkback part of the patch
> doesn''t do the same).
Sorry I forgot get rid of it :)
> 
> >+			ret = sprintf(buf, format, ##args);		\
> >+		read_unlock(&sysfs_read_lock);				\
> >+		return ret;						\
> > 	}								\
> > 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
> > 
> 
> And btw., in both cases with the lock added there''s no need to
check
> both ''be'' and ''be->blkif'', since
be->blkif can''t be NULL when be is
> non-NULL.
No we must check it, as I have methioned, root cause is vbd was been
freed after open sysfs file.


Below is the new patch, please review.


diff -r 0bec29c94ce9 drivers/xen/blkback/xenbus.c
--- a/drivers/xen/blkback/xenbus.c	Mon Jan 18 14:50:43 2010 +0000
+++ b/drivers/xen/blkback/xenbus.c	Wed Jan 20 18:49:50 2010 +0800
@@ -26,6 +26,8 @@
 #define DPRINTK(fmt, args...)				\
 	pr_debug("blkback/xenbus (%s:%d) " fmt ".\n",	\
 		 __FUNCTION__, __LINE__, ##args)
+
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 struct backend_info
 {
@@ -104,10 +106,19 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
-		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		ssize_t ret = -ENODEV;					\
+		struct xenbus_device *dev;				\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (!get_device(_dev))					\
+			return ret;					\
+		dev = to_xenbus_device(_dev);				\
+		read_lock(&sysfs_read_lock);				\
+		if ((be = dev->dev.driver_data) && be->blkif)		\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		put_device(_dev);					\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -173,6 +184,7 @@
 
 	DPRINTK("");
 
+	write_lock(&sysfs_read_lock);
 	if (be->major || be->minor)
 		xenvbd_sysfs_delif(dev);
 
@@ -191,6 +203,7 @@
 
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 
diff -r 0bec29c94ce9 drivers/xen/blktap/xenbus.c
--- a/drivers/xen/blktap/xenbus.c	Mon Jan 18 14:50:43 2010 +0000
+++ b/drivers/xen/blktap/xenbus.c	Wed Jan 20 18:49:50 2010 +0800
@@ -50,6 +50,7 @@
 	int group_added;
 };
 
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 static void connect(struct backend_info *);
 static int connect_ring(struct backend_info *);
@@ -122,10 +123,19 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
-		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		ssize_t ret = -ENODEV;					\
+		struct xenbus_device *dev;				\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (!get_device(_dev))					\
+			return ret;					\
+		dev = to_xenbus_device(_dev);				\
+		read_lock(&sysfs_read_lock);				\
+		if ((be = dev->dev.driver_data) && be->blkif)		\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		put_device(_dev);					\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -170,6 +180,7 @@
 {
 	struct backend_info *be = dev->dev.driver_data;
 
+	write_lock(&sysfs_read_lock);
 	if (be->group_added)
 		xentap_sysfs_delif(be->dev);
 	if (be->backend_watch.node) {
@@ -187,6 +198,7 @@
 	}
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Jan-20 11:06 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

>>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 11:51
>>>
>On 2010-01-20 07:46, Jan Beulich wrote:
>> And btw., in both cases with the lock added there''s no need to
check
>> both ''be'' and ''be->blkif'', since
be->blkif can''t be NULL when be is
>> non-NULL.
>
>No we must check it, as I have methioned, root cause is vbd was been
>freed after open sysfs file.
Why? You''re inside the locked region, so either ''be''
is NULL, or both
''be'' and ''be->blkif'' are non-NULL.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-20 11:45 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-20 11:06, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 11:51
>>>
> >On 2010-01-20 07:46, Jan Beulich wrote:
> >> And btw., in both cases with the lock added there''s no
need to check
> >> both ''be'' and ''be->blkif'',
since be->blkif can''t be NULL when be is
> >> non-NULL.
> >
> >No we must check it, as I have methioned, root cause is vbd was been
> >freed after open sysfs file.
> 
> Why? You''re inside the locked region, so either
''be'' is NULL, or both
> ''be'' and ''be->blkif'' are non-NULL.
> 
Sorry for misunstdrstood, your are right, attach new patch.


diff -r 0bec29c94ce9 drivers/xen/blkback/xenbus.c
--- a/drivers/xen/blkback/xenbus.c	Mon Jan 18 14:50:43 2010 +0000
+++ b/drivers/xen/blkback/xenbus.c	Wed Jan 20 19:44:32 2010 +0800
@@ -26,6 +26,8 @@
 #define DPRINTK(fmt, args...)				\
 	pr_debug("blkback/xenbus (%s:%d) " fmt ".\n",	\
 		 __FUNCTION__, __LINE__, ##args)
+
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 struct backend_info
 {
@@ -104,10 +106,19 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
-		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		ssize_t ret = -ENODEV;					\
+		struct xenbus_device *dev;				\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (!get_device(_dev))					\
+			return ret;					\
+		dev = to_xenbus_device(_dev);				\
+		read_lock(&sysfs_read_lock);				\
+		if ((be = dev->dev.driver_data) != NULL)		\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		put_device(_dev);					\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -173,6 +184,7 @@
 
 	DPRINTK("");
 
+	write_lock(&sysfs_read_lock);
 	if (be->major || be->minor)
 		xenvbd_sysfs_delif(dev);
 
@@ -191,6 +203,7 @@
 
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 
diff -r 0bec29c94ce9 drivers/xen/blktap/xenbus.c
--- a/drivers/xen/blktap/xenbus.c	Mon Jan 18 14:50:43 2010 +0000
+++ b/drivers/xen/blktap/xenbus.c	Wed Jan 20 19:44:32 2010 +0800
@@ -50,6 +50,7 @@
 	int group_added;
 };
 
+static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED;
 
 static void connect(struct backend_info *);
 static int connect_ring(struct backend_info *);
@@ -122,10 +123,19 @@
 				   struct device_attribute *attr,	\
 				   char *buf)				\
 	{								\
-		struct xenbus_device *dev = to_xenbus_device(_dev);	\
-		struct backend_info *be = dev->dev.driver_data;		\
+		ssize_t ret = -ENODEV;					\
+		struct xenbus_device *dev;				\
+		struct backend_info *be;				\
 									\
-		return sprintf(buf, format, ##args);			\
+		if (!get_device(_dev))					\
+			return ret;					\
+		dev = to_xenbus_device(_dev);				\
+		read_lock(&sysfs_read_lock);				\
+		if ((be = dev->dev.driver_data) != NULL)		\
+			ret = sprintf(buf, format, ##args);		\
+		read_unlock(&sysfs_read_lock);				\
+		put_device(_dev);					\
+		return ret;						\
 	}								\
 	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
@@ -170,6 +180,7 @@
 {
 	struct backend_info *be = dev->dev.driver_data;
 
+	write_lock(&sysfs_read_lock);
 	if (be->group_added)
 		xentap_sysfs_delif(be->dev);
 	if (be->backend_watch.node) {
@@ -187,6 +198,7 @@
 	}
 	kfree(be);
 	dev->dev.driver_data = NULL;
+	write_unlock(&sysfs_read_lock);
 	return 0;
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jan-20 20:25 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 20/01/2010 11:45, "Joe Jin" <joe.jin@oracle.com> wrote:
>> Why? You''re inside the locked region, so either
''be'' is NULL, or both
>> ''be'' and ''be->blkif'' are
non-NULL.
>> 
> 
> Sorry for misunstdrstood, your are right, attach new patch.
I''m waiting for final resubmission with changeset comment and sign-off,
and
for it to collect an ack from Jan, by the way.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Stodden

2010-Jan-21 02:16 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Hey.

I think reverted a similar patch when moving past 2.6.18 a while ago,
but I don''t remember the first kernel versions affected right now.

On Wed, 2010-01-20 at 06:45 -0500, Joe Jin wrote:
>  static void connect(struct backend_info *);
>  static int connect_ring(struct backend_info *);
> @@ -122,10 +123,19 @@
>  				   struct device_attribute *attr,	\
>  				   char *buf)				\
>  	{								\
> -		struct xenbus_device *dev = to_xenbus_device(_dev);	\
> -		struct backend_info *be = dev->dev.driver_data;		\
> +		ssize_t ret = -ENODEV;					\
> +		struct xenbus_device *dev;				\
> +		struct backend_info *be;				\

								\> -		return sprintf(buf, format, ##args);			\
> +		if (!get_device(_dev))					\
> +			return ret;					\
> +		dev = to_xenbus_device(_dev);				\
I think this deadlocks here:> +		read_lock(&sysfs_read_lock);				\
> +		if ((be = dev->dev.driver_data) != NULL)		\
> +			ret = sprintf(buf, format, ##args);		\
> +		read_unlock(&sysfs_read_lock);				\
> +		put_device(_dev);					\
> +		return ret;						\
>  	}								\
>  	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
>  
> @@ -170,6 +180,7 @@
>  {
>  	struct backend_info *be = dev->dev.driver_data;
>  
> +	write_lock(&sysfs_read_lock);
>  	if (be->group_added)
.. and here:>  		xentap_sysfs_delif(be->dev);
>  	if (be->backend_watch.node) {
> @@ -187,6 +198,7 @@
>  	}
>  	kfree(be);
>  	dev->dev.driver_data = NULL;
> +	write_unlock(&sysfs_read_lock);
>  	return 0;
>  }
Sysfs may refcount these nodes, blocking xentap_sysfs_delif().


Daniel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-21 03:13 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On 2010-01-20 18:16, Daniel Stodden wrote:> 
> Hey.
> 
> I think reverted a similar patch when moving past 2.6.18 a while ago,
> but I don''t remember the first kernel versions affected right now.
> 
> On Wed, 2010-01-20 at 06:45 -0500, Joe Jin wrote:
> 
> >  static void connect(struct backend_info *);
> >  static int connect_ring(struct backend_info *);
> > @@ -122,10 +123,19 @@
> >  				   struct device_attribute *attr,	\
> >  				   char *buf)				\
> >  	{								\
> > -		struct xenbus_device *dev = to_xenbus_device(_dev);	\
> > -		struct backend_info *be = dev->dev.driver_data;		\
> > +		ssize_t ret = -ENODEV;					\
> > +		struct xenbus_device *dev;				\
> > +		struct backend_info *be;				\
> 
> 
> 								\
> > -		return sprintf(buf, format, ##args);			\
> > +		if (!get_device(_dev))					\
> > +			return ret;					\
> > +		dev = to_xenbus_device(_dev);				\
> 
> I think this deadlocks here:
> > +		read_lock(&sysfs_read_lock);				\
> > +		if ((be = dev->dev.driver_data) != NULL)		\
> > +			ret = sprintf(buf, format, ##args);		\
> > +		read_unlock(&sysfs_read_lock);				\
> > +		put_device(_dev);					\
> > +		return ret;						\
> >  	}								\
> >  	static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
> >  
> > @@ -170,6 +180,7 @@
> >  {
> >  	struct backend_info *be = dev->dev.driver_data;
> >  
> > +	write_lock(&sysfs_read_lock);
> >  	if (be->group_added)
> 
> .. and here:
> >  		xentap_sysfs_delif(be->dev);
> >  	if (be->backend_watch.node) {
> > @@ -187,6 +198,7 @@
> >  	}
> >  	kfree(be);
> >  	dev->dev.driver_data = NULL;
> > +	write_unlock(&sysfs_read_lock);
> >  	return 0;
> >  }
> 
> Sysfs may refcount these nodes, blocking xentap_sysfs_delif().
Would you like to give me some info about of blocking and deadlocks?

Checked xentap_sysfs_delif() -> sysfs_remove_group() and I did not found
blocking codes, maybe I lost something?

Thanks,
Joe

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Stodden

2010-Jan-21 07:26 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On Wed, 2010-01-20 at 22:13 -0500, Joe Jin wrote:> On 2010-01-20 18:16, Daniel Stodden wrote:
> > 
> > Hey.
> > 
> > I think reverted a similar patch when moving past 2.6.18 a while ago,
> > but I don''t remember the first kernel versions affected right
now.
> > 
> > Sysfs may refcount these nodes, blocking xentap_sysfs_delif().
> 
> Would you like to give me some info about of blocking and deadlocks?
> 
> Checked xentap_sysfs_delif() -> sysfs_remove_group() and I did not found
> blocking codes, maybe I lost something?
> 
> Thanks,
> Joe
Hi Joe.

Your patch will work okay on 2.6.18.

But collisions will deadlock after 2.6.23

Found an old stack trace:

[2009-07-08 06:15:08 UTC] INFO: task xb.00021.xvdd:30039 blocked for more than
120 seconds.
[2009-07-08 06:15:08 UTC] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2009-07-08 06:15:08 UTC]        c7adfe0c 00000246 00000000 00000000 5b88fd4f
00000256 c7ade000 c7adfdc8
[2009-07-08 06:15:08 UTC]        c0107c1b 38c984a4 00000256 c7adfddc ed629578
ed6293f0 ed629578 c16bdb00
[2009-07-08 06:15:08 UTC]        00000000 eea0d500 c16bdb34 002dc05a 00000000
00000005 0024c31c e20a8ff0
[2009-07-08 06:15:08 UTC] Call Trace:
[2009-07-08 06:15:08 UTC]  [<c0107c1b>] ? local_clock+0x3b/0x90
[2009-07-08 06:15:08 UTC]  [<c0344675>] schedule_timeout+0x75/0xc0
[2009-07-08 06:15:08 UTC]  [<c011ed81>] ? pick_next_task_fair+0x91/0xd0
[2009-07-08 06:15:08 UTC]  [<c03442c9>] wait_for_common+0xa9/0x1c0
[2009-07-08 06:15:08 UTC]  [<c0120d40>] ? default_wake_function+0x0/0x10
[2009-07-08 06:15:08 UTC]  [<c0344472>] wait_for_completion+0x12/0x20
[2009-07-08 06:15:08 UTC]  [<c01cf9e7>] sysfs_addrm_finish+0x1e7/0x230
[2009-07-08 06:15:08 UTC]  [<c01ce2e5>] sysfs_hash_and_remove+0x45/0x70
[2009-07-08 06:15:08 UTC]  [<c01d0adb>] remove_files+0x1b/0x30
[2009-07-08 06:15:08 UTC]  [<c01d0b26>] sysfs_remove_group+0x36/0xc0
[2009-07-08 06:15:08 UTC]  [<c01ae02f>] ? __blkdev_put+0x14f/0x160
[2009-07-08 06:15:08 UTC]  [<c02769fc>] xenvbd_sysfs_delif+0x2c/0x60
[2009-07-08 06:15:08 UTC]  [<c0276a76>] blkback_close+0x46/0x70
[2009-07-08 06:15:08 UTC]  [<c0275d33>] blkif_schedule+0x583/0x5b0
[2009-07-08 06:15:08 UTC]  [<c011ed81>] ? pick_next_task_fair+0x91/0xd0
[2009-07-08 06:15:08 UTC]  [<c013dda0>] ?
autoremove_wake_function+0x0/0x50
[2009-07-08 06:15:08 UTC]  [<c02757b0>] ? blkif_schedule+0x0/0x5b0
[2009-07-08 06:15:08 UTC]  [<c013da42>] kthread+0x42/0x70
[2009-07-08 06:15:08 UTC]  [<c013da00>] ? kthread+0x0/0x70
[2009-07-08 06:15:08 UTC]  [<c010561b>] kernel_thread_helper+0x7/0x10

The reason is in sysfs_deactivate(), which will sync callers against any
remaining thread in .show()
- show() hangs on the lock
- the lock holder in sysfs_remove_group(), 
  waiting for show() to complete.

Pardon me -- I''m not entirely sure where/how these patches are
currently
submitted and merged. I suppose yours are only for linux-2.6.18.hg, not
e.g. pvops? Then sorry for any confusion.

Daniel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Joe Jin

2010-Jan-21 07:49 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

> Your patch will work okay on 2.6.18.
> 
> But collisions will deadlock after 2.6.23
> 
> Found an old stack trace:
> 
> [2009-07-08 06:15:08 UTC] INFO: task xb.00021.xvdd:30039 blocked for more
than 120 seconds.
> [2009-07-08 06:15:08 UTC] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [2009-07-08 06:15:08 UTC]        c7adfe0c 00000246 00000000 00000000
5b88fd4f 00000256 c7ade000 c7adfdc8
> [2009-07-08 06:15:08 UTC]        c0107c1b 38c984a4 00000256 c7adfddc
ed629578 ed6293f0 ed629578 c16bdb00
> [2009-07-08 06:15:08 UTC]        00000000 eea0d500 c16bdb34 002dc05a
00000000 00000005 0024c31c e20a8ff0
> [2009-07-08 06:15:08 UTC] Call Trace:
> [2009-07-08 06:15:08 UTC]  [<c0107c1b>] ? local_clock+0x3b/0x90
> [2009-07-08 06:15:08 UTC]  [<c0344675>] schedule_timeout+0x75/0xc0
> [2009-07-08 06:15:08 UTC]  [<c011ed81>] ?
pick_next_task_fair+0x91/0xd0
> [2009-07-08 06:15:08 UTC]  [<c03442c9>] wait_for_common+0xa9/0x1c0
> [2009-07-08 06:15:08 UTC]  [<c0120d40>] ?
default_wake_function+0x0/0x10
> [2009-07-08 06:15:08 UTC]  [<c0344472>] wait_for_completion+0x12/0x20
> [2009-07-08 06:15:08 UTC]  [<c01cf9e7>]
sysfs_addrm_finish+0x1e7/0x230
> [2009-07-08 06:15:08 UTC]  [<c01ce2e5>]
sysfs_hash_and_remove+0x45/0x70
> [2009-07-08 06:15:08 UTC]  [<c01d0adb>] remove_files+0x1b/0x30
> [2009-07-08 06:15:08 UTC]  [<c01d0b26>] sysfs_remove_group+0x36/0xc0
> [2009-07-08 06:15:08 UTC]  [<c01ae02f>] ? __blkdev_put+0x14f/0x160
> [2009-07-08 06:15:08 UTC]  [<c02769fc>] xenvbd_sysfs_delif+0x2c/0x60
> [2009-07-08 06:15:08 UTC]  [<c0276a76>] blkback_close+0x46/0x70
> [2009-07-08 06:15:08 UTC]  [<c0275d33>] blkif_schedule+0x583/0x5b0
> [2009-07-08 06:15:08 UTC]  [<c011ed81>] ?
pick_next_task_fair+0x91/0xd0
> [2009-07-08 06:15:08 UTC]  [<c013dda0>] ?
autoremove_wake_function+0x0/0x50
> [2009-07-08 06:15:08 UTC]  [<c02757b0>] ? blkif_schedule+0x0/0x5b0
> [2009-07-08 06:15:08 UTC]  [<c013da42>] kthread+0x42/0x70
> [2009-07-08 06:15:08 UTC]  [<c013da00>] ? kthread+0x0/0x70
> [2009-07-08 06:15:08 UTC]  [<c010561b>] kernel_thread_helper+0x7/0x10
> 
> The reason is in sysfs_deactivate(), which will sync callers against any
> remaining thread in .show()
> - show() hangs on the lock
> - the lock holder in sysfs_remove_group(), 
>   waiting for show() to complete.
> 
> Pardon me -- I''m not entirely sure where/how these patches are
currently
> submitted and merged. I suppose yours are only for linux-2.6.18.hg, not
> e.g. pvops? Then sorry for any confusion.
> 
Daniel, 

Thanks a lot of your comments, it really help for me, yes my patch based 
linux-2.6.18.hg branch. As Jan have pointed out in previous email, it should 
be sysfs''s issue, looked like later kernel sysfs have fixed the issue?

Thanks,
Joe


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Stodden

2010-Jan-21 18:01 UTC

head link

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

On Thu, 2010-01-21 at 02:49 -0500, Joe Jin wrote:
> > The reason is in sysfs_deactivate(), which will sync callers against
any
> > remaining thread in .show()
> > - show() hangs on the lock
> > - the lock holder in sysfs_remove_group(), 
> >   waiting for show() to complete.
> > 
> > Pardon me -- I''m not entirely sure where/how these patches
are currently
> > submitted and merged. I suppose yours are only for linux-2.6.18.hg,
not
> > e.g. pvops? Then sorry for any confusion.
> > 
> 
> Daniel, 
> 
> Thanks a lot of your comments, it really help for me, yes my patch based 
> linux-2.6.18.hg branch. As Jan have pointed out in previous email, it
should
> be sysfs''s issue, looked like later kernel sysfs have fixed the
issue?
Yes. Later sysfs won''t need the locking. Ordering sysfs removal before
backend release will be sufficient.

Daniel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jan 2010 - [PATCH] Fix blkback/blktap sysfs read bug.

[Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.

Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.