Hi, At backend driver blkback and blktap, when checking statistics information, at the time vbd device remove, kernel will crash. Below patch will fix it, please review and apply. Signed-off-by: Joe Jin <joe.jin@oracle.com> blkback/xenbus.c | 5 ++++- blktap/xenbus.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-) --- diff -r 6061d5615522 drivers/xen/blkback/xenbus.c --- a/drivers/xen/blkback/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blkback/xenbus.c Tue Jan 19 17:37:32 2010 +0800 @@ -104,10 +104,13 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ struct backend_info *be = dev->dev.driver_data; \ \ - return sprintf(buf, format, ##args); \ + if (be && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) diff -r 6061d5615522 drivers/xen/blktap/xenbus.c --- a/drivers/xen/blktap/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blktap/xenbus.c Tue Jan 19 17:37:32 2010 +0800 @@ -122,10 +122,13 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ struct backend_info *be = dev->dev.driver_data; \ \ - return sprintf(buf, format, ##args); \ + if (be && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-19 10:25 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52 >>> >At backend driver blkback and blktap, when checking statistics information, >at the time vbd device remove, kernel will crash. > >Below patch will fix it, please review and apply.This isn''t a complete fix if I follow your analysis: There''s still a race between blk{back,tap}_remove() freeing be->blkif/be and the sysfs code. dev->dev.driver_data (and possibly be->blkif) must be cleared before freeing it (them). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-19 10:25, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52 >>> > >At backend driver blkback and blktap, when checking statistics information, > >at the time vbd device remove, kernel will crash. > > > >Below patch will fix it, please review and apply. > > This isn''t a complete fix if I follow your analysis: There''s still a race > between blk{back,tap}_remove() freeing be->blkif/be and the sysfs > code. dev->dev.driver_data (and possibly be->blkif) must be cleared > before freeing it (them). >Thanks for you point. Anyway, I think need to add some check at VBD_SHOW even cleared dev->dev.driver_data before free it, sysfs->fops() have been initialized when call open(), and later when read the file, call trace will fall VBD_SHOW defined function(s), so it should crashed. The checks may look like below? diff -r 6061d5615522 drivers/xen/blkback/xenbus.c --- a/drivers/xen/blkback/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blkback/xenbus.c Tue Jan 19 19:30:32 2010 +0800 @@ -104,10 +104,13 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (dev && (be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) diff -r 6061d5615522 drivers/xen/blktap/xenbus.c --- a/drivers/xen/blktap/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blktap/xenbus.c Tue Jan 19 19:30:32 2010 +0800 @@ -122,10 +122,13 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (dev && (be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-19 12:06 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 12:32 >>> >On 2010-01-19 10:25, Jan Beulich wrote: >> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52 >>> >> >At backend driver blkback and blktap, when checking statistics information, >> >at the time vbd device remove, kernel will crash. >> > >> >Below patch will fix it, please review and apply. >> >> This isn''t a complete fix if I follow your analysis: There''s still a race >> between blk{back,tap}_remove() freeing be->blkif/be and the sysfs >> code. dev->dev.driver_data (and possibly be->blkif) must be cleared >> before freeing it (them). >> > >Thanks for you point. >Anyway, I think need to add some check at VBD_SHOW even cleared >dev->dev.driver_data before free it, sysfs->fops() have been >initialized when call open(), and later when read the file, call >trace will fall VBD_SHOW defined function(s), so it should crashed.I think I understand what you''re saying. But then there''s only one solution - avoid freeing those two data structures from the .remove handler.>The checks may look like below?Checking dev to be non-NULL is certainly pointless in any case. But wait - how did you see the crash you''re trying to fix occur in the first place? blkback_remove() calls xenvbd_sysfs_delif(), so the sysfs entries are gone by the time driver_data gets freed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-19 12:06, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 12:32 >>> > >On 2010-01-19 10:25, Jan Beulich wrote: > >> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 10:52 >>> > >> >At backend driver blkback and blktap, when checking statistics information, > >> >at the time vbd device remove, kernel will crash. > >> > > >> >Below patch will fix it, please review and apply. > >> > >> This isn''t a complete fix if I follow your analysis: There''s still a race > >> between blk{back,tap}_remove() freeing be->blkif/be and the sysfs > >> code. dev->dev.driver_data (and possibly be->blkif) must be cleared > >> before freeing it (them). > >> > > > >Thanks for you point. > >Anyway, I think need to add some check at VBD_SHOW even cleared > >dev->dev.driver_data before free it, sysfs->fops() have been > >initialized when call open(), and later when read the file, call > >trace will fall VBD_SHOW defined function(s), so it should crashed. > > I think I understand what you''re saying. But then there''s only one > solution - avoid freeing those two data structures from the .remove > handler.locked blk{back,tap}_remove and VBD_SHOW? I did not found any refcnt help for this. but even add lock for this, I think still need the checks.> > >The checks may look like below? > > Checking dev to be non-NULL is certainly pointless in any case. > > But wait - how did you see the crash you''re trying to fix occur in the > first place? blkback_remove() calls xenvbd_sysfs_delif(), so the sysfs > entries are gone by the time driver_data gets freed.When tested continous reboot VM testcase kernel panic. then I write a program as blow: while (1) { open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect"); usleep(100); read(fd, buff, BUFF_LEN); } running it, then reboot the vm, will hit the panic. As I have menthioned, when open the file, kernel have initialized file handler, when call file operation, call the function. At here, even blkback_remove() have delete sysfs entry, but did not released sysfs->fops, that caused the panic. Thanks, Joe _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-19 16:20 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
>>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 15:13 >>> >When tested continous reboot VM testcase kernel panic. then I write >a program as blow: >while (1) { > open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect"); > usleep(100); > read(fd, buff, BUFF_LEN); >} > >running it, then reboot the vm, will hit the panic. > >As I have menthioned, when open the file, kernel have initialized >file handler, when call file operation, call the function. >At here, even blkback_remove() have delete sysfs entry, but did not >released sysfs->fops, that caused the panic.That would seem like a general sysfs problem then - I can''t imagine other modules do much to prevent this, though I''m sure there are other modules which add/remove entries on the fly. Would require some looking through the sources to see whether there is any generally usable (and simple) solution to this. But irrespective of that, I think the race remains even after your patch - only the window gets smaller. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-19 16:20, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 19.01.10 15:13 >>> > >When tested continous reboot VM testcase kernel panic. then I write > >a program as blow: > >while (1) { > > open("/sys/devices/xen-backend/vbd-x-xxxx/statistics/rd_sect"); > > usleep(100); > > read(fd, buff, BUFF_LEN); > >} > > > >running it, then reboot the vm, will hit the panic. > > > >As I have menthioned, when open the file, kernel have initialized > >file handler, when call file operation, call the function. > >At here, even blkback_remove() have delete sysfs entry, but did not > >released sysfs->fops, that caused the panic. > > That would seem like a general sysfs problem then - I can''t imagine > other modules do much to prevent this, though I''m sure there are > other modules which add/remove entries on the fly. Would require > some looking through the sources to see whether there is any > generally usable (and simple) solution to this.sysfs did not provide lock to handle this, not sure if developer think it is not necessary or they''d like to caller to handled it.> > But irrespective of that, I think the race remains even after your > patch - only the window gets smaller.Add a rwlock will fix the race like below patch? diff -r 6061d5615522 drivers/xen/blkback/xenbus.c --- a/drivers/xen/blkback/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blkback/xenbus.c Wed Jan 20 10:00:53 2010 +0800 @@ -26,6 +26,8 @@ #define DPRINTK(fmt, args...) \ pr_debug("blkback/xenbus (%s:%d) " fmt ".\n", \ __FUNCTION__, __LINE__, ##args) + +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; struct backend_info { @@ -104,10 +106,15 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + read_lock(&sysfs_read_lock); \ + if ((be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -173,6 +180,7 @@ DPRINTK(""); + write_lock(&sysfs_read_lock); if (be->major || be->minor) xenvbd_sysfs_delif(dev); @@ -191,6 +199,7 @@ kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } diff -r 6061d5615522 drivers/xen/blktap/xenbus.c --- a/drivers/xen/blktap/xenbus.c Fri Jan 08 13:07:17 2010 +0000 +++ b/drivers/xen/blktap/xenbus.c Wed Jan 20 10:00:53 2010 +0800 @@ -50,6 +50,7 @@ int group_added; }; +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; static void connect(struct backend_info *); static int connect_ring(struct backend_info *); @@ -122,10 +123,15 @@ struct device_attribute *attr, \ char *buf) \ { \ + ssize_t ret = -ENODEV; \ struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + read_lock(&sysfs_read_lock); \ + if (dev && (be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -170,6 +176,7 @@ { struct backend_info *be = dev->dev.driver_data; + write_lock(&sysfs_read_lock); if (be->group_added) xentap_sysfs_delif(be->dev); if (be->backend_watch.node) { @@ -187,6 +194,7 @@ } kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-20 07:46 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
>>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 03:06 >>> >sysfs did not provide lock to handle this, not sure if developer >think it is not necessary or they''d like to caller to handled it.A lock is probably not the usual way to deal with this; ref-counting would seem more common. Nevertheless I think adding a lock will take care of the issue here.>--- a/drivers/xen/blktap/xenbus.c Fri Jan 08 13:07:17 2010 +0000 >+++ b/drivers/xen/blktap/xenbus.c Wed Jan 20 10:00:53 2010 +0800 >... >@@ -122,10 +123,15 @@ > struct device_attribute *attr, \ > char *buf) \ > { \ >+ ssize_t ret = -ENODEV; \ > struct xenbus_device *dev = to_xenbus_device(_dev); \The use of to_xenbus_device() here makes ...>- struct backend_info *be = dev->dev.driver_data; \ >+ struct backend_info *be; \ > \ >- return sprintf(buf, format, ##args); \ >+ read_lock(&sysfs_read_lock); \ >+ if (dev && (be = dev->dev.driver_data) && be->blkif) \... the checking of dev here useless (and the blkback part of the patch doesn''t do the same).>+ ret = sprintf(buf, format, ##args); \ >+ read_unlock(&sysfs_read_lock); \ >+ return ret; \ > } \ > static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) >And btw., in both cases with the lock added there''s no need to check both ''be'' and ''be->blkif'', since be->blkif can''t be NULL when be is non-NULL. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-20 07:46, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 03:06 >>> > >sysfs did not provide lock to handle this, not sure if developer > >think it is not necessary or they''d like to caller to handled it. > > A lock is probably not the usual way to deal with this; ref-counting > would seem more common. Nevertheless I think adding a lock will > take care of the issue here.Add refcnt could not fix the race for file open is sysfs file operations and will not touch xenbus/blk{back,tap} struct, so could not increase the refcnt. The root cause is after open sysfs file, vbd was been removed by other. For the lock type is read/write lock, so the lock almost will not slowdwon performance, isn''t it?> > >--- a/drivers/xen/blktap/xenbus.c Fri Jan 08 13:07:17 2010 +0000 > >+++ b/drivers/xen/blktap/xenbus.c Wed Jan 20 10:00:53 2010 +0800 > >... > >@@ -122,10 +123,15 @@ > > struct device_attribute *attr, \ > > char *buf) \ > > { \ > >+ ssize_t ret = -ENODEV; \ > > struct xenbus_device *dev = to_xenbus_device(_dev); \ > > The use of to_xenbus_device() here makes ... > > >- struct backend_info *be = dev->dev.driver_data; \ > >+ struct backend_info *be; \ > > \ > >- return sprintf(buf, format, ##args); \ > >+ read_lock(&sysfs_read_lock); \ > >+ if (dev && (be = dev->dev.driver_data) && be->blkif) \ > > ... the checking of dev here useless (and the blkback part of the patch > doesn''t do the same).Sorry I forgot get rid of it :)> > >+ ret = sprintf(buf, format, ##args); \ > >+ read_unlock(&sysfs_read_lock); \ > >+ return ret; \ > > } \ > > static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) > > > > And btw., in both cases with the lock added there''s no need to check > both ''be'' and ''be->blkif'', since be->blkif can''t be NULL when be is > non-NULL.No we must check it, as I have methioned, root cause is vbd was been freed after open sysfs file. Below is the new patch, please review. diff -r 0bec29c94ce9 drivers/xen/blkback/xenbus.c --- a/drivers/xen/blkback/xenbus.c Mon Jan 18 14:50:43 2010 +0000 +++ b/drivers/xen/blkback/xenbus.c Wed Jan 20 18:49:50 2010 +0800 @@ -26,6 +26,8 @@ #define DPRINTK(fmt, args...) \ pr_debug("blkback/xenbus (%s:%d) " fmt ".\n", \ __FUNCTION__, __LINE__, ##args) + +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; struct backend_info { @@ -104,10 +106,19 @@ struct device_attribute *attr, \ char *buf) \ { \ - struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + ssize_t ret = -ENODEV; \ + struct xenbus_device *dev; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (!get_device(_dev)) \ + return ret; \ + dev = to_xenbus_device(_dev); \ + read_lock(&sysfs_read_lock); \ + if ((be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + put_device(_dev); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -173,6 +184,7 @@ DPRINTK(""); + write_lock(&sysfs_read_lock); if (be->major || be->minor) xenvbd_sysfs_delif(dev); @@ -191,6 +203,7 @@ kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } diff -r 0bec29c94ce9 drivers/xen/blktap/xenbus.c --- a/drivers/xen/blktap/xenbus.c Mon Jan 18 14:50:43 2010 +0000 +++ b/drivers/xen/blktap/xenbus.c Wed Jan 20 18:49:50 2010 +0800 @@ -50,6 +50,7 @@ int group_added; }; +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; static void connect(struct backend_info *); static int connect_ring(struct backend_info *); @@ -122,10 +123,19 @@ struct device_attribute *attr, \ char *buf) \ { \ - struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + ssize_t ret = -ENODEV; \ + struct xenbus_device *dev; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (!get_device(_dev)) \ + return ret; \ + dev = to_xenbus_device(_dev); \ + read_lock(&sysfs_read_lock); \ + if ((be = dev->dev.driver_data) && be->blkif) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + put_device(_dev); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -170,6 +180,7 @@ { struct backend_info *be = dev->dev.driver_data; + write_lock(&sysfs_read_lock); if (be->group_added) xentap_sysfs_delif(be->dev); if (be->backend_watch.node) { @@ -187,6 +198,7 @@ } kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-20 11:06 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
>>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 11:51 >>> >On 2010-01-20 07:46, Jan Beulich wrote: >> And btw., in both cases with the lock added there''s no need to check >> both ''be'' and ''be->blkif'', since be->blkif can''t be NULL when be is >> non-NULL. > >No we must check it, as I have methioned, root cause is vbd was been >freed after open sysfs file.Why? You''re inside the locked region, so either ''be'' is NULL, or both ''be'' and ''be->blkif'' are non-NULL. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-20 11:06, Jan Beulich wrote:> >>> "Joe Jin" <joe.jin@oracle.com> 20.01.10 11:51 >>> > >On 2010-01-20 07:46, Jan Beulich wrote: > >> And btw., in both cases with the lock added there''s no need to check > >> both ''be'' and ''be->blkif'', since be->blkif can''t be NULL when be is > >> non-NULL. > > > >No we must check it, as I have methioned, root cause is vbd was been > >freed after open sysfs file. > > Why? You''re inside the locked region, so either ''be'' is NULL, or both > ''be'' and ''be->blkif'' are non-NULL. >Sorry for misunstdrstood, your are right, attach new patch. diff -r 0bec29c94ce9 drivers/xen/blkback/xenbus.c --- a/drivers/xen/blkback/xenbus.c Mon Jan 18 14:50:43 2010 +0000 +++ b/drivers/xen/blkback/xenbus.c Wed Jan 20 19:44:32 2010 +0800 @@ -26,6 +26,8 @@ #define DPRINTK(fmt, args...) \ pr_debug("blkback/xenbus (%s:%d) " fmt ".\n", \ __FUNCTION__, __LINE__, ##args) + +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; struct backend_info { @@ -104,10 +106,19 @@ struct device_attribute *attr, \ char *buf) \ { \ - struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + ssize_t ret = -ENODEV; \ + struct xenbus_device *dev; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (!get_device(_dev)) \ + return ret; \ + dev = to_xenbus_device(_dev); \ + read_lock(&sysfs_read_lock); \ + if ((be = dev->dev.driver_data) != NULL) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + put_device(_dev); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -173,6 +184,7 @@ DPRINTK(""); + write_lock(&sysfs_read_lock); if (be->major || be->minor) xenvbd_sysfs_delif(dev); @@ -191,6 +203,7 @@ kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } diff -r 0bec29c94ce9 drivers/xen/blktap/xenbus.c --- a/drivers/xen/blktap/xenbus.c Mon Jan 18 14:50:43 2010 +0000 +++ b/drivers/xen/blktap/xenbus.c Wed Jan 20 19:44:32 2010 +0800 @@ -50,6 +50,7 @@ int group_added; }; +static rwlock_t sysfs_read_lock = RW_LOCK_UNLOCKED; static void connect(struct backend_info *); static int connect_ring(struct backend_info *); @@ -122,10 +123,19 @@ struct device_attribute *attr, \ char *buf) \ { \ - struct xenbus_device *dev = to_xenbus_device(_dev); \ - struct backend_info *be = dev->dev.driver_data; \ + ssize_t ret = -ENODEV; \ + struct xenbus_device *dev; \ + struct backend_info *be; \ \ - return sprintf(buf, format, ##args); \ + if (!get_device(_dev)) \ + return ret; \ + dev = to_xenbus_device(_dev); \ + read_lock(&sysfs_read_lock); \ + if ((be = dev->dev.driver_data) != NULL) \ + ret = sprintf(buf, format, ##args); \ + read_unlock(&sysfs_read_lock); \ + put_device(_dev); \ + return ret; \ } \ static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) @@ -170,6 +180,7 @@ { struct backend_info *be = dev->dev.driver_data; + write_lock(&sysfs_read_lock); if (be->group_added) xentap_sysfs_delif(be->dev); if (be->backend_watch.node) { @@ -187,6 +198,7 @@ } kfree(be); dev->dev.driver_data = NULL; + write_unlock(&sysfs_read_lock); return 0; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-20 20:25 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
On 20/01/2010 11:45, "Joe Jin" <joe.jin@oracle.com> wrote:>> Why? You''re inside the locked region, so either ''be'' is NULL, or both >> ''be'' and ''be->blkif'' are non-NULL. >> > > Sorry for misunstdrstood, your are right, attach new patch.I''m waiting for final resubmission with changeset comment and sign-off, and for it to collect an ack from Jan, by the way. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 02:16 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
Hey. I think reverted a similar patch when moving past 2.6.18 a while ago, but I don''t remember the first kernel versions affected right now. On Wed, 2010-01-20 at 06:45 -0500, Joe Jin wrote:> static void connect(struct backend_info *); > static int connect_ring(struct backend_info *); > @@ -122,10 +123,19 @@ > struct device_attribute *attr, \ > char *buf) \ > { \ > - struct xenbus_device *dev = to_xenbus_device(_dev); \ > - struct backend_info *be = dev->dev.driver_data; \ > + ssize_t ret = -ENODEV; \ > + struct xenbus_device *dev; \ > + struct backend_info *be; \\> - return sprintf(buf, format, ##args); \ > + if (!get_device(_dev)) \ > + return ret; \ > + dev = to_xenbus_device(_dev); \I think this deadlocks here:> + read_lock(&sysfs_read_lock); \ > + if ((be = dev->dev.driver_data) != NULL) \ > + ret = sprintf(buf, format, ##args); \ > + read_unlock(&sysfs_read_lock); \ > + put_device(_dev); \ > + return ret; \ > } \ > static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) > > @@ -170,6 +180,7 @@ > { > struct backend_info *be = dev->dev.driver_data; > > + write_lock(&sysfs_read_lock); > if (be->group_added).. and here:> xentap_sysfs_delif(be->dev); > if (be->backend_watch.node) { > @@ -187,6 +198,7 @@ > } > kfree(be); > dev->dev.driver_data = NULL; > + write_unlock(&sysfs_read_lock); > return 0; > }Sysfs may refcount these nodes, blocking xentap_sysfs_delif(). Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 2010-01-20 18:16, Daniel Stodden wrote:> > Hey. > > I think reverted a similar patch when moving past 2.6.18 a while ago, > but I don''t remember the first kernel versions affected right now. > > On Wed, 2010-01-20 at 06:45 -0500, Joe Jin wrote: > > > static void connect(struct backend_info *); > > static int connect_ring(struct backend_info *); > > @@ -122,10 +123,19 @@ > > struct device_attribute *attr, \ > > char *buf) \ > > { \ > > - struct xenbus_device *dev = to_xenbus_device(_dev); \ > > - struct backend_info *be = dev->dev.driver_data; \ > > + ssize_t ret = -ENODEV; \ > > + struct xenbus_device *dev; \ > > + struct backend_info *be; \ > > > \ > > - return sprintf(buf, format, ##args); \ > > + if (!get_device(_dev)) \ > > + return ret; \ > > + dev = to_xenbus_device(_dev); \ > > I think this deadlocks here: > > + read_lock(&sysfs_read_lock); \ > > + if ((be = dev->dev.driver_data) != NULL) \ > > + ret = sprintf(buf, format, ##args); \ > > + read_unlock(&sysfs_read_lock); \ > > + put_device(_dev); \ > > + return ret; \ > > } \ > > static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL) > > > > @@ -170,6 +180,7 @@ > > { > > struct backend_info *be = dev->dev.driver_data; > > > > + write_lock(&sysfs_read_lock); > > if (be->group_added) > > .. and here: > > xentap_sysfs_delif(be->dev); > > if (be->backend_watch.node) { > > @@ -187,6 +198,7 @@ > > } > > kfree(be); > > dev->dev.driver_data = NULL; > > + write_unlock(&sysfs_read_lock); > > return 0; > > } > > Sysfs may refcount these nodes, blocking xentap_sysfs_delif().Would you like to give me some info about of blocking and deadlocks? Checked xentap_sysfs_delif() -> sysfs_remove_group() and I did not found blocking codes, maybe I lost something? Thanks, Joe _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 07:26 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
On Wed, 2010-01-20 at 22:13 -0500, Joe Jin wrote:> On 2010-01-20 18:16, Daniel Stodden wrote: > > > > Hey. > > > > I think reverted a similar patch when moving past 2.6.18 a while ago, > > but I don''t remember the first kernel versions affected right now. > >> > Sysfs may refcount these nodes, blocking xentap_sysfs_delif(). > > Would you like to give me some info about of blocking and deadlocks? > > Checked xentap_sysfs_delif() -> sysfs_remove_group() and I did not found > blocking codes, maybe I lost something? > > Thanks, > JoeHi Joe. Your patch will work okay on 2.6.18. But collisions will deadlock after 2.6.23 Found an old stack trace: [2009-07-08 06:15:08 UTC] INFO: task xb.00021.xvdd:30039 blocked for more than 120 seconds. [2009-07-08 06:15:08 UTC] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [2009-07-08 06:15:08 UTC] c7adfe0c 00000246 00000000 00000000 5b88fd4f 00000256 c7ade000 c7adfdc8 [2009-07-08 06:15:08 UTC] c0107c1b 38c984a4 00000256 c7adfddc ed629578 ed6293f0 ed629578 c16bdb00 [2009-07-08 06:15:08 UTC] 00000000 eea0d500 c16bdb34 002dc05a 00000000 00000005 0024c31c e20a8ff0 [2009-07-08 06:15:08 UTC] Call Trace: [2009-07-08 06:15:08 UTC] [<c0107c1b>] ? local_clock+0x3b/0x90 [2009-07-08 06:15:08 UTC] [<c0344675>] schedule_timeout+0x75/0xc0 [2009-07-08 06:15:08 UTC] [<c011ed81>] ? pick_next_task_fair+0x91/0xd0 [2009-07-08 06:15:08 UTC] [<c03442c9>] wait_for_common+0xa9/0x1c0 [2009-07-08 06:15:08 UTC] [<c0120d40>] ? default_wake_function+0x0/0x10 [2009-07-08 06:15:08 UTC] [<c0344472>] wait_for_completion+0x12/0x20 [2009-07-08 06:15:08 UTC] [<c01cf9e7>] sysfs_addrm_finish+0x1e7/0x230 [2009-07-08 06:15:08 UTC] [<c01ce2e5>] sysfs_hash_and_remove+0x45/0x70 [2009-07-08 06:15:08 UTC] [<c01d0adb>] remove_files+0x1b/0x30 [2009-07-08 06:15:08 UTC] [<c01d0b26>] sysfs_remove_group+0x36/0xc0 [2009-07-08 06:15:08 UTC] [<c01ae02f>] ? __blkdev_put+0x14f/0x160 [2009-07-08 06:15:08 UTC] [<c02769fc>] xenvbd_sysfs_delif+0x2c/0x60 [2009-07-08 06:15:08 UTC] [<c0276a76>] blkback_close+0x46/0x70 [2009-07-08 06:15:08 UTC] [<c0275d33>] blkif_schedule+0x583/0x5b0 [2009-07-08 06:15:08 UTC] [<c011ed81>] ? pick_next_task_fair+0x91/0xd0 [2009-07-08 06:15:08 UTC] [<c013dda0>] ? autoremove_wake_function+0x0/0x50 [2009-07-08 06:15:08 UTC] [<c02757b0>] ? blkif_schedule+0x0/0x5b0 [2009-07-08 06:15:08 UTC] [<c013da42>] kthread+0x42/0x70 [2009-07-08 06:15:08 UTC] [<c013da00>] ? kthread+0x0/0x70 [2009-07-08 06:15:08 UTC] [<c010561b>] kernel_thread_helper+0x7/0x10 The reason is in sysfs_deactivate(), which will sync callers against any remaining thread in .show() - show() hangs on the lock - the lock holder in sysfs_remove_group(), waiting for show() to complete. Pardon me -- I''m not entirely sure where/how these patches are currently submitted and merged. I suppose yours are only for linux-2.6.18.hg, not e.g. pvops? Then sorry for any confusion. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Your patch will work okay on 2.6.18. > > But collisions will deadlock after 2.6.23 > > Found an old stack trace: > > [2009-07-08 06:15:08 UTC] INFO: task xb.00021.xvdd:30039 blocked for more than 120 seconds. > [2009-07-08 06:15:08 UTC] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [2009-07-08 06:15:08 UTC] c7adfe0c 00000246 00000000 00000000 5b88fd4f 00000256 c7ade000 c7adfdc8 > [2009-07-08 06:15:08 UTC] c0107c1b 38c984a4 00000256 c7adfddc ed629578 ed6293f0 ed629578 c16bdb00 > [2009-07-08 06:15:08 UTC] 00000000 eea0d500 c16bdb34 002dc05a 00000000 00000005 0024c31c e20a8ff0 > [2009-07-08 06:15:08 UTC] Call Trace: > [2009-07-08 06:15:08 UTC] [<c0107c1b>] ? local_clock+0x3b/0x90 > [2009-07-08 06:15:08 UTC] [<c0344675>] schedule_timeout+0x75/0xc0 > [2009-07-08 06:15:08 UTC] [<c011ed81>] ? pick_next_task_fair+0x91/0xd0 > [2009-07-08 06:15:08 UTC] [<c03442c9>] wait_for_common+0xa9/0x1c0 > [2009-07-08 06:15:08 UTC] [<c0120d40>] ? default_wake_function+0x0/0x10 > [2009-07-08 06:15:08 UTC] [<c0344472>] wait_for_completion+0x12/0x20 > [2009-07-08 06:15:08 UTC] [<c01cf9e7>] sysfs_addrm_finish+0x1e7/0x230 > [2009-07-08 06:15:08 UTC] [<c01ce2e5>] sysfs_hash_and_remove+0x45/0x70 > [2009-07-08 06:15:08 UTC] [<c01d0adb>] remove_files+0x1b/0x30 > [2009-07-08 06:15:08 UTC] [<c01d0b26>] sysfs_remove_group+0x36/0xc0 > [2009-07-08 06:15:08 UTC] [<c01ae02f>] ? __blkdev_put+0x14f/0x160 > [2009-07-08 06:15:08 UTC] [<c02769fc>] xenvbd_sysfs_delif+0x2c/0x60 > [2009-07-08 06:15:08 UTC] [<c0276a76>] blkback_close+0x46/0x70 > [2009-07-08 06:15:08 UTC] [<c0275d33>] blkif_schedule+0x583/0x5b0 > [2009-07-08 06:15:08 UTC] [<c011ed81>] ? pick_next_task_fair+0x91/0xd0 > [2009-07-08 06:15:08 UTC] [<c013dda0>] ? autoremove_wake_function+0x0/0x50 > [2009-07-08 06:15:08 UTC] [<c02757b0>] ? blkif_schedule+0x0/0x5b0 > [2009-07-08 06:15:08 UTC] [<c013da42>] kthread+0x42/0x70 > [2009-07-08 06:15:08 UTC] [<c013da00>] ? kthread+0x0/0x70 > [2009-07-08 06:15:08 UTC] [<c010561b>] kernel_thread_helper+0x7/0x10 > > The reason is in sysfs_deactivate(), which will sync callers against any > remaining thread in .show() > - show() hangs on the lock > - the lock holder in sysfs_remove_group(), > waiting for show() to complete. > > Pardon me -- I''m not entirely sure where/how these patches are currently > submitted and merged. I suppose yours are only for linux-2.6.18.hg, not > e.g. pvops? Then sorry for any confusion. >Daniel, Thanks a lot of your comments, it really help for me, yes my patch based linux-2.6.18.hg branch. As Jan have pointed out in previous email, it should be sysfs''s issue, looked like later kernel sysfs have fixed the issue? Thanks, Joe _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 18:01 UTC
Re: [Xen-devel] [PATCH] Fix blkback/blktap sysfs read bug.
On Thu, 2010-01-21 at 02:49 -0500, Joe Jin wrote:> > The reason is in sysfs_deactivate(), which will sync callers against any > > remaining thread in .show() > > - show() hangs on the lock > > - the lock holder in sysfs_remove_group(), > > waiting for show() to complete. > > > > Pardon me -- I''m not entirely sure where/how these patches are currently > > submitted and merged. I suppose yours are only for linux-2.6.18.hg, not > > e.g. pvops? Then sorry for any confusion. > > > > Daniel, > > Thanks a lot of your comments, it really help for me, yes my patch based > linux-2.6.18.hg branch. As Jan have pointed out in previous email, it should > be sysfs''s issue, looked like later kernel sysfs have fixed the issue?Yes. Later sysfs won''t need the locking. Ordering sysfs removal before backend release will be sufficient. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel