Hi all, this small patch fixes gntdev on Linux pvops kernels: gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long as parameters for machine addresses because they are not big enough on PAE systems. This patch fixes the issue using phys_addr_t instead and enables XEN_GNTDEV compilation again. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index f6828c5..a2833bb 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -170,7 +170,7 @@ config XEN_S3 config XEN_GNTDEV tristate "userspace grant access device driver" - depends on XEN && BROKEN + depends on XEN select MMU_NOTIFIER help Allows userspace processes use grants. diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index ef07e91..8552d7e 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -146,7 +146,7 @@ void gnttab_grant_foreign_transfer_ref(grant_ref_t, domid_t domid, unsigned long pfn); static inline void -gnttab_set_map_op(struct gnttab_map_grant_ref *map, unsigned long addr, +gnttab_set_map_op(struct gnttab_map_grant_ref *map, phys_addr_t addr, uint32_t flags, grant_ref_t ref, domid_t domid) { if (flags & GNTMAP_contains_pte) @@ -162,7 +162,7 @@ gnttab_set_map_op(struct gnttab_map_grant_ref *map, unsigned long addr, } static inline void -gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, unsigned long addr, +gnttab_set_unmap_op(struct gnttab_unmap_grant_ref *unmap, phys_addr_t addr, uint32_t flags, grant_handle_t handle) { if (flags & GNTMAP_contains_pte) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Feb-01 15:46 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Mon, 1 Feb 2010, Stefano Stabellini wrote:> Hi all, > this small patch fixes gntdev on Linux pvops kernels: > gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long > as parameters for machine addresses because they are not big enough on > PAE systems. > This patch fixes the issue using phys_addr_t instead and enables > XEN_GNTDEV compilation again. > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >BTW gntdev is used by qemu to provide the console backend to pv guests. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Feb-09 21:57 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 02/01/2010 07:46 AM, Stefano Stabellini wrote:> On Mon, 1 Feb 2010, Stefano Stabellini wrote: > >> Hi all, >> this small patch fixes gntdev on Linux pvops kernels: >> gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long >> as parameters for machine addresses because they are not big enough on >> PAE systems. >> This patch fixes the issue using phys_addr_t instead and enables >> XEN_GNTDEV compilation again. >> >> >> Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> >> >> > BTW gntdev is used by qemu to provide the console backend to pv guests. >Is that recent? Console had been working before hadn''t it? The gntdev problems I saw were more locking related than anything to do with PAE. Did you try testing with lock debugging enabled? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Feb-09 22:24 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 02/01/2010 07:46 AM, Stefano Stabellini wrote:> On Mon, 1 Feb 2010, Stefano Stabellini wrote: > >> Hi all, >> this small patch fixes gntdev on Linux pvops kernels: >> gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long >> as parameters for machine addresses because they are not big enough on >> PAE systems. >> This patch fixes the issue using phys_addr_t instead and enables >> XEN_GNTDEV compilation again. >> >> >> Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> >> >> > BTW gntdev is used by qemu to provide the console backend to pv guests. >I applied the grant_table.h parts of the patch since its clearly a fix, but I think we need confirm that gnt_dev is really OK. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Feb-10 12:19 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Tue, 9 Feb 2010, Jeremy Fitzhardinge wrote:> On 02/01/2010 07:46 AM, Stefano Stabellini wrote: > > On Mon, 1 Feb 2010, Stefano Stabellini wrote: > > > >> Hi all, > >> this small patch fixes gntdev on Linux pvops kernels: > >> gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long > >> as parameters for machine addresses because they are not big enough on > >> PAE systems. > >> This patch fixes the issue using phys_addr_t instead and enables > >> XEN_GNTDEV compilation again. > >> > >> > >> Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > >> > >> > > BTW gntdev is used by qemu to provide the console backend to pv guests. > > > > Is that recent? Console had been working before hadn''t it? > > The gntdev problems I saw were more locking related than anything to do > with PAE. Did you try testing with lock debugging enabled? >Yes, I don''t have any problem with locking in gntdev on my testbox. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Feb-10 23:12 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 02/10/2010 04:19 AM, Stefano Stabellini wrote:> On Tue, 9 Feb 2010, Jeremy Fitzhardinge wrote: > >> On 02/01/2010 07:46 AM, Stefano Stabellini wrote: >> >>> On Mon, 1 Feb 2010, Stefano Stabellini wrote: >>> >>> >>>> Hi all, >>>> this small patch fixes gntdev on Linux pvops kernels: >>>> gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long >>>> as parameters for machine addresses because they are not big enough on >>>> PAE systems. >>>> This patch fixes the issue using phys_addr_t instead and enables >>>> XEN_GNTDEV compilation again. >>>> >>>> >>>> Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> >>>> >>>> >>>> >>> BTW gntdev is used by qemu to provide the console backend to pv guests. >>> >>> >> Is that recent? Console had been working before hadn''t it? >> >> The gntdev problems I saw were more locking related than anything to do >> with PAE. Did you try testing with lock debugging enabled? >> >> > Yes, I don''t have any problem with locking in gntdev on my testbox. >OK, I''ll give it another go. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-May-28 17:29 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 02/10/2010 04:19 AM, Stefano Stabellini wrote:> On Tue, 9 Feb 2010, Jeremy Fitzhardinge wrote: > >> On 02/01/2010 07:46 AM, Stefano Stabellini wrote: >> >>> On Mon, 1 Feb 2010, Stefano Stabellini wrote: >>> >>> >>>> Hi all, >>>> this small patch fixes gntdev on Linux pvops kernels: >>>> gnttab_set_map_op and gnttab_set_unmap_op shouldn''t take unsigned long >>>> as parameters for machine addresses because they are not big enough on >>>> PAE systems. >>>> This patch fixes the issue using phys_addr_t instead and enables >>>> XEN_GNTDEV compilation again. >>>> >>>> >>>> Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> >>>> >>>> >>>> >>> BTW gntdev is used by qemu to provide the console backend to pv guests. >>> >>> >> Is that recent? Console had been working before hadn''t it? >> >> The gntdev problems I saw were more locking related than anything to do >> with PAE. Did you try testing with lock debugging enabled? >> >> > Yes, I don''t have any problem with locking in gntdev on my testbox. >I managed to catch a lockdep problem in gntdev, which may be the same as before: BUG: sleeping function called from invalid context at kernel/rwsem.c:21 in_atomic(): 1, irqs_disabled(): 0, pid: 4091, name: qemu-dm 2 locks held by qemu-dm/4091: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810bb50f>] sys_munmap+0x33/0x58 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff810cd63a>] __mmu_notifier_invalidate_range_start+0x0/0xc7 Pid: 4091, comm: qemu-dm Not tainted 2.6.32.13 #23 Call Trace: [<ffffffff8106705b>] ? __debug_show_held_locks+0x22/0x24 [<ffffffff81039522>] __might_sleep+0x123/0x127 [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 [<ffffffff81498849>] down_read+0x1f/0x57 [<ffffffff81010142>] ? check_events+0x12/0x20 [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 [<ffffffff8123e069>] mn_invl_range_start+0x32/0x118 [<ffffffff810cd69c>] __mmu_notifier_invalidate_range_start+0x62/0xc7 [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 [<ffffffff810b54bc>] unmap_vmas+0x8c/0x91a [<ffffffff810ba363>] unmap_region+0xda/0x178 [<ffffffff810bb472>] do_munmap+0x2ae/0x318 [<ffffffff810bb51d>] sys_munmap+0x41/0x58 [<ffffffff81013b82>] system_call_fastpath+0x16/0x1b The problem is that mn_invl_range_start does a down_read(), but it is called from __mmu_notifier_invalidate_range_start(), which does an rcu_read_lock, which has the side-effect of disabling preemption. The mmu notifier code seems to have always used rcu_read_lock this way, so I guess this bug has always been there. It''s not immediately obvious how to fix it. Thoughts? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jun-01 09:38 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Fri, 28 May 2010, Jeremy Fitzhardinge wrote:> > I managed to catch a lockdep problem in gntdev, which may be the same as > before: > > BUG: sleeping function called from invalid context at kernel/rwsem.c:21 > in_atomic(): 1, irqs_disabled(): 0, pid: 4091, name: qemu-dm > 2 locks held by qemu-dm/4091: > #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810bb50f>] sys_munmap+0x33/0x58 > #1: (rcu_read_lock){.+.+..}, at: [<ffffffff810cd63a>] __mmu_notifier_invalidate_range_start+0x0/0xc7 > Pid: 4091, comm: qemu-dm Not tainted 2.6.32.13 #23 > Call Trace: > [<ffffffff8106705b>] ? __debug_show_held_locks+0x22/0x24 > [<ffffffff81039522>] __might_sleep+0x123/0x127 > [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 > [<ffffffff81498849>] down_read+0x1f/0x57 > [<ffffffff81010142>] ? check_events+0x12/0x20 > [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 > [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 > [<ffffffff8123e069>] mn_invl_range_start+0x32/0x118 > [<ffffffff810cd69c>] __mmu_notifier_invalidate_range_start+0x62/0xc7 > [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 > [<ffffffff810b54bc>] unmap_vmas+0x8c/0x91a > [<ffffffff810ba363>] unmap_region+0xda/0x178 > [<ffffffff810bb472>] do_munmap+0x2ae/0x318 > [<ffffffff810bb51d>] sys_munmap+0x41/0x58 > [<ffffffff81013b82>] system_call_fastpath+0x16/0x1b > > > The problem is that mn_invl_range_start does a down_read(), but it is > called from __mmu_notifier_invalidate_range_start(), which does an > rcu_read_lock, which has the side-effect of disabling preemption. > > The mmu notifier code seems to have always used rcu_read_lock this way, > so I guess this bug has always been there. It''s not immediately obvious > how to fix it. > > Thoughts?What about turning the semaphore into a rwlock? Performances shouldn''t matter in this case. Something like this: --- diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index ddc59cc..91846b9 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -28,7 +28,7 @@ #include <linux/types.h> #include <linux/uaccess.h> #include <linux/sched.h> -#include <linux/rwsem.h> +#include <linux/spinlock_types.h> #include <xen/xen.h> #include <xen/grant_table.h> @@ -51,7 +51,7 @@ struct gntdev_priv { struct list_head maps; uint32_t used; uint32_t limit; - struct rw_semaphore sem; + rwlock_t rwlock; struct mm_struct *mm; struct mmu_notifier mn; }; @@ -277,7 +277,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, unsigned long mstart, mend; int err; - down_read(&priv->sem); + read_lock(&priv->rwlock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -299,7 +299,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, (mend - mstart) >> PAGE_SHIFT); WARN_ON(err); } - up_read(&priv->sem); + read_unlock(&priv->rwlock); } static void mn_invl_page(struct mmu_notifier *mn, @@ -316,7 +316,7 @@ static void mn_release(struct mmu_notifier *mn, struct grant_map *map; int err; - down_read(&priv->sem); + read_lock(&priv->rwlock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -327,7 +327,7 @@ static void mn_release(struct mmu_notifier *mn, err = unmap_grant_pages(map, 0, map->count); WARN_ON(err); } - up_read(&priv->sem); + read_unlock(&priv->rwlock); } struct mmu_notifier_ops gntdev_mmu_ops = { @@ -347,7 +347,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) return -ENOMEM; INIT_LIST_HEAD(&priv->maps); - init_rwsem(&priv->sem); + priv->rwlock = RW_LOCK_UNLOCKED; priv->limit = limit; priv->mm = get_task_mm(current); @@ -375,13 +375,13 @@ static int gntdev_release(struct inode *inode, struct file *flip) if (debug) printk("%s: priv %p\n", __FUNCTION__, priv); - down_write(&priv->sem); + write_lock(&priv->rwlock); while (!list_empty(&priv->maps)) { map = list_entry(priv->maps.next, struct grant_map, next); err = gntdev_del_map(map); WARN_ON(err); } - up_write(&priv->sem); + write_unlock(&priv->rwlock); mmu_notifier_unregister(&priv->mn, priv->mm); kfree(priv); return 0; @@ -404,7 +404,7 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, if (unlikely(op.count > priv->limit)) return -EINVAL; - down_write(&priv->sem); + write_lock(&priv->rwlock); err = -ENOMEM; map = gntdev_add_map(priv, op.count); if (!map) @@ -417,13 +417,13 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, op.index = map->index << PAGE_SHIFT; if (copy_to_user(u, &op, sizeof(op)) != 0) goto err_free; - up_write(&priv->sem); + write_unlock(&priv->rwlock); return 0; err_free: gntdev_del_map(map); err_unlock: - up_write(&priv->sem); + write_unlock(&priv->rwlock); return err; } @@ -440,11 +440,11 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, (int)op.index, (int)op.count); - down_write(&priv->sem); + write_lock(&priv->rwlock); map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); if (map) err = gntdev_del_map(map); - up_write(&priv->sem); + write_unlock(&priv->rwlock); return err; } @@ -460,16 +460,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, (unsigned long)op.vaddr); - down_read(&priv->sem); + read_lock(&priv->rwlock); map = gntdev_find_map_vaddr(priv, op.vaddr); if (map == NULL || map->vma->vm_start != op.vaddr) { - up_read(&priv->sem); + read_unlock(&priv->rwlock); return -EINVAL; } op.offset = map->index << PAGE_SHIFT; op.count = map->count; - up_read(&priv->sem); + read_unlock(&priv->rwlock); if (copy_to_user(u, &op, sizeof(op)) != 0) return -EFAULT; @@ -488,9 +488,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, if (op.count > limit) return -EINVAL; - down_write(&priv->sem); + write_lock(&priv->rwlock); priv->limit = op.count; - up_write(&priv->sem); + write_unlock(&priv->rwlock); return 0; } @@ -538,7 +538,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, index, count, vma->vm_start, vma->vm_pgoff); - down_read(&priv->sem); + read_lock(&priv->rwlock); map = gntdev_find_map_index(priv, index, count); if (!map) goto unlock_out; @@ -580,7 +580,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) map->is_mapped = 1; unlock_out: - up_read(&priv->sem); + read_unlock(&priv->rwlock); return err; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-01 16:31 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 06/01/2010 02:38 AM, Stefano Stabellini wrote:> On Fri, 28 May 2010, Jeremy Fitzhardinge wrote: > >> I managed to catch a lockdep problem in gntdev, which may be the same as >> before: >> >> BUG: sleeping function called from invalid context at kernel/rwsem.c:21 >> in_atomic(): 1, irqs_disabled(): 0, pid: 4091, name: qemu-dm >> 2 locks held by qemu-dm/4091: >> #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810bb50f>] sys_munmap+0x33/0x58 >> #1: (rcu_read_lock){.+.+..}, at: [<ffffffff810cd63a>] __mmu_notifier_invalidate_range_start+0x0/0xc7 >> Pid: 4091, comm: qemu-dm Not tainted 2.6.32.13 #23 >> Call Trace: >> [<ffffffff8106705b>] ? __debug_show_held_locks+0x22/0x24 >> [<ffffffff81039522>] __might_sleep+0x123/0x127 >> [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 >> [<ffffffff81498849>] down_read+0x1f/0x57 >> [<ffffffff81010142>] ? check_events+0x12/0x20 >> [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 >> [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 >> [<ffffffff8123e069>] mn_invl_range_start+0x32/0x118 >> [<ffffffff810cd69c>] __mmu_notifier_invalidate_range_start+0x62/0xc7 >> [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 >> [<ffffffff810b54bc>] unmap_vmas+0x8c/0x91a >> [<ffffffff810ba363>] unmap_region+0xda/0x178 >> [<ffffffff810bb472>] do_munmap+0x2ae/0x318 >> [<ffffffff810bb51d>] sys_munmap+0x41/0x58 >> [<ffffffff81013b82>] system_call_fastpath+0x16/0x1b >> >> >> The problem is that mn_invl_range_start does a down_read(), but it is >> called from __mmu_notifier_invalidate_range_start(), which does an >> rcu_read_lock, which has the side-effect of disabling preemption. >> >> The mmu notifier code seems to have always used rcu_read_lock this way, >> so I guess this bug has always been there. It''s not immediately obvious >> how to fix it. >> >> Thoughts? >> > What about turning the semaphore into a rwlock? > Performances shouldn''t matter in this case. > Something like this: >The problem is that the rcu lock disables preemption, so anything inside it must be non-scheduling. So it would need to be a spinlock type thing, I think. J> --- > > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > index ddc59cc..91846b9 100644 > --- a/drivers/xen/gntdev.c > +++ b/drivers/xen/gntdev.c > @@ -28,7 +28,7 @@ > #include <linux/types.h> > #include <linux/uaccess.h> > #include <linux/sched.h> > -#include <linux/rwsem.h> > +#include <linux/spinlock_types.h> > > #include <xen/xen.h> > #include <xen/grant_table.h> > @@ -51,7 +51,7 @@ struct gntdev_priv { > struct list_head maps; > uint32_t used; > uint32_t limit; > - struct rw_semaphore sem; > + rwlock_t rwlock; > struct mm_struct *mm; > struct mmu_notifier mn; > }; > @@ -277,7 +277,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > unsigned long mstart, mend; > int err; > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -299,7 +299,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > (mend - mstart) >> PAGE_SHIFT); > WARN_ON(err); > } > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > } > > static void mn_invl_page(struct mmu_notifier *mn, > @@ -316,7 +316,7 @@ static void mn_release(struct mmu_notifier *mn, > struct grant_map *map; > int err; > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -327,7 +327,7 @@ static void mn_release(struct mmu_notifier *mn, > err = unmap_grant_pages(map, 0, map->count); > WARN_ON(err); > } > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > } > > struct mmu_notifier_ops gntdev_mmu_ops = { > @@ -347,7 +347,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) > return -ENOMEM; > > INIT_LIST_HEAD(&priv->maps); > - init_rwsem(&priv->sem); > + priv->rwlock = RW_LOCK_UNLOCKED; > priv->limit = limit; > > priv->mm = get_task_mm(current); > @@ -375,13 +375,13 @@ static int gntdev_release(struct inode *inode, struct file *flip) > if (debug) > printk("%s: priv %p\n", __FUNCTION__, priv); > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > while (!list_empty(&priv->maps)) { > map = list_entry(priv->maps.next, struct grant_map, next); > err = gntdev_del_map(map); > WARN_ON(err); > } > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > mmu_notifier_unregister(&priv->mn, priv->mm); > kfree(priv); > return 0; > @@ -404,7 +404,7 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, > if (unlikely(op.count > priv->limit)) > return -EINVAL; > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > err = -ENOMEM; > map = gntdev_add_map(priv, op.count); > if (!map) > @@ -417,13 +417,13 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, > op.index = map->index << PAGE_SHIFT; > if (copy_to_user(u, &op, sizeof(op)) != 0) > goto err_free; > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > return 0; > > err_free: > gntdev_del_map(map); > err_unlock: > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > return err; > } > > @@ -440,11 +440,11 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, > printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, > (int)op.index, (int)op.count); > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); > if (map) > err = gntdev_del_map(map); > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > return err; > } > > @@ -460,16 +460,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, > printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, > (unsigned long)op.vaddr); > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > map = gntdev_find_map_vaddr(priv, op.vaddr); > if (map == NULL || > map->vma->vm_start != op.vaddr) { > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > return -EINVAL; > } > op.offset = map->index << PAGE_SHIFT; > op.count = map->count; > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > > if (copy_to_user(u, &op, sizeof(op)) != 0) > return -EFAULT; > @@ -488,9 +488,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, > if (op.count > limit) > return -EINVAL; > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > priv->limit = op.count; > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > return 0; > } > > @@ -538,7 +538,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, > index, count, vma->vm_start, vma->vm_pgoff); > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > map = gntdev_find_map_index(priv, index, count); > if (!map) > goto unlock_out; > @@ -580,7 +580,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > map->is_mapped = 1; > > unlock_out: > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > return err; > } > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jun-01 16:36 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Tue, 1 Jun 2010, Jeremy Fitzhardinge wrote:> On 06/01/2010 02:38 AM, Stefano Stabellini wrote: > > On Fri, 28 May 2010, Jeremy Fitzhardinge wrote: > > > >> I managed to catch a lockdep problem in gntdev, which may be the same as > >> before: > >> > >> BUG: sleeping function called from invalid context at kernel/rwsem.c:21 > >> in_atomic(): 1, irqs_disabled(): 0, pid: 4091, name: qemu-dm > >> 2 locks held by qemu-dm/4091: > >> #0: (&mm->mmap_sem){++++++}, at: [<ffffffff810bb50f>] sys_munmap+0x33/0x58 > >> #1: (rcu_read_lock){.+.+..}, at: [<ffffffff810cd63a>] __mmu_notifier_invalidate_range_start+0x0/0xc7 > >> Pid: 4091, comm: qemu-dm Not tainted 2.6.32.13 #23 > >> Call Trace: > >> [<ffffffff8106705b>] ? __debug_show_held_locks+0x22/0x24 > >> [<ffffffff81039522>] __might_sleep+0x123/0x127 > >> [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 > >> [<ffffffff81498849>] down_read+0x1f/0x57 > >> [<ffffffff81010142>] ? check_events+0x12/0x20 > >> [<ffffffff810a8536>] ? release_pages+0xd2/0x1e7 > >> [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 > >> [<ffffffff8123e069>] mn_invl_range_start+0x32/0x118 > >> [<ffffffff810cd69c>] __mmu_notifier_invalidate_range_start+0x62/0xc7 > >> [<ffffffff810cd63a>] ? __mmu_notifier_invalidate_range_start+0x0/0xc7 > >> [<ffffffff810b54bc>] unmap_vmas+0x8c/0x91a > >> [<ffffffff810ba363>] unmap_region+0xda/0x178 > >> [<ffffffff810bb472>] do_munmap+0x2ae/0x318 > >> [<ffffffff810bb51d>] sys_munmap+0x41/0x58 > >> [<ffffffff81013b82>] system_call_fastpath+0x16/0x1b > >> > >> > >> The problem is that mn_invl_range_start does a down_read(), but it is > >> called from __mmu_notifier_invalidate_range_start(), which does an > >> rcu_read_lock, which has the side-effect of disabling preemption. > >> > >> The mmu notifier code seems to have always used rcu_read_lock this way, > >> so I guess this bug has always been there. It''s not immediately obvious > >> how to fix it. > >> > >> Thoughts? > >> > > What about turning the semaphore into a rwlock? > > Performances shouldn''t matter in this case. > > Something like this: > > > > The problem is that the rcu lock disables preemption, so anything inside > it must be non-scheduling. So it would need to be a spinlock type > thing, I think.right, in fact rwlock is a rw spinlock if I am not mistaken _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-01 16:46 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 06/01/2010 09:36 AM, Stefano Stabellini wrote:>>> Performances shouldn''t matter in this case. >>> Something like this: >>> >>> >> The problem is that the rcu lock disables preemption, so anything inside >> it must be non-scheduling. So it would need to be a spinlock type >> thing, I think. >> > right, in fact rwlock is a rw spinlock if I am not mistaken >Ah, yes. The problem with just making it a spinlock is that other parts of the code do copy_from_user while holding it, so they would need to be restructured to avoid that. Also, rwlock is (almost?) never useful compared to a plain spinlock. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jun-02 14:11 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Tue, 1 Jun 2010, Jeremy Fitzhardinge wrote:> On 06/01/2010 09:36 AM, Stefano Stabellini wrote: > >>> Performances shouldn''t matter in this case. > >>> Something like this: > >>> > >>> > >> The problem is that the rcu lock disables preemption, so anything inside > >> it must be non-scheduling. So it would need to be a spinlock type > >> thing, I think. > >> > > right, in fact rwlock is a rw spinlock if I am not mistaken > > > > Ah, yes. The problem with just making it a spinlock is that other parts > of the code do copy_from_user while holding it, so they would need to be > restructured to avoid that. >Yes, you are right. I fixed the patch to do that, however I couldn''t actually reproduce the bug you are seeing so I couldn''t test the error path at all... --- diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index ddc59cc..0e571bd 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -28,7 +28,7 @@ #include <linux/types.h> #include <linux/uaccess.h> #include <linux/sched.h> -#include <linux/rwsem.h> +#include <linux/spinlock_types.h> #include <xen/xen.h> #include <xen/grant_table.h> @@ -51,7 +51,7 @@ struct gntdev_priv { struct list_head maps; uint32_t used; uint32_t limit; - struct rw_semaphore sem; + rwlock_t rwlock; struct mm_struct *mm; struct mmu_notifier mn; }; @@ -84,9 +84,9 @@ static void gntdev_print_maps(struct gntdev_priv *priv, map->index == text_index && text ? text : ""); } -static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) +static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count) { - struct grant_map *map, *add; + struct grant_map *add; add = kzalloc(sizeof(struct grant_map), GFP_KERNEL); if (NULL == add) @@ -104,8 +104,20 @@ static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) add->count = count; add->priv = priv; - if (add->count + priv->used > priv->limit) - goto err; + if (add->count + priv->used <= priv->limit) + return add; + +err: + kfree(add->grants); + kfree(add->map_ops); + kfree(add->unmap_ops); + kfree(add); + return NULL; +} + +static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add) +{ + struct grant_map *map; list_for_each_entry(map, &priv->maps, next) { if (add->index + add->count < map->index) { @@ -120,14 +132,6 @@ done: priv->used += add->count; if (debug) gntdev_print_maps(priv, "[new]", add->index); - return add; - -err: - kfree(add->grants); - kfree(add->map_ops); - kfree(add->unmap_ops); - kfree(add); - return NULL; } static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index, @@ -174,11 +178,17 @@ static int gntdev_del_map(struct grant_map *map) map->priv->used -= map->count; list_del(&map->next); + return 0; +} + +static void gntdev_free_map(struct grant_map *map) +{ + if (!map) + return; kfree(map->grants); kfree(map->map_ops); kfree(map->unmap_ops); kfree(map); - return 0; } /* ------------------------------------------------------------------ */ @@ -277,7 +287,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, unsigned long mstart, mend; int err; - down_read(&priv->sem); + read_lock(&priv->rwlock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -299,7 +309,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, (mend - mstart) >> PAGE_SHIFT); WARN_ON(err); } - up_read(&priv->sem); + read_unlock(&priv->rwlock); } static void mn_invl_page(struct mmu_notifier *mn, @@ -316,7 +326,7 @@ static void mn_release(struct mmu_notifier *mn, struct grant_map *map; int err; - down_read(&priv->sem); + read_lock(&priv->rwlock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -327,7 +337,7 @@ static void mn_release(struct mmu_notifier *mn, err = unmap_grant_pages(map, 0, map->count); WARN_ON(err); } - up_read(&priv->sem); + read_unlock(&priv->rwlock); } struct mmu_notifier_ops gntdev_mmu_ops = { @@ -347,7 +357,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) return -ENOMEM; INIT_LIST_HEAD(&priv->maps); - init_rwsem(&priv->sem); + priv->rwlock = RW_LOCK_UNLOCKED; priv->limit = limit; priv->mm = get_task_mm(current); @@ -375,13 +385,21 @@ static int gntdev_release(struct inode *inode, struct file *flip) if (debug) printk("%s: priv %p\n", __FUNCTION__, priv); - down_write(&priv->sem); - while (!list_empty(&priv->maps)) { - map = list_entry(priv->maps.next, struct grant_map, next); - err = gntdev_del_map(map); - WARN_ON(err); + while (1) { + write_lock(&priv->rwlock); + if (!list_empty(&priv->maps)) { + map = list_entry(priv->maps.next, struct grant_map, next); + err = gntdev_del_map(map); + write_unlock(&priv->rwlock); + if (err) + WARN_ON(err); + else + gntdev_free_map(map); + } else { + write_unlock(&priv->rwlock); + break; + } } - up_write(&priv->sem); mmu_notifier_unregister(&priv->mn, priv->mm); kfree(priv); return 0; @@ -404,27 +422,29 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, if (unlikely(op.count > priv->limit)) return -EINVAL; - down_write(&priv->sem); err = -ENOMEM; - map = gntdev_add_map(priv, op.count); + map = gntdev_alloc_map(priv, op.count); if (!map) - goto err_unlock; - - err = -ENOMEM; + return err; if (copy_from_user(map->grants, &u->refs, - sizeof(map->grants[0]) * op.count) != 0) - goto err_free; + sizeof(map->grants[0]) * op.count) != 0) { + gntdev_free_map(map); + return err; + } + + write_lock(&priv->rwlock); + gntdev_add_map(priv, map); + op.index = map->index << PAGE_SHIFT; - if (copy_to_user(u, &op, sizeof(op)) != 0) - goto err_free; - up_write(&priv->sem); + write_unlock(&priv->rwlock); + if (copy_to_user(u, &op, sizeof(op)) != 0) { + write_lock(&priv->rwlock); + gntdev_del_map(map); + write_unlock(&priv->rwlock); + gntdev_free_map(map); + return err; + } return 0; - -err_free: - gntdev_del_map(map); -err_unlock: - up_write(&priv->sem); - return err; } static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, @@ -440,11 +460,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, (int)op.index, (int)op.count); - down_write(&priv->sem); + write_lock(&priv->rwlock); map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); if (map) err = gntdev_del_map(map); - up_write(&priv->sem); + write_unlock(&priv->rwlock); + if (!err) + gntdev_free_map(map); return err; } @@ -460,16 +482,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, (unsigned long)op.vaddr); - down_read(&priv->sem); + read_lock(&priv->rwlock); map = gntdev_find_map_vaddr(priv, op.vaddr); if (map == NULL || map->vma->vm_start != op.vaddr) { - up_read(&priv->sem); + read_unlock(&priv->rwlock); return -EINVAL; } op.offset = map->index << PAGE_SHIFT; op.count = map->count; - up_read(&priv->sem); + read_unlock(&priv->rwlock); if (copy_to_user(u, &op, sizeof(op)) != 0) return -EFAULT; @@ -488,9 +510,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, if (op.count > limit) return -EINVAL; - down_write(&priv->sem); + write_lock(&priv->rwlock); priv->limit = op.count; - up_write(&priv->sem); + write_unlock(&priv->rwlock); return 0; } @@ -538,7 +560,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, index, count, vma->vm_start, vma->vm_pgoff); - down_read(&priv->sem); + read_lock(&priv->rwlock); map = gntdev_find_map_index(priv, index, count); if (!map) goto unlock_out; @@ -580,7 +602,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) map->is_mapped = 1; unlock_out: - up_read(&priv->sem); + read_unlock(&priv->rwlock); return err; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-02 17:11 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 06/02/2010 07:11 AM, Stefano Stabellini wrote:> On Tue, 1 Jun 2010, Jeremy Fitzhardinge wrote: > >> On 06/01/2010 09:36 AM, Stefano Stabellini wrote: >> >>>>> Performances shouldn''t matter in this case. >>>>> Something like this: >>>>> >>>>> >>>>> >>>> The problem is that the rcu lock disables preemption, so anything inside >>>> it must be non-scheduling. So it would need to be a spinlock type >>>> thing, I think. >>>> >>>> >>> right, in fact rwlock is a rw spinlock if I am not mistaken >>> >>> >> Ah, yes. The problem with just making it a spinlock is that other parts >> of the code do copy_from_user while holding it, so they would need to be >> restructured to avoid that. >> >> > Yes, you are right. I fixed the patch to do that, however I couldn''t > actually reproduce the bug you are seeing so I couldn''t test the error > path at all... >I hit this often, mostly when mucking around with xenstored (maybe when it exits?). Other people have reported it too. Does it really need to be a rwlock? J> --- > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > index ddc59cc..0e571bd 100644 > --- a/drivers/xen/gntdev.c > +++ b/drivers/xen/gntdev.c > @@ -28,7 +28,7 @@ > #include <linux/types.h> > #include <linux/uaccess.h> > #include <linux/sched.h> > -#include <linux/rwsem.h> > +#include <linux/spinlock_types.h> > > #include <xen/xen.h> > #include <xen/grant_table.h> > @@ -51,7 +51,7 @@ struct gntdev_priv { > struct list_head maps; > uint32_t used; > uint32_t limit; > - struct rw_semaphore sem; > + rwlock_t rwlock; > struct mm_struct *mm; > struct mmu_notifier mn; > }; > @@ -84,9 +84,9 @@ static void gntdev_print_maps(struct gntdev_priv *priv, > map->index == text_index && text ? text : ""); > } > > -static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) > +static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count) > { > - struct grant_map *map, *add; > + struct grant_map *add; > > add = kzalloc(sizeof(struct grant_map), GFP_KERNEL); > if (NULL == add) > @@ -104,8 +104,20 @@ static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) > add->count = count; > add->priv = priv; > > - if (add->count + priv->used > priv->limit) > - goto err; > + if (add->count + priv->used <= priv->limit) > + return add; > + > +err: > + kfree(add->grants); > + kfree(add->map_ops); > + kfree(add->unmap_ops); > + kfree(add); > + return NULL; > +} > + > +static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add) > +{ > + struct grant_map *map; > > list_for_each_entry(map, &priv->maps, next) { > if (add->index + add->count < map->index) { > @@ -120,14 +132,6 @@ done: > priv->used += add->count; > if (debug) > gntdev_print_maps(priv, "[new]", add->index); > - return add; > - > -err: > - kfree(add->grants); > - kfree(add->map_ops); > - kfree(add->unmap_ops); > - kfree(add); > - return NULL; > } > > static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index, > @@ -174,11 +178,17 @@ static int gntdev_del_map(struct grant_map *map) > > map->priv->used -= map->count; > list_del(&map->next); > + return 0; > +} > + > +static void gntdev_free_map(struct grant_map *map) > +{ > + if (!map) > + return; > kfree(map->grants); > kfree(map->map_ops); > kfree(map->unmap_ops); > kfree(map); > - return 0; > } > > /* ------------------------------------------------------------------ */ > @@ -277,7 +287,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > unsigned long mstart, mend; > int err; > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -299,7 +309,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > (mend - mstart) >> PAGE_SHIFT); > WARN_ON(err); > } > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > } > > static void mn_invl_page(struct mmu_notifier *mn, > @@ -316,7 +326,7 @@ static void mn_release(struct mmu_notifier *mn, > struct grant_map *map; > int err; > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -327,7 +337,7 @@ static void mn_release(struct mmu_notifier *mn, > err = unmap_grant_pages(map, 0, map->count); > WARN_ON(err); > } > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > } > > struct mmu_notifier_ops gntdev_mmu_ops = { > @@ -347,7 +357,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) > return -ENOMEM; > > INIT_LIST_HEAD(&priv->maps); > - init_rwsem(&priv->sem); > + priv->rwlock = RW_LOCK_UNLOCKED; > priv->limit = limit; > > priv->mm = get_task_mm(current); > @@ -375,13 +385,21 @@ static int gntdev_release(struct inode *inode, struct file *flip) > if (debug) > printk("%s: priv %p\n", __FUNCTION__, priv); > > - down_write(&priv->sem); > - while (!list_empty(&priv->maps)) { > - map = list_entry(priv->maps.next, struct grant_map, next); > - err = gntdev_del_map(map); > - WARN_ON(err); > + while (1) { > + write_lock(&priv->rwlock); > + if (!list_empty(&priv->maps)) { > + map = list_entry(priv->maps.next, struct grant_map, next); > + err = gntdev_del_map(map); > + write_unlock(&priv->rwlock); > + if (err) > + WARN_ON(err); > + else > + gntdev_free_map(map); > + } else { > + write_unlock(&priv->rwlock); > + break; > + } > } > - up_write(&priv->sem); > mmu_notifier_unregister(&priv->mn, priv->mm); > kfree(priv); > return 0; > @@ -404,27 +422,29 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, > if (unlikely(op.count > priv->limit)) > return -EINVAL; > > - down_write(&priv->sem); > err = -ENOMEM; > - map = gntdev_add_map(priv, op.count); > + map = gntdev_alloc_map(priv, op.count); > if (!map) > - goto err_unlock; > - > - err = -ENOMEM; > + return err; > if (copy_from_user(map->grants, &u->refs, > - sizeof(map->grants[0]) * op.count) != 0) > - goto err_free; > + sizeof(map->grants[0]) * op.count) != 0) { > + gntdev_free_map(map); > + return err; > + } > + > + write_lock(&priv->rwlock); > + gntdev_add_map(priv, map); > + > op.index = map->index << PAGE_SHIFT; > - if (copy_to_user(u, &op, sizeof(op)) != 0) > - goto err_free; > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > + if (copy_to_user(u, &op, sizeof(op)) != 0) { > + write_lock(&priv->rwlock); > + gntdev_del_map(map); > + write_unlock(&priv->rwlock); > + gntdev_free_map(map); > + return err; > + } > return 0; > - > -err_free: > - gntdev_del_map(map); > -err_unlock: > - up_write(&priv->sem); > - return err; > } > > static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, > @@ -440,11 +460,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, > printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, > (int)op.index, (int)op.count); > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); > if (map) > err = gntdev_del_map(map); > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > + if (!err) > + gntdev_free_map(map); > return err; > } > > @@ -460,16 +482,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, > printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, > (unsigned long)op.vaddr); > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > map = gntdev_find_map_vaddr(priv, op.vaddr); > if (map == NULL || > map->vma->vm_start != op.vaddr) { > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > return -EINVAL; > } > op.offset = map->index << PAGE_SHIFT; > op.count = map->count; > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > > if (copy_to_user(u, &op, sizeof(op)) != 0) > return -EFAULT; > @@ -488,9 +510,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, > if (op.count > limit) > return -EINVAL; > > - down_write(&priv->sem); > + write_lock(&priv->rwlock); > priv->limit = op.count; > - up_write(&priv->sem); > + write_unlock(&priv->rwlock); > return 0; > } > > @@ -538,7 +560,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, > index, count, vma->vm_start, vma->vm_pgoff); > > - down_read(&priv->sem); > + read_lock(&priv->rwlock); > map = gntdev_find_map_index(priv, index, count); > if (!map) > goto unlock_out; > @@ -580,7 +602,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > map->is_mapped = 1; > > unlock_out: > - up_read(&priv->sem); > + read_unlock(&priv->rwlock); > return err; > } > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jun-03 09:32 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On Wed, 2 Jun 2010, Jeremy Fitzhardinge wrote:> I hit this often, mostly when mucking around with xenstored (maybe when > it exits?). Other people have reported it too. > > Does it really need to be a rwlock? >I see you really don''t like rwlocks :) All right, this version uses plain spinlocks instead. --- diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index ddc59cc..6a3e207 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -28,7 +28,7 @@ #include <linux/types.h> #include <linux/uaccess.h> #include <linux/sched.h> -#include <linux/rwsem.h> +#include <linux/spinlock.h> #include <xen/xen.h> #include <xen/grant_table.h> @@ -51,7 +51,7 @@ struct gntdev_priv { struct list_head maps; uint32_t used; uint32_t limit; - struct rw_semaphore sem; + spinlock_t lock; struct mm_struct *mm; struct mmu_notifier mn; }; @@ -84,9 +84,9 @@ static void gntdev_print_maps(struct gntdev_priv *priv, map->index == text_index && text ? text : ""); } -static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) +static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count) { - struct grant_map *map, *add; + struct grant_map *add; add = kzalloc(sizeof(struct grant_map), GFP_KERNEL); if (NULL == add) @@ -104,8 +104,20 @@ static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) add->count = count; add->priv = priv; - if (add->count + priv->used > priv->limit) - goto err; + if (add->count + priv->used <= priv->limit) + return add; + +err: + kfree(add->grants); + kfree(add->map_ops); + kfree(add->unmap_ops); + kfree(add); + return NULL; +} + +static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add) +{ + struct grant_map *map; list_for_each_entry(map, &priv->maps, next) { if (add->index + add->count < map->index) { @@ -120,14 +132,6 @@ done: priv->used += add->count; if (debug) gntdev_print_maps(priv, "[new]", add->index); - return add; - -err: - kfree(add->grants); - kfree(add->map_ops); - kfree(add->unmap_ops); - kfree(add); - return NULL; } static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index, @@ -174,11 +178,17 @@ static int gntdev_del_map(struct grant_map *map) map->priv->used -= map->count; list_del(&map->next); + return 0; +} + +static void gntdev_free_map(struct grant_map *map) +{ + if (!map) + return; kfree(map->grants); kfree(map->map_ops); kfree(map->unmap_ops); kfree(map); - return 0; } /* ------------------------------------------------------------------ */ @@ -277,7 +287,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, unsigned long mstart, mend; int err; - down_read(&priv->sem); + spin_lock(&priv->lock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -299,7 +309,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, (mend - mstart) >> PAGE_SHIFT); WARN_ON(err); } - up_read(&priv->sem); + spin_unlock(&priv->lock); } static void mn_invl_page(struct mmu_notifier *mn, @@ -316,7 +326,7 @@ static void mn_release(struct mmu_notifier *mn, struct grant_map *map; int err; - down_read(&priv->sem); + spin_lock(&priv->lock); list_for_each_entry(map, &priv->maps, next) { if (!map->vma) continue; @@ -327,7 +337,7 @@ static void mn_release(struct mmu_notifier *mn, err = unmap_grant_pages(map, 0, map->count); WARN_ON(err); } - up_read(&priv->sem); + spin_unlock(&priv->lock); } struct mmu_notifier_ops gntdev_mmu_ops = { @@ -347,7 +357,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) return -ENOMEM; INIT_LIST_HEAD(&priv->maps); - init_rwsem(&priv->sem); + spin_lock_init(&priv->lock); priv->limit = limit; priv->mm = get_task_mm(current); @@ -375,13 +385,21 @@ static int gntdev_release(struct inode *inode, struct file *flip) if (debug) printk("%s: priv %p\n", __FUNCTION__, priv); - down_write(&priv->sem); - while (!list_empty(&priv->maps)) { - map = list_entry(priv->maps.next, struct grant_map, next); - err = gntdev_del_map(map); - WARN_ON(err); + while (1) { + spin_lock(&priv->lock); + if (!list_empty(&priv->maps)) { + map = list_entry(priv->maps.next, struct grant_map, next); + err = gntdev_del_map(map); + spin_unlock(&priv->lock); + if (err) + WARN_ON(err); + else + gntdev_free_map(map); + } else { + spin_unlock(&priv->lock); + break; + } } - up_write(&priv->sem); mmu_notifier_unregister(&priv->mn, priv->mm); kfree(priv); return 0; @@ -404,27 +422,29 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, if (unlikely(op.count > priv->limit)) return -EINVAL; - down_write(&priv->sem); err = -ENOMEM; - map = gntdev_add_map(priv, op.count); + map = gntdev_alloc_map(priv, op.count); if (!map) - goto err_unlock; - - err = -ENOMEM; + return err; if (copy_from_user(map->grants, &u->refs, - sizeof(map->grants[0]) * op.count) != 0) - goto err_free; + sizeof(map->grants[0]) * op.count) != 0) { + gntdev_free_map(map); + return err; + } + + spin_lock(&priv->lock); + gntdev_add_map(priv, map); + op.index = map->index << PAGE_SHIFT; - if (copy_to_user(u, &op, sizeof(op)) != 0) - goto err_free; - up_write(&priv->sem); + spin_unlock(&priv->lock); + if (copy_to_user(u, &op, sizeof(op)) != 0) { + spin_lock(&priv->lock); + gntdev_del_map(map); + spin_unlock(&priv->lock); + gntdev_free_map(map); + return err; + } return 0; - -err_free: - gntdev_del_map(map); -err_unlock: - up_write(&priv->sem); - return err; } static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, @@ -440,11 +460,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, (int)op.index, (int)op.count); - down_write(&priv->sem); + spin_lock(&priv->lock); map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); if (map) err = gntdev_del_map(map); - up_write(&priv->sem); + spin_unlock(&priv->lock); + if (!err) + gntdev_free_map(map); return err; } @@ -460,16 +482,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, (unsigned long)op.vaddr); - down_read(&priv->sem); + spin_lock(&priv->lock); map = gntdev_find_map_vaddr(priv, op.vaddr); if (map == NULL || map->vma->vm_start != op.vaddr) { - up_read(&priv->sem); + spin_unlock(&priv->lock); return -EINVAL; } op.offset = map->index << PAGE_SHIFT; op.count = map->count; - up_read(&priv->sem); + spin_unlock(&priv->lock); if (copy_to_user(u, &op, sizeof(op)) != 0) return -EFAULT; @@ -488,9 +510,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, if (op.count > limit) return -EINVAL; - down_write(&priv->sem); + spin_lock(&priv->lock); priv->limit = op.count; - up_write(&priv->sem); + spin_unlock(&priv->lock); return 0; } @@ -538,7 +560,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, index, count, vma->vm_start, vma->vm_pgoff); - down_read(&priv->sem); + spin_lock(&priv->lock); map = gntdev_find_map_index(priv, index, count); if (!map) goto unlock_out; @@ -580,7 +602,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) map->is_mapped = 1; unlock_out: - up_read(&priv->sem); + spin_unlock(&priv->lock); return err; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jun-09 19:38 UTC
Re: [Xen-devel] [PATCH] [PVOPS] fix gntdev on PAE
On 06/03/2010 02:32 AM, Stefano Stabellini wrote:> On Wed, 2 Jun 2010, Jeremy Fitzhardinge wrote: > >> I hit this often, mostly when mucking around with xenstored (maybe when >> it exits?). Other people have reported it too. >> >> Does it really need to be a rwlock? >> >> > I see you really don''t like rwlocks :) > > All right, this version uses plain spinlocks instead. >Thanks, that looks good. I tweaked it a little bit (kfree doesn''t block, so it can be done under spinlock which simplifies the release loop), and I''m about to give it a test spin. J> --- > > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c > index ddc59cc..6a3e207 100644 > --- a/drivers/xen/gntdev.c > +++ b/drivers/xen/gntdev.c > @@ -28,7 +28,7 @@ > #include <linux/types.h> > #include <linux/uaccess.h> > #include <linux/sched.h> > -#include <linux/rwsem.h> > +#include <linux/spinlock.h> > > #include <xen/xen.h> > #include <xen/grant_table.h> > @@ -51,7 +51,7 @@ struct gntdev_priv { > struct list_head maps; > uint32_t used; > uint32_t limit; > - struct rw_semaphore sem; > + spinlock_t lock; > struct mm_struct *mm; > struct mmu_notifier mn; > }; > @@ -84,9 +84,9 @@ static void gntdev_print_maps(struct gntdev_priv *priv, > map->index == text_index && text ? text : ""); > } > > -static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) > +static struct grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count) > { > - struct grant_map *map, *add; > + struct grant_map *add; > > add = kzalloc(sizeof(struct grant_map), GFP_KERNEL); > if (NULL == add) > @@ -104,8 +104,20 @@ static struct grant_map *gntdev_add_map(struct gntdev_priv *priv, int count) > add->count = count; > add->priv = priv; > > - if (add->count + priv->used > priv->limit) > - goto err; > + if (add->count + priv->used <= priv->limit) > + return add; > + > +err: > + kfree(add->grants); > + kfree(add->map_ops); > + kfree(add->unmap_ops); > + kfree(add); > + return NULL; > +} > + > +static void gntdev_add_map(struct gntdev_priv *priv, struct grant_map *add) > +{ > + struct grant_map *map; > > list_for_each_entry(map, &priv->maps, next) { > if (add->index + add->count < map->index) { > @@ -120,14 +132,6 @@ done: > priv->used += add->count; > if (debug) > gntdev_print_maps(priv, "[new]", add->index); > - return add; > - > -err: > - kfree(add->grants); > - kfree(add->map_ops); > - kfree(add->unmap_ops); > - kfree(add); > - return NULL; > } > > static struct grant_map *gntdev_find_map_index(struct gntdev_priv *priv, int index, > @@ -174,11 +178,17 @@ static int gntdev_del_map(struct grant_map *map) > > map->priv->used -= map->count; > list_del(&map->next); > + return 0; > +} > + > +static void gntdev_free_map(struct grant_map *map) > +{ > + if (!map) > + return; > kfree(map->grants); > kfree(map->map_ops); > kfree(map->unmap_ops); > kfree(map); > - return 0; > } > > /* ------------------------------------------------------------------ */ > @@ -277,7 +287,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > unsigned long mstart, mend; > int err; > > - down_read(&priv->sem); > + spin_lock(&priv->lock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -299,7 +309,7 @@ static void mn_invl_range_start(struct mmu_notifier *mn, > (mend - mstart) >> PAGE_SHIFT); > WARN_ON(err); > } > - up_read(&priv->sem); > + spin_unlock(&priv->lock); > } > > static void mn_invl_page(struct mmu_notifier *mn, > @@ -316,7 +326,7 @@ static void mn_release(struct mmu_notifier *mn, > struct grant_map *map; > int err; > > - down_read(&priv->sem); > + spin_lock(&priv->lock); > list_for_each_entry(map, &priv->maps, next) { > if (!map->vma) > continue; > @@ -327,7 +337,7 @@ static void mn_release(struct mmu_notifier *mn, > err = unmap_grant_pages(map, 0, map->count); > WARN_ON(err); > } > - up_read(&priv->sem); > + spin_unlock(&priv->lock); > } > > struct mmu_notifier_ops gntdev_mmu_ops = { > @@ -347,7 +357,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) > return -ENOMEM; > > INIT_LIST_HEAD(&priv->maps); > - init_rwsem(&priv->sem); > + spin_lock_init(&priv->lock); > priv->limit = limit; > > priv->mm = get_task_mm(current); > @@ -375,13 +385,21 @@ static int gntdev_release(struct inode *inode, struct file *flip) > if (debug) > printk("%s: priv %p\n", __FUNCTION__, priv); > > - down_write(&priv->sem); > - while (!list_empty(&priv->maps)) { > - map = list_entry(priv->maps.next, struct grant_map, next); > - err = gntdev_del_map(map); > - WARN_ON(err); > + while (1) { > + spin_lock(&priv->lock); > + if (!list_empty(&priv->maps)) { > + map = list_entry(priv->maps.next, struct grant_map, next); > + err = gntdev_del_map(map); > + spin_unlock(&priv->lock); > + if (err) > + WARN_ON(err); > + else > + gntdev_free_map(map); > + } else { > + spin_unlock(&priv->lock); > + break; > + } > } > - up_write(&priv->sem); > mmu_notifier_unregister(&priv->mn, priv->mm); > kfree(priv); > return 0; > @@ -404,27 +422,29 @@ static long gntdev_ioctl_map_grant_ref(struct gntdev_priv *priv, > if (unlikely(op.count > priv->limit)) > return -EINVAL; > > - down_write(&priv->sem); > err = -ENOMEM; > - map = gntdev_add_map(priv, op.count); > + map = gntdev_alloc_map(priv, op.count); > if (!map) > - goto err_unlock; > - > - err = -ENOMEM; > + return err; > if (copy_from_user(map->grants, &u->refs, > - sizeof(map->grants[0]) * op.count) != 0) > - goto err_free; > + sizeof(map->grants[0]) * op.count) != 0) { > + gntdev_free_map(map); > + return err; > + } > + > + spin_lock(&priv->lock); > + gntdev_add_map(priv, map); > + > op.index = map->index << PAGE_SHIFT; > - if (copy_to_user(u, &op, sizeof(op)) != 0) > - goto err_free; > - up_write(&priv->sem); > + spin_unlock(&priv->lock); > + if (copy_to_user(u, &op, sizeof(op)) != 0) { > + spin_lock(&priv->lock); > + gntdev_del_map(map); > + spin_unlock(&priv->lock); > + gntdev_free_map(map); > + return err; > + } > return 0; > - > -err_free: > - gntdev_del_map(map); > -err_unlock: > - up_write(&priv->sem); > - return err; > } > > static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, > @@ -440,11 +460,13 @@ static long gntdev_ioctl_unmap_grant_ref(struct gntdev_priv *priv, > printk("%s: priv %p, del %d+%d\n", __FUNCTION__, priv, > (int)op.index, (int)op.count); > > - down_write(&priv->sem); > + spin_lock(&priv->lock); > map = gntdev_find_map_index(priv, op.index >> PAGE_SHIFT, op.count); > if (map) > err = gntdev_del_map(map); > - up_write(&priv->sem); > + spin_unlock(&priv->lock); > + if (!err) > + gntdev_free_map(map); > return err; > } > > @@ -460,16 +482,16 @@ static long gntdev_ioctl_get_offset_for_vaddr(struct gntdev_priv *priv, > printk("%s: priv %p, offset for vaddr %lx\n", __FUNCTION__, priv, > (unsigned long)op.vaddr); > > - down_read(&priv->sem); > + spin_lock(&priv->lock); > map = gntdev_find_map_vaddr(priv, op.vaddr); > if (map == NULL || > map->vma->vm_start != op.vaddr) { > - up_read(&priv->sem); > + spin_unlock(&priv->lock); > return -EINVAL; > } > op.offset = map->index << PAGE_SHIFT; > op.count = map->count; > - up_read(&priv->sem); > + spin_unlock(&priv->lock); > > if (copy_to_user(u, &op, sizeof(op)) != 0) > return -EFAULT; > @@ -488,9 +510,9 @@ static long gntdev_ioctl_set_max_grants(struct gntdev_priv *priv, > if (op.count > limit) > return -EINVAL; > > - down_write(&priv->sem); > + spin_lock(&priv->lock); > priv->limit = op.count; > - up_write(&priv->sem); > + spin_unlock(&priv->lock); > return 0; > } > > @@ -538,7 +560,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > printk("%s: map %d+%d at %lx (pgoff %lx)\n", __FUNCTION__, > index, count, vma->vm_start, vma->vm_pgoff); > > - down_read(&priv->sem); > + spin_lock(&priv->lock); > map = gntdev_find_map_index(priv, index, count); > if (!map) > goto unlock_out; > @@ -580,7 +602,7 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) > map->is_mapped = 1; > > unlock_out: > - up_read(&priv->sem); > + spin_unlock(&priv->lock); > return err; > } > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel