Jan Kara
2011-Jun-23 20:51 UTC
[Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
When someone writes to an inode, readers accessing the same inode via ocfs2_readpage() just busyloop trying to get ip_alloc_sem because do_generic_file_read() looks up the page again and retries ->readpage() when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough readers, they can occupy all CPUs and in non-preempt kernel the system is deadlocked because writer holding ip_alloc_sem is never run to release the semaphore. Fix the problem by making reader block on ip_alloc_sem to break the busy loop. Signed-off-by: Jan Kara <jack at suse.cz> --- fs/ocfs2/aops.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index ac97bca..0919e8f 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page) } if (down_read_trylock(&oi->ip_alloc_sem) == 0) { + /* + * Unlock the page and cycle ip_alloc_sem so that we don't + * busyloop waiting for ip_alloc_sem to unlock + */ ret = AOP_TRUNCATED_PAGE; + unlock_page(page); + unlock = 0; + down_read(&oi->ip_alloc_sem); + up_read(&oi->ip_alloc_sem); goto out_inode_unlock; } -- 1.7.1
Sunil Mushran
2011-Jun-24 20:55 UTC
[Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
Looks reasonable. Was this something someone ran into or was this reproduced only via a test workload? On 06/23/2011 01:51 PM, Jan Kara wrote:> When someone writes to an inode, readers accessing the same inode via > ocfs2_readpage() just busyloop trying to get ip_alloc_sem because > do_generic_file_read() looks up the page again and retries ->readpage() > when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough > readers, they can occupy all CPUs and in non-preempt kernel the system is > deadlocked because writer holding ip_alloc_sem is never run to release the > semaphore. Fix the problem by making reader block on ip_alloc_sem to break > the busy loop. > > Signed-off-by: Jan Kara<jack at suse.cz> > --- > fs/ocfs2/aops.c | 8 ++++++++ > 1 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index ac97bca..0919e8f 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page) > } > > if (down_read_trylock(&oi->ip_alloc_sem) == 0) { > + /* > + * Unlock the page and cycle ip_alloc_sem so that we don't > + * busyloop waiting for ip_alloc_sem to unlock > + */ > ret = AOP_TRUNCATED_PAGE; > + unlock_page(page); > + unlock = 0; > + down_read(&oi->ip_alloc_sem); > + up_read(&oi->ip_alloc_sem); > goto out_inode_unlock; > } >
Joel Becker
2011-Jun-26 07:26 UTC
[Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:> When someone writes to an inode, readers accessing the same inode via > ocfs2_readpage() just busyloop trying to get ip_alloc_sem because > do_generic_file_read() looks up the page again and retries ->readpage() > when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough > readers, they can occupy all CPUs and in non-preempt kernel the system is > deadlocked because writer holding ip_alloc_sem is never run to release the > semaphore. Fix the problem by making reader block on ip_alloc_sem to break > the busy loop. > > Signed-off-by: Jan Kara <jack at suse.cz> > --- > fs/ocfs2/aops.c | 8 ++++++++ > 1 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index ac97bca..0919e8f 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page) > } > > if (down_read_trylock(&oi->ip_alloc_sem) == 0) { > + /* > + * Unlock the page and cycle ip_alloc_sem so that we don't > + * busyloop waiting for ip_alloc_sem to unlock > + */ > ret = AOP_TRUNCATED_PAGE; > + unlock_page(page); > + unlock = 0; > + down_read(&oi->ip_alloc_sem); > + up_read(&oi->ip_alloc_sem); > goto out_inode_unlock; > }Question: First, is it safe to drop the page lock here? Can all callers of readpage (not just g_f_a_r()) handle that?> -- "But all my words come back to me In shades of mediocrity. Like emptiness in harmony I need someone to comfort me." http://www.jlbec.org/ jlbec at evilplan.org
Joel Becker
2011-Jul-28 09:08 UTC
[Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:> When someone writes to an inode, readers accessing the same inode via > ocfs2_readpage() just busyloop trying to get ip_alloc_sem because > do_generic_file_read() looks up the page again and retries ->readpage() > when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough > readers, they can occupy all CPUs and in non-preempt kernel the system is > deadlocked because writer holding ip_alloc_sem is never run to release the > semaphore. Fix the problem by making reader block on ip_alloc_sem to break > the busy loop. > > Signed-off-by: Jan Kara <jack at suse.cz>This patch is now in the fixes branch of ocfs2.git. Joel -- Life's Little Instruction Book #30 "Never buy a house without a fireplace." http://www.jlbec.org/ jlbec at evilplan.org