thr3ads.net - Ocfs2 devel - [Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage

If this information is useful, please help other people find it:
Share via:

Andreas Gruenbacher

2020-Jun-18 12:46 UTC

[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead

On Wed, Jun 17, 2020 at 4:22 AM Matthew Wilcox <willy at infradead.org>
wrote:> On Wed, Jun 17, 2020 at 02:57:14AM +0200, Andreas Gr?nbacher wrote:
> > Am Mi., 17. Juni 2020 um 02:33 Uhr schrieb Matthew Wilcox <willy at
infradead.org>:
> > >
> > > On Wed, Jun 17, 2020 at 12:36:13AM +0200, Andreas Gruenbacher
wrote:
> > > > Am Mi., 15. Apr. 2020 um 23:39 Uhr schrieb Matthew Wilcox
<willy at infradead.org>:
> > > > > From: "Matthew Wilcox (Oracle)" <willy at
infradead.org>
> > > > >
> > > > > Implement the new readahead aop and convert all callers
(block_dev,
> > > > > exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2,
ocfs2, omfs, qnx6,
> > > > > reiserfs & udf).  The callers are all trivial
except for GFS2 & OCFS2.
> > > >
> > > > This patch leads to an ABBA deadlock in xfstest generic/095
on gfs2.
> > > >
> > > > Our lock hierarchy is such that the inode cluster lock
("inode glock")
> > > > for an inode needs to be taken before any page locks in that
inode's
> > > > address space.
> > >
> > > How does that work for ...
> > >
> > > writepage:              yes, unlocks (see below)
> > > readpage:               yes, unlocks
> > > invalidatepage:         yes
> > > releasepage:            yes
> > > freepage:               yes
> > > isolate_page:           yes
> > > migratepage:            yes (both)
> > > putback_page:           yes
> > > launder_page:           yes
> > > is_partially_uptodate:  yes
> > > error_remove_page:      yes
> > >
> > > Is there a reason that you don't take the glock in the higher
level
> > > ops which are called before readhead gets called?  I'm
looking at XFS,
> > > and it takes the xfs_ilock SHARED in xfs_file_buffered_aio_read()
> > > (called from xfs_file_read_iter).
> >
> > Right, the approach from the following thread might fix this:
> >
> >
https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20191122235324.17245-1-agruenba
at
redhat.com/T/*t__;Iw!!GqivPVa7Brio!IekEw8uQR1N_S3Wqmk3aWZVcgHRjA1cp2KT7CkfJUM5xxPPXx_Jr5Is0-Z3rCNxzU5XYvQ$
>
> In general, I think this is a sound approach.
>
> Specifically, I think FAULT_FLAG_CACHED can go away.  map_pages()
> will bring in the pages which are in the page cache, so when we get to
> gfs2_fault(), we know there's a reason to acquire the glock.
We'd still be grabbing a glock while holding a dependent page lock.
Another process could be holding the glock and could try to grab the
same page lock (i.e., a concurrent writer), leading to the same kind
of deadlock.

Andreas

Matthew Wilcox

2020-Jun-18 15:03 UTC

head link

[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead

On Thu, Jun 18, 2020 at 02:46:03PM +0200, Andreas Gruenbacher
wrote:> On Wed, Jun 17, 2020 at 4:22 AM Matthew Wilcox <willy at
infradead.org> wrote:
> > On Wed, Jun 17, 2020 at 02:57:14AM +0200, Andreas Gr?nbacher wrote:
> > > Right, the approach from the following thread might fix this:
> > >
> > >
https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20191122235324.17245-1-agruenba
at
redhat.com/T/*t__;Iw!!GqivPVa7Brio!I8hTVCrMoGi6ZPm8rkcsxqxtbzTOw-Hj_usoVmG1GbN7GkmX9sVwdseMcWqpTQWoqXUGRw$
> >
> > In general, I think this is a sound approach.
> >
> > Specifically, I think FAULT_FLAG_CACHED can go away.  map_pages()
> > will bring in the pages which are in the page cache, so when we get to
> > gfs2_fault(), we know there's a reason to acquire the glock.
> 
> We'd still be grabbing a glock while holding a dependent page lock.
> Another process could be holding the glock and could try to grab the
> same page lock (i.e., a concurrent writer), leading to the same kind
> of deadlock.
What I'm saying is that gfs2_fault should just be:

+static vm_fault_t gfs2_fault(struct vm_fault *vmf)
+{
+	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_holder gh;
+	vm_fault_t ret;
+	int err;
+
+	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
+	err = gfs2_glock_nq(&gh);
+	if (err) {
+		ret = block_page_mkwrite_return(err);
+		goto out_uninit;
+	}
+	ret = filemap_fault(vmf);
+	gfs2_glock_dq(&gh);
+out_uninit:
+	gfs2_holder_uninit(&gh);
+	return ret;
+}

because by the time gfs2_fault() is called, map_pages() has already been
called and has failed to insert the necessary page, so we should just
acquire the glock now instead of trying again to look for the page in
the page cache.

Ocfs2 devel - Jun 2020 - [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead

[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead

[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead