Andreas Gruenbacher
2020-Jun-18 12:46 UTC
[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead
On Wed, Jun 17, 2020 at 4:22 AM Matthew Wilcox <willy at infradead.org> wrote:> On Wed, Jun 17, 2020 at 02:57:14AM +0200, Andreas Gr?nbacher wrote: > > Am Mi., 17. Juni 2020 um 02:33 Uhr schrieb Matthew Wilcox <willy at infradead.org>: > > > > > > On Wed, Jun 17, 2020 at 12:36:13AM +0200, Andreas Gruenbacher wrote: > > > > Am Mi., 15. Apr. 2020 um 23:39 Uhr schrieb Matthew Wilcox <willy at infradead.org>: > > > > > From: "Matthew Wilcox (Oracle)" <willy at infradead.org> > > > > > > > > > > Implement the new readahead aop and convert all callers (block_dev, > > > > > exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6, > > > > > reiserfs & udf). The callers are all trivial except for GFS2 & OCFS2. > > > > > > > > This patch leads to an ABBA deadlock in xfstest generic/095 on gfs2. > > > > > > > > Our lock hierarchy is such that the inode cluster lock ("inode glock") > > > > for an inode needs to be taken before any page locks in that inode's > > > > address space. > > > > > > How does that work for ... > > > > > > writepage: yes, unlocks (see below) > > > readpage: yes, unlocks > > > invalidatepage: yes > > > releasepage: yes > > > freepage: yes > > > isolate_page: yes > > > migratepage: yes (both) > > > putback_page: yes > > > launder_page: yes > > > is_partially_uptodate: yes > > > error_remove_page: yes > > > > > > Is there a reason that you don't take the glock in the higher level > > > ops which are called before readhead gets called? I'm looking at XFS, > > > and it takes the xfs_ilock SHARED in xfs_file_buffered_aio_read() > > > (called from xfs_file_read_iter). > > > > Right, the approach from the following thread might fix this: > > > > https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20191122235324.17245-1-agruenba at redhat.com/T/*t__;Iw!!GqivPVa7Brio!IekEw8uQR1N_S3Wqmk3aWZVcgHRjA1cp2KT7CkfJUM5xxPPXx_Jr5Is0-Z3rCNxzU5XYvQ$ > > In general, I think this is a sound approach. > > Specifically, I think FAULT_FLAG_CACHED can go away. map_pages() > will bring in the pages which are in the page cache, so when we get to > gfs2_fault(), we know there's a reason to acquire the glock.We'd still be grabbing a glock while holding a dependent page lock. Another process could be holding the glock and could try to grab the same page lock (i.e., a concurrent writer), leading to the same kind of deadlock. Andreas
Matthew Wilcox
2020-Jun-18 15:03 UTC
[Ocfs2-devel] [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead
On Thu, Jun 18, 2020 at 02:46:03PM +0200, Andreas Gruenbacher wrote:> On Wed, Jun 17, 2020 at 4:22 AM Matthew Wilcox <willy at infradead.org> wrote: > > On Wed, Jun 17, 2020 at 02:57:14AM +0200, Andreas Gr?nbacher wrote: > > > Right, the approach from the following thread might fix this: > > > > > > https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20191122235324.17245-1-agruenba at redhat.com/T/*t__;Iw!!GqivPVa7Brio!I8hTVCrMoGi6ZPm8rkcsxqxtbzTOw-Hj_usoVmG1GbN7GkmX9sVwdseMcWqpTQWoqXUGRw$ > > > > In general, I think this is a sound approach. > > > > Specifically, I think FAULT_FLAG_CACHED can go away. map_pages() > > will bring in the pages which are in the page cache, so when we get to > > gfs2_fault(), we know there's a reason to acquire the glock. > > We'd still be grabbing a glock while holding a dependent page lock. > Another process could be holding the glock and could try to grab the > same page lock (i.e., a concurrent writer), leading to the same kind > of deadlock.What I'm saying is that gfs2_fault should just be: +static vm_fault_t gfs2_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct gfs2_inode *ip = GFS2_I(inode); + struct gfs2_holder gh; + vm_fault_t ret; + int err; + + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); + err = gfs2_glock_nq(&gh); + if (err) { + ret = block_page_mkwrite_return(err); + goto out_uninit; + } + ret = filemap_fault(vmf); + gfs2_glock_dq(&gh); +out_uninit: + gfs2_holder_uninit(&gh); + return ret; +} because by the time gfs2_fault() is called, map_pages() has already been called and has failed to insert the necessary page, so we should just acquire the glock now instead of trying again to look for the page in the page cache.