Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] remove most callers of write_one_page
Hi all, this series removes most users of the write_one_page API. These helpers internally call ->writepage which we are gradually removing from the kernel. Changes since v1: - drop the btrfs changes (queue up in the btrfs tree) - drop the finaly move to jfs (can't be done without the btrfs patches) - fix the existing minix code to properly propagate errors Diffstat: minix/dir.c | 67 ++++++++++++++++++++++++++++++--------------------- minix/minix.h | 3 +- minix/namei.c | 8 +++--- ocfs2/refcounttree.c | 9 +++--- sysv/dir.c | 30 ++++++++++++++-------- ufs/dir.c | 29 ++++++++++++++-------- 6 files changed, 90 insertions(+), 56 deletions(-)
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 1/6] minix: fix error handling in minix_delete_entry
If minix_prepare_chunk fails, updating c/mtime and marking the dir inode dirty is wrong, as the inode hasn't been modified. Note that this moves the dir_put_page call later, but that matches other uses of this helper in the directory code. Signed-off-by: Christoph Hellwig <hch at lst.de> --- fs/minix/dir.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/fs/minix/dir.c b/fs/minix/dir.c index dcfe5b25378b54..a8e76284cb71ec 100644 --- a/fs/minix/dir.c +++ b/fs/minix/dir.c @@ -297,18 +297,19 @@ int minix_delete_entry(struct minix_dir_entry *de, struct page *page) lock_page(page); err = minix_prepare_chunk(page, pos, len); - if (err == 0) { - if (sbi->s_version == MINIX_V3) - ((minix3_dirent *) de)->inode = 0; - else - de->inode = 0; - err = dir_commit_chunk(page, pos, len); - } else { + if (err) { unlock_page(page); + goto out_put_page; } - dir_put_page(page); + if (sbi->s_version == MINIX_V3) + ((minix3_dirent *)de)->inode = 0; + else + de->inode = 0; + err = dir_commit_chunk(page, pos, len); inode->i_ctime = inode->i_mtime = current_time(inode); mark_inode_dirty(inode); +out_put_page: + dir_put_page(page); return err; } -- 2.39.0
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 2/6] minix: fix error handling in minix_set_link
If minix_prepare_chunk fails, updating c/mtime and marking the dir inode dirty is wrong, as the inode hasn't been modified. Also propagate the error to the caller. Note that this moves the dir_put_page call later, but that matches other uses of this helper in the directory code. Signed-off-by: Christoph Hellwig <hch at lst.de> --- fs/minix/dir.c | 23 ++++++++++++----------- fs/minix/minix.h | 3 ++- fs/minix/namei.c | 8 +++++--- 3 files changed, 19 insertions(+), 15 deletions(-) diff --git a/fs/minix/dir.c b/fs/minix/dir.c index a8e76284cb71ec..c9a3d520b72671 100644 --- a/fs/minix/dir.c +++ b/fs/minix/dir.c @@ -410,8 +410,8 @@ int minix_empty_dir(struct inode * inode) } /* Releases the page */ -void minix_set_link(struct minix_dir_entry *de, struct page *page, - struct inode *inode) +int minix_set_link(struct minix_dir_entry *de, struct page *page, + struct inode *inode) { struct inode *dir = page->mapping->host; struct minix_sb_info *sbi = minix_sb(dir->i_sb); @@ -420,20 +420,21 @@ void minix_set_link(struct minix_dir_entry *de, struct page *page, int err; lock_page(page); - err = minix_prepare_chunk(page, pos, sbi->s_dirsize); - if (err == 0) { - if (sbi->s_version == MINIX_V3) - ((minix3_dirent *) de)->inode = inode->i_ino; - else - de->inode = inode->i_ino; - err = dir_commit_chunk(page, pos, sbi->s_dirsize); - } else { + if (err) { unlock_page(page); + goto out_put_page; } - dir_put_page(page); + if (sbi->s_version == MINIX_V3) + ((minix3_dirent *)de)->inode = inode->i_ino; + else + de->inode = inode->i_ino; + err = dir_commit_chunk(page, pos, sbi->s_dirsize); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); +out_put_page: + dir_put_page(page); + return err; } struct minix_dir_entry * minix_dotdot (struct inode *dir, struct page **p) diff --git a/fs/minix/minix.h b/fs/minix/minix.h index 20217336802570..8f7a636bd1b241 100644 --- a/fs/minix/minix.h +++ b/fs/minix/minix.h @@ -69,7 +69,8 @@ extern int minix_add_link(struct dentry*, struct inode*); extern int minix_delete_entry(struct minix_dir_entry*, struct page*); extern int minix_make_empty(struct inode*, struct inode*); extern int minix_empty_dir(struct inode*); -extern void minix_set_link(struct minix_dir_entry*, struct page*, struct inode*); +int minix_set_link(struct minix_dir_entry *de, struct page *page, + struct inode *inode); extern struct minix_dir_entry *minix_dotdot(struct inode*, struct page**); extern ino_t minix_inode_by_name(struct dentry*); diff --git a/fs/minix/namei.c b/fs/minix/namei.c index 8afdc408ca4fd5..82d46c28f01b01 100644 --- a/fs/minix/namei.c +++ b/fs/minix/namei.c @@ -223,7 +223,9 @@ static int minix_rename(struct user_namespace *mnt_userns, new_de = minix_find_entry(new_dentry, &new_page); if (!new_de) goto out_dir; - minix_set_link(new_de, new_page, old_inode); + err = minix_set_link(new_de, new_page, old_inode); + if (err) + goto out_dir; new_inode->i_ctime = current_time(new_inode); if (dir_de) drop_nlink(new_inode); @@ -240,10 +242,10 @@ static int minix_rename(struct user_namespace *mnt_userns, mark_inode_dirty(old_inode); if (dir_de) { - minix_set_link(dir_de, dir_page, new_dir); + err = minix_set_link(dir_de, dir_page, new_dir); inode_dec_link_count(old_dir); } - return 0; + return err; out_dir: if (dir_de) { -- 2.39.0
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 3/6] minix: don't flush page immediately for DIRSYNC directories
We do not need to writeout modified directory blocks immediately when modifying them while the page is locked. It is enough to do the flush somewhat later which has the added benefit that inode times can be flushed as well. It also allows us to stop depending on write_one_page() function. Ported from an ext2 patch by Jan Kara. Signed-off-by: Christoph Hellwig <hch at lst.de> --- fs/minix/dir.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/fs/minix/dir.c b/fs/minix/dir.c index c9a3d520b72671..e9a1dc6bdfb0a1 100644 --- a/fs/minix/dir.c +++ b/fs/minix/dir.c @@ -46,21 +46,27 @@ minix_last_byte(struct inode *inode, unsigned long page_nr) return last_byte; } -static int dir_commit_chunk(struct page *page, loff_t pos, unsigned len) +static void dir_commit_chunk(struct page *page, loff_t pos, unsigned len) { struct address_space *mapping = page->mapping; struct inode *dir = mapping->host; - int err = 0; + block_write_end(NULL, mapping, pos, len, len, page, NULL); if (pos+len > dir->i_size) { i_size_write(dir, pos+len); mark_inode_dirty(dir); } - if (IS_DIRSYNC(dir)) - err = write_one_page(page); - else - unlock_page(page); + unlock_page(page); +} + +static int minix_handle_dirsync(struct inode *dir) +{ + int err; + + err = filemap_write_and_wait(dir->i_mapping); + if (!err) + err = sync_inode_metadata(dir, 1); return err; } @@ -274,9 +280,10 @@ int minix_add_link(struct dentry *dentry, struct inode *inode) memset (namx + namelen, 0, sbi->s_dirsize - namelen - 2); de->inode = inode->i_ino; } - err = dir_commit_chunk(page, pos, sbi->s_dirsize); + dir_commit_chunk(page, pos, sbi->s_dirsize); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + err = minix_handle_dirsync(dir); out_put: dir_put_page(page); out: @@ -305,9 +312,11 @@ int minix_delete_entry(struct minix_dir_entry *de, struct page *page) ((minix3_dirent *)de)->inode = 0; else de->inode = 0; - err = dir_commit_chunk(page, pos, len); + dir_commit_chunk(page, pos, len); inode->i_ctime = inode->i_mtime = current_time(inode); mark_inode_dirty(inode); + if (!err) + err = minix_handle_dirsync(inode); out_put_page: dir_put_page(page); return err; @@ -350,7 +359,8 @@ int minix_make_empty(struct inode *inode, struct inode *dir) } kunmap_atomic(kaddr); - err = dir_commit_chunk(page, 0, 2 * sbi->s_dirsize); + dir_commit_chunk(page, 0, 2 * sbi->s_dirsize); + err = minix_handle_dirsync(inode); fail: put_page(page); return err; @@ -429,9 +439,10 @@ int minix_set_link(struct minix_dir_entry *de, struct page *page, ((minix3_dirent *)de)->inode = inode->i_ino; else de->inode = inode->i_ino; - err = dir_commit_chunk(page, pos, sbi->s_dirsize); + dir_commit_chunk(page, pos, sbi->s_dirsize); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + err = minix_handle_dirsync(dir); out_put_page: dir_put_page(page); return err; -- 2.39.0
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 4/6] sysv: don't flush page immediately for DIRSYNC directories
We do not need to writeout modified directory blocks immediately when modifying them while the page is locked. It is enough to do the flush somewhat later which has the added benefit that inode times can be flushed as well. It also allows us to stop depending on write_one_page() function. Ported from an ext2 patch by Jan Kara. Signed-off-by: Christoph Hellwig <hch at lst.de> --- fs/sysv/dir.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c index 88e38cd8f5c9ae..16730795a62161 100644 --- a/fs/sysv/dir.c +++ b/fs/sysv/dir.c @@ -34,21 +34,26 @@ static inline void dir_put_page(struct page *page) put_page(page); } -static int dir_commit_chunk(struct page *page, loff_t pos, unsigned len) +static void dir_commit_chunk(struct page *page, loff_t pos, unsigned len) { struct address_space *mapping = page->mapping; struct inode *dir = mapping->host; - int err = 0; block_write_end(NULL, mapping, pos, len, len, page, NULL); if (pos+len > dir->i_size) { i_size_write(dir, pos+len); mark_inode_dirty(dir); } - if (IS_DIRSYNC(dir)) - err = write_one_page(page); - else - unlock_page(page); + unlock_page(page); +} + +static int sysv_handle_dirsync(struct inode *dir) +{ + int err; + + err = filemap_write_and_wait(dir->i_mapping); + if (!err) + err = sync_inode_metadata(dir, 1); return err; } @@ -215,9 +220,10 @@ int sysv_add_link(struct dentry *dentry, struct inode *inode) memcpy (de->name, name, namelen); memset (de->name + namelen, 0, SYSV_DIRSIZE - namelen - 2); de->inode = cpu_to_fs16(SYSV_SB(inode->i_sb), inode->i_ino); - err = dir_commit_chunk(page, pos, SYSV_DIRSIZE); + dir_commit_chunk(page, pos, SYSV_DIRSIZE); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + err = sysv_handle_dirsync(dir); out_page: dir_put_page(page); out: @@ -238,11 +244,11 @@ int sysv_delete_entry(struct sysv_dir_entry *de, struct page *page) err = sysv_prepare_chunk(page, pos, SYSV_DIRSIZE); BUG_ON(err); de->inode = 0; - err = dir_commit_chunk(page, pos, SYSV_DIRSIZE); + dir_commit_chunk(page, pos, SYSV_DIRSIZE); dir_put_page(page); inode->i_ctime = inode->i_mtime = current_time(inode); mark_inode_dirty(inode); - return err; + return sysv_handle_dirsync(inode); } int sysv_make_empty(struct inode *inode, struct inode *dir) @@ -272,7 +278,8 @@ int sysv_make_empty(struct inode *inode, struct inode *dir) strcpy(de->name,".."); kunmap(page); - err = dir_commit_chunk(page, 0, 2 * SYSV_DIRSIZE); + dir_commit_chunk(page, 0, 2 * SYSV_DIRSIZE); + err = sysv_handle_dirsync(inode); fail: put_page(page); return err; @@ -336,10 +343,11 @@ void sysv_set_link(struct sysv_dir_entry *de, struct page *page, err = sysv_prepare_chunk(page, pos, SYSV_DIRSIZE); BUG_ON(err); de->inode = cpu_to_fs16(SYSV_SB(inode->i_sb), inode->i_ino); - err = dir_commit_chunk(page, pos, SYSV_DIRSIZE); + dir_commit_chunk(page, pos, SYSV_DIRSIZE); dir_put_page(page); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + sysv_handle_dirsync(inode); } struct sysv_dir_entry * sysv_dotdot (struct inode *dir, struct page **p) -- 2.39.0
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 5/6] ufs: don't flush page immediately for DIRSYNC directories
We do not need to writeout modified directory blocks immediately when modifying them while the page is locked. It is enough to do the flush somewhat later which has the added benefit that inode times can be flushed as well. It also allows us to stop depending on write_one_page() function. Ported from an ext2 patch by Jan Kara. Signed-off-by: Christoph Hellwig <hch at lst.de> --- fs/ufs/dir.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c index 391efaf1d52897..379d75796a5ce3 100644 --- a/fs/ufs/dir.c +++ b/fs/ufs/dir.c @@ -42,11 +42,10 @@ static inline int ufs_match(struct super_block *sb, int len, return !memcmp(name, de->d_name, len); } -static int ufs_commit_chunk(struct page *page, loff_t pos, unsigned len) +static void ufs_commit_chunk(struct page *page, loff_t pos, unsigned len) { struct address_space *mapping = page->mapping; struct inode *dir = mapping->host; - int err = 0; inode_inc_iversion(dir); block_write_end(NULL, mapping, pos, len, len, page, NULL); @@ -54,10 +53,16 @@ static int ufs_commit_chunk(struct page *page, loff_t pos, unsigned len) i_size_write(dir, pos+len); mark_inode_dirty(dir); } - if (IS_DIRSYNC(dir)) - err = write_one_page(page); - else - unlock_page(page); + unlock_page(page); +} + +static int ufs_handle_dirsync(struct inode *dir) +{ + int err; + + err = filemap_write_and_wait(dir->i_mapping); + if (!err) + err = sync_inode_metadata(dir, 1); return err; } @@ -99,11 +104,12 @@ void ufs_set_link(struct inode *dir, struct ufs_dir_entry *de, de->d_ino = cpu_to_fs32(dir->i_sb, inode->i_ino); ufs_set_de_type(dir->i_sb, de, inode->i_mode); - err = ufs_commit_chunk(page, pos, len); + ufs_commit_chunk(page, pos, len); ufs_put_page(page); if (update_times) dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + ufs_handle_dirsync(dir); } @@ -390,10 +396,11 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode) de->d_ino = cpu_to_fs32(sb, inode->i_ino); ufs_set_de_type(sb, de, inode->i_mode); - err = ufs_commit_chunk(page, pos, rec_len); + ufs_commit_chunk(page, pos, rec_len); dir->i_mtime = dir->i_ctime = current_time(dir); mark_inode_dirty(dir); + err = ufs_handle_dirsync(dir); /* OFFSET_CACHE */ out_put: ufs_put_page(page); @@ -531,9 +538,10 @@ int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir, if (pde) pde->d_reclen = cpu_to_fs16(sb, to - from); dir->d_ino = 0; - err = ufs_commit_chunk(page, pos, to - from); + ufs_commit_chunk(page, pos, to - from); inode->i_ctime = inode->i_mtime = current_time(inode); mark_inode_dirty(inode); + err = ufs_handle_dirsync(inode); out: ufs_put_page(page); UFSD("EXIT\n"); @@ -579,7 +587,8 @@ int ufs_make_empty(struct inode * inode, struct inode *dir) strcpy (de->d_name, ".."); kunmap(page); - err = ufs_commit_chunk(page, 0, chunk_size); + ufs_commit_chunk(page, 0, chunk_size); + err = ufs_handle_dirsync(inode); fail: put_page(page); return err; -- 2.39.0
Christoph Hellwig
2023-Jan-16 08:55 UTC
[Ocfs2-devel] [PATCH 6/6] ocfs2: don't use write_one_page in ocfs2_duplicate_clusters_by_page
Use filemap_write_and_wait_range to write back the range of the dirty page instead of write_one_page in preparation of removing write_one_page and eventually ->writepage. Signed-off-by: Christoph Hellwig <hch at lst.de> Reviewed-by: Jan Kara <jack at suse.cz> Reviewed-by: Joseph Qi <joseph.qi at linux.alibaba.com> --- fs/ocfs2/refcounttree.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c index 623db358b1efa8..4a73405962ec4f 100644 --- a/fs/ocfs2/refcounttree.c +++ b/fs/ocfs2/refcounttree.c @@ -2952,10 +2952,11 @@ int ocfs2_duplicate_clusters_by_page(handle_t *handle, */ if (PAGE_SIZE <= OCFS2_SB(sb)->s_clustersize) { if (PageDirty(page)) { - /* - * write_on_page will unlock the page on return - */ - ret = write_one_page(page); + unlock_page(page); + put_page(page); + + ret = filemap_write_and_wait_range(mapping, + offset, map_end - 1); goto retry; } } -- 2.39.0
On Mon, Jan 16, 2023 at 09:55:17AM +0100, Christoph Hellwig wrote:> Hi all, > > this series removes most users of the write_one_page API. These helpers > internally call ->writepage which we are gradually removing from the > kernel. > > Changes since v1: > - drop the btrfs changes (queue up in the btrfs tree) > - drop the finaly move to jfs (can't be done without the btrfs patches) > - fix the existing minix code to properly propagate errors > > Diffstat: > minix/dir.c | 67 ++++++++++++++++++++++++++++++--------------------- > minix/minix.h | 3 +- > minix/namei.c | 8 +++--- > ocfs2/refcounttree.c | 9 +++--- > sysv/dir.c | 30 ++++++++++++++-------- > ufs/dir.c | 29 ++++++++++++++-------- > 6 files changed, 90 insertions(+), 56 deletions(-)OK... Mind if I grab minix/sysv/ufs into a branch in vfs.git? There's a pile of kmap_local() stuff that that would interfere with that and I'd rather have it in one place...