Jan Kara
2009-Aug-19 16:04 UTC
[Ocfs2-devel] [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write()
generic_file_direct_write() and generic_file_buffered_write() called generic_osync_inode() if it was called on O_SYNC file or IS_SYNC inode. But this is superfluous since generic_file_aio_write() does the syncing as well. Also XFS and OCFS2 which call these functions directly handle syncing themselves. So let's have a single place where syncing happens: generic_file_aio_write(). CC: ocfs2-devel at oss.oracle.com CC: Joel Becker <joel.becker at oracle.com> CC: Felix Blyakher <felixb at sgi.com> CC: xfs at oss.sgi.com Signed-off-by: Jan Kara <jack at suse.cz> --- mm/filemap.c | 25 ------------------------- 1 files changed, 0 insertions(+), 25 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 554a396..3bee198 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2187,20 +2187,7 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, } *ppos = end; } - - /* - * Sync the fs metadata but not the minor inode changes and - * of course not the data as we did direct DMA for the IO. - * i_mutex is held, which protects generic_osync_inode() from - * livelocking. AIO O_DIRECT ops attempt to sync metadata here. - */ out: - if ((written >= 0 || written == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - int err = generic_osync_inode(inode, mapping, OSYNC_METADATA); - if (err < 0) - written = err; - } return written; } EXPORT_SYMBOL(generic_file_direct_write); @@ -2332,8 +2319,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, { struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; - const struct address_space_operations *a_ops = mapping->a_ops; - struct inode *inode = mapping->host; ssize_t status; struct iov_iter i; @@ -2343,16 +2328,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, if (likely(status >= 0)) { written += status; *ppos = pos + status; - - /* - * For now, when the user asks for O_SYNC, we'll actually give - * O_DSYNC - */ - if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - if (!a_ops->writepage || !is_sync_kiocb(iocb)) - status = generic_osync_inode(inode, mapping, - OSYNC_METADATA|OSYNC_DATA); - } } /* -- 1.6.0.2
Christoph Hellwig
2009-Aug-19 16:18 UTC
[Ocfs2-devel] [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write()
On Wed, Aug 19, 2009 at 06:04:30PM +0200, Jan Kara wrote:> generic_file_direct_write() and generic_file_buffered_write() called > generic_osync_inode() if it was called on O_SYNC file or IS_SYNC inode. But > this is superfluous since generic_file_aio_write() does the syncing as well. > Also XFS and OCFS2 which call these functions directly handle syncing > themselves. So let's have a single place where syncing happens: > generic_file_aio_write().Yeah, this is something that never made any sense to me.> @@ -2187,20 +2187,7 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, > } > *ppos = end; > } > - > - /* > - * Sync the fs metadata but not the minor inode changes and > - * of course not the data as we did direct DMA for the IO. > - * i_mutex is held, which protects generic_osync_inode() from > - * livelocking. AIO O_DIRECT ops attempt to sync metadata here. > - */ > out: > - if ((written >= 0 || written == -EIOCBQUEUED) && > - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { > - int err = generic_osync_inode(inode, mapping, OSYNC_METADATA); > - if (err < 0) > - written = err; > - } > return written;Here we check (written >= 0 || written == -EIOCBQUEUED), but generic_file_aio_write only cares about positive return values. We defintively do have a change here for partial AIO requests. The question is if the previous behaviour made in sense. If do have an O_SYNC aio+dio request we would have to flush out the metadata after the request has completed and not here.> @@ -2343,16 +2328,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, > if (likely(status >= 0)) { > written += status; > *ppos = pos + status; > - > - /* > - * For now, when the user asks for O_SYNC, we'll actually give > - * O_DSYNC > - */ > - if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { > - if (!a_ops->writepage || !is_sync_kiocb(iocb)) > - status = generic_osync_inode(inode, mapping, > - OSYNC_METADATA|OSYNC_DATA); > - } > }No problem with -EIOCBQUEUED here, but we change from doing generic_osync_inode with OSYNC_DATA which does a full writeout of the data to sync_page_range which only does the range writeout here. That should be fine (as we only need to sync that range), but should probably be documented in the patch description.
Jan Kara
2009-Aug-21 16:59 UTC
[Ocfs2-devel] [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write()
generic_file_direct_write() and generic_file_buffered_write() called generic_osync_inode() if it was called on O_SYNC file or IS_SYNC inode. But this is superfluous since generic_file_aio_write() does the syncing as well. Also XFS and OCFS2 which call these functions directly handle syncing themselves. So let's have a single place where syncing happens: generic_file_aio_write(). We slightly change the behavior by syncing only the range of file to which the write happened for buffered writes but that should be all that is required. CC: ocfs2-devel at oss.oracle.com CC: Joel Becker <joel.becker at oracle.com> CC: Felix Blyakher <felixb at sgi.com> CC: xfs at oss.sgi.com Signed-off-by: Jan Kara <jack at suse.cz> --- mm/filemap.c | 31 ++++--------------------------- 1 files changed, 4 insertions(+), 27 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 554a396..b523e42 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2187,20 +2187,7 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, } *ppos = end; } - - /* - * Sync the fs metadata but not the minor inode changes and - * of course not the data as we did direct DMA for the IO. - * i_mutex is held, which protects generic_osync_inode() from - * livelocking. AIO O_DIRECT ops attempt to sync metadata here. - */ out: - if ((written >= 0 || written == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - int err = generic_osync_inode(inode, mapping, OSYNC_METADATA); - if (err < 0) - written = err; - } return written; } EXPORT_SYMBOL(generic_file_direct_write); @@ -2332,8 +2319,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, { struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; - const struct address_space_operations *a_ops = mapping->a_ops; - struct inode *inode = mapping->host; ssize_t status; struct iov_iter i; @@ -2343,16 +2328,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, if (likely(status >= 0)) { written += status; *ppos = pos + status; - - /* - * For now, when the user asks for O_SYNC, we'll actually give - * O_DSYNC - */ - if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - if (!a_ops->writepage || !is_sync_kiocb(iocb)) - status = generic_osync_inode(inode, mapping, - OSYNC_METADATA|OSYNC_DATA); - } } /* @@ -2514,7 +2489,8 @@ ssize_t generic_file_aio_write_nolock(struct kiocb *iocb, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range_nolock(inode, mapping, pos, ret); @@ -2550,7 +2526,8 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); mutex_unlock(&inode->i_mutex); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range(inode, mapping, pos, ret); -- 1.6.0.2
Jan Kara
2009-Aug-21 17:23 UTC
[Ocfs2-devel] [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write()
generic_file_direct_write() and generic_file_buffered_write() called generic_osync_inode() if it was called on O_SYNC file or IS_SYNC inode. But this is superfluous since generic_file_aio_write() does the syncing as well. Also XFS and OCFS2 which call these functions directly handle syncing themselves. So let's have a single place where syncing happens: generic_file_aio_write(). We slightly change the behavior by syncing only the range of file to which the write happened for buffered writes but that should be all that is required. CC: ocfs2-devel at oss.oracle.com CC: Joel Becker <joel.becker at oracle.com> CC: Felix Blyakher <felixb at sgi.com> CC: xfs at oss.sgi.com Signed-off-by: Jan Kara <jack at suse.cz> --- mm/filemap.c | 35 ++++++----------------------------- 1 files changed, 6 insertions(+), 29 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 554a396..f863e1d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2187,20 +2187,7 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, } *ppos = end; } - - /* - * Sync the fs metadata but not the minor inode changes and - * of course not the data as we did direct DMA for the IO. - * i_mutex is held, which protects generic_osync_inode() from - * livelocking. AIO O_DIRECT ops attempt to sync metadata here. - */ out: - if ((written >= 0 || written == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - int err = generic_osync_inode(inode, mapping, OSYNC_METADATA); - if (err < 0) - written = err; - } return written; } EXPORT_SYMBOL(generic_file_direct_write); @@ -2332,8 +2319,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, { struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; - const struct address_space_operations *a_ops = mapping->a_ops; - struct inode *inode = mapping->host; ssize_t status; struct iov_iter i; @@ -2343,16 +2328,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, if (likely(status >= 0)) { written += status; *ppos = pos + status; - - /* - * For now, when the user asks for O_SYNC, we'll actually give - * O_DSYNC - */ - if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - if (!a_ops->writepage || !is_sync_kiocb(iocb)) - status = generic_osync_inode(inode, mapping, - OSYNC_METADATA|OSYNC_DATA); - } } /* @@ -2514,11 +2489,12 @@ ssize_t generic_file_aio_write_nolock(struct kiocb *iocb, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range_nolock(inode, mapping, pos, ret); - if (err < 0) + if (err < 0 && ret > 0) ret = err; } return ret; @@ -2550,11 +2526,12 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); mutex_unlock(&inode->i_mutex); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range(inode, mapping, pos, ret); - if (err < 0) + if (err < 0 && ret > 0) ret = err; } return ret; -- 1.6.0.2