Jan Kara
2010-Jul-29 12:00 UTC
[Ocfs2-devel] [PATCH 1/2] ocfs2: Flush drive's caches on fdatasync
We have to issue a cache flush during fdatasync even if inode doesn't have I_DIRTY_DATASYNC set because we still have to get written *data* to disk to observe fdatasync() guarantees. Signed-off-by: Jan Kara <jack at suse.cz> --- fs/ocfs2/file.c | 10 +++++++++- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 2b10b36..5659d85 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -36,6 +36,7 @@ #include <linux/writeback.h> #include <linux/falloc.h> #include <linux/quotaops.h> +#include <linux/blkdev.h> #define MLOG_MASK_PREFIX ML_INODE #include <cluster/masklog.h> @@ -190,8 +191,15 @@ static int ocfs2_sync_file(struct file *file, int datasync) if (err) goto bail; - if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) + if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) { + /* + * Although inode doesn't need writing, we still have to flush + * drive's caches to get data to the platter + */ + blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, + BLKDEV_IFL_WAIT); goto bail; + } journal = osb->journal->j_journal; err = jbd2_journal_force_commit(journal); -- 1.6.4.2
ocfs2_sync_inode() is used only from ocfs2_sync_file(). But all data has already been written before calling ocfs2_sync_file() and ocfs2 doesn't use inode's private_list for tracking metadata buffers thus sync_mapping_buffers() is superfluous as well. Signed-off-by: Jan Kara <jack at suse.cz> --- fs/ocfs2/file.c | 10 ---------- 1 files changed, 0 insertions(+), 10 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 5659d85..2e341a2 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -64,12 +64,6 @@ #include "buffer_head_io.h" -static int ocfs2_sync_inode(struct inode *inode) -{ - filemap_fdatawrite(inode->i_mapping); - return sync_mapping_buffers(inode->i_mapping); -} - static int ocfs2_init_file_private(struct inode *inode, struct file *file) { struct ocfs2_file_private *fp; @@ -187,10 +181,6 @@ static int ocfs2_sync_file(struct file *file, int datasync) mlog_entry("(0x%p, 0x%p, %d, '%.*s')\n", file, dentry, datasync, dentry->d_name.len, dentry->d_name.name); - err = ocfs2_sync_inode(dentry->d_inode); - if (err) - goto bail; - if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) { /* * We still have to flush drive's caches to get data to the -- 1.6.4.2
Tao Ma
2010-Jul-30 02:07 UTC
[Ocfs2-devel] [PATCH 1/2] ocfs2: Flush drive's caches on fdatasync
Hi Jan, On 07/29/2010 08:00 PM, Jan Kara wrote:> We have to issue a cache flush during fdatasync even if inode doesn't have > I_DIRTY_DATASYNC set because we still have to get written *data* to disk to > observe fdatasync() guarantees.I am fine with the patch from the code's perspective. But I just noticed the discussion in fsdevel with the subject "relaxed barrier semantics", so with barrier there will be a massive slowdowns according to Christoph. And as ocfs2 is mainly used with some SAN, I guess in most cases the storage will have a battery backed cache, so we may not need this? Sunil, Joel and Mark, Did you have any user data that most of the ocfs2 system is used on or can we start a survey in ocfs2-users? Regards, Tao> Signed-off-by: Jan Kara<jack at suse.cz> > --- > fs/ocfs2/file.c | 10 +++++++++- > 1 files changed, 9 insertions(+), 1 deletions(-) > > diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c > index 2b10b36..5659d85 100644 > --- a/fs/ocfs2/file.c > +++ b/fs/ocfs2/file.c > @@ -36,6 +36,7 @@ > #include<linux/writeback.h> > #include<linux/falloc.h> > #include<linux/quotaops.h> > +#include<linux/blkdev.h> > > #define MLOG_MASK_PREFIX ML_INODE > #include<cluster/masklog.h> > @@ -190,8 +191,15 @@ static int ocfs2_sync_file(struct file *file, int datasync) > if (err) > goto bail; > > - if (datasync&& !(inode->i_state& I_DIRTY_DATASYNC)) > + if (datasync&& !(inode->i_state& I_DIRTY_DATASYNC)) { > + /* > + * Although inode doesn't need writing, we still have to flush > + * drive's caches to get data to the platter > + */ > + blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, > + BLKDEV_IFL_WAIT); > goto bail; > + } > > journal = osb->journal->j_journal; > err = jbd2_journal_force_commit(journal);