Goldwyn Rodrigues
2014-Nov-04 15:52 UTC
[Ocfs2-devel] [PATCH] Remove filesize checks for sync I/O journal commit
Filesize is not a good indication that the file needs to be synced. An example where this breaks is: 1. Open the file in O_SYNC|O_RDWR 2. Read a small portion of the file (say 64 bytes) 3. Lseek to starting of the file 4. Write 64 bytes If the node crashes, it is not written out to disk because this was not committed in the journal and the other node which reads the file after recovery reads stale data (even if the write on the other node was successful) --- diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 324dc93..69fb9f7 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -2381,9 +2381,7 @@ out_dio: if (ret < 0) written = ret; - if (!ret && ((old_size != i_size_read(inode)) || - (old_clusters != OCFS2_I(inode)->ip_clusters) || - has_refcount)) { + if (!ret) { ret = jbd2_journal_force_commit(osb->journal->j_journal); if (ret < 0) written = ret;
Andrew Morton
2014-Nov-04 22:45 UTC
[Ocfs2-devel] [PATCH] Remove filesize checks for sync I/O journal commit
On Tue, 4 Nov 2014 09:52:22 -0600 Goldwyn Rodrigues <rgoldwyn at suse.de> wrote:> Filesize is not a good indication that the file needs to be synced. > An example where this breaks is: > 1. Open the file in O_SYNC|O_RDWR > 2. Read a small portion of the file (say 64 bytes) > 3. Lseek to starting of the file > 4. Write 64 bytes > > If the node crashes, it is not written out to disk because this > was not committed in the journal and the other node which reads > the file after recovery reads stale data (even if the write on > the other node was successful) > > ... > > --- a/fs/ocfs2/file.c > +++ b/fs/ocfs2/file.c > @@ -2381,9 +2381,7 @@ out_dio: > if (ret < 0) > written = ret; > > - if (!ret && ((old_size != i_size_read(inode)) || > - (old_clusters != OCFS2_I(inode)->ip_clusters) || > - has_refcount)) { > + if (!ret) { > ret = jbd2_journal_force_commit(osb->journal->j_journal); > if (ret < 0) > written = ret;Can we have a signed-off-by for this, please?
Mark Fasheh
2014-Nov-05 21:18 UTC
[Ocfs2-devel] [PATCH] Remove filesize checks for sync I/O journal commit
On Tue, Nov 04, 2014 at 09:52:22AM -0600, Goldwyn Rodrigues wrote:> Filesize is not a good indication that the file needs to be synced. > An example where this breaks is: > 1. Open the file in O_SYNC|O_RDWR > 2. Read a small portion of the file (say 64 bytes) > 3. Lseek to starting of the file > 4. Write 64 bytes > > If the node crashes, it is not written out to disk because this > was not committed in the journal and the other node which reads > the file after recovery reads stale data (even if the write on > the other node was successful)This patch looks good, thanks for sending it over. Reviewed-by: Mark Fasheh <mfasheh at suse.de> --Mark -- Mark Fasheh
Xue jiufei
2014-Dec-03 08:20 UTC
[Ocfs2-devel] [PATCH] Remove filesize checks for sync I/O journal commit
Hi Goldwyn, On 2014/11/4 23:52, Goldwyn Rodrigues wrote:> Filesize is not a good indication that the file needs to be synced. > An example where this breaks is: > 1. Open the file in O_SYNC|O_RDWR > 2. Read a small portion of the file (say 64 bytes) > 3. Lseek to starting of the file > 4. Write 64 bytes > > If the node crashes, it is not written out to disk because this > was not committed in the journal and the other node which reads > the file after recovery reads stale data (even if the write on > the other node was successful) >I have a question that if user open the file with O_SYNC, it will call filemap_fdatawait_range() after generic_perform_write() to wait until data reaches the disk, why another node may read the stale data? Thanks, Xuejiufei> --- > diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c > index 324dc93..69fb9f7 100644 > --- a/fs/ocfs2/file.c > +++ b/fs/ocfs2/file.c > @@ -2381,9 +2381,7 @@ out_dio: > if (ret < 0) > written = ret; > > - if (!ret && ((old_size != i_size_read(inode)) || > - (old_clusters != OCFS2_I(inode)->ip_clusters) || > - has_refcount)) { > + if (!ret) { > ret = jbd2_journal_force_commit(osb->journal->j_journal); > if (ret < 0) > written = ret; > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >