Francesco Prelz
2013-Aug-05  16:46 UTC
[Dovecot] Corrupted mboxes with v2.2.4, posix_fallocate and GFS2
Hi,
on a clustered Dovecot server installation that was recently moved from a
shared GPFS filesystem to GFS2, occasional corruptions in the users'
INBOXes started appearing, where a new incoming message would be appended 
directly after a block of NUL bytes, and be scanned by dovecot as being
glued to the preceding message.
I traced this to the file extension operation performed in
mbox_sync_handle_eof_updates, where the 'file_set_size' call
is used. If available, file_set_size will use the posix_fallocate
call. In GFS2 posix_fallocate increases the file size in 4 kB chunks
(there seems to be no guarantee anyway that posix_allocate will
extend a file by the exact size requested).
After a successful posix_fallocate call, mbox_sync_handle_eof_updates
currently proceeds in rewriting the mailbox starting from the 
originally intended 'file_size':
    1306                 if (file_set_size(sync_ctx->write_fd,
    1307                                   file_size + -sync_ctx->space_diff)
< 0) {
    1308                         mbox_set_syscall_error(sync_ctx->mbox,
    1309                                               
"file_set_size()");
    1310                         if (ftruncate(sync_ctx->write_fd, file_size)
< 0) {
    1311                                
mbox_set_syscall_error(sync_ctx->mbox,
    1312                                                       
"ftruncate()");
    1313                         }
    1314                         return -1;
    1315                 }
    1316                 mbox_sync_file_updated(sync_ctx, FALSE);
    1317
    1318                 if (mbox_sync_rewrite(sync_ctx, mail_ctx, file_size,
    1319                                       -sync_ctx->space_diff,
padding,
    1320                                       sync_ctx->need_space_seq,
    1321                                       sync_ctx->seq) < 0)
    1322                         return -1;
When posix_fallocate extends the mailbox beyond the requested
'file_size',
a variable size block of NUL bytes is left behind at the tail of the 
mailbox, with the side effects described above.
I successfully worked around this issue by undefining 
HAVE_POSIX_FALLOCATE, as the performance penalty with falling back
to direct block appends seems small.
At least a size check (and possible truncation) after the 
mbox_sync_file_updated call above should probably be added.
I thought that the issue would be anyway worth bringing to your attention.
Thanks.
Francesco Prelz
INFN - Sezione di Milano
Timo Sirainen
2013-Aug-05  17:29 UTC
[Dovecot] Corrupted mboxes with v2.2.4, posix_fallocate and GFS2
On 5.8.2013, at 19.46, Francesco Prelz <Francesco.Prelz at mi.infn.it> wrote:> on a clustered Dovecot server installation that was recently moved from a > shared GPFS filesystem to GFS2, occasional corruptions in the users' > INBOXes started appearing, where a new incoming message would be appended directly after a block of NUL bytes, and be scanned by dovecot as being > glued to the preceding message. > > I traced this to the file extension operation performed in > mbox_sync_handle_eof_updates, where the 'file_set_size' call > is used. If available, file_set_size will use the posix_fallocate > call. In GFS2 posix_fallocate increases the file size in 4 kB chunks > (there seems to be no guarantee anyway that posix_allocate will > extend a file by the exact size requested).I think that's a bug in GFS2. I understand posix_fallocate() man page to clearly say that it grows the file to the specified offset+len, not any higher. So could be a good idea to report it to their developers if they're not aware of it.. Anyway, I thought I'd just get rid of the whole syscall since it's not very useful anyway: http://hg.dovecot.org/dovecot-2.2/rev/42b2736f146b