thr3ads.net - CentOS - [Centos] possible data corruption when NFS is used [Mar 2005]

If this information is useful, please help other people find it:
Share via:

Aleksandar Milivojevic

2005-Mar-18 17:05 UTC

[Centos] possible data corruption when NFS is used

Couple of days ago, I experienced nasty problem when doing mmap of file
located on NFS mounted partition (from Solaris 9 server).  The problem
manifests itself as data corruption.  I''ve notified folks at Red Hat on
their bugzilla (after all, the kernel is build from their source), and I
tought of sharing it with folks here.

The problem manifests itself like this.  Create empty file using
open64() system call.  write a single byte at some position in the file
(basically this will allocate a single block at the end of the file,
with rest of the file empty).  In my test I wrote a single byte at 100KB
offset using pwrite64() system call.  Use mmap() call to map entire file
into the memory.  Use memset() library function to fill entire mmaped
region with some pattern.  Do unmmap() on the file, and close() the file.

What happens is that on the Linux NFS client, if you do "less
filename",
you''ll see the file correctly filled with the pattern.  On Solaris 9
NFS
server, doing "less filename" will show that file is empty.  NFS
allows
for 30 seconds gap before the changes are flushed from client to the
server (in reality, most NFS clients do not wait and will attempt to
flush the changes to the server shortly after they are made).  However,
this never happens, the changes are never sent to the NFS server (they
stay cached on the client side forever).  When client is rebooted,
changes are lost.  Doing "du -sk filename" on both client and server
produces same results, the output indicates that the file is sparse.
This shows inconsistency on the client (less shows that file is filled
with pattern, so it can''t be sparse, du -sk shows size that indicates
that the file is sparse).

The longes I waited for the client to send updated file blocks to the
NFS server was something like half an hour.  So there is possibility
that changes would get flushed eventually in several hours (or days)
when kernel attempts to free pages used to hold cached copy (haven''t
tested that scenario).

If the file is updated using write() or pwrite64() system calls (instead
of mmap()/memset()/munmap() combo), the file is updated on the NFS
server almost instantly.

I am able to reproduce it "every time" on CentOS4 as NFS client, and
Solaris 9 as NFS server.  Haven''t tried out other combinations.  RHEL4
as NFS client should have same problem, and possible other Linux
distributions (Fedora comes to mind as most likely candidate, becasue of
its close connection to RHEL4).  I also have a small app that
demonstrates the problem (that should be labeled as "one of the most
stupid uses of mmap", basically implementation of Solaris mkfile command).

If anybody experienced hard to explain data corruptions on NFS mounted
file systems, this might be the reason behind it.

--
Aleksandar Milivojevic <amilivojevic@pbl.ca>    Pollard Banknote Limited
Systems Administrator                           1499 Buffalo Place
Tel: (204) 474-2323 ext 276                     Winnipeg, MB  R3T 1L7

CentOS - Mar 2005 - possible data corruption when NFS is used

[Centos] possible data corruption when NFS is used