Hi Stephen, I use ext3 with kernel 2.4.14. I'm happy to have verified that nfs+ext3 in journal mode doesn't provide atomic write for the user point of view. My program writes sequential records of 64KB in a file through a nfs mount point. The blocks of data are initialized with a serie of integer: 1, 2, 3 ... I kill the nfsd daemons while two instance of the program are writing their 600 records of 64KB in two distinct files. Then using 'od' i look at the result, and i see some blocks of zero inside the file. The size of these zeroed blocks seems to be multiple of h'4000. NB: My clients and servers are colocalised on the same machine. I obtained the same result with nfsv2 and nfsv3 mount option. I also have the same result using ordered and writeback modes. NB: when i have changed the program in order to initialize the pattern for each record i cannot reproduce the problem ( it seems the timing was changed, as more CPU was done before each write request). Regards, eric
Stephen C. Tweedie
2002-Jan-09 19:31 UTC
Re: inconsistent file content after killing nfs daemon
Hi, On Wed, Jan 09, 2002 at 06:05:30PM +0100, eric chacron wrote:> Hi Stephen, > > I use ext3 with kernel 2.4.14. I'm happy to have verified that nfs+ext3 > in journal mode doesn't provide > atomic write for the user point of view. > > My program writes sequential records of 64KB in a file through a nfs > mount point. The blocks of data are > initialized with a serie of integer: 1, 2, 3 ... > I kill the nfsd daemons while two instance of the program are writing > their 600 records of > 64KB in two distinct files. > Then using 'od' i look at the result, and i see some blocks of zero > inside the file. The size of these zeroed blocks seems to be multiple > of h'4000.NFS is a stateless protocol. It has absolutely no serialisation or ordering in the protocol, so a given set of application writes can easily get reordered on the wire (especially if you are running over UDP and encounter any dropped packets.) The problem here is not ext3, but NFS. NFS simply does not make any ordering guarantees at all. Is your application using O_SYNC or f[data]sync to impose ordering on the data stream? Cheers, Stephen