Hi! I think I've reported that before, the I thought it's been fixed, however I still get data corruptions when writing on NFS volumes. Now I wonder - is nobody really using NFS, or do I have that much of uncommon setup, or this is some kind of local problem? Client: 8.0-RELEASE i386 Server: 8.0-RELEASE amd64 mount options: nfs rw,nosuid,noexec,nfsv3,intr,soft,tcp,bg,nolockd Server has ZFS, but the same thing happens when sharing UFS placed on md(4). I've prepared a special 1GB file to determine the details of corruption: it's filled with 32-bit numbers each equal to it's own offset in the file. That is, like that: 00000000 00 00 00 00 04 00 00 00 08 00 00 00 0c 00 00 00 |................| 00000010 10 00 00 00 14 00 00 00 18 00 00 00 1c 00 00 00 |................| 00000020 20 00 00 00 24 00 00 00 28 00 00 00 2c 00 00 00 | ...$...(...,...| 00000030 30 00 00 00 34 00 00 00 38 00 00 00 3c 00 00 00 |0...4...8...<...| I've copied that file over NFS from client to server around 50 times, and got 3 corruptions on 8th, 28th and 36th copies. Case1: single currupted block 3779CF88-3779FFFF (12408 bytes). Data in block is shifted 68 bytes up, loosing first 68 bytes are filling last 68 bytes with garbage. Interestingly, among that garbage is my hostname. Case2: single currupted block 2615CFA0-2615FFFF (12384 bytes). Data is shifted by 44 bytes in the same way. Case3: single currepted block 3AA947A8-3AA97FFF (14424 bytes). Data is shifted by 36 bytes in the same way. Any ideas? PS. Diffs of corrupted blocks in a text format are here: http://people.freebsd.org/~amdmi3/diff.1.txt http://people.freebsd.org/~amdmi3/diff.2.txt http://people.freebsd.org/~amdmi3/diff.3.txt -- Dmitry Marakasov . 55B5 0596 FF1E 8D84 5F56 9510 D35A 80DD F9D2 F77D amdmi3@amdmi3.ru ..: jabber: amdmi3@jabber.ru http://www.amdmi3.ru
Dmitry Marakasov <amdmi3@amdmi3.ru> wrote: > I think I've reported that before, the I thought it's been fixed, > however I still get data corruptions when writing on NFS volumes. > Now I wonder - is nobody really using NFS, or do I have that much > of uncommon setup, or this is some kind of local problem? NFS works fine for me. I'm using -stable, not -release, though. > Client: 8.0-RELEASE i386 > Server: 8.0-RELEASE amd64 > > mount options: > nfs rw,nosuid,noexec,nfsv3,intr,soft,tcp,bg,nolockd I recommend not using the "soft" option. This is an excerpt from Solaris' mount_nfs(1M) manpage: File systems that are mounted read-write or that con- tain executable files should always be mounted with the hard option. Applications using soft mounted file systems may incur unexpected I/O errors, file corrup- tion, and unexpected program core dumps. The soft option is not recommended. FreeBSD's manual page doesn't contain such a warning, but maybe it should. (It contains a warning not to use "soft" with NFSv4, though, for different reasons.) Also note that the "nolockd" option means that processes on different clients won't see each other's locks. That means that you will get corruption if they rely on locking. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "Perl will consistently give you what you want, unless what you want is consistency." -- Larry Wall
On Wednesday 10 February 2010 12:43:38 pm Dmitry Marakasov wrote:> Hi! > > I think I've reported that before, the I thought it's been fixed, > however I still get data corruptions when writing on NFS volumes. > Now I wonder - is nobody really using NFS, or do I have that much > of uncommon setup, or this is some kind of local problem? > > Client: 8.0-RELEASE i386 > Server: 8.0-RELEASE amd64 > > mount options: > nfs rw,nosuid,noexec,nfsv3,intr,soft,tcp,bg,nolockd > > Server has ZFS, but the same thing happens when sharing UFS placed on > md(4). > > I've prepared a special 1GB file to determine the details of corruption: > it's filled with 32-bit numbers each equal to it's own offset in the > file. That is, like that: > > 00000000 00 00 00 00 04 00 00 00 08 00 00 00 0c 00 00 00 |................| > 00000010 10 00 00 00 14 00 00 00 18 00 00 00 1c 00 00 00 |................| > 00000020 20 00 00 00 24 00 00 00 28 00 00 00 2c 00 00 00 | ...$...(...,...| > 00000030 30 00 00 00 34 00 00 00 38 00 00 00 3c 00 00 00 |0...4...8...<...| > > I've copied that file over NFS from client to server around 50 > times, and got 3 corruptions on 8th, 28th and 36th copies. > > Case1: single currupted block 3779CF88-3779FFFF (12408 bytes). > Data in block is shifted 68 bytes up, loosing first 68 bytes are > filling last 68 bytes with garbage. Interestingly, among that garbage > is my hostname.Is it the hostname of the server or the client?> Case2: single currupted block 2615CFA0-2615FFFF (12384 bytes). > Data is shifted by 44 bytes in the same way. > > Case3: single currepted block 3AA947A8-3AA97FFF (14424 bytes). > Data is shifted by 36 bytes in the same way. > > Any ideas? > > PS. Diffs of corrupted blocks in a text format are here: > http://people.freebsd.org/~amdmi3/diff.1.txt > http://people.freebsd.org/~amdmi3/diff.2.txt > http://people.freebsd.org/~amdmi3/diff.3.txtCan you reproduce this using a non-FreeBSD server with a FreeBSD client or a non-FreeBSD client with a FreeBSD server? That would narrow down the breakage to either the client or the server. -- John Baldwin