Setup: server - FreeBSD 8-stable from today.? 2 UFS dirs exported via NFS. client - FreeBSD 8.0-Release. Running a test php script that copies around various files to/from 2 separate NFS mounts. Situation: script is started (forked to do 20 simultaneous runs) and 20 1GB files are copied to the NFS dir which works fine.? When it then switches to reading those files back and simultaneously writing to the other NFS mount I see a hang of 75 seconds.? If I do an "ls -l" on the NFS mount it hangs too.? After 75 seconds the client has reported: nfs server 192.168.10.133:/usr/local/export1: not responding nfs server 192.168.10.133:/usr/local/export1: is alive again nfs server 192.168.10.133:/usr/local/export1: not responding nfs server 192.168.10.133:/usr/local/export1: is alive again and then things start working again.? The server was originally FreeBSD 8.0-Release also but was upgraded to the latest stable to see if this issue could be avoided. # nfsstat -s -W -w 1 GtAttr Lookup Rdlink???Read? Write Rename Access? Rddir ? ? ? 0? ? ? 0? ? ? 0? ? 222? ? 257? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? 178? ? 135? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ???85? ? 127? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 ... for 75 rows of all zeros ? ? ? 0? ? ? 0? ? ? 0? ? 272? ? 266? ? ? 0? ? ? 0? ? ? 0 ? ? ? 0? ? ? 0? ? ? 0? ? 167? ? 165? ? ? 0? ? ? 0? ? ? 0 I also tried runs with 15 simultaneous processes and 25. 15 processes gave only about a 5 second stall but 25 gave again the same 75 second stall. Further, I tested with 2 mounts to the same server but from ZFS filesytems with the exact same stall/timeout periods. So, it doesn't appear to matter what the underlying filesystem is - it's something in NFS or networking code. Any ideas on what's going on here? What's causing the complete stall period of zero NFS activity? Any flaws with my testing methods? Thanks for any and all help/ideas. --Alan
On Thu, Jul 1, 2010 at 11:01 AM, alan bryan <alan.bryan@yahoo.com> wrote:> Setup: > > server - FreeBSD 8-stable from today.? 2 UFS dirs exported via NFS. > client - FreeBSD 8.0-Release. ?Running a test php script that copies around various files to/from 2 separate NFS mounts. > > Situation: > > script is started (forked to do 20 simultaneous runs) and 20 1GB files are copied to the NFS dir which works fine.? When it then switches to reading those files back and simultaneously writing to the other NFS mount I see a hang of 75 seconds.? If I do an "ls -l" on the NFS mount it hangs too.? After 75 seconds the client has reported: > > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > > and then things start working again.? The server was originally FreeBSD 8.0-Release also but was upgraded to the latest stable to see if this issue could be avoided. > > # nfsstat -s -W -w 1 > ?GtAttr Lookup Rdlink???Read? Write Rename Access? Rddir > ? ? ? 0? ? ? 0? ? ? 0? ? 222? ? 257? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? 178? ? 135? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ???85? ? 127? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0? ? ? 0 > > ... for 75 rows of all zeros > > ? ? ? 0? ? ? 0? ? ? 0? ? 272? ? 266? ? ? 0? ? ? 0? ? ? 0 > ? ? ? 0? ? ? 0? ? ? 0? ? 167? ? 165? ? ? 0? ? ? 0? ? ? 0 > > I also tried runs with 15 simultaneous processes and 25. ?15 processes gave only about a 5 second stall but 25 gave again the same 75 second stall. > > Further, I tested with 2 mounts to the same server but from ZFS filesytems with the exact same stall/timeout periods. ?So, it doesn't appear to matter what the underlying filesystem is - it's something in NFS or networking code. > > Any ideas on what's going on here? ?What's causing the complete stall period of zero NFS activity? ? Any flaws with my testing methods? > > Thanks for any and all help/ideas.What network driver are you using? Have you tried tcpdumping the packets? -Garrett
On Thu, Jul 01, 2010 at 11:01:04AM -0700, alan bryan wrote:> Setup: > > server - FreeBSD 8-stable from today.? 2 UFS dirs exported via NFS. > client - FreeBSD 8.0-Release. Running a test php script that copies around various files to/from 2 separate NFS mounts. > > Situation: > > script is started (forked to do 20 simultaneous runs) and 20 1GB files are copied to the NFS dir which works fine.? When it then switches to reading those files back and simultaneously writing to the other NFS mount I see a hang of 75 seconds.? If I do an "ls -l" on the NFS mount it hangs too.? After 75 seconds the client has reported: > > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > > and then things start working again.? The server was originally FreeBSD 8.0-Release also but was upgraded to the latest stable to see if this issue could be avoided. > > ... > > Any ideas on what's going on here? What's causing the complete stall period of zero NFS activity? Any flaws with my testing methods?One thing worth asking: are there any firewall stacks (ipfw, ipfilter, or pf) in use on either the client or server? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |