Jeremie Le Hen
2009-Dec-13 23:07 UTC
Cannot list a particular directory through NFS with UDP
Hi, __ Please Cc: me when replying as I'm not subscribed. Thanks. __ My NFS server is running FreeBSD 8.0 from December 6th. The client is a NetBSD 5.0. The directory exported is /data/repos on the server (192.168.1.222) and is mounted on /mnt/repos on the client (192.168.1.1). The problem exists in /data/repos/netbsd-cvsroot/pkgsrc when using NFS over UDP: ls(1) stalls. OTOH, for instance, listing another directory or using NFS over TCP work flawlessly. "ktruss ls" shows the following: % 26964 1 ls open(".", 0, 0) = 3 % 26964 1 ls fcntl(0x3, 0x2, 0x1) = 0 % 26964 1 ls fchdir(0x3) = 0 % 26964 1 ls open(".", 0, 0) = 5 % 26964 1 ls open(".", 0x4, 0) = 6 % 26964 1 ls fcntl(0x6, 0x2, 0x1) = 0 % 26964 1 ls __fstat30(0x6, 0xbfbfdef0) = 0 % 26964 1 ls fstatvfs1(0x6, 0xbfbfdf54, 0x2) = 0 % 26964 1 ls lseek(0x6, 0, 0, 0, 0x1) = 0 <---------- stalls here Here is a trace from the stalling ls(1). Please ask me if you need more informations: 23:58:37.735792 IP (tos 0x0, ttl 64, id 48150, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288088 > 192.168.1.222.2049: 140 lookup [|nfs] 23:58:37.736635 IP (tos 0x0, ttl 64, id 62453, offset 0, flags [none], proto UDP (17), length 264) 192.168.1.222.2049 > 192.168.1.1.3819288088: reply ok 236 lookup [|nfs] 23:58:37.736727 IP (tos 0x0, ttl 64, id 48152, offset 0, flags [none], proto UDP (17), length 160) 192.168.1.1.3819288089 > 192.168.1.222.2049: 132 lookup [|nfs] 23:58:37.737232 IP (tos 0x0, ttl 64, id 18881, offset 0, flags [none], proto UDP (17), length 264) 192.168.1.222.2049 > 192.168.1.1.3819288089: reply ok 236 lookup [|nfs] 23:58:37.737411 IP (tos 0x0, ttl 64, id 48153, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288090 > 192.168.1.222.2049: 124 access [|nfs] 23:58:37.737783 IP (tos 0x0, ttl 64, id 57308, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288090: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 23:58:37.737927 IP (tos 0x0, ttl 64, id 48154, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288091 > 192.168.1.222.2049: 124 access [|nfs] 23:58:37.738412 IP (tos 0x0, ttl 64, id 21511, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288091: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 23:58:37.738477 IP (tos 0x0, ttl 64, id 48155, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288092 > 192.168.1.222.2049: 124 access [|nfs] 23:58:37.738914 IP (tos 0x0, ttl 64, id 33831, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288092: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 23:58:37.738990 IP (tos 0x0, ttl 64, id 48156, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.1.3819288093 > 192.168.1.222.2049: 120 getattr [|nfs] 23:58:37.739377 IP (tos 0x0, ttl 64, id 26761, offset 0, flags [none], proto UDP (17), length 140) 192.168.1.222.2049 > 192.168.1.1.3819288093: reply ok 112 getattr DIR 755 ids 0/0 [|nfs] 23:58:37.740301 IP (tos 0x0, ttl 64, id 48158, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] 23:58:37.764039 IP (tos 0x0, ttl 64, id 46859, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] 23:58:37.764088 IP (tos 0x0, ttl 64, id 46859, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp 23:58:43.353108 IP (tos 0x0, ttl 64, id 48242, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] 23:58:43.353640 IP (tos 0x0, ttl 64, id 35118, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] 23:58:43.353687 IP (tos 0x0, ttl 64, id 35118, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp 23:58:54.587373 IP (tos 0x0, ttl 64, id 48349, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] 23:58:54.587822 IP (tos 0x0, ttl 64, id 20689, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] 23:58:54.587875 IP (tos 0x0, ttl 64, id 20689, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp 23:59:17.045978 IP (tos 0x0, ttl 64, id 48635, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] 23:59:17.046483 IP (tos 0x0, ttl 64, id 53175, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] 23:59:17.046538 IP (tos 0x0, ttl 64, id 53175, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp 00:00:01.953196 IP (tos 0x0, ttl 64, id 48966, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] 00:00:01.953665 IP (tos 0x0, ttl 64, id 27028, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] 00:00:01.953711 IP (tos 0x0, ttl 64, id 27028, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp Regards, -- Jeremie Le Hen
Rick Macklem
2009-Dec-14 16:05 UTC
Cannot list a particular directory through NFS with UDP
On Mon, 14 Dec 2009, Jeremie Le Hen wrote:> Hi, > > __ Please Cc: me when replying as I'm not subscribed. Thanks. __ > > My NFS server is running FreeBSD 8.0 from December 6th. The client is a > NetBSD 5.0. The directory exported is /data/repos on the server > (192.168.1.222) and is mounted on /mnt/repos on the client (192.168.1.1). > > The problem exists in /data/repos/netbsd-cvsroot/pkgsrc when using NFS > over UDP: ls(1) stalls. OTOH, for instance, listing another directory > or using NFS over TCP work flawlessly. >I'll take a look and let you know if I can think of anything. A couple of things: - What arch/net interface is the server running? - I haven't seen any issues w.r.t. i386, so I'm thinking it might be some sort of 64bit/alignment problem. (dfr@ replaced the RPC transport code with a new krpc subsystem for FreeBSD8.0 and known issues w.r.t. alignment were fixed, but there may be more) If you wanted to, you could try using the experimental server instead (-e option for mountd and nfsd), just to see if that makes the problem go away. (It handles mbuf lists/alignment somewhat differently.) Good luck with it and I'll let you know if I spot anything, rick
John Baldwin
2009-Dec-14 16:47 UTC
Cannot list a particular directory through NFS with UDP
On Sunday 13 December 2009 6:06:50 pm Jeremie Le Hen wrote:> Hi, > > __ Please Cc: me when replying as I'm not subscribed. Thanks. __ > > My NFS server is running FreeBSD 8.0 from December 6th. The client is a > NetBSD 5.0. The directory exported is /data/repos on the server > (192.168.1.222) and is mounted on /mnt/repos on the client (192.168.1.1). > > The problem exists in /data/repos/netbsd-cvsroot/pkgsrc when using NFS > over UDP: ls(1) stalls. OTOH, for instance, listing another directory > or using NFS over TCP work flawlessly. > > "ktruss ls" shows the following: > % 26964 1 ls open(".", 0, 0) = 3 > % 26964 1 ls fcntl(0x3, 0x2, 0x1) = 0 > % 26964 1 ls fchdir(0x3) = 0 > % 26964 1 ls open(".", 0, 0) = 5 > % 26964 1 ls open(".", 0x4, 0) = 6 > % 26964 1 ls fcntl(0x6, 0x2, 0x1) = 0 > % 26964 1 ls __fstat30(0x6, 0xbfbfdef0) = 0 > % 26964 1 ls fstatvfs1(0x6, 0xbfbfdf54, 0x2) = 0 > % 26964 1 ls lseek(0x6, 0, 0, 0, 0x1) = 0 > <---------- stalls here > > > Here is a trace from the stalling ls(1). Please ask me if you need more > informations: > > 23:58:37.735792 IP (tos 0x0, ttl 64, id 48150, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288088 > 192.168.1.222.2049: 140 lookup [|nfs] > 23:58:37.736635 IP (tos 0x0, ttl 64, id 62453, offset 0, flags [none], proto UDP (17), length 264) 192.168.1.222.2049 > 192.168.1.1.3819288088: reply ok 236 lookup [|nfs] > 23:58:37.736727 IP (tos 0x0, ttl 64, id 48152, offset 0, flags [none], proto UDP (17), length 160) 192.168.1.1.3819288089 > 192.168.1.222.2049: 132 lookup [|nfs] > 23:58:37.737232 IP (tos 0x0, ttl 64, id 18881, offset 0, flags [none], proto UDP (17), length 264) 192.168.1.222.2049 > 192.168.1.1.3819288089: reply ok 236 lookup [|nfs] > 23:58:37.737411 IP (tos 0x0, ttl 64, id 48153, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288090 > 192.168.1.222.2049: 124 access [|nfs] > 23:58:37.737783 IP (tos 0x0, ttl 64, id 57308, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288090: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] > 23:58:37.737927 IP (tos 0x0, ttl 64, id 48154, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288091 > 192.168.1.222.2049: 124 access [|nfs] > 23:58:37.738412 IP (tos 0x0, ttl 64, id 21511, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288091: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] > 23:58:37.738477 IP (tos 0x0, ttl 64, id 48155, offset 0, flags [none], proto UDP (17), length 152) 192.168.1.1.3819288092 > 192.168.1.222.2049: 124 access [|nfs] > 23:58:37.738914 IP (tos 0x0, ttl 64, id 33831, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.222.2049 > 192.168.1.1.3819288092: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] > 23:58:37.738990 IP (tos 0x0, ttl 64, id 48156, offset 0, flags [none], proto UDP (17), length 148) 192.168.1.1.3819288093 > 192.168.1.222.2049: 120 getattr [|nfs] > 23:58:37.739377 IP (tos 0x0, ttl 64, id 26761, offset 0, flags [none], proto UDP (17), length 140) 192.168.1.222.2049 > 192.168.1.1.3819288093: reply ok 112 getattr DIR 755 ids 0/0 [|nfs] > 23:58:37.740301 IP (tos 0x0, ttl 64, id 48158, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 23:58:37.764039 IP (tos 0x0, ttl 64, id 46859, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 23:58:37.764088 IP (tos 0x0, ttl 64, id 46859, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp > 23:58:43.353108 IP (tos 0x0, ttl 64, id 48242, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 23:58:43.353640 IP (tos 0x0, ttl 64, id 35118, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 23:58:43.353687 IP (tos 0x0, ttl 64, id 35118, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp > 23:58:54.587373 IP (tos 0x0, ttl 64, id 48349, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 23:58:54.587822 IP (tos 0x0, ttl 64, id 20689, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 23:58:54.587875 IP (tos 0x0, ttl 64, id 20689, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp > 23:59:17.045978 IP (tos 0x0, ttl 64, id 48635, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 23:59:17.046483 IP (tos 0x0, ttl 64, id 53175, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 23:59:17.046538 IP (tos 0x0, ttl 64, id 53175, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp > 00:00:01.953196 IP (tos 0x0, ttl 64, id 48966, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 00:00:01.953665 IP (tos 0x0, ttl 64, id 27028, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 00:00:01.953711 IP (tos 0x0, ttl 64, id 27028, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udpIt looks like the NFS client does not like the replies to the 3819288094 request. Can you grab nfsstat output before and after a retransmit of the request and reply to see which counters are increased? This might indicate why the reply is not being accepted. -- John Baldwin
Jeremie Le Hen
2009-Dec-14 18:55 UTC
Cannot list a particular directory through NFS with UDP
On Mon, Dec 14, 2009 at 10:50:40AM -0500, John Baldwin wrote:> > It looks like the NFS client does not like the replies to the 3819288094 > request. Can you grab nfsstat output before and after a retransmit of > the request and reply to see which counters are increased? This might > indicate why the reply is not being accepted.Premices (replayed each time to have the exact same cache): # umount /mnt/repos # mount -t nfs -o intr,soft obiwan:/data/repos /mnt/repos # cd /mnt/repos/netbsd-cvsroot # ls Running ls(1) on a "good" directory shows the following difference: # ls src Server: Client: - +3 getattr - +3 getattr - +1 lookup - +1 lookup - +1 readdir - +1 readdir - +1 access - +1 access Client cache: - +9 attrcache - +2 lookupcache - +2 readdir - +2 direofcache Running ls(1) on the "bad" directory shows the following difference: # ls pkgsrc Server: Client: - +3 getattr - +3 getattr - +1 lookup - +1 lookup - +1 readdir - +1 readdir - +3 access - +3 access Client cache: - +5 attrcache - +1 lookupcache Both scenarios show no error. Regards, -- Jeremie Le Hen Humans are born free and equal. But some are more equal than others. Coluche
Rick Macklem
2009-Dec-18 20:14 UTC
Cannot list a particular directory through NFS with UDP
On Mon, 14 Dec 2009, Jeremie Le Hen wrote:> 00:00:01.953196 IP (tos 0x0, ttl 64, id 48966, offset 0, flags [none], proto UDP (17), length 168) 192.168.1.1.3819288094 > 192.168.1.222.2049: 140 readdir [|nfs] > 00:00:01.953665 IP (tos 0x0, ttl 64, id 27028, offset 0, flags [+], proto UDP (17), length 1500) 192.168.1.222.2049 > 192.168.1.1.3819288094: reply ok 1472 readdir POST: DIR 755 ids 0/0 [|nfs] > 00:00:01.953711 IP (tos 0x0, ttl 64, id 27028, offset 1480, flags [none], proto UDP (17), length 632) 192.168.1.222 > 192.168.1.1: udp >This appears to be the reply to the nfs readdir request, which is what would be expected. It could be a problem with the content or the reply or a NetBSD client issue. If you were to email me the raw tcpdump capture for the above, I could take a look at it in wireshark (which knows how to interpret nfs) and see if there is anything bogus looking in the reply. ("tcpdump -s 0 -w <file> host 192.168.1.1" and then email me <file> as an attachment, should do it) rick