Charles Cooke
2013-Jun-05 14:59 UTC
[Gluster-users] GlusterFS (3.3.1) - issues with NLM - RPC Program procedure not available
Hello, We had some issues with context switches using the GlusterFS FUSE client - & we are processing a large number of small files. We decided to switch to NFS to see if that would increase performance, but are having issues with the NLM locking functionality. Using http://wiki.samba.org/index.php/Ping_pong - we are able to verify that a single F_SETLKW call is blocking indefinitely. ping_pong -r glusterfs_test.txt 1 Hangs on this line blocking for a wait lock: return fcntl(fd,F_SETLKW,&lock); The root cause appears to be this in the glusterfs/nfs.log: *RPC Program procedure not available for procedure 5 in NLM4* * * Is there any more information on this error, or how it can be resolved? rpcinfo -p program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100021 3 udp 45739 nlockmgr 100021 3 tcp 58923 nlockmgr 100005 3 tcp 38465 mountd 100005 1 tcp 38466 mountd 100003 3 tcp 38467 nfs 100021 4 tcp 38468 nlockmgr 100024 1 udp 791 status 100024 1 tcp 794 status 100021 1 udp 783 nlockmgr 100021 1 tcp 785 nlockmgr /var/log/messages Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nlm_lookup_host(0a8b08af, p=6, v=4) Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: host garbage collection Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: nlmsvc_mark_resources Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: delete host 10.139.8.175 Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_unmonitor(10.139.8.175) Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: creating host entry Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: get host 10.139.8.175 Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_monitor(10.139.8.175) Jun 5 10:53:48 ip-10-139-8-175 kernel: nsm: xdr_decode_stat_res status 0 state 39 Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: call procedure 2 on 10.139.8.175 Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nlm_bind_host(0a8b08af) Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: server returns status 50331648 /var/log/glusterfs/nfs.log [2013-06-05 14:53:48.147647] T [nfs-fops.c:131:nfs_create_frame] 0-nfs: uid: 0, gid 0, gids: 7 [2013-06-05 14:53:48.147664] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 0 [2013-06-05 14:53:48.147680] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 1 [2013-06-05 14:53:48.147695] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 2 [2013-06-05 14:53:48.147710] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 3 [2013-06-05 14:53:48.147736] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 4 [2013-06-05 14:53:48.147752] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 6 [2013-06-05 14:53:48.147767] T [nfs-fops.c:133:nfs_create_frame] 0-nfs: gid: 10 [2013-06-05 14:53:48.147798] T [rpc-clnt.c:1308:rpc_clnt_record] 0-nfs_test-client-0: Auth Info: pid: 1, uid: 0, gid: 0, owner: 304069702d31302d-3133392d382d3137-35 [2013-06-05 14:53:48.147821] T [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 196, payload: 88, rpc hdr: 108 [2013-06-05 14:53:48.147861] T [rpc-clnt.c:1507:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x27x Program: GlusterFS 3.1, ProgVers: 330, Proc: 26) to rpc-transport (nfs_test-client-0) [2013-06-05 14:53:48.148249] T [rpc-clnt.c:685:rpc_clnt_reply_init] 0-nfs_test-client-0: received rpc message (RPC XID: 0x27x Program: GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport (nfs_test-client-0) [2013-06-05 14:53:48.148293] T [rpc-clnt.c:1308:rpc_clnt_record] 0-nfs_test-client-1: Auth Info: pid: 1, uid: 0, gid: 0, owner: 304069702d31302d-3133392d382d3137-35 [2013-06-05 14:53:48.148314] T [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 196, payload: 88, rpc hdr: 108 [2013-06-05 14:53:48.148349] T [rpc-clnt.c:1507:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x48x Program: GlusterFS 3.1, ProgVers: 330, Proc: 26) to rpc-transport (nfs_test-client-1) [2013-06-05 14:53:48.149264] T [rpc-clnt.c:685:rpc_clnt_reply_init] 0-nfs_test-client-1: received rpc message (RPC XID: 0x48x Program: GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport (nfs_test-client-1) [2013-06-05 14:53:48.149300] D [fd-lk.c:456:fd_lk_insert_and_merge] 0-fd-lk: new lock requrest: owner = 304069702d31302d-3133392d382d3137-35, fl_type = F_WRLCK, fs_start = 0, fs_end = 0, user_flock: l_type = F_WRLCK, l_start = 0, l_len = 1 [2013-06-05 14:53:48.149320] D [fd-lk.c:419:print_lock_list] 0-fd-lk: lock list: [2013-06-05 14:53:48.149340] D [fd-lk.c:431:print_lock_list] 0-fd-lk: owner = 304069702d31302d-3133392d382d3137-35, cmd = F_SETLKW fl_type = F_WRLCK, fs_start = 0, fs_end = 0, user_flock: l_type = F_WRLCK, l_start = 0, l_len = 1, [2013-06-05 14:53:48.149371] T [rpc-clnt.c:1308:rpc_clnt_record] 0-NLM-client: Auth Info: pid: 1, uid: 0, gid: 0, owner: 304069702d31302d-3133392d382d3137-35 [2013-06-05 14:53:48.149392] T [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 156, payload: 116, rpc hdr: 40 [2013-06-05 14:53:48.149430] T [rpc-clnt.c:1507:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x19x Program: NLMv4, ProgVers: 4, Proc: 5) to rpc-transport (NLM-client) [2013-06-05 14:53:48.149452] T [nfs3.c:447:nfs3_call_state_wipe] 0-nfs-nfsv3: fd 0x1c9f53fc ref: 4 [2013-06-05 14:53:48.149495] T [rpcsvc.c:470:rpcsvc_handle_rpc_call] 0-rpcsvc: Client port: 1009 [2013-06-05 14:53:48.149516] T [rpcsvc-auth.c:305:rpcsvc_auth_request_init] 0-rpc-service: Auth handler: AUTH_NULL [2013-06-05 14:53:48.149533] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x13, Ver: 2, Program: 100021, ProgVers: 4, Proc: 5) from rpc-transport (socket.NLM) [2013-06-05 14:53:48.149551] E [rpcsvc.c:203:rpcsvc_program_actor] 0-rpc-service: RPC Program procedure not available for procedure 5 in NLM4 [2013-06-05 14:53:48.149598] D [rpcsvc.c:1113:rpcsvc_error_reply] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x27) [0x2b1874ed5e17] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x16c) [0x2b1874ed442c] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x143) [0x2b1874ed40b3]))) 0-: sending a RPC error reply -- Thank You, Charles Cooke Senior Cloud Software Engineer Coupa Software | www.coupa.com charles.cooke at coupa.com Cell: 902.452.4858 <http://www.coupa.com/solutions/new-in-coupa-9> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130605/27d79a1a/attachment.html>
Michael Brown
2013-Jun-05 15:08 UTC
[Gluster-users] GlusterFS (3.3.1) - issues with NLM - RPC Program procedure not available
I found that when I was exporting NFS from gluster I had to be careful not to load the kernel nfs modules (such as lockd) or things would get messed up. I think that's happened to you. Is gluster lockd in use? Do this: [root at fearless1 ~]# rpcinfo -p | grep 100021 100021 4 tcp 38468 nlockmgr 100021 1 udp 872 nlockmgr 100021 1 tcp 874 nlockmgr [root at fearless1 ~]# ss -lp sport = :38468 State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 10 *:38468 *:* users:(("glusterfs",28675,13)) Yes it is! If not: * ensure no nfs mounts are on gluster servers * rmmod nfs * rmmod lockd * 'gluster volume volname start force' should ensure all the daemons are up. On 13-06-05 10:59 AM, Charles Cooke wrote:> Hello, > > We had some issues with context switches using the GlusterFS FUSE > client - & we are processing a large number of small files. > > We decided to switch to NFS to see if that would increase performance, > but are having issues with the NLM locking functionality. > > Using http://wiki.samba.org/index.php/Ping_pong - we are able to > verify that a single F_SETLKW call is blocking indefinitely. > > ping_pong -r glusterfs_test.txt 1 > Hangs on this line blocking for a wait lock: > return fcntl(fd,F_SETLKW,&lock); > > The root cause appears to be this in the glusterfs/nfs.log: > *RPC Program procedure not available for procedure 5 in NLM4* > * > * > Is there any more information on this error, or how it can be resolved? > > rpcinfo -p > program vers proto port > 100000 2 tcp 111 portmapper > 100000 2 udp 111 portmapper > 100021 3 udp 45739 nlockmgr > 100021 3 tcp 58923 nlockmgr > 100005 3 tcp 38465 mountd > 100005 1 tcp 38466 mountd > 100003 3 tcp 38467 nfs > 100021 4 tcp 38468 nlockmgr > 100024 1 udp 791 status > 100024 1 tcp 794 status > 100021 1 udp 783 nlockmgr > 100021 1 tcp 785 nlockmgr > > > /var/log/messages > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: > nlm_lookup_host(0a8b08af, p=6, v=4) > Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: host garbage collection > Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: nlmsvc_mark_resources > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: delete host 10.139.8.175 > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_unmonitor(10.139.8.175) > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: creating host entry > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: get host 10.139.8.175 > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nsm_monitor(10.139.8.175) > Jun 5 10:53:48 ip-10-139-8-175 kernel: nsm: xdr_decode_stat_res > status 0 state 39 > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: call procedure 2 on > 10.139.8.175 > Jun 5 10:53:48 ip-10-139-8-175 kernel: lockd: nlm_bind_host(0a8b08af) > Jun 5 14:53:48 ip-10-139-8-175 kernel: lockd: server returns status > 50331648 > > /var/log/glusterfs/nfs.log > [2013-06-05 14:53:48.147647] T [nfs-fops.c:131:nfs_create_frame] > 0-nfs: uid: 0, gid 0, gids: 7 > [2013-06-05 14:53:48.147664] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 0 > [2013-06-05 14:53:48.147680] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 1 > [2013-06-05 14:53:48.147695] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 2 > [2013-06-05 14:53:48.147710] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 3 > [2013-06-05 14:53:48.147736] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 4 > [2013-06-05 14:53:48.147752] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 6 > [2013-06-05 14:53:48.147767] T [nfs-fops.c:133:nfs_create_frame] > 0-nfs: gid: 10 > [2013-06-05 14:53:48.147798] T [rpc-clnt.c:1308:rpc_clnt_record] > 0-nfs_test-client-0: Auth Info: pid: 1, uid: 0, gid: 0, owner: > 304069702d31302d-3133392d382d3137-35 > [2013-06-05 14:53:48.147821] T > [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 196, payload: 88, rpc hdr: 108 > [2013-06-05 14:53:48.147861] T [rpc-clnt.c:1507:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x27x Program: GlusterFS 3.1, > ProgVers: 330, Proc: 26) to rpc-transport (nfs_test-client-0) > [2013-06-05 14:53:48.148249] T [rpc-clnt.c:685:rpc_clnt_reply_init] > 0-nfs_test-client-0: received rpc message (RPC XID: 0x27x Program: > GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport > (nfs_test-client-0) > [2013-06-05 14:53:48.148293] T [rpc-clnt.c:1308:rpc_clnt_record] > 0-nfs_test-client-1: Auth Info: pid: 1, uid: 0, gid: 0, owner: > 304069702d31302d-3133392d382d3137-35 > [2013-06-05 14:53:48.148314] T > [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 196, payload: 88, rpc hdr: 108 > [2013-06-05 14:53:48.148349] T [rpc-clnt.c:1507:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x48x Program: GlusterFS 3.1, > ProgVers: 330, Proc: 26) to rpc-transport (nfs_test-client-1) > [2013-06-05 14:53:48.149264] T [rpc-clnt.c:685:rpc_clnt_reply_init] > 0-nfs_test-client-1: received rpc message (RPC XID: 0x48x Program: > GlusterFS 3.1, ProgVers: 330, Proc: 26) from rpc-transport > (nfs_test-client-1) > [2013-06-05 14:53:48.149300] D [fd-lk.c:456:fd_lk_insert_and_merge] > 0-fd-lk: new lock requrest: owner > 304069702d31302d-3133392d382d3137-35, fl_type = F_WRLCK, fs_start = 0, > fs_end = 0, user_flock: l_type = F_WRLCK, l_start = 0, l_len = 1 > [2013-06-05 14:53:48.149320] D [fd-lk.c:419:print_lock_list] 0-fd-lk: > lock list: > [2013-06-05 14:53:48.149340] D [fd-lk.c:431:print_lock_list] 0-fd-lk: > owner = 304069702d31302d-3133392d382d3137-35, cmd = F_SETLKW fl_type > F_WRLCK, fs_start = 0, fs_end = 0, user_flock: l_type = F_WRLCK, > l_start = 0, l_len = 1, > [2013-06-05 14:53:48.149371] T [rpc-clnt.c:1308:rpc_clnt_record] > 0-NLM-client: Auth Info: pid: 1, uid: 0, gid: 0, owner: > 304069702d31302d-3133392d382d3137-35 > [2013-06-05 14:53:48.149392] T > [rpc-clnt.c:1188:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 156, payload: 116, rpc hdr: 40 > [2013-06-05 14:53:48.149430] T [rpc-clnt.c:1507:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x19x Program: NLMv4, ProgVers: 4, > Proc: 5) to rpc-transport (NLM-client) > [2013-06-05 14:53:48.149452] T [nfs3.c:447:nfs3_call_state_wipe] > 0-nfs-nfsv3: fd 0x1c9f53fc ref: 4 > [2013-06-05 14:53:48.149495] T [rpcsvc.c:470:rpcsvc_handle_rpc_call] > 0-rpcsvc: Client port: 1009 > [2013-06-05 14:53:48.149516] T > [rpcsvc-auth.c:305:rpcsvc_auth_request_init] 0-rpc-service: Auth > handler: AUTH_NULL > [2013-06-05 14:53:48.149533] T [rpcsvc.c:382:rpcsvc_request_create] > 0-rpc-service: received rpc-message (XID: 0x13, Ver: 2, Program: > 100021, ProgVers: 4, Proc: 5) from rpc-transport (socket.NLM) > [2013-06-05 14:53:48.149551] E [rpcsvc.c:203:rpcsvc_program_actor] > 0-rpc-service: RPC Program procedure not available for procedure 5 in NLM4 > [2013-06-05 14:53:48.149598] D [rpcsvc.c:1113:rpcsvc_error_reply] > (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x27) > [0x2b1874ed5e17] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x16c) > [0x2b1874ed442c] > (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x143) > [0x2b1874ed40b3]))) 0-: sending a RPC error reply > > -- > Thank You, > > Charles Cooke > Senior Cloud Software Engineer > Coupa Software | www.coupa.com <http://www.coupa.com> > charles.cooke at coupa.com <mailto:charles.cooke at coupa.com> > Cell: 902.452.4858 > > <http://www.coupa.com/solutions/new-in-coupa-9> > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-- Michael Brown | `One of the main causes of the fall of Systems Consultant | the Roman Empire was that, lacking zero, Net Direct Inc. | they had no way to indicate successful ?: +1 519 883 1172 x5106 | termination of their C programs.' - Firth -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130605/42f072a1/attachment.html>