neutron
2008-Nov-30 06:14 UTC
[Lustre-discuss] failed to start lustre: problem with port 988
hi all, I m using Lustre 2.6.18-53.1.13.el5_lustre.1.6.4.3smp. I sometime ran into a problem when starting lustre FS. For example, I usually failed to start an OSS with some error messages appear in /var/log/messages, like: ------------------ Nov 30 01:04:31 wci66 kernel: Lustre: Added LNI 172.16.0.67 at o2ib [8/64] Nov 30 01:04:31 wci66 kernel: Lustre: Added LNI 172.16.0.67 at tcp [8/256] Nov 30 01:04:31 wci66 kernel: LustreError: 7288:0:(linux-tcpip.c:554:libcfs_sock_listen() ) Can''t create socket: port 988 already in use Nov 30 01:04:31 wci66 kernel: LustreError: 122-1: Can''t start acceptor on port 988: port already in use Nov 30 01:04:32 wci66 kernel: Lustre: Removed LNI 172.16.0.67 at o2ib Nov 30 01:04:33 wci66 kernel: Lustre: Removed LNI 172.16.0.67 at tcp Nov 30 01:04:33 wci66 kernel: LustreError: 7204:0:(events.c:654:ptlrpc_init_portals()) ne twork initialisation failed Nov 30 01:04:33 wci66 modprobe: WARNING: Error inserting ptlrpc (/lib/modules/2.6.18-53.1 .13.el5_lustre.1.6.4.3smp/kernel/fs/lustre/ptlrpc.ko): Input/output error Nov 30 01:04:33 wci66 kernel: mdc: Unknown symbol ldlm_prep_enqueue_req Nov 30 01:04:33 wci66 kernel: mdc: Unknown symbol ldlm_resource_get ------------------------ It seems that Lustre modules need port 988 but the port is already used by others. But at that time "netstat -nap" shows no proc is using that port. Is Lustre statically bound to the port 988? Or is there anywhere I can change the configuration so that Lustre doesn''t rely on a statically fixed port? Thanks.
Brian J. Murrell
2008-Nov-30 15:02 UTC
[Lustre-discuss] failed to start lustre: problem with port 988
On Sun, 2008-11-30 at 01:14 -0500, neutron wrote:> hi all,Hi.> Nov 30 01:04:31 wci66 kernel: LustreError: > 7288:0:(linux-tcpip.c:554:libcfs_sock_listen() > ) Can''t create socket: port 988 already in use > Nov 30 01:04:31 wci66 kernel: LustreError: 122-1: Can''t start acceptor > on port 988: port > already in useThis means that something already has port 988 open. The culprit is usually an RPC server and it got that port from the RPC mapper. Typical examples are one of the NFS/ONC services like rpc.mountd, etc.> It seems that Lustre modules need port 988 but the port is already > used by others.Right.> But at that time "netstat -nap" shows no proc is > using that port.Strange.> Is Lustre statically bound to the port 988?You can specify the port as a module option. Likely the manual has more details on this than I can recall at the moment.> Or is there anywhere I > can change the configuration so that Lustre doesn''t rely on a > statically fixed port?No. Of course, the port has to be "static" (as you call it) for the same reason that your friend has to have and give you his "static" address if you can visit him. You''d have a difficult time finding him if his address was "some house in New York", right? That said, it can be any (legal) static value you want to use. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081130/1c8ffea6/attachment.bin