On Sep 20, 2004 14:14 -0600, Sonja Tideman wrote:> I am setting up Lustre 1.2.5 on a cluster with an elan4 interconnect. I > get the MDS and the OSTs setup, but, when I try to setup the clients, it > gets to the mount point and hangs. The kernel message buffer has a lot of > messages of the form: > > > LustreError: 1570:0:(qswnal_cb.c:926:kqswnal_sendmsg()) Can''t route to > > 0xc0a8023c: router error -101 > > LustreError: 1570:0:(lib-move.c:1182:do_PtlPut()) 3232235803: error sending > > PUT to 3232236092: 27 > > LustreError: 1570:0:(events.c:51:request_out_callback()) @@@ type 4, status 27 > > req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 lens 168/64 ref 2 > > fl Rpc:/0/0 rc 0/0 > > LustreError: 1570:0:(client.c:823:ptlrpc_expire_one_request()) @@@ timeout > > (sent 1095711865) req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 > > lens 168/64 ref 1 fl Rpc:/0/0 rc 0/0 > > Any ideas what is going on?The "can''t route" message (-101 = -ENETUNREACH) indicates that the clients cannot locate the MDS and/or OST nodes. How are you trying to mount the clients? If using "mount -t lustre" you need to add "-o nettype=elan" so that it will try to use elan to connect. If the elan MDS/OST nid is not of the form <hostname><nid> then you will also need to add "server_nid" in order to connect to the MDS. Also, since the client gets the config logs from the MDS, you need to store a valid elan config log on the MDS with "--write-conf". You can have both "generic" elan and TCP connected clients if they have separate node names (e.g. "client_tcp" and "client_elan") that you can use with the mount -t lustre/llmount options. Cheers, Andreas -- Andreas Dilger
Hi Sonja-- On Mon, 2004-09-20 at 16:14, Sonja Tideman wrote:> > I am setting up Lustre 1.2.5 on a cluster with an elan4 interconnect. I > get the MDS and the OSTs setup, but, when I try to setup the clients, it > gets to the mount point and hangs. The kernel message buffer has a lot of > messages of the form: > > > LustreError: 1570:0:(qswnal_cb.c:926:kqswnal_sendmsg()) Can''t route to > > 0xc0a8023c: router error -101 > > LustreError: 1570:0:(lib-move.c:1182:do_PtlPut()) 3232235803: error sending > > PUT to 3232236092: 27 > > LustreError: 1570:0:(events.c:51:request_out_callback()) @@@ type 4, status 27 > > req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 lens 168/64 ref 2 > > fl Rpc:/0/0 rc 0/0 > > LustreError: 1570:0:(client.c:823:ptlrpc_expire_one_request()) @@@ timeout > > (sent 1095711865) req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 > > lens 168/64 ref 1 fl Rpc:/0/0 rc 0/0 > > Any ideas what is going on?This looks like you''re specifying an IP address or hostname instead of an Elan NID in the "--nid" portion of your "--add net" lines. correct: lmc -m config.xml --add net --node s200 --nettype elan --nid 200 incorrect: lmc -m config.xml --add net --node s200 --nettype elan --nid s200 lmc -m config.xml --add net --node s200 --nettype elan --nid 10.0.0.200 Hope that helps-- -Phil
Hello, I am setting up Lustre 1.2.5 on a cluster with an elan4 interconnect. I get the MDS and the OSTs setup, but, when I try to setup the clients, it gets to the mount point and hangs. The kernel message buffer has a lot of messages of the form:> LustreError: 1570:0:(qswnal_cb.c:926:kqswnal_sendmsg()) Can''t route to > 0xc0a8023c: router error -101 > LustreError: 1570:0:(lib-move.c:1182:do_PtlPut()) 3232235803: error sending > PUT to 3232236092: 27 > LustreError: 1570:0:(events.c:51:request_out_callback()) @@@ type 4, status 27 > req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 lens 168/64 ref 2 > fl Rpc:/0/0 rc 0/0 > LustreError: 1570:0:(client.c:823:ptlrpc_expire_one_request()) @@@ timeout > (sent 1095711865) req@000001007f8eb800 x14/t0 o8->ost125_UUID@NID_n125_UUID:6 > lens 168/64 ref 1 fl Rpc:/0/0 rc 0/0Any ideas what is going on? Thanks Sonja