Chad Kerner
2008-Mar-11 18:27 UTC
[Lustre-discuss] Problems mountine lustre thru an ib2ip gateway
Hello, I am trying to mount a lustre filesystem thru an ib2ip gateway. The MDS''s have infiniband connections. The client nodes are tcp/ip connections. I am able to route between the client nodes and the MDS''s. I have the following in /etc/fstab: abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client /abehome lustre _netdev,flock 0 0 I get the following when trying to mount: [root at t3honest5 lustre]# mount -v /abehome verbose: 1 arg[0] = /sbin/mount.lustre arg[1] = abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client arg[2] = /abehome arg[3] = -v arg[4] = -o arg[5] = rw,_netdev,flock mds nid 0: 141.142.69.7 at o2ib mds nid 1: 141.142.69.8 at o2ib mds name: home profile: client options: rw,_netdev,flock retry: 0 mount.lustre: mount(abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client, /abehome) failed: Input/output error mds nid 0: 141.142.69.7 at o2ib mds nid 1: 141.142.69.8 at o2ib mds name: home profile: client options: rw,_netdev,flock retry: 0 [root at t3honest5 lustre]# I can see the MDS nodes: [root at t3honest5 lustre]# ping -c2 141.142.69.7 PING 141.142.69.7 (141.142.69.7) 56(84) bytes of data. 64 bytes from 141.142.69.7: icmp_seq=0 ttl=60 time=0.753 ms 64 bytes from 141.142.69.7: icmp_seq=1 ttl=60 time=0.271 ms --- 141.142.69.7 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.271/0.512/0.753/0.241 ms, pipe 2 [root at t3honest5 lustre]# ping -c2 141.142.69.8 PING 141.142.69.8 (141.142.69.8) 56(84) bytes of data. 64 bytes from 141.142.69.8: icmp_seq=0 ttl=60 time=1.61 ms 64 bytes from 141.142.69.8: icmp_seq=1 ttl=60 time=0.248 ms --- 141.142.69.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.248/0.930/1.613/0.683 ms, pipe 2 [root at t3honest5 lustre]# I was wondering if anyone had any ideas where to start looking? Thanks, Chad -- Chad Kerner - ckerner at ncsa.uiuc.edu Systems Engineer, Storage Enabling Technologies National Center for Supercomputing Applications http://www.ncsa.uiuc.edu/~ckerner
Isaac Huang
2008-Mar-12 04:41 UTC
[Lustre-discuss] Problems mountine lustre thru an ib2ip gateway
Hi, When the mount fails, please check ''dmesg'' for more error messages. Please also provide your lnet module parameters on both the client and the MDS. Isaac On Tue, Mar 11, 2008 at 01:27:48PM -0500, Chad Kerner wrote:> Hello, > > I am trying to mount a lustre filesystem thru an ib2ip gateway. > The MDS''s have infiniband connections. The client nodes are tcp/ip > connections. I am able to route between the client nodes and the MDS''s. > > I have the following in /etc/fstab: > abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client /abehome lustre > _netdev,flock 0 0 > > > I get the following when trying to mount: > [root at t3honest5 lustre]# mount -v /abehome > verbose: 1 > arg[0] = /sbin/mount.lustre > arg[1] = abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client > arg[2] = /abehome > arg[3] = -v > arg[4] = -o > arg[5] = rw,_netdev,flock > mds nid 0: 141.142.69.7 at o2ib > mds nid 1: 141.142.69.8 at o2ib > mds name: home > profile: client > options: rw,_netdev,flock > retry: 0 > mount.lustre: mount(abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client, > /abehome) failed: Input/output error > mds nid 0: 141.142.69.7 at o2ib > mds nid 1: 141.142.69.8 at o2ib > mds name: home > profile: client > options: rw,_netdev,flock > retry: 0 > [root at t3honest5 lustre]# > > > I can see the MDS nodes: > [root at t3honest5 lustre]# ping -c2 141.142.69.7 > PING 141.142.69.7 (141.142.69.7) 56(84) bytes of data. > 64 bytes from 141.142.69.7: icmp_seq=0 ttl=60 time=0.753 ms > 64 bytes from 141.142.69.7: icmp_seq=1 ttl=60 time=0.271 ms > > --- 141.142.69.7 ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 1000ms > rtt min/avg/max/mdev = 0.271/0.512/0.753/0.241 ms, pipe 2 > > [root at t3honest5 lustre]# ping -c2 141.142.69.8 > PING 141.142.69.8 (141.142.69.8) 56(84) bytes of data. > 64 bytes from 141.142.69.8: icmp_seq=0 ttl=60 time=1.61 ms > 64 bytes from 141.142.69.8: icmp_seq=1 ttl=60 time=0.248 ms > > --- 141.142.69.8 ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 1001ms > rtt min/avg/max/mdev = 0.248/0.930/1.613/0.683 ms, pipe 2 > [root at t3honest5 lustre]# > > > I was wondering if anyone had any ideas where to start looking? > > Thanks, > Chad > -- > Chad Kerner - ckerner at ncsa.uiuc.edu > Systems Engineer, Storage Enabling Technologies > National Center for Supercomputing Applications > http://www.ncsa.uiuc.edu/~ckerner > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Andreas Dilger
2008-Mar-12 09:07 UTC
[Lustre-discuss] Problems mountine lustre thru an ib2ip gateway
On Mar 11, 2008 13:27 -0500, Chad Kerner wrote:> I am trying to mount a lustre filesystem thru an ib2ip gateway. > The MDS''s have infiniband connections. The client nodes are tcp/ip > connections. I am able to route between the client nodes and the MDS''s. > > I have the following in /etc/fstab: > abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client /abehome lustre > _netdev,flock 0 0You shouldn''t be using IB (ko2iblnd) in this case at all. Instead, just use TCP (ksocklnd). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.