Klaus Steden
2007-Nov-16 01:27 UTC
[Lustre-discuss] Problem mounting Lustre from another network
Hello, I''ve got Lustre now set up using two networks. When the file system was built, only one of these networks was present and configured. I have no problems communicating with Lustre from hosts on the network that it was built with, but when I try to mount from the network I''ve just added, I get this error: -- client -- mount -t lustre tm0-0 at tcp1:tm0-1 at tcp1:/lustre /mnt/lustremount.lustre: mount tm0-0 at tcp1:tm0-1 at tcp1:/lustre at /mnt/lustre failed: No such file or directory Is the MGS specification correct? Is the filesystem name correct? If upgrading, is the copied client log valid? (see upgrade docs) -- client -- I can see all of the various operational units (OSS nodes, primary and secondary MDS nodes) both over regular TCP/IP and via the various ''ltcl'' commands (peer_list, conn_list), and can see the client there. However, if I run ''lctl dl'' on the client, nothing shows up. Why is this happening, and what can I do to fix it? I''ve scoured the ''net and list archives, but I haven''t found anything that sheds any light on the problem. thanks, Klaus
Isaac Huang
2007-Nov-16 03:06 UTC
[Lustre-discuss] Problem mounting Lustre from another network
On Thu, Nov 15, 2007 at 05:27:22PM -0800, Klaus Steden wrote:> > Hello, > > I''ve got Lustre now set up using two networks. When the file system was > built, only one of these networks was present and configured. I have no > problems communicating with Lustre from hosts on the network that it was > built with, but when I try to mount from the network I''ve just added, I get > this error: > > -- client -- > mount -t lustre tm0-0 at tcp1:tm0-1 at tcp1:/lustre /mnt/lustremount.lustre: mount > tm0-0 at tcp1:tm0-1 at tcp1:/lustre at /mnt/lustre failed: No such file or > directoryOn the client, can you please run these commands: lctl list_nids lctl ping tm0-0 at tcp1 lctl ping tm0-1 at tcp1 What do they say? Isaac
Klaus Steden
2007-Nov-16 03:23 UTC
[Lustre-discuss] Problem mounting Lustre from another network
Hiyo, Here''s what I get: -- lctl -- lctl > list_nids 172.16.128.100 at tcp1 lctl > ping tm0-0 Can''t parse process id "tm0-0" lctl > ping tm0-1 Can''t parse process id "tm0-1" lctl > ping 172.16.128.252 at tcp1 12345-0 at lo 12345-172.16.129.252 at tcp 12345-172.16.128.252 at tcp1 lctl > ping 172.16.128.249 at tcp1 12345-0 at lo 12345-172.16.129.249 at tcp 12345-172.16.128.249 at tcp1 -- lctl -- 172.16.128.252 is the IP of tm0-0, and 172.16.128.249 is the IP of tm0-1. Right now, they''re only listed in /etc/hosts, the client doesn''t have useful DNS service. Klaus On 11/15/07 7:06 PM, "Isaac Huang" <He.Huang at Sun.COM>did etch on stone tablets:> On Thu, Nov 15, 2007 at 05:27:22PM -0800, Klaus Steden wrote: >> >> Hello, >> >> I''ve got Lustre now set up using two networks. When the file system was >> built, only one of these networks was present and configured. I have no >> problems communicating with Lustre from hosts on the network that it was >> built with, but when I try to mount from the network I''ve just added, I get >> this error: >> >> -- client -- >> mount -t lustre tm0-0 at tcp1:tm0-1 at tcp1:/lustre /mnt/lustremount.lustre: mount >> tm0-0 at tcp1:tm0-1 at tcp1:/lustre at /mnt/lustre failed: No such file or >> directory > > On the client, can you please run these commands: > lctl list_nids > lctl ping tm0-0 at tcp1 > lctl ping tm0-1 at tcp1 > > What do they say? > > Isaac