McKee, Shawn
2010-May-10  18:27 UTC
[Lustre-discuss] Clients getting incorrect network information for one of two MDT servers (active/passive)
Hi Everyone,
We are having a problem with Lustre v1.8.3/x86_64 (ext4 flavor if it matters). 
We are very new to using Lustre so our problem may be trivial to those with
experience.
We have setup a separate MGS server and we have an HA setup for our MDT.  There
are two servers with a backend iSCSI storage area for the MDT on
lmd01.aglt2.org/lmd02.aglt2.org (active/passive using RedHat clustering).  All
nodes are dual-homed (private and public networks).  Failover works without a
problem modulo the issue we are asking about.
The primary problem is that one of the MDT nodes (LMD01) seems to be unreachable
from the clients.  We have configured lnet to use the private network to
mount/access Lustre.  The lnet line in /etc/modprobe.conf looks like this on an
MDT server:
options lnet networks=tcp0(bond0.4010) routes="tcp2
10.10.1.[50-52]@tcp0"
(We also have some routing for an external public network to allow clients there
to mount...not sure it is relevant to our problem.  I can provide details if it
is useful)
The ''bond0.4010'' is the private network.  The clients on this 
private network look similar:
options lnet networks=tcp0(eth0)
The relevant IPs:   lmd01 has 10.10.1.48 (private) and 192.41.230.48 (public)
                    lmd02 has 10.10.1.49 (private) and 192.41.230.49 (public)
The problem we have is shown in the ''lctl --net tcp0
peer_list'' output:
[root at bl-11-1 ~]# lctl --net tcp0  peer_list
12345-10.10.1.26 at tcp [1]bl-11-1.local->umfs06.local:988 #3
12345-10.10.1.36 at tcp [1]bl-11-1.local->umfs16.local:988 #3
12345-10.10.1.140 at tcp [1]bl-11-1.local->mgs.local:988 #3
12345-10.10.1.49 at tcp [2]bl-11-1.local->lmd02.local:988 #6
12345-192.41.230.48 at tcp [1116]0.0.0.0->lmd01.aglt2.org:988 #0
12345-10.10.1.25 at tcp [1]bl-11-1.local->umfs05.local:988 #3
Notice the "public" address 192.41.230.48 showing up on the
''tcp'' (''tcp0'') network?   This seems to be
the problem.  If LMD01 takes over actively serving the MDT we see things like
the following in the logs:
2010-05-10T12:21:01-04:00 lmd01.aglt2.org kernel: [272846.750287] LustreError:
120-3: Refusing connection from 192.41.237.235 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:23:46-04:00 lmd01.aglt2.org kernel: [273011.595403] LustreError:
120-3: Refusing connection from 192.41.237.235 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:29:01-04:00 lmd01.aglt2.org kernel: [273326.290186] LustreError:
120-3: Refusing connection from 192.41.230.203 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:48:11-04:00 lmd01.aglt2.org kernel: [274475.351001] LustreError:
120-3: Refusing connection from 192.41.230.168 for 192.41.230.48 at tcp: No
matching NI
This makes sense because LMD01 is NOT supposed to be using its public IP for
Lustre.   The strange thing is the LMD02 (setup almost exactly the same way as
LMD01) doesn''t have this problem and always works fine on the private
network.  Deleting the "bad" peer address on the client
doesn''t help since it just re-appears as soon as the client tries to
access Lustre.  Any ideas about what could be providing this "bad" IP
and how we can remove it?
FYI, I even tried "adding" tcp1 (for the public NIC) to the lnet
options on LMD01/LMD02 but clients still fail since the request is coming in as
''192.41.230.48 at tcp''  and not as ''192.41.230.48 at
tcp1''.
Thanks for any help or pointers to what might be wrong.
Shawn McKee/University of Michigan Physics
McKee, Shawn
2010-May-21  15:54 UTC
[Lustre-discuss] Clients getting incorrect network information for one of two MDT servers (active/passive)
Hi Everyone,
I never got any reply or suggestions from this one.  We are still having the
issue.  Summarizing: the clients get the wrong address for the MDS when our
LMD01 node is running the service.   If LMD02 (the active/passive HA partner to
LMD01) runs as the MDS things work.
Some further information that may be helpful showing the ''tunefs.lustre
--print'' details of the MDT:
root at lmd01 ~# tunefs.lustre --mdt --print /dev/sdd checking for existing
Lustre data: found CONFIGS/mountdata Reading CONFIGS/mountdata
    Read previous values:
Target:     umt3-MDT0000
Index:      0
Lustre FS:  umt3
Mount type: ldiskfs
Flags:      0x1
               (MDT )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: 
mgsnode=10.10.1.140 at tcp,192.41.230.140 at tcp1,141.211.101.161 at tcp2
failover.node=10.10.1.49 at tcp,192.41.230.49 at tcp1
    Permanent disk data:
Target:     umt3-MDT0000
Index:      0
Lustre FS:  umt3
Mount type: ldiskfs
Flags:      0x1
               (MDT )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: 
mgsnode=10.10.1.140 at tcp,192.41.230.140 at tcp1,141.211.101.161 at tcp2 
failover.node=10.10.1.49 at tcp,192.41.230.49 at tcp1
Notice there is no reference to 192.41.230.48 at tcp anywhere here.   
Thanks for any suggestions,
Shawn
-----Original Message-----
From: McKee, Shawn 
Sent: Monday, May 10, 2010 2:28 PM
To: ''lustre-discuss at lists.lustre.org''
Cc: aglt2-admin at umich.edu
Subject: Clients getting incorrect network information for one of two MDT
servers (active/passive)
Hi Everyone,
We are having a problem with Lustre v1.8.3/x86_64 (ext4 flavor if it matters). 
We are very new to using Lustre so our problem may be trivial to those with
experience.
We have setup a separate MGS server and we have an HA setup for our MDT.  There
are two servers with a backend iSCSI storage area for the MDT on
lmd01.aglt2.org/lmd02.aglt2.org (active/passive using RedHat clustering).  All
nodes are dual-homed (private and public networks).  Failover works without a
problem modulo the issue we are asking about.
The primary problem is that one of the MDT nodes (LMD01) seems to be unreachable
from the clients.  We have configured lnet to use the private network to
mount/access Lustre.  The lnet line in /etc/modprobe.conf looks like this on an
MDT server:
options lnet networks=tcp0(bond0.4010) routes="tcp2
10.10.1.[50-52]@tcp0"
(We also have some routing for an external public network to allow clients there
to mount...not sure it is relevant to our problem.  I can provide details if it
is useful)
The ''bond0.4010'' is the private network.  The clients on this 
private network look similar:
options lnet networks=tcp0(eth0)
The relevant IPs:   lmd01 has 10.10.1.48 (private) and 192.41.230.48 (public)
                    lmd02 has 10.10.1.49 (private) and 192.41.230.49 (public)
The problem we have is shown in the ''lctl --net tcp0
peer_list'' output:
[root at bl-11-1 ~]# lctl --net tcp0  peer_list
12345-10.10.1.26 at tcp [1]bl-11-1.local->umfs06.local:988 #3
12345-10.10.1.36 at tcp [1]bl-11-1.local->umfs16.local:988 #3
12345-10.10.1.140 at tcp [1]bl-11-1.local->mgs.local:988 #3
12345-10.10.1.49 at tcp [2]bl-11-1.local->lmd02.local:988 #6
12345-192.41.230.48 at tcp [1116]0.0.0.0->lmd01.aglt2.org:988 #0
12345-10.10.1.25 at tcp [1]bl-11-1.local->umfs05.local:988 #3
Notice the "public" address 192.41.230.48 showing up on the
''tcp'' (''tcp0'') network?   This seems to be
the problem.  If LMD01 takes over actively serving the MDT we see things like
the following in the logs:
2010-05-10T12:21:01-04:00 lmd01.aglt2.org kernel: [272846.750287] LustreError:
120-3: Refusing connection from 192.41.237.235 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:23:46-04:00 lmd01.aglt2.org kernel: [273011.595403] LustreError:
120-3: Refusing connection from 192.41.237.235 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:29:01-04:00 lmd01.aglt2.org kernel: [273326.290186] LustreError:
120-3: Refusing connection from 192.41.230.203 for 192.41.230.48 at tcp: No
matching NI
2010-05-10T12:48:11-04:00 lmd01.aglt2.org kernel: [274475.351001] LustreError:
120-3: Refusing connection from 192.41.230.168 for 192.41.230.48 at tcp: No
matching NI
This makes sense because LMD01 is NOT supposed to be using its public IP for
Lustre.   The strange thing is the LMD02 (setup almost exactly the same way as
LMD01) doesn''t have this problem and always works fine on the private
network.  Deleting the "bad" peer address on the client
doesn''t help since it just re-appears as soon as the client tries to
access Lustre.  Any ideas about what could be providing this "bad" IP
and how we can remove it?
FYI, I even tried "adding" tcp1 (for the public NIC) to the lnet
options on LMD01/LMD02 but clients still fail since the request is coming in as
''192.41.230.48 at tcp''  and not as ''192.41.230.48 at
tcp1''.
Thanks for any help or pointers to what might be wrong.
Shawn McKee/University of Michigan Physics
Daniel Kobras
2010-May-21  16:26 UTC
[Lustre-discuss] Clients getting incorrect network information for one of two MDT servers (active/passive)
Hi! On Fri, May 21, 2010 at 11:54:56AM -0400, McKee, Shawn wrote:> Parameters: > mgsnode=10.10.1.140 at tcp,192.41.230.140 at tcp1,141.211.101.161 at tcp2 > failover.node=10.10.1.49 at tcp,192.41.230.49 at tcp1 > > Notice there is no reference to 192.41.230.48 at tcp anywhere here.Lustre MDS and OSS nodes register themselves with the MGS when they are started (mounted) for the first time. In particular, the then-current list of network ids is recorded and sent off to the MGS, from where it is propagated to all clients. This information sticks and will not be updated automatically, even if the configuration on the server changes. From your description, it sounds like you initially started up the MDS with an incorrect LNET config (and probably fixed it in the meantime, but the MGS and thus the clients won''t know). Check with "lctl list_nids" on your first MDS that you''re content with the current configuration, then follow the procedure to change a server nid ("writeconf procedure") that is documented in the manual, and you should get both server nodes operational again. Regards, Daniel.
McKee, Shawn
2010-May-21  16:48 UTC
[Lustre-discuss] Clients getting incorrect network information for one of two MDT servers (active/passive)
Thanks Daniel, That indeed seems to be our problem. We are following the process documented in section 4.3.12 of the Lustre 1.8 Operations Manual and that should fix us up. Many thanks for the solution, Shawn -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Daniel Kobras Sent: Friday, May 21, 2010 12:27 PM To: lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] Clients getting incorrect network information for one of two MDT servers (active/passive) Hi! On Fri, May 21, 2010 at 11:54:56AM -0400, McKee, Shawn wrote:> Parameters: > mgsnode=10.10.1.140 at tcp,192.41.230.140 at tcp1,141.211.101.161 at tcp2 > failover.node=10.10.1.49 at tcp,192.41.230.49 at tcp1 > > Notice there is no reference to 192.41.230.48 at tcp anywhere here.Lustre MDS and OSS nodes register themselves with the MGS when they are started (mounted) for the first time. In particular, the then-current list of network ids is recorded and sent off to the MGS, from where it is propagated to all clients. This information sticks and will not be updated automatically, even if the configuration on the server changes. From your description, it sounds like you initially started up the MDS with an incorrect LNET config (and probably fixed it in the meantime, but the MGS and thus the clients won''t know). Check with "lctl list_nids" on your first MDS that you''re content with the current configuration, then follow the procedure to change a server nid ("writeconf procedure") that is documented in the manual, and you should get both server nodes operational again. Regards, Daniel. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss