Jim Albin
2006-Nov-28 11:44 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
I am trying to install lustre on a single system with multiple ethernet interfaces. LNET appears to map it to eth3 if it is up regardless of the node name or ip address in the XML config file. (eth1 and eth2 are also up) This is why the single system test on the loopback or localhost was not working, lctl list_nids shows the ip address of eth3. If I use the name that maps to eth3 for the config file, everything works. If I ifconfig eth3 down and use the node ip or name for eth1 it also works. I want to use the eth1 interface without disturbing eth3. The installation manual mentions LNET will try to use all available interfaces in the Chapter 2 New Schema section. How do I get it to only use a specific interface (eth1 instead of eth3)? Thanks for any advice. -- Jim Albin Sr. Systems Administrator, HPC Systems Scientific Computing Center National Renewable Energy Laboratory
Aaron Knister
2006-Nov-28 11:46 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
In your /etc/modprobe.conf add the following line and replace X with the num of your eth interface. options lnet networks="tcp0(ethX)" -Aaron Jim Albin wrote:> I am trying to install lustre on a single system with multiple ethernet > interfaces. LNET appears to map it to eth3 if it is up regardless of the > node name or ip address in the XML config file. (eth1 and eth2 are also > up) This is why the single system test on the loopback or localhost was > not working, lctl list_nids shows the ip address of eth3. If I use the > name that maps to eth3 for the config file, everything works. If I > ifconfig eth3 down and use the node ip or name for eth1 it also works. I > want to use the eth1 interface without disturbing eth3. The installation > manual mentions LNET will try to use all available interfaces in the > Chapter 2 New Schema section. How do I get it to only use a specific > interface (eth1 instead of eth3)? > Thanks for any advice. > >
Jim Albin
2006-Nov-28 12:05 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
That did the trick. Thanks Aaron! On Tue, 2006-11-28 at 13:46 -0500, Aaron Knister wrote:> In your /etc/modprobe.conf add the following line and replace X with the > num of your eth interface. > > options lnet networks="tcp0(ethX)" > > -Aaron > > Jim Albin wrote: > > I am trying to install lustre on a single system with multiple ethernet > > interfaces. LNET appears to map it to eth3 if it is up regardless of the > > node name or ip address in the XML config file. (eth1 and eth2 are also > > up) This is why the single system test on the loopback or localhost was > > not working, lctl list_nids shows the ip address of eth3. If I use the > > name that maps to eth3 for the config file, everything works. If I > > ifconfig eth3 down and use the node ip or name for eth1 it also works. I > > want to use the eth1 interface without disturbing eth3. The installation > > manual mentions LNET will try to use all available interfaces in the > > Chapter 2 New Schema section. How do I get it to only use a specific > > interface (eth1 instead of eth3)? > > Thanks for any advice. > > > > >-- Jim Albin Sr. Systems Administrator, HPC Systems Scientific Computing Center National Renewable Energy Laboratory
Robin Humble
2006-Dec-07 21:44 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
On Tue, Nov 28, 2006 at 12:05:41PM -0700, Jim Albin wrote:>On Tue, 2006-11-28 at 13:46 -0500, Aaron Knister wrote: >> options lnet networks="tcp0(ethX)" >That did the trick. Thanks Aaron!if I have fancy routes setup up any time before I create the OST or MGS/MDT then the above doesn''t work for me... even if I ifdown and ifup to clear all routes first it still doesn''t work. I need to boot without any fancy routes set at all, then create the Lustre filesystem, otherise the OSSs and MDS always reject the traffic. this is with Lustre 1.6beta5. might be intrinsic kernel weirdness, or could be Lustre. no idea. just FYI... cheers, robin
Eric Barton
2006-Dec-08 03:16 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
> On Tue, Nov 28, 2006 at 12:05:41PM -0700, Jim Albin wrote: > >On Tue, 2006-11-28 at 13:46 -0500, Aaron Knister wrote: > >> options lnet networks="tcp0(ethX)" > >That did the trick. Thanks Aaron! > > if I have fancy routes setup up any time before I create the OST or > MGS/MDT then the above doesn''t work for me... even if I ifdown and ifup > to clear all routes first it still doesn''t work. > > I need to boot without any fancy routes set at all, then create > the Lustre filesystem, otherise the OSSs and MDS always reject the > traffic. this is with Lustre 1.6beta5.The lustre debug logs and/or console messages should help determine why the connection attempts are being rejected. Do you have these? Cheers, Eric
Nathaniel Rutman
2006-Dec-08 10:14 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
Eric Barton wrote:>> On Tue, Nov 28, 2006 at 12:05:41PM -0700, Jim Albin wrote: >> >>> On Tue, 2006-11-28 at 13:46 -0500, Aaron Knister wrote: >>> >>>> options lnet networks="tcp0(ethX)" >>>> >>> That did the trick. Thanks Aaron! >>> >> if I have fancy routes setup up any time before I create the OST or >> MGS/MDT then the above doesn''t work for me... even if I ifdown and ifup >> to clear all routes first it still doesn''t work. >> >> I need to boot without any fancy routes set at all, then create >> the Lustre filesystem, otherise the OSSs and MDS always reject the >> traffic. this is with Lustre 1.6beta5. >> > > The lustre debug logs and/or console messages should help determine why > the connection attempts are being rejected. Do you have these? >When a 1.6 server mounts for the first time, it reports all its local NIDs to the MGS for inclusion in the configuration logs. After the logs are written, the NIDs in the logs are not changed (except by doing a "writeconf" procedure.) So if your "fancy routes" setup leads to a weird local NID, this will get stored in the logs.
Robin Humble
2006-Dec-11 00:13 UTC
[Lustre-discuss] install problem: LNET maps nid to wrong interface
On Fri, Dec 08, 2006 at 09:14:46AM -0800, Nathaniel Rutman wrote:>Eric Barton wrote: >>>On Tue, Nov 28, 2006 at 12:05:41PM -0700, Jim Albin wrote: >>>>On Tue, 2006-11-28 at 13:46 -0500, Aaron Knister wrote: >>>>>options lnet networks="tcp0(ethX)" >>>>That did the trick. Thanks Aaron! >>>> >>>if I have fancy routes setup up any time before I create the OST or >>>MGS/MDT then the above doesn''t work for me... even if I ifdown and ifup >>>to clear all routes first it still doesn''t work. >> >>The lustre debug logs and/or console messages should help determine why >>the connection attempts are being rejected. Do you have these? >> >When a 1.6 server mounts for the first time, it reports all its local >NIDs to the MGS for inclusion in the configuration logs. After the logs >are written, the NIDs in the logs are not changed (except by doing a >"writeconf" procedure.) So if your "fancy routes" setup leads to a >weird local NID, this will get stored in the logs.trying things again now, I can''t reproduce the same problem... I must have screwed up something simple :-/ sorry about that. however, your message makes me wonder if it''s possible to get the MDS to respond on both of it''s gigabit interfaces? this is my best guess as to how to configure it: options lnet networks=tcp0(eth0),tcp1(eth1) and ''lctl list_nids all'' then says 0@lo 10.2.8.1@tcp # eth0 10.3.1.15@tcp1 # eth1 is that close? without any network options at all I get lo and eth0. cheers, robin