Michael Kluge
2010-Dec-03 13:32 UTC
[Lustre-discuss] lnet rounter immediatelly marked as down
Hi list, we have a Lustr 1.6.7.2 running on our (IB SDR) cluster and have added one additional NIC (tcp1) to one node and like to use this node as router. I have added a ip2nets statement and forwaring=enabled to the modprobe files on the router and reloaded the modules. I see two NIDS now and no trouble. The MDS server that need to go through the router to a hand full of additional clients is in production and I can''t take it down. So I added the route to the additional network via lctl --net tcp1 add_route W.X.Y.Z at o2ib where W.X.Y.Z is the ipoib address of the router. When I do an lctl show_routes, this router is marked as "down". Is there a way to bring it to life? I can lctl ping the router node from the MDS but can''t reload lnet to enable active router tests. Right now on the MDS the only option for the lnet module is the network config for the IB network interface. Any ideas who to enable this router? Regards, Michael -- Michael Kluge, M.Sc. Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5997 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101203/3c4df501/attachment.bin
liang Zhen
2010-Dec-03 15:29 UTC
[Lustre-discuss] lnet rounter immediatelly marked as down
Hi Michael, To add router dynamically, you also have to run "--net o2ib add_route a.b.c.d at tcp1" on all nodes of tcp1, so the better choice is using universal modprobe.conf by define "ip2nets" and "routes", you can see some example at here: http://wiki.lustre.org/manual/LustreManual18_HTML/MoreComplicatedConfigurations.html Regards Liang On 12/3/10 9:32 PM, Michael Kluge wrote:> Hi list, > > we have a Lustr 1.6.7.2 running on our (IB SDR) cluster and have added > one additional NIC (tcp1) to one node and like to use this node as > router. I have added a ip2nets statement and forwaring=enabled to the > modprobe files on the router and reloaded the modules. I see two NIDS > now and no trouble. > > The MDS server that need to go through the router to a hand full of > additional clients is in production and I can''t take it down. So I added > the route to the additional network via lctl --net tcp1 add_route > W.X.Y.Z at o2ib where W.X.Y.Z is the ipoib address of the router. When I do > an lctl show_routes, this router is marked as "down". Is there a way to > bring it to life? I can lctl ping the router node from the MDS but can''t > reload lnet to enable active router tests. Right now on the MDS the only > option for the lnet module is the network config for the IB network > interface. > > Any ideas who to enable this router? > > > Regards, Michael > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101203/427314ba/attachment-0001.html
Michael Kluge
2010-Dec-03 17:48 UTC
[Lustre-discuss] lnet rounter immediatelly marked as down
Hi Liang, sure, but my current question is: Why are the nodes within o2ib considering the router as down? I add the route to a node within o2ib and instantly afterwards lctl show_route say the router is down. That does not make much sense to me. And if I try to send a message through the router from this node I see that it can''t send the message beause all routers are down. Regards, Michael Am 03.12.2010 16:29, schrieb liang Zhen:> Hi Michael, > > To add router dynamically, you also have to run "--net o2ib add_route > a.b.c.d at tcp1" on all nodes of tcp1, so the better choice is using > universal modprobe.conf by define "ip2nets" and "routes", you can see > some example at here: > http://wiki.lustre.org/manual/LustreManual18_HTML/MoreComplicatedConfigurations.html > > Regards > Liang > > On 12/3/10 9:32 PM, Michael Kluge wrote: >> Hi list, >> >> we have a Lustr 1.6.7.2 running on our (IB SDR) cluster and have added >> one additional NIC (tcp1) to one node and like to use this node as >> router. I have added a ip2nets statement and forwaring=enabled to the >> modprobe files on the router and reloaded the modules. I see two NIDS >> now and no trouble. >> >> The MDS server that need to go through the router to a hand full of >> additional clients is in production and I can''t take it down. So I added >> the route to the additional network via lctl --net tcp1 add_route >> W.X.Y.Z at o2ib where W.X.Y.Z is the ipoib address of the router. When I do >> an lctl show_routes, this router is marked as "down". Is there a way to >> bring it to life? I can lctl ping the router node from the MDS but can''t >> reload lnet to enable active router tests. Right now on the MDS the only >> option for the lnet module is the network config for the IB network >> interface. >> >> Any ideas who to enable this router? >> >> >> Regards, Michael >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss