Hi, is there a way to tell a lustre client to do a failover from one network to another one (e.g. from o2ib to tcp)? Best regards, Hans Schnitzer -- Hans-Juergen Schnitzer RWTH Aachen University, Center for Computing and Communication Rechen- und Kommunikationszentrum Seffenter Weg 23, 52074 Aachen (Germany) Tel.: + 49(0)241/80-28719 - Fax: + 49(0)241/80-628719 schnitzer at rz.rwth-aachen.de http://www.rz.rwth-aachen.de -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5750 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080701/48301206/attachment.bin
On Tue, 2008-07-01 at 13:57 +0200, Hans-Juergen Schnitzer wrote:> Hi, > > is there a way to tell a lustre client to do a failover from one > network to another one (e.g. from o2ib to tcp)?What is your use-case? Lustre doesn''t have to be told to failover. It will just automatically fail over if there is a failure to reach whatever it is configured to reach on o2ib (assuming it was configured to failover onto tcp). b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080702/5d02065e/attachment.bin
In my current configuration, when I do a failover from one OSS node to another one, the client reconnects after some time. That works fine. However, when I simply unplug the IB connector on the OSS, the client hangs, waiting for the connection to come back. One use-case is that I would like to switch to ethernet when I shutdown IB for maintenance for example. How can I do that? Best regards, Hans Schnitzer Brian J. Murrell wrote:> On Tue, 2008-07-01 at 13:57 +0200, Hans-Juergen Schnitzer wrote: >> Hi, >> >> is there a way to tell a lustre client to do a failover from one >> network to another one (e.g. from o2ib to tcp)? > > What is your use-case? Lustre doesn''t have to be told to failover. It > will just automatically fail over if there is a failure to reach > whatever it is configured to reach on o2ib (assuming it was configured > to failover onto tcp). > > b. > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5750 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080702/61383381/attachment.bin
On Wed, 2008-07-02 at 19:08 +0200, Hans-Juergen Schnitzer wrote:> In my current configuration, when I do a failover from one OSS node > to another one, the client reconnects after some time.Good.> However, when I simply unplug the IB connector on the > OSS, the client hangs, waiting for the connection to come back.How long did you give it? There should be no functional difference between failures. Simply if a client times out trying to reach an OST on a given OSS, it tries the failover partner. Certainly a network failure would qualify as the kind of failure that would trigger that event. Did you actually umount the OSTs on the server you pulled the IB connector on and mount them on the failover OSS? Having the failover OSS mount the failed resources (and ONLY after the failed node has unmounted them!!) is a prerequisite for the client to actually perform a failover.> One use-case is that I would like to switch to ethernet when I > shutdown IB for maintenance for example. How can I do that?If the same OSSes will have both the IB and TCP NIDs for the target, simply shutting down the IB interface/network should cause a properly configured client to failover to the TCP LND. The Operations Manual should give pretty good coverage to configuring multiple LNDs. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080702/408fa0e2/attachment.bin