Hello, I''m trying to get OST failover going using failover groups. Currently I am able to stop OSTs by group name (using: lconf --cleanup --group <group name> /etc/lustre/config.xml) but unless I modify lconf, I am unable to start OSTs by group name (using: lconf --group <group name> /etc/lustre/config.xml). Am I correct in thinking that this is a bug, or am I misunderstanding groups? If I change lconf at line 1524 from: if self.active and config.group and config.group != ost.get_val(''group'', ost.get_val(''name'')): self.active = 0 to: if not self.active and config.group and config.group == ost.get_val(''group'', ost.get_val(''name'')): self.active = 1 elif self.active and config.group and config.group != ost.get_val(''group'', ost.get_val(''name'')): self.active = 0 then it seems to work like I''d expect it to. Thanks, Kit Westneat
Kit are you actually modifying the xml file between stop and start using lconf? In our ldap world, which should be similar, you need to run lactive between the lconf calls.. For example to failover m973 to m956(failover pair) we run 3 commands from a management server: --------------------- rsh m973 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov --config mpp2fo --cleanup --failover --group pair15_2 /usr/sbin/lactive --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov --config mpp2fo --pwfile /home/mscf/lustre/pw --group pair15_2 --active m956 rsh m956 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov --group pair15_2 --config mpp2fo --------------------- The lactive command modifies the config on ldap(or xml file) so that the other server believes he should server the new targets. Note: this is lustre 1.4.2. Evan> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of > Kit Westneat > Sent: Wednesday, November 08, 2006 2:24 PM > To: lustre-discuss@clusterfs.com > Subject: [Lustre-discuss] failover groups > > Hello, > > I''m trying to get OST failover going using failover groups. > Currently I am able to stop OSTs by group name (using: lconf > --cleanup --group <group name> /etc/lustre/config.xml) but > unless I modify lconf, I am unable to start OSTs by group > name (using: lconf --group <group name> /etc/lustre/config.xml). > > Am I correct in thinking that this is a bug, or am I > misunderstanding groups? > > If I change lconf at line 1524 from: > > if self.active and config.group and config.group != > ost.get_val(''group'', > ost.get_val(''name'')): > self.active = 0 > > to: > > if not self.active and config.group and config.group == > ost.get_val(''group'', ost.get_val(''name'')): > self.active = 1 > elif self.active and config.group and config.group != > ost.get_val(''group'', ost.get_val(''name'')): > self.active = 0 > > then it seems to work like I''d expect it to. > > Thanks, > Kit Westneat > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
Ah, I was operating under the impression that the --group flag would take care of setting the OSTs active. It would be nice if there were a command that would do this, if not the group flag. I know it is possible to use the --select flag to set individual OSTs active, but it''d be nice to be able to set an entire group active. If --group isn''t a good choice for implementing this, would it be possible to implement --select <group>=<node>? It''d be nice to avoid having to select each OST individually when we have these groups defined. Thanks, Kit Felix, Evan J wrote:> Kit are you actually modifying the xml file between stop > and start using lconf? > > In our ldap world, which should be similar, you need to run lactive > between the lconf calls.. For example to failover m973 to m956(failover > pair) we run 3 commands from a management server: > --------------------- > > rsh m973 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov > --config mpp2fo --cleanup --failover --group pair15_2 > > /usr/sbin/lactive --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov --config > mpp2fo --pwfile /home/mscf/lustre/pw --group pair15_2 --active m956 > > rsh m956 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov > --group pair15_2 --config mpp2fo > > --------------------- > > The lactive command modifies the config on ldap(or xml file) so that the > other server believes he should server the new targets. > > Note: this is lustre 1.4.2. > > Evan > > >> -----Original Message----- >> From: lustre-discuss-bounces@clusterfs.com >> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of >> Kit Westneat >> Sent: Wednesday, November 08, 2006 2:24 PM >> To: lustre-discuss@clusterfs.com >> Subject: [Lustre-discuss] failover groups >> >> Hello, >> >> I''m trying to get OST failover going using failover groups. >> Currently I am able to stop OSTs by group name (using: lconf >> --cleanup --group <group name> /etc/lustre/config.xml) but >> unless I modify lconf, I am unable to start OSTs by group >> name (using: lconf --group <group name> /etc/lustre/config.xml). >> >> Am I correct in thinking that this is a bug, or am I >> misunderstanding groups? >> >> If I change lconf at line 1524 from: >> >> if self.active and config.group and config.group != >> ost.get_val(''group'', >> ost.get_val(''name'')): >> self.active = 0 >> >> to: >> >> if not self.active and config.group and config.group == >> ost.get_val(''group'', ost.get_val(''name'')): >> self.active = 1 >> elif self.active and config.group and config.group != >> ost.get_val(''group'', ost.get_val(''name'')): >> self.active = 0 >> >> then it seems to work like I''d expect it to. >> >> Thanks, >> Kit Westneat >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> >>
You should all use 1.6 and forget this whole mess :) Kit Westneat wrote:> Ah, I was operating under the impression that the --group flag would > take care of setting the OSTs active. It would be nice if there were a > command that would do this, if not the group flag. I know it is > possible to use the --select flag to set individual OSTs active, but > it''d be nice to be able to set an entire group active. If --group > isn''t a good choice for implementing this, would it be possible to > implement --select <group>=<node>? > > It''d be nice to avoid having to select each OST individually when we > have these groups defined. > > Thanks, > Kit > > Felix, Evan J wrote: >> Kit are you actually modifying the xml file between stop >> and start using lconf? >> >> In our ldap world, which should be similar, you need to run lactive >> between the lconf calls.. For example to failover m973 to m956(failover >> pair) we run 3 commands from a management server: >> --------------------- >> >> rsh m973 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov >> --config mpp2fo --cleanup --failover --group pair15_2 >> /usr/sbin/lactive --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov --config >> mpp2fo --pwfile /home/mscf/lustre/pw --group pair15_2 --active m956 >> rsh m956 /usr/sbin/lconf --ldapurl ldap://lustre.hpcs2.emsl.pnl.gov >> --group pair15_2 --config mpp2fo >> --------------------- >> >> The lactive command modifies the config on ldap(or xml file) so that the >> other server believes he should server the new targets. >> >> Note: this is lustre 1.4.2. >> >> Evan >> >> >>> -----Original Message----- >>> From: lustre-discuss-bounces@clusterfs.com >>> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Kit Westneat >>> Sent: Wednesday, November 08, 2006 2:24 PM >>> To: lustre-discuss@clusterfs.com >>> Subject: [Lustre-discuss] failover groups >>> >>> Hello, >>> >>> I''m trying to get OST failover going using failover groups. >>> Currently I am able to stop OSTs by group name (using: lconf >>> --cleanup --group <group name> /etc/lustre/config.xml) but unless I >>> modify lconf, I am unable to start OSTs by group name (using: lconf >>> --group <group name> /etc/lustre/config.xml). >>> >>> Am I correct in thinking that this is a bug, or am I >>> misunderstanding groups? >>> >>> If I change lconf at line 1524 from: >>> >>> if self.active and config.group and config.group != >>> ost.get_val(''group'', >>> ost.get_val(''name'')): >>> self.active = 0 >>> >>> to: >>> >>> if not self.active and config.group and config.group == >>> ost.get_val(''group'', ost.get_val(''name'')): >>> self.active = 1 >>> elif self.active and config.group and config.group != >>> ost.get_val(''group'', ost.get_val(''name'')): >>> self.active = 0 >>> >>> then it seems to work like I''d expect it to. >>> >>> Thanks, >>> Kit Westneat >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss@clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >>> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
That would be great... But there are reasons to keep working on old versions: Our file systems are in production.. Which means no playing with it to make upgrades... A vendor other than CFS(HP) controls what version we use, and we are still on 1.4.2 and 1.4.5 is 3-6 months away, and 1.4.7 is WAY out there.. And most importantly.... Lustre 1.6 is in BETA!!!! So unless you released it and didn''t tell anyone, we cant use it yet except for playing... Personally I like 1.6betas, and use them for my play... Buy I still support 1.4.2, and 1.4.7.1... Evan> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of > Nathaniel Rutman > Sent: Thursday, November 09, 2006 1:14 PM > To: Kit Westneat > Cc: lustre-discuss@clusterfs.com > Subject: Re: [Lustre-discuss] failover groups > > You should all use 1.6 and forget this whole mess :) > > > Kit Westneat wrote: > > Ah, I was operating under the impression that the --group > flag would > > take care of setting the OSTs active. It would be nice if > there were a > > command that would do this, if not the group flag. I know it is > > possible to use the --select flag to set individual OSTs > active, but > > it''d be nice to be able to set an entire group active. If --group > > isn''t a good choice for implementing this, would it be possible to > > implement --select <group>=<node>? > > > > It''d be nice to avoid having to select each OST > individually when we > > have these groups defined. > > > > Thanks, > > Kit > > > > Felix, Evan J wrote: > >> Kit are you actually modifying the xml file between stop and start > >> using lconf? > >> > >> In our ldap world, which should be similar, you need to > run lactive > >> between the lconf calls.. For example to failover m973 to > >> m956(failover > >> pair) we run 3 commands from a management server: > >> --------------------- > >> > >> rsh m973 /usr/sbin/lconf --ldapurl > ldap://lustre.hpcs2.emsl.pnl.gov > >> --config mpp2fo --cleanup --failover --group pair15_2 > >> /usr/sbin/lactive --ldapurl > ldap://lustre.hpcs2.emsl.pnl.gov --config > >> mpp2fo --pwfile /home/mscf/lustre/pw --group pair15_2 > --active m956 > >> rsh m956 /usr/sbin/lconf --ldapurl > ldap://lustre.hpcs2.emsl.pnl.gov > >> --group pair15_2 --config mpp2fo > >> --------------------- > >> > >> The lactive command modifies the config on ldap(or xml > file) so that > >> the other server believes he should server the new targets. > >> > >> Note: this is lustre 1.4.2. > >> > >> Evan > >> > >> > >>> -----Original Message----- > >>> From: lustre-discuss-bounces@clusterfs.com > >>> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Kit > >>> Westneat > >>> Sent: Wednesday, November 08, 2006 2:24 PM > >>> To: lustre-discuss@clusterfs.com > >>> Subject: [Lustre-discuss] failover groups > >>> > >>> Hello, > >>> > >>> I''m trying to get OST failover going using failover groups. > >>> Currently I am able to stop OSTs by group name (using: lconf > >>> --cleanup --group <group name> /etc/lustre/config.xml) > but unless I > >>> modify lconf, I am unable to start OSTs by group name > (using: lconf > >>> --group <group name> /etc/lustre/config.xml). > >>> > >>> Am I correct in thinking that this is a bug, or am I > >>> misunderstanding groups? > >>> > >>> If I change lconf at line 1524 from: > >>> > >>> if self.active and config.group and config.group != > >>> ost.get_val(''group'', > >>> ost.get_val(''name'')): > >>> self.active = 0 > >>> > >>> to: > >>> > >>> if not self.active and config.group and config.group == > >>> ost.get_val(''group'', ost.get_val(''name'')): > >>> self.active = 1 > >>> elif self.active and config.group and config.group != > >>> ost.get_val(''group'', ost.get_val(''name'')): > >>> self.active = 0 > >>> > >>> then it seems to work like I''d expect it to. > >>> > >>> Thanks, > >>> Kit Westneat > >>> > >>> _______________________________________________ > >>> Lustre-discuss mailing list > >>> Lustre-discuss@clusterfs.com > >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >>> > >>> > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss@clusterfs.com > > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >