Ok, I feel really stupid. I''ve done this before without any problem, but I can''t seem to get it to work and I can''t find my notes from the last time I did it. We have separate MGS and MDTs. I can''t seem to get our MGS to failover correctly after reformatting it. mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 We are running this on Debian, using the Lustre 1.6.3 debs from svn on Lenny with 2.6.22.12. I''ve tried several permutations of the mkfs.lustre command, specifing both nodes as failover, and both nodes as MGS and pretty much every other combination of the above. With the above command tunefs.lustre shows that failnode and mgsnode are the failover node. Thanks, Robert Robert LeBlanc College of Life Sciences Computer Support Brigham Young University leblanc at byu.edu (801)422-1882
Robert LeBlanc wrote:> Ok, I feel really stupid. I''ve done this before without any problem, but I > can''t seem to get it to work and I can''t find my notes from the last time I > did it. We have separate MGS and MDTs. I can''t seem to get our MGS to > failover correctly after reformatting it. > > mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs > --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 > >The MGS doesn''t actually use the --failnode option (although it won''t hurt). You actually have to tell the other nodes in the system (servers and clients) about the failover options for the MGS (use the --mgsnode parameter on servers, and mount address for clients). The reason is because the servers must contact the MGS for the configuration information, and they can''t ask the MGS where its failover partner is if e.g. the failover partner is the one that''s running.> We are running this on Debian, using the Lustre 1.6.3 debs from svn on Lenny > with 2.6.22.12. I''ve tried several permutations of the mkfs.lustre command, > specifing both nodes as failover, and both nodes as MGS and pretty much > every other combination of the above. With the above command tunefs.lustre > shows that failnode and mgsnode are the failover node. > > Thanks, > Robert > > Robert LeBlanc > College of Life Sciences Computer Support > Brigham Young University > leblanc at byu.edu > (801)422-1882 > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
Hi, How will look my tunefs.lustre command line if I would like to configure failnode for my MDS. I have two MDT''s and MGS is on the same block device that one of MDT''s ? I have also two servers connected to share matadata storage. Thanks, Wojciech On 12 Nov 2007, at 20:49, Nathan Rutman wrote:> Robert LeBlanc wrote: >> Ok, I feel really stupid. I''ve done this before without any >> problem, but I >> can''t seem to get it to work and I can''t find my notes from the >> last time I >> did it. We have separate MGS and MDTs. I can''t seem to get our MGS to >> failover correctly after reformatting it. >> >> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs >> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 >> >> > The MGS doesn''t actually use the --failnode option (although it won''t > hurt). You actually have to tell the other nodes > in the system (servers and clients) about the failover options for the > MGS (use the --mgsnode parameter on servers, and mount address for > clients). The reason is because the servers must contact the MGS for > the configuration information, and they can''t ask the MGS where its > failover partner is if e.g. the failover partner is the one that''s > running. > >> We are running this on Debian, using the Lustre 1.6.3 debs from >> svn on Lenny >> with 2.6.22.12. I''ve tried several permutations of the mkfs.lustre >> command, >> specifing both nodes as failover, and both nodes as MGS and pretty >> much >> every other combination of the above. With the above command >> tunefs.lustre >> shows that failnode and mgsnode are the failover node. >> >> Thanks, >> Robert >> >> Robert LeBlanc >> College of Life Sciences Computer Support >> Brigham Young University >> leblanc at byu.edu >> (801)422-1882 >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discussMr Wojciech Turek Assistant System Manager University of Cambridge High Performance Computing service email: wjt27 at cam.ac.uk tel. +441223763517 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071112/3d3134ef/attachment-0002.html
You should just unmount all the clients, all OSTs and then: tunefs.lustre ?failnode 10.0.0.2 at tcp ?writeconf /dev/shared/disk If your volume is already on the shared disk, them mount everything and you should be good to go. You can also do it on a live mounted system by using lctl, but I?m not exactly sure how to do that. Robert On 11/12/07 2:24 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote:> Hi, > > How will look my tunefs.lustre command line if I would like to configure > failnode for my MDS. I have two MDT''s and MGS is on the same block device that > one of MDT''s ? I have also two servers connected to share matadata storage. > > Thanks, > > Wojciech? > On 12 Nov 2007, at 20:49, Nathan Rutman wrote: > >> Robert LeBlanc wrote: >> >>> Ok, I feel really stupid. I''ve done this before without any problem, but I >>> can''t seem to get it to work and I can''t find my notes from the last time I >>> did it. We have separate MGS and MDTs. I can''t seem to get our MGS to >>> failover correctly after reformatting it. >>> >>> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs >>> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 >>> >>> >>> >> The MGS doesn''t actually use the --failnode option (although it won''t? >> hurt).? You actually have to tell the other nodes >> in the system (servers and clients) about the failover options for the? >> MGS (use the --mgsnode parameter on servers, and mount address for? >> clients). ? The reason is because the servers must contact the MGS for? >> the configuration information, and they can''t ask the MGS where its? >> failover partner is if e.g. the failover partner is the one that''s running. >> >> >>> We are running this on Debian, using the Lustre 1.6.3 debs from svn on Lenny >>> with 2.6.22.12. I''ve tried several permutations of the mkfs.lustre command, >>> specifing both nodes as failover, and both nodes as MGS and pretty much >>> every other combination of the above. With the above command tunefs.lustre >>> shows that failnode and mgsnode are the failover node. >>> >>> Thanks, >>> Robert >>> >>> Robert LeBlanc >>> College of Life Sciences Computer Support >>> Brigham Young University >>> leblanc at byu.edu >>> (801)422-1882 >>> >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >>> >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >>> >>> >>> Mr Wojciech Turek >>> Assistant System Manager >>> University of Cambridge >>> High Performance Computing service? >>> email: wjt27 at cam.ac.uk >>> tel. +441223763517 >>> >>> >>> >>> >>>Robert LeBlanc College of Life Sciences Computer Support Brigham Young University leblanc at byu.edu (801)422-1882 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071112/0012ec37/attachment-0002.html
Hi, Thanks for that. Actually I have a little more complex situation here. I have two sets of clients. First set is working in 10.142.10.0/24 network and the second set is working in 10.143.0.0/16 network. Each server has two NIC''s. NIC1 = ETH0 10.143.0.0/16 and NIC2= ETH1 10.142.10.0/24 lnet configures network in the following manner: eth0 = <ip>@tcp0 eth1 = <ip>@tcp1 I am going to change lustre configuration in order to introduce failover features. MGS is cobined with with mdt01=/dev/dm-0 on mds01 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 on oss1 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-2 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-3 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-4 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-5 on oss2 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-6 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-7 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-8 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-9 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-10 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-11 on oss3 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-2 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-3 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-4 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-5 on oss4 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-6 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-7 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-8 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-9 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-10 tunefs.lustre --erase-params --writeconf -- failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 -- mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 -- mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-11 Will above be correct? Cheers, Wojciech Turek On 12 Nov 2007, at 21:36, Robert LeBlanc wrote:> You should just unmount all the clients, all OSTs and then: > > tunefs.lustre ?failnode 10.0.0.2 at tcp ?writeconf /dev/shared/disk > > If your volume is already on the shared disk, them mount everything > and you should be good to go. You can also do it on a live mounted > system by using lctl, but I?m not exactly sure how to do that. > > Robert > > On 11/12/07 2:24 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote: > >> Hi, >> >> How will look my tunefs.lustre command line if I would like to >> configure failnode for my MDS. I have two MDT''s and MGS is on the >> same block device that one of MDT''s ? I have also two servers >> connected to share matadata storage. >> >> Thanks, >> >> Wojciech >> On 12 Nov 2007, at 20:49, Nathan Rutman wrote: >> >>> Robert LeBlanc wrote: >>> >>>> Ok, I feel really stupid. I''ve done this before without any >>>> problem, but I >>>> can''t seem to get it to work and I can''t find my notes from the >>>> last time I >>>> did it. We have separate MGS and MDTs. I can''t seem to get our >>>> MGS to >>>> failover correctly after reformatting it. >>>> >>>> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs >>>> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 >>>> >>>> >>>> >>> The MGS doesn''t actually use the --failnode option (although it >>> won''t >>> hurt). You actually have to tell the other nodes >>> in the system (servers and clients) about the failover options >>> for the >>> MGS (use the --mgsnode parameter on servers, and mount address for >>> clients). The reason is because the servers must contact the >>> MGS for >>> the configuration information, and they can''t ask the MGS where its >>> failover partner is if e.g. the failover partner is the one >>> that''s running. >>> >>> >>>> We are running this on Debian, using the Lustre 1.6.3 debs from >>>> svn on Lenny >>>> with 2.6.22.12. I''ve tried several permutations of the >>>> mkfs.lustre command, >>>> specifing both nodes as failover, and both nodes as MGS and >>>> pretty much >>>> every other combination of the above. With the above command >>>> tunefs.lustre >>>> shows that failnode and mgsnode are the failover node. >>>> >>>> Thanks, >>>> Robert >>>> >>>> Robert LeBlanc >>>> College of Life Sciences Computer Support >>>> Brigham Young University >>>> leblanc at byu.edu >>>> (801)422-1882 >>>> >>>> >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss at clusterfs.com >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>>> >>>> >>>> >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss at clusterfs.com >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>>> >>>> >>>> >>>> Mr Wojciech Turek >>>> Assistant System Manager >>>> University of Cambridge >>>> High Performance Computing service >>>> email: wjt27 at cam.ac.uk >>>> tel. +441223763517 >>>> >>>> >>>> >>>> >>>> > > > Robert LeBlanc > College of Life Sciences Computer Support > Brigham Young University > leblanc at byu.edu > (801)422-1882 >Mr Wojciech Turek Assistant System Manager University of Cambridge High Performance Computing service email: wjt27 at cam.ac.uk tel. +441223763517 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071112/f07e40ee/attachment-0002.html
Since you are only adding parameters you don?t need the ?erase-params option. I think. Robert On 11/12/07 3:23 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote:> Hi, > > Thanks for that. Actually I have a little more complex situation here. I have > two sets of clients. First set is working in?10.142.10.0/24?network and the > second set is working in 10.143.0.0/16 network. > Each server has two NIC''s.? > NIC1 = ETH0 10.143.0.0/16?and?NIC2= ETH1 10.142.10.0/24? > lnet configures network in the following manner: > eth0 = <ip>@tcp0 > eth1 = <ip>@tcp1 > > I am going to change lustre configuration in order to introduce failover > features. > MGS is cobined with with mdt01=/dev/dm-0 > > on mds01 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 > > on oss1 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-2 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-3 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-4 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.8 at tcp0,10.142.10.8 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-5 > > on oss2 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-6 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-7 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-8 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-9 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-10 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.7 at tcp0,10.142.10.7 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-11 > > on oss3 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-0 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-1 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-2 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-3 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-4 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.10 at tcp0,10.142.10.10 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-5 > > on oss4 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-6 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-7 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-8 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-9 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-10 > tunefs.lustre --erase-params --writeconf > --failnode=10.143.245.9 at tcp0,10.142.10.9 at tcp1 > --mgsnode=10.143.245.201 at tcp0,10.142.10.201 at tcp1 > --mgsnode=10.143.245.202 at tcp0,10.142.10.202 at tcp1 /dev/dm-11 > > Will above be correct? > > Cheers, > > Wojciech Turek > > > On 12 Nov 2007, at 21:36, Robert LeBlanc wrote: > >> You should just unmount all the clients, all OSTs and then: >> >> tunefs.lustre ?failnode 10.0.0.2 at tcp ?writeconf /dev/shared/disk >> >> If your volume is already on the shared disk, them mount everything and you >> should be good to go. You can also do it on a live mounted system by using >> lctl, but I?m not exactly sure how to do that. >> >> Robert >> >> On 11/12/07 2:24 PM, "Wojciech Turek" <wjt27 at cam.ac.uk> wrote: >> >> >>> Hi, >>> >>> How will look my tunefs.lustre command line if I would like to configure >>> failnode for my MDS. I have two MDT''s and MGS is on the same block device >>> that one of MDT''s ? I have also two servers connected to share matadata >>> storage. >>> >>> Thanks, >>> >>> Wojciech? >>> On 12 Nov 2007, at 20:49, Nathan Rutman wrote: >>> >>> >>>> Robert LeBlanc wrote: >>>> ? >>>> >>>>> Ok, I feel really stupid. I''ve done this before without any problem, but I >>>>> can''t seem to get it to work and I can''t find my notes from the last time >>>>> I >>>>> did it. We have separate MGS and MDTs. I can''t seem to get our MGS to >>>>> failover correctly after reformatting it. >>>>> >>>>> mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mgs >>>>> --failnode=192.168.1.253 at o2ib /dev/mapper/ldiskc-part1 >>>>> >>>>> >>>>> ? >>>>> >>>> The MGS doesn''t actually use the --failnode option (although it won''t? >>>> hurt).? You actually have to tell the other nodes >>>> in the system (servers and clients) about the failover options for the? >>>> MGS (use the --mgsnode parameter on servers, and mount address for? >>>> clients). ? The reason is because the servers must contact the MGS for? >>>> the configuration information, and they can''t ask the MGS where its? >>>> failover partner is if e.g. the failover partner is the one that''s >>>> running. >>>> >>>> ? >>>> >>>>> We are running this on Debian, using the Lustre 1.6.3 debs from svn on >>>>> Lenny >>>>> with 2.6.22.12. I''ve tried several permutations of the mkfs.lustre >>>>> command, >>>>> specifing both nodes as failover, and both nodes as MGS and pretty much >>>>> every other combination of the above. With the above command >>>>> tunefs.lustre >>>>> shows that failnode and mgsnode are the failover node. >>>>> >>>>> Thanks, >>>>> Robert >>>>> >>>>> Robert LeBlanc >>>>> College of Life Sciences Computer Support >>>>> Brigham Young University >>>>> leblanc at byu.edu >>>>> (801)422-1882 >>>>> >>>>> >>>>> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> Lustre-discuss at clusterfs.com >>>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>>>> >>>>> ? >>>>> >>>>> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> Lustre-discuss at clusterfs.com >>>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>>>> ? >>>>> >>>>> ? >>>>> Mr Wojciech Turek >>>>> Assistant System Manager >>>>> University of Cambridge >>>>> High Performance Computing service? >>>>> email: wjt27 at cam.ac.uk >>>>> tel. +441223763517 >>>>> >>>>> >>>>> ? >>>>> >>>>> >>>>> >>>>> >>>>> ? >>>>> Robert LeBlanc >>>>> College of Life Sciences Computer Support >>>>> Brigham Young University >>>>> leblanc at byu.edu >>>>> (801)422-1882 >>>>> >>>>> >>>>> >>>>> >>>>> Mr Wojciech Turek >>>>> Assistant System Manager >>>>> University of Cambridge >>>>> High Performance Computing service? >>>>> email: wjt27 at cam.ac.uk >>>>> tel. +441223763517 >>>>> >>>>> >>>>> >>>>> >>>>>Robert LeBlanc College of Life Sciences Computer Support Brigham Young University leblanc at byu.edu (801)422-1882 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071112/58727c5c/attachment-0002.html