1.First, I start one MDS on one machine and start two OST on another OSS machine,the first OST data is shared with one disk by ISCSI. 2.Second, I stop(umount) one OST on OSS machine and deactive the OST on MDS machine(lctl conf_param test-OST0000.osc.active=0); 3.Third,I install new machine and install Lustre, ISCSI. I start(mount) the first OST by ISCSI and active the OST on MDS machine(lctl conf_param test-OST0000.osc.active=1).It can work(mount successfully). But I find the OST is inactive device on CLIENT machine by command ''lfs df -h'' and list files failed.
On Fri, 2008-05-30 at 03:08 -0700, Johnlya wrote:> 1.First, I start one MDS on one machine and start two OST on another > OSS machine,the first OST data is shared with one disk by ISCSI. > > 2.Second, I stop(umount) one OST on OSS machine and deactive the > OST on MDS machine(lctl conf_param test-OST0000.osc.active=0); > > 3.Third,I install new machine and install Lustre, ISCSI. I > start(mount) the first OST by ISCSI and active the OST on MDS > machine(lctl conf_param test-OST0000.osc.active=1).It can work(mount > successfully). But I find the OST is inactive device on CLIENT machine > by command ''lfs df -h'' and list files failed.What are you trying to accomplish with these steps? It sounds like you are simply trying to move the OST "test-OST0000" from it''s existing OSS to a new one, is this correct? If so, you can follow the process for "Changing a server nid" in the mountconf wiki at http://wiki.lustre.org/index.php?title=Mount_Conf#Changing_a_server_nid. I would suggest you read about the general Writeconf section that that procedure is a part of. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080602/04cdfb44/attachment.bin
On Tue, 2008-06-03 at 3:14 , "Brian J. Murrell" <Brian.Murr... at Sun.COM> wrote:> On Fri, 2008-05-30 at 03:08 -0700, Johnlya wrote: > > 1.First, I start one MDS on one machine and start two OST on another > > OSS machine,the first OST data is shared with one disk by ISCSI. > > > ? ? ? 2.Second, I stop(umount) one OST on OSS machine and deactive the > > OST on MDS machine(lctl conf_param test-OST0000.osc.active=0); > > > ? ? ? 3.Third,I install new machine and install Lustre, ISCSI. I > > start(mount) the first OST by ISCSI and active the OST on MDS > > machine(lctl conf_param test-OST0000.osc.active=1).It can work(mount > > successfully). But I find the OST is inactive device on CLIENT machine > > by command ''lfs df -h'' and list files failed. > > What are you trying to accomplish with these steps? ?It sounds like you > are simply trying to move the OST "test-OST0000" from it''s existing OSS > to a new one, is this correct? ?If so, you can follow the process for > "Changing a server nid" in the mountconf wiki athttp://wiki.lustre.org/index.php?title=Mount_Conf#Changing_a_server_nid. > I would suggest you read about the general Writeconf section that that > procedure is a part of. > > b. > > ?signature.asc > 1Kdownload > > _______________________________________________ > Lustre-discuss mailing list > Lustre-disc... at lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discussThank you! if using the command ''Writeconf'', all servers should be restarted. I mean that I want to test for changing OST when OST is down. And other servers (e.g MDS, other OSS) should not restart.
sorry. below message is correct: I mean that I want to test for changing OSS when OSS is down. And other servers (e.g MDS, other OSS) should not restart. The OST data is shared with one disk by ISCS.> > <Brian.Murr... at Sun.COM> wrote: > > On Fri, 2008-05-30 at 03:08 -0700, Johnlya wrote: > > > 1.First, I start one MDS on one machine and start two OST on another > > > OSS machine,the first OST data is shared with one disk by ISCSI. > > > > ? ? ? 2.Second, I stop(umount) one OST on OSS machine and deactive the > > > OST on MDS machine(lctl conf_param test-OST0000.osc.active=0); > > > > ? ? ? 3.Third,I install new machine and install Lustre, ISCSI. I > > > start(mount) the first OST by ISCSI and active the OST on MDS > > > machine(lctl conf_param test-OST0000.osc.active=1).It can work(mount > > > successfully). But I find the OST is inactive device on CLIENT machine > > > by command ''lfs df -h'' and list files failed. > > > What are you trying to accomplish with these steps? ?It sounds like you > > are simply trying to move the OST "test-OST0000" from it''s existing OSS > > to a new one, is this correct? ?If so, you can follow the process for > > "Changing a server nid" in the mountconf wiki athttp://wiki.lustre.org/index.php?title=Mount_Conf#Changing_a_server_nid. > > I would suggest you read about the general Writeconf section that that > > procedure is a part of. > > > b. > > > ?signature.asc > > 1Kdownload > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-disc... at lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discuss > > Thank you! > ? ? if using the command ''Writeconf'', all servers should be restarted. > ? ? I mean that I want to test for changing OST when OST is down. ?And > other servers (e.g MDS, other OSS) should not restart. > _______________________________________________ > Lustre-discuss mailing list > Lustre-disc... at lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discuss-
On Mon, 2008-06-02 at 20:06 -0700, Johnlya wrote:> sorry. > below message is correct: > I mean that I want to test for changing OSS when OSS is down. > And > other servers (e.g MDS, other OSS) should not restart. The OST data > is shared with one disk by ISCS.It sounds like you want failover. Please see our operations manual for a discussion of failover and how to implement it. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080603/c77b0099/attachment-0001.bin
On Tue, 2008-06-03 at 20:45 , "Brian J. Murrell" <Brian.Murr... at Sun.COM> wrote:> On Mon, 2008-06-02 at 20:06 -0700, Johnlya wrote: > > sorry. > > below message is correct: > > ? ? I mean that I want to test for changing OSS when OSS is down. > > And > > other servers (e.g MDS, other OSS) should not restart. ?The OST data > > is shared with one disk by ISCS. > > It sounds like you want failover. ?Please see our operations manual for > a discussion of failover and how to implement it. > > b. > > ?signature.asc > 1Kdownload > > _______________________________________________ > Lustre-discuss mailing list > Lustre-disc... at lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discussMDS_MASTER> mkfs.lustre --fsname=test --mdt --mgs --reformat -- failnode=MDS_SLAVER /dev/sda1 MDS_MASTER> mount.lustre /dev/sda1 /mnt/test/mdt OSS_MASTER1> mkfs.lustre --fsname=test --ost --reformat -- mgsnode=MDS_SLAVE at tcp0 --mgsnode=MDS_MASTER at tcp0 -- failnode=OSS_SLAVER1 /dev/sdb1 OSS_MASTER1> mount.lustre /dev/sdb1 /mnt/test/ost1 OSS_MASTER2> mkfs.lustre --fsname=test --ost --reformat -- mgsnode=MDS_SLAVE at tcp0 --mgsnode=MDS_MASTER at tcp0 -- failnode=OSS_SLAVER2 /dev/sdb2 OSS_MASTER2> mount.lustre /dev/sdb1 /mnt/test/ost2 CLINET> mount -t lustre MDS_SLAVE at tcp0:MDS_MASTER at tcp0:/test /mnt/ test/ client/ I test it and It can work. when the OSS_MASTER2 and OSS_SLAVER2 machine is all down( The OST data is shared with one disk by ISCS), I want to relpace a new machine (IP address configuration not different from old OSS). The Client can mount successfully, but it can''t work. thank you!
On Tue, 2008-06-03 at 22:40 -0700, Johnlya wrote:> > I test it and It can work.Good.> when the OSS_MASTER2 and OSS_SLAVER2 machine is all downSo you are working on mitigating a double machine failure? Both the OSS servers go bad at the same time?> I want to relpace a new machine (IP > address configuration not different from old OSS).So you are replacing the OSS_MASTER2 machine with a new machine and assigning it the same IP address as OSS_MASTER2? (i.e. why don''t you just take the "root" disk out of OSS_MASTER2 and put it in the new machine?> The Client can > mount successfully, but it can''t work.Well, if you duplicate the IP address faithfully, it should work. Can you supply the output of dmesg after you have mounted the client and done a "df -h" on it? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080604/0da7a883/attachment.bin
On Wed, 2008-06-04 at 21:18 , "Brian J. Murrell" <Brian.Murr... at Sun.COM> wrote:> On Tue, 2008-06-03 at 22:40 -0700, Johnlya wrote: > > > I test it and It can work. > > Good. > > > when the OSS_MASTER2 and OSS_SLAVER2 machine is all down > > So you are working on mitigating a double machine failure? ?Both the OSS > servers go bad at the same time? > > > I want to relpace a new machine (IP > > address configuration not different from old OSS). > > So you are replacing the OSS_MASTER2 machine with a new machine and > assigning it the same IP address as OSS_MASTER2? ?(i.e. why don''t you > just take the "root" disk out of OSS_MASTER2 and put it in the new > machine? > > > The Client can > > mount successfully, but it can''t work. > > Well, if you duplicate the IP address faithfully, it should work. ?Can > you supply the output of dmesg after you have mounted the client and > done a "df -h" on it? > > b. > > ?signature.asc > 1Kdownload > > _______________________________________________ > Lustre-discuss mailing list > Lustre-disc... at lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discusssorry, a few days ago I had a meeting. So I can''t reply .