Hello everyone, I have some Lustre1.6.6 manual failove problem. I have two MDS four OSS and two clients .And the two MDS make failover, the four OSS make two failover. two MDS have share storage [root at mds1 ~]# mkfs.lustre --fsname=testfs --mdt --mgs --failnode=mds2 /dev/sdb [root at mds1 ~]# mkdir -p /mnt/mdt [root at mds1 ~]# mount -t lustre /dev/sdb /mnt/mdt [root at mds2 ~]# mkdir -p /mnt/mdt [root at mds2 ~]# mount -t lustre /dev/sdb /mnt/mdt two OSS have share storage [root at oss1 ~]mkfs.lustre --fsname=testfs --ost --failnode=oss2 --mgsnode=mds1 --mgsnode=mds2 /dev/sdb [root at oss1 ~]# mkdir -p /mnt/ost [root at oss1 ~]# mount -t lustre /dev/sdb /mnt/ost [root at oss2 ~]# mkdir -p /mnt/ost [root at oss2 ~]# mount -t lustre /dev/sdb /mnt/ost two OSS have share storage [root at oss3 ~]mkfs.lustre --fsname=testfs --ost --failnode=oss4 --mgsnode=mds1 --mgsnode=mds2 /dev/sdb [root at oss3 ~]# mkdir -p /mnt/ost [root at oss3 ~]# mount -t lustre /dev/sdb /mnt/ost [root at oss4 ~]# mkdir -p /mnt/ost [root at oss4 ~]# mount -t lustre /dev/sdb /mnt/ost The Lustre two Clients [root at client1 ~]#mkdir /lustre [root at client1 ~]#mount -t lustre md1:md2:/testfs /lustre [root at client2 ~]#mkdir /lustre [root at client2 ~]#mount -t lustre md1:md2:/testfs /lustre Now, shutdown the mds1 .the client don''t use the command "df" list the disk use .And shutdown the oss1 or oss3 the same problem. I want ask : how manual failover under the Lustre1.6.6 I read the Lustre1.6 manual ,the failover need the heartbeat software and so on ,but should have manual failover? How to do ? Thankyou very much! WangLin 2008-12-18 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081218/4a31d3db/attachment.html
Brian J. Murrell
2008-Dec-18 18:29 UTC
[Lustre-discuss] Lustre1.6.6 manual failover problem
On Thu, 2008-12-18 at 14:45 +0800, Lin Wang wrote:> > [root at mds1 ~]# mkfs.lustre --fsname=testfs --mdt --mgs > --failnode=mds2 /dev/sdb > [root at mds1 ~]# mkdir -p /mnt/mdt > [root at mds1 ~]# mount -t lustre /dev/sdb /mnt/mdt > [root at mds2 ~]# mkdir -p /mnt/mdt > [root at mds2 ~]# mount -t lustre /dev/sdb /mnt/mdtYou cannot do this. By mounting the MDT on *both* MDSes at the same time, you are corrupting it! Only one MDS can have the MDT mounted at a time. I think you need to review the Failover chapter of the manual again. I''m surprised in fact that MMP allowed you to do this. It should not have.> two OSS have share storage > [root at oss1 ~]mkfs.lustre --fsname=testfs --ost --failnode=oss2 > --mgsnode=mds1 --mgsnode=mds2 /dev/sdb > [root at oss1 ~]# mkdir -p /mnt/ost > [root at oss1 ~]# mount -t lustre /dev/sdb /mnt/ost > [root at oss2 ~]# mkdir -p /mnt/ost > [root at oss2 ~]# mount -t lustre /dev/sdb /mnt/ostDitto here. Two OSSes cannot mount the same OST at the same time. Corruption!> Now, shutdown the mds1 .the client don''t use the command "df" list the > disk use .And > shutdown the oss1 or oss3 the same problem. > I want ask : how manual failover under the Lustre1.6.6 > I read the Lustre1.6 manual ,the failover need the heartbeat software > and so on ,but should have manual failover?You need to do manually what heartbeat would do, which is first kill the power of a dead node and then mount the resource (MDT or OST(s)). b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081218/145d44ba/attachment.bin