Hi, If you refer to my previous message, you will see that I have two multihomed clusters, each having Lustre servers and clients. I have clients mounting lustre partitions from o2ib and tcp. Now I am inplementing failover, did a try this morning without success, so RTFM. I read: Note -- If you have an MGS or MDT configured for failover, perform these steps: 1. On the OST, list the NIDs of all MGS nodes at mkfs time. OST# mkfs.lustre --fsname sunfs --ost --mgsnode=10.0.0.1 --mgsnode=10.0.0.2 /dev/{device} 2. On the client, mount the file system. client# mount -t lustre 10.0.0.1:10.0.0.2:/sunfs /cfs/client/ So I extended the logic from : mkfs.lustre --mgs --mdt --fsname=sata --failnode=ib3-st02s at o2ib3 <mailto:--failnode%3Dib4-st02s at o2ib4> --reformat /dev/mpath/emcssd-1 mkfs.lustre --fsname sata --reformat --ost --mgsnode=ib3-st01s at o2ib3 --mgsnode=ib3-st01e at tcp --failnode=ib3-st02s at o2ib3 <mailto:--failnode%3Dib4-st02s at o2ib4> /dev/mpath/colosse4-lun54-sata to: mkfs.lustre --mgs --mdt --fsname=sata --failnode=ib3-st02s at o2ib3,ib3-st02e at tcp --reformat /dev/mpath/emcssd-1 mkfs.lustre --fsname sata --reformat --ost --mgsnode=ib3-st01s at o2ib3,ib3-st01e at tcp --mgsnode=ib3-st02s at o2ib3,ib3-st02e at tcp --failnode=ib3-st02s at o2ib3,ib3-st02e at tcp /dev/mpath/colosse4-lun53-sata And so on for other disks. Partitions mounts great on the MDS/MGS/OSS server, but on the OSS only, I have: [root at ib3-st03 ~]# mount -t lustre /dev/mpath/colosse4-lun55-sata /mnt/data/clun55 mount.lustre: mount /dev/mpath/colosse4-lun55-sata at /mnt/data/clun55 failed: Interrupted system call messages file contains: Dec 21 15:18:52 ib3-st03 kernel: Lustre: 9464:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request x1388814699331655 sent from MGC10.10.135.115 at o2ib3 to NID 10.10.135.116 at o2ib3 5s ago has timed out (5s prior to deadline). Dec 21 15:18:52 ib3-st03 kernel: req at ffff810116fff800 x1388814699331655/t0 o250->MGS at MGC10.10.135.115@o2ib3_1:26/25 lens 368/584 e 0 to 1 dl 1324480732 ref 1 fl Rpc:N/0/0 rc 0/0 Dec 21 15:18:52 ib3-st03 kernel: LustreError: 23519:0:(obd_mount.c:1112:server_start_targets()) Required registration failed for sata-OSTffff: -4 Dec 21 15:18:52 ib3-st03 kernel: LustreError: 23519:0:(obd_mount.c:1670:server_fill_super()) Unable to start targets: -4 Dec 21 15:18:52 ib3-st03 kernel: LustreError: 23519:0:(obd_mount.c:1453:server_put_super()) no obd sata-OSTffff Dec 21 15:18:52 ib3-st03 kernel: LustreError: 23519:0:(obd_mount.c:147:server_deregister_mount()) sata-OSTffff not registered Dec 21 15:18:52 ib3-st03 kernel: Lustre: server umount sata-OSTffff complete Dec 21 15:18:52 ib3-st03 kernel: LustreError: 23519:0:(obd_mount.c:2065:lustre_fill_super()) Unable to mount (-4) so my question is? What would ne the correct syntax to make sure I have a failover on the o2ib clients as well as the tcp clients? Thanks -- Patrice Hamelin Specialiste s?nior en syst?mes d''exploitation | Senior OS specialist Environnement Canada | Environment Canada 2121, route Transcanadienne | 2121 Transcanada Highway Dorval, QC H9P 1J3 T?l?phone | Telephone 514-421-5303 T?l?copieur | Facsimile 514-421-7231 Gouvernement du Canada | Government of Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111221/a70c844b/attachment.html