Dear list, I am a lustre newbie and would like to know where I can find more informations about failover. (Node A goes down Node B goes up and takes somehow over) The config consits of 2 nodes a & b. -------- #!/bin/sh # creating 2 nodes # 1st node lmc -o jan.xml --add node --node cccbasea lmc -m jan.xml --add net --node cccbasea --nid cccbasea --nettype tcp # 2nd node lmc -m jan.xml --add node --node cccbaseb lmc -m jan.xml --add net --node cccbaseb --nid cccbaseb --nettype tcp # creating 1 MDS lmc -m jan.xml --add mds --node cccbasea --mds mds1 --fstype ldiskfs --dev /dev/sdb1 # creating OST failover lmc -m jan.xml --add ost --node cccbasea --ost ost1 --failover --fstype ldiskfs--dev /dev/sdb2 lmc -m jan.xml --add ost --node cccbaseb --ost ost1 --failover --fstype ldiskfs--dev /dev/sdb2 # cliente config lmc -m jan.xml --add mtpt --node cccbasea --path /mnt/lustre --mds mds1 --ost ost1 ------- The /dev/sdb2 is a iscsi device. Mounting the shared drive (/dev/sdb2) simultaneously (from node a & node b) works perfectly but I don''t know how to do realize the failover szenario : i.e. Node A goes down, Node B takes (flawlessly) over. Ideally with the same Metadata and the same "User data". Newbie question : Do I also need a second MDS or is only one sufficient. IMHO : a single MDS should be sufficient. I follwed your description at https://wiki.clusterfs.com/lustre/LustreFailover 1.4.1 shared Resource but I don''t understand your definiton of "the client upcall" lconf --node nodeB --select ost1=nodeB <config.xml> lconf --recover --select ost1=nodeB --target_uuid $2 --client_uuid $3 --conn_uuid $4 <config.xml> But where can I find more informations about --target_uuid --client_uuid --conn_uuid When I tried the above command I got unfortunately the same error message as described by tad_lake@hotmail.com in https://lists.clusterfs.com/pipermail/lustre-discuss/2004-October/000478.html cccbaseb:~ # lconf -v --recover --select ost1=cccbaseb --tgt_uuid ost1_UUID --conn_uuid cccbasea_UUID --client_uuid cccbasea_UUID jan.xml configuring for host: [''cccbaseb'', ''localhost''] add_local NET_cccbaseb_tcp_UUID find_local_routes: [] Reconnecting ost1_UUID to NID_cccbaseb_UUID + /usr/sbin/lctl add_uuid NID_cccbaseb_UUID cccbaseb tcp + /usr/sbin/lctl network tcp send_mem 8388608 recv_mem 8388608 add_autoconn cccbaseb cccbaseb 988 s quit + /usr/sbin/lctl device $cccbasea_UUID recover NID_cccbaseb_UUID ! /usr/sbin/lctl (255): Can someone or tad_lake@hotmail.com himself give me a hint. Would be nice. regards, Jan