Dear list,
I am a lustre newbie and would like to know where I can find more
informations about failover. (Node A goes down Node B goes up and takes
somehow over)
The config consits of 2 nodes a & b.
--------
#!/bin/sh
# creating 2 nodes
# 1st node
lmc -o jan.xml --add node --node cccbasea
lmc -m jan.xml --add net --node cccbasea --nid cccbasea --nettype tcp
# 2nd node
lmc -m jan.xml --add node --node cccbaseb
lmc -m jan.xml --add net --node cccbaseb --nid cccbaseb --nettype tcp
# creating 1 MDS
lmc -m jan.xml --add mds --node cccbasea --mds mds1 --fstype ldiskfs
--dev /dev/sdb1
# creating OST failover
lmc -m jan.xml --add ost --node cccbasea --ost ost1 --failover --fstype
ldiskfs--dev /dev/sdb2
lmc -m jan.xml --add ost --node cccbaseb --ost ost1 --failover --fstype
ldiskfs--dev /dev/sdb2
# cliente config
lmc -m jan.xml --add mtpt --node cccbasea --path /mnt/lustre --mds mds1
--ost ost1
-------
The /dev/sdb2 is a iscsi device. Mounting the shared drive (/dev/sdb2)
simultaneously (from node a & node b) works perfectly but I don''t
know
how to do realize the failover szenario :
i.e. Node A goes down, Node B takes (flawlessly) over. Ideally with the
same Metadata and the same "User data".
Newbie question : Do I also need a second MDS or is only one sufficient.
IMHO : a single MDS should be sufficient.
I follwed your description at
https://wiki.clusterfs.com/lustre/LustreFailover
1.4.1 shared Resource
but I don''t understand your definiton of "the client upcall"
lconf --node nodeB --select ost1=nodeB <config.xml>
lconf --recover --select ost1=nodeB --target_uuid $2 --client_uuid $3
--conn_uuid $4 <config.xml>
But where can I find more informations about
--target_uuid
--client_uuid
--conn_uuid
When I tried the above command I got unfortunately the same error
message as described by tad_lake@hotmail.com in
https://lists.clusterfs.com/pipermail/lustre-discuss/2004-October/000478.html
cccbaseb:~ # lconf -v --recover --select ost1=cccbaseb --tgt_uuid
ost1_UUID --conn_uuid cccbasea_UUID --client_uuid cccbasea_UUID jan.xml
configuring for host: [''cccbaseb'',
''localhost'']
add_local NET_cccbaseb_tcp_UUID
find_local_routes: []
Reconnecting ost1_UUID to NID_cccbaseb_UUID
+ /usr/sbin/lctl
add_uuid NID_cccbaseb_UUID cccbaseb tcp
+ /usr/sbin/lctl
network tcp
send_mem 8388608
recv_mem 8388608
add_autoconn cccbaseb cccbaseb 988 s
quit
+ /usr/sbin/lctl
device $cccbasea_UUID
recover NID_cccbaseb_UUID
! /usr/sbin/lctl (255):
Can someone or tad_lake@hotmail.com himself give me a hint. Would be
nice.
regards,
Jan