thr3ads.net - Lustre discuss - [Lustre-discuss] Problems with failover [Aug 2006]

If this information is useful, please help other people find it:
Share via:

RS RS

2006-Aug-10 10:22 UTC

[Lustre-discuss] Problems with failover

Hi,

I’ve read everything that I can find about lustre failover, and I’m still 
having trouble getting it to work for my MDS.  If anyone can spare a few 
minutes to read this email and suggest what I’m doing wrong, I’d really 
appreciate it.  My boss is on my case to get this working already!

I have 4 nodes:
roger-ha-1  -- Active MDS
roger-ha-2  -- Standby MDS
blade-lustre2 -- OST
blade-lustre0  -- Client

I use the following script to generate my .xms file:
lmc -m failoverLustre.xml --add net --node roger-ha-1 --nid roger-ha-1 
--nettype tcp
lmc -m failoverLustre.xml --add net --node roger-ha-2 --nid roger-ha-2 
--nettype tcp
lmc -m failoverLustre.xml --add net --node blade-lustre2 --nid blade-lustre2 
--nettype tcp
lmc -m failoverLustre.xml --add net --node client --nid * --nettype tcp

lmc -m failoverLustre.xml --add mds --node roger-ha-1 --mds ha-mds --fstype 
ext3 --dev /dev/md1 --failover
lmc -m failoverLustre.xml --add mds --node roger-ha-2 --mds ha-mds --fstype 
ext3 --dev /dev/md1 --failover

lmc -m failoverLustre.xml --add lov --lov lov-ts --mds ha-mds --stripe_sz 
1048576 --stripe_cnt 0 --stripe_pattern 0
lmc -m failoverLustre.xml --add ost --node blade-lustre2 --lov lov-ts --ost 
ost1-ts --fstype ext3 --dev /dev/sda1
lmc -m failoverLustre.xml --add mtpt --node client --path /mnt/lustre --mds 
ha-mds --lov lov-ts



I do the following in this order:
1.  I bring up lustre on the OST using:
lconf -v --reformat --upcall /root/roger/upcall --timeout 30 --node 
blade-lustre2 failoverLustre.xml

2.  I bring up lustre on the Active MDS using:
lconf -v --reformat --timeout 30 --node roger-ha-1 failoverLustre.xml

3.  I bring up lustre on the client (blade-lustre0) using:
lconf -v --upcall /root/roger/upcall --timeout 30 --node client 
failoverLustre.xml

4.  I create a few files on the client.

5.  I manually halt roger-ha-1, and I move the disks to the roger-ha-2 
(Standby MDS), and start lustre there using:
lconf -v --reformat --force --select mds=roger-ha-2 --timeout 30 --node 
roger-ha-2 failoverLustre.xml

I believe that the first of my problems occurs here.  Far fewer modules are 
loaded on roger-ha-2 than were originally loaded on roger-ha-1, and I get 
the message:

ha-mds_UUID not active

6.  Next, I go back to the client, expecting that my upcall would have been 
called.

I have a very simple upcall script, namely:

   echo `date` $0 "$@" >> /root/roger/upcall.log

However, as the file /root/roger/upcall.log is never created, it is clear 
that the upcall is not being called (my second problem).

7.  So, I look at the /var/log/messages file, and I see that lustre is 
trying to call my upcall.  I see messages like this:

LustreError: 3094:0:(client.c:951:ptlrpc_expire_one_request()) @@@ timeout 
(sent at 1155152056, 30s ago) req@000001007e59d400 x188/t0 
o400->ha-mds_UUID@roger-ha-1_UUID:12 lens 64/64 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: A connection with 10.200.1.251 timed out; the network or that node 
may be down.
LustreError: 3071:0:(socklnd_cb.c:1981:ksocknal_check_peer_timeouts()) 
Timeout out conn->12345-10.200.1.251@tcp ip 10.200.1.251:988
Lustre: 3071:0:(router.c:184:lnet_notify()) Upcall: NID 10.200.1.251@tcp is 
dead
Lustre: 4:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked portals upcall 
/root/roger/upcall ROUTER_NOTIFY,10.200.1.251@tcp,down,1155152029
Lustre: 3073:0:(recover.c:117:ptlrpc_run_failed_import_upcall()) Invoked 
upcall /root/roger/upcall FAILED_IMPORT ha-mds_UUID 
MDC_blade-lustre0_ha-mds_MNT_client roger-ha-1_UUID 
303a6_MNT_client_b06099ff7d
LustreError: 3094:0:(client.c:951:ptlrpc_expire_one_request()) @@@ timeout 
(sent at 1155152063, 31s ago) req@000001007fbd0c00 x190/t0 
o400->ha-mds_UUID@roger-ha-1_UUID:12 lens 64/64 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: 3073:0:(recover.c:117:ptlrpc_run_failed_import_upcall()) Invoked 
upcall /root/roger/upcall FAILED_IMPORT ha-mds_UUID 
MDC_blade-lustre0_ha-mds_MNT_client roger-ha-1_UUID 
303a6_MNT_client_b06099ff7d
Lustre: 3073:0:(recover.c:117:ptlrpc_run_failed_import_upcall()) previously 
skipped 9 similar messages

I don’t know why it can’t find my upcall.  The file clearly exists and is 
executable, e.g.

   blade-lustre0:~/roger# ls -l upcall
   -rwxr-xr-x  1 root root 46 2006-08-09 09:34 upcall
   blade-lustre0:~/roger# ./upcall testing
   blade-lustre0:~/roger# cat upcall.log
   Wed Aug 9 15:55:52 EDT 2006 ./upcall testing
   blade-lustre0:~/roger#

8.  So, I enter the following lconf command by hand on blade-lustre0 (the 
client)
lconf --node blade-lustre0 --recover --select mds=10.200.1.252 --tgt_uuid 
ha-mds_UUID --client_uuid MDC_blade-lustre0_ha-mds_MNT_client --conn_uuid 
roger-ha-1_UUID failoverLustre.xml

And, I get the error message:

   No host entry found.

I don’t know what causes this.  I’ve even added entries to the /etc/hosts 
file to help resolve things, but that hasn’t helped.  Note that 10.200.1.252 
is roger-ha-2.  I originally said mds=roger-ha-2, but I changed it to the IP 
address when I started getting those error messages.

If you can help me resolve these problems, I’d really appreciate it.

Thanks,

Roger

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfee® 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Jean-Marc Saffroy

2006-Aug-10 10:56 UTC

head link

[Lustre-discuss] Problems with failover

On Thu, 10 Aug 2006, RS RS wrote:
> 5.  I manually halt roger-ha-1, and I move the disks to the roger-ha-2 
> (Standby MDS), and start lustre there using:
> lconf -v --reformat --force --select mds=roger-ha-2 --timeout 30 --node 
> roger-ha-2 failoverLustre.xml
Are you sure you want to *reformat* your MDT when moving it to the standby 
MDS?
> 8.  So, I enter the following lconf command by hand on blade-lustre0 (the 
> client)
> lconf --node blade-lustre0 --recover --select mds=10.200.1.252 --tgt_uuid 
> ha-mds_UUID --client_uuid MDC_blade-lustre0_ha-mds_MNT_client --conn_uuid 
> roger-ha-1_UUID failoverLustre.xml
>
> And, I get the error message:
>
>  No host entry found.
IIRC this error is caused by lconf not finding in the xml a node 
description matching the supplied node name (here
''blade-lustre0'',
otherwise the hostname).


-- 
Jean-Marc Saffroy - jean-marc.saffroy@ext.bull.net

RS RS

2006-Aug-10 13:03 UTC

head link

[Lustre-discuss] Problems with failover

>Are you sure you want to *reformat* your MDT when moving it to the standby 
>MDS?
Good point.  I’ll fix this.

No host entry found.
> > No host entry found
>
>IIRC this error is caused by lconf not finding in the xml a node 
>description matching the supplied node name (here
''blade-lustre0'',
>otherwise the hostname).
At Jean-Marc’s suggestion, I changed my lmc commands from:
lmc ... --add net --node client --nid * ...
to:
lmc ... --add net --node blade-lustre0 --nid blade-lustre0 ...

and from:

lmc ... --add mtpt --node client ...

to:
lmc ... --add mtpt --node blade-lustre0 ...

(Does this mean that I’ll have to list every client in that file?)

Then, on the client, I run:

lconf --recover --node blade-lustre0 --tgt_uuid ha-mds_UUID --client_uuid 
MDC_blade-lustre0_ha-mds_MNT_client --conn_uuid roger-ha-1_UUID 
/home/roger/lustreConfigs/failoverLustre/failoverLustre.xml

I get the following error from lconf.

Traceback (most recent call last):
  File "/usr/sbin/lconf", line 2827, in ?
    main()
  File "/usr/sbin/lconf", line 2820, in main
    doHost(lustreDB, node_list)
  File "/usr/sbin/lconf", line 2218, in doHost
    config.conn_uuid)
  File "/usr/sbin/lconf", line 2419, in doRecovery
    srv_list = find_local_servers(get_ost_net(lustreDB, new_uuid))
NameError: global name ''find_local_servers'' is not defined

As far as I can tell, this function does not exist in /usr/sbin/lconf.  
Should I be getting that from somewhere else?

-Roger

_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/

Nathaniel Rutman

2006-Aug-10 14:14 UTC

head link

[Lustre-discuss] Problems with failover

RS RS wrote:
>> Are you sure you want to *reformat* your MDT when moving it to the 
>> standby MDS?
>
>
> Good point. I?ll fix this.
The --reformats will erase your disks. Get rid of them all.
> No host entry found.
>
>> > No host entry found
>>
>> IIRC this error is caused by lconf not finding in the xml a node 
>> description matching the supplied node name (here
''blade-lustre0'',
>> otherwise the hostname).
>
>
> At Jean-Marc?s suggestion, I changed my lmc commands from:
> lmc ... --add net --node client --nid * ...
> to:
> lmc ... --add net --node blade-lustre0 --nid blade-lustre0 ...
>
> and from:
>
> lmc ... --add mtpt --node client ...
>
> to:
> lmc ... --add mtpt --node blade-lustre0 ...
>
> (Does this mean that I?ll have to list every client in that file?)
>no - you should use the *, and specify --node client instead of --node 
blade-lustre0
> Then, on the client, I run:
>
> lconf --recover --node blade-lustre0 --tgt_uuid ha-mds_UUID 
> --client_uuid MDC_blade-lustre0_ha-mds_MNT_client --conn_uuid 
> roger-ha-1_UUID 
> /home/roger/lustreConfigs/failoverLustre/failoverLustre.xml
>
> I get the following error from lconf.
>
> Traceback (most recent call last):
> File "/usr/sbin/lconf", line 2827, in ?
> main()
> File "/usr/sbin/lconf", line 2820, in main
> doHost(lustreDB, node_list)
> File "/usr/sbin/lconf", line 2218, in doHost
> config.conn_uuid)
> File "/usr/sbin/lconf", line 2419, in doRecovery
> srv_list = find_local_servers(get_ost_net(lustreDB, new_uuid))
> NameError: global name ''find_local_servers'' is not
defined
>The recovery won''t work since you reformatted your mds. You should also
not specify your own recovery upcalls and leave the defaults; the client 
will automatically attempt to reconnect to the failover server. You 
shouldn''t need to run a lconf --recover.

You can make sure your MDS is running with cat /proc/fs/lustre/devices - 
that should look the same on the primary and on the failover.

RS RS

2006-Aug-10 15:15 UTC

head link

[Lustre-discuss] Problems with failover

Eureka!  It worked.  I successfully failed over my MDS.  Thanks to everyone 
who offered advice.

Nathaniel wrote:>The recovery won''t work since you reformatted your mds. You should
also not
>specify your own recovery upcalls and leave the defaults; the client will 
>automatically attempt to reconnect to the failover server. You
shouldn''t
>need to run a lconf --recover.
I took both your suggestions.  Thanks again.

By the way, the only reason I even considered using the upcall is because it 
suggests doing so in the manual (section 6.4.2):  To quote:

    For example, one way to manage the current active
    node is to save the node name in a shared
    location that is accessible to the client upcall.
    When the upcall runs, it determines which service
    has failed, and looks up for the current active
    node to make the file system available. The
    current node and the upcall parameters are then
    passed to lconf in order to complete the recovery.

    . . .

    For example:
    $upcall FAILED_IMPORT ost1_UUID OSC_localhost_ost4_MNT_localhost
     NET_uml2_UUID ff151_lov1_6d1fce3b45
    lconf --recover --select ost1=nodeB --target_uuid $2 --client_uuid $3
    --conn_uuid $4 <config.xml>

(I''ve CCed doc-request, with the hope that this can be fixed up in a
future
update).

-Roger

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfee® 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

Lustre discuss - Aug 2006 - Problems with failover

[Lustre-discuss] Problems with failover

[Lustre-discuss] Problems with failover

[Lustre-discuss] Problems with failover

[Lustre-discuss] Problems with failover

[Lustre-discuss] Problems with failover