Dear list,
Our Lustre clients are frozen for some reasons sometimes. To avoid restart
these client nodes, I tried to umount and remount lustre. At first, I used
"umuont -f ", it said "device is busy", so I tried
"umount -l " and then "mount ....", the commands were
successful and users could see Lustre again, however, there were two sets of
Lustre clients
[root at lxslc09 /]# lctl dl
0 UP mgc MGC192.168.50.32 at tcp dee1300b-f204-21c5-7f3c-1be448231e6b 5
1 UP lov besfs-clilov-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 4
2 UP mdc besfs-MDT0000-mdc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
3 IN osc besfs-OST0000-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
4 IN osc besfs-OST0001-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
5 IN osc besfs-OST0002-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
6 IN osc besfs-OST0003-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
7 IN osc besfs-OST0004-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
8 IN osc besfs-OST0005-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
9 IN osc besfs-OST0006-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
10 IN osc besfs-OST0007-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
11 IN osc besfs-OST0008-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
12 IN osc besfs-OST0009-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
13 IN osc besfs-OST000a-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
14 IN osc besfs-OST000b-osc-f7de1a00 f395e597-800a-7595-ea59-0f78f10bd078 5
15 UP lov besfs-clilov-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 4
16 UP mdc besfs-MDT0000-mdc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
17 IN osc besfs-OST0000-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
18 IN osc besfs-OST0001-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
19 IN osc besfs-OST0002-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
20 IN osc besfs-OST0003-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
21 IN osc besfs-OST0004-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
22 IN osc besfs-OST0005-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
23 IN osc besfs-OST0006-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
24 IN osc besfs-OST0007-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
25 IN osc besfs-OST0008-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
26 IN osc besfs-OST0009-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
27 IN osc besfs-OST000a-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
28 IN osc besfs-OST000b-osc-f7e4e800 6896395d-d4c4-ed52-8d93-ffba67b55c3c 5
And the clients would soon be frozen again, if I did " umount -l ;mount
..." for many times, on mds server, there were logs like:
Dec 21 04:16:57 mds01 kernel: Lustre:
23784:0:(ldlm_lib.c:760:target_handle_connect()) besfs-MDT0000: refuse
reconnection from ca8973af-0055-b204-13d6-ba05e74f6b01 at 192.168.52.91@tcp to
0xde916000; still busy with 3 active RPCs
My question is how to umount Lustre clearly on a client? I think if Lustre is
umount clearly ,then the module "lustre" is depended by 0. In my case,
it is 1
[root at lxslc09 ~]# lsmod | grep lustre
lustre 643548 2
lov 414696 3 lustre
mdc 144900 3 lustre
osc 224680 25 lustre
ptlrpc 971188 6 mgc,lustre,lov,mdc,lquota,osc
obdclass 677592 9 mgc,lustre,lov,mdc,lquota,osc,ptlrpc
lnet 267292 4 lustre,ksocklnd,ptlrpc,obdclass
lvfs 89208 8 mgc,lustre,lov,mdc,lquota,osc,ptlrpc,obdclass
libcfs 132044 11
mgc,lustre,lov,mdc,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
[root at lxslc09 ~]# umount -l /besfs
[root at lxslc09 ~]# lsmod | grep lustre
lustre 643548 1
lov 414696 2 lustre
mdc 144900 2 lustre
osc 224680 13 lustre
Any ideas?
Lu Wang
Institute of High Energy Physics, China
2008-12-23