Here are the notes from my attempt to move MDT from pool4 to lustre3. Any
ideas on why the transfer of MDT didn''t succeed?
And the reason why all the OSTs were marked inactive?
Thanks,
Ron
The following contains the commands we executed and the associated log
messages.
Note:
/dev/sda4 on pool4 is a Hardware RAID 1 device
/dev/mapper/lustrevol-lustrelv on lustre3 is a Volume Group on a LVM2
Physical Volume which is on a software RAID device /dev/md3
[root at lustre3 ~]# pvdisplay
--- Physical volume ---
PV Name /dev/md3
VG Name lustrevol
PV Size 592.86 GB / not usable 1.00 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 151772
Free PE 113372
Allocated PE 38400
PV UUID KRY0HY-BhjD-l8qR-14Qw-cQ1T-NNHT-LXT3Bs
[root at lustre3 ~]# vgdisplay
--- Volume group ---
VG Name lustrevol
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 2
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 1
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size 592.86 GB
PE Size 4.00 MB
Total PE 151772
Alloc PE / Size 38400 / 150.00 GB
Free PE / Size 113372 / 442.86 GB
VG UUID XFvbDk-Ukfg-fTcQ-XcNp-rX0f-hYTl-XbkGKs
[root at lqcd-pool4 ~]# mount -t ldiskfs /dev/sda4 /mnt/mdt
[root at lqcd-pool4 ~]# cd /mnt/mdt/
[root at lqcd-pool4 mdt]# getfattr -R -d -m ''.*'' -P . >
/root/ea.bak
[root at lqcd-pool4 mdt]# /usr/bin/rcp /root/ea.bak lustre3:/root/ea.bak
[root at lustre3 ~]# mkfs.lustre --fsname=lustre --mdt --mgs --param
lov.stripecount=1 --mkfsoptions="-m 0" --reformat
/dev/mapper/lustrevol-lustrelv
[root at lustre3 ~]# mount -t ldiskfs /dev/mapper/lustrevol-lustrelv /mnt/mdt
[root at lqcd-pool4 mdt]# export RSYNC_RSH=/usr/bin/rsh
[root at lqcd-pool4 mdt]# rsync -aSvz --ignore-existing --ignore-times
/mnt/mdt/ lustre3:/mnt/mdt > /tmp/rsync.log 2>&1
[root at lustre3 ~]# cd /mnt/mdt
[root at lustre3 mdt]# setfattr --restore=/root/ea.bak
The following command was executed on all the 24 OSTs
tunefs.lustre --erase-param --mgsnode=lustre3 --writeconf /dev/sde1
[root at lustre3 ~]# mount -t lustre /dev/mapper/lustrevol-lustrelv /mnt/mdt
mount.lustre: mount /dev/mapper/lustrevol-lustrelv at /mnt/mdt failed:
Address already in use
The target service''s index is already in use.
(/dev/mapper/lustrevol-lustrelv)
Oct 6 16:39:28 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:39:28 lustre3 kernel: LDISKFS FS on dm-0, internal journal
Oct 6 16:39:28 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:39:28 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:39:28 lustre3 kernel: LDISKFS FS on dm-0, internal journal
Oct 6 16:39:28 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:39:28 lustre3 kernel: Lustre: MGS MGS started
Oct 6 16:39:28 lustre3 kernel: LustreError: 13e-c: MDT index must = 0
(until Clustered MetaData feature is ready.)
Oct 6 16:39:28 lustre3 kernel: LustreError: 140-5: Server lustre-MDTffff
requested index 0, but that index is already in use
Oct 6 16:39:28 lustre3 kernel: LustreError:
5026:0:(mgs_llog.c:1672:mgs_write_log_target()) Can''t get index (-98)
Oct 6 16:39:28 lustre3 kernel: LustreError:
5026:0:(mgs_handler.c:431:mgs_handle_target_reg()) Failed to write
lustre-MDTffff log (-98)
Oct 6 16:39:29 lustre3 kernel: LustreError:
5026:0:(mgs_handler.c:625:mgs_handle()) MGS handle cmd=253 rc=-98
Oct 6 16:39:29 lustre3 kernel: LustreError:
5026:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error (-98)
req at ffff8102152b3450 x117/t0
o253->fb445385-7ef9-62f3-1e58-db8f8dc29917 at NET_0x9000000000000_UUID:0/0
lens
4672/4672 e 0 to 0 dl 1223329268 ref 1 fl Interpret:/0/0 rc 0/0
Oct 6 16:39:29 lustre3 kernel: LustreError: 11-0: an error occurred while
communicating with 0 at lo. The mgs_target_reg operation failed with -98
Oct 6 16:39:29 lustre3 kernel: LustreError:
4964:0:(obd_mount.c:1062:server_start_targets()) Required registration
failed for lustre-MDTffff: -98
Oct 6 16:39:29 lustre3 kernel: LustreError:
4964:0:(obd_mount.c:1597:server_fill_super()) Unable to start targets: -98
Oct 6 16:39:29 lustre3 kernel: LustreError:
4964:0:(obd_mount.c:1382:server_put_super()) no obd lustre-MDTffff
Oct 6 16:39:29 lustre3 kernel: LustreError:
4964:0:(obd_mount.c:119:server_deregister_mount()) lustre-MDTffff not
registered
Oct 6 16:39:29 lustre3 kernel: Lustre: MGS has stopped.
Oct 6 16:39:29 lustre3 kernel: Lustre: server umount lustre-MDTffff
complete
Oct 6 16:39:29 lustre3 kernel: LustreError:
4964:0:(obd_mount.c:1951:lustre_fill_super()) Unable to mount (-98)
[root at lustre3 ~]# tunefs.lustre --erase-params --mgs --mdt --writeconf
/dev/lustrevol/lustrelv
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: lustre-MDTffff
Index: unassigned
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x75
(MDT MGS needs_index first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: lov.stripecount=1 mdt.group_upcall=/usr/sbin/l_getgroups
Permanent disk data:
Target: lustre-MDTffff
Index: unassigned
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x175
(MDT MGS needs_index first_time update writeconf )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters:
Writing CONFIGS/mountdata
[root at lustre3 ~]# mount -t lustre /dev/mapper/lustrevol-lustrelv /mnt/mdt
[root at lustre3 ~]# mount -v -t lustre /dev/sde1 /mnt/sata1-1-3
arg[0] = /sbin/mount.lustre
arg[1] = -v
arg[2] = -o
arg[3] = rw,noauto,_netdev
arg[4] = /dev/sde1
arg[5] = /mnt/sata1-1-3
source = /dev/sde1 (/dev/sde1), target = /mnt/sata1-1-3
options = rw,noauto,_netdev
mounting device /dev/sde1 at /mnt/sata1-1-3, flags=0
options=device=/dev/sde1
Oct 6 16:40:57 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:40:57 lustre3 kernel: LDISKFS FS on dm-0, internal journal
Oct 6 16:40:57 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:41:05 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:41:05 lustre3 kernel: LDISKFS FS on dm-0, internal journal
Oct 6 16:41:05 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:41:05 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:41:05 lustre3 kernel: LDISKFS FS on dm-0, internal journal
Oct 6 16:41:05 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:41:05 lustre3 kernel: Lustre: MGS MGS started
Oct 6 16:41:05 lustre3 kernel: Lustre: MGS: Logs for fs lustre were removed
by user request. All servers must be restarted in order to regenerate the
logs.
Oct 6 16:41:05 lustre3 kernel: Lustre: Enabling user_xattr
Oct 6 16:41:06 lustre3 kernel: LustreError:
5145:0:(fsfilt-ldiskfs.c:1283:fsfilt_ldiskfs_read_record()) can''t read
block: 0
Oct 6 16:41:06 lustre3 kernel: Lustre: MDT lustre-MDT0000 now serving dev
(lustre-MDT0000/9b2d9c21-aeec-b2d2-4d55-b8e6d8a37b4a) with recovery enabled
Oct 6 16:41:06 lustre3 kernel: Lustre: Server lustre-MDT0000 on device
/dev/mapper/lustrevol-lustrelv has started
Oct 6 16:41:55 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:41:55 lustre3 kernel: LDISKFS FS on sde1, internal journal
Oct 6 16:41:55 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:41:55 lustre3 kernel: kjournald starting. Commit interval 5
seconds
Oct 6 16:41:55 lustre3 kernel: LDISKFS FS on sde1, internal journal
Oct 6 16:41:55 lustre3 kernel: LDISKFS-fs: mounted filesystem with ordered
data mode.
Oct 6 16:41:55 lustre3 kernel: LDISKFS-fs: file extents enabled
Oct 6 16:41:55 lustre3 kernel: LDISKFS-fs: mballoc enabled
Oct 6 16:41:55 lustre3 kernel: Lustre: MGS: Regenerating lustre-OST0012 log
by user request.
Oct 6 16:41:55 lustre3 kernel: Lustre: OST lustre-OST0012 now serving dev
(lustre-OST0012/32f8d0ff-18d9-b05e-491b-477b8558b745) with recovery enabled
Oct 6 16:41:55 lustre3 kernel: Lustre: Server lustre-OST0012 on device
/dev/sde1 has started
Oct 6 16:42:00 lustre3 kernel: Lustre:
5536:0:(quota_master.c:1576:mds_quota_recovery()) Not all osts are active,
abort quota recovery
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(llog_lvfs.c:597:llog_lvfs_create()) error looking up logfile
0x28c8020:0x3c23cd5e: rc -2
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(osc_request.c:3586:osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(osc_request.c:3597:osc_llog_init()) osc
''lustre-OST0012-osc'' tgt
''lustre-MDT0000'' cnt 1 catid ffffc20000a3b240 rc=-2
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(osc_request.c:3599:osc_llog_init()) logid 0x28c8020:0x3c23cd5e
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(lov_log.c:214:lov_llog_init()) error osc_llog_init idx 18 osc
''lustre-OST0012-osc'' tgt ''lustre-MDT0000''
(rc=-2)
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(mds_log.c:207:mds_llog_init()) lov_llog_init err -2
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(llog_obd.c:394:llog_cat_initialize()) rc: -2
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(mds_lov.c:855:__mds_lov_synchronize()) lustre-OST0012_UUID failed at
update_mds: -2
Oct 6 16:42:00 lustre3 kernel: LustreError:
5539:0:(mds_lov.c:898:__mds_lov_synchronize()) lustre-OST0012_UUID sync
failed -2, deactivating
[root at lustre3 ~]# lctl dl
0 UP mgs MGS MGS 11
1 UP mgc MGC192.168.241.243 at tcp b2bcceae-de69-e1b3-d96f-2971bba2fdfc 5
2 UP mdt MDS MDS_uuid 3
3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 3
5 UP ost OSS OSS_uuid 3
6 UP obdfilter lustre-OST0012 lustre-OST0012_UUID 5
7 IN osc lustre-OST0012-osc lustre-mdtlov_UUID 5
8 UP obdfilter lustre-OST0013 lustre-OST0013_UUID 5
9 IN osc lustre-OST0013-osc lustre-mdtlov_UUID 5
10 UP obdfilter lustre-OST0014 lustre-OST0014_UUID 5
11 UP obdfilter lustre-OST0015 lustre-OST0015_UUID 5
12 IN osc lustre-OST0014-osc lustre-mdtlov_UUID 5
13 IN osc lustre-OST0015-osc lustre-mdtlov_UUID 5
14 UP obdfilter lustre-OST0016 lustre-OST0016_UUID 5
15 IN osc lustre-OST0016-osc lustre-mdtlov_UUID 5
16 UP obdfilter lustre-OST0017 lustre-OST0017_UUID 5
17 IN osc lustre-OST0017-osc lustre-mdtlov_UUID 5
18 IN osc lustre-OST000c-osc lustre-mdtlov_UUID 5
19 IN osc lustre-OST000d-osc lustre-mdtlov_UUID 5
20 IN osc lustre-OST000e-osc lustre-mdtlov_UUID 5
21 IN osc lustre-OST000f-osc lustre-mdtlov_UUID 5
22 IN osc lustre-OST0010-osc lustre-mdtlov_UUID 5
23 IN osc lustre-OST0011-osc lustre-mdtlov_UUID 5
24 IN osc lustre-OST0000-osc lustre-mdtlov_UUID 5
25 IN osc lustre-OST0001-osc lustre-mdtlov_UUID 5
26 IN osc lustre-OST0002-osc lustre-mdtlov_UUID 5
27 IN osc lustre-OST0003-osc lustre-mdtlov_UUID 5
28 IN osc lustre-OST0004-osc lustre-mdtlov_UUID 5
29 IN osc lustre-OST0005-osc lustre-mdtlov_UUID 5
30 IN osc lustre-OST0006-osc lustre-mdtlov_UUID 5
31 IN osc lustre-OST0007-osc lustre-mdtlov_UUID 5
32 IN osc lustre-OST0008-osc lustre-mdtlov_UUID 5
33 IN osc lustre-OST0009-osc lustre-mdtlov_UUID 5
34 IN osc lustre-OST000a-osc lustre-mdtlov_UUID 5
35 IN osc lustre-OST000b-osc lustre-mdtlov_UUID 5
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081007/f6c624aa/attachment.html