Stu Midgley
2012-Mar-17 09:10 UTC
[Lustre-discuss] Fwd: can''t mount our lustre filesystem after tunefs.lustre --writeconf
---------- Forwarded message ---------- From: Stu Midgley <sdm900 at gmail.com> Date: Sat, Mar 17, 2012 at 5:10 PM Subject: can''t mount our lustre filesystem after tunefs.lustre --writeconf To: wc-discuss at whamcloud.com Afternoon We have a rather severe problem with our lustre file system. ?We had a full config log and the advice was to rewrite it with a new one. ?So, we unmounted our lustre file system off all clients, unmount all the ost''s and then unmounted the mds. ?I then did mds: ? tunefs.lustre --writeconf --erase-params /dev/md2 oss: ? tunefs.lustre --writeconf --erase-params --mgsnode=mds001 /dev/md2 After the tunefs.lustre on the mds I saw Mar 17 14:33:02 mds001 kernel: Lustre: MGS MGS started Mar 17 14:33:02 mds001 kernel: Lustre: MGC172.16.0.251 at tcp: Reactivating import Mar 17 14:33:02 mds001 kernel: Lustre: MGS: Logs for fs p1 were removed by user request. ?All servers must be restarted in order to regenerate the logs. Mar 17 14:33:02 mds001 kernel: Lustre: Enabling user_xattr Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: new disk, initializing Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: Now serving p1-MDT0000 on /dev/md2 with recovery enabled which scared me a little... the mds and the oss''s mount happily BUT I can''t mount the file system on my clients... on the mds I see Mar 17 16:42:11 mds001 kernel: LustreError: 137-5: UUID ''prod_mds_001_UUID'' is not available ?for connect (no target) On the client I see Mar 17 16:00:06 host kernel: LustreError: 11-0: an error occurred while communicating with 172.16.0.251 at tcp. The mds_connect operation failed with -19 now, it appears the writeconf renamed the UUID of the mds from prod_mds_001_UUID to p1-MDT0000_UUID but I can''t work out how to get it back... for example I tried # tunefs.lustre --mgs --mdt --fsname=p1 /dev/md2 checking for existing Lustre data: found CONFIGS/mountdata Reading CONFIGS/mountdata ?Read previous values: Target: ? ? p1-MDT0000 Index: ? ? ?0 UUID: ? ? ? prod_mds_001_UUID Lustre FS: ?p1 Mount type: ldiskfs Flags: ? ? ?0x405 ? ? ? ? ? ? (MDT MGS ) Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr Parameters: tunefs.lustre: cannot change the name of a registered target tunefs.lustre: exiting with 1 (Operation not permitted) I''m now stuck not being able to mount a 1PB file system... which isn''t good :( -- Dr Stuart Midgley sdm900 at gmail.com -- Dr Stuart Midgley sdm900 at gmail.com