thr3ads.net - Lustre discuss - [Lustre-discuss] osc lost on MDS server [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Lu Wang

2009-Nov-11 12:59 UTC

[Lustre-discuss] osc lost on MDS server

Dear  list,
	Our MDS losts 5/7 osc devices after a reconfiguration:
	1.umount all osts
	2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
	3.mount -t ldiskfs mdtdevice mountpoint 
	4.rm CONFIGS/*
	5.mount -t lustre mdtdevice mountpoint
    6. mount all osts
	
	we can see 7 the osc devices on a mounted client, but only 2/7 osc devices on
MDS. When I create a file with 7 stripe, the MDS server crashed. Any ideas?
	 lctl dl 
  0 UP mgs MGS MGS 707
  1 UP mgc MGC192.168.50.50 at tcp 43fc0787-3580-ce63-5019-94a7903f2fb0 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov besfs2-mdtlov besfs2-mdtlov_UUID 4
  4 UP mds besfs2-MDT0000 besfs2-MDT0000_UUID 803
  5 UP ost OSS OSS_uuid 3
  6 UP obdfilter besfs2-OST0000 besfs2-OST0000_UUID 803
  7 UP obdfilter besfs2-OST0001 besfs2-OST0001_UUID 805
  8 UP osc besfs2-OST0001-osc besfs2-mdtlov_UUID 5
  9 UP obdfilter besfs2-OST0002 besfs2-OST0002_UUID 805
 10 UP osc besfs2-OST0002-osc besfs2-mdtlov_UUID 5

    This is a small lustre installation. We have only 2 server, there are 3 OSTs
and 1 MDT/MGS on one server, 4 OSTs on the other server.  The lustre version is
:   2.6.9-67.0.22.EL_lustre.1.6.6smp on x86-64.

Best Regards
Lu Wang
--------------------------------------------------------------	  
Computing Center
IHEP						Office: Computing Center,123 
19B Yuquan Road				Tel: (+86) 10 88236012-607
P.O. Box 918-7				Fax: (+86) 10 8823 6839
Beijing 100049,China		Email: Lu.Wang at ihep.ac.cn							
--------------------------------------------------------------

Antonio Concas

2009-Nov-11 13:20 UTC

head link

[Lustre-discuss] osc lost on MDS server

I had a similar problem,
after a reconfiguration, running tunefs.lustre ost I got a kernel panic,
I solved using fsck on the ost damaged.

Antonio



Lu Wang ha scritto:> Dear  list,
> 	Our MDS losts 5/7 osc devices after a reconfiguration:
> 	1.umount all osts
> 	2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
> 	3.mount -t ldiskfs mdtdevice mountpoint 
> 	4.rm CONFIGS/*
> 	5.mount -t lustre mdtdevice mountpoint
>     6. mount all osts
> 	
> 	we can see 7 the osc devices on a mounted client, but only 2/7 osc devices
on MDS. When I create a file with 7 stripe, the MDS server crashed. Any ideas?
> 	 lctl dl 
>   0 UP mgs MGS MGS 707
>   1 UP mgc MGC192.168.50.50 at tcp 43fc0787-3580-ce63-5019-94a7903f2fb0 5
>   2 UP mdt MDS MDS_uuid 3
>   3 UP lov besfs2-mdtlov besfs2-mdtlov_UUID 4
>   4 UP mds besfs2-MDT0000 besfs2-MDT0000_UUID 803
>   5 UP ost OSS OSS_uuid 3
>   6 UP obdfilter besfs2-OST0000 besfs2-OST0000_UUID 803
>   7 UP obdfilter besfs2-OST0001 besfs2-OST0001_UUID 805
>   8 UP osc besfs2-OST0001-osc besfs2-mdtlov_UUID 5
>   9 UP obdfilter besfs2-OST0002 besfs2-OST0002_UUID 805
>  10 UP osc besfs2-OST0002-osc besfs2-mdtlov_UUID 5
>
>     This is a small lustre installation. We have only 2 server, there are 3
OSTs and 1 MDT/MGS on one server, 4 OSTs on the other server.  The lustre
version is :   2.6.9-67.0.22.EL_lustre.1.6.6smp on x86-64.
>
> Best Regards
> Lu Wang
>
>

Andreas Dilger

2009-Nov-11 22:20 UTC

head link

[Lustre-discuss] osc lost on MDS server

On 2009-11-11, at 05:59, Lu Wang wrote:> Dear  list,
> 	Our MDS losts 5/7 osc devices after a reconfiguration:
> 	1.umount all osts
> 	2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
> 	3.mount -t ldiskfs mdtdevice mountpoint
> 	4.rm CONFIGS/*
> 	5.mount -t lustre mdtdevice mountpoint
Where is it documented to delete all of the files in CONFIGS?
This deletes the action of step #2 above, and isn''t a good idea.
Presumably there was also a step 4b to unmount the filesystem
from type lfdiskfs?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

huangql

2009-Nov-12 01:27 UTC

head link

[Lustre-discuss] osc lost on MDS server

Hi, dearlist

We have done the steps from 1 to 5, but we can still only see 2/7 osc devices on
the MDS, but we can see 7 osc devices on a mounted client. Then we run e2fsck on
the osts, However, we get the same result.

And we had the problem, after we do the step 1 to 4  and umount the filesystem
from ldiskfs, then run the step 5, we got the logs:
According to the logs, we have no idea why there are 63 clients and where the
MDS get the clients information with we removing CONFIGS/* and linking down.

Nov 12 09:02:37 beshome01 kernel: Lustre:
2474:0:(mds_fs.c:493:mds_init_server_data()) RECOVERY: service besfs2-MDT0000,
63 recoverable clients, last_transno 118077355
Nov 12 09:02:37 beshome01 kernel: Lustre: MDT besfs2-MDT0000 now serving dev
(besfs2-MDT0000/5ceb6ad6-e810-9fae-4862-8ed0913bf7e7), but will be in recovery
for at least 5:00, or until 63 clients reconnect. During this time new clients
will not be allowed to connect. Recovery progress can be monitored by watching
/proc/fs/lustre/mds/besfs2-MDT0000/recovery_status.
2009-11-12 

huangql 

???? Andreas Dilger 
????? 2009-11-12  06:19:53 
???? Lu Wang 
??? lustre-discuss 
??? Re: [Lustre-discuss] osc lost on MDS server 

On 2009-11-11, at 05:59, Lu Wang wrote:> Dear  list,
>  Our MDS losts 5/7 osc devices after a reconfiguration:
>  1.umount all osts
>  2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
>  3.mount -t ldiskfs mdtdevice mountpoint
>  4.rm CONFIGS/*
>  5.mount -t lustre mdtdevice mountpointWhere is it documented to delete all of the files in CONFIGS?
This deletes the action of step #2 above, and isn''t a good idea.
Presumably there was also a step 4b to unmount the filesystem
from type lfdiskfs?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091112/8a0c46e6/attachment-0001.html

Lu Wang

2009-Nov-12 09:52 UTC

head link

[Lustre-discuss] osc lost on MDS server

Hi list, 
	We have tried again trying to recover the system to a consistant state with
following steps:
1. Fulled out the 10Gbit Ethernet links connecting to the computing clustre, and
connected  the 2 server using a direct
ether net link. This step isolated the 2 servers from computing clustre to avoid
the interferes from running clients( may be umount unclearly).
2. umount all the osts
3. umount MDT.
4. mount MDT as ldiskfs, and rm all files under CONFIGS( this files are
confirmed  not right )   and unmount.
5. running tunefs.lustre --erase-params --mgs --mdt --fsname=besfs2 --writeconf
/dev/sda1 to MDT device. This command returned an Fatal Error which said it
assumed that this is a upgrading operation  from 1.4 to 1.6. The device
is trying to copy client file from /tmp/****/LOG/ but failed, so it made a log
file from "last_rcvd".
6. We ingored the error and mounted MDT successfully. lctl dl showed 5 device. 
7.  mount every OST as ldiskfs, and rm all files undre CONFIGS and umount
8.  running tunefs.lustre --erase-params  --ost --mgsnode=192.168.50.50
--index=old index --fsname=besfs2 --writeconf /dev/sd* to each OST. This command
also rreturned the assumption about 1.4 to 1.6 upgrade,and then it made a log
file from "last_recv".  However, there was no Fatal error.
9. We mounted osts one by one. 
10. This time we could see every osc for OST, however, only 2 osc are UP , the
other 5 are IN.
  0 UP mgs MGS MGS 7
  1 UP mgc MGC192.168.50.50 at tcp 26aae9d0-202e-abf3-3cb0-746eea59d7a4 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov besfs2-mdtlov besfs2-mdtlov_UUID 4
  4 UP mds besfs2-MDT0000 besfs2-MDT0000_UUID 3
  5 IN osc besfs2-OST0000-osc besfs2-mdtlov_UUID 5
  6 UP osc besfs2-OST0001-osc besfs2-mdtlov_UUID 5
  7 IN osc besfs2-OST0003-osc besfs2-mdtlov_UUID 5
  8 UP osc besfs2-OST0002-osc besfs2-mdtlov_UUID 5
  9 IN osc besfs2-OST0004-osc besfs2-mdtlov_UUID 5
 10 IN osc besfs2-OST0005-osc besfs2-mdtlov_UUID 5
 11 IN osc besfs2-OST0006-osc besfs2-mdtlov_UUID 5
 12 UP ost OSS OSS_uuid 3
 13 UP obdfilter besfs2-OST0000 besfs2-OST0000_UUID 5
 14 UP obdfilter besfs2-OST0001 besfs2-OST0001_UUID 5
 15 UP obdfilter besfs2-OST0002 besfs2-OST0002_UUID 5

We find this error log in /var/log/message:
kernel: LustreError: 2407:0:(llog_lvfs.c:612:llog_lvfs_create()) error looking
up logfile 0xbc28013:0xf77a298: rc -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2407:0:(llog_cat.c:176:llog_cat_id2handle()) error opening log id
0xbc28013:f77a298: rc -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2407:0:(llog_obd.c:262:cat_cancel_cb()) Cannot find handle for log 0xbc28013
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(llog_obd.c:329:llog_obd_origin_setup()) llog_process with cat_cancel_cb
failed: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3664:osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3675:osc_llog_init()) osc
''besfs2-OST0000-osc'' tgt ''besfs2-MDT0000''
cnt 1 catid 00000104110f1ce8 rc=-2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3677:osc_llog_init()) logid 0xbc28002:0x9a60e39f
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(lov_log.c:230:lov_llog_init()) error osc_llog_init idx 0 osc
''besfs2-OST0000-osc'' tgt ''besfs2-MDT0000''
(rc=-2)
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_log.c:220:mds_llog_init()) lov_llog_init err -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(llog_obd.c:417:llog_cat_initialize()) rc: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_lov.c:916:__mds_lov_synchronize()) besfs2-OST0000_UUID failed at
update_mds: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_lov.c:959:__mds_lov_synchronize()) besfs2-OST0000_UUID sync failed
-2, deactivating




Any ideas?




------------------				 
Lu Wang
2009-11-12

-------------------------------------------------------------
????huangql
?????2009-11-12 09:25:44
????Andreas Dilger; Lu Wang
???lustre-discuss
???Re: Re: [Lustre-discuss] osc lost on MDS server

Hi, dearlist

We have done the steps from 1 to 5, but we can still only see 2/7 osc devices on
the MDS, but we can see 7 osc devices on a mounted client. Then we run e2fsck on
the osts, However, we get the same result.

And we had the problem, after we do the step 1 to 4  and umount the filesystem
from ldiskfs, then run the step 5, we got the logs:
According to the logs, we have no idea why there are 63 clients and where the
MDS get the clients information with we removing CONFIGS/* and linking down.

Nov 12 09:02:37 beshome01 kernel: Lustre:
2474:0:(mds_fs.c:493:mds_init_server_data()) RECOVERY: service besfs2-MDT0000,
63 recoverable clients, last_transno 118077355
Nov 12 09:02:37 beshome01 kernel: Lustre: MDT besfs2-MDT0000 now serving dev
(besfs2-MDT0000/5ceb6ad6-e810-9fae-4862-8ed0913bf7e7), but will be in recovery
for at least 5:00, or until 63 clients reconnect. During this time new clients
will not be allowed to connect. Recovery progress can be monitored by watching
/proc/fs/lustre/mds/besfs2-MDT0000/recovery_status.
2009-11-12 



huangql 



???? Andreas Dilger 
????? 2009-11-12  06:19:53 
???? Lu Wang 
??? lustre-discuss 
??? Re: [Lustre-discuss] osc lost on MDS server 
 
On 2009-11-11, at 05:59, Lu Wang wrote:> Dear  list,
>  Our MDS losts 5/7 osc devices after a reconfiguration:
>  1.umount all osts
>  2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
>  3.mount -t ldiskfs mdtdevice mountpoint
>  4.rm CONFIGS/*
>  5.mount -t lustre mdtdevice mountpointWhere is it documented to delete all of the files in CONFIGS?
This deletes the action of step #2 above, and isn''t a good idea.
Presumably there was also a step 4b to unmount the filesystem
from type lfdiskfs?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lu Wang

2009-Nov-13 03:24 UTC

head link

[Lustre-discuss] osc lost on MDS server

We take the 2 servers back to the cluster. After 15 hours''s running, we
get this errors in /var/log/message:

Nov 13 10:37:04 beshome01 kernel: LustreError:
2359:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:37:04 beshome01 kernel: LustreError:
2359:0:(llog_obd.c:211:llog_add()) Skipped 2 previous similar messages
Nov 13 10:39:49 beshome01 kernel: LustreError:
2360:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:39:49 beshome01 kernel: LustreError:
2360:0:(llog_obd.c:211:llog_add()) Skipped 16 previous similar messages
Nov 13 10:52:43 beshome01 kernel: LustreError:
2332:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:52:43 beshome01 kernel: LustreError:
2332:0:(llog_obd.c:211:llog_add()) Skipped 265 previous similar messages
Nov 13 10:53:46 beshome01 kernel: LustreError:
2346:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:53:46 beshome01 kernel: LustreError:
2346:0:(llog_obd.c:211:llog_add()) Skipped 105 previous similar messages
Nov 13 10:54:38 beshome01 kernel: LustreError:
2335:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:54:38 beshome01 kernel: LustreError:
2335:0:(llog_obd.c:211:llog_add()) Skipped 4 previous similar messages
Nov 13 10:56:04 beshome01 kernel: LustreError:
2356:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:56:04 beshome01 kernel: LustreError:
2356:0:(llog_obd.c:211:llog_add()) Skipped 5 previous similar messages
Nov 13 10:59:26 beshome01 kernel: LustreError:
2357:0:(llog_obd.c:211:llog_add()) No ctxt
Nov 13 10:59:26 beshome01 kernel: LustreError:
2357:0:(llog_obd.c:211:llog_add()) Skipped 3 previous similar messages

     Since we still have 2 "UP" osc on MDS, and all the osc are
"UP" on lustre clients, users feel the system is back to normal.
However, new objects can only be created on  2 OSTs.  If the write I/Os
increase, we will get:
   
 Nov 12 18:50:28 beshome01 kernel: Lustre:
2599:0:(filter_io_26.c:714:filter_commitrw_write()) besfs2-OST0002: slow i_mutex
30s


------------------				 
Lu Wang
2009-11-13

-------------------------------------------------------------
????Lu Wang
?????2009-11-12 17:52:05
????lustre-discuss
???
???Re: [Lustre-discuss] osc lost on MDS server

Hi list, 
	We have tried again trying to recover the system to a consistant state with
following steps:
1. Fulled out the 10Gbit Ethernet links connecting to the computing clustre, and
connected  the 2 server using a direct
ether net link. This step isolated the 2 servers from computing clustre to avoid
the interferes from running clients( may be umount unclearly).
2. umount all the osts
3. umount MDT.
4. mount MDT as ldiskfs, and rm all files under CONFIGS( this files are
confirmed  not right )   and unmount.
5. running tunefs.lustre --erase-params --mgs --mdt --fsname=besfs2 --writeconf
/dev/sda1 to MDT device. This command returned an Fatal Error which said it
assumed that this is a upgrading operation  from 1.4 to 1.6. The device
is trying to copy client file from /tmp/****/LOG/ but failed, so it made a log
file from "last_rcvd".
6. We ingored the error and mounted MDT successfully. lctl dl showed 5 device. 
7.  mount every OST as ldiskfs, and rm all files undre CONFIGS and umount
8.  running tunefs.lustre --erase-params  --ost --mgsnode=192.168.50.50
--index=old index --fsname=besfs2 --writeconf /dev/sd* to each OST. This command
also rreturned the assumption about 1.4 to 1.6 upgrade,and then it made a log
file from "last_recv".  However, there was no Fatal error.
9. We mounted osts one by one. 
10. This time we could see every osc for OST, however, only 2 osc are UP , the
other 5 are IN.
  0 UP mgs MGS MGS 7
  1 UP mgc MGC192.168.50.50 at tcp 26aae9d0-202e-abf3-3cb0-746eea59d7a4 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov besfs2-mdtlov besfs2-mdtlov_UUID 4
  4 UP mds besfs2-MDT0000 besfs2-MDT0000_UUID 3
  5 IN osc besfs2-OST0000-osc besfs2-mdtlov_UUID 5
  6 UP osc besfs2-OST0001-osc besfs2-mdtlov_UUID 5
  7 IN osc besfs2-OST0003-osc besfs2-mdtlov_UUID 5
  8 UP osc besfs2-OST0002-osc besfs2-mdtlov_UUID 5
  9 IN osc besfs2-OST0004-osc besfs2-mdtlov_UUID 5
 10 IN osc besfs2-OST0005-osc besfs2-mdtlov_UUID 5
 11 IN osc besfs2-OST0006-osc besfs2-mdtlov_UUID 5
 12 UP ost OSS OSS_uuid 3
 13 UP obdfilter besfs2-OST0000 besfs2-OST0000_UUID 5
 14 UP obdfilter besfs2-OST0001 besfs2-OST0001_UUID 5
 15 UP obdfilter besfs2-OST0002 besfs2-OST0002_UUID 5

We find this error log in /var/log/message:
kernel: LustreError: 2407:0:(llog_lvfs.c:612:llog_lvfs_create()) error looking
up logfile 0xbc28013:0xf77a298: rc -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2407:0:(llog_cat.c:176:llog_cat_id2handle()) error opening log id
0xbc28013:f77a298: rc -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2407:0:(llog_obd.c:262:cat_cancel_cb()) Cannot find handle for log 0xbc28013
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(llog_obd.c:329:llog_obd_origin_setup()) llog_process with cat_cancel_cb
failed: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3664:osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3675:osc_llog_init()) osc
''besfs2-OST0000-osc'' tgt ''besfs2-MDT0000''
cnt 1 catid 00000104110f1ce8 rc=-2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(osc_request.c:3677:osc_llog_init()) logid 0xbc28002:0x9a60e39f
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(lov_log.c:230:lov_llog_init()) error osc_llog_init idx 0 osc
''besfs2-OST0000-osc'' tgt ''besfs2-MDT0000''
(rc=-2)
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_log.c:220:mds_llog_init()) lov_llog_init err -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(llog_obd.c:417:llog_cat_initialize()) rc: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_lov.c:916:__mds_lov_synchronize()) besfs2-OST0000_UUID failed at
update_mds: -2
Nov 12 16:57:03 beshome01 kernel: LustreError:
2398:0:(mds_lov.c:959:__mds_lov_synchronize()) besfs2-OST0000_UUID sync failed
-2, deactivating




Any ideas?




------------------				 
Lu Wang
2009-11-12

-------------------------------------------------------------
????huangql
?????2009-11-12 09:25:44
????Andreas Dilger; Lu Wang
???lustre-discuss
???Re: Re: [Lustre-discuss] osc lost on MDS server

Hi, dearlist

We have done the steps from 1 to 5, but we can still only see 2/7 osc devices on
the MDS, but we can see 7 osc devices on a mounted client. Then we run e2fsck on
the osts, However, we get the same result.

And we had the problem, after we do the step 1 to 4  and umount the filesystem
from ldiskfs, then run the step 5, we got the logs:
According to the logs, we have no idea why there are 63 clients and where the
MDS get the clients information with we removing CONFIGS/* and linking down.

Nov 12 09:02:37 beshome01 kernel: Lustre:
2474:0:(mds_fs.c:493:mds_init_server_data()) RECOVERY: service besfs2-MDT0000,
63 recoverable clients, last_transno 118077355
Nov 12 09:02:37 beshome01 kernel: Lustre: MDT besfs2-MDT0000 now serving dev
(besfs2-MDT0000/5ceb6ad6-e810-9fae-4862-8ed0913bf7e7), but will be in recovery
for at least 5:00, or until 63 clients reconnect. During this time new clients
will not be allowed to connect. Recovery progress can be monitored by watching
/proc/fs/lustre/mds/besfs2-MDT0000/recovery_status.
2009-11-12 



huangql 



???? Andreas Dilger 
????? 2009-11-12  06:19:53 
???? Lu Wang 
??? lustre-discuss 
??? Re: [Lustre-discuss] osc lost on MDS server 
 
On 2009-11-11, at 05:59, Lu Wang wrote:> Dear  list,
>  Our MDS losts 5/7 osc devices after a reconfiguration:
>  1.umount all osts
>  2.tunfs.lustre --writeconf --mgs --mdt /dev/sda1
>  3.mount -t ldiskfs mdtdevice mountpoint
>  4.rm CONFIGS/*
>  5.mount -t lustre mdtdevice mountpointWhere is it documented to delete all of the files in CONFIGS?
This deletes the action of step #2 above, and isn''t a good idea.
Presumably there was also a step 4b to unmount the filesystem
from type lfdiskfs?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lustre discuss - Nov 2009 - osc lost on MDS server

[Lustre-discuss] osc lost on MDS server

[Lustre-discuss] osc lost on MDS server

[Lustre-discuss] osc lost on MDS server

[Lustre-discuss] osc lost on MDS server

[Lustre-discuss] osc lost on MDS server

[Lustre-discuss] osc lost on MDS server