Hello list, i want to remove an OST permanently on the MDS: lctl --device 11 conf_param foo-OST0006.osc.active=0 In the messages on the MDS i get: Jan 22 13:42:12 mds1 Lustre: foo-OST0006-osc.osc: set parameter active=0 Jan 22 13:42:12 mds1 Lustre: Skipped 1 previous similar message On a client i can see this: client # cat /proc/fs/lustre/osc/foo-OST0006-osc-ffff8100f3ed8800/active 0 In the messages on the client: Jan 22 13:42:11 cluster1 Lustre: setting import foo-OST0006_UUID INACTIVE by administrator request Jan 22 13:43:11 cluster1 Lustre: setting import foo-OST0006_UUID INACTIVE by administrator request Jan 22 13:43:11 cluster1 LustreError: 4629:0:(lov_obd.c:316:lov_connect_obd()) not connecting OSC foo-OST0006_UUID; administratively disabled Jan 22 13:44:20 cluster1 LustreError: 4673:0: (connection.c:155:ptlrpc_put_connection()) NULL connection Jan 22 13:44:41 cluster1 Lustre: setting import foo-OST0006_UUID INACTIVE by administrator request Jan 22 13:44:41 cluster1 LustreError: 4680:0:(lov_obd.c:316:lov_connect_obd()) not connecting OSC foo-OST0006_UUID; administratively disabled Jan 22 13:45:50 cluster1 LustreError: 4722:0: (connection.c:155:ptlrpc_put_connection()) NULL connection Fine. Now lfs barks out on a ''lfs df -h /lustrefs'': client # lfs df -h /lustrefs UUID bytes Used Available Use% Mounted on foo-MDT0000_UUID 154.8G 708.8M 145.3G 0% /misc/data[MDT:0] foo-OST0000_UUID 6.4T 5.1T 1002.9G 79% /misc/data[OST:0] foo-OST0001_UUID 6.4T 5.1T 1016.1G 79% /misc/data[OST:1] foo-OST0002_UUID 6.4T 5.0T 1.0T 79% /misc/data[OST:2] foo-OST0003_UUID 6.4T 5.0T 1.0T 78% /misc/data[OST:3] foo-OST0004_UUID 6.4T 5.6T 488.7G 87% /misc/data[OST:4] foo-OST0005_UUID 6.4T 5.6T 497.9G 87% /misc/data[OST:5] error: llapi_obd_statfs failed: Bad address (-14) But there are OSTs numbered higher than ''OST-0006''. Getting the stripes located on those OSTs above is ok: client # lfs getstripe -O foo-OST000a_UUID -r /lustrefs /file1.dat /file2.dat ... lustre 1.6.6, vanilla Kernel 2.6.22.19 with lustre patches. We do not stripe across the OSTs. What to do to get lfs working again ? Thanks and Regards Heiko
On Jan 22, 2009 14:17 +0100, Heiko Schroeter wrote:> i want to remove an OST permanently on the MDS: > lctl --device 11 conf_param foo-OST0006.osc.active=0 > > Fine. Now lfs barks out on a ''lfs df -h /lustrefs'': > client # lfs df -h /lustrefs > UUID bytes Used Available Use% Mounted on > foo-MDT0000_UUID 154.8G 708.8M 145.3G 0% /misc/data[MDT:0] > foo-OST0000_UUID 6.4T 5.1T 1002.9G 79% /misc/data[OST:0] > foo-OST0001_UUID 6.4T 5.1T 1016.1G 79% /misc/data[OST:1] > foo-OST0002_UUID 6.4T 5.0T 1.0T 79% /misc/data[OST:2] > foo-OST0003_UUID 6.4T 5.0T 1.0T 78% /misc/data[OST:3] > foo-OST0004_UUID 6.4T 5.6T 488.7G 87% /misc/data[OST:4] > foo-OST0005_UUID 6.4T 5.6T 497.9G 87% /misc/data[OST:5] > error: llapi_obd_statfs failed: Bad address (-14) > > But there are OSTs numbered higher than ''OST-0006''. > Getting the stripes located on those OSTs above is ok: > > What to do to get lfs working again ?Please file a bug on this. It looks like there are two issues: - lustre/utils/lfs.c:mntdf() should continue on this error instead of failing - the ll_obd_statfs() function should return a better error You can fix the first problem just by recompiling lfs, which will get you a working lfs. The second one is the "more correct" fix, but needs more work and also recompiling/restarting Lustre. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.