List, lfs find is not finding all OSTs on our system. This may be due to a bug we triggered by having deactivated OSTs. OST9-28 don''t exist, and lustre find cannot find OST29, which is the first OST after a sequence of missing ones. Example: $ lfs getstripe . OBDS: 0: lustre-OST0000_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID ACTIVE 3: lustre-OST0003_UUID ACTIVE 4: lustre-OST0004_UUID ACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE 7: lustre-OST0007_UUID ACTIVE 8: lustre-OST0008_UUID ACTIVE 41: lustre-OST0029_UUID ACTIVE 42: lustre-OST002a_UUID ACTIVE 43: lustre-OST002b_UUID ACTIVE 44: lustre-OST002c_UUID ACTIVE 45: lustre-OST002d_UUID ACTIVE 46: lustre-OST002e_UUID ACTIVE 47: lustre-OST002f_UUID ACTIVE 48: lustre-OST0030_UUID ACTIVE 49: lustre-OST0031_UUID ACTIVE 50: lustre-OST0032_UUID ACTIVE 51: lustre-OST0033_UUID ACTIVE 52: lustre-OST0034_UUID ACTIVE 53: lustre-OST0035_UUID ACTIVE 54: lustre-OST0036_UUID ACTIVE 55: lustre-OST0037_UUID ACTIVE 56: lustre-OST0038_UUID ACTIVE 57: lustre-OST0039_UUID ACTIVE 58: lustre-OST003a_UUID ACTIVE 59: lustre-OST003b_UUID ACTIVE 60: lustre-OST003c_UUID ACTIVE 61: lustre-OST003d_UUID ACTIVE 62: lustre-OST003e_UUID ACTIVE 63: lustre-OST003f_UUID ACTIVE 64: lustre-OST0040_UUID ACTIVE 65: lustre-OST0041_UUID ACTIVE 66: lustre-OST0042_UUID ACTIVE 67: lustre-OST0043_UUID ACTIVE 68: lustre-OST0044_UUID ACTIVE . (Default) stripe_count: 1 stripe_size: 1048576 stripe_offset: -1 ./loss12.out obdidx objid objid group 42 6678168 0x65e698 0 ./lustre-OST0029 obdidx objid objid group 41 6953217 0x6a1901 0 ./lustre-OST002a obdidx objid objid group 42 6841347 0x686403 0 $ lfs find -O lustre-OST002a_UUID . ./loss12.out ./lustre-OST002a $ lfs find -O lustre-OST0029_UUID . <NOTHING> -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
What version of lustre are you running. I seem to recall a bug back in the early 1.8 releases relating to how lfs find walks through the OST index and if one was missing there were issues. -cf On 06/13/2011 09:44 AM, Michael Barnes wrote:> List, > > lfs find is not finding all OSTs on our system. This may be due to a bug we > triggered by having deactivated OSTs. OST9-28 don''t exist, and lustre find > cannot find OST29, which is the first OST after a sequence of missing ones. > > Example: > > $ lfs getstripe . > OBDS: > 0: lustre-OST0000_UUID ACTIVE > 1: lustre-OST0001_UUID ACTIVE > 2: lustre-OST0002_UUID ACTIVE > 3: lustre-OST0003_UUID ACTIVE > 4: lustre-OST0004_UUID ACTIVE > 5: lustre-OST0005_UUID ACTIVE > 6: lustre-OST0006_UUID ACTIVE > 7: lustre-OST0007_UUID ACTIVE > 8: lustre-OST0008_UUID ACTIVE > 41: lustre-OST0029_UUID ACTIVE > 42: lustre-OST002a_UUID ACTIVE > 43: lustre-OST002b_UUID ACTIVE > 44: lustre-OST002c_UUID ACTIVE > 45: lustre-OST002d_UUID ACTIVE > 46: lustre-OST002e_UUID ACTIVE > 47: lustre-OST002f_UUID ACTIVE > 48: lustre-OST0030_UUID ACTIVE > 49: lustre-OST0031_UUID ACTIVE > 50: lustre-OST0032_UUID ACTIVE > 51: lustre-OST0033_UUID ACTIVE > 52: lustre-OST0034_UUID ACTIVE > 53: lustre-OST0035_UUID ACTIVE > 54: lustre-OST0036_UUID ACTIVE > 55: lustre-OST0037_UUID ACTIVE > 56: lustre-OST0038_UUID ACTIVE > 57: lustre-OST0039_UUID ACTIVE > 58: lustre-OST003a_UUID ACTIVE > 59: lustre-OST003b_UUID ACTIVE > 60: lustre-OST003c_UUID ACTIVE > 61: lustre-OST003d_UUID ACTIVE > 62: lustre-OST003e_UUID ACTIVE > 63: lustre-OST003f_UUID ACTIVE > 64: lustre-OST0040_UUID ACTIVE > 65: lustre-OST0041_UUID ACTIVE > 66: lustre-OST0042_UUID ACTIVE > 67: lustre-OST0043_UUID ACTIVE > 68: lustre-OST0044_UUID ACTIVE > . > (Default) stripe_count: 1 stripe_size: 1048576 stripe_offset: -1 > ./loss12.out > obdidx objid objid group > 42 6678168 0x65e698 0 > > ./lustre-OST0029 > obdidx objid objid group > 41 6953217 0x6a1901 0 > > ./lustre-OST002a > obdidx objid objid group > 42 6841347 0x686403 0 > > > $ lfs find -O lustre-OST002a_UUID . > ./loss12.out > ./lustre-OST002a > > $ lfs find -O lustre-OST0029_UUID . > <NOTHING> > > -mb > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Jun 13, 2011, at 12:54 PM, Colin Faber wrote:> What version of lustre are you running. I seem to recall a bug back in > the early 1.8 releases relating to how lfs find walks through the OST > index and if one was missing there were issues.Sorry, I meant to include the version info. The MDS/MGS is at 1.8.5, the OSS in question is 1.8.4 (others are at 1.8.5 and 1.8.4), client versions I tried are 1.8.1.1, 1.8.2, and 1.8.5. -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------