Hello I''m trying to analyze an OST with few thousands of object and find where they belong to. Mounting this OST with ldiskfs and using ll_decode_filter_fid tells me that. -Most of these object do not have a fid EA back pointer. Does that means they are not used? -Some of them have good results, and the man page says that "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the MDT Object Index (OI) file on the MDT". How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I''m afraid IAM userspace code has been deactivated. How can I know if those files could be removed without risk. I previously checked that "lfs find" did not find any other files with object on this specific OST I''m working on. Thanks Aur?lien Degr?mont
On 2012-07-24, at 5:04, DEGREMONT Aurelien <aurelien.degremont at cea.fr> wrote:> I''m trying to analyze an OST with few thousands of object and find where they belong to. > Mounting this OST with ldiskfs and using ll_decode_filter_fid tells me that. > > -Most of these object do not have a fid EA back pointer. Does that means they are not used?Two possibilities: - object is in use, but has never been accessed, maybe because it is stripe N, in a file < N MB in size - object is preallocated and is not in use yet In first case (unaccessed object), new lfsck will add parent FID back pointer as part of current Phase II project.> -Some of them have good results, and the man page says that > "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the > MDT Object Index (OI) file on the MDT". > How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I''m afraid IAM userspace code > has been deactivated.Using normal "lfs fid2path {FID}" command. Cheers, Andreas> How can I know if those files could be removed without risk. > I previously checked that "lfs find" did not find any other files with object on this specific OST I''m working on. > > Thanks > > Aur?lien Degr?mont > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hi Aurelien! Am 24.07.2012 um 14:04 schrieb DEGREMONT Aurelien:> I''m trying to analyze an OST with few thousands of object and find where they belong to. > Mounting this OST with ldiskfs and using ll_decode_filter_fid tells me that. > > -Most of these object do not have a fid EA back pointer. Does that means they are not used?Is this the troglodyte type of OST that started its life in times of prehistoric versions of Lustre? We see this on old files that were created in the early ages of Lustre 1.6, before the trusted.fid EA was introduced. Other than that, these objects could have been preallocated, but never actually used. Do these objects contain any data at all (blockcount != 0)?> -Some of them have good results, and the man page says that > "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the > MDT Object Index (OI) file on the MDT". > How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I''m afraid IAM userspace code > has been deactivated.lfs fid2path (on any client) should do what you''re looking for.> How can I know if those files could be removed without risk. > I previously checked that "lfs find" did not find any other files with object on this specific OST I''m working on.From my experience, a small amount of object leakage is not too uncommon on real-world systems, so if lfs find doesn''t show up any objects anymore, most likely you''re good to take this OST down. (Hey, and you can double-check with rbh-report --dump-ost, of course! ;-) Regards, Daniel.
Rappleye, Jason (ARC-TN)[Computer Sciences Corporation]
2012-Jul-24 18:18 UTC
[Lustre-discuss] Object index
On 7/24/12 11:10 AM, "Daniel Kobras" <kobras at linux.de> wrote:>> -Some of them have good results, and the man page says that >> "For objects with MDT parent sequence numbers above 0x200000000, this >>indicates that the FID needs to be mapped via the >> MDT Object Index (OI) file on the MDT". >> How do I do this mapping? I found some iam utilities but they do not >>seems to be ok, and I''m afraid IAM userspace code >> has been deactivated. > >lfs fid2path (on any client) should do what you''re looking for.Unfortunately that doesn''t work for files created prior to Lustre 2.0, or files with components of their path created before Lustre 2.0, The link EA is missing from the MDT inode of such files, which is what fid2path appears to use. This was a real bummer for us, and I''d love for someone to tell me that I''m wrong. Please? Jason
Hi Jason! Am 24.07.2012 um 20:18 schrieb Rappleye, Jason (ARC-TN)[Computer Sciences Corporation]:> On 7/24/12 11:10 AM, "Daniel Kobras" <kobras at linux.de> wrote: >> lfs fid2path (on any client) should do what you''re looking for. > > Unfortunately that doesn''t work for files created prior to Lustre 2.0, or > files with components of their path created before Lustre 2.0, The link > EA is missing from the MDT inode of such files, which is what fid2path > appears to use. This was a real bummer for us, and I''d love for someone to > tell me that I''m wrong. Please?Pre-2.0, you can extract the inode number of the parent object on the MDT from the object''s trusted.fid EA, eg. with ll_decode_filter_fid. On the MDT, you can map the inode number to a filesystem path with ncheck in debugfs or find -inum. Regards, Daniel.
Le 24/07/2012 20:10, Daniel Kobras a ?crit :> > Is this the troglodyte type of OST that started its life in times of prehistoric versions of Lustre? We see this on old files that were created in the early ages of Lustre 1.6, before the trusted.fid EA was introduced.No, this filesystem was formatted with Lustre 2.0 By the way, does someone remember the incompatibility with 2.0/2.1 which prevent a target, formatted with Lustre 2.1 to be downgraded to Lustre 2.0 ?> Other than that, these objects could have been preallocated, but never actually used. Do these objects contain any data at all (blockcount != 0)?I was rather thinking of that. But I''m surprised that so many objects are preallocated.>> -Some of them have good results, and the man page says that >> "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the >> MDT Object Index (OI) file on the MDT". >> How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I''m afraid IAM userspace code >> has been deactivated. > lfs fid2path (on any client) should do what you''re looking for.It does not. Moreover, lfs does not support this kind of fid [0x20a5df05f:0x4874:0x0] [0x20a6e8d8c:0x27b4:0x1] Lustre Manual said "The idx field shows the stripe number of this OST object in the Lustre RAID-0 striped file. " Which seems true as I''ve got several files where idx > 0. But, lfs fid sanity check is : static inline int fid_is_sane(const struct lu_fid *fid) { return fid != NULL && ((fid_seq(fid) >= FID_SEQ_START && fid_oid(fid) != 0 && fid_ver(fid) == 0) || fid_is_igif(fid)); } And so complains when fid_ver != 0 I''m not sure at all lfs fid2patch expect fid coming from OST.> From my experience, a small amount of object leakage is not too uncommon on real-world systems, so if lfs find doesn''t show up any objects anymore, most likely you''re good to take this OST down.I agree on that, but I consider that more than 4k objects is not "a small amount" :)> (Hey, and you can double-check with rbh-report --dump-ost, of course! ;-)Sure, but I did not have an rbh DB for that FS available (a pity as "rbh-report" is few minutes in worst cases, "lfs find" was 15 hours :)) By the way, using a Lustre tool helps me to be sure the remaining objects were not related to a possible robinhood bug :) Aur?lien
On 2012-07-25, at 3:14, DEGREMONT Aurelien <aurelien.degremont at cea.fr> wrote:> Le 24/07/2012 20:10, Daniel Kobras a ?crit : >> >> Is this the troglodyte type of OST that started its life in times of prehistoric versions of Lustre? We see this on old files that were created in the early ages of Lustre 1.6, before the trusted.fid EA was introduced. > No, this filesystem was formatted with Lustre 2.0 > By the way, does someone remember the incompatibility with 2.0/2.1 which prevent a target, formatted with Lustre 2.1 to > be downgraded to Lustre 2.0 ?We never allow filesystems formatted with a new version of Lustre to be "downgraded" to an earlier version that the one which it was originally formatted at. This allows us to add new features without somehow having to retroactively. E able to support them in older versions of Lustre.>> Other than that, these objects could have been preallocated, but never actually used. Do these objects contain any data at all (blockcount != 0)? > I was rather thinking of that. But I''m surprised that so many objects are preallocated.As previously mentioned, they might also be allocated but never accessed.>>> -Some of them have good results, and the man page says that >>> "For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the >>> MDT Object Index (OI) file on the MDT". >>> How do I do this mapping? I found some iam utilities but they do not seems to be ok, and I''m afraid IAM userspace code >>> has been deactivated. >> lfs fid2path (on any client) should do what you''re looking for. > It does not. Moreover, lfs does not support this kind of fid > [0x20a5df05f:0x4874:0x0] > [0x20a6e8d8c:0x27b4:0x1] > > Lustre Manual said "The idx field shows the stripe number of this OST object in the Lustre RAID-0 striped file. "Try setting the last field of the FID (fid_ver) to 0. This is really the LOV stripe index and not really part of the FID at all. It just happens to be stored in this location on disk to save space.> Which seems true as I''ve got several files where idx > 0. > But, lfs fid sanity check is : > static inline int fid_is_sane(const struct lu_fid *fid) > { > return > fid != NULL && > ((fid_seq(fid) >= FID_SEQ_START && fid_oid(fid) != 0 > && fid_ver(fid) == 0) || > fid_is_igif(fid)); > } > > And so complains when fid_ver != 0 > > I''m not sure at all lfs fid2path expect fid coming from OST.They are the same FID, it just depends on how you decoded the FID from the OST xattr. Cheers, Andreas>> From my experience, a small amount of object leakage is not too uncommon on real-world systems, so if lfs find doesn''t show up any objects anymore, most likely you''re good to take this OST down. > I agree on that, but I consider that more than 4k objects is not "a small amount" :) > >> (Hey, and you can double-check with rbh-report --dump-ost, of course! ;-) > Sure, but I did not have an rbh DB for that FS available (a pity as "rbh-report" is few minutes in worst cases, "lfs > find" was 15 hours :)) > By the way, using a Lustre tool helps me to be sure the remaining objects were not related to a possible robinhood bug :) > > > Aur?lien > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss