Hello! While inspecting patches in bug 17560 it struck me as odd that we are fetching the data one entry at a time from the filesystem, so I asked a many-entries at a time approach to be implemented. Now the patch in the bug just fills the osd buffer with entries (after parsing them) and then actual user (e.g. __mdd_readpage) reparses the osd buffer and copies data to its own. This is a lot of wasted processing, copies and then some extra memory usage. So in my opinion the iterator interface should be reworked to accept an iterator and a buffer and then __mdd_readdir will pass in its own parsing of the data (with possible generalized preparsing from osd so that we are fs-neutral) and will directly fill its own buffer to the amount it actually needs, kind of like any kernel filldir functions already work right now. Am I missing something stupid? Are there any objections, would this break anything? Bye, Oleg
Hello, In new MDS stack, osd suppose do not understand Directory (IMHO), and it only provides index API. MDD will use the index API to fill the dir entries. The filldir in current EA index API is just for compatible with old version of MDS. Maybe we need bulk index API for readdir ? But Alex should know more. Thanks WangDi Oleg Drokin wrote:> Hello! > > While inspecting patches in bug 17560 it struck me as odd that we > are fetching the data one entry at a time from the filesystem, so I > asked a many-entries at a time > approach to be implemented. Now the patch in the bug just fills > the osd buffer with entries (after parsing them) and then actual user > (e.g. __mdd_readpage) > reparses the osd buffer and copies data to its own. This is a lot > of wasted processing, copies and then some extra memory usage. > So in my opinion the iterator interface should be reworked to > accept an iterator and a buffer and then __mdd_readdir will pass in > its own > parsing of the data (with possible generalized preparsing from osd > so that we are fs-neutral) and will directly fill its own buffer to > the amount it actually needs, > kind of like any kernel filldir functions already work right now. > > Am I missing something stupid? Are there any objections, would > this break anything? > > Bye, > Oleg > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-devel >
I agree with what Oleg said. I think there are two-stages processing here: 1) osd should be able to clean "value" from compatibility stuff like inodes 2) mdd should be able to store this value in own buffer in, possible, some efficient form>>>>> di wang (dw) writes:dw> Hello, dw> In new MDS stack, osd suppose do not understand Directory (IMHO), and dw> it only provides dw> index API. MDD will use the index API to fill the dir entries. dw> The filldir in current EA index API is just for compatible with old version dw> of MDS. Maybe we need bulk index API for readdir ? But Alex should dw> know more. dw> Thanks dw> WangDi dw> Oleg Drokin wrote: >> Hello! >> >> While inspecting patches in bug 17560 it struck me as odd that >> we are fetching the data one entry at a time from the filesystem, >> so I asked a many-entries at a time >> approach to be implemented. Now the patch in the bug just fills >> the osd buffer with entries (after parsing them) and then actual >> user (e.g. __mdd_readpage) >> reparses the osd buffer and copies data to its own. This is a >> lot of wasted processing, copies and then some extra memory usage. >> So in my opinion the iterator interface should be reworked to >> accept an iterator and a buffer and then __mdd_readdir will pass in >> its own >> parsing of the data (with possible generalized preparsing from >> osd so that we are fs-neutral) and will directly fill its own >> buffer to the amount it actually needs, >> kind of like any kernel filldir functions already work right now. >> >> Am I missing something stupid? Are there any objections, would >> this break anything? >> >> Bye, >> Oleg >> _______________________________________________ >> Lustre-devel mailing list >> Lustre-devel at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-devel >> -- thanks, Alex
2009/4/6 Alex Zhuravlev <bzzz at sun.com> Hello,> I agree with what Oleg said. I think there are two-stages processing here: > 1) osd should be able to clean "value" from compatibility stuff like inodes > 2) mdd should be able to store this value in own buffer in, possible, some > efficient form > > >>>>> di wang (dw) writes: >[...]> > >> > >> Am I missing something stupid? Are there any objections, would > >> this break anything? >the reason dt_it_ops have one-entry-at-a-time interface is that originally iterators were able to update indices as well. *E.g.*, it was expected that /orphan index processing would read one entry, handle it, and then delete it from the index (a possibly mistaken belief in the virtue of simplicity and evilness of premature optimization also played its r?le). Eventually all existing use cases for the updating iterators were eliminated and ->set_*() methods were dropped from dt_it_ops.> >> > >> Bye, > >> Oleg >Nikita. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-devel/attachments/20090406/5b040647/attachment.html