tao.peng at emc.com
2012-Jan-11 14:06 UTC
[Lustre-discuss] question about dcache revalidate
Hi, I was reading dcache.c and following comments in ll_revalidate_it() seem confusing. Does it mean llite can hash a positive dentry to dcache without taking inode LOOKUP lock? 589 /* 590 * This part is here to combat evil-evil race in real_lookup on 2.6 591 * kernels. The race details are: We enter do_lookup() looking for some 592 * name, there is nothing in dcache for this name yet and d_lookup() 593 * returns NULL. We proceed to real_lookup(), and while we do this, 594 * another process does open on the same file we looking up (most simple 595 * reproducer), open succeeds and the dentry is added. Now back to 596 * us. In real_lookup() we do d_lookup() again and suddenly find the 597 * dentry, so we call d_revalidate on it, but there is no lock, so 598 * without this code we would return 0, but unpatched real_lookup just 599 * returns -ENOENT in such a case instead of retrying the lookup. Once 600 * this is dealt with in real_lookup(), all of this ugly mess can go and 601 * we can just check locks in ->d_revalidate without doing any RPCs 602 * ever. 603 */ Best Regards, Tao
No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at the time of revalidate, it needs to do a real lookup. On Wed, Jan 11, 2012 at 10:06 PM, <tao.peng at emc.com> wrote:> Hi, > > I was reading dcache.c and following comments in ll_revalidate_it() seem > confusing. Does it mean llite can hash a positive dentry to dcache without > taking inode LOOKUP lock? > > 589 /* > 590 * This part is here to combat evil-evil race in real_lookup > on 2.6 > 591 * kernels. The race details are: We enter do_lookup() > looking for some > 592 * name, there is nothing in dcache for this name yet and > d_lookup() > 593 * returns NULL. We proceed to real_lookup(), and while we do > this, > 594 * another process does open on the same file we looking up > (most simple > 595 * reproducer), open succeeds and the dentry is added. Now > back to > 596 * us. In real_lookup() we do d_lookup() again and suddenly > find the > 597 * dentry, so we call d_revalidate on it, but there is no > lock, so > 598 * without this code we would return 0, but unpatched > real_lookup just > 599 * returns -ENOENT in such a case instead of retrying the > lookup. Once > 600 * this is dealt with in real_lookup(), all of this ugly mess > can go and > 601 * we can just check locks in ->d_revalidate without doing any > RPCs > 602 * ever. > 603 */ > > Best Regards, > Tao > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120112/422914ba/attachment.html
tao.peng at emc.com
2012-Jan-12 10:35 UTC
[Lustre-discuss] question about dcache revalidate
I see. Thanks a lot for the explanation. Cheers, Tao From: Lai Siyao [mailto:laisiyao at whamcloud.com] Sent: Thursday, January 12, 2012 4:53 PM To: Peng, Tao Cc: Lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] question about dcache revalidate No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at the time of revalidate, it needs to do a real lookup. On Wed, Jan 11, 2012 at 10:06 PM, <tao.peng at emc.com<mailto:tao.peng at emc.com>> wrote: Hi, I was reading dcache.c and following comments in ll_revalidate_it() seem confusing. Does it mean llite can hash a positive dentry to dcache without taking inode LOOKUP lock? 589 /* 590 * This part is here to combat evil-evil race in real_lookup on 2.6 591 * kernels. The race details are: We enter do_lookup() looking for some 592 * name, there is nothing in dcache for this name yet and d_lookup() 593 * returns NULL. We proceed to real_lookup(), and while we do this, 594 * another process does open on the same file we looking up (most simple 595 * reproducer), open succeeds and the dentry is added. Now back to 596 * us. In real_lookup() we do d_lookup() again and suddenly find the 597 * dentry, so we call d_revalidate on it, but there is no lock, so 598 * without this code we would return 0, but unpatched real_lookup just 599 * returns -ENOENT in such a case instead of retrying the lookup. Once 600 * this is dealt with in real_lookup(), all of this ugly mess can go and 601 * we can just check locks in ->d_revalidate without doing any RPCs 602 * ever. 603 */ Best Regards, Tao _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120112/a2888538/attachment.html
Hello! On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote:> No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at the time of revalidate, it needs to do a real lookup.In fact there''s more to it. Certain operations like open, for example, don''t actually get a lock from server so if there was no lock to begin with, we add the "known invalid" entry to the hash and immediately mark it as invalid, no later lock cancel needed to trigger this. Bye, Oleg> > On Wed, Jan 11, 2012 at 10:06 PM, <tao.peng at emc.com> wrote: > Hi, > > I was reading dcache.c and following comments in ll_revalidate_it() seem confusing. Does it mean llite can hash a positive dentry to dcache without taking inode LOOKUP lock? > > 589 /* > 590 * This part is here to combat evil-evil race in real_lookup on 2.6 > 591 * kernels. The race details are: We enter do_lookup() looking for some > 592 * name, there is nothing in dcache for this name yet and d_lookup() > 593 * returns NULL. We proceed to real_lookup(), and while we do this, > 594 * another process does open on the same file we looking up (most simple > 595 * reproducer), open succeeds and the dentry is added. Now back to > 596 * us. In real_lookup() we do d_lookup() again and suddenly find the > 597 * dentry, so we call d_revalidate on it, but there is no lock, so > 598 * without this code we would return 0, but unpatched real_lookup just > 599 * returns -ENOENT in such a case instead of retrying the lookup. Once > 600 * this is dealt with in real_lookup(), all of this ugly mess can go and > 601 * we can just check locks in ->d_revalidate without doing any RPCs > 602 * ever. > 603 */ > > Best Regards, > Tao > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Oleg Drokin Senior Software Engineer Whamcloud, Inc.
tao.peng at emc.com
2012-Jan-13 11:44 UTC
[Lustre-discuss] question about dcache revalidate
Oleg hi,> -----Original Message----- > From: Oleg Drokin [mailto:green at whamcloud.com] > Sent: Friday, January 13, 2012 12:04 AM > To: Lai Siyao > Cc: Peng, Tao; Lustre-discuss at lists.lustre.org > Subject: Re: [Lustre-discuss] question about dcache revalidate > > Hello! > > On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote: > > > No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see > ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in > the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at > the time of revalidate, it needs to do a real lookup. > > In fact there''s more to it. > Certain operations like open, for example, don''t actually get a lock from server so if there was no > lock to begin with, we add > the "known invalid" entry to the hash and immediately mark it as invalid, no later lock cancel needed > to trigger this. >Are you referring to ll_lookup_it_finish() that flag negative dentry DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have UDATE lock? Looks like such dentry is skipped during link path walk as ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place? Thanks, Tao> Bye, > Oleg > > > > On Wed, Jan 11, 2012 at 10:06 PM, <tao.peng at emc.com> wrote: > > Hi, > > > > I was reading dcache.c and following comments in ll_revalidate_it() seem confusing. Does it mean > llite can hash a positive dentry to dcache without taking inode LOOKUP lock? > > > > 589 /* > > 590 * This part is here to combat evil-evil race in real_lookup on 2.6 > > 591 * kernels. The race details are: We enter do_lookup() looking for some > > 592 * name, there is nothing in dcache for this name yet and d_lookup() > > 593 * returns NULL. We proceed to real_lookup(), and while we do this, > > 594 * another process does open on the same file we looking up (most simple > > 595 * reproducer), open succeeds and the dentry is added. Now back to > > 596 * us. In real_lookup() we do d_lookup() again and suddenly find the > > 597 * dentry, so we call d_revalidate on it, but there is no lock, so > > 598 * without this code we would return 0, but unpatched real_lookup just > > 599 * returns -ENOENT in such a case instead of retrying the lookup. Once > > 600 * this is dealt with in real_lookup(), all of this ugly mess can go and > > 601 * we can just check locks in ->d_revalidate without doing any RPCs > > 602 * ever. > > 603 */ > > > > Best Regards, > > Tao > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- > Oleg Drokin > Senior Software Engineer > Whamcloud, Inc. >
> > > On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote: > > > > > No, to add a dentry to hash client needs holding LOOKUP lock, but > lustre client unhash (see > > ll_unhash_aliases()) doesn''t really remove dentry from hash, but set > LUSTRE_DCACHE_INVALID flag. So in > > the race you mentioned, another process may add the dentry but later the > lock is canceled soon, so at > > the time of revalidate, it needs to do a real lookup. > > > > In fact there''s more to it. > > Certain operations like open, for example, don''t actually get a lock > from server so if there was no > > lock to begin with, we add > > the "known invalid" entry to the hash and immediately mark it as > invalid, no later lock cancel needed > > to trigger this. > > > Are you referring to ll_lookup_it_finish() that flag negative dentry > DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have > UDATE lock? Looks like such dentry is skipped during link path walk as > ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place? > >Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), but a child dentry will be inserted into hash, however this dentry is marked INVALID, and it will be freed after file->release(), that means, no other processes will use the dentry (in this case other processes will create another dentry for this file and add to hash too). So Oleg means a dentry may be inserted into hash without LOOKUP lock, but this dentry won''t be used by others, so it will not cause the race you mentioned. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120117/6e929efc/attachment.html
tao.peng at emc.com
2012-Jan-17 09:13 UTC
[Lustre-discuss] question about dcache revalidate
> On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote: > > > No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see > ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in > the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at > the time of revalidate, it needs to do a real lookup. > > In fact there''s more to it. > Certain operations like open, for example, don''t actually get a lock from server so if there was no > lock to begin with, we add > the "known invalid" entry to the hash and immediately mark it as invalid, no later lock cancel needed > to trigger this. > > Are you referring to ll_lookup_it_finish() that flag negative dentry DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have UDATE lock? Looks like such dentry is skipped during link path walk as ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place??> Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), but a child dentry will be inserted into hash, however this dentry is marked INVALID, and it will be freed after file->release(), that means, no other processes will use the dentry (in this case other processes will create another dentry for this file and add to hash too). So Oleg means a dentry may be inserted into hash without LOOKUP lock, but this dentry won''t be used by others, so it will not cause the race you mentioned.I see. mdc_intent_open_pack() only asks for child OPEN and parent UPDATE lock. Thanks very much! Best, Tao
On Tue, Jan 17, 2012 at 5:13 PM, <tao.peng at emc.com> wrote:> > On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote: > > > > > No, to add a dentry to hash client needs holding LOOKUP lock, but > lustre client unhash (see > > ll_unhash_aliases()) doesn''t really remove dentry from hash, but set > LUSTRE_DCACHE_INVALID flag. So in > > the race you mentioned, another process may add the dentry but later the > lock is canceled soon, so at > > the time of revalidate, it needs to do a real lookup. > > > > In fact there''s more to it. > > Certain operations like open, for example, don''t actually get a lock > from server so if there was no > > lock to begin with, we add > > the "known invalid" entry to the hash and immediately mark it as > invalid, no later lock cancel needed > > to trigger this. > > > > Are you referring to ll_lookup_it_finish() that flag negative dentry > DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have > UDATE lock? Looks like such dentry is skipped during link path walk as > ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place? > > > Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), > but a child dentry will be inserted into hash, however this dentry is > marked INVALID, and it will be freed after file->release(), that means, no > other processes will use the dentry (in this case other processes will > create another dentry for this file and add to hash too). So Oleg means a > dentry may be inserted into hash without LOOKUP lock, but this dentry won''t > be used by others, so it will not cause the race you mentioned. > > I see. mdc_intent_open_pack() only asks for child OPEN and parent UPDATE > lock. Thanks very much! >No, generally only nfsd-exported lustre client requests child OPEN lock. Bye, - Lai -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120117/d0a8a549/attachment.html
tao.peng at emc.com
2012-Jan-17 10:24 UTC
[Lustre-discuss] question about dcache revalidate
Got it. Thanks. Best, Tao From: Lai Siyao [mailto:laisiyao at whamcloud.com] Sent: Tuesday, January 17, 2012 5:44 PM To: Peng, Tao Cc: green at whamcloud.com; Lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] question about dcache revalidate On Tue, Jan 17, 2012 at 5:13 PM, <tao.peng at emc.com<mailto:tao.peng at emc.com>> wrote:> On Jan 12, 2012, at 3:52 AM, Lai Siyao wrote: > > > No, to add a dentry to hash client needs holding LOOKUP lock, but lustre client unhash (see > ll_unhash_aliases()) doesn''t really remove dentry from hash, but set LUSTRE_DCACHE_INVALID flag. So in > the race you mentioned, another process may add the dentry but later the lock is canceled soon, so at > the time of revalidate, it needs to do a real lookup. > > In fact there''s more to it. > Certain operations like open, for example, don''t actually get a lock from server so if there was no > lock to begin with, we add > the "known invalid" entry to the hash and immediately mark it as invalid, no later lock cancel needed > to trigger this. > > Are you referring to ll_lookup_it_finish() that flag negative dentry DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have UDATE lock? Looks like such dentry is skipped during link path walk as ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place?> Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), but a child dentry will be inserted into hash, however this dentry is marked INVALID, and it will be freed after file->release(), that means, no other processes will use the dentry (in this case other processes will create another dentry for this file and add to hash too). So Oleg means a dentry may be inserted into hash without LOOKUP lock, but this dentry won''t be used by others, so it will not cause the race you mentioned.I see. mdc_intent_open_pack() only asks for child OPEN and parent UPDATE lock. Thanks very much! No, generally only nfsd-exported lustre client requests child OPEN lock. Bye, - Lai -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120117/0d23a4ad/attachment-0001.html
Hello! On Jan 17, 2012, at 3:26 AM, Lai Siyao wrote:> Are you referring to ll_lookup_it_finish() that flag negative dentry DCACHE_LUSTRE_INVALID and hash/unhash it immediately if client doesn''t have UDATE lock? Looks like such dentry is skipped during link path walk as ll_dcompare checks DCACHE_LUSTRE_INVALID. Or am I looking at wrong place? > Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), but a child dentry will be inserted into hash, however this dentry is marked INVALID, and it will be freed after file->release(), that means, no other processes will use the dentry (in this case other processes will create another dentry for this file and add to hash too). So Oleg means a dentry may be inserted into hash without LOOKUP lock, but this dentry won''t be used by others, so it will not cause the race you mentioned.Actually that was not always the case, the check in d_compare was added a few years later and before then it was perfectly possible to find invalid dentries and that''s why we had this d_revalidate check hitting. Bye, Oleg -- Oleg Drokin Senior Software Engineer Whamcloud, Inc.
Hello! On Jan 17, 2012, at 4:43 AM, Lai Siyao wrote:> > Normally lustre client doesn''t fetch any child lock for lookup(IT_OPEN), but a child dentry will be inserted into hash, however this dentry is marked INVALID, and it will be freed after file->release(), that means, no other processes will use the dentry (in this case other processes will create another dentry for this file and add to hash too). So Oleg means a dentry may be inserted into hash without LOOKUP lock, but this dentry won''t be used by others, so it will not cause the race you mentioned. > I see. mdc_intent_open_pack() only asks for child OPEN and parent UPDATE lock. Thanks very much! > No, generally only nfsd-exported lustre client requests child OPEN lock.In fact we only did this for NFS case because that was the biggest abuser of repeat open-close at the same time. And at the time there was a hard limit on number of locks a client can cache and this caused extra slowdowns in mdtest-like workloads where opens were slowed down by need to also cancel locks before doing opens. I think attempting to revive the "always grant open lock on opens" logic might be worthwhile. Bye, Oleg -- Oleg Drokin Senior Software Engineer Whamcloud, Inc.