Phil Schwan
2009-Nov-19 12:06 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
Hello old friends!? I return with a gift, like an almost-forgotten uncle visiting from a faraway land. I have an interesting issue, on 1.6.6: # cat /proc/fs/lustre/version lustre: 1.6.6 kernel: patchless build:? 1.6.6-19700101080000-PRISTINE-.usr.src.kernels.2.6.18-128.1.16.el5-x86_64.-2.6.18-128.1.16.el5 Consider this setup: - one task creates a "foo.working" directory, does all its work inside, then renames it to "foo.done" - another task polls, waiting for "foo.working" to disappear.? all of this occurs on one node. - the problem: the rename occurred, but "foo.working" remains as a valid dentry! Witness: node1$ ls tape18??????????? tapeLabel_19.txt? tid64.done???? tid67.working? tid70.working tape19.working??? tid62.done??????? tid65.done???? tid68.working tapeLabel_18.txt? tid63.done??????? tid66.working? tid69 -- Note the absence of a "tid65.working" directory node1$ stat tid65.working ? File: `tid65.working'' ? Size: 12288?????????? Blocks: 24???????? IO Block: 4096?? directory Device: 6d48dd40h/1833491776d?? Inode: 76317251??? Links: 3 Access: (2775/drwxrwsr-x)? Uid: ( 3005/ stuartm)?? Gid: ( 2000/??? prod) Access: 2009-11-18 17:14:26.000000000 +0800 Modify: 2009-11-18 16:08:01.000000000 +0800 Change: 2009-11-18 16:08:01.000000000 +0800 This is unique to node1.? On node2: node2$ stat tid65.working stat: cannot stat `tid65.working'': No such file or directory Attached is a lnet.debug=-1 log of the stat on node1, in which we can see it revalidating the dentry for a directory that no longer exists. A snapshot of that lock in the DLM cache reveals no obvious abnormal pathology: 00010000:00010000:7: 1258537200.873945:0:30794:0:(ldlm_resource.c:1116:ldlm_resource_dump()) --- Resource: ffff810133749500 (76317251/3438387721/0/0) (rc: 3) 00010000:00010000:7:1258537200.873947:0:30794:0:(ldlm_resource.c:1120:ldlm_resource_dump()) Granted locks: 00010000:00010000:7:1258537200.873948:0:30794:0:(ldlm_lock.c:1729:ldlm_lock_dump()) -- Lock dump: ffff8101ac7d2c00/0x3b4ffa12a31eb406 (rc: 1) (pos: 1) (pid: 28837) 00010000:00010000:7:1258537200.873950:0:30794:0:(ldlm_lock.c:1742:ldlm_lock_dump()) ? Node: NID 172.16.0.251 at tcp (rhandle: 0x3ad2b5ae2b9e570a) 00010000:00010000:7:1258537200.873951:0:30794:0:(ldlm_lock.c:1746:ldlm_lock_dump()) ? Resource: ffff810133749500 (76317251/3438387721) 00010000:00010000:7:1258537200.873953:0:30794:0:(ldlm_lock.c:1751:ldlm_lock_dump()) ? Req mode: CR, grant mode: CR, rc: 1, read: 0, write: 0 flags: 0x0 00010000:00010000:7:1258537200.873954:0:30794:0:(ldlm_lock.c:1765:ldlm_lock_dump()) ? Bits: 0x3 We stopped the job when it became clear that it would never finish. Eventually that lock did disappear -- likely just due to normal DLM turnover -- and the problem resolved itself.? If the task had been allowed to continue, however, constantly stat()ing that dead directory, the lock would have remained at the bottom of the LRU -- and thus it would be an effectively infinite loop! In the interest of full disclosure, there were some of these in the dmesg.? But they were from several days prior to the creation even of the parent directory, so I think it''s very unlikely that they are related: BUG: warning at fs/inotify.c:202/set_dentry_child_flags() (Tainted: G???? ) Call Trace: ?[<ffffffff800f2777>] set_dentry_child_flags+0xef/0x14d ?[<ffffffff800f280d>] remove_watch_no_event+0x38/0x47 ?[<ffffffff800f2834>] inotify_remove_watch_locked+0x18/0x3b ?[<ffffffff800f296f>] inotify_rm_wd+0x8d/0xb6 ?[<ffffffff800f2ee5>] sys_inotify_rm_watch+0x46/0x63 ?[<ffffffff8005e28d>] tracesys+0xd5/0xe0 Has this cheerful missive induced an "A ha!" moment in anyone, that would explain this?? Have I overlooked something important? Much like those given by your own almost-forgotten uncles, this gift was essentially a pair of itchy wool socks.? I hope you will forgive me. Cheers, -p (nobody handles the moderator queue any more, eh?) -------------- next part -------------- A non-text attachment was scrubbed... Name: lustre-debug.log Type: text/x-log Size: 12507 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091119/fb242590/attachment.bin
Brian J. Murrell
2009-Nov-19 12:47 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
On Thu, 2009-11-19 at 20:06 +0800, Phil Schwan wrote:> Hello old friends!Heya phik.> I return with a gift, like an almost-forgotten > uncle visiting from a faraway land.Yay!> - one task creates a "foo.working" directory, does all its work > inside, then renames it to "foo.done" > > - another task polls, waiting for "foo.working" to disappear. all of > this occurs on one node. > > - the problem: the rename occurred, but "foo.working" remains as a valid dentry!So would this be a valid reproducer: tty1$ mkdir foo.working tty2$ while stat foo.working; do echo; done tty2: [ loop continues, stating foo.working ] tty1$ mv foo.{working,done} Where the loop on tty2 should continue beyond the mv on tty1? I tried the above on: lustre: 1.8.1.50 kernel: patchless_client build: b1_8-20090921170221-CHANGED-2.6.28-11-generic and the loop on tty2 terminates promptly on the mv on tty1. Maybe my reproducer is missing some detail? Does such a simple reproducer reproduce the problem on your system? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091119/cc56bcad/attachment-0001.bin
Phil Schwan
2009-Nov-19 13:09 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
G''day Brian! Glad to see you''ve stuck around! 2009/11/19 Brian J. Murrell <Brian.Murrell at sun.com>:> > So would this be a valid reproducer: > > tty1$ mkdir foo.working > tty2$ while stat foo.working; do echo; done > tty2: [ loop continues, stating foo.working ] > tty1$ mv foo.{working,done} > > Where the loop on tty2 should continue beyond the mv on tty1?It should be, except it isn''t. Your steps are correct, I believe (Stu may correct me), but we also tried that without any luck. I''m not much involved in our production ops (just helping out with this bug), but I believe that this is a relatively new idiom for our scripts. It''s common for our scripts to poll waiting for a file to _exist_, but this polling to not exist is new. ...so it''s hard to say precisely how rare or common this problem is, yet. But alas, our simple test didn''t work either. Cheers, -p
Brian J. Murrell
2009-Nov-19 13:34 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
On Thu, 2009-11-19 at 21:09 +0800, Phil Schwan wrote:> G''day Brian! Glad to see you''ve stuck around!:-)> It should be, except it isn''t.Don''t ya just hate those?> Your steps are correct, I believe (Stu > may correct me), but we also tried that without any luck.Hrm. Must be more subtle then. Perhaps somebody else here might recognize those symptoms. Quite a few releases have gone by since 1.6.6 so it''s entirely possible it''s been fixed, as I am sure you realize. So let''s see if this one rings a bell with somebody. In the meanwhile, if you come up with a reliable reproducer, feel free to post it here and we can see if we can reproduce -- or not, in an effort at least to see if it''s been fixed already. Cheers, b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091119/2334add4/attachment.bin
Oleg Drokin
2009-Nov-20 02:52 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
Hello! On Nov 19, 2009, at 7:06 AM, Phil Schwan wrote:> Hello old friends! I return with a gift, like an almost-forgotten > uncle visiting from a faraway land.Long time no see! ;)> I have an interesting issue, on 1.6.6: > # cat /proc/fs/lustre/version > lustre: 1.6.6 > kernel: patchless > build: 1.6.6-19700101080000-PRISTINE-.usr.src.kernels.2.6.18-128.1.16.el5-x86_64.-2.6.18-128.1.16.el5 > Consider this setup: > - one task creates a "foo.working" directory, does all its work > inside, then renames it to "foo.done" > > - another task polls, waiting for "foo.working" to disappear. all of > this occurs on one node. > - the problem: the rename occurred, but "foo.working" remains as a valid dentry!Well, this might be bug 2969 I would think. But depends on how the second task is polling. There were several things done to avert it in the past, I wonder if 1.8.2 would work better for you.> Witness: > > node1$ ls > tape18 tapeLabel_19.txt tid64.done tid67.working tid70.working > tape19.working tid62.done tid65.done tid68.working > tapeLabel_18.txt tid63.done tid66.working tid69 > > -- Note the absence of a "tid65.working" directory > > node1$ stat tid65.working > File: `tid65.working'' > Size: 12288 Blocks: 24 IO Block: 4096 directory > Device: 6d48dd40h/1833491776d Inode: 76317251 Links: 3 > Access: (2775/drwxrwsr-x) Uid: ( 3005/ stuartm) Gid: ( 2000/ prod) > Access: 2009-11-18 17:14:26.000000000 +0800 > Modify: 2009-11-18 16:08:01.000000000 +0800 > Change: 2009-11-18 16:08:01.000000000 +0800 > This is unique to node1. On node2:Yes, I think this does match bug 2969 behavior. We add entry to dcache without lock (not visible in the trace you provided). Then we do rename, then we do some sort of stat on a renamed entry and reobtain the lock. Then we do stat on old name, and since lock is on inode - we find the newly reinstantiated lock and declare old dentry as valid.> We stopped the job when it became clear that it would never finish. > Eventually that lock did disappear -- likely just due to normal DLM > turnover -- and the problem resolved itself. If the task had been > allowed to continue, however, constantly stat()ing that dead > directory, the lock would have remained at the bottom of the LRU -- > and thus it would be an effectively infinite loop!I am sorry to hear we came back to haunt you after all this time. I wonder if my patch for 20323 would have helped this case (or have just always returning the lock), though on the other hand this is inode from mkdir and so might have never go through open path. Bug 16417 is what landed into 1.8.2 and is a complete rework of dcache caching logic for dentries and has a better chance of fixing this, I would say. If not, it would be great if the lock will start earlier in time, definitely before rename happens. I hope this problem did not ruin your day in the end. And we do miss you. Does your coming with such a question means you are on your way back to us? ;) Bye, Oleg
Phil Schwan
2009-Nov-22 09:24 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
Hello, my good man! 2009/11/20 Oleg Drokin <Oleg.Drokin at sun.com>> > Yes, I think this does match bug 2969 behavior. > We add entry to dcache without lock (not visible in the trace you > provided). Then we do rename, then we do some sort of stat on a renamed > entry and reobtain the lock. Then we do stat on old name, and since lock is > on inode - we find the newly reinstantiated lock and declare > old dentry as valid. >Hmm. My first instinct was that there shouldn''t be dentries without locks, but it''s been sufficiently long that I can''t remember all the details of the dentry life cycle. What you wrote certainly sounds like a plausible explanation.> I wonder if my patch for 20323 would have helped this case > (or have just always returning the lock), though on the other hand this > is inode from mkdir and so might have never go through open path. > Bug 16417 is what landed into 1.8.2 and is a complete rework of > dcache caching logic for dentries and has a better chance of fixing this, > I would say. > If not, it would be great if the lock will start earlier in time, > definitely > before rename happens. > > I hope this problem did not ruin your day in the end. >No, no, it''s fairly minor for us. I just wanted to report it in case we were the first to experience it. We will probably upgrade at some point, but we had a less-than-perfect experience with an upgrade this year, so I imagine Stu will wait until it''s really necessary. Thanks for looking into it. From skimming 16417, I agree that it stands a good chance of fixing it, or at least permutes the system sufficiently that it''d be worth trying to reproduce it again after an upgrade. And we do miss you. Does your coming with such a question means you are on> your way back to us? ;) >You all seem to be doing just fine without me. :) We certainly make intense use of Lustre in our unending quest to find dinosaur blood, and it serves us very well. Cheers, -p -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091122/d73bf00a/attachment.html
Stuart Midgley
2009-Nov-22 09:35 UTC
[Lustre-discuss] renamed directory retains a dentry under its old name?
Lustre serves us very very well. As Phil points out, these sort of bugs are not show stoppers for us, we report them to improve the code. We have 30oss''s and ~300TB of storage under it. -- Dr Stuart Midgley sdm900 at gmail.com On 22/11/2009, at 17:24 , Phil Schwan wrote:> Hello, my good man! > > 2009/11/20 Oleg Drokin <Oleg.Drokin at sun.com> > > Yes, I think this does match bug 2969 behavior. > We add entry to dcache without lock (not visible in the trace you provided). Then we do rename, then we do some sort of stat on a renamed > entry and reobtain the lock. Then we do stat on old name, and since lock is on inode - we find the newly reinstantiated lock and declare > old dentry as valid. > > Hmm. My first instinct was that there shouldn''t be dentries without locks, but it''s been sufficiently long that I can''t remember all the details of the dentry life cycle. What you wrote certainly sounds like a plausible explanation. > > I wonder if my patch for 20323 would have helped this case > (or have just always returning the lock), though on the other hand this > is inode from mkdir and so might have never go through open path. > Bug 16417 is what landed into 1.8.2 and is a complete rework of > dcache caching logic for dentries and has a better chance of fixing this, > I would say. > If not, it would be great if the lock will start earlier in time, definitely > before rename happens. > > I hope this problem did not ruin your day in the end. > > No, no, it''s fairly minor for us. I just wanted to report it in case we were the first to experience it. We will probably upgrade at some point, but we had a less-than-perfect experience with an upgrade this year, so I imagine Stu will wait until it''s really necessary. > > Thanks for looking into it. From skimming 16417, I agree that it stands a good chance of fixing it, or at least permutes the system sufficiently that it''d be worth trying to reproduce it again after an upgrade. > > And we do miss you. Does your coming with such a question means you are on your way back to us? ;) > > You all seem to be doing just fine without me. :) We certainly make intense use of Lustre in our unending quest to find dinosaur blood, and it serves us very well. > > Cheers, > > -p > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss