Joel Becker
2005-Sep-21 04:15 UTC
[Ocfs2-devel] Re: [patch] stop inotify from sending random DELETE_SELF event under load
On Wed, Sep 21, 2005 at 09:35:25AM +0100, Christoph Hellwig wrote:> On Wed, Sep 21, 2005 at 03:36:01AM +0100, Al Viro wrote: > > I have no problems with killing ->drop_inode(), but that should be > > a) done for in-tree filesystems > > b) announced on fsdevel, so that out-of-tree folks could deal > > with that > > c) given at least one release to avoid screwing them. > > sure. Note that clusterfs folks (ocfs2 in particular) really want > ->drop_inode because they need additional checks instead of just the > nlink one in there. While hugetlbfs should just go away ->drop_inode > makes some sense for them.My apologies for not having read the inotify thread, I'll go look in the morning. In ->drop_inode(), OCFS2 takes care of noticing that nlink has been changed by a remote node. This is necessary for generic_drop...delete operation to proceed. If OCFS2 had to go back to the 2.4 method of checking i_count==1 in ->put_inode(), I'm not sure we're allowed to modify i_nlink there unlocked, are we? I also think we had some sort of race with inode_lock that ->drop_inode() avoids, but I'm not sure. Mark? Joel -- "For every complex problem there exists a solution that is brief, concise, and totally wrong." -Unknown Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127
Mark Fasheh
2005-Sep-21 13:08 UTC
[Ocfs2-devel] Re: [patch] stop inotify from sending random DELETE_SELF event under load
On Wed, Sep 21, 2005 at 07:45:38AM -0700, Joel Becker wrote:> On Wed, Sep 21, 2005 at 10:17:38AM +0100, Christoph Hellwig wrote: > > The real fix would be to put an equivalent of OCFS2_INODE_MAYBE_ORPHANED > > into struct inode. That way it could be shared by other clustered > > filesystems aswell, and OCFS2 had no need to implement ->drop_inode. > Or change the VFS patterns to allow lookup and validation of the > inode before choosing the generic_drop/generic_delete path.Right. *Only* setting something akin to OCFS2_INODE_MAYBE_ORPHANED on struct inode would give us essentially what we have today assuming that generic_drop_inode() punted to delete_inode() in that case (as well as nlink == 0). Which is fine as long as we're ok with inodes who might *not* actually orphaned going through delete_inode(). A more involved solution might be to give OCFS2 the chance to do it's cluster querying at or around drop_inode() time on those which have been marked as potentially orphaned. This would give us the benefit of allowing those inodes which wound up not being orphaned (due to remote node error or death) to go through generic_forget_inode(), which if my reading is correct would at least allow it to live longer in the system. I'm not so sure how important this is however, considering the number of inodes which wind up not having been orphaned is typically very small and the only downside for those few as far as i can tell is a small loss of performance. Btw, the other reason a local node might not completely wipe an inode in OCFS2 land is if another node is still using it (if it has open file descriptors), in which case what we do today is sync the inode, truncate it's pages and then exit ocfs2_delete_inode(), thus allowing our local VFS to destroy it secure in the knowledge that the actual wiping of the inode will happen elsewhere in the cluster. --Mark -- Mark Fasheh Senior Software Developer, Oracle mark.fasheh@oracle.com
Joel Becker
2005-Sep-21 14:48 UTC
[Ocfs2-devel] Re: [patch] stop inotify from sending random DELETE_SELF event under load
On Wed, Sep 21, 2005 at 10:17:38AM +0100, Christoph Hellwig wrote:> The real fix would be to put an equivalent of OCFS2_INODE_MAYBE_ORPHANED > into struct inode. That way it could be shared by other clustered > filesystems aswell, and OCFS2 had no need to implement ->drop_inode.Or change the VFS patterns to allow lookup and validation of the inode before choosing the generic_drop/generic_delete path. Joel -- Life's Little Instruction Book #197 "Don't forget, a person's greatest emotional need is to feel appreciated." Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/