Zhang, Sonic
2004-Jun-22 02:51 UTC
[Ocfs2-devel] The truncate_inode_page call in ocfs_file_release causes the severethroughput drop of file reading in OCFS2.
Hi, We have investigated the possible causes of the severe throughput drop of file reading in OCFS2 that found in the iozone benchmark. We find the major cause is the inadequate inode page cache cleaning when open or close the last reference to an inode. In routine ocfs_file_release(), if the caller is the last reference to this inode, truncate_inode_page is called to invalidate all page caches of this inode after the reference is closed. And in routine ocfs_file_open(), the page caches are also cleaned when this inode is first opened. In this case, the file reading operation always read data from the disk directly, which throughput is only 16M bytes/sec on our development machine. But, if we try to bypass the call to truncate_inode_page(), the file reading throughput in one node can reach 1300M bytes/sec, which is about 75% of that of ext3. I think it is not a good idea to clean all page caches of an inode when its last reference is closed. This inode may be reopened very soon and its cached pages may be accessed again. I guess your intention to call truncate_inode_page() is to avoid inconsistency of the metadata if a process on the other node changes the same inode metadata on disk before it is reopened in this node. Am I right? Do you have more concern? I think in this case we have 2 options. One is to clean all pages of this inode when receive the file change notification (rename, delete, move, attributes, etc) in the receiver thread. The other is to only invalidate pages contain the metadata of this inode. What's your opinion? Thank you.
Wim Coekaerts
2004-Jun-22 03:10 UTC
[Ocfs2-devel] The truncate_inode_page call in ocfs_file_releasecaus es the severethroughput drop of file reading in OCFS2.
yeah... it's on purpose for the reason you mentioned. multinodeconsistency i was actually cosnidering testing by taking out truncateinodepages, this has been discussed internqally for quite a few months, it's a big nightmare i have nightly ;-) the problem is, how can we notify. I think we don't want to notify every node on every change othewise we overload the interconnect and we don't have a good consistent map, if I remmeber Kurts explanation correctly. this has to be fixed for regular performance for sure, the question is how do we do this in a good way. I'd say, feel free to experiment... just remember that the big probelm is multinode consistency. imagine this : I open file /ocfs/foo and read it all cached close file, no one on this node has it open on node2 I write some data, either O_DIRECT or regular close or keep it open whichever on node1 I now do an md5sum> development machine. But, if we try to bypass the call to > truncate_inode_page(), the file reading throughput in one node can reach > 1300M bytes/sec, which is about 75% of that of ext3. > > I think it is not a good idea to clean all page caches of an > inode when its last reference is closed. This inode may be reopened very > soon and its cached pages may be accessed again. > > I guess your intention to call truncate_inode_page() is to avoid > inconsistency of the metadata if a process on the other node changes the > same inode metadata on disk before it is reopened in this node. Am I > right? Do you have more concern? > > I think in this case we have 2 options. One is to clean all > pages of this inode when receive the file change notification (rename, > delete, move, attributes, etc) in the receiver thread. The other is to > only invalidate pages contain the metadata of this inode. > > What's your opinion? > > Thank you. > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Maybe Matching Threads
- The truncate_inode_page call inocfs_file_releasecaus es the severethroughput drop of file reading in OCFS2.
- [Patch] We resolve the throughput drop problemwhe nr eading filesin OCFS2 volume in the patch "ocfs2-truncate-pages-1.patch"a gainstsvn 1226.
- [BUGFIX][OCFS2 1/1] inode truncating
- [PATCH] Btrfs: fix very slow inode eviction and fs unmount
- sys_ftruncate call lasting 17 hours on ext3 filesystem from mutt