Ling, Xiaofeng
2004-Jul-02 01:57 UTC
[Ocfs2-devel] [Patch] We resolve the throughput drop problemwhe nr eading filesin OCFS2 volume in the patch "ocfs2-truncate-pages-1.patch"a gainstsvn 1226.
We are also thinking about locking for each read/write, but would its overhead be too high? We have another idea that is extending the function of flock, lockf, fcntl to distributed. So any application that need strict data consistent can do a lock operation on the whole or part of the file. For ordinary application, maybe the current logic is enough. How about it?>-----Original Message----- >From: khackel@ca2.us.oracle.com=20 >[mailto:khackel@ca2.us.oracle.com] On Behalf Of Kurt Hackel >Sent: 2004=C4=EA7=D4=C22=C8=D5 0:11 >To: Ling, Xiaofeng >Cc: Wim Coekaerts; Zhang, Sonic; Fu, Michael; Yang, Elton; Ocfs2-Devel >Subject: Re: [Ocfs2-devel] [Patch] We resolve the throughput=20 >drop problemwhe nr eading filesin OCFS2 volume in the patch=20 >"ocfs2-truncate-pages-1.patch"a gainstsvn 1226. > >Hi, > >Great work! We had internally discussed something along the lines of >#4, but figured we would not have time to implement it. Basically, we >were going to extend the current READONLY cache locks to regular files >(today it only works for directories) and then take a READONLY lock on >every buffered read (not direct-io) and a regular lock on=20 >every buffered >write. The writer would have to notify readers to drop the READONLY >property and flush the inode's data pages and extent map. > >In practice, the only differences between this and what you=20 >have come up >with are (a) your method will require a dlm message on every write, >where the READONLY method would require messages on write only if the >master of the lock changes or new readers have joined, and (b) yours is >already done and tested. :) > >I think we should go ahead with your patch and optimize it=20 >further later >if we need to. =20 > >Thanks! >-kurt > > >On Thu, Jul 01, 2004 at 06:09:56PM +0800, Ling, Xiaofeng wrote: >> There are still some improvement we may do for this patch. >> 1. Move the message from open to close >> So if there is an open in another node during the write=20 >on this node, it will not affect the next read. >> 2. Send the message only when there is really a write before=20 >the close. (Maybe we can use the flag OCFS_OIN_OPEN_FOR_WRITE?=20 >It is now only used in direct io) >> 3. When creating a new file, do not send the message.( Need=20 >to add some flags to OCFS_I(inode) ?) >> 4.Send the message only to those node that have ever opened=20 >this file.( maybe similar with process of the DROP_READONLY=20 >message for directory operation?) >> =20 >> any more suggestion? >> =20 >>=20 >> >-----Original Message----- >> >From: ocfs2-devel-bounces@oss.oracle.com=20 >> >[mailto:ocfs2-devel-bounces@oss.oracle.com] On Behalf Of=20 >Wim Coekaerts >> >Sent: 2004??7??1?? 10:53 >> >To: Zhang, Sonic >> >Cc: Fu, Michael; Yang, Elton; Ocfs2-Devel >> >Subject: Re: [Ocfs2-devel] [Patch] We resolve the throughput=20 >> >drop problem whenr eading filesin OCFS2 volume in the patch=20 >> >"ocfs2-truncate-pages-1.patch" againstsvn 1226. >> > >> >very interesting ! ll hav to study this one closely :) >> >thanks ! >> > >> >On Thu, Jul 01, 2004 at 10:39:07AM +0800, Zhang, Sonic wrote: >> >> Hi, >> >>=20 >> >> We root caused the problem "The truncate_inode_page call in >> >> ocfs_file_releasecauses the severethroughput drop of file=20 >reading in >> >> OCFS2", which we put forward in our former mails. And now, we also >> >> generate a patch to resolve this problem after one week debugging. >> >>=20 >> >> This patch is against OCFS2 svn 1226. >> >>=20 >> >> The average file reading throughput without our patch is 16 >> >> Mbtye/sec. >> >> The average file reading throughput with our patch is 1600 >> >> Mbtye/sec. >> >> Our patch has 100 times improvement on file reading throughput. >> >> We will submit the full benchmark data of izone in the other=20 >> >mail soon. >> >>=20 >> >> In our patch, we remove ocfs_truncate_pages() and >> >> ocfs_extent_map_destroy() from routine ocfs_file_open() and >> >> ocfs_file_release(), which enable file data page reuse=20 >> >between different >> >> and sequential file access in one node.=20 >> >>=20 >> >> In current OCFS2 design, file data consistency among all nodes >> >> in the cluster is only ensured if this file is accessed in=20 >> >sequence. Our >> >> patch keeps the same consistency level by a new vote request >> >> FLAG_TRUNCATE_PAGES and a new vote action TRUNCATE_PAGES.=20 >> >This request >> >> is broadcast when a file is asked to be opened for write. Then the >> >> receivers truncate all in memory pages and extent maps of=20 >> >this file. The >> >> sender truncates part of the pages and maps only when the file is >> >> truncated (shortened). >> >>=20 >> >> Please refer to the attachment. >> >>=20 >> >> The throughput drop problem also occurs when creating, changing >> >> and deleting directories on OCFS2 volume. But it is not=20 >> >covered in this >> >> patch. We will work on the other patch to solve this problem. >> >>=20 >> >> Any comments are appreciated. >> >> Thank you. >> >>=20 >> >>=20 >> >>=20 >> >> ********************************************* >> >> Sonic Zhang >> >> Software Engineer >> >> Intel China Software Lab >> >> Tel: (086)021-52574545-1667 >> >> iNet: 752-1667 >> >> ********************************************* =20 >> > >> > >> >> _______________________________________________ >> >> Ocfs2-devel mailing list >> >> Ocfs2-devel@oss.oracle.com >> >> http://oss.oracle.com/mailman/listinfo/ocfs2-devel >> > >> >_______________________________________________ >> >Ocfs2-devel mailing list >> >Ocfs2-devel@oss.oracle.com >> >http://oss.oracle.com/mailman/listinfo/ocfs2-devel >> > >> > >> _______________________________________________ >> Ocfs2-devel mailing list >> Ocfs2-devel@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-devel > > >
Possibly Parallel Threads
- Shared storage showing 100% used
- St0ck Oppurtunities - their climbing (PR#7528)
- yum error "AttributeError: LOCATION_BASE" after 4.5 -> 4.6 upgrade
- Error in matrix (unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, : attempt to set an attribute on NULL
- Fax.com:Message Nr.141714