Yusuf Goolamabbas
2001-Aug-20 14:13 UTC
[tytso@mit.edu: Re: Your ext2 optimisation for readdir+stat]
I asked Ted about his patch for ext3 and ext2 and this is his response. I am not sure if his post will reach lkml since it's sent to vger.rutgers.edu IMHO, With the amount of work Andrew has done to make ext3 usable for MTA applications, Ted's work would make ext3 even better for MTA apps sinc both Postfix/qmail use 'find' in their control-scripts and queue statistics program /yg ----- Forwarded message from Theodore Tso <tytso@mit.edu> ----- Date: Mon, 20 Aug 2001 08:24:40 -0400 From: Theodore Tso <tytso@mit.edu> To: Yusuf Goolamabbas <yusufg@outblaze.com> Cc: tytso@mit.edu, alan@redhat.com, linux-kernel@vger.rutgers.edu Subject: Re: Your ext2 optimisation for readdir+stat Message-ID: <20010820082440.C8399@thunk.org> On Mon, Aug 20, 2001 at 04:29:21AM -0000, Yusuf Goolamabbas wrote:> Hi Ted, I just came across your email on ext2-devel,debian-devel about > ext2 optimisations when doing a readdir+stat. > > This may be a dumb question but would your patch have any impact on > Linux client NFS mounts when the underlying filesystem may not be ext2It will have absolutely no effect if the underlying filesystem is not ext2; the change is to the ext2 low-level directory lookup routine, after all.> Alan, Would you consider this for 2.2.20 ?. Would your patch help with > the Nautilus optimizations you recently spent some time on ?The patch won't work for the 2.2 kernels or 2.4 ext3, since it requires that the directories-in-page-cache change. It's theoretically possible to rewrite the change for the old-style ext2/3_find_entry code, but (a) the ext2_find_entry() function before it was modified to use the page cache is rather icky, and (b) I don't particularly care about 2.2 at this point. The only reason why I might try to do this work is if we really want this optimization in ext3 before we add support for putting directories in the page cache (which isn't going to happen before the ext3 1.0 release), but as I said, it would require messing with a complicated bit of code, and it's not high on my priority list at the moment.> Ted, Any chance you can generate a patch for -ac kernels, Linus > kernels suck in VM right nowThe patch applies completely cleanly into the -ac kernel tree, without even any patch offsets --- I just tried it against 2.4.8-ac7. Alan, I think this is a trivial enough change, which can provide a significant performance improvement.... "ls -l" on a directory with 50,000 entries and a cold dcache goes down from an elapsed time of 3:03 minutes (with 180 seconds of system cpu time) to an elapsed time of 4.67 seconds (with 1.7 seconds of system cpu time). So I think it'd be worthwhile to get this into the 2.4 -ac tree, and to feed it to Linus. Compared the sorts of much higher-risk changes going into 2.4 at the moment, this changes no locking semantics, makes no on-disk format changes, and has no any global VM effects. It also makes a huge performance difference, so it seems reasonable to get this into 2.4 mainline. - Ted Patch generated: on Sat Aug 18 08:41:38 EDT 2001 by tytso@think against Linux version 2.4.8 ==================================================================RCS file: fs/ext2/RCS/dir.c,v retrieving revision 1.1 diff -u -r1.1 fs/ext2/dir.c --- fs/ext2/dir.c 2001/08/18 11:11:30 1.1 +++ fs/ext2/dir.c 2001/08/18 12:41:10 @@ -303,7 +303,7 @@ const char *name = dentry->d_name.name; int namelen = dentry->d_name.len; unsigned reclen = EXT2_DIR_REC_LEN(namelen); - unsigned long n; + unsigned long start, n; unsigned long npages = dir_pages(dir); struct page *page = NULL; ext2_dirent * de; @@ -311,7 +311,11 @@ /* OFFSET_CACHE */ *res_page = NULL; - for (n = 0; n < npages; n++) { + start = dir->u.ext2_i.i_dir_start_lookup; + if (start >= npages) + start = 0; + n = start; + do { char *kaddr; page = ext2_get_page(dir, n); if (IS_ERR(page)) @@ -324,11 +328,14 @@ if (ext2_match (namelen, name, de)) goto found; ext2_put_page(page); - } + if (++n >= npages) + n = 0; + } while (n != start); return NULL; found: *res_page = page; + dir->u.ext2_i.i_dir_start_lookup = n; return de; } ==================================================================RCS file: include/linux/RCS/ext2_fs_i.h,v retrieving revision 1.1 diff -u -r1.1 include/linux/ext2_fs_i.h --- include/linux/ext2_fs_i.h 2001/08/18 11:10:48 1.1 +++ include/linux/ext2_fs_i.h 2001/08/18 11:11:18 @@ -34,6 +34,7 @@ __u32 i_next_alloc_goal; __u32 i_prealloc_block; __u32 i_prealloc_count; + __u32 i_dir_start_lookup; int i_new_inode:1; /* Is a freshly allocated inode */ }; ----- End forwarded message ----- -- Yusuf Goolamabbas yusufg@outblaze.com
tytso@snap.thunk.org
2001-Aug-20 23:26 UTC
Re: [tytso@mit.edu: Re: Your ext2 optimisation for readdir+stat]
On Mon, Aug 20, 2001 at 10:13:00PM +0800, Yusuf Goolamabbas wrote:> I asked Ted about his patch for ext3 and ext2 and this is his response. > I am not sure if his post will reach lkml since it's sent to > vger.rutgers.eduOops. Sorry about that; old habits die hard...... I'll resend it to vger.kernel.org. Turns out Stephen pointed out a bug in my patch, in an error handling case, so I'll end up resubmitting the patch to linux-kernel anyway.> IMHO, With the amount of work Andrew has done to make ext3 usable for > MTA applications, Ted's work would make ext3 even better for MTA apps > sinc both Postfix/qmail use 'find' in their control-scripts and queue > statistics programYeah, sigh.... as I said, the problem is the read-ahead code for doing the directory lookup. It probably is still easier to try to get that code right and stable before ext3 1.0 than to try to get the page directory stuff in. I'm told that Postfix uses a hashed directory scheme, and exim has a split_spool_directory mode which should help an awful lot, in the absence of the patch (which only makes a big difference if you have a large number of files in each directory, and having a hierarchical spool directory largely avoids this problem). But if I have time, I'll take a look at it. - Ted
Possibly Parallel Threads
- [linux-lvm] EXT2-fs panic (device lvm(58,0)):
- [andrea@suse.de: Re: VFS bug in 2.4.10+ which applies ulimits to block devices]
- [akpm@zip.com.au: Re: ext3 and chattr +S on postfix spools]
- Making UseLogin yes requires a valid reverse DNS enty
- ogg123/libao needs to factor non support for mono in i810 driver