Hello, as far as I understood ext3 will more or less hog a machine when writing away the journal. A customer is having a slowdown every 5 minues for about 30 seconds, the machine becomes more or less unusable. This is an NFS server serving 300 Gigs spread over 2 NFS shares. I'm wondering what would be the best course of action: a) make the journal bigger? b) make the journal smaller? c) switch from ordered to writeback? Can somebody give me a hint? RU PCFE -- from the desk of Patrick C. F. Ernzer Red Hat Europe
On Mon, Nov 12, 2001 at 04:18:59PM -0000, Patrick C. F. Ernzer wrote:> Hello, > > as far as I understood ext3 will more or less hog a machine when writing > away the journal. A customer is having a slowdown every 5 minues for about > 30 seconds, the machine becomes more or less unusable. > > This is an NFS server serving 300 Gigs spread over 2 NFS shares. > > I'm wondering what would be the best course of action: > a) make the journal bigger? > b) make the journal smaller? > c) switch from ordered to writeback? > > Can somebody give me a hint? >What was the previous FS? Ext2? What is the current mode? The default (ordered)? What are the write patterns for this FS? Large contiguous writes? random writes? Read, write, read? (I'm thinking of each individual use, and overall usage) If you have random writes, data=journal mode might help... Mike
"Patrick C. F. Ernzer" wrote:> > Hello, > > as far as I understood ext3 will more or less hog a machine when writing > away the journal.It shouldn't. With the default (usual) journal size, the maximum amount of data which we write to the journal is around eight megs, and it's a single linear write - it's really quick. We then wait on that write before releasig the data for writeback, which is also quick. The slowest part (in the default ordered data mode) is the write of the data into the main filesystem prior to writing the journal. But this all happens every five seconds.> A customer is having a slowdown every 5 minues for about > 30 seconds, the machine becomes more or less unusable. > > This is an NFS server serving 300 Gigs spread over 2 NFS shares. > > I'm wondering what would be the best course of action: > a) make the journal bigger? > b) make the journal smaller? > c) switch from ordered to writeback? > > Can somebody give me a hint?It's possible that a switch to writeback would fix it up. As an experiment, I'd be interested i the result. But writeback has lower data safety guarantees - file contents can be corrupted if they were undergoing write at the time of a crash. Something unusual is happening here. Is this with kernel 2.4.9? What is the underlying IO system? We've had problems before with interactions between ext3, software RAID and the VM. Are you able to monitor the server during the slowdown? Running `top', `ps' and `vmstat 1' would be useful.
"Patrick C. F. Ernzer" wrote:> > This is an NFS server serving 300 Gigs spread over 2 NFS shares. >Is the server using a synchronous export (/etc/exports?) Is this problem associated with any particular workload? Such as intensive write activity?
On Monday November 12, pernzer@redhat.com wrote:> Hello, > > as far as I understood ext3 will more or less hog a machine when writing > away the journal. A customer is having a slowdown every 5 minues for about > 30 seconds, the machine becomes more or less unusable. > > This is an NFS server serving 300 Gigs spread over 2 NFS shares. > > I'm wondering what would be the best course of action: > a) make the journal bigger? > b) make the journal smaller? > c) switch from ordered to writeback? > > Can somebody give me a hint?I've seen something a lot like this. I export with "sync" (because it is the safe thing to do) and with "no_wdelay" (because that is nicer to ext3) and mount with "data=journal" because that is nicer for sync-writes. Under heavy NFS load, I get pauses of a few seconds every few minutes. If I echo 40 0 0 0 60 300 60 0 0 > /proc/sys/vm/bdflush the problem goes away. What I *think* is happening is that the journal fills up before bdflush flushs the data to it's rightful home. When this happens, ext3 blocks while it forces this data out to disk so that it can make more room in the journal. What ext3 *should* do is start pushing data out when the journal gets X% full for some value of X like 50 or 75. This may or may not be related to your problem, depending on what export and mount options you are using. The 5 minutes sounds like the journal commit interval. It's probably a long shot, but might you have a heavily fragmented journal? use "debugfs" to find the inode number of the journal, and then stat <inode> to find where the blocks are. If they are all over the place, then sequential journal writes might not be fast. NeilBrown
Hi, On Mon, Nov 12, 2001 at 04:18:59PM -0000, Patrick C. F. Ernzer wrote:> > as far as I understood ext3 will more or less hog a machine when writing > away the journal. A customer is having a slowdown every 5 minues for about > 30 seconds, the machine becomes more or less unusable.Well, ext3 writes the journal about once every 5 seconds on a typical busy filesystem, so it's not clear that's necessarily what the problem is. Cheers, Stephen