Christopher Chan
2003-May-01 18:44 UTC
Performance problem with mysql on a 3ware 1+0 raid array
Hi all, We are observing a consistent interval of about 4 minutes at which there are large sustained writes to disk that causes mysqld to block and not respond for the entire period. We are using data=journal with a 128M journal and the filesystem is 150GB in size. We get about 300kb/sec in writes and that will jump to about 2000kb/sec during the periods of large sustained writes. Those periods last around 10-15 secs. We also normal get 2000kb/sec to 4000kb/sec reads during normal operation and 0 - 500kb/sec during the sustain write periods... I wonder if this is a ext3 issue since we seem to get this jerkiness on all our linux boxes that run ext3 and if so, how we can eliminate/lower the impact of this behaviour. Christopher
Neil Brown
2003-May-02 04:55 UTC
Re: Performance problem with mysql on a 3ware 1+0 raid array
On Friday May 2, cchan@outblaze.com wrote:> Hi all, > > We are observing a consistent interval of about 4 minutes at which there > are large sustained writes to disk that causes mysqld to block and not > respond for the entire period. > > We are using data=journal with a 128M journal and the filesystem is > 150GB in size.Yep. This is due to some lazy code in ext3. If the journal fills up, then it flushes the whole journal before continuing. There are three ways to get around this problem. 1/ fix the code :-) 2/ use a bigger journal. There is a drawback with bigger journals though as replay after a crash will take longer. 3/ Push the bdflush parameters down so that data in the journal will be flushed out more often and the journal will not get full. Something like: echo 40 0 0 0 60 300 60 0 0 > /proc/sys/vm/bdflush The 5th and 6th numbers are significant. The defaults are 500 and 3000 (hundreths of a second) so every 5 seconds it flushs data older than 30 seconds. If your journal cannot hold 30 seconds of data, this is a problem. So drop the 3000 to maybe 300 or 500 (3 or 5 seconds) and then drop the 500 to a reasonable fraction of that (50 or so (half a second)). 2 and 3 are complimentary. The bigger the journal, the larger the age of flushed buffers can be. NeilBrown
Yusuf Goolamabbas
2003-May-02 05:20 UTC
Re: Performance problem with mysql on a 3ware 1+0 raid array
Neil, Thanks for the info but I am kinda confused as to why the sudden writeout occurs at 4 minutes if the default ext3 settings is to flush the journal every 5 seconds. Do you know what the correlation between these times would be. I have seen previous posts from you in which you described your fileservers also having write stalls at around 5 minute intervals. Did you figure out why things would go bad at 5 minutes interval. There seems to be no default writeout every 5 minutes Regards, Yusuf> > Hi all, > > > > We are observing a consistent interval of about 4 minutes at which there > > are large sustained writes to disk that causes mysqld to block and not > > respond for the entire period. > > > > We are using data=journal with a 128M journal and the filesystem is > > 150GB in size. > > Yep. This is due to some lazy code in ext3. > If the journal fills up, then it flushes the whole journal before > continuing. > > There are three ways to get around this problem. > 1/ fix the code :-) > 2/ use a bigger journal. There is a drawback with bigger journals > though as replay after a crash will take longer. > 3/ Push the bdflush parameters down so that data in the journal will > be flushed out more often and the journal will not get full. > > Something like: > echo 40 0 0 0 60 300 60 0 0 > /proc/sys/vm/bdflush > > The 5th and 6th numbers are significant. > The defaults are 500 and 3000 (hundreths of a second) so > every 5 seconds it flushs data older than 30 seconds. If your > journal cannot hold 30 seconds of data, this is a problem. > So drop the 3000 to maybe 300 or 500 (3 or 5 seconds) and then > drop the 500 to a reasonable fraction of that (50 or so (half a > second)). > > 2 and 3 are complimentary. The bigger the journal, the larger the > age of flushed buffers can be.