thr3ads.net - CentOS - [CentOS] disk I/O problems and Solutions [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Alan McKay

2009-Oct-09 16:45 UTC

[CentOS] disk I/O problems and Solutions

Hey folks,

CentOS / PostgreSQL shop over here.

I'm hitting 3 of my favorite lists with this, so here's hoping that
the BCC trick is the right way to do it :-)

We've just discovered thanks to a new Munin plugin
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
that our production DB is completely maxing out in I/O for about a 3
hour stretch from 6am til 9am
This is "device utilization" as per the last graph at the above link.

Load went down for a while but is now between 70% and 95% sustained.
We've only had this plugin going for less than a day so I don't really
 have any more data going back further.  But we've suspected a disk
issue for some time - just have not been able to prove it.

Our system
IBM 3650 - quad 2Ghz e5405 Xeon
8K SAS RAID Controller
6 x 300G 15K/RPM SAS Drives
/dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
/dev/sdb - 3 drives configured as RAID5 for 600G for the DB
1 drive as a global hot spare

/dev/sdb is the one that is maxing out.

We need to have a very serious look at fixing this situation.   But we
don't have the money to be experimenting with solutions that won't
solve our problem.  And our budget is fairly limited.

Is there a public library somewhere of disk subsystems and their
performance figures?  Done with some semblance of a standard
benchmark?

One benchmark I am partial to is this one :
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test

One thing I am thinking of in the immediate term is taking the RAID5 +
hot spare and converting it to RAID10 with the same amount of storage.
 Will that perform much better?

In general we are planning to move away from RAID5 toward RAID10.

We also have on order an external IBM array (don't have the exact name
on hand but model number was 3000) with 12 drive bays.  We ordered it
with just 4 x SATAII drives, and were going to put it on a different
system as a RAID10.  These are just 7200 RPM drives - the goal was
cheaper storage because the SAS drives are about twice as much per
drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
the SATA2 drives are about $200 each and the SAS 300G drives about
$500 each.

So I have 2 thoughts with this 12 disk array.   1 is to fill it up
with 12 x cheap SATA2 drives and hope that even though the spin-rate
is a lot slower, that the fact that it has more drives will make it
perform better.  But somehow I am doubtful about that.   The other
thought is to bite the bullet and fill it up with 300G SAS drives.

any thoughts here?  recommendations on what to do with a tight budget?
  It could be the answer is that I just have to go back to the bean
counters and tell them we have no choice but to start spending some
real money.  But on what?  And how do I prove that this is the only
choice?


-- 
?Don't eat anything you've ever seen advertised on TV?
         - Michael Pollan, author of "In Defense of Food"

Ray Van Dolson

2009-Oct-09 17:28 UTC

head link

[CentOS] disk I/O problems and Solutions

On Fri, Oct 09, 2009 at 12:45:14PM -0400, Alan McKay
wrote:> Hey folks,
> 
> CentOS / PostgreSQL shop over here.
> 
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)
> 
> We've just discovered thanks to a new Munin plugin
>
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above
link.
> 
> Load went down for a while but is now between 70% and 95% sustained.
> We've only had this plugin going for less than a day so I don't
really
>  have any more data going back further.  But we've suspected a disk
> issue for some time - just have not been able to prove it.
Really hard to say what's going on.  Does your DB need optimization?
Do the applications hitting it?  Maybe some indexing?  Maybe some more
RAM on the machine would help?  What exactly is the workload like --
especially during the time when you're peaked out?

Is the system swapping?  If so, you either need more memory or need to
track down a memory leak.... 'free' and 'sar' can both help you
see
what swap usage is like.

It would be interesting to know which processes are running and
consuming IO during this peak period as well.  top could probably give
you an "OK" picture, but something like iotop or SystemTap could tell
you a lot more (unfortunately you'll have to wait for 5.4 to get that
functionality I believe).

Writes are always slower on any parity based RAID setup, so I imagine
you'd get superior performance on RAID10, especially if you're write
heavy.

But to begin with, it'd be interesting to know exactly what this server
is doing.  Does it makes sense that the disks are being brought to
their knees with the given workload?

Is the disk array you bought an N-series? (N3300, N3600)?  If so, those
are NetApps and should be quite fast thanks to heavy write caching.
Even then, you'll be limited by spindles it sounds like...

Ray

Tim Pickard

2009-Oct-09 17:33 UTC

head link

[CentOS] disk I/O problems and Solutions

Hi Alan,

You will get the best performance on /dev/sdb by making that a RAID 10
device.  The write penalty for R5 is a killer.  You will lose a bit of
space over R5 but the performance is worth it.

As to the external enclosure, 15k 300g SAS drives are dropping in price
if you shop around or ask a sales person for a break since you are
buying more that 2-4.  Dell gave us a nice break on our last order to ~
$300 per disk.  I would use R10 there as well.  Do make sure to look at
the controller attaching the array.  It may be best to let it do the
raid on the hardware side.  It may be best to do 2 or 3 R10 groups to
maximize the channels or one big R10 to take advantage of I/O spindle
distribution.  Do spread your data around as much as possible.  

If you have to use the SATA drives, then definitely fill up the array
and spread the load as wide as possible.

Hope this helps.

Tim


On Fri, 2009-10-09 at 12:45 -0400, Alan McKay wrote:
> Hey folks,
> 
> CentOS / PostgreSQL shop over here.
> 
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)
> 
> We've just discovered thanks to a new Munin plugin
>
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above
link.
> 
> Load went down for a while but is now between 70% and 95% sustained.
> We've only had this plugin going for less than a day so I don't
really
>  have any more data going back further.  But we've suspected a disk
> issue for some time - just have not been able to prove it.
> 
> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller
> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
> 
> /dev/sdb is the one that is maxing out.
> 
> We need to have a very serious look at fixing this situation.   But we
> don't have the money to be experimenting with solutions that won't
> solve our problem.  And our budget is fairly limited.
> 
> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?
> 
> One benchmark I am partial to is this one :
>
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test
> 
> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?
> 
> In general we are planning to move away from RAID5 toward RAID10.
> 
> We also have on order an external IBM array (don't have the exact name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.
> 
> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.
> 
> any thoughts here?  recommendations on what to do with a tight budget?
>   It could be the answer is that I just have to go back to the bean
> counters and tell them we have no choice but to start spending some
> real money.  But on what?  And how do I prove that this is the only
> choice?
> 
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.centos.org/pipermail/centos/attachments/20091009/a5a79e32/attachment-0002.html>

Blake Hudson

2009-Oct-09 17:52 UTC

head link

[CentOS] disk I/O problems and Solutions

I would second what another poster stated, to look at DB/RAM 
optimizations. If possible, you should have enough RAM in the system to 
hold your entire DB in memory - and your db software should be setup to 
take advantage of that memory.

Since you stated that your system is only having trouble at writes, 
doing batch inserts, I would look at how you are doing these batch 
inserts and see if you can optimize this by using techniques to delay 
key writes or adjust locking mechanisms on the insert. Some databases 
have special insert from file or bulk insert operations that are much 
faster than running many single row inserts - be sure you are taking 
advantage of these methods.

--Blake

Pasi Kärkkäinen

2009-Oct-09 18:19 UTC

head link

[CentOS] disk I/O problems and Solutions

On Fri, Oct 09, 2009 at 12:45:14PM -0400, Alan McKay
wrote:> Hey folks,
> 
> CentOS / PostgreSQL shop over here.
> 
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)
> 
> We've just discovered thanks to a new Munin plugin
>
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above
link.
> 
> Load went down for a while but is now between 70% and 95% sustained.
> We've only had this plugin going for less than a day so I don't
really
>  have any more data going back further.  But we've suspected a disk
> issue for some time - just have not been able to prove it.
> 
> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller
> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
> 
> /dev/sdb is the one that is maxing out.
> 
> We need to have a very serious look at fixing this situation.   But we
> don't have the money to be experimenting with solutions that won't
> solve our problem.  And our budget is fairly limited.
> 
> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?
> 
> One benchmark I am partial to is this one :
>
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test
> 
> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?
> 
Does your RAID controller have battery-backed write-cache? That's needed
to get good performance from RAID5. 

Changing to RAID-10 will definitely help. 
> In general we are planning to move away from RAID5 toward RAID10.
> 
That's good. RAID5 isn't very good for database use..
> We also have on order an external IBM array (don't have the exact name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.
> 
> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.
> 
> any thoughts here?  recommendations on what to do with a tight budget?
>   It could be the answer is that I just have to go back to the bean
> counters and tell them we have no choice but to start spending some
> real money.  But on what?  And how do I prove that this is the only
> choice?
> 
I'd use ltp disktest to measure the performance using various
workloads.. sequential and random, using different amount of threads and
different blocksizes.

There's only so many random IOPS each disk can do.. 15k SAS drive will
be 2-3x faster than 7200 rpm SATA disk, but still 4 disks total isn't
THAT much.. 

-- Pasi

Martin Suehowicz

2009-Oct-09 19:20 UTC

head link

[CentOS] disk I/O problems and Solutions

Raid10 should be better on your writes. Random reads and writes are the
most important for a db. For random io the more spindles(disks) you have
the better. I would use sas over sata if possible. How big is your
database? If it is small you may be able to put it on a few solid state
drives there really good for random io. I don't know of a website, but
one would be nice. 

-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
Behalf Of Alan McKay
Sent: Friday, October 09, 2009 9:45 AM
To: Alan McKay
Subject: [CentOS] disk I/O problems and Solutions

Hey folks,

CentOS / PostgreSQL shop over here.

I'm hitting 3 of my favorite lists with this, so here's hoping that
the BCC trick is the right way to do it :-)

We've just discovered thanks to a new Munin plugin
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-wi
th-munin.html
that our production DB is completely maxing out in I/O for about a 3
hour stretch from 6am til 9am
This is "device utilization" as per the last graph at the above link.

Load went down for a while but is now between 70% and 95% sustained.
We've only had this plugin going for less than a day so I don't really
 have any more data going back further.  But we've suspected a disk
issue for some time - just have not been able to prove it.

Our system
IBM 3650 - quad 2Ghz e5405 Xeon
8K SAS RAID Controller
6 x 300G 15K/RPM SAS Drives
/dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
/dev/sdb - 3 drives configured as RAID5 for 600G for the DB
1 drive as a global hot spare

/dev/sdb is the one that is maxing out.

We need to have a very serious look at fixing this situation.   But we
don't have the money to be experimenting with solutions that won't
solve our problem.  And our budget is fairly limited.

Is there a public library somewhere of disk subsystems and their
performance figures?  Done with some semblance of a standard
benchmark?

One benchmark I am partial to is this one :
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmark
ing_notes#dd_test

One thing I am thinking of in the immediate term is taking the RAID5 +
hot spare and converting it to RAID10 with the same amount of storage.
 Will that perform much better?

In general we are planning to move away from RAID5 toward RAID10.

We also have on order an external IBM array (don't have the exact name
on hand but model number was 3000) with 12 drive bays.  We ordered it
with just 4 x SATAII drives, and were going to put it on a different
system as a RAID10.  These are just 7200 RPM drives - the goal was
cheaper storage because the SAS drives are about twice as much per
drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
the SATA2 drives are about $200 each and the SAS 300G drives about
$500 each.

So I have 2 thoughts with this 12 disk array.   1 is to fill it up
with 12 x cheap SATA2 drives and hope that even though the spin-rate
is a lot slower, that the fact that it has more drives will make it
perform better.  But somehow I am doubtful about that.   The other
thought is to bite the bullet and fill it up with 300G SAS drives.

any thoughts here?  recommendations on what to do with a tight budget?
  It could be the answer is that I just have to go back to the bean
counters and tell them we have no choice but to start spending some
real money.  But on what?  And how do I prove that this is the only
choice?


-- 
"Don't eat anything you've ever seen advertised on TV"
         - Michael Pollan, author of "In Defense of Food"
_______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos

James A. Peltier

2009-Oct-09 20:35 UTC

head link

[CentOS] disk I/O problems and Solutions

On Fri, 9 Oct 2009, Alan McKay wrote:
> Hey folks,
>
> CentOS / PostgreSQL shop over here.
>
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)
>
> We've just discovered thanks to a new Munin plugin
>
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above
link.
You make no mention of the file system that you are currently using here 
either.  If you look and see I/O wait rather high than your file system 
could also be the culprit.  EXT3/4 journalling is terrible under heavy 
load because the kjournald is single threaded.  We have a user here that 
completely saturates a 4 core 8GB memory machine with only a few clients 
because it can't keep up with the journal writes.

Are you seeing any high I/O waits and lots of kjournald's running?

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
           http://blogs.sfu.ca/people/jpeltier
MSN     : subatomic_spam at hotmail.com

Does your OS has a man 8 lart?
http://www.xinu.nl/unix/humour/asr-manpages/lart.html

Seemingly Similar Threads

Search for more apparently analagous threads

CentOS - Oct 2009 - disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

[CentOS] disk I/O problems and Solutions

Seemingly Similar Threads