Kelly Lesperance
2016-May-25 18:44 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
The HBA is an HP H220. We haven?t really benchmarked individual drives ? all 12 drives are utilized in one RAID-10 array, I?m unsure how we would test individual drives without breaking the array. Trying ?hdparm -tT /dev/sda? now ? it?s been running for 25 minutes so far? Kelly On 2016-05-25, 2:12 PM, "centos-bounces at centos.org on behalf of Dennis Jacobfeuerborn" <centos-bounces at centos.org on behalf of dennisml at conversis.de> wrote:>What is the HBA the drives are attached to? >Have you done a quick benchmark on a single disk to check if this is a >raid problem or further down the stack? > >Regards, > Dennis > >On 25.05.2016 19:26, Kelly Lesperance wrote: >> [merging] >> >> The HBA the drives are attached to has no configuration that I?m aware of. We would have had to accidentally change 23 of them ? >> >> Thanks, >> >> Kelly >> >> On 2016-05-25, 1:25 PM, "Kelly Lesperance" <klesperance at blackberry.com> wrote: >> >>> They are: >>> >>> [root at r1k1 ~] # hdparm -I /dev/sda >>> >>> /dev/sda: >>> >>> ATA device, with non-removable media >>> Model Number: MB4000GCWDC >>> Serial Number: S1Z06RW9 >>> Firmware Revision: HPGD >>> Transport: Serial, SATA Rev 3.0 >>> >>> Thanks, >>> >>> Kelly >> >> >> On 2016-05-25, 1:23 PM, "centos-bounces at centos.org on behalf of m.roth at 5-cent.us" <centos-bounces at centos.org on behalf of m.roth at 5-cent.us> wrote: >> >>> Kelly Lesperance wrote: >>>> I?ve posted this on the forums at >>>> https://www.centos.org/forums/viewtopic.php?f=47&t=57926&p=244614#p244614 >>>> - posting to the list in the hopes of getting more eyeballs on it. >>>> >>>> We have a cluster of 23 HP DL380p Gen8 hosts running Kafka. Basic specs: >>>> >>>> 2x E5-2650 >>>> 128 GB RAM >>>> 12 x 4 TB 7200 RPM SATA drives connected to an HP H220 HBA >>>> Dual port 10 GB NIC >>>> >>>> The drives are configured as one large RAID-10 volume with mdadm, >>>> filesystem is XFS. The OS is not installed on the drive - we PXE boot a >>>> CentOS image we've built with minimal packages installed, and do the OS >>>> configuration via puppet. Originally, the hosts were running CentOS 6.5, >>>> with Kafka 0.8.1, without issue. We recently upgraded to CentOS 7.2 and >>>> Kafka 0.9, and that's when the trouble started. >>> <SNIP> >>> One more stupid question: could the configuration of the card for how the >>> drives are accessed been accidentally changed? >>> >>> mark >>> >>> _______________________________________________ >>> CentOS mailing list >>> CentOS at centos.org >>> https://lists.centos.org/mailman/listinfo/centos >> >> _______________________________________________ >> CentOS mailing list >> CentOS at centos.org >> https://lists.centos.org/mailman/listinfo/centos >> > >_______________________________________________ >CentOS mailing list >CentOS at centos.org >https://lists.centos.org/mailman/listinfo/centos
John R Pierce
2016-May-25 18:59 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
On 5/25/2016 11:44 AM, Kelly Lesperance wrote:> The HBA is an HP H220.for the uninitated, thats a LSI SAS2308, in IT (initiator-terminator) mode. -- john r pierce, recycling bits in santa cruz
John R Pierce
2016-May-25 19:01 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
On 5/25/2016 11:44 AM, Kelly Lesperance wrote:> The HBA is an HP H220.OH. its a very good idea to verify the driver is at the same revision level as the firmware. not 100% sure how you do this under CentOS, my H220 system is running FreeBSD, and is at revision P20, both firmware and driver. HP's firmware, at least what I could find, was a fairly old P14 or something, so I had to re-flash mine with 'generic' LSI firmware, this isn't exactly a recommended thing to do, but its sure working fine for me. -- john r pierce, recycling bits in santa cruz
Kelly Lesperance
2016-May-25 19:13 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
Hdparm didn?t get far: [root at r1k1 ~] # hdparm -tT /dev/sda /dev/sda: Timing cached reads: Alarm clock [root at r1k1 ~] # On 2016-05-25, 2:44 PM, "Kelly Lesperance" <klesperance at blackberry.com> wrote:>The HBA is an HP H220. > >We haven?t really benchmarked individual drives ? all 12 drives are utilized in one RAID-10 array, I?m unsure how we would test individual drives without breaking the array. > >Trying ?hdparm -tT /dev/sda? now ? it?s been running for 25 minutes so far? > >Kelly > >On 2016-05-25, 2:12 PM, "centos-bounces at centos.org on behalf of Dennis Jacobfeuerborn" <centos-bounces at centos.org on behalf of dennisml at conversis.de> wrote: > >>What is the HBA the drives are attached to? >>Have you done a quick benchmark on a single disk to check if this is a >>raid problem or further down the stack? >> >>Regards, >> Dennis >> >>On 25.05.2016 19:26, Kelly Lesperance wrote: >>> [merging] >>> >>> The HBA the drives are attached to has no configuration that I?m aware of. We would have had to accidentally change 23 of them ? >>> >>> Thanks, >>> >>> Kelly >>> >>> On 2016-05-25, 1:25 PM, "Kelly Lesperance" <klesperance at blackberry.com> wrote: >>> >>>> They are: >>>> >>>> [root at r1k1 ~] # hdparm -I /dev/sda >>>> >>>> /dev/sda: >>>> >>>> ATA device, with non-removable media >>>> Model Number: MB4000GCWDC >>>> Serial Number: S1Z06RW9 >>>> Firmware Revision: HPGD >>>> Transport: Serial, SATA Rev 3.0 >>>> >>>> Thanks, >>>> >>>> Kelly >>> >>> >>> On 2016-05-25, 1:23 PM, "centos-bounces at centos.org on behalf of m.roth at 5-cent.us" <centos-bounces at centos.org on behalf of m.roth at 5-cent.us> wrote: >>> >>>> Kelly Lesperance wrote: >>>>> I?ve posted this on the forums at >>>>> https://www.centos.org/forums/viewtopic.php?f=47&t=57926&p=244614#p244614 >>>>> - posting to the list in the hopes of getting more eyeballs on it. >>>>> >>>>> We have a cluster of 23 HP DL380p Gen8 hosts running Kafka. Basic specs: >>>>> >>>>> 2x E5-2650 >>>>> 128 GB RAM >>>>> 12 x 4 TB 7200 RPM SATA drives connected to an HP H220 HBA >>>>> Dual port 10 GB NIC >>>>> >>>>> The drives are configured as one large RAID-10 volume with mdadm, >>>>> filesystem is XFS. The OS is not installed on the drive - we PXE boot a >>>>> CentOS image we've built with minimal packages installed, and do the OS >>>>> configuration via puppet. Originally, the hosts were running CentOS 6.5, >>>>> with Kafka 0.8.1, without issue. We recently upgraded to CentOS 7.2 and >>>>> Kafka 0.9, and that's when the trouble started. >>>> <SNIP> >>>> One more stupid question: could the configuration of the card for how the >>>> drives are accessed been accidentally changed? >>>> >>>> mark >>>> >>>> _______________________________________________ >>>> CentOS mailing list >>>> CentOS at centos.org >>>> https://lists.centos.org/mailman/listinfo/centos >>> >>> _______________________________________________ >>> CentOS mailing list >>> CentOS at centos.org >>> https://lists.centos.org/mailman/listinfo/centos >>> >> >>_______________________________________________ >>CentOS mailing list >>CentOS at centos.org >>https://lists.centos.org/mailman/listinfo/centos >
Kelly Lesperance
2016-May-25 19:20 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
I installed the latest firmware and driver (mpt2sas) from HP on one system. The driver is v20, it appears the firmware may be 15, though: [ 11.128979] mpt2sas version 20.100.00.00 loaded [ 11.513836] mpt2sas0: LSISAS2308: FWVersion(15.10.09.00), ChipRevision(0x05), BiosVersion(07.39.00.00) On 2016-05-25, 3:01 PM, "centos-bounces at centos.org on behalf of John R Pierce" <centos-bounces at centos.org on behalf of pierce at hogranch.com> wrote:>On 5/25/2016 11:44 AM, Kelly Lesperance wrote: >> The HBA is an HP H220. > > >OH. its a very good idea to verify the driver is at the same revision >level as the firmware. not 100% sure how you do this under CentOS, my >H220 system is running FreeBSD, and is at revision P20, both firmware >and driver. HP's firmware, at least what I could find, was a fairly >old P14 or something, so I had to re-flash mine with 'generic' LSI >firmware, this isn't exactly a recommended thing to do, but its sure >working fine for me. > > > > >-- >john r pierce, recycling bits in santa cruz > >_______________________________________________ >CentOS mailing list >CentOS at centos.org >https://lists.centos.org/mailman/listinfo/centos
m.roth at 5-cent.us
2016-May-25 19:20 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
John R Pierce wrote:> On 5/25/2016 11:44 AM, Kelly Lesperance wrote: >> The HBA is an HP H220. > > OH. its a very good idea to verify the driver is at the same revision > level as the firmware. not 100% sure how you do this under CentOS, my > H220 system is running FreeBSD, and is at revision P20, both firmware > and driver. HP's firmware, at least what I could find, was a fairly > old P14 or something, so I had to re-flash mine with 'generic' LSI > firmware, this isn't exactly a recommended thing to do, but its sure > working fine for me.Not sure if dmidecode will tell you, but you might see if you can run smartctl -i Also, you could either, on boot, go into the card's firmware interface, and that'll tell you, somewhere, what the firmware version is. Not sure if MegaRAID will work with this card - if it does, you really want it..even though it has an actively user-hostile interface. mark
cpolish at surewest.net
2016-May-25 21:43 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
On 2016-05-25 19:13, Kelly Lesperance wrote:> Hdparm didn?t get far: > > [root at r1k1 ~] # hdparm -tT /dev/sda > > /dev/sda: > Timing cached reads: Alarm clock > [root at r1k1 ~] #Hi Kelly, Try running 'iostat -xdmc 1'. Look for a single drive that has substantially greater await than ~10msec. If all the drives except one are taking 6-8msec, but one is very much more, you've got a drive that drags down the whole array's performance. Ignore the very first output from the command - it's an average of the disk subsystem since boot. Post a representative output along with the contents /proc/mdstat. Good luck, -- Charles Polisher
Apparently Analagous Threads
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2