Kelly Lesperance
2016-May-27 13:21 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
All of our Kafka clusters are fairly write-heavy. The cluster in question is our second-heaviest ? we haven?t yet upgraded the heaviest, due to the issues we?ve been experiencing in this one. Here is an iostat example from a host within the same cluster, but without the RAID check running: [root at r2k1 ~] # iostat -xdmc 1 10 Linux 3.10.0-327.13.1.el7.x86_64 (r2k1) 05/27/16 _x86_64_ (32 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 8.87 0.02 1.28 0.21 0.00 89.62 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.02 0.55 0.15 27.06 0.03 11.40 859.89 1.02 37.40 36.13 37.41 6.86 18.65 sdf 0.02 0.48 0.15 26.99 0.03 11.40 862.17 0.15 5.56 40.94 5.37 7.27 19.73 sdk 0.03 0.58 0.22 27.10 0.03 11.40 857.01 1.60 58.49 36.20 58.67 7.17 19.58 sdb 0.02 0.52 0.15 27.43 0.03 11.40 848.37 0.02 0.78 42.84 0.55 7.07 19.50 sdj 0.02 0.55 0.15 27.11 0.03 11.40 858.28 0.62 22.70 41.97 22.59 7.43 20.27 sdg 0.03 0.68 0.22 27.76 0.03 11.40 836.98 0.76 27.10 34.36 27.04 7.33 20.51 sde 0.03 0.48 0.22 26.99 0.03 11.40 860.43 0.33 12.07 33.16 11.90 7.34 19.98 sda 0.03 0.52 0.22 27.43 0.03 11.40 846.65 0.57 20.48 36.42 20.35 7.34 20.31 sdh 0.02 0.68 0.15 27.76 0.03 11.40 838.63 0.47 16.66 40.96 16.53 7.20 20.09 sdc 0.03 0.55 0.22 27.06 0.03 11.40 858.19 0.74 27.30 36.96 27.22 7.55 20.58 sdi 0.03 0.53 0.22 27.13 0.03 11.40 856.04 1.60 58.50 27.43 58.75 5.21 14.24 sdl 0.02 0.56 0.15 27.11 0.03 11.40 858.27 1.12 41.09 27.89 41.16 5.00 13.63 md127 0.00 0.00 2.53 161.84 0.36 68.39 856.56 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 13.11 0.00 1.82 1.07 0.00 84.01 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 81.00 0.00 38.48 972.95 51.00 219.06 0.00 219.06 6.37 51.60 sdf 0.00 1.00 0.00 73.00 0.00 33.70 945.33 55.02 235.86 0.00 235.86 7.12 52.00 sdk 0.00 1.00 0.00 56.00 0.00 25.70 939.73 60.45 223.79 0.00 223.79 9.29 52.00 sdb 0.00 2.00 0.00 70.00 0.00 34.48 1008.70 58.88 292.81 0.00 292.81 7.37 51.60 sdj 0.00 3.00 0.00 62.00 0.00 29.87 986.60 59.32 243.48 0.00 243.48 8.26 51.20 sdg 0.00 1.00 0.00 49.00 0.00 23.43 979.45 60.37 234.98 0.00 234.98 10.53 51.60 sde 0.00 1.00 0.00 61.00 0.00 27.95 938.38 58.17 239.57 0.00 239.57 8.52 52.00 sda 0.00 2.00 0.00 56.00 0.00 27.48 1004.88 56.27 202.88 0.00 202.88 9.27 51.90 sdh 0.00 1.00 0.00 70.00 0.00 33.57 982.19 59.00 277.84 0.00 277.84 7.43 52.00 sdc 0.00 0.00 0.00 64.00 0.00 30.06 961.89 58.20 268.30 0.00 268.30 8.08 51.70 sdi 0.00 3.00 0.00 116.00 0.00 55.62 981.94 44.54 199.72 0.00 199.72 4.56 52.90 sdl 0.00 1.00 0.00 128.00 0.00 60.31 964.88 43.91 215.94 0.00 215.94 4.11 52.60 md127 0.00 0.00 0.00 1143.00 0.00 538.90 965.59 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 15.70 0.00 1.97 0.44 0.00 81.89 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 119.00 0.00 56.39 970.42 42.84 639.45 0.00 639.45 6.66 79.20 sdf 0.00 1.00 0.00 129.00 0.00 61.21 971.84 48.89 672.04 0.00 672.04 6.34 81.80 sdk 0.00 0.00 0.00 152.00 0.00 72.62 978.53 61.02 716.76 0.00 716.76 5.74 87.20 sdb 0.00 1.00 0.00 133.00 0.00 62.86 967.88 54.10 695.35 0.00 695.35 6.45 85.80 sdj 0.00 0.00 0.00 146.00 0.00 68.36 958.85 69.22 767.12 0.00 767.12 6.85 100.00 sdg 0.00 0.00 0.00 146.00 0.00 69.87 980.11 77.99 789.53 0.00 789.53 6.85 100.00 sde 0.00 1.00 0.00 141.00 0.00 66.96 972.60 56.21 707.61 0.00 707.61 6.21 87.60 sda 0.00 1.00 0.00 147.00 0.00 69.86 973.22 62.21 728.76 0.00 728.76 6.32 92.90 sdh 0.00 0.00 0.00 134.00 0.00 62.61 956.90 55.79 711.49 0.00 711.49 6.63 88.90 sdc 0.00 0.00 0.00 136.00 0.00 64.81 975.94 61.46 753.57 0.00 753.57 6.93 94.20 sdi 0.00 0.00 0.00 93.00 0.00 42.67 939.61 17.60 419.10 0.00 419.10 4.63 43.10 sdl 0.00 0.00 0.00 80.00 0.00 38.02 973.20 11.00 340.79 0.00 340.79 4.25 34.00 md127 0.00 0.00 0.00 87.00 0.00 40.99 964.97 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 12.11 0.00 1.35 0.00 0.00 86.54 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 15.00 0.00 15.00 15.00 1.50 sdf 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 sdk 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 sdb 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 7.00 0.00 7.00 7.00 0.70 sdj 0.00 0.00 0.00 2.00 0.00 0.06 64.50 0.01 733.50 0.00 733.50 7.50 1.50 sdg 0.00 0.00 0.00 10.00 0.00 2.88 588.90 0.55 1212.80 0.00 1212.80 15.50 15.50 sde 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 12.00 0.00 12.00 12.00 1.20 sda 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 sdh 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 20.00 0.00 20.00 20.00 2.00 sdc 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 17.00 0.00 17.00 17.00 1.70 sdi 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 12.00 0.00 12.00 12.00 1.20 sdl 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 17.00 0.00 17.00 17.00 1.70 md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 15.22 0.00 1.50 0.00 0.00 83.28 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 16.96 0.09 1.63 0.16 0.00 81.16 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 8.00 0.00 0.66 168.25 0.09 11.50 0.00 11.50 8.75 7.00 sdf 0.00 0.00 0.00 5.00 0.00 0.52 213.20 0.08 16.20 0.00 16.20 16.20 8.10 sdk 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.06 20.33 0.00 20.33 20.33 6.10 sdb 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.05 16.67 0.00 16.67 16.67 5.00 sdj 0.00 0.00 0.00 4.00 0.00 0.98 500.50 0.06 14.50 0.00 14.50 11.00 4.40 sdg 0.00 1.00 0.00 4.00 0.00 0.63 322.50 0.14 36.00 0.00 36.00 32.75 13.10 sde 0.00 0.00 0.00 5.00 0.00 0.52 213.20 0.07 13.60 0.00 13.60 13.60 6.80 sda 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.05 15.67 0.00 15.67 15.67 4.70 sdh 0.00 1.00 0.00 4.00 0.00 0.63 322.50 0.06 14.50 0.00 14.50 11.50 4.60 sdc 0.00 0.00 0.00 8.00 0.00 0.66 168.25 0.11 13.25 0.00 13.25 10.62 8.50 sdi 0.00 0.00 0.00 4.00 0.00 0.98 500.50 0.06 15.50 0.00 15.50 12.00 4.80 sdl 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.04 13.67 0.00 13.67 13.67 4.10 md127 0.00 0.00 0.00 17.00 0.00 3.78 455.53 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 14.08 0.00 1.50 0.00 0.00 84.42 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 14.89 0.00 1.98 0.00 0.00 83.13 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 90.00 0.00 41.31 940.01 27.25 302.80 0.00 302.80 7.07 63.60 sdf 0.00 0.00 0.00 87.00 0.00 41.35 973.44 22.73 261.30 0.00 261.30 6.92 60.20 sdk 0.00 2.00 0.00 97.00 0.00 42.08 888.42 39.86 410.94 0.00 410.94 8.10 78.60 sdb 0.00 0.00 0.00 87.00 0.00 41.07 966.82 24.39 280.30 0.00 280.30 7.14 62.10 sdj 0.00 1.00 0.00 91.00 0.00 41.94 943.92 36.37 399.62 0.00 399.62 8.44 76.80 sdg 0.00 0.00 0.00 86.00 0.00 40.67 968.48 31.76 369.33 0.00 369.33 8.81 75.80 sde 0.00 0.00 0.00 87.00 0.00 41.35 973.44 30.80 354.05 0.00 354.05 9.01 78.40 sda 0.00 0.00 0.00 87.00 0.00 41.07 966.82 32.61 374.80 0.00 374.80 8.57 74.60 sdh 0.00 0.00 0.00 86.00 0.00 40.67 968.48 29.52 343.23 0.00 343.23 8.56 73.60 sdc 0.00 0.00 0.00 89.00 0.00 40.81 939.07 32.80 360.15 0.00 360.15 8.91 79.30 sdi 0.00 1.00 0.00 91.00 0.00 41.94 943.92 19.60 215.34 0.00 215.34 5.62 51.10 sdl 0.00 2.00 0.00 97.00 0.00 42.08 888.42 19.59 201.93 0.00 201.93 4.69 45.50 md127 0.00 0.00 0.00 535.00 0.00 248.42 950.95 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 11.08 0.00 1.41 0.00 0.00 87.51 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 5.00 0.00 42.00 0.00 0.38 18.55 2.25 53.52 0.00 53.52 4.93 20.70 sdf 0.00 0.00 0.00 35.00 0.00 0.21 12.43 1.62 46.17 0.00 46.17 5.29 18.50 sdk 0.00 23.00 0.00 42.00 0.00 0.44 21.40 1.99 47.29 0.00 47.29 4.64 19.50 sdb 0.00 9.00 0.00 58.00 0.00 0.34 12.02 2.77 47.78 0.00 47.78 4.12 23.90 sdj 0.00 1.00 0.00 39.00 0.00 0.24 12.79 1.79 45.97 0.00 45.97 5.21 20.30 sdg 0.00 11.00 0.00 66.00 0.00 0.40 12.45 3.60 54.47 0.00 54.47 3.42 22.60 sde 0.00 0.00 0.00 35.00 0.00 0.21 12.43 2.13 61.00 0.00 61.00 8.89 31.10 sda 0.00 9.00 0.00 58.00 0.00 0.34 12.02 2.48 42.81 0.00 42.81 3.71 21.50 sdh 0.00 11.00 0.00 66.00 0.00 0.40 12.45 4.81 72.83 0.00 72.83 3.80 25.10 sdc 0.00 5.00 0.00 43.00 0.00 0.88 41.93 1.99 63.81 0.00 63.81 5.00 21.50 sdi 0.00 1.00 0.00 39.00 0.00 0.24 12.79 1.31 33.69 0.00 33.69 4.03 15.70 sdl 0.00 23.00 0.00 42.00 0.00 0.44 21.40 1.23 29.33 0.00 29.33 3.71 15.60 md127 0.00 0.00 0.00 313.00 0.00 2.01 13.14 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 16.16 0.03 1.66 0.00 0.00 82.15 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 On 2016-05-26, 11:50 PM, "centos-bounces at centos.org on behalf of Gordon Messmer" <centos-bounces at centos.org on behalf of gordon.messmer at gmail.com> wrote:>On 05/25/2016 09:54 AM, Kelly Lesperance wrote: >> What we're seeing is that when the weekly raid-check script executes, performance nose dives, and I/O wait skyrockets. The raid check starts out fairly fast (20000K/sec - the limit that's been set), but then quickly drops down to about 4000K/Sec. dev.raid.speed sysctls are at the defaults: > >It looks like some pretty heavy writes are going on at the time. I'm not >sure what you mean by "nose dives", but I'd expect *some* performance >impact of running a read-intensive process like a RAID check at the same >time you're running a write-intensive process. > >Do the same write-heavy processes run on the other clusters, where you >aren't seeing performance issues? > >> avg-cpu: %user %nice %system %iowait %steal %idle >> 9.24 0.00 1.32 20.02 0.00 69.42 >> >> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn >> sda 50.00 512.00 20408.00 512 20408 >> sdb 50.00 512.00 20408.00 512 20408 >> sdc 48.00 512.00 19984.00 512 19984 >> sdd 48.00 512.00 19984.00 512 19984 >> sdf 50.00 704.00 19968.00 704 19968 >> sdg 47.00 512.00 19968.00 512 19968 >> sdh 47.00 512.00 19968.00 512 19968 >> sde 50.00 704.00 19968.00 704 19968 >> sdj 48.00 512.00 19972.00 512 19972 >> sdi 48.00 512.00 19972.00 512 19972 >> sdk 48.00 512.00 19980.00 512 19980 >> sdl 48.00 512.00 19980.00 512 19980 >> md127 241.00 0.00 120280.00 0 120280 > >_______________________________________________ >CentOS mailing list >CentOS at centos.org >https://lists.centos.org/mailman/listinfo/centos
Kelly Lesperance
2016-Jun-01 19:47 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
I did some additional testing - I stopped Kafka on the host, and kicked off a disk check, and it ran at the expected speed overnight. I started kafka this morning, and the raid check's speed immediately dropped down to ~2000K/Sec. I then enabled the write-back cache on the drives (hdparm -W1 /dev/sd*). The raid check is now running between 100000K/Sec and 200000K/Sec, and has been for several hours (it fluctuates, but seems to stay within that range). Write-back cache is NOT enabled for the drives on the hosts we haven't upgraded yet, but the speeds are similar (I kicked off a raid check on one of our CentOS 6 hosts as well, the window seems to be 150000 - 200000K/Sec on that host). Kelly On 2016-05-27, 9:21 AM, "Kelly Lesperance" <klesperance at blackberry.com> wrote:>All of our Kafka clusters are fairly write-heavy. The cluster in question is our second-heaviest ? we haven?t yet upgraded the heaviest, due to the issues we?ve been experiencing in this one. > >Here is an iostat example from a host within the same cluster, but without the RAID check running: > >[root at r2k1 ~] # iostat -xdmc 1 10 >Linux 3.10.0-327.13.1.el7.x86_64 (r2k1) 05/27/16 _x86_64_ (32 CPU) > >avg-cpu: %user %nice %system %iowait %steal %idle > 8.87 0.02 1.28 0.21 0.00 89.62 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.02 0.55 0.15 27.06 0.03 11.40 859.89 1.02 37.40 36.13 37.41 6.86 18.65 >sdf 0.02 0.48 0.15 26.99 0.03 11.40 862.17 0.15 5.56 40.94 5.37 7.27 19.73 >sdk 0.03 0.58 0.22 27.10 0.03 11.40 857.01 1.60 58.49 36.20 58.67 7.17 19.58 >sdb 0.02 0.52 0.15 27.43 0.03 11.40 848.37 0.02 0.78 42.84 0.55 7.07 19.50 >sdj 0.02 0.55 0.15 27.11 0.03 11.40 858.28 0.62 22.70 41.97 22.59 7.43 20.27 >sdg 0.03 0.68 0.22 27.76 0.03 11.40 836.98 0.76 27.10 34.36 27.04 7.33 20.51 >sde 0.03 0.48 0.22 26.99 0.03 11.40 860.43 0.33 12.07 33.16 11.90 7.34 19.98 >sda 0.03 0.52 0.22 27.43 0.03 11.40 846.65 0.57 20.48 36.42 20.35 7.34 20.31 >sdh 0.02 0.68 0.15 27.76 0.03 11.40 838.63 0.47 16.66 40.96 16.53 7.20 20.09 >sdc 0.03 0.55 0.22 27.06 0.03 11.40 858.19 0.74 27.30 36.96 27.22 7.55 20.58 >sdi 0.03 0.53 0.22 27.13 0.03 11.40 856.04 1.60 58.50 27.43 58.75 5.21 14.24 >sdl 0.02 0.56 0.15 27.11 0.03 11.40 858.27 1.12 41.09 27.89 41.16 5.00 13.63 >md127 0.00 0.00 2.53 161.84 0.36 68.39 856.56 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 13.11 0.00 1.82 1.07 0.00 84.01 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 81.00 0.00 38.48 972.95 51.00 219.06 0.00 219.06 6.37 51.60 >sdf 0.00 1.00 0.00 73.00 0.00 33.70 945.33 55.02 235.86 0.00 235.86 7.12 52.00 >sdk 0.00 1.00 0.00 56.00 0.00 25.70 939.73 60.45 223.79 0.00 223.79 9.29 52.00 >sdb 0.00 2.00 0.00 70.00 0.00 34.48 1008.70 58.88 292.81 0.00 292.81 7.37 51.60 >sdj 0.00 3.00 0.00 62.00 0.00 29.87 986.60 59.32 243.48 0.00 243.48 8.26 51.20 >sdg 0.00 1.00 0.00 49.00 0.00 23.43 979.45 60.37 234.98 0.00 234.98 10.53 51.60 >sde 0.00 1.00 0.00 61.00 0.00 27.95 938.38 58.17 239.57 0.00 239.57 8.52 52.00 >sda 0.00 2.00 0.00 56.00 0.00 27.48 1004.88 56.27 202.88 0.00 202.88 9.27 51.90 >sdh 0.00 1.00 0.00 70.00 0.00 33.57 982.19 59.00 277.84 0.00 277.84 7.43 52.00 >sdc 0.00 0.00 0.00 64.00 0.00 30.06 961.89 58.20 268.30 0.00 268.30 8.08 51.70 >sdi 0.00 3.00 0.00 116.00 0.00 55.62 981.94 44.54 199.72 0.00 199.72 4.56 52.90 >sdl 0.00 1.00 0.00 128.00 0.00 60.31 964.88 43.91 215.94 0.00 215.94 4.11 52.60 >md127 0.00 0.00 0.00 1143.00 0.00 538.90 965.59 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 15.70 0.00 1.97 0.44 0.00 81.89 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 119.00 0.00 56.39 970.42 42.84 639.45 0.00 639.45 6.66 79.20 >sdf 0.00 1.00 0.00 129.00 0.00 61.21 971.84 48.89 672.04 0.00 672.04 6.34 81.80 >sdk 0.00 0.00 0.00 152.00 0.00 72.62 978.53 61.02 716.76 0.00 716.76 5.74 87.20 >sdb 0.00 1.00 0.00 133.00 0.00 62.86 967.88 54.10 695.35 0.00 695.35 6.45 85.80 >sdj 0.00 0.00 0.00 146.00 0.00 68.36 958.85 69.22 767.12 0.00 767.12 6.85 100.00 >sdg 0.00 0.00 0.00 146.00 0.00 69.87 980.11 77.99 789.53 0.00 789.53 6.85 100.00 >sde 0.00 1.00 0.00 141.00 0.00 66.96 972.60 56.21 707.61 0.00 707.61 6.21 87.60 >sda 0.00 1.00 0.00 147.00 0.00 69.86 973.22 62.21 728.76 0.00 728.76 6.32 92.90 >sdh 0.00 0.00 0.00 134.00 0.00 62.61 956.90 55.79 711.49 0.00 711.49 6.63 88.90 >sdc 0.00 0.00 0.00 136.00 0.00 64.81 975.94 61.46 753.57 0.00 753.57 6.93 94.20 >sdi 0.00 0.00 0.00 93.00 0.00 42.67 939.61 17.60 419.10 0.00 419.10 4.63 43.10 >sdl 0.00 0.00 0.00 80.00 0.00 38.02 973.20 11.00 340.79 0.00 340.79 4.25 34.00 >md127 0.00 0.00 0.00 87.00 0.00 40.99 964.97 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 12.11 0.00 1.35 0.00 0.00 86.54 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 15.00 0.00 15.00 15.00 1.50 >sdf 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 >sdk 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 >sdb 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 7.00 0.00 7.00 7.00 0.70 >sdj 0.00 0.00 0.00 2.00 0.00 0.06 64.50 0.01 733.50 0.00 733.50 7.50 1.50 >sdg 0.00 0.00 0.00 10.00 0.00 2.88 588.90 0.55 1212.80 0.00 1212.80 15.50 15.50 >sde 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 12.00 0.00 12.00 12.00 1.20 >sda 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 11.00 0.00 11.00 11.00 1.10 >sdh 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 20.00 0.00 20.00 20.00 2.00 >sdc 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 17.00 0.00 17.00 17.00 1.70 >sdi 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.01 12.00 0.00 12.00 12.00 1.20 >sdl 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.02 17.00 0.00 17.00 17.00 1.70 >md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 15.22 0.00 1.50 0.00 0.00 83.28 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 16.96 0.09 1.63 0.16 0.00 81.16 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 8.00 0.00 0.66 168.25 0.09 11.50 0.00 11.50 8.75 7.00 >sdf 0.00 0.00 0.00 5.00 0.00 0.52 213.20 0.08 16.20 0.00 16.20 16.20 8.10 >sdk 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.06 20.33 0.00 20.33 20.33 6.10 >sdb 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.05 16.67 0.00 16.67 16.67 5.00 >sdj 0.00 0.00 0.00 4.00 0.00 0.98 500.50 0.06 14.50 0.00 14.50 11.00 4.40 >sdg 0.00 1.00 0.00 4.00 0.00 0.63 322.50 0.14 36.00 0.00 36.00 32.75 13.10 >sde 0.00 0.00 0.00 5.00 0.00 0.52 213.20 0.07 13.60 0.00 13.60 13.60 6.80 >sda 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.05 15.67 0.00 15.67 15.67 4.70 >sdh 0.00 1.00 0.00 4.00 0.00 0.63 322.50 0.06 14.50 0.00 14.50 11.50 4.60 >sdc 0.00 0.00 0.00 8.00 0.00 0.66 168.25 0.11 13.25 0.00 13.25 10.62 8.50 >sdi 0.00 0.00 0.00 4.00 0.00 0.98 500.50 0.06 15.50 0.00 15.50 12.00 4.80 >sdl 0.00 0.00 0.00 3.00 0.00 0.50 342.00 0.04 13.67 0.00 13.67 13.67 4.10 >md127 0.00 0.00 0.00 17.00 0.00 3.78 455.53 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 14.08 0.00 1.50 0.00 0.00 84.42 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 14.89 0.00 1.98 0.00 0.00 83.13 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 90.00 0.00 41.31 940.01 27.25 302.80 0.00 302.80 7.07 63.60 >sdf 0.00 0.00 0.00 87.00 0.00 41.35 973.44 22.73 261.30 0.00 261.30 6.92 60.20 >sdk 0.00 2.00 0.00 97.00 0.00 42.08 888.42 39.86 410.94 0.00 410.94 8.10 78.60 >sdb 0.00 0.00 0.00 87.00 0.00 41.07 966.82 24.39 280.30 0.00 280.30 7.14 62.10 >sdj 0.00 1.00 0.00 91.00 0.00 41.94 943.92 36.37 399.62 0.00 399.62 8.44 76.80 >sdg 0.00 0.00 0.00 86.00 0.00 40.67 968.48 31.76 369.33 0.00 369.33 8.81 75.80 >sde 0.00 0.00 0.00 87.00 0.00 41.35 973.44 30.80 354.05 0.00 354.05 9.01 78.40 >sda 0.00 0.00 0.00 87.00 0.00 41.07 966.82 32.61 374.80 0.00 374.80 8.57 74.60 >sdh 0.00 0.00 0.00 86.00 0.00 40.67 968.48 29.52 343.23 0.00 343.23 8.56 73.60 >sdc 0.00 0.00 0.00 89.00 0.00 40.81 939.07 32.80 360.15 0.00 360.15 8.91 79.30 >sdi 0.00 1.00 0.00 91.00 0.00 41.94 943.92 19.60 215.34 0.00 215.34 5.62 51.10 >sdl 0.00 2.00 0.00 97.00 0.00 42.08 888.42 19.59 201.93 0.00 201.93 4.69 45.50 >md127 0.00 0.00 0.00 535.00 0.00 248.42 950.95 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 11.08 0.00 1.41 0.00 0.00 87.51 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 5.00 0.00 42.00 0.00 0.38 18.55 2.25 53.52 0.00 53.52 4.93 20.70 >sdf 0.00 0.00 0.00 35.00 0.00 0.21 12.43 1.62 46.17 0.00 46.17 5.29 18.50 >sdk 0.00 23.00 0.00 42.00 0.00 0.44 21.40 1.99 47.29 0.00 47.29 4.64 19.50 >sdb 0.00 9.00 0.00 58.00 0.00 0.34 12.02 2.77 47.78 0.00 47.78 4.12 23.90 >sdj 0.00 1.00 0.00 39.00 0.00 0.24 12.79 1.79 45.97 0.00 45.97 5.21 20.30 >sdg 0.00 11.00 0.00 66.00 0.00 0.40 12.45 3.60 54.47 0.00 54.47 3.42 22.60 >sde 0.00 0.00 0.00 35.00 0.00 0.21 12.43 2.13 61.00 0.00 61.00 8.89 31.10 >sda 0.00 9.00 0.00 58.00 0.00 0.34 12.02 2.48 42.81 0.00 42.81 3.71 21.50 >sdh 0.00 11.00 0.00 66.00 0.00 0.40 12.45 4.81 72.83 0.00 72.83 3.80 25.10 >sdc 0.00 5.00 0.00 43.00 0.00 0.88 41.93 1.99 63.81 0.00 63.81 5.00 21.50 >sdi 0.00 1.00 0.00 39.00 0.00 0.24 12.79 1.31 33.69 0.00 33.69 4.03 15.70 >sdl 0.00 23.00 0.00 42.00 0.00 0.44 21.40 1.23 29.33 0.00 29.33 3.71 15.60 >md127 0.00 0.00 0.00 313.00 0.00 2.01 13.14 0.00 0.00 0.00 0.00 0.00 0.00 > >avg-cpu: %user %nice %system %iowait %steal %idle > 16.16 0.03 1.66 0.00 0.00 82.15 > >Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util >sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >md127 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >On 2016-05-26, 11:50 PM, "centos-bounces at centos.org on behalf of Gordon Messmer" <centos-bounces at centos.org on behalf of gordon.messmer at gmail.com> wrote: > >>On 05/25/2016 09:54 AM, Kelly Lesperance wrote: >>> What we're seeing is that when the weekly raid-check script executes, performance nose dives, and I/O wait skyrockets. The raid check starts out fairly fast (20000K/sec - the limit that's been set), but then quickly drops down to about 4000K/Sec. dev.raid.speed sysctls are at the defaults: >> >>It looks like some pretty heavy writes are going on at the time. I'm not >>sure what you mean by "nose dives", but I'd expect *some* performance >>impact of running a read-intensive process like a RAID check at the same >>time you're running a write-intensive process. >> >>Do the same write-heavy processes run on the other clusters, where you >>aren't seeing performance issues? >> >>> avg-cpu: %user %nice %system %iowait %steal %idle >>> 9.24 0.00 1.32 20.02 0.00 69.42 >>> >>> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn >>> sda 50.00 512.00 20408.00 512 20408 >>> sdb 50.00 512.00 20408.00 512 20408 >>> sdc 48.00 512.00 19984.00 512 19984 >>> sdd 48.00 512.00 19984.00 512 19984 >>> sdf 50.00 704.00 19968.00 704 19968 >>> sdg 47.00 512.00 19968.00 512 19968 >>> sdh 47.00 512.00 19968.00 512 19968 >>> sde 50.00 704.00 19968.00 704 19968 >>> sdj 48.00 512.00 19972.00 512 19972 >>> sdi 48.00 512.00 19972.00 512 19972 >>> sdk 48.00 512.00 19980.00 512 19980 >>> sdl 48.00 512.00 19980.00 512 19980 >>> md127 241.00 0.00 120280.00 0 120280 >> >>_______________________________________________ >>CentOS mailing list >>CentOS at centos.org >>https://lists.centos.org/mailman/listinfo/centos >
m.roth at 5-cent.us
2016-Jun-01 19:52 UTC
[CentOS] Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
Kelly Lesperance wrote:> I did some additional testing - I stopped Kafka on the host, and kicked > off a disk check, and it ran at the expected speed overnight. I started > kafka this morning, and the raid check's speed immediately dropped down to > ~2000K/Sec. > > I then enabled the write-back cache on the drives (hdparm -W1 /dev/sd*). > The raid check is now running between 100000K/Sec and 200000K/Sec, and has > been for several hours (it fluctuates, but seems to stay within that > range). Write-back cache is NOT enabled for the drives on the hosts we > haven't upgraded yet, but the speeds are similar (I kicked off a raid > check on one of our CentOS 6 hosts as well, the window seems to be 150000 > - 200000K/Sec on that host).<snip> Perhaps I missed where you answered this: is this software RAID, or hardware? And I think you said you're upgrading existing boxes? mark
Possibly Parallel Threads
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2
- Slow RAID Check/high %iowait during check after updgrade from CentOS 6.5 -> CentOS 7.2