Timo Schoeler
2010-Mar-02 08:30 UTC
[CentOS] Very unresponsive, sometimes stalling domU (5.4, x86_64)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi list, please forgive cross posting, but I cannot specify the problem enough to say whether list it fits perfectly, so I'll ask on both. I have some machines based with following specs (see at the end of the email). They run CentOS 5.4 x86_64 with the latest patches applied, Xen-enabled and should host one or more domUs. I put the domUs' storage on LVM, as I learnt ages ago (what never caused any problems) and is way faster than using file-based 'images'. However, there's something special about these machines: They have the new WD EARS series drives, which use 4K sector sizes. So, I booted a rescue system and used fdisk to start at sector 64 instead of 63 (long story made short: Due to overhead causing the drive to do much more, inefficient writes when starting at sector 63, the performance collapses; with 'normal' geometry (sector 63), the drive achieves about 25MiByte/sec writes, with starting at sector 64 partition, it achieves almost 100MiByte/sec writes): [root at server2 ~]# fdisk -ul /dev/sda Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors Units = sectors of 1 * 512 = 512 bytes Device Boot Start End Blocks Id System /dev/sda1 * 64 2097223 1048580 fd Linux raid autodetect Partition 1 does not end on cylinder boundary. /dev/sda2 2097224 18876487 8389632 82 Linux swap / Solaris /dev/sda3 18876488 1953525167 967324340 fd Linux raid autodetect On top of those (two per machine) WD EARS HDs there's ``md'' providing two RAID1, /boot and LVM, as well as swap per HD (i.e. non-RAIDed). LVM provides the / partition as well as LVs for Xen domUs. I have about 60 machines running that style and never had any problems. They run like a charm. On these machines, however, domUs are *very* slow, have a steady (!) load of about two -- 50% stating in 'wait' -- and all operations take ages, e.g. a ``yum update'' with the recently released updates. Now, can that be due to 4K issues I didn't see, nestet now in LVM? Help is very appreciated. Cheers, Timo - --- Linux server2.blah.org 2.6.18-164.11.1.el5xen #1 SMP Wed Jan 20 08:06:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux - --- [root at server2 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 1998.000 cache size : 3072 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 cx16 xtpr lahf_lm bogomips : 6668.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 1998.000 cache size : 3072 KB physical id : 1 siblings : 1 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 cx16 xtpr lahf_lm bogomips : 6668.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 1998.000 cache size : 3072 KB physical id : 2 siblings : 1 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 cx16 xtpr lahf_lm bogomips : 6668.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 1998.000 cache size : 3072 KB physical id : 3 siblings : 1 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 cx16 xtpr lahf_lm bogomips : 6668.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: - --- [root at server2 ~]# cat /proc/meminfo MemTotal: 524288 kB MemFree: 80620 kB Buffers: 23352 kB Cached: 205400 kB SwapCached: 0 kB Active: 132448 kB Inactive: 156424 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 524288 kB LowFree: 80620 kB SwapTotal: 16779248 kB SwapFree: 16779248 kB Dirty: 32 kB Writeback: 0 kB AnonPages: 60112 kB Mapped: 13348 kB Slab: 30996 kB PageTables: 4424 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 17041392 kB Committed_AS: 334800 kB VmallocTotal: 34359738367 kB VmallocUsed: 12572 kB VmallocChunk: 34359724607 kB - --- [root at server2 ~]# fgrep min-mem /etc/xen/xend-config.sxp # dom0-min-mem is the lowest memory level (in MB) dom0 will get down to. # If dom0-min-mem=0, dom0 will never balloon out. (dom0-min-mem 512) - --- [root at server2 ~]# fgrep dom0 /boot/grub/menu.lst kernel /xen.gz-2.6.18-164.11.1.el5 dom0_mem=512M - --- example of ``dstat'' running while ``yum update'' was done; I think the CPU is in ``wait'' state too much: - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 100 0 0 0| 0 0 | 136B 178B| 0 0 | 11 12 0 0 100 0 0 0| 0 0 | 966B 892B| 0 0 | 35 31 12 5 83 0 0 0| 0 0 |4068B 4521B| 0 0 | 245 150 47 4 50 0 0 0| 0 0 | 126B 178B| 0 0 | 113 11 46 3 51 0 0 0| 0 0 | 328B 470B| 0 0 | 133 22 0 0 73 28 0 0| 48k 0 | 198B 454B| 0 0 | 30 27 41 3 51 5 0 0| 192k 0 | 522B 1246B| 0 0 | 164 61 9 2 89 0 0 0|8192B 968k| 630B 1896B| 0 0 | 62 35 0 0 100 0 0 0| 0 0 | 136B 178B| 0 0 | 15 16 0 0 100 0 0 0| 0 0 | 246B 292B| 0 0 | 14 17 1 0 99 0 0 0| 0 0 |1231k 28k| 0 0 |1004 925 0 0 100 0 0 0| 0 0 |3394k 77k| 0 0 |2871 2943 27 5 48 20 0 0| 968k 0 | 442k 10k| 0 0 | 641 657 19 5 59 17 0 0| 344k 536k|1644B 4064B| 0 0 | 414 339 0 0 50 50 0 0| 56k 320k| 186B 232B| 0 0 | 128 129 0 1 44 54 0 0| 136k 1312k| 278B 220B| 0 0 | 126 107 0 0 55 45 0 0|1552k 11M| 126B 178B| 0 0 | 502 139 0 0 53 48 0 0| 568k 0 | 126B 178B| 0 0 | 41 32 0 0 50 50 0 0| 0 0 | 126B 178B| 0 0 | 16 14 1 1 53 46 0 0|9608k 0 | 258B 566B| 0 0 |1473 2456 12 3 54 32 0 0|1368k 0 |2112B 6064B| 0 0 | 713 603 12 1 36 52 0 0| 888k 1192k| 858B 2426B| 0 0 | 394 429 0 0 52 48 0 0| 0 2472k| 126B 178B| 0 0 | 189 75 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 54 46 0 0| 0 728k| 66B 178B| 0 0 | 107 30 0 0 41 59 0 0|8192B 448k| 126B 322B| 0 0 | 85 46 0 0 55 45 0 0| 0 1920k| 126B 178B| 0 0 | 185 72 0 0 54 46 0 0| 0 2688k| 126B 178B| 0 0 | 238 78 0 0 41 59 0 0| 0 1576k| 126B 178B| 0 0 | 136 51 0 0 47 53 0 0|8192B 2128k| 66B 178B| 0 0 | 207 53 0 0 50 50 0 0| 40k 2744k| 66B 178B| 0 0 | 277 60 0 0 50 50 0 0| 16k 3536k| 66B 178B| 0 0 | 330 59 0 0 50 50 0 0|8192B 1016k| 66B 178B| 0 0 | 98 16 0 0 59 41 0 0| 80k 4320k| 66B 178B| 0 0 | 108 100 0 0 48 52 0 0| 16k 208k| 126B 178B| 0 0 | 80 89 0 0 46 54 0 0| 56k 0 | 308B 178B| 0 0 | 38 68 0 0 42 58 0 0| 0 0 | 66B 178B| 0 0 | 11 11 0 0 54 45 0 0|1264k 752k| 66B 178B| 0 0 | 282 428 0 0 53 47 0 0| 0 360k| 66B 178B| 0 0 | 81 101 0 0 53 46 0 0| 0 576k| 66B 178B| 0 0 | 141 129 0 0 38 62 0 0| 536k 88k| 126B 178B| 0 0 | 68 62 0 0 50 50 0 0| 0 0 | 66B 178B| 0 0 | 17 15 1 1 52 46 0 0| 336k 392k| 126B 178B| 0 0 | 105 115 0 0 39 61 0 0| 152k 3504k| 126B 178B| 0 0 | 199 63 0 0 49 51 0 0| 40k 992k| 186B 178B| 0 0 | 122 40 0 0 56 44 0 0| 0 216k| 186B 178B| 0 0 | 73 39 0 0 42 58 0 0| 0 224k| 66B 178B| 0 0 | 69 30 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 50 50 0 0| 0 216k| 66B 178B| 0 0 | 89 36 0 0 51 50 0 0| 0 272k| 126B 322B| 0 0 | 100 34 0 0 50 50 0 0|8192B 312k| 66B 178B| 0 0 | 81 36 0 0 56 44 0 0| 0 560k| 186B 178B| 0 0 | 103 44 0 0 43 57 0 0| 0 488k| 126B 178B| 0 0 | 91 16 0 0 50 50 0 0| 0 408k| 126B 178B| 0 0 | 59 12 0 0 64 36 0 0| 72k 120k| 380B 566B| 0 0 | 140 87 0 0 44 56 0 0| 0 0 | 66B 178B| 0 0 | 9 10 0 0 55 45 0 0| 16k 0 | 126B 178B| 0 0 | 32 38 2 0 21 78 0 0| 72k 744k| 846B 2038B| 0 0 | 243 275 0 0 44 56 0 0| 0 0 | 186B 178B| 0 0 | 15 19 0 0 50 50 0 0|8192B 992k| 66B 178B| 0 0 | 72 16 0 1 51 48 0 0| 440k 568k| 126B 178B| 0 0 | 105 142 0 0 65 34 0 0| 32k 48k| 192B 372B| 0 0 | 41 43 2 1 41 56 0 0| 80k 872k|1254B 3784B| 0 0 | 271 264 0 0 44 56 0 0| 0 264k| 126B 178B| 0 0 | 66 84 0 0 61 39 0 0| 64k 224k| 126B 178B| 0 0 | 125 162 1 1 18 81 0 0| 0 736k| 132B 372B| 0 0 | 88 35 0 0 59 41 0 0| 104k 912k|1032B 2312B| 0 0 | 176 113 0 0 44 56 0 0| 0 0 |1090B 178B| 0 0 | 15 13 1 0 57 41 0 0| 96k 704k| 528B 1456B| 0 0 | 167 206 0 0 44 56 0 0| 40k 16k|1270B 178B| 0 0 | 39 36 0 0 50 50 0 0| 0 0 | 126B 178B| 0 0 | 13 13 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 50 50 0 0| 16k 2368k| 66B 178B| 0 0 | 165 50 1 0 54 45 0 0| 72k 304k|1692B 4510B| 0 0 | 248 201 0 0 54 46 0 0|8192B 176k|3898B 178B| 0 0 | 132 83 0 0 51 49 0 0|8192B 240k|7236B 252B| 0 0 | 212 101 0 0 50 50 0 0| 40k 0 | 126B 178B| 0 0 | 27 31 0 0 32 68 0 0| 0 0 |3034B 178B| 0 0 | 61 15 0 3 63 34 0 0| 280k 1840k|6350B 820B| 0 0 | 378 354 0 0 44 56 0 0| 0 336k| 66B 178B| 0 0 | 73 88 0 0 50 50 0 0|8192B 248k| 66B 178B| 0 0 | 62 52 0 0 50 50 0 0| 336k 200k| 126B 178B| 0 0 | 65 71 0 0 55 45 0 0| 72k 368k| 126B 178B| 0 0 | 80 100 0 0 52 48 0 0| 192k 176k| 66B 178B| 0 0 | 54 69 0 0 41 59 0 0| 112k 272k| 66B 178B| 0 0 | 71 64 0 0 40 60 0 0| 0 0 | 126B 178B| 0 0 | 25 24 0 0 57 43 0 0| 240k 216k| 186B 330B| 0 0 | 63 86 0 0 51 49 0 0| 120k 808k| 126B 178B| 0 0 | 131 157 0 0 50 50 0 0| 0 296k| 66B 178B| 0 0 | 65 84 0 0 50 50 0 0| 0 296k| 126B 178B| 0 0 | 72 98 0 0 41 59 0 0| 0 2848k| 126B 178B| 0 0 | 154 112 0 0 50 50 0 0| 0 384k| 188B 178B| 0 0 | 84 32 0 0 53 47 0 0| 0 272k|1812B 178B| 0 0 | 82 17 0 0 47 53 0 0| 0 208k| 196B 178B| 0 0 | 54 17 0 0 50 50 0 0| 0 232k| 128B 178B| 0 0 | 66 30 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 45 55 0 0| 0 392k| 126B 178B| 0 0 | 85 22 0 0 50 50 0 0| 0 296k| 126B 322B| 0 0 | 64 10 0 0 65 35 0 0| 24k 472k| 448B 486B| 0 0 | 114 99 1 1 20 78 0 0| 120k 0 | 66B 178B| 0 0 | 86 96 0 0 49 51 0 0| 0 0 | 66B 178B| 0 0 | 9 12 0 1 64 36 0 0|2720k 456k| 276B 178B| 0 0 | 151 124 0 0 100 0 0 0| 0 0 | 196B 178B| 0 0 | 32 24 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 9 12 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 10 13 0 0 80 19 0 0|8192B 1248k| 66B 178B| 0 0 | 177 205 0 0 54 46 0 0| 16k 328k| 198B 486B| 0 0 | 99 111 0 0 53 47 0 0| 0 344k| 126B 178B| 0 0 | 82 78 0 0 31 69 0 0| 0 312k| 66B 178B| 0 0 | 79 101 0 0 60 40 0 0| 0 240k| 192B 372B| 0 0 | 76 89 0 0 24 76 0 0| 0 280k| 66B 178B| 0 0 | 71 81 0 0 47 53 0 0| 0 120k| 66B 178B| 0 0 | 36 40 0 0 50 50 0 0| 0 304k| 66B 178B| 0 0 | 79 99 0 0 56 44 0 0| 0 144k| 66B 178B| 0 0 | 42 84 0 0 50 50 0 0| 16k 136k| 186B 178B| 0 0 | 50 64 0 0 59 41 0 0| 0 328k| 192B 372B| 0 0 | 71 57 0 0 22 78 0 0| 0 280k| 66B 178B| 0 0 | 71 88 0 0 50 50 0 0| 0 256k| 66B 178B| 0 0 | 76 88 0 0 56 44 0 0| 24k 232k| 186B 178B| 0 0 | 80 102 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 52 48 0 0| 0 240k| 132B 372B| 0 0 | 76 84 0 0 60 40 0 0| 0 416k| 186B 322B| 0 0 | 73 81 3 2 55 41 0 0|8192B 184k| 384B 1316B| 0 0 | 97 80 0 0 100 0 0 0| 0 0 | 428B 746B| 0 0 | 40 22 0 0 100 0 0 0| 0 0 | 246B 462B| 0 0 | 24 13 0 0 98 2 0 0| 0 2728k| 66B 178B| 0 0 | 22 21 0 0 100 0 0 0| 0 0 | 308B 462B| 0 0 | 23 11 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 23 17 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 12 11 0 0 100 0 0 0| 0 0 | 246B 462B| 0 0 | 23 15 0 0 99 1 0 0| 0 0 | 66B 178B| 0 0 | 10 14 0 0 99 1 0 0| 0 688k| 126B 178B| 0 0 | 110 15 0 0 100 0 0 0| 0 0 | 248B 178B| 0 0 | 21 15 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 10 13 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 21 13 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 14 17 0 0 100 0 0 0| 0 0 | 186B 462B| 0 0 | 20 11 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 22 19 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 17 13 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 12 14 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 13 13 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 13 12 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 11 11 - ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 0 78 23 0 0| 0 32k| 126B 178B| 0 0 | 20 21 0 0 100 0 0 0| 0 0 | 126B 322B| 0 0 | 12 13 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 9 12 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 10 13 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 15 15 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 13 11 0 0 100 0 0 0| 0 0 | 126B 314B| 0 0 | 16 15 0 0 100 0 0 0| 0 0 | 186B 326B| 0 0 | 16 11 0 0 100 0 0 0| 0 112k| 126B 178B| 0 0 | 22 15 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 14 14 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 12 12 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 14 13 0 0 100 0 0 0| 0 0 | 66B 178B| 0 0 | 10 13 0 0 99 2 0 0| 0 8192B| 66B 178B| 0 0 | 12 15 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 21 14 0 0 100 0 0 0| 0 0 | 126B 178B| 0 0 | 15 13 0 0 100 0 0 0| 0 0 | 186B 178B| 0 0 | 16 15 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFLjMy5fg746kcGBOwRAs6KAJ9BeSRkvsIwt3/z/KYQcW6fIKxkHgCgjFaH GDjMz//WUy7m2EAeD27HYpw=oK05 -----END PGP SIGNATURE-----
Pasi Kärkkäinen
2010-Mar-02 09:18 UTC
[CentOS-virt] [CentOS] Very unresponsive, sometimes stalling domU (5.4, x86_64)
On Tue, Mar 02, 2010 at 09:30:50AM +0100, Timo Schoeler wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi list, > > please forgive cross posting, but I cannot specify the problem enough to > say whether list it fits perfectly, so I'll ask on both. > > I have some machines based with following specs (see at the end of the > email). > > They run CentOS 5.4 x86_64 with the latest patches applied, Xen-enabled > and should host one or more domUs. I put the domUs' storage on LVM, as I > learnt ages ago (what never caused any problems) and is way faster than > using file-based 'images'. > > However, there's something special about these machines: They have the > new WD EARS series drives, which use 4K sector sizes. So, I booted a > rescue system and used fdisk to start at sector 64 instead of 63 (long > story made short: Due to overhead causing the drive to do much more, > inefficient writes when starting at sector 63, the performance > collapses; with 'normal' geometry (sector 63), the drive achieves about > 25MiByte/sec writes, with starting at sector 64 partition, it achieves > almost 100MiByte/sec writes): > > [root at server2 ~]# fdisk -ul /dev/sda > > Disk /dev/sda: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors > Units = sectors of 1 * 512 = 512 bytes > > Device Boot Start End Blocks Id System > /dev/sda1 * 64 2097223 1048580 fd Linux raid > autodetect > Partition 1 does not end on cylinder boundary. > /dev/sda2 2097224 18876487 8389632 82 Linux swap / Solaris > /dev/sda3 18876488 1953525167 967324340 fd Linux raid > autodetect > > On top of those (two per machine) WD EARS HDs there's ``md'' providing > two RAID1, /boot and LVM, as well as swap per HD (i.e. non-RAIDed). LVM > provides the / partition as well as LVs for Xen domUs. > > I have about 60 machines running that style and never had any problems. > They run like a charm. On these machines, however, domUs are *very* > slow, have a steady (!) load of about two -- 50% stating in 'wait' -- > and all operations take ages, e.g. a ``yum update'' with the recently > released updates. > > Now, can that be due to 4K issues I didn't see, nestet now in LVM? > > Help is very appreciated. >Maybe the default LVM alignment is wrong for these drives.. did you check/verify that? See: http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/ Especially the "--metadatasize" option. -- Pasi
compdoc
2010-Mar-02 12:48 UTC
[CentOS-virt] [CentOS] Very unresponsive, sometimes stalling domU (5.4, x86_64)
I recently avoided buying a pair of WD green EARS drives because of the problems ppl were having with them when used in raids. Newegg.com has many poor reviews. Some reviews mentioned the drive can power down and cause the raid card to mark them as missing. If you find a solution it would be interesting to know. I went with Samsung green drives, as I have had good luck with them used in servers in the past.
Timo Schoeler
2010-Mar-03 09:20 UTC
[CentOS] [CentOS-virt] Very unresponsive, sometimes stalling domU (5.4, x86_64)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 thus Pasi K?rkk?inen spake:> On Tue, Mar 02, 2010 at 09:30:50AM +0100, Timo Schoeler wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Hi list, >> >> please forgive cross posting, but I cannot specify the problem enough to >> say whether list it fits perfectly, so I'll ask on both. >> >> I have some machines based with following specs (see at the end of the >> email). >> >> They run CentOS 5.4 x86_64 with the latest patches applied, Xen-enabled >> and should host one or more domUs. I put the domUs' storage on LVM, as I >> learnt ages ago (what never caused any problems) and is way faster than >> using file-based 'images'. >> >> However, there's something special about these machines: They have the >> new WD EARS series drives, which use 4K sector sizes. So, I booted a >> rescue system and used fdisk to start at sector 64 instead of 63 (long >> story made short: Due to overhead causing the drive to do much more, >> inefficient writes when starting at sector 63, the performance >> collapses; with 'normal' geometry (sector 63), the drive achieves about >> 25MiByte/sec writes, with starting at sector 64 partition, it achieves >> almost 100MiByte/sec writes): >> >> [root at server2 ~]# fdisk -ul /dev/sda >> >> Disk /dev/sda: 1000.2 GB, 1000204886016 bytes >> 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors >> Units = sectors of 1 * 512 = 512 bytes >> >> Device Boot Start End Blocks Id System >> /dev/sda1 * 64 2097223 1048580 fd Linux raid >> autodetect >> Partition 1 does not end on cylinder boundary. >> /dev/sda2 2097224 18876487 8389632 82 Linux swap / Solaris >> /dev/sda3 18876488 1953525167 967324340 fd Linux raid >> autodetect >> >> On top of those (two per machine) WD EARS HDs there's ``md'' providing >> two RAID1, /boot and LVM, as well as swap per HD (i.e. non-RAIDed). LVM >> provides the / partition as well as LVs for Xen domUs. >> >> I have about 60 machines running that style and never had any problems. >> They run like a charm. On these machines, however, domUs are *very* >> slow, have a steady (!) load of about two -- 50% stating in 'wait' -- >> and all operations take ages, e.g. a ``yum update'' with the recently >> released updates. >> >> Now, can that be due to 4K issues I didn't see, nestet now in LVM? >> >> Help is very appreciated. >> > > Maybe the default LVM alignment is wrong for these drives.. > did you check/verify that? > > See: > http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/ > > Especially the "--metadatasize" option.Hi Pasi, hey lists, thanks for the hint. Following is the 'most important' part of the text: ``So I created a 1 gigabyte /boot partition as /dev/sdb1, and allocated the rest of the SSD for use by LVM as /dev/sdb2. And that?s where I ran into my next problem. LVM likes to allocate 192k for its header information, and 192k is not a multiple of 128k. So if you are creating file systems as logical volumes, and you want those volume to be properly aligned you have to tell LVM that it should reserve slightly more space for its meta-data, so that the physical extents that it allocates for its logical volumes are properly aligned. Unfortunately, the way this is done is slightly baroque: # pvcreate ?metadatasize 250k /dev/sdb2 Physical volume ?/dev/sdb2? successfully created Why 250k and not 256k? I can?t tell you ? sometimes the LVM tools aren?t terribly intuitive. However, you can test to make sure that physical extents start at the proper offset by using: # pvs /dev/sdb2 -o+pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/sdb2 lvm2 ? 73.52G 73.52G 256.00K If you use a metadata size of 256k, the first PE will be at 320k instead of 256k. There really ought to be an ?pe-align option to pvcreate, which would be far more user-friendly, but, we have to work with the tools that we have. Maybe in the next version of the LVM support tools?.'' So, after taking care of starting at sector 64 *and* taking care ``pvcreate'' has its 'multiple of 128k', I still have the same problem. Most interestingly, debian 'lenny' does *not* have this problem. LVM's PV does *not* have to be like mentioned above. So, unfortunately, it seems like I'm forced to use debian in this project, at least on a few machines. *shiver*> -- PasiTimo -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFLjin0fg746kcGBOwRAvp0AKC7TuCnrK63MOiqI8CK+m+XNgDqFgCfRvq+ DjcZJN8mCweY6jvAvTb90hg=+E/H -----END PGP SIGNATURE-----