Hello list,
I have a system with 2x 1.8 GHz AMD CPUs, 4G of ECC RAM, 7T RAID-Z pool
on Areca controller with about 400 file systems on OpenSolaris snv_101.
The problem is that it takes VERY long to take or delete snapshot and
sync incremental snapshots to backup system.
System load is quite low I''d say, CPU is 98% idle:
load average: 0.09, 0.13, 0.26
IOPs are low as well:
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
data 5.62T 2.32T 98 115 7.72M 2.35M
data 5.62T 2.32T 547 227 47.9M 864K
data 5.62T 2.32T 204 58 15.9M 616K
data 5.62T 2.32T 4 0 256K 0
data 5.62T 2.32T 20 0 399K 0
data 5.62T 2.32T 99 47 9.68M 264K
data 5.62T 2.32T 0 11 6.93K 38.1K
data 5.62T 2.32T 0 455 506 1.90M
data 5.62T 2.32T 250 21 17.0M 420K
data 5.62T 2.32T 150 235 10.7M 1.34M
data 5.62T 2.32T 305 0 16.0M 0
data 5.62T 2.32T 137 3.42K 12.9M 16.8M
data 5.62T 2.32T 107 0 13.2M 0
data 5.62T 2.32T 56 0 4.97M 0
data 5.62T 2.32T 200 296 23.6M 1.70M
mpstat output:
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 160 0 690 1152 568 1133 89 68 144 0 2599 5 8 0 87
1 154 0 108 4424 3241 1388 102 68 137 0 2481 5 7 0 88
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 6 0 83 594 365 286 0 31 3 0 616 0 7 0 93
1 0 0 0 524 141 669 2 27 1 0 321 0 2 0 98
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 55 575 353 280 0 15 3 0 483 1 6 0 93
1 0 0 0 462 142 610 3 17 5 0 450 1 2 0 97
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 454 210 323 0 19 1 0 763 0 3 0 97
1 0 0 0 288 166 297 0 15 3 0 338 0 2 0 98
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 398 172 213 0 13 1 0 626 0 1 0 99
1 0 0 0 252 154 245 0 15 1 0 249 0 0 0 100
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 0 461 229 292 0 17 1 0 501 0 3 0 97
1 0 0 0 290 149 339 4 12 1 0 402 0 2 0 98
What can be wrong that ZFS operations like create file system,
take/destroy snapshot, (not saying snapshot listing which takes ages)
takes minutes to complete.
Is there something I can look at which would help to determine where is
a bottleneck or what is wrong etc?
Thanks in advance for any advice,
Mike