Ray Galvin
2009-Jan-24 03:07 UTC
[zfs-discuss] zfs read performance degrades over a short time
I appear to be seeing the performance of a local ZFS file system degrading over a short period of time. My system configuration: 32 bit Athlon 1800+ CPU 1 Gbyte of RAM Solaris 10 U6 SunOS filer 5.10 Generic_137138-09 i86pc i386 i86pc 2x250 GByte Western Digital WD2500JB IDE hard drives 1 zfs pool (striped with the two drives, 449 GBytes total) 1 hard drive has an 8 Gbyte UFS slice holding the OS root filesystem My test consists of reading a single 2.3 Gbyte file using ''dd''. Immediately after booting the system, I am seeing a read speed of nearly 100 Mbyte per sec. This is wonderful and shows that the ZFS striping is giving a performance boost as the maximum data rate from each drive is much lower than that - somewhere around 60 Mbytes/sec. Unfortunately if I rerun the test multiple times, the performance degrades. By the 18th time (about 10 minutes later) the test is run, the performance is down to 54 Mbyte/sec. It is interesting to note that the read performance of the UFS root filesystem has not degraded during the above testing. The large file read performance of the root UFS filesystem remains constant at 58 Mbytes. I''m running a fresh install of sol-10-u6-ga1-x86-dvd.iso. The only processes running on the system are the standard ones that happen with a fresh install. Does anyone know if this performance degradation is normal? If not, does anyone have hints on what I should do to track down the problem. Thanks. Ray ----------The log of my runs is shown below------------ # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 23.8 user 0.0 sys 11.1 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 24.5 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 27.4 user 0.0 sys 8.8 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 26.9 user 0.0 sys 8.4 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 32.0 user 0.0 sys 8.5 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 31.4 user 0.0 sys 8.6 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 33.2 user 0.0 sys 8.8 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 34.1 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 35.9 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 36.8 user 0.0 sys 8.7 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 38.1 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 38.4 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 39.4 user 0.0 sys 9.0 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 40.3 user 0.0 sys 8.9 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 40.6 user 0.0 sys 9.0 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 41.3 user 0.0 sys 9.0 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 42.2 user 0.0 sys 9.0 # time dd if=big2 of=/dev/null bs=1024k 2201+1 records in 2201+1 records out real 42.4 user 0.0 sys 9.2 # -- This message posted from opensolaris.org
Richard Elling
2009-Jan-24 15:26 UTC
[zfs-discuss] zfs read performance degrades over a short time
Ray Galvin wrote:> I appear to be seeing the performance of a local ZFS file system degrading over a short period of time. > > My system configuration: > > 32 bit Athlon 1800+ CPU > 1 Gbyte of RAM >SInce you only have 1 GByte of RAM, I would suspect the ARC is filling and becoming less useful as it works harder to find space. This processor isn''t very fast, so you might notice it on this sort of machine. Get arcstat and use it to track how the ARC is being used for the different runs. http://blogs.sun.com/realneel/entry/zfs_arc_statistics -- richard> Solaris 10 U6 > SunOS filer 5.10 Generic_137138-09 i86pc i386 i86pc > 2x250 GByte Western Digital WD2500JB IDE hard drives > 1 zfs pool (striped with the two drives, 449 GBytes total) > 1 hard drive has an 8 Gbyte UFS slice holding the OS root filesystem > > My test consists of reading a single 2.3 Gbyte file using ''dd''. Immediately after booting the system, I am seeing a read speed of nearly 100 Mbyte per sec. This is wonderful and shows that the ZFS striping is giving a performance boost as the maximum data rate from each drive is much lower than that - somewhere around 60 Mbytes/sec. Unfortunately if I rerun the test multiple times, the performance degrades. By the 18th time (about 10 minutes later) the test is run, the performance is down to 54 Mbyte/sec. It is interesting to note that the read performance of the UFS root filesystem has not degraded during the above testing. The large file read performance of the root UFS filesystem remains constant at 58 Mbytes. > > I''m running a fresh install of sol-10-u6-ga1-x86-dvd.iso. The only processes running on the system are the standard ones that happen with a fresh install. > > Does anyone know if this performance degradation is normal? If not, does anyone have hints on what I should do to track down the problem. > > Thanks. > > Ray > > > ----------The log of my runs is shown below------------ > > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 23.8 > user 0.0 > sys 11.1 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 24.5 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 27.4 > user 0.0 > sys 8.8 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 26.9 > user 0.0 > sys 8.4 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 32.0 > user 0.0 > sys 8.5 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 31.4 > user 0.0 > sys 8.6 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 33.2 > user 0.0 > sys 8.8 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 34.1 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 35.9 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 36.8 > user 0.0 > sys 8.7 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 38.1 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 38.4 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 39.4 > user 0.0 > sys 9.0 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 40.3 > user 0.0 > sys 8.9 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 40.6 > user 0.0 > sys 9.0 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 41.3 > user 0.0 > sys 9.0 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 42.2 > user 0.0 > sys 9.0 > # time dd if=big2 of=/dev/null bs=1024k > 2201+1 records in > 2201+1 records out > > real 42.4 > user 0.0 > sys 9.2 > # >
Ray Galvin
2009-Jan-24 18:58 UTC
[zfs-discuss] zfs read performance degrades over a short time
Thanks for the information Richard! The output of running arcstat.pl is included below. A potentially interesting thing I see is that the "Prefetch miss percentage" is 100% during this test. I would have thought that a large sequential read test would be an easy case for prefetch prediction. When I first start the test and am seeing a 100Mbytes/sec read rate, arcstat output is: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 10:29:22 685 374 54 9 2 365 100 7 2 51M 490M 10:29:23 2K 855 37 7 0 848 100 7 0 163M 490M 10:29:24 2K 774 33 6 0 768 100 6 0 264M 490M 10:29:25 3K 1K 33 8 0 1K 100 8 0 398M 490M 10:29:26 2K 774 30 6 0 768 100 6 0 413M 412M 10:29:27 2K 1K 34 8 0 1K 100 8 0 413M 412M 10:29:28 2K 774 35 6 0 768 100 6 0 413M 412M 10:29:29 2K 774 34 6 0 768 100 6 0 413M 412M 10:29:30 2K 774 36 6 0 768 100 6 0 413M 412M 10:29:31 2K 774 35 6 0 768 100 6 0 413M 412M 13 minutes later after the test has run about 30 times (and still going) and the read rate has dropped to 50 Mbytes/sec the arcstat output is: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 10:42:58 1K 523 40 11 1 512 100 4 0 397M 397M 10:42:59 761 266 34 10 1 256 100 2 0 397M 397M 10:43:00 1K 525 41 13 1 512 100 4 0 397M 397M 10:43:01 1K 464 40 15 2 449 100 4 0 397M 397M 10:43:02 932 331 35 12 1 319 100 4 1 397M 397M 10:43:03 1K 534 41 19 2 515 99 8 1 397M 397M 10:43:04 770 266 34 10 1 256 100 2 0 397M 397M 10:43:05 1K 525 41 13 1 512 100 4 0 397M 397M 10:43:06 777 267 34 11 2 256 100 3 1 397M 397M 10:43:08 1K 533 41 18 2 515 99 7 1 397M 397M Given the information above, is it still likely this degradation would not occur on a faster machine with more memory? -- This message posted from opensolaris.org
Richard Elling
2009-Jan-24 19:30 UTC
[zfs-discuss] zfs read performance degrades over a short time
Ray Galvin wrote:> Thanks for the information Richard! > > The output of running arcstat.pl is included below. A potentially > interesting thing I see is that the "Prefetch miss percentage" is > 100% during this test. I would have thought that a large sequential > read test would be an easy case for prefetch prediction. > > When I first start the test and am seeing a 100Mbytes/sec > read rate, arcstat output is: > > Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c > 10:29:22 685 374 54 9 2 365 100 7 2 51M 490M > 10:29:23 2K 855 37 7 0 848 100 7 0 163M 490M > 10:29:24 2K 774 33 6 0 768 100 6 0 264M 490M > 10:29:25 3K 1K 33 8 0 1K 100 8 0 398M 490M > 10:29:26 2K 774 30 6 0 768 100 6 0 413M 412M > 10:29:27 2K 1K 34 8 0 1K 100 8 0 413M 412M > 10:29:28 2K 774 35 6 0 768 100 6 0 413M 412M > 10:29:29 2K 774 34 6 0 768 100 6 0 413M 412M > 10:29:30 2K 774 36 6 0 768 100 6 0 413M 412M > 10:29:31 2K 774 35 6 0 768 100 6 0 413M 412M > > 13 minutes later after the test has run about 30 times (and > still going) and the read rate has dropped to 50 Mbytes/sec > the arcstat output is: > > Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c > 10:42:58 1K 523 40 11 1 512 100 4 0 397M 397M > 10:42:59 761 266 34 10 1 256 100 2 0 397M 397M > 10:43:00 1K 525 41 13 1 512 100 4 0 397M 397M > 10:43:01 1K 464 40 15 2 449 100 4 0 397M 397M > 10:43:02 932 331 35 12 1 319 100 4 1 397M 397M > 10:43:03 1K 534 41 19 2 515 99 8 1 397M 397M > 10:43:04 770 266 34 10 1 256 100 2 0 397M 397M > 10:43:05 1K 525 41 13 1 512 100 4 0 397M 397M > 10:43:06 777 267 34 11 2 256 100 3 1 397M 397M > 10:43:08 1K 533 41 18 2 515 99 7 1 397M 397M > > Given the information above, is it still likely this degradation > would not occur on a faster machine with more memory? >If this change correlates with an increase in system time, then yes. Memory will likely offer more imact than CPU speed. Note: this is what we would consider a small memory machine and you might want to consider disabling prefetches altogether, if your workload does not benefit from prefetching. Details on how to do this are in the Evil Tuning Guide. -- richard