Robert Milkowski
2006-Mar-03 10:23 UTC
[zfs-discuss] Performance degradation while reading sequentially with large block size
Hi. v440 with S10U2p1 (zfs based on snv_b32). Directly connected several 3510 FC JBODS with two links under MPxIO. I sent the same report as SDR-0163. I created mirrored pool from 12 disks from two JBODs in that way that each mirror pair has both disks from both JBODS. Then I create one large file using dd. Now if I want to read that file with any block size above about 800KB (using dd) the performance is bad (40-60MB/s). With smaller block sizes performance is quite good (~160MB/s). bash-3.00# zpool export p-16-32 bash-3.00# zpool import p-16-32 bash-3.00# zpool status p-16-32 pool: p-16-32 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM p-16-32 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E011909320d0 ONLINE 0 0 0 c5t500000E011902FB0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E011909300d0 ONLINE 0 0 0 c5t500000E0119030F0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E011903030d0 ONLINE 0 0 0 c5t500000E01190E730d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E011903300d0 ONLINE 0 0 0 c5t500000E01190E7F0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E011903120d0 ONLINE 0 0 0 c5t500000E0119091E0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t500000E01190E750d0 ONLINE 0 0 0 c5t500000E0119032D0d0 ONLINE 0 0 0 bash-3.00# dd if=/dev/zero of=/p-16-32/q1 bs=1024k [keeps running] bash-3.00# zpool iostat p-16-32 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- p-16-32 416M 408G 145 166 628K 830K p-16-32 416M 408G 0 711 0 89.0M p-16-32 416M 408G 0 938 0 117M p-16-32 416M 408G 0 943 0 118M p-16-32 416M 408G 0 924 0 116M p-16-32 416M 408G 0 937 0 117M p-16-32 416M 408G 0 936 0 117M p-16-32 416M 408G 0 958 0 120M p-16-32 416M 408G 0 937 0 117M p-16-32 1.43G 407G 8 1.05K 47.5K 126M p-16-32 1.43G 407G 0 786 0 92.7M p-16-32 1.43G 407G 0 911 0 114M p-16-32 1.43G 407G 0 926 0 116M p-16-32 1.43G 407G 0 914 0 114M p-16-32 1.43G 407G 0 920 0 115M p-16-32 1.43G 407G 0 960 0 120M p-16-32 1.43G 407G 0 915 0 114M p-16-32 1.43G 407G 0 1020 0 128M p-16-32 2.38G 406G 6 795 39.6K 88.8M p-16-32 2.38G 406G 0 910 0 114M p-16-32 2.38G 406G 0 942 0 118M p-16-32 2.38G 406G 0 940 0 118M p-16-32 2.38G 406G 0 908 0 114M p-16-32 2.38G 406G 0 922 0 115M p-16-32 2.38G 406G 0 947 0 118M p-16-32 2.38G 406G 0 944 0 118M p-16-32 2.38G 406G 0 1.09K 0 140M p-16-32 3.38G 405G 9 774 39.6K 85.9M p-16-32 3.38G 405G 0 927 0 116M p-16-32 3.38G 405G 0 958 0 120M p-16-32 3.38G 405G 0 985 0 123M ^C bash-3.00# I interrupted dd. bash-3.00# zpool export p-16-32 bash-3.00# zpool import p-16-32 bash-3.00# ls -lh /p-16-32/q1 -rw-r--r-- 1 root other 27G Mar 3 10:53 /p-16-32/q1 bash-3.00# Ok, caches are flushed and we''ve got 27GB file which was sequentially written with write speed about 120MB/s (twice as that to disks as it''s software mirror). Now let''s try to read this file with block size 1MB. bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k bash-3.00# zpool iostat p-16-32 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- p-16-32 26.6G 381G 180 0 22.1M 0 p-16-32 26.6G 381G 331 0 41.2M 0 p-16-32 26.6G 381G 336 0 41.8M 0 p-16-32 26.6G 381G 334 0 41.5M 0 p-16-32 26.6G 381G 337 0 42.0M 0 p-16-32 26.6G 381G 336 0 41.8M 0 p-16-32 26.6G 381G 340 0 42.2M 0 p-16-32 26.6G 381G 339 0 42.2M 0 p-16-32 26.6G 381G 329 0 40.9M 0 p-16-32 26.6G 381G 324 0 40.4M 0 p-16-32 26.6G 381G 352 0 43.7M 0 p-16-32 26.6G 381G 371 0 46.1M 0 p-16-32 26.6G 381G 375 0 46.6M 0 p-16-32 26.6G 381G 360 0 44.7M 0 p-16-32 26.6G 381G 370 0 46.0M 0 p-16-32 26.6G 381G 368 0 45.7M 0 p-16-32 26.6G 381G 391 0 48.6M 0 p-16-32 26.6G 381G 363 0 45.1M 0 [...] Well only just above 40MB/s - this is bad. Now again, with block size 128k. bash-3.00# zpool export p-16-32 bash-3.00# zpool import p-16-32 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=128k bash-3.00# zpool iostat p-16-32 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- p-16-32 26.6G 381G 46 0 5.37M 0 p-16-32 26.6G 381G 1.14K 0 145M 0 p-16-32 26.6G 381G 1.22K 0 155M 0 p-16-32 26.6G 381G 1.19K 0 151M 0 p-16-32 26.6G 381G 1.27K 0 161M 0 p-16-32 26.6G 381G 1.23K 0 156M 0 p-16-32 26.6G 381G 1.24K 0 158M 0 p-16-32 26.6G 381G 1.26K 0 160M 0 p-16-32 26.6G 381G 1.25K 0 158M 0 p-16-32 26.6G 381G 1.25K 0 159M 0 p-16-32 26.6G 381G 1.27K 0 161M 0 p-16-32 26.6G 381G 1.27K 0 162M 0 p-16-32 26.6G 381G 1.27K 0 162M 0 p-16-32 26.6G 381G 1.27K 0 161M 0 p-16-32 26.6G 381G 1.27K 0 162M 0 [...] Well, this is much better. bash-3.00# zpool export p-16-32 bash-3.00# zpool import p-16-32 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=256k bash-3.00# zpool iostat p-16-32 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- p-16-32 26.6G 381G 289 2 35.5M 32.0K p-16-32 26.6G 381G 1.24K 0 158M 0 p-16-32 26.6G 381G 1.25K 0 159M 0 p-16-32 26.6G 381G 1.24K 0 158M 0 p-16-32 26.6G 381G 1.26K 0 161M 0 p-16-32 26.6G 381G 1.24K 0 158M 0 p-16-32 26.6G 381G 1.26K 0 160M 0 p-16-32 26.6G 381G 1.26K 0 160M 0 p-16-32 26.6G 381G 1.26K 0 160M 0 p-16-32 26.6G 381G 1.29K 0 164M 0 p-16-32 26.6G 381G 1.28K 0 163M 0 [...] This is ok. With 512k and 768k it''s still the same (~160MB/s). With 800k it is still ok. With 850k, 900k, 1000k, 1024k it''s bad (40-60MB/s). With 2048k, 4096k, 8192k it''s also bad (40-60MB/s). Now let''s see if it improves with many streams. bash-3.00# zpool export p-16-32 bash-3.00# zpool import p-16-32 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& [1] 4535 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& [2] 4536 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& [3] 4537 bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& [4] 4538 First 10 lines were while I adding streams. bash-3.00# zpool iostat p-16-32 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- p-16-32 26.6G 381G 65 0 7.72M 0 p-16-32 26.6G 381G 337 0 41.9M 0 p-16-32 26.6G 381G 328 0 40.9M 0 p-16-32 26.6G 381G 484 0 60.0M 0 p-16-32 26.6G 381G 575 0 71.5M 0 p-16-32 26.6G 381G 693 0 86.1M 0 p-16-32 26.6G 381G 681 0 84.5M 0 p-16-32 26.6G 381G 682 0 84.7M 0 p-16-32 26.6G 381G 679 0 84.4M 0 p-16-32 26.6G 381G 955 0 118M 0 p-16-32 26.6G 381G 991 0 123M 0 p-16-32 26.6G 381G 1017 0 126M 0 p-16-32 26.6G 381G 793 0 98.5M 0 p-16-32 26.6G 381G 970 0 120M 0 p-16-32 26.6G 381G 874 0 109M 0 p-16-32 26.6G 381G 916 0 114M 0 p-16-32 26.6G 381G 935 0 116M 0 p-16-32 26.6G 381G 984 0 122M 0 p-16-32 26.6G 381G 1.08K 0 138M 0 p-16-32 26.6G 381G 1.22K 0 156M 0 p-16-32 26.6G 381G 1.13K 0 144M 0 p-16-32 26.6G 381G 1.20K 0 154M 0 p-16-32 26.6G 381G 918 0 114M 0 p-16-32 26.6G 381G 848 0 106M 0 p-16-32 26.6G 381G 985 0 123M 0 p-16-32 26.6G 381G 973 0 121M 0 p-16-32 26.6G 381G 1.00K 0 128M 0 p-16-32 26.6G 381G 997 0 124M 0 p-16-32 26.6G 381G 953 0 119M 0 p-16-32 26.6G 381G 784 0 97.5M 0 p-16-32 26.6G 381G 1.04K 0 133M 0 [...] Well, better. This message posted from opensolaris.org
Joe Little
2006-Mar-03 16:51 UTC
[zfs-discuss] Performance degradation while reading sequentially with large block size
I''ll bite on this one. What is S10U2P1 -- is there a Solaris 10 Update 2 that contains ZFS? On 3/3/06, Robert Milkowski <milek at task.gda.pl> wrote:> Hi. > > v440 with S10U2p1 (zfs based on snv_b32). Directly connected several 3510 FC JBODS with two links under MPxIO. I sent the same report as SDR-0163. > > I created mirrored pool from 12 disks from two JBODs in that way that each mirror pair has both disks from both JBODS. Then I create one large file using dd. Now if I want to read that file with any block size above about 800KB (using dd) the performance is bad (40-60MB/s). With smaller block sizes performance is quite good (~160MB/s). > > > > bash-3.00# zpool export p-16-32 > bash-3.00# zpool import p-16-32 > bash-3.00# zpool status p-16-32 > pool: p-16-32 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > p-16-32 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E011909320d0 ONLINE 0 0 0 > c5t500000E011902FB0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E011909300d0 ONLINE 0 0 0 > c5t500000E0119030F0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E011903030d0 ONLINE 0 0 0 > c5t500000E01190E730d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E011903300d0 ONLINE 0 0 0 > c5t500000E01190E7F0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E011903120d0 ONLINE 0 0 0 > c5t500000E0119091E0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c5t500000E01190E750d0 ONLINE 0 0 0 > c5t500000E0119032D0d0 ONLINE 0 0 0 > > bash-3.00# dd if=/dev/zero of=/p-16-32/q1 bs=1024k > [keeps running] > > bash-3.00# zpool iostat p-16-32 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > p-16-32 416M 408G 145 166 628K 830K > p-16-32 416M 408G 0 711 0 89.0M > p-16-32 416M 408G 0 938 0 117M > p-16-32 416M 408G 0 943 0 118M > p-16-32 416M 408G 0 924 0 116M > p-16-32 416M 408G 0 937 0 117M > p-16-32 416M 408G 0 936 0 117M > p-16-32 416M 408G 0 958 0 120M > p-16-32 416M 408G 0 937 0 117M > p-16-32 1.43G 407G 8 1.05K 47.5K 126M > p-16-32 1.43G 407G 0 786 0 92.7M > p-16-32 1.43G 407G 0 911 0 114M > p-16-32 1.43G 407G 0 926 0 116M > p-16-32 1.43G 407G 0 914 0 114M > p-16-32 1.43G 407G 0 920 0 115M > p-16-32 1.43G 407G 0 960 0 120M > p-16-32 1.43G 407G 0 915 0 114M > p-16-32 1.43G 407G 0 1020 0 128M > p-16-32 2.38G 406G 6 795 39.6K 88.8M > p-16-32 2.38G 406G 0 910 0 114M > p-16-32 2.38G 406G 0 942 0 118M > p-16-32 2.38G 406G 0 940 0 118M > p-16-32 2.38G 406G 0 908 0 114M > p-16-32 2.38G 406G 0 922 0 115M > p-16-32 2.38G 406G 0 947 0 118M > p-16-32 2.38G 406G 0 944 0 118M > p-16-32 2.38G 406G 0 1.09K 0 140M > p-16-32 3.38G 405G 9 774 39.6K 85.9M > p-16-32 3.38G 405G 0 927 0 116M > p-16-32 3.38G 405G 0 958 0 120M > p-16-32 3.38G 405G 0 985 0 123M > ^C > bash-3.00# > > I interrupted dd. > > > bash-3.00# zpool export p-16-32 > bash-3.00# zpool import p-16-32 > bash-3.00# ls -lh /p-16-32/q1 > -rw-r--r-- 1 root other 27G Mar 3 10:53 /p-16-32/q1 > bash-3.00# > > > Ok, caches are flushed and we''ve got 27GB file which was sequentially written with write speed about 120MB/s (twice as that to disks as it''s software mirror). > > Now let''s try to read this file with block size 1MB. > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k > > bash-3.00# zpool iostat p-16-32 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > p-16-32 26.6G 381G 180 0 22.1M 0 > p-16-32 26.6G 381G 331 0 41.2M 0 > p-16-32 26.6G 381G 336 0 41.8M 0 > p-16-32 26.6G 381G 334 0 41.5M 0 > p-16-32 26.6G 381G 337 0 42.0M 0 > p-16-32 26.6G 381G 336 0 41.8M 0 > p-16-32 26.6G 381G 340 0 42.2M 0 > p-16-32 26.6G 381G 339 0 42.2M 0 > p-16-32 26.6G 381G 329 0 40.9M 0 > p-16-32 26.6G 381G 324 0 40.4M 0 > p-16-32 26.6G 381G 352 0 43.7M 0 > p-16-32 26.6G 381G 371 0 46.1M 0 > p-16-32 26.6G 381G 375 0 46.6M 0 > p-16-32 26.6G 381G 360 0 44.7M 0 > p-16-32 26.6G 381G 370 0 46.0M 0 > p-16-32 26.6G 381G 368 0 45.7M 0 > p-16-32 26.6G 381G 391 0 48.6M 0 > p-16-32 26.6G 381G 363 0 45.1M 0 > [...] > > Well only just above 40MB/s - this is bad. > > Now again, with block size 128k. > > bash-3.00# zpool export p-16-32 > bash-3.00# zpool import p-16-32 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=128k > > bash-3.00# zpool iostat p-16-32 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > p-16-32 26.6G 381G 46 0 5.37M 0 > p-16-32 26.6G 381G 1.14K 0 145M 0 > p-16-32 26.6G 381G 1.22K 0 155M 0 > p-16-32 26.6G 381G 1.19K 0 151M 0 > p-16-32 26.6G 381G 1.27K 0 161M 0 > p-16-32 26.6G 381G 1.23K 0 156M 0 > p-16-32 26.6G 381G 1.24K 0 158M 0 > p-16-32 26.6G 381G 1.26K 0 160M 0 > p-16-32 26.6G 381G 1.25K 0 158M 0 > p-16-32 26.6G 381G 1.25K 0 159M 0 > p-16-32 26.6G 381G 1.27K 0 161M 0 > p-16-32 26.6G 381G 1.27K 0 162M 0 > p-16-32 26.6G 381G 1.27K 0 162M 0 > p-16-32 26.6G 381G 1.27K 0 161M 0 > p-16-32 26.6G 381G 1.27K 0 162M 0 > [...] > > Well, this is much better. > > bash-3.00# zpool export p-16-32 > bash-3.00# zpool import p-16-32 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=256k > > bash-3.00# zpool iostat p-16-32 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > p-16-32 26.6G 381G 289 2 35.5M 32.0K > p-16-32 26.6G 381G 1.24K 0 158M 0 > p-16-32 26.6G 381G 1.25K 0 159M 0 > p-16-32 26.6G 381G 1.24K 0 158M 0 > p-16-32 26.6G 381G 1.26K 0 161M 0 > p-16-32 26.6G 381G 1.24K 0 158M 0 > p-16-32 26.6G 381G 1.26K 0 160M 0 > p-16-32 26.6G 381G 1.26K 0 160M 0 > p-16-32 26.6G 381G 1.26K 0 160M 0 > p-16-32 26.6G 381G 1.29K 0 164M 0 > p-16-32 26.6G 381G 1.28K 0 163M 0 > [...] > > This is ok. > > > With 512k and 768k it''s still the same (~160MB/s). > With 800k it is still ok. > With 850k, 900k, 1000k, 1024k it''s bad (40-60MB/s). > With 2048k, 4096k, 8192k it''s also bad (40-60MB/s). > > Now let''s see if it improves with many streams. > > bash-3.00# zpool export p-16-32 > bash-3.00# zpool import p-16-32 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > [1] 4535 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > [2] 4536 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > [3] 4537 > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > [4] 4538 > > First 10 lines were while I adding streams. > > bash-3.00# zpool iostat p-16-32 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > p-16-32 26.6G 381G 65 0 7.72M 0 > p-16-32 26.6G 381G 337 0 41.9M 0 > p-16-32 26.6G 381G 328 0 40.9M 0 > p-16-32 26.6G 381G 484 0 60.0M 0 > p-16-32 26.6G 381G 575 0 71.5M 0 > p-16-32 26.6G 381G 693 0 86.1M 0 > p-16-32 26.6G 381G 681 0 84.5M 0 > p-16-32 26.6G 381G 682 0 84.7M 0 > p-16-32 26.6G 381G 679 0 84.4M 0 > p-16-32 26.6G 381G 955 0 118M 0 > p-16-32 26.6G 381G 991 0 123M 0 > p-16-32 26.6G 381G 1017 0 126M 0 > p-16-32 26.6G 381G 793 0 98.5M 0 > p-16-32 26.6G 381G 970 0 120M 0 > p-16-32 26.6G 381G 874 0 109M 0 > p-16-32 26.6G 381G 916 0 114M 0 > p-16-32 26.6G 381G 935 0 116M 0 > p-16-32 26.6G 381G 984 0 122M 0 > p-16-32 26.6G 381G 1.08K 0 138M 0 > p-16-32 26.6G 381G 1.22K 0 156M 0 > p-16-32 26.6G 381G 1.13K 0 144M 0 > p-16-32 26.6G 381G 1.20K 0 154M 0 > p-16-32 26.6G 381G 918 0 114M 0 > p-16-32 26.6G 381G 848 0 106M 0 > p-16-32 26.6G 381G 985 0 123M 0 > p-16-32 26.6G 381G 973 0 121M 0 > p-16-32 26.6G 381G 1.00K 0 128M 0 > p-16-32 26.6G 381G 997 0 124M 0 > p-16-32 26.6G 381G 953 0 119M 0 > p-16-32 26.6G 381G 784 0 97.5M 0 > p-16-32 26.6G 381G 1.04K 0 133M 0 > [...] > > > Well, better. > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Roch Bourbonnais - Performance Engineering
2006-Mar-06 16:11 UTC
[zfs-discuss] Performance degradation while reading sequentially with large block size
I can''t explain everything you see but for 1M read and up we disable prefetching which may account for some of the behavior. The tunable that controls that is uint64_t zfetch_array_rd_sz -r Joe Little writes: > I''ll bite on this one. What is S10U2P1 -- is there a Solaris 10 Update > 2 that contains ZFS? > > On 3/3/06, Robert Milkowski <milek at task.gda.pl> wrote: > > Hi. > > > > v440 with S10U2p1 (zfs based on snv_b32). Directly connected > several 3510 FC JBODS with two links under MPxIO. I sent the same > report as SDR-0163. > > > > I created mirrored pool from 12 disks from two JBODs in that way > that each mirror pair has both disks from both JBODS. Then I create > one large file using dd. Now if I want to read that file with any > block size above about 800KB (using dd) the performance is bad > (40-60MB/s). With smaller block sizes performance is quite good > (~160MB/s). > > > > > > > > bash-3.00# zpool export p-16-32 > > bash-3.00# zpool import p-16-32 > > bash-3.00# zpool status p-16-32 > > pool: p-16-32 > > state: ONLINE > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > p-16-32 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E011909320d0 ONLINE 0 0 0 > > c5t500000E011902FB0d0 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E011909300d0 ONLINE 0 0 0 > > c5t500000E0119030F0d0 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E011903030d0 ONLINE 0 0 0 > > c5t500000E01190E730d0 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E011903300d0 ONLINE 0 0 0 > > c5t500000E01190E7F0d0 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E011903120d0 ONLINE 0 0 0 > > c5t500000E0119091E0d0 ONLINE 0 0 0 > > mirror ONLINE 0 0 0 > > c5t500000E01190E750d0 ONLINE 0 0 0 > > c5t500000E0119032D0d0 ONLINE 0 0 0 > > > > bash-3.00# dd if=/dev/zero of=/p-16-32/q1 bs=1024k > > [keeps running] > > > > bash-3.00# zpool iostat p-16-32 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > p-16-32 416M 408G 145 166 628K 830K > > p-16-32 416M 408G 0 711 0 89.0M > > p-16-32 416M 408G 0 938 0 117M > > p-16-32 416M 408G 0 943 0 118M > > p-16-32 416M 408G 0 924 0 116M > > p-16-32 416M 408G 0 937 0 117M > > p-16-32 416M 408G 0 936 0 117M > > p-16-32 416M 408G 0 958 0 120M > > p-16-32 416M 408G 0 937 0 117M > > p-16-32 1.43G 407G 8 1.05K 47.5K 126M > > p-16-32 1.43G 407G 0 786 0 92.7M > > p-16-32 1.43G 407G 0 911 0 114M > > p-16-32 1.43G 407G 0 926 0 116M > > p-16-32 1.43G 407G 0 914 0 114M > > p-16-32 1.43G 407G 0 920 0 115M > > p-16-32 1.43G 407G 0 960 0 120M > > p-16-32 1.43G 407G 0 915 0 114M > > p-16-32 1.43G 407G 0 1020 0 128M > > p-16-32 2.38G 406G 6 795 39.6K 88.8M > > p-16-32 2.38G 406G 0 910 0 114M > > p-16-32 2.38G 406G 0 942 0 118M > > p-16-32 2.38G 406G 0 940 0 118M > > p-16-32 2.38G 406G 0 908 0 114M > > p-16-32 2.38G 406G 0 922 0 115M > > p-16-32 2.38G 406G 0 947 0 118M > > p-16-32 2.38G 406G 0 944 0 118M > > p-16-32 2.38G 406G 0 1.09K 0 140M > > p-16-32 3.38G 405G 9 774 39.6K 85.9M > > p-16-32 3.38G 405G 0 927 0 116M > > p-16-32 3.38G 405G 0 958 0 120M > > p-16-32 3.38G 405G 0 985 0 123M > > ^C > > bash-3.00# > > > > I interrupted dd. > > > > > > bash-3.00# zpool export p-16-32 > > bash-3.00# zpool import p-16-32 > > bash-3.00# ls -lh /p-16-32/q1 > > -rw-r--r-- 1 root other 27G Mar 3 10:53 /p-16-32/q1 > > bash-3.00# > > > > > > Ok, caches are flushed and we''ve got 27GB file which was sequentially written with write speed about 120MB/s (twice as that to disks as it''s software mirror). > > > > Now let''s try to read this file with block size 1MB. > > > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k > > > > bash-3.00# zpool iostat p-16-32 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > p-16-32 26.6G 381G 180 0 22.1M 0 > > p-16-32 26.6G 381G 331 0 41.2M 0 > > p-16-32 26.6G 381G 336 0 41.8M 0 > > p-16-32 26.6G 381G 334 0 41.5M 0 > > p-16-32 26.6G 381G 337 0 42.0M 0 > > p-16-32 26.6G 381G 336 0 41.8M 0 > > p-16-32 26.6G 381G 340 0 42.2M 0 > > p-16-32 26.6G 381G 339 0 42.2M 0 > > p-16-32 26.6G 381G 329 0 40.9M 0 > > p-16-32 26.6G 381G 324 0 40.4M 0 > > p-16-32 26.6G 381G 352 0 43.7M 0 > > p-16-32 26.6G 381G 371 0 46.1M 0 > > p-16-32 26.6G 381G 375 0 46.6M 0 > > p-16-32 26.6G 381G 360 0 44.7M 0 > > p-16-32 26.6G 381G 370 0 46.0M 0 > > p-16-32 26.6G 381G 368 0 45.7M 0 > > p-16-32 26.6G 381G 391 0 48.6M 0 > > p-16-32 26.6G 381G 363 0 45.1M 0 > > [...] > > > > Well only just above 40MB/s - this is bad. > > > > Now again, with block size 128k. > > > > bash-3.00# zpool export p-16-32 > > bash-3.00# zpool import p-16-32 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=128k > > > > bash-3.00# zpool iostat p-16-32 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > p-16-32 26.6G 381G 46 0 5.37M 0 > > p-16-32 26.6G 381G 1.14K 0 145M 0 > > p-16-32 26.6G 381G 1.22K 0 155M 0 > > p-16-32 26.6G 381G 1.19K 0 151M 0 > > p-16-32 26.6G 381G 1.27K 0 161M 0 > > p-16-32 26.6G 381G 1.23K 0 156M 0 > > p-16-32 26.6G 381G 1.24K 0 158M 0 > > p-16-32 26.6G 381G 1.26K 0 160M 0 > > p-16-32 26.6G 381G 1.25K 0 158M 0 > > p-16-32 26.6G 381G 1.25K 0 159M 0 > > p-16-32 26.6G 381G 1.27K 0 161M 0 > > p-16-32 26.6G 381G 1.27K 0 162M 0 > > p-16-32 26.6G 381G 1.27K 0 162M 0 > > p-16-32 26.6G 381G 1.27K 0 161M 0 > > p-16-32 26.6G 381G 1.27K 0 162M 0 > > [...] > > > > Well, this is much better. > > > > bash-3.00# zpool export p-16-32 > > bash-3.00# zpool import p-16-32 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=256k > > > > bash-3.00# zpool iostat p-16-32 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > p-16-32 26.6G 381G 289 2 35.5M 32.0K > > p-16-32 26.6G 381G 1.24K 0 158M 0 > > p-16-32 26.6G 381G 1.25K 0 159M 0 > > p-16-32 26.6G 381G 1.24K 0 158M 0 > > p-16-32 26.6G 381G 1.26K 0 161M 0 > > p-16-32 26.6G 381G 1.24K 0 158M 0 > > p-16-32 26.6G 381G 1.26K 0 160M 0 > > p-16-32 26.6G 381G 1.26K 0 160M 0 > > p-16-32 26.6G 381G 1.26K 0 160M 0 > > p-16-32 26.6G 381G 1.29K 0 164M 0 > > p-16-32 26.6G 381G 1.28K 0 163M 0 > > [...] > > > > This is ok. > > > > > > With 512k and 768k it''s still the same (~160MB/s). > > With 800k it is still ok. > > With 850k, 900k, 1000k, 1024k it''s bad (40-60MB/s). > > With 2048k, 4096k, 8192k it''s also bad (40-60MB/s). > > > > Now let''s see if it improves with many streams. > > > > bash-3.00# zpool export p-16-32 > > bash-3.00# zpool import p-16-32 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > > [1] 4535 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > > [2] 4536 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > > [3] 4537 > > bash-3.00# dd if=/p-16-32/q1 of=/dev/null bs=1024k& > > [4] 4538 > > > > First 10 lines were while I adding streams. > > > > bash-3.00# zpool iostat p-16-32 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > p-16-32 26.6G 381G 65 0 7.72M 0 > > p-16-32 26.6G 381G 337 0 41.9M 0 > > p-16-32 26.6G 381G 328 0 40.9M 0 > > p-16-32 26.6G 381G 484 0 60.0M 0 > > p-16-32 26.6G 381G 575 0 71.5M 0 > > p-16-32 26.6G 381G 693 0 86.1M 0 > > p-16-32 26.6G 381G 681 0 84.5M 0 > > p-16-32 26.6G 381G 682 0 84.7M 0 > > p-16-32 26.6G 381G 679 0 84.4M 0 > > p-16-32 26.6G 381G 955 0 118M 0 > > p-16-32 26.6G 381G 991 0 123M 0 > > p-16-32 26.6G 381G 1017 0 126M 0 > > p-16-32 26.6G 381G 793 0 98.5M 0 > > p-16-32 26.6G 381G 970 0 120M 0 > > p-16-32 26.6G 381G 874 0 109M 0 > > p-16-32 26.6G 381G 916 0 114M 0 > > p-16-32 26.6G 381G 935 0 116M 0 > > p-16-32 26.6G 381G 984 0 122M 0 > > p-16-32 26.6G 381G 1.08K 0 138M 0 > > p-16-32 26.6G 381G 1.22K 0 156M 0 > > p-16-32 26.6G 381G 1.13K 0 144M 0 > > p-16-32 26.6G 381G 1.20K 0 154M 0 > > p-16-32 26.6G 381G 918 0 114M 0 > > p-16-32 26.6G 381G 848 0 106M 0 > > p-16-32 26.6G 381G 985 0 123M 0 > > p-16-32 26.6G 381G 973 0 121M 0 > > p-16-32 26.6G 381G 1.00K 0 128M 0 > > p-16-32 26.6G 381G 997 0 124M 0 > > p-16-32 26.6G 381G 953 0 119M 0 > > p-16-32 26.6G 381G 784 0 97.5M 0 > > p-16-32 26.6G 381G 1.04K 0 133M 0 > > [...] > > > > > > Well, better. > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Robert Milkowski
2006-Mar-13 19:36 UTC
[zfs-discuss] Performance degradation while reading sequentially with large block size
Hello Roch, Monday, March 6, 2006, 5:11:26 PM, you wrote: RBPE> I can''t explain everything you see but for 1M read and up we RBPE> disable prefetching which may account for some of the behavior. RBPE> The tunable that controls that is RBPE> uint64_t zfetch_array_rd_sz Yep, that''s it. Increasing this parameter and even with 8MB block size I get ~170MB/s - decreasing and I get ~40-60MB/s. Thank you. ps. there''s a bug on it - 6395670 -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Robert Milkowski
2006-Mar-13 19:48 UTC
[zfs-discuss] Performance degradation while reading sequentially with large block size
Hello Robert, Monday, March 13, 2006, 8:36:15 PM, you wrote: RM> Hello Roch, RM> Monday, March 6, 2006, 5:11:26 PM, you wrote: RBPE>> I can''t explain everything you see but for 1M read and up we RBPE>> disable prefetching which may account for some of the behavior. RBPE>> The tunable that controls that is RBPE>> uint64_t zfetch_array_rd_sz RM> Yep, that''s it. Increasing this parameter and even with 8MB block size RM> I get ~170MB/s - decreasing and I get ~40-60MB/s. Increasing zfetch_block_cap from 32 to 256 with zfetch_array_rd_sz increased I get ~240-260MB/s which is really good :) -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com