Hello. I faced with a strange performance problem with new disk shelf. We a using ZFS system with SATA disks for a while. It is Supermicro SC846-E16 chassis, Supermicro X8DTH-6F motherboard with 96Gb RAM and 24 HITACHI HDS723020BLA642 SATA disks attached to onboard LSI 2008 controller. Pretty much satisfied with it we bought additional shelf with SAS disks for VMs hosting. New shelf is Supermicro SC846-E26 chassis. Disks model is HITACHI HUS156060VLS600 (15K 600Gb SAS2). Additional controller LSI 9205-8e was installed in server and connected with JBOD. I connected JBOD with 2 channels and setup multi path first, but when i noticed performance problem i disabled multi path and disconnected one cable (for sure it is not multipath cause the problem). Problem description follow: Creating test pool with 5 pair of mirrors (new shelf, SAS disks) # zpool create -o version=28 -O primarycache=none test mirror c9t5000CCA02A138899d0 c9t5000CCA02A102181d0 mirror c9t5000CCA02A13500Dd0 c9t5000CCA02A13316Dd0 mirror c9t5000CCA02A005699d0 c9t5000CCA02A004271d0 mirror c9t5000CCA02A004229d0 c9t5000CCA02A1342CDd0 mirror c9t5000CCA02A1251E5d0 c9t5000CCA02A1151DDd0 (primarycache=none) to disable ARC influence Testing sequential write # dd if=/dev/zero of=/test/zero bs=1M count=2048 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 1.04272 s, 2.1 GB/s iostat when writing look like r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1334.6 0.0 165782.9 0.0 8.4 0.0 6.3 1 86 c9t5000CCA02A1151DDd0 0.0 1345.5 0.0 169575.3 0.0 8.7 0.0 6.5 1 88 c9t5000CCA02A1342CDd0 2.0 1359.5 1.0 168969.8 0.0 8.7 0.0 6.4 1 90 c9t5000CCA02A13500Dd0 0.0 1358.5 0.0 168714.0 0.0 8.7 0.0 6.4 1 90 c9t5000CCA02A13316Dd0 0.0 1345.5 0.0 166669.3 0.0 9.0 0.0 6.7 1 92 c9t5000CCA02A102181d0 1.0 1317.5 1.0 164456.9 0.0 8.5 0.0 6.5 1 88 c9t5000CCA02A004271d0 4.0 1342.5 2.0 166282.2 0.0 8.5 0.0 6.3 1 88 c9t5000CCA02A1251E5d0 0.0 1377.5 0.0 170515.5 0.0 8.7 0.0 6.3 1 90 c9t5000CCA02A138899d0 Now read # dd if=/test/zero of=/dev/null bs=1M 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 13.5681 s, 158 MB/s iostat when reading r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 106.0 0.0 11417.4 0.0 0.0 0.2 0.0 2.4 0 14 c9t5000CCA02A004271d0 80.0 0.0 10239.9 0.0 0.0 0.2 0.0 2.4 0 10 c9t5000CCA02A1251E5d0 110.0 0.0 12182.4 0.0 0.0 0.1 0.0 1.3 0 9 c9t5000CCA02A138899d0 102.0 0.0 11664.4 0.0 0.0 0.2 0.0 1.8 0 15 c9t5000CCA02A005699d0 99.0 0.0 10900.9 0.0 0.0 0.3 0.0 3.0 0 16 c9t5000CCA02A004229d0 107.0 0.0 11545.4 0.0 0.0 0.2 0.0 1.9 0 13 c9t5000CCA02A1151DDd0 81.0 0.0 10367.9 0.0 0.0 0.2 0.0 2.2 0 11 c9t5000CCA02A1342CDd0 Unexpected low speed! Note the busy column. When writing it about 90%, when reading it about 15% Individual disks raw read speed (don''t be confused with name change. i connect JBOD to another HBA channel) # dd if=/dev/dsk/c8t5000CCA02A13889Ad0 of=/dev/null bs=1M count=2000 2000+0 records in 2000+0 records out 2097152000 bytes (2.1 GB) copied, 10.9685 s, 191 MB/s # dd if=/dev/dsk/c8t5000CCA02A1342CEd0 of=/dev/null bs=1M count=2000 2000+0 records in 2000+0 records out 2097152000 bytes (2.1 GB) copied, 10.8024 s, 194 MB/s The 10-disks mirror zpool read slower than a single disk. There is no tuning in /etc/system I tried test with FreeBSD 8.3 live CD. Reads was the same (about 150Mb/s). Also i tried SmartOS, but it can''t see disks behind LSI 9205-8e controller. For compare this is speed from SATA pool (it consist of 4 6-disk raidz2 vdev) #dd if=CentOS-6.2-x86_64-bin-DVD1.iso of=/dev/null bs=1M 4218+1 records in 4218+1 records out 4423129088 bytes (4.4 GB) copied, 4.76552 s, 928 MB/s r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 13614.4 0.0 800338.5 0.0 0.1 36.0 0.0 2.6 0 914 c6 459.9 0.0 25761.4 0.0 0.0 0.8 0.0 1.8 0 22 c6t5000CCA369D16860d0 84.0 0.0 2785.2 0.0 0.0 0.2 0.0 3.0 0 13 c6t5000CCA369D1B1E0d0 836.9 0.0 50089.5 0.0 0.0 2.6 0.0 3.1 0 60 c6t5000CCA369D1B302d0 411.0 0.0 24492.6 0.0 0.0 0.8 0.0 2.1 0 25 c6t5000CCA369D16982d0 821.9 0.0 49385.1 0.0 0.0 3.0 0.0 3.7 0 67 c6t5000CCA369CFBDA3d0 231.0 0.0 12292.5 0.0 0.0 0.5 0.0 2.3 0 18 c6t5000CCA369D17E73d0 803.9 0.0 50091.5 0.0 0.0 2.9 0.0 3.6 1 69 c6t5000CCA369D0EA93d0 PS. Before testing i flash last firmware and bios to LSI 9205-8e. It come with factory 9 version. I flashed version 13.5. Now I think that it was not worth such a hurry. Then i downgrade it to version 12. Read speed remains the same. Now controllers versions # ./sas2flash -listall LSI Corporation SAS2 Flash Utility Version 12.00.00.00 (2011.11.08) Copyright (c) 2008-2011 LSI Corporation. All rights reserved Adapter Selected is a LSI SAS: SAS2008(B1) Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr ---------------------------------------------------------------------------- 0 SAS2008(B1) 12.00.00.00 0c.00.00.04 07.23.01.00 00:05:00:00 1 SAS2308_2(B0) 12.00.00.00 0c.00.00.04 07.23.01.00 00:84:00:00 Any suggestions or thoughts ?
Richard Elling
2012-Jul-23 13:39 UTC
[zfs-discuss] slow speed problem with a new SAS shelf
On Jul 22, 2012, at 10:18 PM, Yuri Vorobyev wrote:> Hello. > > I faced with a strange performance problem with new disk shelf. > We a using ZFS system with SATA disks for a while.What OS and release? -- richard> It is Supermicro SC846-E16 chassis, Supermicro X8DTH-6F motherboard with 96Gb RAM and 24 HITACHI HDS723020BLA642 SATA disks attached to onboard LSI 2008 controller. > > Pretty much satisfied with it we bought additional shelf with SAS disks for VMs hosting. New shelf is Supermicro SC846-E26 chassis. Disks model is HITACHI HUS156060VLS600 (15K 600Gb SAS2). > Additional controller LSI 9205-8e was installed in server and connected with JBOD. > I connected JBOD with 2 channels and setup multi path first, but when i noticed performance problem i disabled multi path and disconnected one cable (for sure it is not multipath cause the problem). > > Problem description follow: > > Creating test pool with 5 pair of mirrors (new shelf, SAS disks) > > # zpool create -o version=28 -O primarycache=none test mirror c9t5000CCA02A138899d0 c9t5000CCA02A102181d0 mirror c9t5000CCA02A13500Dd0 c9t5000CCA02A13316Dd0 mirror c9t5000CCA02A005699d0 c9t5000CCA02A004271d0 mirror c9t5000CCA02A004229d0 c9t5000CCA02A1342CDd0 mirror c9t5000CCA02A1251E5d0 c9t5000CCA02A1151DDd0 > > (primarycache=none) to disable ARC influence > > > Testing sequential write > # dd if=/dev/zero of=/test/zero bs=1M count=2048 > 2048+0 records in > 2048+0 records out > 2147483648 bytes (2.1 GB) copied, 1.04272 s, 2.1 GB/s > > iostat when writing look like > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 1334.6 0.0 165782.9 0.0 8.4 0.0 6.3 1 86 c9t5000CCA02A1151DDd0 > 0.0 1345.5 0.0 169575.3 0.0 8.7 0.0 6.5 1 88 c9t5000CCA02A1342CDd0 > 2.0 1359.5 1.0 168969.8 0.0 8.7 0.0 6.4 1 90 c9t5000CCA02A13500Dd0 > 0.0 1358.5 0.0 168714.0 0.0 8.7 0.0 6.4 1 90 c9t5000CCA02A13316Dd0 > 0.0 1345.5 0.0 166669.3 0.0 9.0 0.0 6.7 1 92 c9t5000CCA02A102181d0 > 1.0 1317.5 1.0 164456.9 0.0 8.5 0.0 6.5 1 88 c9t5000CCA02A004271d0 > 4.0 1342.5 2.0 166282.2 0.0 8.5 0.0 6.3 1 88 c9t5000CCA02A1251E5d0 > 0.0 1377.5 0.0 170515.5 0.0 8.7 0.0 6.3 1 90 c9t5000CCA02A138899d0 > > Now read > # dd if=/test/zero of=/dev/null bs=1M > 2048+0 records in > 2048+0 records out > 2147483648 bytes (2.1 GB) copied, 13.5681 s, 158 MB/s > > iostat when reading > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 106.0 0.0 11417.4 0.0 0.0 0.2 0.0 2.4 0 14 c9t5000CCA02A004271d0 > 80.0 0.0 10239.9 0.0 0.0 0.2 0.0 2.4 0 10 c9t5000CCA02A1251E5d0 > 110.0 0.0 12182.4 0.0 0.0 0.1 0.0 1.3 0 9 c9t5000CCA02A138899d0 > 102.0 0.0 11664.4 0.0 0.0 0.2 0.0 1.8 0 15 c9t5000CCA02A005699d0 > 99.0 0.0 10900.9 0.0 0.0 0.3 0.0 3.0 0 16 c9t5000CCA02A004229d0 > 107.0 0.0 11545.4 0.0 0.0 0.2 0.0 1.9 0 13 c9t5000CCA02A1151DDd0 > 81.0 0.0 10367.9 0.0 0.0 0.2 0.0 2.2 0 11 c9t5000CCA02A1342CDd0 > > Unexpected low speed! Note the busy column. When writing it about 90%, when reading it about 15% > > Individual disks raw read speed (don''t be confused with name change. i connect JBOD to another HBA channel) > > # dd if=/dev/dsk/c8t5000CCA02A13889Ad0 of=/dev/null bs=1M count=2000 > 2000+0 records in > 2000+0 records out > 2097152000 bytes (2.1 GB) copied, 10.9685 s, 191 MB/s > # dd if=/dev/dsk/c8t5000CCA02A1342CEd0 of=/dev/null bs=1M count=2000 > 2000+0 records in > 2000+0 records out > 2097152000 bytes (2.1 GB) copied, 10.8024 s, 194 MB/s > > The 10-disks mirror zpool read slower than a single disk. > > There is no tuning in /etc/system > > I tried test with FreeBSD 8.3 live CD. Reads was the same (about 150Mb/s). Also i tried SmartOS, but it can''t see disks behind LSI 9205-8e controller. > > For compare this is speed from SATA pool (it consist of 4 6-disk raidz2 vdev) > #dd if=CentOS-6.2-x86_64-bin-DVD1.iso of=/dev/null bs=1M > 4218+1 records in > 4218+1 records out > 4423129088 bytes (4.4 GB) copied, 4.76552 s, 928 MB/s > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 13614.4 0.0 800338.5 0.0 0.1 36.0 0.0 2.6 0 914 c6 > 459.9 0.0 25761.4 0.0 0.0 0.8 0.0 1.8 0 22 c6t5000CCA369D16860d0 > 84.0 0.0 2785.2 0.0 0.0 0.2 0.0 3.0 0 13 c6t5000CCA369D1B1E0d0 > 836.9 0.0 50089.5 0.0 0.0 2.6 0.0 3.1 0 60 c6t5000CCA369D1B302d0 > 411.0 0.0 24492.6 0.0 0.0 0.8 0.0 2.1 0 25 c6t5000CCA369D16982d0 > 821.9 0.0 49385.1 0.0 0.0 3.0 0.0 3.7 0 67 c6t5000CCA369CFBDA3d0 > 231.0 0.0 12292.5 0.0 0.0 0.5 0.0 2.3 0 18 c6t5000CCA369D17E73d0 > 803.9 0.0 50091.5 0.0 0.0 2.9 0.0 3.6 1 69 c6t5000CCA369D0EA93d0 > > PS. Before testing i flash last firmware and bios to LSI 9205-8e. It come with factory 9 version. I flashed version 13.5. > Now I think that it was not worth such a hurry. > Then i downgrade it to version 12. Read speed remains the same. > Now controllers versions > > # ./sas2flash -listall > LSI Corporation SAS2 Flash Utility > Version 12.00.00.00 (2011.11.08) > Copyright (c) 2008-2011 LSI Corporation. All rights reserved > > Adapter Selected is a LSI SAS: SAS2008(B1) > > Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr > ---------------------------------------------------------------------------- > > 0 SAS2008(B1) 12.00.00.00 0c.00.00.04 07.23.01.00 00:05:00:00 > 1 SAS2308_2(B0) 12.00.00.00 0c.00.00.04 07.23.01.00 00:84:00:00 > > Any suggestions or thoughts ? > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120723/0cf61676/attachment.html>
23.07.2012 19:39, Richard Elling ?????:>> I faced with a strange performance problem with new disk shelf. >> We a using ZFS system with SATA disks for a while. > > What OS and release?Oh. I forgot this important thing. It is OpenIndiana oi_151a5 now.
Hi, Have you had a look iostat -E (error counters) to make sure you don''t have faulty cabling? I''ve bad cables trip me up once in a manner similar to your situation here. Cheers, -- Saso On 07/23/2012 07:18 AM, Yuri Vorobyev wrote:> Hello. > > I faced with a strange performance problem with new disk shelf. > We a using ZFS system with SATA disks for a while. > It is Supermicro SC846-E16 chassis, Supermicro X8DTH-6F motherboard with > 96Gb RAM and 24 HITACHI HDS723020BLA642 SATA disks attached to onboard > LSI 2008 controller. > > Pretty much satisfied with it we bought additional shelf with SAS disks > for VMs hosting. New shelf is Supermicro SC846-E26 chassis. Disks model > is HITACHI HUS156060VLS600 (15K 600Gb SAS2). > Additional controller LSI 9205-8e was installed in server and connected > with JBOD. > I connected JBOD with 2 channels and setup multi path first, but when i > noticed performance problem i disabled multi path and disconnected one > cable (for sure it is not multipath cause the problem). > > Problem description follow: > > Creating test pool with 5 pair of mirrors (new shelf, SAS disks) > > # zpool create -o version=28 -O primarycache=none test mirror > c9t5000CCA02A138899d0 c9t5000CCA02A102181d0 mirror c9t5000CCA02A13500Dd0 > c9t5000CCA02A13316Dd0 mirror c9t5000CCA02A005699d0 c9t5000CCA02A004271d0 > mirror c9t5000CCA02A004229d0 c9t5000CCA02A1342CDd0 mirror > c9t5000CCA02A1251E5d0 c9t5000CCA02A1151DDd0 > > (primarycache=none) to disable ARC influence > > > Testing sequential write > # dd if=/dev/zero of=/test/zero bs=1M count=2048 > 2048+0 records in > 2048+0 records out > 2147483648 bytes (2.1 GB) copied, 1.04272 s, 2.1 GB/s > > iostat when writing look like > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 1334.6 0.0 165782.9 0.0 8.4 0.0 6.3 1 86 > c9t5000CCA02A1151DDd0 > 0.0 1345.5 0.0 169575.3 0.0 8.7 0.0 6.5 1 88 > c9t5000CCA02A1342CDd0 > 2.0 1359.5 1.0 168969.8 0.0 8.7 0.0 6.4 1 90 > c9t5000CCA02A13500Dd0 > 0.0 1358.5 0.0 168714.0 0.0 8.7 0.0 6.4 1 90 > c9t5000CCA02A13316Dd0 > 0.0 1345.5 0.0 166669.3 0.0 9.0 0.0 6.7 1 92 > c9t5000CCA02A102181d0 > 1.0 1317.5 1.0 164456.9 0.0 8.5 0.0 6.5 1 88 > c9t5000CCA02A004271d0 > 4.0 1342.5 2.0 166282.2 0.0 8.5 0.0 6.3 1 88 > c9t5000CCA02A1251E5d0 > 0.0 1377.5 0.0 170515.5 0.0 8.7 0.0 6.3 1 90 > c9t5000CCA02A138899d0 > > Now read > # dd if=/test/zero of=/dev/null bs=1M > 2048+0 records in > 2048+0 records out > 2147483648 bytes (2.1 GB) copied, 13.5681 s, 158 MB/s > > iostat when reading > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 106.0 0.0 11417.4 0.0 0.0 0.2 0.0 2.4 0 14 > c9t5000CCA02A004271d0 > 80.0 0.0 10239.9 0.0 0.0 0.2 0.0 2.4 0 10 > c9t5000CCA02A1251E5d0 > 110.0 0.0 12182.4 0.0 0.0 0.1 0.0 1.3 0 9 > c9t5000CCA02A138899d0 > 102.0 0.0 11664.4 0.0 0.0 0.2 0.0 1.8 0 15 > c9t5000CCA02A005699d0 > 99.0 0.0 10900.9 0.0 0.0 0.3 0.0 3.0 0 16 > c9t5000CCA02A004229d0 > 107.0 0.0 11545.4 0.0 0.0 0.2 0.0 1.9 0 13 > c9t5000CCA02A1151DDd0 > 81.0 0.0 10367.9 0.0 0.0 0.2 0.0 2.2 0 11 > c9t5000CCA02A1342CDd0 > > Unexpected low speed! Note the busy column. When writing it about 90%, > when reading it about 15% > > Individual disks raw read speed (don''t be confused with name change. i > connect JBOD to another HBA channel) > > # dd if=/dev/dsk/c8t5000CCA02A13889Ad0 of=/dev/null bs=1M count=2000 > 2000+0 records in > 2000+0 records out > 2097152000 bytes (2.1 GB) copied, 10.9685 s, 191 MB/s > # dd if=/dev/dsk/c8t5000CCA02A1342CEd0 of=/dev/null bs=1M count=2000 > 2000+0 records in > 2000+0 records out > 2097152000 bytes (2.1 GB) copied, 10.8024 s, 194 MB/s > > The 10-disks mirror zpool read slower than a single disk. > > There is no tuning in /etc/system > > I tried test with FreeBSD 8.3 live CD. Reads was the same (about > 150Mb/s). Also i tried SmartOS, but it can''t see disks behind LSI > 9205-8e controller. > > For compare this is speed from SATA pool (it consist of 4 6-disk raidz2 > vdev) > #dd if=CentOS-6.2-x86_64-bin-DVD1.iso of=/dev/null bs=1M > 4218+1 records in > 4218+1 records out > 4423129088 bytes (4.4 GB) copied, 4.76552 s, 928 MB/s > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 13614.4 0.0 800338.5 0.0 0.1 36.0 0.0 2.6 0 914 c6 > 459.9 0.0 25761.4 0.0 0.0 0.8 0.0 1.8 0 22 > c6t5000CCA369D16860d0 > 84.0 0.0 2785.2 0.0 0.0 0.2 0.0 3.0 0 13 > c6t5000CCA369D1B1E0d0 > 836.9 0.0 50089.5 0.0 0.0 2.6 0.0 3.1 0 60 > c6t5000CCA369D1B302d0 > 411.0 0.0 24492.6 0.0 0.0 0.8 0.0 2.1 0 25 > c6t5000CCA369D16982d0 > 821.9 0.0 49385.1 0.0 0.0 3.0 0.0 3.7 0 67 > c6t5000CCA369CFBDA3d0 > 231.0 0.0 12292.5 0.0 0.0 0.5 0.0 2.3 0 18 > c6t5000CCA369D17E73d0 > 803.9 0.0 50091.5 0.0 0.0 2.9 0.0 3.6 1 69 > c6t5000CCA369D0EA93d0 > > PS. Before testing i flash last firmware and bios to LSI 9205-8e. It > come with factory 9 version. I flashed version 13.5. > Now I think that it was not worth such a hurry. > Then i downgrade it to version 12. Read speed remains the same. > Now controllers versions > > # ./sas2flash -listall > LSI Corporation SAS2 Flash Utility > Version 12.00.00.00 (2011.11.08) > Copyright (c) 2008-2011 LSI Corporation. All rights reserved > > Adapter Selected is a LSI SAS: SAS2008(B1) > > Num Ctlr FW Ver NVDATA x86-BIOS PCI > Addr > > ---------------------------------------------------------------------------- > > > 0 SAS2008(B1) 12.00.00.00 0c.00.00.04 07.23.01.00 00:05:00:00 > 1 SAS2308_2(B0) 12.00.00.00 0c.00.00.04 07.23.01.00 00:84:00:00 > > Any suggestions or thoughts ? > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
23.07.2012 21:59, Yuri Vorobyev ?????:>>> I faced with a strange performance problem with new disk shelf. >>> We a using ZFS system with SATA disks for a while. >> >> What OS and release? > Oh. I forgot this important thing. > It is OpenIndiana oi_151a5 now. >New testing data: I reboot to first boot environment with original io_151 dev installed (without updates). Read speed remains the same. About 150Mb/s. Now something intresting. Booted to Centos 6.3 live CD Create software raid10: #mdadm -C /dev/md0 --level=raid10 --assume-clean --raid-devices=10 /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal # dd if=/dev/zero of=zero bs=1M count=5000 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB) copied, 9.50402 s, 552 MB/s cleaning file system caches: # free total used free shared buffers cached Mem: 99195228 6800076 92395152 0 22184 5269432 -/+ buffers/cache: 1508460 97686768 Swap: 0 0 0 # echo 3 > /proc/sys/vm/drop_caches # free total used free shared buffers cached Mem: 99195228 1434564 97760664 0 1964 75988 -/+ buffers/cache: 1356612 97838616 Swap: 0 0 0 # dd if=zero of=/dev/null bs=1M 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB) copied, 5.65738 s, 927 MB/s iostat during reading here http://pastebin.com/rwd0LWdc CentOS dmesg here https://dl.dropbox.com/u/12915469/centos_dmesg.txt Sequential speed 6 times more than in OpenIndiana. Seems like driver bug in mpt_sas? Remind you HBA is LSI 9205-8e (2308 chip). LSI support answered me OpenIndiana not supported OS (who would doubt...) What should i do? Go to time-tested LSI2008 HBA?
25.07.2012 9:29, Yuri Vorobyev ?????:>>>> I faced with a strange performance problem with new disk shelf. >>>> We a using ZFS system with SATA disks for a while. >>> >>> What OS and release? >> Oh. I forgot this important thing. >> It is OpenIndiana oi_151a5 now. >> > > New testing data: > > I reboot to first boot environment with original io_151 dev installed > (without updates). > Read speed remains the same. About 150Mb/s. > > Now something intresting. Booted to Centos 6.3 live CD > Create software raid10: > #mdadm -C /dev/md0 --level=raid10 --assume-clean --raid-devices=10 > /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai > /dev/sdaj /dev/sdak /dev/sdal > > # dd if=/dev/zero of=zero bs=1M count=5000 > 5000+0 records in > 5000+0 records out > 5242880000 bytes (5.2 GB) copied, 9.50402 s, 552 MB/s > > cleaning file system caches: > > # free > total used free shared buffers cached > Mem: 99195228 6800076 92395152 0 22184 5269432 > -/+ buffers/cache: 1508460 97686768 > Swap: 0 0 0 > # echo 3 > /proc/sys/vm/drop_caches > # free > total used free shared buffers cached > Mem: 99195228 1434564 97760664 0 1964 75988 > -/+ buffers/cache: 1356612 97838616 > Swap: 0 0 0 > > > # dd if=zero of=/dev/null bs=1M > 5000+0 records in > 5000+0 records out > 5242880000 bytes (5.2 GB) copied, 5.65738 s, 927 MB/s > > iostat during reading here http://pastebin.com/rwd0LWdc > CentOS dmesg here https://dl.dropbox.com/u/12915469/centos_dmesg.txt > > Sequential speed 6 times more than in OpenIndiana. > Seems like driver bug in mpt_sas? Remind you HBA is LSI 9205-8e (2308 > chip). > > LSI support answered me OpenIndiana not supported OS (who would doubt...) > What should i do? Go to time-tested LSI2008 HBA?We bought LSI 9200-8e (LSI2008 chip). Card come with FW version 7. Don''t do upgrade for now. Reconnect shelf to it. No success. Read speed much lower that expected: # dd if=3g of=/dev/null bs=1M 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 23.5578 s, 134 MB/s Can someone with Supermicro JBOD equipped with SAS drives and LSI HBA do this sequential read test? Don''t forget to set primarycache=none on testing dataset.
On 08/26/2012 07:40 AM, Yuri Vorobyev wrote:> Can someone with Supermicro JBOD equipped with SAS drives and LSI > HBA do this sequential read test?Did that on a SC847 with 45 drives, read speeds around 2GB/s aren''t a problem.> Don''t forget to set primarycache=none on testing dataset.There''s your problem. By disabling the cache you''ve essentially disabled prefetch. Why are you doing that? -- Saso
27.08.2012 14:02, Sa?o Kiselkov ?????:>> Can someone with Supermicro JBOD equipped with SAS drives and LSI >> HBA do this sequential read test? > > Did that on a SC847 with 45 drives, read speeds around 2GB/s aren''t a > problem.Thanks for info.>> Don''t forget to set primarycache=none on testing dataset. > > There''s your problem. By disabling the cache you''ve essentially > disabled prefetch. Why are you doing that?Hm. The box have 96Gb RAM. I tried to exclude influence ARC cache. Hasn''t thought about prefetch... readspeed is: readspeed () { dd if=$1 of=/dev/null bs=1M ;} root at atom:/sas1/test# zfs set primarycache=metadata sas1/test root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 19.2203 s, 164 MB/s Prefetch still disabled? root at atom:/sas1/test# zfs set primarycache=all sas1/test root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 3.99195 s, 788 MB/s Seems to be this is disk read speed with prefetch enabled. root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.901665 s, 3.5 GB/s root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 1.02127 s, 3.1 GB/s root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.86884 s, 3.6 GB/s These results obviously are from memory. Is there any way to disable ARC for testing and leave prefetch enabled?
27.08.2012 14:02, Sa?o Kiselkov ?????:>> Can someone with Supermicro JBOD equipped with SAS drives and LSI >> HBA do this sequential read test? > > Did that on a SC847 with 45 drives, read speeds around 2GB/s aren''t a > problem.Thanks for info.>> Don''t forget to set primarycache=none on testing dataset. > > There''s your problem. By disabling the cache you''ve essentially > disabled prefetch. Why are you doing that?Hm. The box have 96Gb RAM. I tried to exclude influence ARC cache. Hasn''t thought about prefetch... readspeed is: readspeed () { dd if=$1 of=/dev/null bs=1M ;} root at atom:/sas1/test# zfs set primarycache=metadata sas1/test root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 19.2203 s, 164 MB/s Prefetch still disabled? root at atom:/sas1/test# zfs set primarycache=all sas1/test root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 3.99195 s, 788 MB/s Seems to be this is disk read speed with prefetch enabled. root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.901665 s, 3.5 GB/s root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 1.02127 s, 3.1 GB/s root at atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.86884 s, 3.6 GB/s These results obviously are from memory. Is there any way to disable ARC for testing and leave prefetch enabled?
On 08/27/2012 10:37 AM, Yuri Vorobyev wrote:> Is there any way to disable ARC for testing and leave prefetch enabled?No. The reason is quite simply because prefetch is a mechanism separate from your direct application''s read requests. Prefetch runs on ahead of your anticipated read requests and places blocks it expects you''ll need in the ARC, so obviously by disabling the ARC, you''ve disabled prefetch as well. You can get around the problem by exporting and importing the dataset between testing runs, which will clear the ARC, so do: # dd if=/dev/zero of=testfile bs=1024k count=10000 # zpool export sas1 # zpool import sas1 # dd if=testfile of=/dev/null bs=1024k Cheers, -- Saso
27.08.2012 14:43, Sa?o Kiselkov ?????:>> Is there any way to disable ARC for testing and leave prefetch enabled? > > No. The reason is quite simply because prefetch is a mechanism separate > from your direct application''s read requests. Prefetch runs on ahead of > your anticipated read requests and places blocks it expects you''ll need > in the ARC, so obviously by disabling the ARC, you''ve disabled prefetch > as well. > > You can get around the problem by exporting and importing the dataset > between testing runs, which will clear the ARC, so do: > > # dd if=/dev/zero of=testfile bs=1024k count=10000 > # zpool export sas1 > # zpool import sas1 > # dd if=testfile of=/dev/null bs=1024kThank you very much, Sa?o. Now i see hardware works without problem. I create another 10-disks pair of mirrors zpool for testing: root at atom:/# zpool export sas2 ; zpool import sas2 root at atom:/# readspeed /sas2/5g 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 5.73728 s, 936 MB/s root at atom:/# zpool export sas2 ; zpool import sas2 root at atom:/# readspeed /sas2/5g 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 5.63869 s, 952 MB/s
On 08/27/2012 12:58 PM, Yuri Vorobyev wrote:> 27.08.2012 14:43, Sa?o Kiselkov ?????: > >>> Is there any way to disable ARC for testing and leave prefetch enabled? >> >> No. The reason is quite simply because prefetch is a mechanism separate >> from your direct application''s read requests. Prefetch runs on ahead of >> your anticipated read requests and places blocks it expects you''ll need >> in the ARC, so obviously by disabling the ARC, you''ve disabled prefetch >> as well. >> >> You can get around the problem by exporting and importing the dataset >> between testing runs, which will clear the ARC, so do: >> >> # dd if=/dev/zero of=testfile bs=1024k count=10000 >> # zpool export sas1 >> # zpool import sas1 >> # dd if=testfile of=/dev/null bs=1024k > > Thank you very much, Sa?o.You''re very welcome.> Now i see hardware works without problem. > > I create another 10-disks pair of mirrors zpool for testing: > > root at atom:/# zpool export sas2 ; zpool import sas2 > root at atom:/# readspeed /sas2/5g > 5120+0 records in > 5120+0 records out > 5368709120 bytes (5.4 GB) copied, 5.73728 s, 936 MB/s > root at atom:/# zpool export sas2 ; zpool import sas2 > root at atom:/# readspeed /sas2/5g > 5120+0 records in > 5120+0 records out > 5368709120 bytes (5.4 GB) copied, 5.63869 s, 952 MB/sSounds about right, that''s ~94MB/s from each drive. Have you tried running multiple dd''s in parallel? ZFS likes to have its pipelines fairly saturated, so chances you''ll get higher total performance numbers with multiple parallel readers, like this: (create multiple files like /sas2/5g_1, /sas2/5g_2, /sas2/5g_3, etc...) # zpool export sas2 && zpool import sas2 # readspeed /sas2/5g_1 & readspeed /sas2/5g_2 & readspeed /sas2/5g_3 The simply sum up the MB/s from each dd operation. Also, if possible, use larger files, not just 5GB - something on the order a few 100GBs. You can then watch your pool''s performance via "zpool iostat sas2 5" (that''s how I usually do it). Cheers, -- Saso