Felix Buenemann
2010-Feb-23 17:16 UTC
[zfs-discuss] controller cache instead of dedicated ZIL device
Hi, as it turns out to be pretty difficult (or expensive), to find high performance dedicated ZIL devices, I had another thought: If using a RAID controller with a large cache, eg. 4GB and battery backup in JBOD mode and using on disk ZIL ? wouldn''t the controller cache work as a great ZIL accelerator, requiring no dedicated ZIL? If my understanding is correct, a battery backed RAID controller will ignore cache flush commands and thus the controller cache would be a very low latency intermediate cache wich is preserved over power failure. Best Regards, Felix Buenemann
Richard Elling
2010-Feb-23 17:25 UTC
[zfs-discuss] controller cache instead of dedicated ZIL device
On Feb 23, 2010, at 9:16 AM, Felix Buenemann wrote:> Hi, > > as it turns out to be pretty difficult (or expensive), to find high performance dedicated ZIL devices, I had another thought: > > If using a RAID controller with a large cache, eg. 4GB and battery backup in JBOD mode and using on disk ZIL ? wouldn''t the controller cache work as a great ZIL accelerator, requiring no dedicated ZIL?Yes. ZIL is a performance problem for HDD JBODs, not so much on devices with fast, nonvolatile write caches.> If my understanding is correct, a battery backed RAID controller will ignore cache flush commands and thus the controller cache would be a very low latency intermediate cache wich is preserved over power failure.Cache flush latency is orthogonal to slog latency. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)
Hi all, I''m currently evaluating the possibility of migrating a NFS server (Linux Centos 5.4 / RHEL 5.4 x64-32) based to a opensolaris box and i''m seeing some huge cpu usage in the opensolaris box. The zfs box is a Dell R710 with 2 Quad-Cores (Intel E5506 @ 2.13GHz), 16Gb ram , 2 Sun non-Raid HBA''s connected to two J4400 jbods, while the Linux box is a 2Xeon 3.0Ghz with 8Gb ram, a areca HBA with 512 mb cache, and both of the servers have a Intel 10gbE card with jumbo frames enabled. This zfs box has one pool in a raidz2 with multipath enable (to make use of the 2hbas and 2 j4400), with 20 disks (sata 7.200 rpm seagate enterprise as supplied by Sun). The raidz2 has 5 vdevs with 4 disks each. The test is made by mounting in the linux box one nfs share from the zfs box, and copy around 1.1TB of data , and this data is users''s home directories, so thousands of small files. During the copy procedure from the linux box to the zfs box the load in the zfs box is between 8 and 10 while on the linux box it never goes over 1 . Could the fact of having a RAIDZ2 configuration be the cause for such a big load on the zfs box, or maybe am i missing something ? Thanks for all your time, Bruno Here are some more specs from the ZFS box : root at zfsbox01:/var/adm# zpool status -v RAIDZ2 pool: RAIDZ2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM RAIDZ2 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c0t5000C5001A101764d0 ONLINE 0 0 0 c0t5000C5001A315D0Ad0 ONLINE 0 0 0 c0t5000C5001A10EC6Bd0 ONLINE 0 0 0 c0t5000C5001A0FFF4Bd0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 c0t5000C50019C0A04Ed0 ONLINE 0 0 0 c0t5000C5001A0FA028d0 ONLINE 0 0 0 c0t5000C50019FCF180d0 ONLINE 0 0 0 c0t5000C5001A11E657d0 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 c0t5000C5001A104A30d0 ONLINE 0 0 0 c0t5000C5001A316841d0 ONLINE 0 0 0 c0t5000C5001A0FF92Ed0 ONLINE 0 0 0 c0t5000C50019EB02FDd0 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 c0t5000C5001A0FDBDCd0 ONLINE 0 0 0 c0t5000C5001A0F2197d0 ONLINE 0 0 0 c0t5000C50019BDBB8Dd0 ONLINE 0 0 0 c0t5000C5001A3152A0d0 ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 c0t5000C5001A100DA0d0 ONLINE 0 0 0 c0t5000C5001A31544Cd0 ONLINE 0 0 0 c0t5000C50019F03AF6d0 ONLINE 0 0 0 c0t5000C50019FC3055d0 ONLINE 0 0 0 ############### root at zfsbox01:~# zpool iostat RAIDZ2 5 capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- RAIDZ2 2.29T 15.8T 43 305 3.03M 14.6M RAIDZ2 2.29T 15.8T 114 663 12.7M 18.6M RAIDZ2 2.29T 15.8T 129 595 14.0M 11.2M RAIDZ2 2.29T 15.8T 881 623 13.0M 10.4M RAIDZ2 2.29T 15.8T 227 449 8.48M 17.5M RAIDZ2 2.29T 15.8T 39 498 4.55M 29.1M ####################################### root at zfsbox01:~# top -b | grep CPU | head -n1 CPU states: 35.2% idle, 2.2% user, 62.6% kernel, 0.0% iowait, 0.0% swap root at zfsbox01:~# mpstat CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 55 0 16969 18180 102 785 55 127 1779 4 242 1 69 0 30 1 70 0 18005 16820 4 926 44 142 1889 6 159 1 65 0 35 2 42 0 16659 18091 262 555 53 113 1757 11 250 2 68 0 31 3 48 0 18221 17380 246 667 40 122 1929 12 132 1 66 0 33 4 38 0 16547 19965 1766 517 48 107 1775 10 264 2 70 0 29 5 42 0 18596 19113 1527 595 35 115 1987 6 156 1 69 0 31 6 23 0 16284 17921 10 2066 54 109 1763 4 115 1 70 0 29 7 32 0 17576 16665 3 2233 39 134 1847 5 90 0 64 0 35 top -b| grep Memory Memory: 16G phys mem, 2181M free mem, 8187M total swap, 8187M free swap Feb 18 11:42:36 zfsbox01 unix: [ID 378719 kern.info] NOTICE: cpu_acpi: _PSS package evaluation failed for with status 5 for CPU 2. Feb 18 11:42:36 zfsbox01 unix: [ID 388705 kern.info] NOTICE: cpu_acpi: error parsing _PSS for CPU 2 Feb 18 11:43:12 zfsbox01 ixgbe: [ID 611667 kern.info] NOTICE: ixgbe0: identify 82598 adapter Feb 18 11:43:12 zfsbox01 ixgbe: [ID 611667 kern.info] NOTICE: ixgbe0: Request 16 handles, 2 available Feb 18 11:43:12 zfsbox01 pcplusmp: [ID 805372 kern.info] pcplusmp: pciex8086,10c7 (ixgbe) instance 0 irq 0x45 vector 0x66 ioapic 0xff intin 0xff is bound to cpu 3 Feb 18 11:43:12 zfsbox01 pcplusmp: [ID 805372 kern.info] pcplusmp: pciex8086,10c7 (ixgbe) instance 0 irq 0x46 vector 0x67 ioapic 0xff intin 0xff is bound to cpu 4 Feb 18 11:43:12 zfsbox01 mac: [ID 469746 kern.info] NOTICE: ixgbe0 registered Feb 18 11:43:12 zfsbox01 ixgbe: [ID 611667 kern.info] NOTICE: ixgbe0: Intel 10Gb Ethernet, driver version 1.1.4 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/088b85ec/attachment.bin>
On Tue, 23 Feb 2010, Bruno Sousa wrote:> Could the fact of having a RAIDZ2 configuration be the cause for such a > big load on the zfs box, or maybe am i missing something ?Zfs can consume appreciable CPU if compression, sha256 checksums, and/or deduplication is enabled. Otherwise, substantial CPU consumption is unexpected. Are compression, sha256 checksums, or deduplication enabled for the filesystem you are using? Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Tue, Feb 23, 2010 at 01:03:04PM -0600, Bob Friesenhahn wrote:> Zfs can consume appreciable CPU if compression, sha256 checksums, > and/or deduplication is enabled. Otherwise, substantial CPU > consumption is unexpected.In terms of scaling, does zfs on OpenSolaris play well on multiple cores? How much disks (assuming 100 MByte/s throughput for each) would be considered pushing it for a current single-socket quadcore?> Are compression, sha256 checksums, or deduplication enabled for the > filesystem you are using?-- Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
Hi, I don''t have compression and deduplication enabled, but checksums are. However disabling checksums gives a 0.5 load reduction only... Bruno On 23-2-2010 20:27, Eugen Leitl wrote:> On Tue, Feb 23, 2010 at 01:03:04PM -0600, Bob Friesenhahn wrote: > > >> Zfs can consume appreciable CPU if compression, sha256 checksums, >> and/or deduplication is enabled. Otherwise, substantial CPU >> consumption is unexpected. >> > In terms of scaling, does zfs on OpenSolaris play well on multiple > cores? How much disks (assuming 100 MByte/s throughput for each) > would be considered pushing it for a current single-socket quadcore? > > >> Are compression, sha256 checksums, or deduplication enabled for the >> filesystem you are using? >> >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/3e952ac3/attachment.bin>
Hi Bob, I have neither deduplication or compression enabled. The checksum are enabled, but if try to disable it i gain aroud 0.5 less load on the box, so it still seems to be to much. Bruno On 23-2-2010 20:03, Bob Friesenhahn wrote:> On Tue, 23 Feb 2010, Bruno Sousa wrote: >> Could the fact of having a RAIDZ2 configuration be the cause for such a >> big load on the zfs box, or maybe am i missing something ? > > Zfs can consume appreciable CPU if compression, sha256 checksums, > and/or deduplication is enabled. Otherwise, substantial CPU > consumption is unexpected. > > Are compression, sha256 checksums, or deduplication enabled for the > filesystem you are using? > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, > http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/4432409f/attachment.bin>
On Tue, 23 Feb 2010, Eugen Leitl wrote:> > In terms of scaling, does zfs on OpenSolaris play well on multiple > cores? How much disks (assuming 100 MByte/s throughput for each) > would be considered pushing it for a current single-socket quadcore?In any large storage system, most disks are relatively unused. It is not normal for all disks to be pumping through their rated throughput at one time. PCIe interfaces are only capable of a certain amount of bandwidth and this will place a hard limit on maximum throughput. There are also limits based on the raw memory bandwidth of the machine. OpenSolaris is the king of multi-threading and excels on multiple cores. Without this fine level of threading, SPARC CMT hardware would be rendered useless. With this in mind, some older versions of OpenSolaris did experience a thread priority problem when compression was used. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Tue, 23 Feb 2010, Bruno Sousa wrote:> I don''t have compression and deduplication enabled, but checksums are. > However disabling checksums gives a 0.5 load reduction only...Since high CPU consumption is unusual, I would suspect a device driver issue. Perhaps there is an interrupt conflict such that two devices are using the same interrupt. On my own system (12 disks), I can run a throughput benchmark and the system remains completely usable as an interactive desktop system, without any large use of CPU or high load factor. The bandwidth bottleneck in my case is the PCIe (4 lane) fiber channel card and its duplex connection to the storage array. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
The system becames really slow during the data copy using network, but i copy data between 2 pools of the box i don''t notice that issue, so probably i may be hitting some sort of interrupt conflit in the network cards...This system is configured with *alot *of interfaces, being : 4 internal broadcom gigabit 1 PCIe 4x, Intel Dual Pro gigabit 1 PCIe 4x, Intel 10gbE card 2 PCIe 8x Sun non-raid HBA With all of this, is there any way to check if there is indeed an interrupt conflit or some other type of conflit that leads this high load? I also noticed some messages about acpi..can this acpi also affect the performance of the system? Regards, Bruno On 23-2-2010 20:47, Bob Friesenhahn wrote:> On Tue, 23 Feb 2010, Bruno Sousa wrote: > >> I don''t have compression and deduplication enabled, but checksums are. >> However disabling checksums gives a 0.5 load reduction only... > > Since high CPU consumption is unusual, I would suspect a device driver > issue. Perhaps there is an interrupt conflict such that two devices > are using the same interrupt. > > On my own system (12 disks), I can run a throughput benchmark and the > system remains completely usable as an interactive desktop system, > without any large use of CPU or high load factor. The bandwidth > bottleneck in my case is the PCIe (4 lane) fiber channel card and its > duplex connection to the storage array. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, > http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/1b66fa8e/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/1b66fa8e/attachment.bin>
On 23 Feb 2010, at 19:53, Bruno Sousa wrote:> The system becames really slow during the data copy using network, but i copy data between 2 pools of the box i don''t notice that issue, so probably i may be hitting some sort of interrupt conflit in the network cards...This system is configured with alot of interfaces, being : > > 4 internal broadcom gigabit > 1 PCIe 4x, Intel Dual Pro gigabit > 1 PCIe 4x, Intel 10gbE card > 2 PCIe 8x Sun non-raid HBA > > > With all of this, is there any way to check if there is indeed an interrupt conflit or some other type of conflit that leads this high load? I also noticed some messages about acpi..can this acpi also affect the performance of the system?To see what interrupts are being shared: # echo "::interrupts -d" | mdb -k Running intrstat might also be interesting. Cheers, Chris
On Tue, February 23, 2010 17:20, Chris Ridd wrote:> To see what interrupts are being shared: > > # echo "::interrupts -d" | mdb -k > > Running intrstat might also be interesting.This just caught my attention. I''m not the original poster, but this sparked something I''ve been wanting to know about for a while. I know from startup log messages that I''ve got several interrupts being shared. I''ve been wondering how serious this is. I don''t have any particular performance problems, but then again my cpu and motherboard are from 2006 and I''d like to extend their service life, so using them more efficiently isn''t a bad idea. Plus it''s all a learning experience :-). While I see the relevance to diagnosing performance problems, for my case, is there likely to be anything I can do about interrupt assignments? Or is this something that, if it''s a problem, is an unfixable problem (short of changing hardware)? I think there''s BIOS stuff to shuffle interrupt assignments some, but do changes at that level survive kernel startup, or get overwritten? If there''s nothing I can do, then no real point in my investigating further. However, if there''s possibly something to do, what kinds of things should I look for as problems in the mdb or intrstat data? mdb reports: # echo "::interrupts -d" | mdb -k IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s) 1 0x42 5 ISA Edg Fixed 1 1 0x0/0x1 i8042#0 4 0xb0 12 ISA Edg Fixed 1 1 0x0/0x4 asy#0 6 0x44 5 ISA Edg Fixed 0 1 0x0/0x6 fdc#0 9 0x81 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr 12 0x43 5 ISA Edg Fixed 0 1 0x0/0xc i8042#0 14 0x45 5 ISA Edg Fixed 1 1 0x0/0xe ata#0 16 0x83 9 PCI Lvl Fixed 0 1 0x0/0x10 pci-ide#1 19 0x86 9 PCI Lvl Fixed 1 1 0x0/0x13 hci1394#0 20 0x41 5 PCI Lvl Fixed 0 2 0x0/0x14 nv_sata#1, nv_sata#0 21 0x84 9 PCI Lvl Fixed 1 2 0x0/0x15 nv_sata#2, ehci#0 22 0x85 9 PCI Lvl Fixed 0 2 0x0/0x16 audiohd#0, ohci#0 23 0x60 6 PCI Lvl Fixed 1 2 0x0/0x17 nge#1, nge#0 24 0x82 7 PCI Edg MSI 0 1 - pcie_pci#0 25 0x40 5 PCI Edg MSI 1 1 - mpt#0 26 0x30 4 PCI Edg MSI 1 1 - pcie_pci#5 27 0x87 7 PCI Edg MSI 0 1 - pcie_pci#4 160 0xa0 0 Edg IPI all 0 - poke_cpu 192 0xc0 13 Edg IPI all 1 - xc_serv 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr 209 0xd1 14 Edg IPI all 1 - cbe_fire 210 0xd3 14 Edg IPI all 1 - cbe_fire 240 0xe0 15 Edg IPI all 1 - xc_serv 241 0xe1 15 Edg IPI all 1 - apic_error_intr -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On 02/23/10 15:20, Chris Ridd wrote:> > On 23 Feb 2010, at 19:53, Bruno Sousa wrote: > >> The system becames really slow during the data copy using network, but i copy data between 2 pools of the box i don''t notice that issue, so probably i may be hitting some sort of interrupt conflit in the network cards...This system is configured with alot of interfaces, being : >> >> 4 internal broadcom gigabit >> 1 PCIe 4x, Intel Dual Pro gigabit >> 1 PCIe 4x, Intel 10gbE card >> 2 PCIe 8x Sun non-raid HBA >> >> >> With all of this, is there any way to check if there is indeed an interrupt conflit or some other type of conflit that leads this high load? I also noticed some messages about acpi..can this acpi also affect the performance of the system? > > To see what interrupts are being shared: > > # echo "::interrupts -d" | mdb -k > > Running intrstat might also be interesting. > > Cheers, > > Chris > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussIs this using mpt driver? There''s an issue w/ the fix for 6863127 that causes performance problems on larger memory machines, filed as 6908360. - Bart -- Bart Smaalders Solaris Kernel Performance barts at cyber.eng.sun.com http://blogs.sun.com/barts "You will contribute more with mercurial than with thunderbird."
Hi Bart, yep, I got Bruno to run a kernel profile lockstat... it does look like the mpt issue.. andy :------------------------------------------------------------------------------- Count indv cuml rcnt nsec Hottest CPU+PIL Caller 2861 7% 55% 0.00 4889 cpu[1]+5 do_splx nsec ------ Time Distribution ------ count Stack 1024 | 1 xc_common 2048 |@@ 213 xc_call 4096 |@@@@@@@@@@@ 1136 hat_tlb_inval 8192 |@@@@@@@@@@@@ 1237 x86pte_inval 16384 |@@ 256 hat_pte_unmap 32768 | 15 hat_unload_callback 65536 | 1 hat_unload 131072 | 2 segkmem_free_vn segkmem_free vmem_xfree vmem_free kfreea i_ddi_mem_free rootnex_teardown_copybuf rootnex_coredma_unbindhdl rootnex_dma_unbindhdl ddi_dma_unbind_handle scsi_dmafree_attr scsi_free_cache_pkt ------------------------------------------------------------------------------- Count indv cuml rcnt nsec Hottest CPU+PIL Caller 1857 5% 59% 0.00 1907 cpu[0]+5 getctgsz nsec ------ Time Distribution ------ count Stack 1024 |@@@ 206 kfreea 2048 |@@@@@@@@@@@@@@@@@@@ 1203 i_ddi_mem_free 4096 |@@@@@@ 387 rootnex_teardown_copybuf 8192 | 24 rootnex_coredma_unbindhdl 16384 | 25 rootnex_dma_unbindhdl 32768 | 12 ddi_dma_unbind_handle scsi_dmafree_attr scsi_free_cache_pkt scsi_destroy_pkt vhci_scsi_destroy_pkt scsi_destroy_pkt sd_destroypkt_for_buf sd_return_command sdintr scsi_hba_pkt_comp vhci_intr scsi_hba_pkt_comp mpt_doneq_empty mpt_intr On 24 Feb 2010, at 10:31, Bart Smaalders wrote:>> > > Is this using mpt driver? There''s an issue w/ the fix for > 6863127 that causes performance problems on larger memory > machines, filed as 6908360. > > - Bart > > > > > > > > -- > Bart Smaalders Solaris Kernel Performance > barts at cyber.eng.sun.com http://blogs.sun.com/barts > "You will contribute more with mercurial than with thunderbird." > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Yes i''m using the mtp driver . In total this system has 3 HBA''s, 1 internal (Dell perc), and 2 Sun non-raid HBA''s. I''m also using multipath, but if i disable multipath i have pretty much the same results.. Bruno On 24-2-2010 19:42, Andy Bowers wrote:> Hi Bart, > yep, I got Bruno to run a kernel profile lockstat... > > it does look like the mpt issue.. > > andy > > :------------------------------------------------------------------------------- > Count indv cuml rcnt nsec Hottest CPU+PIL Caller > 2861 7% 55% 0.00 4889 cpu[1]+5 do_splx > > nsec ------ Time Distribution ------ count Stack > 1024 | 1 xc_common > 2048 |@@ 213 xc_call > 4096 |@@@@@@@@@@@ 1136 hat_tlb_inval > 8192 |@@@@@@@@@@@@ 1237 x86pte_inval > 16384 |@@ 256 hat_pte_unmap > 32768 | 15 hat_unload_callback > 65536 | 1 hat_unload > 131072 | 2 segkmem_free_vn > segkmem_free > vmem_xfree > vmem_free > kfreea > i_ddi_mem_free > rootnex_teardown_copybuf > rootnex_coredma_unbindhdl > rootnex_dma_unbindhdl > ddi_dma_unbind_handle > scsi_dmafree_attr > scsi_free_cache_pkt > ------------------------------------------------------------------------------- > Count indv cuml rcnt nsec Hottest CPU+PIL Caller > 1857 5% 59% 0.00 1907 cpu[0]+5 getctgsz > > nsec ------ Time Distribution ------ count Stack > 1024 |@@@ 206 kfreea > 2048 |@@@@@@@@@@@@@@@@@@@ 1203 i_ddi_mem_free > 4096 |@@@@@@ 387 rootnex_teardown_copybuf > 8192 | 24 rootnex_coredma_unbindhdl > 16384 | 25 rootnex_dma_unbindhdl > 32768 | 12 ddi_dma_unbind_handle > scsi_dmafree_attr > scsi_free_cache_pkt > scsi_destroy_pkt > vhci_scsi_destroy_pkt > scsi_destroy_pkt > sd_destroypkt_for_buf > sd_return_command > sdintr > scsi_hba_pkt_comp > vhci_intr > scsi_hba_pkt_comp > mpt_doneq_empty > mpt_intr > > > > > On 24 Feb 2010, at 10:31, Bart Smaalders wrote: > >>> >> Is this using mpt driver? There''s an issue w/ the fix for >> 6863127 that causes performance problems on larger memory >> machines, filed as 6908360. >> >> - Bart >> >> >> >> >> >> >> >> -- >> Bart Smaalders Solaris Kernel Performance >> barts at cyber.eng.sun.com http://blogs.sun.com/barts >> "You will contribute more with mercurial than with thunderbird." >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100224/d6043bec/attachment.bin>
On 02/24/10 12:57, Bruno Sousa wrote:> Yes i''m using the mtp driver . In total this system has 3 HBA''s, 1 > internal (Dell perc), and 2 Sun non-raid HBA''s. > I''m also using multipath, but if i disable multipath i have pretty much > the same results.. > > Bruno >From what I understand, the fix is expected "very soon"; your performance is getting killed by the over-aggressive use of bounce buffers... - Bart -- Bart Smaalders Solaris Kernel Performance barts at cyber.eng.sun.com http://blogs.sun.com/barts "You will contribute more with mercurial than with thunderbird."
Hi, Until it''s fixed the 132 build should be used instead of the 133? Bruno On 25-2-2010 3:22, Bart Smaalders wrote:> On 02/24/10 12:57, Bruno Sousa wrote: >> Yes i''m using the mtp driver . In total this system has 3 HBA''s, 1 >> internal (Dell perc), and 2 Sun non-raid HBA''s. >> I''m also using multipath, but if i disable multipath i have pretty much >> the same results.. >> >> Bruno >> > > From what I understand, the fix is expected "very soon"; your > performance is getting killed by the over-aggressive use of > bounce buffers... > > - Bart > >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3656 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100225/a320aaee/attachment.bin>
dd-b at dd-b.net said:> I know from startup log messages that I''ve got several interrupts being > shared. I''ve been wondering how serious this is. I don''t have any > particular performance problems, but then again my cpu and motherboard are > from 2006 and I''d like to extend their service life, so using them more > efficiently isn''t a bad idea. Plus it''s all a learning experience :-).Mine''s from 2004, and I''ve been going through the same adjustments here.> While I see the relevance to diagnosing performance problems, for my case, is > there likely to be anything I can do about interrupt assignments? Or is this > something that, if it''s a problem, is an unfixable problem (short of changing > hardware)? I think there''s BIOS stuff to shuffle interrupt assignments some, > but do changes at that level survive kernel startup, or get overwritten?Experience with my motherboard is that even when you switch the BIOS "Plug-n-Play OS" setting between "No" and "Yes", Solaris-10 doesn''t seem to change where it maps any devices. Probably a removal of the /etc/path_to_inst file and reconfiguration reboot would be required, but even that won''t move devices required for booting. Also, the onboard devices (like your nv_sata, ehci, etc.) are not likely to move around at all. Only things that could be moved to different PCI/PCI-X/PCIe slots are likely to move. Ran across this note: http://blogs.sun.com/sming56/entry/interrupts_output_in_mdb I found it pretty time-consuming just mapping the OS''s device instance numbers to the physical devices. Taking the device instance numbers from "intrstat" or "echo ''::interrupts -d'' | mdb -k" and digging through the output of "prtconf -Dv" and/or boot-up /var/adm/messages stuff was pretty tedious. Check out what mine looks like, in particular the case where four devices share the same interrupt -- the two onboard SATA ports, onboard ethernet, and one slow-mode USB port (Intel ICH5 chipset). There doesn''t appear to be a thing you can do about this sharing. The system''s never seemed slow, though I do try to avoid using that particular USB port. # echo ''::interrupts -d'' | mdb -k IRQ Vector IPL Bus Type CPU Share APIC/INT# Driver Name(s) 1 0x41 5 ISA Fixed 0 1 0x0/0x1 i8042#0 6 0x43 5 ISA Fixed 0 1 0x0/0x6 fdc#0 9 0x81 9 PCI Fixed 0 1 0x0/0x9 acpi_wrapper_isr 12 0x42 5 ISA Fixed 0 1 0x0/0xc i8042#0 15 0x44 5 ISA Fixed 0 1 0x0/0xf ata#1 16 0x82 9 PCI Fixed 0 3 0x0/0x10 uhci#3, uhci#0, nvidia#0 17 0x86 9 PCI Fixed 0 1 0x0/0x11 audio810#0 18 0x85 9 PCI Fixed 0 4 0x0/0x12 pci-ide#1, e1000g#0, uhci#2, pci-ide#1 19 0x84 9 PCI Fixed 0 1 0x0/0x13 uhci#1 22 0x40 5 PCI Fixed 0 1 0x0/0x16 pci-ide#2 23 0x83 9 PCI Fixed 0 1 0x0/0x17 ehci#0 160 0xa0 0 IPI ALL 0 - poke_cpu 192 0xc0 13 IPI ALL 1 - xc_serv 208 0xd0 14 IPI ALL 1 - kcpc_hw_overflow_intr 209 0xd1 14 IPI ALL 1 - cbe_fire 210 0xd3 14 IPI ALL 1 - cbe_fire 240 0xe0 15 IPI ALL 1 - xc_serv 241 0xe1 15 IPI ALL 1 - apic_error_intr # Regards, Marion