Hi All. Hope you can still help here. Solaris 11 Express. x86 platform E6600 with 6GB of RAM I have a fairly new S11E box Im using as a file server. 3x1.5TB HDD''s in a raidz pool. When I first set it up I was getting 110MB/sec writes across gigabit network via SMB shares. I noticed recently that write rate has dropped off and through testing now I am getting 35MB/sec writes. The pool is around 50-60% full. I am getting a CONSTANT 30-35% kernel cpu utilisation, even if the machine is idle. I do not know if this was the case when the write performance was better. I have tried reading from the server to a HDD on a windows client and I get 50+MB/sec which is probably the max that that HDD can sustain on a write. It''s interesting that when I boot into the S11E live CD, my CPU idles at 100% idle, but as soon as I install to HDD, it gets that 30-35% kernel utilisation. I do not know if the CPU utilisation and the bad write performance is related. I did the following while the system was idle. an at Nas:~# lockstat -gkIW sleep 60 Profiling interrupt: 11648 events in 60.038 seconds (194 events/sec) Count genr cuml rcnt nsec Hottest CPU+PIL Caller 10534 90% ---- 0.00 165530 cpu[1] thread_start 9750 84% ---- 0.00 147359 cpu[1] idle 8128 70% ---- 0.00 94807 cpu[1] cpu_idle_mwait 7852 67% ---- 0.00 84202 cpu[1] i86_mwait 2541 22% ---- 0.00 407347 cpu[1]+11 cpupm_utilization_event 2541 22% ---- 0.00 407347 cpu[1]+11 cpupm_change_state 2540 22% ---- 0.00 407412 cpu[1]+11 cpupm_plat_change_state 2539 22% ---- 0.00 407466 cpu[1]+11 cpupm_state_change 2530 22% ---- 0.00 408167 cpu[1]+11 pg_ev_thread_swtch 2512 22% ---- 0.00 401944 cpu[1]+11 swtch 2511 22% ---- 0.00 407944 cpu[1]+11 cmt_ev_thread_swtch_pwr 2509 22% ---- 0.00 408047 cpu[1]+11 speedstep_power 1311 11% ---- 0.00 421383 cpu[0]+11 speedstep_pstate_transition 1307 11% ---- 0.00 422634 cpu[0]+11 write_ctrl 1307 11% ---- 0.00 422634 cpu[0]+11 cpu_acpi_write_port 1306 11% ---- 0.00 422939 cpu[0]+11 outw 1237 11% ---- 0.00 392091 cpu[1]+11 do_splx 1197 10% ---- 0.00 393779 cpu[1]+11 xc_call 1197 10% ---- 0.00 393779 cpu[1]+11 xc_common hope that might help. -- This message posted from opensolaris.org
Markus Kovero
2011-Feb-11 08:55 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> I noticed recently that write rate has dropped off and through testing now I am getting 35MB/sec writes. The pool is around 50-60% full.> I am getting a CONSTANT 30-35% kernel cpu utilisation, even if the machine is idle. I do not know if this was the case when the write performance was better. I have tried reading from the server to a HDD on a windows client and I get 50+MB/sec which is probably the max that that HDD can sustain on a write.Hi, do you have your zfs prefetch turned on or off? Turning prefetch off makes comstar iscsi shares unusable in Solaris 11 Express while it might work fine in osol. Yours Markus Kovero
Edward Ned Harvey
2011-Feb-11 12:53 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Markus Kovero > > > I noticed recently that write rate has dropped off and through testingnow I> am getting 35MB/sec writes. The pool is around 50-60% full. > > Hi, do you have your zfs prefetch turned on or off? Turning prefetch off > makes comstar iscsi shares unusable in Solaris 11 Express while it mightwork> fine in osol.On the other hand, that will only matter for reads. And the complaint is writes.
Edward Ned Harvey
2011-Feb-11 12:59 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of ian W > > Hope you can still help here. Solaris 11 Express. > x86 platform E6600 with 6GB of RAM > > I have a fairly new S11E box Im using as a file server. > 3x1.5TB HDD''s in a raidz pool.Just so you know, you''re calling that "fairly new" but a Core2 Duo processor and 6G of ram isn''t the same class of machine this is meant for. Still, that should be irrelevant, cuz you should be able to fully max out a 1G ethernet easily enough.> When I first set it up I was getting 110MB/sec writes across gigabitnetwork> via SMB shares. > > I noticed recently that write rate has dropped off and through testing nowI> am getting 35MB/sec writes. The pool is around 50-60% full.How, precisely are you performing your test? Step by step... What do you get if you start "iostat 5" a minute before the test and keep it running till a little after the test? What does "zpool status" tell you?
Markus Kovero
2011-Feb-11 13:35 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> On the other hand, that will only matter for reads. And the complaint is > writes.Actually, it also affects writes. (due checksum reads?) Yours Markus Kovero
Thanks for the responses.. I found the issue. It was due to power management, and a probably bug with event driven power management states, changing cpupm enable to cpupm enable poll-mode in /etc/power.conf fixed the issue for me. back up to 110MB/sec+ now.. -- This message posted from opensolaris.org
Edward Ned Harvey
2011-Feb-12 14:15 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Edward Ned Harvey > > What does "zpool status" tell you?Also, "zpool iostat 5''
Roy Sigurd Karlsbakk
2011-Feb-12 17:14 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> > What does "zpool status" tell you? > > Also, "zpool iostat 5''or even iostat -xn Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
Roy Sigurd Karlsbakk
2011-Feb-12 17:15 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
> > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > > bounces at opensolaris.org] On Behalf Of Edward Ned Harvey > > > > What does "zpool status" tell you? > > Also, "zpool iostat 5''Sorry - check iostat -en Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Sat, Feb 12, 2011 at 3:14 AM, ian W <dropbearsssf at yahoo.com.au> wrote:> Thanks for the responses.. I found the issue. It was due to power management, and a probably bug with event driven power management states, > > changing > > cpupm enable > > to > > cpupm enable poll-mode > > in /etc/power.conf fixed the issue for me. back up to 110MB/sec+ now..Interesting - I have a E6600 also, and I will give this a try. I left ''cpupm enable'' in /etc/power.conf because powertop/prtdiag properly reported all the available P/C-states of my CPU, so I assumed that power management was good to go. What do you have cpu-threshold set too? (This may be a moot point for me, because my CPU is littering fault management with strings of L2 cache errors, so might be upgrading to Nehalem soon).
Hello my power.conf is as follows; any recommendations for improvement? device-dependency-property removable-media /dev/fb autopm enable autoS3 enable cpu-threshold 1s # Auto-Shutdown Idle(min) Start/Finish(hh:mm) Behavior autoshutdown 30 0:00 0:00 noshutdown S3-support enable cpu_deep_idle enable cpupm enable poll-mode -- This message posted from opensolaris.org
Richard Elling
2011-Feb-15 05:28 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
On Feb 14, 2011, at 4:49 PM, ian W wrote:> Hello > > my power.conf is as follows; any recommendations for improvement?For best performance, disable power management. For certain processors and BIOSes, some combinations of power management (below the OS) are also known to be toxic. At Nexenta, current best practice is to disable C-states for Nehalems. -- richard
Thanks.. given this box runs 18 hours a day and is idle for maybe 17.5 hrs of that, I''d rather have the best power management I can... I would have loved to have upgraded to a i3 or even SB but the solaris 11 express support for both is marginal. (h55 chipset issues, no sandybridge support at all etc) -- This message posted from opensolaris.org
Richard Elling
2011-Feb-16 05:10 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
On Feb 15, 2011, at 7:46 PM, ian W wrote:> Thanks.. > > given this box runs 18 hours a day and is idle for maybe 17.5 hrs of that, I''d rather have the best power management I can... > > I would have loved to have upgraded to a i3 or even SB but the solaris 11 express support for both is marginal. (h55 chipset issues, no sandybridge support at all etc)I think there are options here, but there are few who will care enough to spend the time required to optimize... it is less expensive to buy lower-power processors than to spend even one man-hour trying to get savings out of a high-power processor. But if you are up to the challenge :-) try disabling cores entirely and leave the remaining two or three cores running without C-states. You will need to measure the actual power consumption, but you might be surprised at how much better that works for performance and power savings. -- richard
Could you not also pin process'' to cores, preventing switching should help too? I''ve done this for performance reasons before on a 24 core Linux box Sent from my HTC Desire On 16 Feb 2011 05:12, "Richard Elling" <richard.elling at gmail.com> wrote:> On Feb 15, 2011, at 7:46 PM, ian W wrote: > >> Thanks.. >> >> given this box runs 18 hours a day and is idle for maybe 17.5 hrs ofthat, I''d rather have the best power management I can...>> >> I would have loved to have upgraded to a i3 or even SB but the solaris 11express support for both is marginal. (h55 chipset issues, no sandybridge support at all etc)> > I think there are options here, but there are few who will care enough tospend the> time required to optimize... it is less expensive to buy lower-powerprocessors than> to spend even one man-hour trying to get savings out of a high-powerprocessor.> But if you are up to the challenge :-) try disabling cores entirely andleave the remaining> two or three cores running without C-states. You will need to measure theactual power> consumption, but you might be surprised at how much better that works forperformance> and power savings. > -- richard > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110216/558c88fb/attachment.html>
Richard Elling
2011-Feb-16 14:43 UTC
[zfs-discuss] Very bad ZFS write performance. Ok Read.
On Feb 15, 2011, at 11:26 PM, Khushil Dep wrote:> Could you not also pin process'' to cores, preventing switching should help too? I''ve done this for performance reasons before on a 24 core Linux box >Yes. More importantly, you could send interrupts to a processor set. There are many ways to implement resource management in Solaris-based systems :-) -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110216/c988bbad/attachment.html>