I have just built an opensolaris box (2008.11) as a small fileserver (6x 1TB drives as RAIDZ2, kernel CIFS) for home media use and I am noticing an odd behavior copying files to the box. My knowledge of monitoring/analysis tools under Solaris is very limited, and so far I have just been using the System Monitor that pops up with ctrl-alt-del, and the numbers I am reporting come from that. When copying files (a small number of large files from a Mac to the Solaris/CIFS server) I initially see network usage of 40-45 MB/s which is pretty much what I would expect from single spindle disks over GigE through a SoHo switch that does not support jumbo frames. However, I only see this performance for perhaps 10 seconds, then it drops to 25-30 MB/s for about 15-20 seconds, and then it drops again to 17-20 MB/s where it remains for the duration of file transfer. This is not an occasional issue, it happens this way each and every time. At each of the three levels, the speeds are consistent. There is a brief period of inactivity (0.5 s) when the speeds are reduced, leading me to believe that *something* is throttling speeds back. Has anyone else seen this behavior? Any idea where it might be coming from, and what I could do to keep a sustained 40-45 MB/s transfer rate? Any suggestions as to what tools I might use to help diagnose this would be appreciated. At the moment, I am in the process of putting an old Windows box together to see if I can replicate the problem and eliminate the possibility of a cause outside of the Solaris box. -- This message posted from opensolaris.org
Ok, I''m going to reply to my own question here. After a few hours of thinking, I believe I know what is going on. I am seeing the initial high network throughput as the 4GB of RAM in the server fills up with data. In fact, in this case, I am bound by the speed of the source drive, which tops out at about 40 MB/s -- just what I am seeing as the copy starts. Eventually, the network speed settles down to the write speed of the local pool. Copying files locally (on and off the pool) shows that the sustained write speeds are, in fact, about 17-20 MB/s. So, this brings up a new question, are these speeds typical? For reference, my pool is built from 6 1TB drives configured as RAIDZ2 driven by an ICH9(R) configured in AHCI mode. I am aware that RAIDZ2 performance will always be less than the speed of individual disks, but this is a little bit more than I was expecting. Individually, these drives benchmark around 60-70 MB/s, so I am looking at a fairly substantial penalty for the reliability of RAIDZ2. I''ll CC this message to the CIFS and Networking lists to prevent anyone else from waiting time writing a reply, as the appropriate place for this thread is now confirmed to be zfs-discuss. -g. -- This message posted from opensolaris.org
test -- This message posted from opensolaris.org
On Thu, Jan 8, 2009 at 5:54 PM, gnomad <gnomad at gmail.com> wrote:> Ok, I''m going to reply to my own question here. After a few hours of > thinking, I believe I know what is going on. > > I am seeing the initial high network throughput as the 4GB of RAM in the > server fills up with data. In fact, in this case, I am bound by the speed > of the source drive, which tops out at about 40 MB/s -- just what I am > seeing as the copy starts. Eventually, the network speed settles down to > the write speed of the local pool. Copying files locally (on and off the > pool) shows that the sustained write speeds are, in fact, about 17-20 MB/s. > > So, this brings up a new question, are these speeds typical? For > reference, my pool is built from 6 1TB drives configured as RAIDZ2 driven by > an ICH9(R) configured in AHCI mode. I am aware that RAIDZ2 performance will > always be less than the speed of individual disks, but this is a little bit > more than I was expecting. Individually, these drives benchmark around > 60-70 MB/s, so I am looking at a fairly substantial penalty for the > reliability of RAIDZ2. > > I''ll CC this message to the CIFS and Networking lists to prevent anyone > else from waiting time writing a reply, as the appropriate place for this > thread is now confirmed to be zfs-discuss. > > -g. > -- >That seems really, relaly low. What are your sustained read speeds? --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090108/bccd071c/attachment.html>
Roch Bourbonnais
2009-Jan-12 13:56 UTC
[zfs-discuss] Odd network performance with ZFS/CIFS
Try setting the cachemode property on the target filesystem. Also verify that the source can pump data through the net at the desired rate if the target is /dev/null. -r Le 8 janv. 09 ? 18:46, gnomad a ?crit :> I have just built an opensolaris box (2008.11) as a small fileserver > (6x 1TB drives as RAIDZ2, kernel CIFS) for home media use and I am > noticing an odd behavior copying files to the box. > > My knowledge of monitoring/analysis tools under Solaris is very > limited, and so far I have just been using the System Monitor that > pops up with ctrl-alt-del, and the numbers I am reporting come from > that. > > When copying files (a small number of large files from a Mac to the > Solaris/CIFS server) I initially see network usage of 40-45 MB/s > which is pretty much what I would expect from single spindle disks > over GigE through a SoHo switch that does not support jumbo frames. > However, I only see this performance for perhaps 10 seconds, then it > drops to 25-30 MB/s for about 15-20 seconds, and then it drops again > to 17-20 MB/s where it remains for the duration of file transfer. > > This is not an occasional issue, it happens this way each and every > time. At each of the three levels, the speeds are consistent. > There is a brief period of inactivity (0.5 s) when the speeds are > reduced, leading me to believe that *something* is throttling speeds > back. > > Has anyone else seen this behavior? Any idea where it might be > coming from, and what I could do to keep a sustained 40-45 MB/s > transfer rate? > > Any suggestions as to what tools I might use to help diagnose this > would be appreciated. At the moment, I am in the process of putting > an old Windows box together to see if I can replicate the problem > and eliminate the possibility of a cause outside of the Solaris box. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I''m currently experiencing exactly the same problem and it''s been driving me nuts. Tried open soalris and am currently running the latest version of SXCE both with exactly the same results. This issue occurs with both CIFS which shows the speed degrade and ISCSI which just starts off at the lowest speed but exhibits the same peaks and troughs I have 4x500GB drives in RAIDz1 config on an AMD 780G mobo. speed tests using DD have shown read rates of ~140MB/s and write rates of `120MB/s (humourously slightly faster than one of my friends arrays on linux and intel hardware) Currently the transfer will sit at about 18% gige network utilisation for 10 seconds then dip to 0 and come straight back up to 18% this happens at regular predictable intervals, there is no randomness. I''ve tried two different switches, one a consumer grade switch from linksys and one a low end distribution switch from 3com both exhibit exactly the same behaviour. The only computer accessing the solaris box is w windows vista 64 sp1 machine. Currently I''m guessing that the transfer issues have somethign to do with the onboard realtek network card in the solaris box. Possibly a driver issue? I''ve got a dual port intel server nic on order to replace it and test with. -- This message posted from opensolaris.org
2C from Oz: Windows (at least XP - I have thus far been lucky enough to avoid running vista on metal) has packet schedulers, quality of service settings and other crap that can severely impact windows performance on the network. I have found that setting the following made a difference to me: - Disable Jumbo Frames (as I have only a very cheap crappy gig-switch and if I try to drive it hard with jumbo''s enabled, it falls in a heap) - Lose the ''deterministic network enhancer'' under windows - Lose the QoS packet scheduler - Check the interface properties and go looking for something that sounds like ''optimize for CPU / optimize for speed'' and set it to speed - Depending on workload and packet sizes, it might also be worth looking at disabling nagle algorithm on the Solaris box. See http://www.sun.com/servers/coolthreads/tnb/lighttpd.jsp for a quick explanation... It would be interesting to see if you see the same issues using a Solaris or other OS client. Hope this helps somewhat. Let us know how it goes. Nathan. fredrick phol wrote:> I''m currently experiencing exactly the same problem and it''s been driving me nuts. Tried open soalris and am currently running the latest version of SXCE both with exactly the same results. > > This issue occurs with both CIFS which shows the speed degrade and ISCSI which just starts off at the lowest speed but exhibits the same peaks and troughs > > I have 4x500GB drives in RAIDz1 config on an AMD 780G mobo. > > speed tests using DD have shown read rates of ~140MB/s and write rates of `120MB/s (humourously slightly faster than one of my friends arrays on linux and intel hardware) > > Currently the transfer will sit at about 18% gige network utilisation for 10 seconds then dip to 0 and come straight back up to 18% this happens at regular predictable intervals, there is no randomness. I''ve tried two different switches, one a consumer grade switch from linksys and one a low end distribution switch from 3com both exhibit exactly the same behaviour. > > The only computer accessing the solaris box is w windows vista 64 sp1 machine. > > Currently I''m guessing that the transfer issues have somethign to do with the onboard realtek network card in the solaris box. Possibly a driver issue? I''ve got a dual port intel server nic on order to replace it and test with.
Turning off windows quality of service seems to have given me sustained write speeds hitting about 90MB/s using cifs wites to the ISCSI device are hitting about 40mb/s but the network utilisation graph is very jagged, it''s just a constant spike to 60% utilisation then a drop to 0 and repeat I also seem to have seen a slight improvement by turning off flow control (which if my network knowledge serves me correctly shouldn''t really be on on my nic anyway) -- This message posted from opensolaris.org