Hi everyone, I'm trying to use samba in a small video post production house but we are not getting the performance we expected. Our setup: - CenOS 5.6 x86-64 - samba.x86_64 (3.0.33-3.29.el5_6.2 and 3.6.0rc1) - Intel based server (One 4 core Xeon E5620 @ 2.40GHz, 8 GB RAM) - 4 Intel Gigagit ethernet NIC ports with 802.3ad bonding connected to a switch configured tu use 802.3ad - 8 2TB 7.2 krpm SATA disks with hardware RAID5 (RAID stripe size 1024 bytes, controller and disk cache enabled, readahead enabled) - XFS filesystem (created with the following parameters: size=64k -d su=1024k,sw=7) - Average file size in the share: 8 MByte - Gigabit network composed by Cat5E certified cabling and DLink DGS-3427 gigabit switch. - Intel I7 based terminals with Intel gigabit NIC, running Windows 7 Test results: OS access: Sequential write (1 x 31 GByte file): 500 MByte/s Sequential read (1 x 31 GByte file): 780 MByte/s Write (1000 files 8 MByte each): 249 MByte/s average Read (1000 files 8 MByte each): 158 MByte/s average Simultaneous write (4 processes each writing 1000 files of 8 MByte each ): 188 MByte/s average Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 118 MByte/s average Samba local access (stock CentOS samba 3.0.33 connecting from the same server with smbclient): Sequential read (1 x 31 GByte file): 267 MByte/s Read (1000 files 8 MByte each): 71 MByte/s average Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 102 MByte/s average Samba local access (Samba 3.6.0rc1 compiled from GIT repo. Connecting from the same server with smbclient): Read (1000 files 8 MByte each): 95 MByte/s average Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 103 MByte/s average Samba server accessed from Windows 7 terminals (samba 3.6.0rc1): Read (1 terminal copying from samba fileserver to local disk 1000 files 8 MByte each): 60 MByte/s average Simultaneous read (4 terminals each copying from samba fileserver to local disk 1000 files of 8 MByte each): 70 MByte/s average Note: Simultaneos read speed is measured adding the size of all transfered files and dividing it by the time taken to transfer these files. I will appreciate any feedback about the results we are getting and advice on how to improve this. Thanks in advance Juan Pablo
On Wed, May 25, 2011 at 08:02:56PM -0700, Juan Pablo wrote:> Hi everyone, > > I'm trying to use samba in a small video post production house but we are not > getting the performance we expected. > > Our setup: > > - CenOS 5.6 x86-64 > - samba.x86_64 (3.0.33-3.29.el5_6.2 and 3.6.0rc1) > - Intel based server (One 4 core Xeon E5620 @ 2.40GHz, 8 GB RAM) > - 4 Intel Gigagit ethernet NIC ports with 802.3ad bonding connected to a switch > configured tu use 802.3ad > - 8 2TB 7.2 krpm SATA disks with hardware RAID5 (RAID stripe size 1024 bytes, > controller and disk cache enabled, readahead enabled) > - XFS filesystem (created with the following parameters: size=64k -d > su=1024k,sw=7) > - Average file size in the share: 8 MByte > - Gigabit network composed by Cat5E certified cabling and DLink DGS-3427 gigabit > switch. > - Intel I7 based terminals with Intel gigabit NIC, running Windows 7 > > > Test results: > > OS access: > > Sequential write (1 x 31 GByte file): 500 MByte/s > Sequential read (1 x 31 GByte file): 780 MByte/s > Write (1000 files 8 MByte each): 249 MByte/s average > Read (1000 files 8 MByte each): 158 MByte/s average > Simultaneous write (4 processes each writing 1000 files of 8 MByte each ): 188 > MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 118 > MByte/s average > > Samba local access (stock CentOS samba 3.0.33 connecting from the same server > with smbclient): > > Sequential read (1 x 31 GByte file): 267 MByte/s > Read (1000 files 8 MByte each): 71 MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 102 > MByte/s average > > Samba local access (Samba 3.6.0rc1 compiled from GIT repo. Connecting from the > same server with smbclient): > > Read (1000 files 8 MByte each): 95 MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 103 > MByte/s average > > Samba server accessed from Windows 7 terminals (samba 3.6.0rc1): > > Read (1 terminal copying from samba fileserver to local disk 1000 files 8 MByte > each): 60 MByte/s average > Simultaneous read (4 terminals each copying from samba fileserver to local disk > 1000 files of 8 MByte each): 70 MByte/s average > > Note: Simultaneos read speed is measured adding the size of all transfered files > and dividing it by the time taken to transfer these files. > > I will appreciate any feedback about the results we are getting and advice on > how to improve this.If you're using 3.6.0 and Windows 7 clients try turning on SMB2 support by setting "max protocol = smb2" in the [global] section of your smb.conf. Jeremy.
W dniu 2011-05-26 05:02, Juan Pablo pisze:> Hi everyone, > > I'm trying to use samba in a small video post production house but we are not > getting the performance we expected. > > Our setup: > > - CenOS 5.6 x86-64 > - samba.x86_64 (3.0.33-3.29.el5_6.2 and 3.6.0rc1) > - Intel based server (One 4 core Xeon E5620 @ 2.40GHz, 8 GB RAM) > - 4 Intel Gigagit ethernet NIC ports with 802.3ad bonding connected to a switch > configured tu use 802.3ad > - 8 2TB 7.2 krpm SATA disks with hardware RAID5 (RAID stripe size 1024 bytes, > controller and disk cache enabled, readahead enabled) > - XFS filesystem (created with the following parameters: size=64k -d > su=1024k,sw=7) > - Average file size in the share: 8 MByte > - Gigabit network composed by Cat5E certified cabling and DLink DGS-3427 gigabit > switch. > - Intel I7 based terminals with Intel gigabit NIC, running Windows 7 > > > Test results: > > OS access: > > Sequential write (1 x 31 GByte file): 500 MByte/s > Sequential read (1 x 31 GByte file): 780 MByte/s > Write (1000 files 8 MByte each): 249 MByte/s average > Read (1000 files 8 MByte each): 158 MByte/s average > Simultaneous write (4 processes each writing 1000 files of 8 MByte each ): 188 > MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 118 > MByte/s average > > Samba local access (stock CentOS samba 3.0.33 connecting from the same server > with smbclient): > > Sequential read (1 x 31 GByte file): 267 MByte/s > Read (1000 files 8 MByte each): 71 MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 102 > MByte/s average > > Samba local access (Samba 3.6.0rc1 compiled from GIT repo. Connecting from the > same server with smbclient): > > Read (1000 files 8 MByte each): 95 MByte/s average > Simultaneous read (4 processes each reading 1000 files of 8 MByte each): 103 > MByte/s average > > Samba server accessed from Windows 7 terminals (samba 3.6.0rc1): > > Read (1 terminal copying from samba fileserver to local disk 1000 files 8 MByte > each): 60 MByte/s average > Simultaneous read (4 terminals each copying from samba fileserver to local disk > 1000 files of 8 MByte each): 70 MByte/s average > > Note: Simultaneos read speed is measured adding the size of all transfered files > and dividing it by the time taken to transfer these files. > > I will appreciate any feedback about the results we are getting and advice on > how to improve this. > > Thanks in advance > > Juan PabloMaybe try the ext4 filesystem? With a new kernel - with stable support for it. Many tests have shown that ext4 is faster than XFS, but also remember to tune the parameters when creating the filesystem. You can try several different configurations and compare their performance (performance for the same parameters can be different on different hardware and RAID configurations, so options recommended by other people are not always the best for you). Filesystem mount options are also important! The second thing is network - some switches do not do port trunking well - for example they use always use one wire even if there are 2 or more connected in a trunk - so it does not improve performance - only the reliability. Usually also one data stream does not go through more than one wire, so the only possibility to get 4 Gbit speed from your server is to connect 4 simultaneously downloading stations to the switch. You can check the bandwidth usage on each interface of the server with the iftop command. For measuring the network performance I recommend also the iperf tool. Also google about network and tcp tuning in linux (parameters like txqueuelen, buffer sizes etc). About tuning samba performance you can read for example here: http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/speed.html But also in many other places on the Internet. Best regards, Daniel
Hi Volker, I've removed the SO_RCVBUF=65536 SO_SNDBUF=65536 and the 3 other setting, reloaded samba and repeated the tests but still getting the same results for the local tests and also from Windows. I am getting the following results in MBytes/s: Test type Local (dd) Local (smbclient) Window 7 Case1 161 101 63 Case2 122 119 68 Case1: Read 1000 files 8 MByte each Case2: 4 processes each reading 1000 files of 8 MByte each Any idea how can I debug where the bottleneck is or why I get so low numbers when reading from Windows? Thanks Juan Pablo ________________________________ From: Volker Lendecke <Volker.Lendecke at SerNet.DE> To: Juan Pablo <jhurcad at yahoo.com> Cc: Jeremy Allison <jra at samba.org>; samba at lists.samba.org Sent: Fri, May 27, 2011 11:25:31 AM Subject: Re: [Samba] Samba performance On Fri, May 27, 2011 at 06:34:50AM -0700, Juan Pablo wrote:> Hi Volker, > > I am using the following socket options: > > socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=65536 SO_SNDBUF=65536Just remove the SO_RCVBUF=65536 SO_SNDBUF=65536 settings. Unless you're on a very old Linux or other Unix the kernel is far better off figuring out that itself.> read raw = yes > write raw = yes > max xmit = 65535Just remove these 3 settings. If it's still slow after that, we need to do more analysis. Volker -- SerNet GmbH, Bahnhofsallee 1b, 37081 G?ttingen phone: +49-551-370000-0, fax: +49-551-370000-9 AG G?ttingen, HRB 2816, GF: Dr. Johannes Loxen
On 5/25/2011 10:02 PM, Juan Pablo wrote:> OS access: > Simultaneous read (4 processes): 118 MByte/s average> Samba local access: > Simultaneous read (4 processes): 102 MByte/s average> Samba server from Windows 7: > Simultaneous read (4 terminals): 70 MByte/s averageThe first two results above demonstrate a slow disk subsystem not suitable for streaming multiple files to multiple concurrent clients at high data rates. Your spindles are too slow and/or you don't have enough to satisfy your test methodology. Four concurrent dd copies yields 118 MB/s per process, only ~15% disk headroom above wire speed GbE. Your smbd+smbclient local process disk bandwidth overhead appears to be roughly 13 percent. I don't know what the optimal percent here should be but 13% above a dd copy process seems reasonable given the additional data movement through smbd and smbclient buffers. It is clear that you don't have enough head seek performance for 4 or more client streams of 1000 x 8MB files. This doesn't necessarily address the 30% drop in over the wire to Win7 client performance, but we'll get to that later. To confirm the disk deficiency issue, I recommend the following test: Make a 2GB tmpfs ramdisk on the server and run your tests against it, albeit with 200 instead of 1000 8MB files. Instructions: http://prefetch.net/blog/index.php/2006/11/30/creating-a-ramdisk-with-linux/ This will tell you if your server block storage subsystem is part of the problem, and will give you a maximum throughput per Samba process baseline. You should get something like 5GB/s+ local smbclient throughput from a tmpfs ramdisk on that Xeon platform with its raw 25GB/s memory bandwidth. Run a single Win7 workstation SMB test copy to a freshly booted machine so most of the memory is free for buffering the inbound files. This will mostly eliminate the slow local disk as a bottleneck. Now run your 4 concurrent Win7 client test and compare to the single client test results. This should tell you if you have a bonding problem or not, either in the server NICs or the switch. You didn't mention jumbo frames. Enable jumbo if not already. It may help. Something else to consider is that the kernel shipped with CentOS 5.6, 2.6.18, the "Pirate" kernel, is now 4.5 years old, released in Sept of 2006 (http://kerneltrap.org/node/7144). There have been just a few performance enhancements between 2.6.18 and 3.0, specifically to the network stack. ;) The CentOS packages are older than dirt as well. If you're not wed to CentOS you should look at more recent distros. -- Stan