Hello, We are running Samba 3.0.33 on a 2-node Linux cluster running RedHat 5.6 ES. Its primary application is to serve out a single network drive to support our business (out 350GB in size). For several years, this solution has been running flawlessly. File access was almost as fast as a local disk, so putting files on the server was never a problem. Our clients are running mostly Windows XP Pro. We have a few Windows 7 clients. Almost a year ago, that changed. Applications written in VB 6.0 that read files from the server started showing *significant* performance problems. What used to take seconds now takes more than a minute to finish. Moving the file to a local disk brought the speed back up to where it should be. Moving the file to a Windows 2003 or 2008 server also provided good throughput. All clients experience this same problem. I ran "strace -f" against the smbd process that is assigned to my desktop and then ran the VB application to see what the daemon was up to. I discovered that it went through a process of opening the file several times and reading data from it, using progressively smaller buffer sizes until is settled on using a buffer size of 1, which it used for the remainder of the file I/O session. I've attached the smb.conf file for your reading pleasure. I can attach the strace output file if that would be helpful. I suspect that something changed on the Windows desktop side to bring this about, since we made no changes to our VB code at all. Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
> I've attached the smb.conf file for your reading pleasure. ?I can attach the strace output file if that would be helpful.The list automatically throws away all attachments. Can you post that inline or on pastebin.com and link here or something similar? John
Sorry - I didn't realize the list wouldn't accept attachments. Here is the smb.conf file: #Backup Domain Controller ## Global parameters [global] unix charset = LOCALE workgroup = IBMPEERS netbios name = mustang1 socket options = SO_KEEPALIVE IPTOS_LOWDELAY TCP_NODELAY SO_RCVBUF=16384 SO_SNDBUF=16384 passdb backend = ldapsam:"ldap://mustang1.si.lan ldap://mustang2.si.lan" # passdb backend = ldapsam:"ldap://mustang1.si.lan" username map = /etc/samba/smbusers # interfaces = 192.168.2.242/32 # bind interfaces only = yes log level = 0 syslog = 1 log file = /var/log/samba/%m max log size = 1024 name resolve order = wins bcast hosts guest account = nobody # printcap name = CUPS # show add printer wizard = No logon script = logon.bat logon path logon drive = C: domain logons = Yes domain master = No local master = no preferred master = no os level = 0 wins server = mustang2.si.lan ldap suffix = dc=IBMPEERS,dc=lan ldap machine suffix = ou=Computers,ou=Users ldap user suffix = ou=People,ou=Users ldap group suffix = ou=Groups ldap idmap suffix = ou=Idmap ldap admin dn = cn=sambaadmin,dc=IBMPEERS,dc=lan utmp = no idmap backend = ldap://mustang2.si.lan idmap uid = 10000-20000 idmap gid = 10000-20000 # printing = cups veto files = /*.eml/*.nws/*.{*}/ veto oplock files = /*.doc/*.xls/*.mdb/*.pdf/ #========================Share Definitions========================[si] comment = Shared disk service on SI Cluster veto files = /.clumanager/.rgmanager/ browsable = yes writable = yes public = yes path = /mnt/share/si # #----- Force all files/dirs to be create group-writeable and world-readable. # create mask = 0664 force create mode = 0664 directory mask = 0775 force directory mode = 0775 [homes] comment = Home Directories valid users = %S read only = No browseable = No #[test] # comment = "TEST" # browseable = yes # writable = yes # public = yes # path = /tmp/data1 # [netlogon] comment = Network Logon Service path = /var/lib/samba/netlogon guest ok = Yes locking = No [profiles] comment = Profile Share path = /var/lib/samba/profiles read only = No profile acls = Yes [cdrom] oplocks = False level2 oplocks = False comment = CD-ROM/DVD path = /mnt/cdrom read only = Yes guest ok = Yes public = Yes browsable = Yes Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
Lang, Rich wrote:> Hello, > > We are running Samba 3.0.33 on a 2-node Linux cluster running RedHat 5.6 ES. Its primary application is to serve out a single network drive to support our business (out 350GB in size). For several years, this solution has been running flawlessly. File access was almost as fast as a local disk, so putting files on the server was never a problem. Our clients are running mostly Windows XP Pro. We have a few Windows 7 clients.---- Any difference in performance between the client types? Did the problems coincide with adding win7 machines to the network? Any new software on the clients (antivirus, firewall...etc?) Is something using up more memory on them? on your sockets, I up the SO_RCVBUF and SO_SNDBUF to at least 65536 each (more won't help until full smb2 support is in samba).... Did you get any new windows servers on your network around the time of the problem? I notice that you have your 'os level = 0', that means for things like name resolution, your smb server will have lowest priority -- even below a win98 client, as I understand it. You mention you ran an 'strace -f' on smbd. Have you looked at a wireshark trace? That would tell you more -- like when negotiating a TCP session, if your windows client keeps reducing the RCV buffer size that would have told you why the reads were getting smaller. Maybe you are getting packet drops, or similar -- Reminds me, do you have switches or hubs, what type of ethernet speed...I take it nothing in the hardward on the clients or the server has changed? You say you are using RH. Has the SW remained static since installation and through this problem increase (I.e. an auto-update of SW might have changed some setting in the kernel, or some firewall might have been added, modified....etc...)... Are the windows client's 'paging' more? I.e. was there any change in the VB script or the SW it's using such that now there could be a memory leak, thus increased paging? Have you set/optimized your TCP/IP params on XP? (and what little you can do on Win7... which is less configurable than XP).... Have you added more clients (significant?)... On the Win clients...what SP are the XP clients running at? Many people complained when SP2 came out -- especially affected were network applications. SP3 has the best performance of the XP series (even better than the original), while SP1 was slower than 'SP0' (original), and SP2 was slower still... I don't have any specific theories...just asking for more data at this point, since there are so many possible variables...and just having the information out there would help anyone investigate the problem... Good luck!.... Linda
On Fri, Jun 17, 2011 at 09:39:16AM -0400, Lang, Rich wrote:> Hello, > > We are running Samba 3.0.33 on a 2-node Linux cluster running RedHat 5.6 ES. Its primary application is to serve out a single network drive to support our business (out 350GB in size). For several years, this solution has been running flawlessly. File access was almost as fast as a local disk, so putting files on the server was never a problem. Our clients are running mostly Windows XP Pro. We have a few Windows 7 clients. > > Almost a year ago, that changed. Applications written in VB 6.0 that read files from the server started showing *significant* performance problems. What used to take seconds now takes more than a minute to finish. Moving the file to a local disk brought the speed back up to where it should be. Moving the file to a Windows 2003 or 2008 server also provided good throughput. All clients experience this same problem. > > I ran "strace -f" against the smbd process that is assigned to my desktop and then ran the VB application to see what the daemon was up to. I discovered that it went through a process of opening the file several times and reading data from it, using progressively smaller buffer sizes until is settled on using a buffer size of 1, which it used for the remainder of the file I/O session.This *seems* like clients not using oplocks, when previously they were. Has anything changed in the server system that might be denying oplock requests ? Jeremy.
The clients may have undergone a change (Windows is always being patched), but the configuration of the server was not changed regarding oplocks. No requests are being denied, since that situation would show up in the log file. -----Original Message----- From: Jeremy Allison [mailto:jra at samba.org] Sent: Monday, June 20, 2011 3:49 PM To: Lang, Rich Cc: samba at lists.samba.org Subject: Re: [Samba] Samba process throttled back? On Fri, Jun 17, 2011 at 09:39:16AM -0400, Lang, Rich wrote:> Hello,>> We are running Samba 3.0.33 on a 2-node Linux cluster running RedHat 5.6 ES. Its primary application is to serve out a single network drive to support our business (out 350GB in size). For several years, this solution has been running flawlessly. File access was almost as fast as a local disk, so putting files on the server was never a problem. Our clients are running mostly Windows XP Pro. We have a few Windows 7 clients.>> Almost a year ago, that changed. Applications written in VB 6.0 that read files from the server started showing *significant* performance problems. What used to take seconds now takes more than a minute to finish. Moving the file to a local disk brought the speed back up to where it should be. Moving the file to a Windows 2003 or 2008 server also provided good throughput. All clients experience this same problem.>> I ran "strace -f" against the smbd process that is assigned to my desktop and then ran the VB application to see what the daemon was up to. I discovered that it went through a process of opening the file several times and reading data from it, using progressively smaller buffer sizes until is settled on using a buffer size of 1, which it used for the remainder of the file I/O session.This *seems* like clients not using oplocks, when previously they were. Has anything changed in the server system that might be denying oplock requests ? Jeremy. Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
Lang, Rich wrote:> Hello, > > We are running Samba 3.0.33 on a 2-node Linux cluster running RedHat 5.6 ES. Its primary application is to serve out a single network drive to support our business (out 350GB in size). For several years, this solution has been running flawlessly. File access was almost as fast as a local disk, so putting files on the server was never a problem. Our clients are running mostly Windows XP Pro. We have a few Windows 7 clients.---- Any difference in performance between the client types? None whatsoever. Did the problems coincide with adding win7 machines to the network? Nope. Windows 7 didn't appear on the network until 8 months later. Any new software on the clients (antivirus, firewall...etc?) Is something using up more memory on them? No - we can rule out the clients, since this happens on every client we have, no matter how it is configured. on your sockets, I up the SO_RCVBUF and SO_SNDBUF to at least 65536 each (more won't help until full smb2 support is in samba).... I can make this change and see what happens. Did you get any new windows servers on your network around the time of the problem? I notice that you have your 'os level = 0', that means for things like name resolution, your smb server will have lowest priority -- even below a win98 client, as I understand it. No new servers were added. Our os level is low because the configuration file is for a BDC and only the PDC and BDC are part of this Samba-only domain - no Windows servers are a part of it. You mention you ran an 'strace -f' on smbd. Have you looked at a wireshark trace? That would tell you more -- like when negotiating a TCP session, if your windows client keeps reducing the RCV buffer size that would have told you why the reads were getting smaller. Maybe you are getting packet drops, or similar Good idea. I would expect to see indications of this activity over the wire. I'll let you know ... n Reminds me, do you have switches or hubs, what type of ethernet speed...I take it nothing in the hardward on the clients or the server has changed? We run on all switches - Linksys and Dell. We have 100MBPS to each desktop. You say you are using RH. Has the SW remained static since installation and through this problem increase (I.e. an auto-update of SW might have changed some setting in the kernel, or some firewall might have been added, modified....etc...)... This is a real possibility, although we've booted the servers up using a kernel image prior to the problem appearing and the problem remained. Maybe I need to go back to a kernel module rev prior to the problem. Are the windows client's 'paging' more? I.e. was there any change in the VB script or the SW it's using such that now there could be a memory leak, thus increased paging? No - nothing like that is happening on the client. Have you set/optimized your TCP/IP params on XP? (and what little you can do on Win7... which is less configurable than XP).... Have you added more clients (significant?)... These are pretty stock XP systems. The problem was so sudden (worked great for years, then slowed to a crawl) that it has to be associated with a change on the server or the client. On the Win clients...what SP are the XP clients running at? Many people complained when SP2 came out -- especially affected were network applications. SP3 has the best performance of the XP series (even better than the original), while SP1 was slower than 'SP0' (original), and SP2 was slower still... We're all running at Windows XP service pack 3. I don't have any specific theories...just asking for more data at this point, since there are so many possible variables...and just having the information out there would help anyone investigate the problem... Good luck!.... Linda Thanks. Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
Well, it just gets "curiouser and curiouser". I downloaded, built and installed the latest stable version of Samba (i.e. 3.5.9) on my "inactive" cluster member which is running RedHat ES 5.6. In case I didn't show this before, here's the output of `uname -a`: Linux mustang1 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:04 EDT 2011 i686 i686 i386 GNU/Linux Anyway, I create a share and copied the "troublesome" file to that share and opened it using the VB application that showed such poor performance. It opened the file and processed it as quickly as if it were on my local hard drive. This is more like it. This is back to how the share used to respond. When I navigated back to the original copy of the file, the performance went to pieces again. Same file, different versions of Samba, different performance. Looks like I "fixed" it, although I don't know exactly what was wrong. So, I wanted to take a wireshark snapshot of the "poor performance" to see if the client was negotiating the buffer size down over the wire. In the meantime, the original file and its folder were moved from the Samba share to a M$ share on another server. Oh well - I copied the file back to the Samba share. Guess what? The performance is great - back to where it was before the problem started. So - it's not the version of Samba. It looks like this is an inode corruption on the disk, although I've run fsck a number of times on the disk and it always comes up clean. Hmmmm...there might be some tools that I need to use to keep my shared disk clean. We're running the cluster through a pair of HP SmartArray 642 SCSI interfaces both connected to an MSA 500 G2 disk array with redundant controllers. There are four logical disks defined, each of which is defined as part of a cluster service so it can swing between cluster members in case of a failure. Does anyone use this kind of disk array in a shared configuration like this? Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
If I'm having oplock problems (i.e. poor performance), then would turning off oplocks altogether bring the performance back up? Richard G. Lang Sr. Software Engineer LangR at specsensors.com<mailto:LangR at specsensors.com> (330) 659-3312
Reasonably Related Threads
- http://llvm.org/OpenProjects.html#thinlto_global
- [PATCH v2 repost 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
- [PATCH v2 repost 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
- [PATCH v9 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
- [PATCH v9 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration