Alex Crow
2016-Jul-01 09:00 UTC
[Samba] Winbind process stuck at 100% after changing use_mmap to no
Hi, We've had a strange issue after following the recommendations at https://wiki.samba.org/index.php/Ping_pong, particularly the part about mmap coherence. We are running CTDB/Samba over a MooseFS clustered FS, and we'd not done the ping-pong before. After finding that the mmap coherence test did not pass, we changed "use mmap" to "no" in smb.conf. This morning as users began to access their drives and profiles, performance tanked to such a degree that most people could not complete loading profiles and even where successful, windows drive mappings were taking minutes to complete per share. When looking at the active CTDB servers, on each a single Winbindd process was taking up 100% of CPU. After reverting the "use_mmap" to yes, performance returned completely to normal. We found this very odd as this is a recommended setting according to the page above. Here are package versions: Installed Packages Name : samba Arch : x86_64 Version : 4.2.10 Release : 6.el7_2 Size : 1.8 M Repo : installed From repo : updates Summary : Server and Client software to interoperate with Windows machines URL : http://www.samba.org/ Licence : GPLv3+ and LGPLv3+ Description : Samba is the standard Windows interoperability suite of programs for Linux and Unix. Installed Packages Name : ctdb Arch : x86_64 Version : 4.2.10 Release : 6.el7_2 Size : 1.2 M Repo : installed From repo : updates Summary : A Clustered Database based on Samba's Trivial Database (TDB) URL : http://www.samba.org/ Licence : GPLv3+ and LGPLv3+ Description : CTDB is a cluster implementation of the TDB database used by Samba and other : projects to store temporary data. If an application is already using TDB for : temporary data it is very easy to convert that application to be cluster aware : and use CTDB instead. Linux zearing.ifa.net 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Best regards Alex -- This message is intended only for the addressee and may contain confidential information. Unless you are that person, you may not disclose its contents or use it in any way and are requested to delete the message along with any attachments and notify us immediately. This email is not intended to, nor should it be taken to, constitute advice. The information provided is correct to our knowledge & belief and must not be used as a substitute for obtaining tax, regulatory, investment, legal or any other appropriate advice. "Transact" is operated by Integrated Financial Arrangements Ltd. 29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300. (Registered office: as above; Registered in England and Wales under number: 3727592). Authorised and regulated by the Financial Conduct Authority (entered on the Financial Services Register; no. 190856).
Jeremy Allison
2016-Jul-01 18:28 UTC
[Samba] Winbind process stuck at 100% after changing use_mmap to no
On Fri, Jul 01, 2016 at 10:00:21AM +0100, Alex Crow wrote:> Hi, > > We've had a strange issue after following the recommendations at > https://wiki.samba.org/index.php/Ping_pong, particularly the part > about mmap coherence. We are running CTDB/Samba over a MooseFS > clustered FS, and we'd not done the ping-pong before. After finding > that the mmap coherence test did not pass, we changed "use mmap" to > "no" in smb.conf. > > This morning as users began to access their drives and profiles, > performance tanked to such a degree that most people could not > complete loading profiles and even where successful, windows drive > mappings were taking minutes to complete per share. > > When looking at the active CTDB servers, on each a single Winbindd > process was taking up 100% of CPU. After reverting the "use_mmap" to > yes, performance returned completely to normal. > > We found this very odd as this is a recommended setting according to > the page above.Hmmm. I know very little about MooseFS. Did you get an strace on the winbindd process to try and figure out what it was doing ?
Alex Crow
2016-Jul-02 09:29 UTC
[Samba] Winbind process stuck at 100% after changing use_mmap to no
On 01/07/16 19:28, Jeremy Allison wrote:> On Fri, Jul 01, 2016 at 10:00:21AM +0100, Alex Crow wrote: >> Hi, >> >> We've had a strange issue after following the recommendations at >> https://wiki.samba.org/index.php/Ping_pong, particularly the part >> about mmap coherence. We are running CTDB/Samba over a MooseFS >> clustered FS, and we'd not done the ping-pong before. After finding >> that the mmap coherence test did not pass, we changed "use mmap" to >> "no" in smb.conf. >> >> This morning as users began to access their drives and profiles, >> performance tanked to such a degree that most people could not >> complete loading profiles and even where successful, windows drive >> mappings were taking minutes to complete per share. >> >> When looking at the active CTDB servers, on each a single Winbindd >> process was taking up 100% of CPU. After reverting the "use_mmap" to >> yes, performance returned completely to normal. >> >> We found this very odd as this is a recommended setting according to >> the page above. > Hmmm. I know very little about MooseFS. Did you get an strace > on the winbindd process to try and figure out what it was doing ?I'm afraid not, the priority was to get people working and I was fixing it remotely. I may be able to try it on a test setup. Thanks Alex -- This message is intended only for the addressee and may contain confidential information. Unless you are that person, you may not disclose its contents or use it in any way and are requested to delete the message along with any attachments and notify us immediately. This email is not intended to, nor should it be taken to, constitute advice. The information provided is correct to our knowledge & belief and must not be used as a substitute for obtaining tax, regulatory, investment, legal or any other appropriate advice. "Transact" is operated by Integrated Financial Arrangements Ltd. 29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300. (Registered office: as above; Registered in England and Wales under number: 3727592). Authorised and regulated by the Financial Conduct Authority (entered on the Financial Services Register; no. 190856).
Volker Lendecke
2016-Jul-03 12:06 UTC
[Samba] Winbind process stuck at 100% after changing use_mmap to no
On Fri, Jul 01, 2016 at 10:00:21AM +0100, Alex Crow wrote:> We've had a strange issue after following the recommendations at > https://wiki.samba.org/index.php/Ping_pong, particularly the part > about mmap coherence. We are running CTDB/Samba over a MooseFS > clustered FS, and we'd not done the ping-pong before. After finding > that the mmap coherence test did not pass, we changed "use mmap" to > "no" in smb.conf."use mmap" only affects tdbs, which should not sit on the clustered fs, they should *always* be on a local file system like ext4, assuming you're using Linux. The wiki page above states that it's not absolutely essential to provide mmap coherence, so you should not bother too much. Or are you putting tdb files on moosefs? BTW, the straces you're seeing are probably not real spins, that's traverses. With 10000 hash chains, that will be 10000 preads. That's just taking time. If the tdbs sit on moosefs, it might actually be true that it does not like that. But as I said: Never put tdbs on a cluster file system. If cluster file systems did that access pattern fine, ctdb would not exist :-) Volker
Alex Crow
2016-Jul-03 19:42 UTC
[Samba] Winbind process stuck at 100% after changing use_mmap to no
On 03/07/16 13:06, Volker Lendecke wrote:> On Fri, Jul 01, 2016 at 10:00:21AM +0100, Alex Crow wrote: >> We've had a strange issue after following the recommendations at >> https://wiki.samba.org/index.php/Ping_pong, particularly the part >> about mmap coherence. We are running CTDB/Samba over a MooseFS >> clustered FS, and we'd not done the ping-pong before. After finding >> that the mmap coherence test did not pass, we changed "use mmap" to >> "no" in smb.conf. > "use mmap" only affects tdbs, which should not sit on the clustered > fs, they should *always* be on a local file system like ext4, assuming > you're using Linux. The wiki page above states that it's not > absolutely essential to provide mmap coherence, so you should not > bother too much. > > Or are you putting tdb files on moosefs? > > BTW, the straces you're seeing are probably not real spins, that's > traverses. With 10000 hash chains, that will be 10000 preads. That's > just taking time. If the tdbs sit on moosefs, it might actually be > true that it does not like that. But as I said: Never put tdbs on a > cluster file system. If cluster file systems did that access pattern > fine, ctdb would not exist :-) > > VolkerHi Volker, I've only put the "private dir" onto MooseFS, as instructed in the CTDB docs. So, in that case, I'm assuming from your comments that it is no worry that the mmap test does not pass on the MooseFS mounts? The "private dir" contains these files: -rw-------+ 1 root root 24576 Jun 23 19:48 netlogon_creds_cli.tdb -rw-------+ 1 root root 421888 Jun 23 19:48 passdb.tdb -rw-------+ 1 root root 430080 Jun 23 19:47 secrets.tdb drwxr-xr-x+ 3 root root 1001200 Jun 23 19:48 smbd.tmp It just strikes me as odd that "use mmap = no" causes such a slowdown in the the case I only have these files on my clustered FS. The timestamp of all those files is the last time I joined one of the member servers to the domain. More worrying is that the rw ping-pong (no mmap) does not work properly even when I mount my MooseFS with -o mfscachemode=none, which is not supposed to do any local page/file cache and yet still I get inconsistent results when adding and removing ping_pong tests on other nodes (ie increment count does not always match running test nodes). I have raised this with the MooseFS chaps and am waiting for a response. Many thanks Alex -- This message is intended only for the addressee and may contain confidential information. Unless you are that person, you may not disclose its contents or use it in any way and are requested to delete the message along with any attachments and notify us immediately. This email is not intended to, nor should it be taken to, constitute advice. The information provided is correct to our knowledge & belief and must not be used as a substitute for obtaining tax, regulatory, investment, legal or any other appropriate advice. "Transact" is operated by Integrated Financial Arrangements Ltd. 29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300. (Registered office: as above; Registered in England and Wales under number: 3727592). Authorised and regulated by the Financial Conduct Authority (entered on the Financial Services Register; no. 190856).
Possibly Parallel Threads
- Winbind process stuck at 100% after changing use_mmap to no
- Winbind process stuck at 100% after changing use_mmap to no
- Winbind process stuck at 100% after changing use_mmap to no
- Winbind process stuck at 100% after changing use_mmap to no
- Winbind process stuck at 100% after changing use_mmap to no