Unfortunately I didn’t seem to have saved one of those big gencache.tdb files
(we don’t currently see them). Or at least I can’t seem to find them now…
A while ago when we first discovered this I posted a message:
https://lists.samba.org/archive/samba/2017-December/212577.html
<https://lists.samba.org/archive/samba/2017-December/212577.html>
I now notice that I never replied to a question I received for that post - we
build our own Samba package (we do not use the ports-supplied one with FreeBSD).
From that post with a samba server then had been running for a number of weeks
(or months) if I remember correctly:> > # tdbtool gencache.tdb check
> > Database integrity is OK and has 495309 records
>
> > # tdbtool gencache.tdb keys | egrep IDMAP/ | wc -l
> > 88974
> > # tdbtool gencache.tdb keys | egrep RA/ | wc -l
> > 406311
>
> (The number of IDMAP records I sort of can understand - our AD server
contains some 100k users). But 406k RA records?
>
> > # tdbtool gencache.tdb dump | egrep -A4 RA/ | sed -e 's:/: :'
| awk '($1 == "[000]") {print $20}'|sort|uniq -c
> > 2001 OSX
> > 374245 Sam
> > 30080 Vis
>
> (“VIs” = Vista, “Sam” = Samba). 374245 “Samba” records seems a bit
excessive. Our servers are mostly serving Windows 7 and Windows 10 clients. The
most active “Samba” clients should be our Nagios monitoring system...
On one of our servers right now (gencache.tdb deleted at 7am, it’s now 11am):
> root at filur01:/var/samba/cache # tdbtool gencache.tdb check
> Database integrity is OK and has 14768 records.
> root at filur01:/var/samba/cache # tdbtool gencache.tdb dump | egrep -A4
RA/ | sed -e 's:/: :' | awk '($1 == "[000]") {print
$20}' | sort | uniq -c
> 328 OSX
> 1171 Sam
> 84 Vis
The “Sam” count is continuously growing for every time I rerun that command
(like 5-10 records per minute right now). The other counters go up/down with the
number of clients connecting. Right now that server has some 390 SMB clients
(mostly Windows 10) connected.
> root at filur01:/var/samba/cache # tdbtool gencache.tdb info
> Size of file/data: 2723840/1118578
> Header offset/logical size: 0/2723840
> Number of records: 14481
> Incompatible hash: yes
> Active/supported feature flags: 0x00000000/0x00000001
> Robust mutexes locking: no
> Smallest/average/largest keys: 14/55/99
> Smallest/average/largest data: 17/21/89
> Smallest/average/largest padding: 13/25/106
> Number of dead records: 1
> Smallest/average/largest dead records: 823272/823272/823272
> Number of free records: 15
> Smallest/average/largest free records: 12/1420/20824
> Number of hash chains: 10000
> Smallest/average/largest hash chains: 0/1/8
> Number of uncoalesced records: 1
> Smallest/average/largest uncoalesced runs: 1/1/1
> Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes:
30/11/14/1/30/15/1
> root at filur01:/var/samba/cache # ls -l gencache.tdb
> -rw-r--r-- 1 root wheel 2723840 Aug 30 10:54 gencache.tdb
My guess would be that the TDB database doesn’t really scale that well when the
number of records
grow over some limit, so when our gencache.tdb contained half a million (or
more) records, the lookup
of the IDMAP records would take seriously longer and longer to execute causing
various authentication
stuff to slow down...
- Peter
> On 29 Aug 2018, at 19:06, Jeremy Allison <jra at samba.org> wrote:
>
> On Wed, Aug 29, 2018 at 03:36:23PM +0200, Peter Eriksson via samba wrote:
>> For what it’s worth you are not alone in seeing similar problems with
Samba and gencache.
>>
>> Our site has some 110K users (university with staff & students
(including former ones), and currently around 2000 active (SMB) clients
connecting to 5 different Samba servers (around 400-500 clients per server).
When we previously just let things “run” gencache.tdb would grow forever and
authentication login performance would start to deteriorate after a little while
(would take more than 10 seconds). So we now delete it (and locks/locking.tdb
that also tends to grow forever) and restart our samba processes every morning
at 7 am - which gives us much more stable performance.
>>
>> - Servers with 256GB of RAM, 10Gbps ethernet interfaces and around
110TB of disk per server.
>> - FreeBSD 11.2-p2
>> - Samba 4.7.6 with some local patches to allow (much) bigger socket
listening queues in order to handle the case of many clients connecting at the
same time.
>>
>> (We are trying to upgrade to a more recent Samba but 4.7.8 and 4.7.9
gave us horrible authentication performance every 10:th hour where the servers
basically denied clients to login for about 2 hours so we had to back down to
4.7.6 again).
>
> Hmmm. Can you save off one of the large
> gencache.tdb files and work out if this
> is a fragmentation issue ?
>
> Sounds like it..