Hi list, I am wondering if there is a theoretical maximum for an Active Directory forest, according to Samba or MS? My concern comes from this. We are piloting AD with Samba 4 at a couple of our schools. My thought was to eventually get the top-level forest hosted at our central office, then setup each school as a "site" with its own AD DC at the site, configured to use each school's subnet as the AD server to authenticate with. I ran this by our working group, and they are concerned that with 2000+ staff and 40,000 students (just an estimate), that the AD database would grow too large, and take forever for the users to log in. I believe it won't make a large difference, as users would just authenticate against the server in their subnet. We have 50 sites that are able to talk to each other through a 10.x.x.x network, each with their own subnet. Is there a concern with capacity in this case? Currently, we have 2 AD servers in each of the pilot sites running as VMs, using 2GB of RAM. Our plan moving forward is to likely keep two AD DCs at each site, but I want to know if we can just setup one large forest, or if each site should remain its own forest. Thanks
Am 21.12.2017 um 18:35 schrieb Luke Barone via samba:> Hi list, > > I am wondering if there is a theoretical maximum for an Active Directory > forest, according to Samba or MS? My concern comes from this. > > We are piloting AD with Samba 4 at a couple of our schools. My thought was > to eventually get the top-level forest hosted at our central office, then > setup each school as a "site" with its own AD DC at the site, configured to > use each school's subnet as the AD server to authenticate with. > > I ran this by our working group, and they are concerned that with 2000+ > staff and 40,000 students (just an estimate), that the AD database would > grow too large, and take forever for the users to log in. I believe it > won't make a large difference, as users would just authenticate against the > server in their subnet. We have 50 sites that are able to talk to each > other through a 10.x.x.x network, each with their own subnet. > > Is there a concern with capacity in this case? Currently, we have 2 AD > servers in each of the pilot sites running as VMs, using 2GB of RAM. Our > plan moving forward is to likely keep two AD DCs at each site, but I want > to know if we can just setup one large forest, or if each site should > remain its own forest. > > ThanksI have no personal experience with such installations, but google quickly turns up this: https://technet.microsoft.com/de-de/library/active-directory-maximum-limits-scalability(v=ws.10).aspx Maybe it answers some of your questions. Speed of user lookups and logins should depend on the backend of the LDAP database. I can't tell you anything about the samba implementation, but from other experiences with LDAP and databases I would say that nowadays looking up something in 40.000 entries should be a piece of cake for any modern database. Append two zeroes and you might get in a range where speed is a concern... Andreas
On Thu, 2017-12-21 at 09:35 -0800, Luke Barone via samba wrote:> Hi list, > > I am wondering if there is a theoretical maximum for an Active Directory > forest, according to Samba or MS? My concern comes from this. > > We are piloting AD with Samba 4 at a couple of our schools. My thought was > to eventually get the top-level forest hosted at our central office, then > setup each school as a "site" with its own AD DC at the site, configured to > use each school's subnet as the AD server to authenticate with. > > I ran this by our working group, and they are concerned that with 2000+ > staff and 40,000 students (just an estimate), that the AD database would > grow too large, and take forever for the users to log in. I believe it > won't make a large difference, as users would just authenticate against the > server in their subnet. We have 50 sites that are able to talk to each > other through a 10.x.x.x network, each with their own subnet. > > Is there a concern with capacity in this case? Currently, we have 2 AD > servers in each of the pilot sites running as VMs, using 2GB of RAM. Our > plan moving forward is to likely keep two AD DCs at each site, but I want > to know if we can just setup one large forest, or if each site should > remain its own forest.A single forest would work fine I think, and avoids trusts which are still fairly limited in their support in our AD DC. 40,000 is a scale that you can expect Samba to operate correctly at, with Samba 4.7 and even more to with Samba 4.8 to be released in March. Make sure to use 4.7.4 (due shortly) to get the fix we just made to the DNS performance regression. Beyond that, my team at Catalyst is targeting for a client the > 100,000 object scale and plan to renovate our TDB based LDB database engine to use Symas's LMDB (used in OpenLDAP) for even more scale. Regarding operating scale, Samba 4.7 runs with one process per LDAP connection and so can require significant memory if there are a lot of LDAP connections (you may wish to allocate much more than 2GB). Samba 4.8 will provide a 'prefork' mode that reduces that by sharing the connections between multiple processes. (Samba 4.6 and below were memory efficient, using one process, but CPU in-efficient using just one CPU for all LDAP traffic). Finally, do test it out. Add lots of users and groups if you are concerned. Use TDB_NO_FSYNC=1 during the load to make things faster (but of course unsafe) and you can easily add that many users. Additionally we now have load testing tools you can use to trial the system with a replay of realistic traffic from your network. These are in Samba's master branch and will be part of Samba 4.8. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
Thank you for the guided input. We are currently using whatever is the latest version within Debian repos, so we may need to wait for the next Debian version to see 4.7 or 4.8. How much memory do you suppose would be adequate for 40,000 users to authenticate with at each site? We are aiming to use VMs more, but if more memory is required, we'd like to price out the memory as we build the servers, instead of later. On Fri, Dec 22, 2017 at 9:16 AM, Andrew Bartlett <abartlet at samba.org> wrote:> On Thu, 2017-12-21 at 09:35 -0800, Luke Barone via samba wrote: > > Hi list, > > > > I am wondering if there is a theoretical maximum for an Active Directory > > forest, according to Samba or MS? My concern comes from this. > > > > We are piloting AD with Samba 4 at a couple of our schools. My thought > was > > to eventually get the top-level forest hosted at our central office, then > > setup each school as a "site" with its own AD DC at the site, configured > to > > use each school's subnet as the AD server to authenticate with. > > > > I ran this by our working group, and they are concerned that with 2000+ > > staff and 40,000 students (just an estimate), that the AD database would > > grow too large, and take forever for the users to log in. I believe it > > won't make a large difference, as users would just authenticate against > the > > server in their subnet. We have 50 sites that are able to talk to each > > other through a 10.x.x.x network, each with their own subnet. > > > > Is there a concern with capacity in this case? Currently, we have 2 AD > > servers in each of the pilot sites running as VMs, using 2GB of RAM. Our > > plan moving forward is to likely keep two AD DCs at each site, but I want > > to know if we can just setup one large forest, or if each site should > > remain its own forest. > > A single forest would work fine I think, and avoids trusts which are > still fairly limited in their support in our AD DC. > > 40,000 is a scale that you can expect Samba to operate correctly at, > with Samba 4.7 and even more to with Samba 4.8 to be released in March. > Make sure to use 4.7.4 (due shortly) to get the fix we just made to > the DNS performance regression. > > Beyond that, my team at Catalyst is targeting for a client the > > 100,000 object scale and plan to renovate our TDB based LDB database > engine to use Symas's LMDB (used in OpenLDAP) for even more scale. > > Regarding operating scale, Samba 4.7 runs with one process per LDAP > connection and so can require significant memory if there are a lot of > LDAP connections (you may wish to allocate much more than 2GB). > > Samba 4.8 will provide a 'prefork' mode that reduces that by sharing > the connections between multiple processes. (Samba 4.6 and below were > memory efficient, using one process, but CPU in-efficient using just > one CPU for all LDAP traffic). > > Finally, do test it out. Add lots of users and groups if you are > concerned. Use TDB_NO_FSYNC=1 during the load to make things faster > (but of course unsafe) and you can easily add that many users. > > Additionally we now have load testing tools you can use to trial the > system with a replay of realistic traffic from your network. These are > in Samba's master branch and will be part of Samba 4.8. > > Andrew Bartlett > > -- > Andrew Bartlett http://samba.org/~abartlet/ > Authentication Developer, Samba Team http://samba.org > Samba Developer, Catalyst IT http://catalyst.net.nz/ > services/samba > >