mathias dufresne
2016-Mar-07 15:07 UTC
[Samba] Samba AD/DC crashed again, third time in as many months
Answering to previous mail: AD is hearth of infrastructure. That's where all accounts are stored. That last affirmation implies few times after you start deploying AD most of your IT infrastructure depends on AD (all applications need accounts, they are in AD, no AD, no accounts, nothing work) and that you take security in consideration and that you do that seriously: an attacker with administrator account can do almost everything everywhere on machines joined to AD. So redundancy, every times. You could also think about your own issue: is it the whole DB which is broken or is it the DB on the broken DC? With one DC, the whole DB is the one DC, so you always break the whole DB. With several you get a chance to break only one DC and to have others with a coherent DB. That do not means you will never break the whole DB (backup and a working process to restore is still needed). Second mail: You want to remove your FSMO owner. The FSMO owner is SOA. These two are really important notion in AD: - FSMO is kind of PDC in NT4 domain, these roels must belong to one DC. Seize role before demoting the old one. - SOA is about DNS, it refers the one server where some client can push DNS modification. Change SOA before you try to add a replacement server to the one you demoted. If you don't the DC you would join to replace demoted DC won't be able to send DNS update! And yes it possible to get redundancy with dns-backend=SAMBA_INTERNAL. How to test your DNS servers are well configured: samba_dnsupdate gives no error on all DC (this a test related to DNS service only). 2016-03-07 11:47 GMT+01:00 IT Admin <it at cliffbells.com>:> As advised I have begun the process of adding ADCs to this domain and > currently have a second samba ADC joined to the domain. I would like to > demote the initial ADC and make this secondary ADC the primary as the > problematic machine is still crashing and when samba has failed DNS fails > across the domain. I'd also like to know if it is possible to create > redundancy when using SAMBA_INTERNAL as a DNS backend. > > Please advise. > > JS > On Mar 6, 2016 6:56 PM, "Andrew Bartlett" <abartlet at samba.org> wrote: > > > On Wed, 2016-03-02 at 16:42 -0500, IT Admin wrote: > > > I built this machine, and while it isn't the most robust box in the > > > world > > > it has been stable otherwise. The RAID array is configured RAID1, I > > > can't > > > see how that could cause corruption issues and I haven't experienced > > > any > > > other data corruption issues apart from SAMBA collapsing > > > > I know it is hard to swallow, but I really think this is hardware, or > > the OS configuration under it, combined with unexpected shutdown or > > some other corruption vector. > > > > We have at this stage 10,000 or more domains running Samba4, and this > > is only the second I've heard of with this kind of symptom. The first > > I blamed on a use of DRDB that I postulated was not preserving 'write > > barriers' (that is, the thing that makes fsync() work) and a poweroff, > > but I didn't really have any proof. > > > > You do need to run a second DC, as well as run tools like memcheck on > > this DC. Make sure you regularly run the backup script, so you can > > work out when the corruption happens, and verify your DB with dbcheck. > > > > The second DC has the advantage that this kind of low-level corruption > > doesn't easily spread across DRS replication (it would instead fail > > replication). > > > > The error shown indicates that for some reason or other, it can't read > > the schema. This is very odd, as the schema doesn't change! > > > > We would love to get to the bottom of this. > > > > Unlike others, I don't think this has anything to do with packaging > > (that would just make us not start at all), but a clean install on a > > clean machine is my best advise, keeping the rest aside (and off) for > > forensics if you have the patience. > > > > Finally, always keep the steps simple - otherwise we might start > > confusing admin errors for hardware errors or vice verca. The things > > we all do in the panic are always the hardest to de-construct in the > > cold light of day. > > > > Thanks, > > > > Andrew Bartlett > > > > -- > > Andrew Bartlett > > https://samba.org/~abartlet/ > > Authentication Developer, Samba Team https://samba.org > > Samba Development and Support, Catalyst IT > > https://catalyst.net.nz/services/samba > > > > > > > > > > > > > > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba >
Rowland penny
2016-Mar-07 15:27 UTC
[Samba] Samba AD/DC crashed again, third time in as many months
On 07/03/16 15:07, mathias dufresne wrote:> Answering to previous mail: > AD is hearth of infrastructure. That's where all accounts are stored. That > last affirmation implies few times after you start deploying AD most of > your IT infrastructure depends on AD (all applications need accounts, they > are in AD, no AD, no accounts, nothing work) and that you take security in > consideration and that you do that seriously: an attacker with > administrator account can do almost everything everywhere on machines > joined to AD. > > So redundancy, every times.Totally agree> > You could also think about your own issue: is it the whole DB which is > broken or is it the DB on the broken DC? With one DC, the whole DB is the > one DC, so you always break the whole DB. > With several you get a chance to break only one DC and to have others with > a coherent DB. That do not means you will never break the whole DB (backup > and a working process to restore is still needed).Again agree> > Second mail: > You want to remove your FSMO owner. The FSMO owner is SOA.Not necessarily, there are no FSMO roles on my second DC, but it has a SOA.> These two are really important notion in AD: > - FSMO is kind of PDC in NT4 domain,Well, to a certain extent and only when you are describing the PDC emulator FSMO role> these roels must belong to one DC.Totally wrong, you can, and probably should, share these about if you have more than one DC.> Seize role before demoting the old one.Again wrong, you should try to transfer the role first, only seize it if you have to i.e. the FSMO role owner DC is dead.> - SOA is about DNS, it refers the one server where some client can push DNS > modification. Change SOA before you try to add a replacement server to the > one you demoted. If you don't the DC you would join to replace demoted DC > won't be able to send DNS update!It would seem that something has changed and I need to some more testing, must add it to my todo list. Rowland> > And yes it possible to get redundancy with dns-backend=SAMBA_INTERNAL. > > How to test your DNS servers are well configured: samba_dnsupdate gives no > error on all DC (this a test related to DNS service only). > > > >
IT Admin
2016-Mar-09 03:26 UTC
[Samba] Samba AD/DC crashed again, third time in as many months
To address my comments, and the follow-up comments, on redundancy: I do now understand the importance of building robust infrastructure when implementing AD (or any centralized system like it) in an organization. I suppose that the bulk of my experience with traditional Windows based AD environments has come from work in small business environments where my clients have had both limited budgets and limited technical knowledge. All of these windows networks have relied on a single ADC, and in the fifteen years I've been supporting networks I have yet to experience a failure of the magnitude I have on this specific network. At my current employer we rely on a Windows based AD environment to manage our domains, and we have multiple ADCs deployed for redundancy of both Active Directory and DNS. When deploying Samba as an AD I've followed the same procedure each time, and currently have it implemented in three locations, all small businesses with the same aforementioned constraints, tight budgets and minimal knowledge of the technologies that support their businesses. For each deployment I've followed the samba wiki, and with each iteration (or failure lol) I've gained both experience/comfort and knowledge about the process, and am now comfortable building the package from source, backing up the database, and recovering from failures via restores. The wiki page on deploying Samba as an Active Directory Controller doesn't mention anything about redundancy, and it wasn't until I built the VM and joined a second ADC to this domain that I discovered and digested the wiki content on joining a secondary ADC to a previously existing AD. Within that guide it states clearly the importance of redundancy, and after reading it and battling the repeated failures of this specific machine, I can't see myself deploying Samba again without redundant ADCs. I think including information on the topic within the wiki page on Deploying Samba as an ADC might be beneficial for newcomers like myself and help others avoid these pitfalls. Just my two cents. On the issue of recovering from the situation I'm currently in: I appreciate the guidance given to date but am presently lost as to how I should proceed. Could someone point me to wiki content that will guide me through the process of demoting a PDC and transferring it's role to another ADC in the forest? I feel I'm close to resolving this case and don't want to make mistakes if I can avoid them, and this is all new territory to me. Thanks in advance, JS On Mar 7, 2016 10:29 AM, "Rowland penny" <rpenny at samba.org> wrote:> On 07/03/16 15:07, mathias dufresne wrote: > >> Answering to previous mail: >> AD is hearth of infrastructure. That's where all accounts are stored. That >> last affirmation implies few times after you start deploying AD most of >> your IT infrastructure depends on AD (all applications need accounts, they >> are in AD, no AD, no accounts, nothing work) and that you take security in >> consideration and that you do that seriously: an attacker with >> administrator account can do almost everything everywhere on machines >> joined to AD. >> >> So redundancy, every times. >> > > Totally agree > > >> You could also think about your own issue: is it the whole DB which is >> broken or is it the DB on the broken DC? With one DC, the whole DB is the >> one DC, so you always break the whole DB. >> With several you get a chance to break only one DC and to have others with >> a coherent DB. That do not means you will never break the whole DB (backup >> and a working process to restore is still needed). >> > > Again agree > > >> Second mail: >> You want to remove your FSMO owner. The FSMO owner is SOA. >> > > Not necessarily, there are no FSMO roles on my second DC, but it has a SOA. > > These two are really important notion in AD: >> - FSMO is kind of PDC in NT4 domain, >> > > Well, to a certain extent and only when you are describing the PDC > emulator FSMO role > > these roels must belong to one DC. >> > > Totally wrong, you can, and probably should, share these about if you have > more than one DC. > > Seize role before demoting the old one. >> > > Again wrong, you should try to transfer the role first, only seize it if > you have to i.e. the FSMO role owner DC is dead. > > - SOA is about DNS, it refers the one server where some client can push DNS >> modification. Change SOA before you try to add a replacement server to the >> one you demoted. If you don't the DC you would join to replace demoted DC >> won't be able to send DNS update! >> > > It would seem that something has changed and I need to some more testing, > must add it to my todo list. > > Rowland > >> >> And yes it possible to get redundancy with dns-backend=SAMBA_INTERNAL. >> >> How to test your DNS servers are well configured: samba_dnsupdate gives no >> error on all DC (this a test related to DNS service only). >> >> >> >> >> > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba >