JS
2016-Jan-03 06:00 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
<=?windows-1252?Q?L.P.H._van_Belle?=> writes:> > Ok, > >Hi Louis, Thank you again for taking the time to help me out, I do appreciate it, and I hope you had a safe and Happy New Year's eve. I'm going to work my way through the questions/comments in your response from top to bottom:> First things is see. > > NTP > drwxr-x--- 2 root root 4096 Dec 28 21:12 ntp_signd > should be root:ntpNo idea why the ownership is incorrect for that directory but I have executed the following to fix it: sudo chown -R root:ntp /var/lib/samba/ntp_signd and now the security settings on that dir look like: sudo ls -la /var/lib/samba/ntp_signd/ total 8 drwxr-x--- 2 root ntp 4096 Dec 28 21:12 . drwxr-xr-x 8 root root 4096 Dec 13 21:07 .. srwxrwxrwx 1 root ntp 0 Dec 28 21:12 socket> SYVOL > drwxrwx---+ 3 root BUILTIN\administrators 4096 Apr 28 2015 sysvol > your shows 300000 while mine gives : BUILTIN\administrators > but i have winbind/nsswitch etc configured on my DC, dont ask why, but ineed it, and it works good for me. Regarding the SYSVOL permissions, I checked the permissions of /var/lib/samba/ on another PDC I have deployed on a different network and ntp_signd is owned by root:3000000 as well.> Can you tell more about the hardware failure? > Disk problems, power outage etc what exact happend? > Did you see an filesystem check the first time starting up after the failuere?The initial hardware failure was a RAID array failure, I replaced the failed devices and rebuilt the array and then rebuilt their domain from scratch provisioning under a new domain.> I asume its the only server, do no other DC's.Yes, that is correct, this machine is the only domain controller on this network.> Stop all samba processes and backup at least these folders. > /etc/samba > /var/lib/samba > /var/cache/sambaSamba fails at boot, I've already made a couple of safety backups but for good measure I stopped smbd, nmbd, and samba services and backed up the directories you listed.> When you run : samba-tool fsmo show > You probely get an error...I do receive an error, note I did not start any of the aforementioned services prior to executing the samba-tool command below: sudo samba-tool fsmo show ldb_wrap open of secrets.ldb ERROR(assert): uncaught exception File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 175, in _run return self.run(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/samba/netcmd/fsmo.py", line 196, in run assert len(res) == 1> , so try the following. > samba-tool fsmo siezeI receive a second error when executing the seize command: sudo samba-tool fsmo seize ldb_wrap open of secrets.ldb ERROR: Invalid FSMO role.> ( i dont think i will work, but give it a try, any outputs is most welkom ) > > These do worry me. > Failed to find object DC=one,DC=cliffbells,DC=com for attributefsmoRoleOwner - Cannot find DN> DC=one,DC=cliffbells,DC=com to get attribute fsmoRoleOwner for referencedn: (null)> > ./source4/dsdb/common/util.c:1877(samdb_is_pdc) > Failed to find if we are the PDC for this ldb: Searching forfSMORoleOwner in DC=one,DC=cliffbells,DC=com> failed: Cannot find DN DC=one,DC=cliffbells,DC=com to get attributefsmoRoleOwner for reference> dn: (null) > > which looks like you samba DB is corrected, probely due to the hardwarefailure. If your hunch that the database is corrupt holds true it couldn't be from hardware failure as this domain was provisioned after that incident. I do believe I may have traced where any possible corruption might have originated though... I (apparently foolishly) started backing up /var/lib/samba with CrashPlan after the hardware failure incident... I'm guessing that was a bad idea.> Do you have a backup, made with samba_backup ? > ( shown here :https://wiki.samba.org/index.php/Backup_and_restore_an_Samba_AD_DC )> > Because i think you db is corrected and beyond recovery.No, I do not have that backup mechanism implemented, and from reading that wiki page's notes about backing up live databases I have come to the conclusion that CrashPlan backed up /var/lib/samba/ while the databases were live and irreparably damaged them. I don't know what the relationship between /var/lib/samba/ and /var/cache/samba/ is exactly, but I assume that any backup I had created via CrashPlan (if it had worked instead of wreaking havoc) probably wouldn't have been valid lacking the /var/cache/samba/ directory contents... I will be implementing the Samba backup script from your wiki link immediately on the other Samba ADCs I have deployed and will utilize it here when I've rebuilt the domain, using CrashPlan for offsite storage of archives it creates. Which leads us your closing statement:> If you have backupped : > /etc/samba > /var/lib/samba > /var/cache/samba > > You can remove the content of > /var/lib/samba > /var/cache/samba > > And reprovision, bases on the posts here and the things i see. > If you have a backup "any" which have also the samba databases, thats thefirst you can try.> > Greetz, > > LouisOther than the python error I received after running samba-tool fsmo show, I believe I've built a pretty solid case for poor backup strategy being the cause of this failure, and that reprovisioning the domain is my only course of action at this time. If you believe I'm getting ahead of myself, or if you think that Python error could lead to another failure after I've reprovisioned, please let me know. I intend to execute the new domain provisioning tomorrow (Sunday Jan 03 2016) in the late afternoon/early evening (EST), and would hate to go through the process of rebuilding their infrastructure only to have a Python issue trash the domain again. Thanks again Louis et al for helping me troubleshoot this issue, I'm still green when it comes to Samba. Kind Regards, JS
Rowland penny
2016-Jan-03 08:37 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
On 03/01/16 06:00, JS wrote:> <=?windows-1252?Q?L.P.H._van_Belle?=> writes: > >> Ok, >> >> > Hi Louis, > > Thank you again for taking the time to help me out, I do appreciate it, and > I hope you had a safe and Happy New Year's eve. I'm going to work my way > through the questions/comments in your response from top to bottom: > >> First things is see. >> >> NTP >> drwxr-x--- 2 root root 4096 Dec 28 21:12 ntp_signd >> should be root:ntp > No idea why the ownership is incorrect for that directory but I have > executed the following to fix it: > > sudo chown -R root:ntp /var/lib/samba/ntp_signd > > and now the security settings on that dir look like: > > sudo ls -la /var/lib/samba/ntp_signd/ > total 8 > drwxr-x--- 2 root ntp 4096 Dec 28 21:12 . > drwxr-xr-x 8 root root 4096 Dec 13 21:07 .. > srwxrwxrwx 1 root ntp 0 Dec 28 21:12 socket > > >> SYVOL >> drwxrwx---+ 3 root BUILTIN\administrators 4096 Apr 28 2015 sysvol >> your shows 300000 while mine gives : BUILTIN\administrators >> but i have winbind/nsswitch etc configured on my DC, dont ask why, but i > need it, and it works good for me. > > Regarding the SYSVOL permissions, I checked the permissions of > /var/lib/samba/ on another PDC I have deployed on a different network and > ntp_signd is owned by root:3000000 as well. > > >> Can you tell more about the hardware failure? >> Disk problems, power outage etc what exact happend? >> Did you see an filesystem check the first time starting up after the failuere? > The initial hardware failure was a RAID array failure, I replaced the failed > devices and rebuilt the array and then rebuilt their domain from scratch > provisioning under a new domain. > >> I asume its the only server, do no other DC's. > Yes, that is correct, this machine is the only domain controller on this > network. > >> Stop all samba processes and backup at least these folders. >> /etc/samba >> /var/lib/samba >> /var/cache/samba > Samba fails at boot, I've already made a couple of safety backups but for > good measure I stopped smbd, nmbd, and samba services and backed up the > directories you listed.Just how are you starting Samba ? If you are running Samba as an AD DC, you should only start the samba deamon, yet you say that you 'stopped smbd, nmbd, and samba services', 'nmbd' should not be running on an AD DC, it interferes with 'nbt' built into the samba deamon.>> When you run : samba-tool fsmo show >> You probely get an error... > I do receive an error, note I did not start any of the aforementioned > services prior to executing the samba-tool command below: > > sudo samba-tool fsmo show > ldb_wrap open of secrets.ldb > ERROR(assert): uncaught exception > File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line > 175, in _run > return self.run(*args, **kwargs) > File "/usr/lib/python2.7/dist-packages/samba/netcmd/fsmo.py", line 196, in run > assert len(res) == 1Known problem that I have fixed in master, mind you, your version of fsmo.py will only show 5 of the seven roles. Your problem seems to be that at least one of your FSMO roles doesn't have a roleowner, hence when the python code says it has (assert len(res) == 1), it throws an error.>> , so try the following. >> samba-tool fsmo sieze > I receive a second error when executing the seize command: > > sudo samba-tool fsmo seize > ldb_wrap open of secrets.ldb > ERROR: Invalid FSMO role. > > >> ( i dont think i will work, but give it a try, any outputs is most welkom ) >> >> These do worry me. >> Failed to find object DC=one,DC=cliffbells,DC=com for attribute > fsmoRoleOwner - Cannot find DN >> DC=one,DC=cliffbells,DC=com to get attribute fsmoRoleOwner for reference > dn: (null) >> ./source4/dsdb/common/util.c:1877(samdb_is_pdc) >> Failed to find if we are the PDC for this ldb: Searching for > fSMORoleOwner in DC=one,DC=cliffbells,DC=com >> failed: Cannot find DN DC=one,DC=cliffbells,DC=com to get attribute > fsmoRoleOwner for reference >> dn: (null) >> >> which looks like you samba DB is corrected, probely due to the hardware > failure. > > If your hunch that the database is corrupt holds true it couldn't be from > hardware failure as this domain was provisioned after that incident. I do > believe I may have traced where any possible corruption might have > originated though... I (apparently foolishly) started backing up > /var/lib/samba with CrashPlan after the hardware failure incident... I'm > guessing that was a bad idea.As far as I am aware, you cannot backup a running Samba AD DC with anything that doesn't use tdbbackup, unless you stop samba.> >> Do you have a backup, made with samba_backup ? >> ( shown here : > https://wiki.samba.org/index.php/Backup_and_restore_an_Samba_AD_DC ) >> Because i think you db is corrected and beyond recovery. > No, I do not have that backup mechanism implemented, and from reading that > wiki page's notes about backing up live databases I have come to the > conclusion that CrashPlan backed up /var/lib/samba/ while the databases were > live and irreparably damaged them. I don't know what the relationship > between /var/lib/samba/ and /var/cache/samba/ is exactly, but I assume that > any backup I had created via CrashPlan (if it had worked instead of wreaking > havoc) probably wouldn't have been valid lacking the /var/cache/samba/ > directory contents... I will be implementing the Samba backup script from > your wiki link immediately on the other Samba ADCs I have deployed and will > utilize it here when I've rebuilt the domain, using CrashPlan for offsite > storage of archives it creates. > > Which leads us your closing statement: > >> If you have backupped : >> /etc/samba >> /var/lib/samba >> /var/cache/samba >> >> You can remove the content of >> /var/lib/samba >> /var/cache/samba >> >> And reprovision, bases on the posts here and the things i see. >> If you have a backup "any" which have also the samba databases, thats the > first you can try. >> Greetz, >> >> Louis > > Other than the python error I received after running samba-tool fsmo show, I > believe I've built a pretty solid case for poor backup strategy being the > cause of this failure, and that reprovisioning the domain is my only course > of action at this time. If you believe I'm getting ahead of myself, or if > you think that Python error could lead to another failure after I've > reprovisioned, please let me know. I intend to execute the new domain > provisioning tomorrow (Sunday Jan 03 2016) in the late afternoon/early > evening (EST), and would hate to go through the process of rebuilding their > infrastructure only to have a Python issue trash the domain again. > > > Thanks again Louis et al for helping me troubleshoot this issue, I'm still > green when it comes to Samba.One of your problems is that you are using the stock Ubuntu samba, this is getting a bit long in the tooth now, can I suggest you use either the latest freely available samba from Sernet or better still, compile it yourself and use the latest version 4.3.3. This will get you a much improved fsmo.py and will also cover you for several CVEs. Rowland> Kind Regards, > > JS > >
Andrew Bartlett
2016-Jan-03 09:26 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
On Sun, 2016-01-03 at 08:37 +0000, Rowland penny wrote:> > As far as I am aware, you cannot backup a running Samba AD DC with > anything that doesn't use tdbbackup, unless you stop samba.To be clear, as long as the backup is only making reads, the impact should only be on the backed-up DB, not on Samba. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
Andrew Bartlett
2016-Jan-03 09:34 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
On Sun, 2016-01-03 at 06:00 +0000, JS wrote:> > > Other than the python error I received after running samba-tool fsmo > show, I > believe I've built a pretty solid case for poor backup strategy being > the > cause of this failure, and that reprovisioning the domain is my only > course > of action at this time. If you believe I'm getting ahead of myself, > or if > you think that Python error could lead to another failure after I've > reprovisioned, please let me know. I intend to execute the new > domain > provisioning tomorrow (Sunday Jan 03 2016) in the late > afternoon/early > evening (EST), and would hate to go through the process of rebuilding > their > infrastructure only to have a Python issue trash the domain again.I've not seen an error like yours before. It suggests one of the key objects that the KDC needs to start is not present in the DB. This particular error is pretty damming:> Failed to find object DC=one,DC=cliffbells,DC=com for attribute > fsmoRoleOwner - Cannot find DN DC=one,DC=cliffbells,DC=com to get > attribute > fsmoRoleOwner for reference dn: (null)That is, it can't find the base object for the whole domain. What does 'samba-tool dbcheck' say? After a backup, does running it with --fix resolve the issue or at least run clear? If that is fixed (somehow), then what does 'samba-tool domain exportkeytab' or 'pdbedit -L -v' say? Try turning up the debug level to get a failure message if it fails. But all said and done, it seems unlikely that that domain is in a 'good' enough state to continue. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
JS
2016-Jan-03 10:31 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
Andrew Bartlett <abartlet <at> samba.org> writes:> > What does 'samba-tool dbcheck' say?Running "sudo samba-tool dbfix" produces the following Python error: sudo samba-tool dbcheck ERROR(<type 'exceptions.IndexError'>): uncaught exception - list index out of range File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 175, in _run return self.run(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/samba/netcmd/dbcheck.py", line 120, in run reset_well_known_acls=reset_well_known_acls) File "/usr/lib/python2.7/dist-packages/samba/dbchecker.py", line 87, in __init__ dnsadmins_sid = ndr_unpack(security.dom_sid, res[0]["objectSid"][0]) Appreciate you joining the conversation Andrew, do you think CrashPlan corrupted this database? I can't think of anything else I could have done that would've caused such a drastic failure and would like to know so I don't repeat the blunder in the future, this has been a royal PITA. JS
JS
2016-Jan-03 10:38 UTC
[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
Rowland penny <rpenny <at> samba.org> writes:> > On 03/01/16 06:00, JS wrote: > > <=?windows-1252?Q?L.P.H._van_Belle?=> writes: > >> > One of your problems is that you are using the stock Ubuntu samba, this > is getting a bit long in the tooth now, can I suggest you use either the > latest freely available samba from Sernet or better still, compile it > yourself and use the latest version 4.3.3. This will get you a much > improved fsmo.py and will also cover you for several CVEs. > > Rowland > > Kind Regards, > > > > JS > > > > >A couple questions... 1) I've downloaded the latest samba source files and obtained the backup script. Ubuntu's distribution of samba installs to different locations than compiling from source from what I can tell yet I haven't been able to find any references online that dictate exactly what directories need to be backed up to create valid archives on Ubuntu, could you provide some insight in this regard? 2) How would I go about 'upgrading' a deployed Samba4 PDC from the version provided by Canonical to one compiled from source? Is it possible to do an 'in-place' upgrade or will I need to uninstall, compile the new version and then redeploy? Thanks again for all the help. JS
Possibly Parallel Threads
- Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
- Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
- Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
- Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed
- Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed