bikas gurung
2007-Sep-18 03:50 UTC
[Fedora-directory-users] help....unable to start fedora server
Hi all,
I''m certainly in deep s*&#t now. I just updated my file-server with
new
updates and patches and tried to reboot it; but it hanged: reason - Kernel
Panic. So I had to shutdown the system manually and had to run
''fsck''
manually afterwards. Everything seemed to run well afterwards. But today
evening I found that I was not able to connect my pc to file-server. When I
checked, it turns out that ''slapd'' daemon wasn''t
started at all. I manually
tried to start the server using the scripts (in /rc.d/init.d ) but got an
error. Here''s an error logged in log file:
Fedora-Directory/1.0.2 B2006.060.1928
isec-file:636 (/opt/fedora-ds/slapd-isec-file)
[17/Sep/2007:20:52:06 -0500] - Fedora-Directory/1.0.2 B2006.060.1928starting up
[17/Sep/2007:20:52:06 -0500] - Detected Disorderly Shutdown last time
Directory Server was running, recovering database.
[17/Sep/2007:20:52:06 -0500] - libdb: Ignoring log file:
/opt/fedora-ds/slapd-isec-file/db/log.0000000206: magic number 0, not 40988
[17/Sep/2007:20:52:06 -0500] - libdb: Invalid log file: log.0000000206:
Invalid argument
[17/Sep/2007:20:52:06 -0500] - libdb: PANIC: Invalid argument
[17/Sep/2007:20:52:06 -0500] - libdb: PANIC: DB_RUNRECOVERY: Fatal error,
run database recovery
[17/Sep/2007:20:52:06 -0500] - Database Recovery Process FAILED. The
database is not recoverable. err=-30978: DB_RUNRECOVERY: Fatal error, run
database recovery
[17/Sep/2007:20:52:06 -0500] - Please make sure there is enough disk space
for dbcache (10485760 bytes) and db region files
[17/Sep/2007:20:52:06 -0500] - start: Failed to init database, err=-30978
DB_RUNRECOVERY: Fatal error, run database recovery
[17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm database
[17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance userRoot already
exists
[17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance NetscapeRoot already
exists
[17/Sep/2007:20:52:06 -0500] binder-based resource limits -
nsLookThroughLimit: parameter error (slapi_reslimit_register() already
registered)
[17/Sep/2007:20:52:06 -0500] - start: Resource limit registration failed
[17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm database
[17/Sep/2007:20:52:06 -0500] - Error: Failed to resolve plugin dependencies
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin 7-bit check is not
started
[17/Sep/2007:20:52:06 -0500] - Error: accesscontrol plugin ACL Plugin is not
started
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin ACL preoperation
is not started
[17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Class of Service
is not started
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin HTTP Client is not
started
[17/Sep/2007:20:52:06 -0500] - Error: database plugin ldbm database is not
started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Legacy Replication
Plugin is not started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Multimaster Replication
Plugin is not started
[17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Roles Plugin is
not started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Views is not started
As all the client machines depend upon this server for authentication and as
weekend is still far away, I''m in big trouble now. I''m quite
clueless what
to do and would really appreciate any kind of help. And no, unfortunately I
don''t have a backup to fall back to .
Thanking you in advance
bikas
Steven Jones
2007-Sep-18 04:21 UTC
RE: [Fedora-directory-users] help....unable to start fedora server
Not knowing a huge amount about FDS/LDAP....I''d start with checking the
OS. Eg.,
[17/Sep/2007:20:52:06 -0500] - Please make sure there is enough disk
space for dbcache (10485760 bytes) and db region files
Suggests to me to check the filesystem with df -h to make sure there is
space left....possibly there is a core dump or something that needs
deleting...rare in Linux but not known on Solaris....
Or maybe some mount point failed to mount as the OS considered it too
damaged....make sure all the filespaces are mounted...
Beyond this I cannot help, sorry.
Making no backups or at least not exporting the database is hopefully
something you will not do again....
regards
Steven Jones
Senior Linux/Unix/San/Vmware System Administrator
APG -Technology Integration Team
Victoria University of Wellington
Phone: +64 4 463 6272
________________________________
From: fedora-directory-users-bounces@redhat.com
[mailto:fedora-directory-users-bounces@redhat.com] On Behalf Of bikas
gurung
Sent: Tuesday, 18 September 2007 3:50 p.m.
To: fedora-directory-users@redhat.com
Subject: [Fedora-directory-users] help....unable to start fedora server
Hi all,
I''m certainly in deep s*&#t now. I just updated my file-server with
new
updates and patches and tried to reboot it; but it hanged: reason -
Kernel Panic. So I had to shutdown the system manually and had to run
''fsck'' manually afterwards. Everything seemed to run well
afterwards.
But today evening I found that I was not able to connect my pc to
file-server. When I checked, it turns out that ''slapd'' daemon
wasn''t
started at all. I manually tried to start the server using the scripts
(in /rc.d/init.d ) but got an error. Here''s an error logged in log
file:
Fedora-Directory/1.0.2 B2006.060.1928
isec-file:636 (/opt/fedora-ds/slapd-isec
-file)
[17/Sep/2007:20:52:06 -0500] - Fedora-Directory/1.0.2 B2006.060.1928
starting up
[17/Sep/2007:20:52:06 -0500] - Detected Disorderly Shutdown last time
Directory Server was running, recovering database.
[17/Sep/2007:20:52:06 -0500] - libdb: Ignoring log file:
/opt/fedora-ds/slapd-isec-file/db/log.0000000206: magic number 0, not
40988
[17/Sep/2007:20:52:06 -0500] - libdb: Invalid log file: log.0000000206:
Invalid argument
[17/Sep/2007:20:52:06 -0500] - libdb: PANIC: Invalid argument
[17/Sep/2007:20:52:06 -0500] - libdb: PANIC: DB_RUNRECOVERY: Fatal
error, run database recovery
[17/Sep/2007:20:52:06 -0500] - Database Recovery Process FAILED. The
database is not recoverable. err=-30978: DB_RUNRECOVERY: Fatal error,
run database recovery
[17/Sep/2007:20:52:06 -0500] - Please make sure there is enough disk
space for dbcache (10485760 bytes) and db region files
[17/Sep/2007:20:52:06 -0500] - start: Failed to init database,
err=-30978 DB_RUNRECOVERY: Fatal error, run database recovery
[17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm
database
[17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance userRoot already
exists
[17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance NetscapeRoot
already exists
[17/Sep/2007:20:52:06 -0500] binder-based resource limits -
nsLookThroughLimit: parameter error (slapi_reslimit_register() already
registered)
[17/Sep/2007:20:52:06 -0500] - start: Resource limit registration failed
[17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm
database
[17/Sep/2007:20:52:06 -0500] - Error: Failed to resolve plugin
dependencies
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin 7-bit check is
not started
[17/Sep/2007:20:52:06 -0500] - Error: accesscontrol plugin ACL Plugin is
not started
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin ACL
preoperation is not started
[17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Class of
Service is not started
[17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin HTTP Client is
not started
[17/Sep/2007:20:52:06 -0500] - Error: database plugin ldbm database is
not started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Legacy Replication
Plugin is not started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Multimaster
Replication Plugin is not started
[17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Roles Plugin
is not started
[17/Sep/2007:20:52:06 -0500] - Error: object plugin Views is not started
As all the client machines depend upon this server for authentication
and as weekend is still far away, I''m in big trouble now. I''m
quite
clueless what to do and would really appreciate any kind of help. And
no, unfortunately I don''t have a backup to fall back to .
Thanking you in advance
bikas
G Venkataraman
2007-Sep-18 05:28 UTC
Re: [Fedora-directory-users] help....unable to start fedora server
Hi, The error: [17/Sep/2007:20:52:06 -0500] - libdb: Ignoring log file: /opt/fedora-ds/slapd-isec-file/db/log.0000000206: magic number 0, not 40988 indicates that the backend Berkeley failed to use the log file log.0000000206 as it is not a valid Berkeley DB logfile. Since you mentioned that you had to shutdown the system manually and do a fsck when it came back up, one possibility is that the log.0000000206 log file (and may be more files) could have been corrupted. Have you checked the lost+found directory for any recovered files ? In any case, I would recommend that before you do any more troubleshooting with the server, you take a snapshot (tar ball) of the affected directory tree (/opt/fedora-ds and any other directories you can think of as belonging to the directory server) and store the tar ball separately (on another directory or even on another machine, for example). This would be useful if you need to go back and change your troubleshooting methodology all over again. Of course, if files are corrupt to begin with, then I am not sure ho useful it would be to begin with. Check whether everything is fine at the system level. Look back in the directory server error log file to see what types of errors showed up (when the directory server tried to start the first time after the system reboot). Check in the system log to make sure that things are fine. Finally, you can also see if by chance, you had taken any ldif dumps of the directory server data at any point in time in the past. Or may be the file system (or the system) itself was backed up by chance for some other purpose. Do you have just one directory server instance running (i.e., only 1 master and no replicas/consumers) ? PS: A couple of things that could have helped in this scenario is to have regular backups of the system and also regular backups of the directory server data (db2ldif.pl<http://www.redhat.com/docs/manuals/dir-server/cli/scripts.htm#pgfId-26364>). Also, another system (or a virtual machine) that is part of a development or test environment and one which is similar to this production server in setup and operation would be useful to have so that things can be tested on it first before being deployed into production. -=Venkat=- gvenkat@gmail.com On 9/17/07, Steven Jones <Steven.Jones@vuw.ac.nz> wrote:> > Not knowing a huge amount about FDS/LDAP….I''d start with checking the OS. > Eg., > > [17/Sep/2007:20:52:06 -0500] - Please make sure there is enough disk space > for dbcache (10485760 bytes) and db region files > > Suggests to me to check the filesystem with df –h to make sure there is > space left….possibly there is a core dump or something that needs > deleting…rare in Linux but not known on Solaris…. > > Or maybe some mount point failed to mount as the OS considered it too > damaged….make sure all the filespaces are mounted… > > Beyond this I cannot help, sorry. > > Making no backups or at least not exporting the database is hopefully > something you will not do again…. > > regards > > Steven Jones > Senior Linux/Unix/San/Vmware System Administrator > APG -Technology Integration Team > Victoria University of Wellington > Phone: +64 4 463 6272 > ------------------------------ > > *From:* fedora-directory-users-bounces@redhat.com [mailto: > fedora-directory-users-bounces@redhat.com] *On Behalf Of *bikas gurung > *Sent:* Tuesday, 18 September 2007 3:50 p.m. > *To:* fedora-directory-users@redhat.com > *Subject:* [Fedora-directory-users] help....unable to start fedora server > > Hi all, > I''m certainly in deep s*&#t now. I just updated my file-server with new > updates and patches and tried to reboot it; but it hanged: reason - Kernel > Panic. So I had to shutdown the system manually and had to run ''fsck'' > manually afterwards. Everything seemed to run well afterwards. But today > evening I found that I was not able to connect my pc to file-server. When I > checked, it turns out that ''slapd'' daemon wasn''t started at all. I manually > tried to start the server using the scripts (in /rc.d/init.d ) but got an > error. Here''s an error logged in log file: > > Fedora-Directory/1.0.2 B2006.060.1928 > isec-file:636 (/opt/fedora-ds/slapd-isec-file) > > [17/Sep/2007:20:52:06 -0500] - Fedora-Directory/1.0.2 B2006.060.1928starting up > [17/Sep/2007:20:52:06 -0500] - Detected Disorderly Shutdown last time > Directory Server was running, recovering database. > [17/Sep/2007:20:52:06 -0500] - libdb: Ignoring log file: > /opt/fedora-ds/slapd-isec-file/db/log.0000000206: magic number 0, not 40988 > [17/Sep/2007:20:52:06 -0500] - libdb: Invalid log file: log.0000000206: > Invalid argument > [17/Sep/2007:20:52:06 -0500] - libdb: PANIC: Invalid argument > [17/Sep/2007:20:52:06 -0500] - libdb: PANIC: DB_RUNRECOVERY: Fatal error, > run database recovery > [17/Sep/2007:20:52:06 -0500] - Database Recovery Process FAILED. The > database is not recoverable. err=-30978: DB_RUNRECOVERY: Fatal error, run > database recovery > [17/Sep/2007:20:52:06 -0500] - Please make sure there is enough disk space > for dbcache (10485760 bytes) and db region files > [17/Sep/2007:20:52:06 -0500] - start: Failed to init database, err=-30978 > DB_RUNRECOVERY: Fatal error, run database recovery > [17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm > database > [17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance userRoot already > exists > [17/Sep/2007:20:52:06 -0500] - WARNING: ldbm instance NetscapeRoot already > exists > [17/Sep/2007:20:52:06 -0500] binder-based resource limits - > nsLookThroughLimit: parameter error (slapi_reslimit_register() already > registered) > [17/Sep/2007:20:52:06 -0500] - start: Resource limit registration failed > [17/Sep/2007:20:52:06 -0500] - Failed to start database plugin ldbm > database > [17/Sep/2007:20:52:06 -0500] - Error: Failed to resolve plugin > dependencies > [17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin 7-bit check is > not started > [17/Sep/2007:20:52:06 -0500] - Error: accesscontrol plugin ACL Plugin is > not started > [17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin ACL preoperation > is not started > [17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Class of > Service is not started > [17/Sep/2007:20:52:06 -0500] - Error: preoperation plugin HTTP Client is > not started > [17/Sep/2007:20:52:06 -0500] - Error: database plugin ldbm database is not > started > [17/Sep/2007:20:52:06 -0500] - Error: object plugin Legacy Replication > Plugin is not started > [17/Sep/2007:20:52:06 -0500] - Error: object plugin Multimaster > Replication Plugin is not started > [17/Sep/2007:20:52:06 -0500] - Error: postoperation plugin Roles Plugin is > not started > [17/Sep/2007:20:52:06 -0500] - Error: object plugin Views is not started > > As all the client machines depend upon this server for authentication and > as weekend is still far away, I''m in big trouble now. I''m quite clueless > what to do and would really appreciate any kind of help. And no, > unfortunately I don''t have a backup to fall back to . > > Thanking you in advance > bikas >