thr3ads.net - Ext3 users - Interesting interaction between journal recovery and slow boots [Mar 2001]

If this information is useful, please help other people find it:
Share via:

Brian J. Murrell

2001-Mar-20 21:53 UTC

Interesting interaction between journal recovery and slow boots

For some time now I have been puzzled as to why certain portions
of my system boot were quite slow -- but only after journal
recoveries.

I was fearing that there was some ugly interaction between the
recovery and the use of the journal shortly afterward but alas that is
not the case.  So just in case anybody else is seeing this problem and
decides to try to hunt it down, let me save you some time.

I am using an 0.0.6b ext3 kernel on a Red Hat 7.0 system.

I was finding that if my system did not get shutdown nicely, upon
reboot it would of course do journal recoveries.  When it was
starting the system services after the recoveries, the "portmapper"
and "nfslock" startup scripts were taking for ever to complete -- up
to a minute or two each.

After much investigation (strace and "set -x" in bash scripts) and
many times hitting the power switch) I finally figured it out.  The
portmapper and nfslock scripts are the first ones started after
syslogd and klogd are started.  During the journal recovery there
are copious quantities of messages such as:

Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:  
Mar 20 12:27:20 brian-laptop fsck: Scanning for sequence ID 484035 at 9590/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484036 at 9591/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484036 at 9593/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484037 at 9594/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484037 at 9596/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484038 at 9597/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484038 at 9599/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484039 at 9600/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484039 at 9602/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484040 at 9603/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484040 at 9609/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484041 at 9610/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484041 at 9612/10000
Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366): do_one_pass:
Scanning for sequence ID 484042 at 9613/10000

diplayed.

On Red Hat systems (and others I am sure) the startup scripts use a
package called initlog to capture all of the console output (including
the above) and "syslog" it (to /var/log/boot.log on Red Hat 7.0
systems).  Well, the copious amounts of JFS records were flooding the
syslog server and causing it to back up, blocking further attemtps to
write to syslog while the back up was being flushed to disk.

Every startup script uses initlog to write the output of the startup
to syslog, so the scripts immediatley following the start of the
syslogger were blocking, presumably due to syslogd's buffer being
full.

b.

Andreas Dilger

2001-Mar-20 22:29 UTC

head link

Re: Interesting interaction between journal recovery and slow boots

Brian Murrell writes:> For some time now I have been puzzled as to why certain portions
> of my system boot were quite slow -- but only after journal
> recoveries.
> 
> After much investigation (strace and "set -x" in bash scripts)
and
> many times hitting the power switch) I finally figured it out.  The
> portmapper and nfslock scripts are the first ones started after
> syslogd and klogd are started.  During the journal recovery there
> are copious quantities of messages such as:
> 
> Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366):
do_one_pass:
> Mar 20 12:27:20 brian-laptop fsck: Scanning for sequence ID 484035 at
9590/10000
> Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366):
do_one_pass: Scanning for sequence ID 484036 at 9591/10000
> Mar 20 12:27:20 brian-laptop fsck: JFS DEBUG: (recovery.c, 366):
do_one_pass: Scanning for sequence ID 484036 at 9593/10000
This is purely a user-space issue, coming from e2fsck it appears.  I have
no such copious logs on my system.  Maybe you should undefine the JFS_DEBUG
in e2fsprogs/include/linux/jfs.h?  I seem to recall that there was a problem
with jfs.h, where it enabled JFS_DEBUG whether in kernel mode or not.  There
is a configure option "--enable-jfs-debug" which should set JFS_DEBUG
for us,
so no need to always enable it in jfs.h.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

Stephen C. Tweedie

2001-Mar-20 23:33 UTC

head link

Re: Interesting interaction between journal recovery and slow boots

Hi,

On Tue, Mar 20, 2001 at 01:53:01PM -0800, Brian J. Murrell
wrote:> 
> I am using an 0.0.6b ext3 kernel on a Red Hat 7.0 system.
Which e2fsprogs?  I think my last e2fsck release should have
jfs_enable_debug set to zero by default already.

Cheers,
 Stephen

Brian J. Murrell

2001-Mar-21 01:32 UTC

head link

Re: Interesting interaction between journal recovery and slow boots

On Tue, Mar 20, 2001 at 11:33:33PM +0000, Stephen C. Tweedie
wrote:> Hi,
> 
> On Tue, Mar 20, 2001 at 01:53:01PM -0800, Brian J. Murrell wrote:
> > 
> > I am using an 0.0.6b ext3 kernel on a Red Hat 7.0 system.
> 
> Which e2fsprogs?
# rpm -q e2fsprogs
e2fsprogs-1.20.WIP.sct-20010216

b.

Reasonably Related Threads

Search for more apparently analagous threads

Ext3 users - Mar 2001 - Interesting interaction between journal recovery and slow boots

Interesting interaction between journal recovery and slow boots

Re: Interesting interaction between journal recovery and slow boots

Re: Interesting interaction between journal recovery and slow boots

Re: Interesting interaction between journal recovery and slow boots

Reasonably Related Threads