thr3ads.net - freebsd stable - [6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Jason Harmening

2006-Mar-16 18:47 UTC

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

Last night I ran into a series of kernel panics that seemed to be related to
heavy UFS traffic.  I ran into two consecutive panics when trying to mount a
UFS-formatted DVD-RAM as a regular user (though not when I mounted it as
root).  The system seemed to actually succeed in mounting the disk, as it
was marked dirty after the ensuing panic.  Upon rebooting after the second
panic, I saw another two consecutive panics which happened whenever I tried
to do something fairly disk-intensive (e.g. starting the X server + KDE)
while the bgfsck was still running from the last panic.  Ultimately I
rebooted in single-user mode, ran fsck manually, and have experienced no
further panics.  I suspect these panics may be related to UFS deadlocks, as
in all cases the application that was attempting disk access hung for
several seconds before the panic, followed by a few seconds of total system
hang, followed by the automatic reboot.

I'm running 6.1-PRELEASE/amd64 from 12 March on an Athlon 64 x2 (SMP) with
SCHED_ULE+PREEMPTION--dangerous combination I know, but it's been rock solid
for months until now.  If anyone is interested, I'll try to reproduce this
panic with a dump/backtrace.  It may be one of the UFS deadlock issues
that's already under investigation for 6.1-RELEASE.

Thanks,
Jason Harmening

Kris Kennaway

2006-Mar-16 19:54 UTC

head link

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

On Thu, Mar 16, 2006 at 12:45:07PM -0600, Jason Harmening
wrote:> Last night I ran into a series of kernel panics that seemed to be related
to
> heavy UFS traffic.  I ran into two consecutive panics when trying to mount
a
> UFS-formatted DVD-RAM as a regular user (though not when I mounted it as
> root).  The system seemed to actually succeed in mounting the disk, as it
> was marked dirty after the ensuing panic.  Upon rebooting after the second
> panic, I saw another two consecutive panics which happened whenever I tried
> to do something fairly disk-intensive (e.g. starting the X server + KDE)
> while the bgfsck was still running from the last panic.  Ultimately I
> rebooted in single-user mode, ran fsck manually, and have experienced no
> further panics.  I suspect these panics may be related to UFS deadlocks, as
> in all cases the application that was attempting disk access hung for
> several seconds before the panic, followed by a few seconds of total system
> hang, followed by the automatic reboot.
> 
> I'm running 6.1-PRELEASE/amd64 from 12 March on an Athlon 64 x2 (SMP)
with
> SCHED_ULE+PREEMPTION--dangerous combination I know, but it's been rock
solid
> for months until now.  If anyone is interested, I'll try to reproduce
this
> panic with a dump/backtrace.  It may be one of the UFS deadlock issues
> that's already under investigation for 6.1-RELEASE.
Yeah, we need a trace.

Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060316/1074b08e/attachment.pgp

Jason Harmening

2006-Apr-01 21:49 UTC

head link

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

On Saturday 18 March 2006 19:39, you wrote:> On Sat, Mar 18, 2006 at 07:29:25PM -0600, Jason Harmening wrote:
> > I finally managed to reproduce the mount panic on the console:
> >
> > CORONA% mount /dev/acd0 /home/jason/dvdram
> > g_vfs_done():acd0[READ(offset=114688, length=16384)]error = 5
> > panic: mount: lost mount
> > cpuid = 0
> > KDB: stack backtrace:
> > kdb_backtrace() at kdb_backtrace+0x37
> > panic() at panic+0x1d1
> > vfs_domount() at vfs_domount+0x9ae
> > vfs_donmount() at vfs_donmount+0x400
> > kernel_mount() at kernel_mount+0x40
> > ffs_cmount() at ffs_cmount+0x7c
> > mount() at mount+0x1e3
> > syscall() at syscall+0x3a4
> > Xfast_syscall() at Xfast_syscall+0xa8
> > --- syscall (21, FreeBSD ELF64, mount), rip = 0x80067e0dc, rsp >
> 0x7fffffffdc88, rbp = 0x7
> > fffffffe748 ---
> > Uptime: 1m34s
> > Dumping 1023 MB (2 chunks)
> >
> > I'm starting to worry this may be a hardware issue...
>
> Yes, it could well be (or bad media) - the drive returned an I/O error
> (error 5 = EIO) when you tried to mount the media.
>
> > If it is, would there be
> > a more elegant way for the OS to handle a failed removable drive mount
> > besides panicking?
>
> In principle, yes.  I don't know if there's any hope of getting it
> fixed in time for 6.1, but please file a PR with this trace.
I filed PR 94669 for this issue and finally took some time to do some further 
investigation on my own.  I've found the following:

1.  I can invariably mount the DVD-RAM successfully if I first do some 
operation on the disk that doesn't require it to be mounted (namely, an 
fsck), or if I've previously mounted successfully and haven't since
ejected
the media.  I will only see the panic if I try to mount immediately after 
inserting the media, and then not 100% of the time.  This leads me to believe 
there may be some confusion between the drive, the ATAPI CD/DVD driver, and 
the VFS subsystem as to when, exactly, the drive is ready for mounting.

2.  I looked at the VFS sources for RELENG_6 and found the point at which the 
panic seems to be occurring--lines 891-892 of vfs_mount.c:

                 if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp, td))
                         panic("mount: lost mount");

So essentially the invocation of mp->mount_op->vfs_root (In this case,
I'm
guessing whatever the vfs_root function for UFS is) returns an error.  Would 
it be safe to handle this error by returning an error code instead of 
panicking?  Or would this have undesirable ramifications for auto-mounted 
filesystems on fixed disks, or could the failed vfs_root possibly induce 
side-effects that would leave the kernel in an unstable state?

I don't know much about the FreeBSD VFS, but I'm willing to take a crack
at
fixing/testing this.

Thanks,
Jason
>
> Kris

freebsd stable - Mar 2006 - [6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic

[6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic