thr3ads.net - freebsd stable - Problem with USB drive errors in recent 7-Stable [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Kevin Oberman

2008-Nov-07 13:21 UTC

Problem with USB drive errors in recent 7-Stable

I recently started getting errors on a fairly new USB connected SATA
drive. Aside from the errors, the system was locking up as any process
attempting to access the drive would lock up in disk uninterruptible
wait ("D" in ps). I could not shut down the system and had to power it
off. (It's a laptop.) After a reboot, I tried to fsck it and that locked
up, too. I was able to recover by telling fsck to not fix the truncated
inode and fix everything else. Then I ran fsck again and it was
successful in fixing the inode. This happened several times.

I then bought a new drive and got the identical behavior! It was not the
drive. I rolled my kernel back to 9/13/08 and tried again. This time it just
worked! No errors or lock up.

I suspect that there are two issues. One results in the lock-up when the
disk had errors and the other caused the purported disk errors. The
latter has been introduced since 9/13/08. The kernel that produced the
errors was from 10/21. I also ran a kernel from 10/8 which did not cause
me problems, but I'm not sure that I used the USB drive with this
kernel.

I'll be building a 10/8 kernel later, after I have backed up some data
from a failing drive (PATA, not USB, and SMART confirms that the this
disk is sick). I will try to track down exactly which change triggered
this ugly behavior, but that will take a number of kernel builds, so it
will take a while.

Has anyone else seen this? Any ideas on what changes might be the most
likely cause. Could be USB, CAM, or something else, I guess.
--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman@es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 224 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/30423b94/attachment.pgp

Jeremy Chadwick

2008-Nov-07 14:01 UTC

head link

Problem with USB drive errors in recent 7-Stable

On Fri, Nov 07, 2008 at 01:21:48PM -0800, Kevin Oberman
wrote:> I recently started getting errors on a fairly new USB connected SATA
> drive. Aside from the errors, the system was locking up as any process
> attempting to access the drive would lock up in disk uninterruptible
> wait ("D" in ps). I could not shut down the system and had to
power it
> off. (It's a laptop.) After a reboot, I tried to fsck it and that
locked
> up, too. I was able to recover by telling fsck to not fix the truncated
> inode and fix everything else. Then I ran fsck again and it was
> successful in fixing the inode. This happened several times.
>
> I then bought a new drive and got the identical behavior! It was not the
> drive. I rolled my kernel back to 9/13/08 and tried again. This time it
just
> worked! No errors or lock up.
> 
> I suspect that there are two issues. One results in the lock-up when the
> disk had errors and the other caused the purported disk errors. The
> latter has been introduced since 9/13/08. The kernel that produced the
> errors was from 10/21. I also ran a kernel from 10/8 which did not cause
> me problems, but I'm not sure that I used the USB drive with this
> kernel. 
> 
> I'll be building a 10/8 kernel later, after I have backed up some data
> from a failing drive (PATA, not USB, and SMART confirms that the this
> disk is sick). I will try to track down exactly which change triggered
> this ugly behavior, but that will take a number of kernel builds, so it
> will take a while.
> 
> Has anyone else seen this? Any ideas on what changes might be the most
> likely cause. Could be USB, CAM, or something else, I guess.
Funny you should post this today -- I just spent the past few days
dealing with this problem, specifically the kernel being "stuck" when
writing to a umass/da device (in my case, USB flash drives).

When I say "stuck", I mean the kernel was still responsive: Ctrl-T
would
report statuses in processes (the states shown were all different) but
the processes essentially had "hung".  Ctrl-Alt-Esc on the console
dropped me to a db> prompt, so it's not as if the machine had
frozen/locked up; it was as if some part surrounding the storage
subsystem was spinning in a loop.  IP traffic still worked as well, but
of course anything that accessed disks would hang.  Rebooting the box
via Ctrl-Alt-Del wouldn't work, because it would get stuck waiting for a
bunch of PIDs to end.

I switched the box to CURRENT (for a lot of reasons), and one of those
was to try out the new USB4BSD (called "USB2" -- not to be confused
with
the USB2.0 protocol) stack.  That simply induced a random kernel panic.
However, HPS is fairly certain he found the issue, and it's with
bus_dma(9) interaction.  Here's the thread:

http://lists.freebsd.org/pipermail/freebsd-current/2008-November/thread.html#235
http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000220.html

I have not yet tried his patches (I just woke up), but I will in a short
while.  So far I have a lot more faith in USB4BSD than I do the old
stack, simply because there's active work going on in it.

(It's ironic that I encountered this issue while working on a document
describing how to put FreeBSD i386, amd64, and MS-DOS on a USB flash
drive, so one could install FreeBSD from it, or boot MS-DOS for BIOS
upgrades)

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

freebsd stable - Nov 2008 - Problem with USB drive errors in recent 7-Stable

Problem with USB drive errors in recent 7-Stable

Problem with USB drive errors in recent 7-Stable