On Monday, October 18, 2010 4:42:05 pm mdf@freebsd.org
wrote:> When we moved to FreeBSD 7 from 6, issuing a kldunload for usb devices
> started causing a kernel panic is a USB device was still plugged in
> (like a keyboard). The kldunload is done as part of an rc.d script
> that unloads usb since it's not generally needed by our product unless
> we mounted the root volume from a USB stick.
>
> The order doesn't matter much, but doing:
>
> kldunload ucom
> kldunload umass
> kldunload usb
>
> panics with this stack:
>
> panic @ time 1287356740.252, thread 0xffffff0016bd64a0: Fatal trap 12:
> page fault while in kernel mode
>
> cpuid = 2
>
> Stack: --------------------------------------------------
> kernel:trap_fatal+0xac
> kernel:trap_pfault+0x24c
> kernel:trap+0x3d9
> kernel:pmap_kextract+0x70
> kernel:free+0xcd
> usb.ko:usb_disconnect_port+0xbd
> usb.ko:uhub_detach+0xd2
> kernel:device_detach+0xb3
> kernel:device_delete_child+0x98
> kernel:device_delete_child+0x66
> usb.ko:uhci_pci_detach+0xac
> kernel:device_detach+0xb3
> kernel:devclass_delete_driver+0xde
> kernel:driver_module_handler+0x11c
> kernel:module_unload+0x41
> kernel:linker_file_unload+0x19a
> kernel:kern_kldunload+0x10a
> kernel:isi_syscall+0x98
> kernel:ia32_syscall+0x1cd
> --------------------------------------------------
>
> cpuid = 2; apic id = 12
> fault virtual address = 0xffff80403037b7a8
> fault code = supervisor read data, page not present
> stack pointer = 0x10:0xffffff8bfe2d0450
> frame pointer = 0x10:0xffffff8bfe2d0470
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
>
>
> The problem is that device_delete_child will recursively call
> device_delete_child before calling device_detach(). When
> device_detach resolves to uhub_detach, it attempts to call
> usb_disconnect_port() which will iterate over the subdevs array. Each
> of the pointers of the subdevs array is pointing into already-free'd
> storage; the free(9) came from the recursive call to
> device_delete_child(). In this case the code is trying to dereference
> 0xdeadc0dedeadc0de since this is INVARIANTS with the malloc poisoning
> on free(9).
>
> So questions:
>
> (1) is there a simple fix, like defining a devclass_t for the port
> device, and having it do a detach method cleanup instead of
> uhub_detach()? I wasn't sure what to put in for the match method,
> though.
I think uhub_detach() should use device_get_children() instead of trying
to maintain its own list of child devices.
If uhub really does need to maintain its own list of children, then a better
fix would be to add a 'bus_device_deleted()' callback from
device_delete_child() to the parent bus device to let it know a device had
been removed, but that would be tricky to use since in many cases a bus
device is what invokes device_delete_child() in the first place.
--
John Baldwin