thr3ads.net - freebsd stable - Status of PCIe Hotplug? [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Eric van Gyzen

2016-Sep-27 15:51 UTC

Status of PCIe Hotplug?

On 09/27/2016 08:57, Borja Marcos wrote:> 
>> On 27 Sep 2016, at 15:48, Jan Henrik Sylvester <me at janh.de>
wrote:
>> 
>> On 09/27/2016 12:16, Borja Marcos wrote:
>>> I have noticed that the GENERIC kernel in 11-STABLE includes the
>>> PCI_HP option, and the hotplug bits seem to be present in the
>>> kernel, but I don?t see any userland support for it.
>>> 
>>> Is it somewhat complete and in that case am I missing something?
>> 
>> I do not know kind of userland support you mean. I just tried:
>> 
>> Plugging in my USB 3.0 ExpressCard while 11.0 is running, the
>> controller was detected and I was able to use USB devices with it.
>> Great.
> 
> Thanks :)
> 
> I was hoping (and I assume it?s the ultimate goal of the project) to
> be able to hot plug PCIe devices such as NVMe drives.
> 
> On Solaris you can replace them provided you power them off
> previously (there?s a command for that, ?hotplug?).
> 
> On FreeBSD I?ve tried using devctl but powering off, disabling a
> device and enabling it again has led to a panic.
> 
> Interestingly, I disabled nvme0 using devctl and "nvmecontrol
> devlist" didn?t find any nvme controllers despite having 10
> controllers and 10 drives. However, the ZFS pool of 10 NVMe drives
> was working happily. Degraded of course, with one NVMe missing.
To my knowledge, all the necessary PCIe-layer code is present.  However,
that's just one layer:  Many drivers will likely need changes in order
to cope with surprise removal of their devices.

For that reason, HotPlug needs a lot of testing on a variety of
platforms.  The FreeBSD developer base is much smaller than its user
base, of course, so the variety of our testing is rather limited.  You
can help immensely by giving us detailed bug reports, either on a
mailing list or in Bugzilla.  For a panic, the panic messages and stack
trace of the current thread will be very helpful.  Complete crashinfo(8)
output would be great.

The most relevant userland tool is devctl, followed closely by devinfo
and pciconf.

In the case of Jan's USB 3.0 ExpressCard, it's possible that one or all
of the USB controller drivers (xhci, ehci, uhci) didn't cope with the
surprise removal of the controller.  Before removing the card, try
detaching the driver(s) with "devctl detach xhciN".  There might be
more
than one device.  Use "pciconf -lc" to find the HotPlug-capable pcib
devices (bridges).  Use devinfo to find which one is your ExpressCard
slot and find all the devices attached to it.  Then use devctl to detach
the devices.  There could be a tree of devices; in that case, you can
usually start at the level immediately under pcibN; you don't need to
detach every device from the bottom up.  Once all the devices are
detached, you should be able to remove the card safely.

Eric

Jan Henrik Sylvester

2016-Sep-28 07:59 UTC

head link

Status of PCIe Hotplug?

On 09/27/2016 17:51, Eric van Gyzen wrote:> In the case of Jan's USB 3.0 ExpressCard, it's possible that one or
all
> of the USB controller drivers (xhci, ehci, uhci) didn't cope with the
> surprise removal of the controller.  Before removing the card, try
> detaching the driver(s) with "devctl detach xhciN".  There might
be more
> than one device.  Use "pciconf -lc" to find the HotPlug-capable
pcib
> devices (bridges).  Use devinfo to find which one is your ExpressCard
> slot and find all the devices attached to it.  Then use devctl to detach
> the devices.  There could be a tree of devices; in that case, you can
> usually start at the level immediately under pcibN; you don't need to
> detach every device from the bottom up.  Once all the devices are
> detached, you should be able to remove the card safely.
Doing "devctl detach xhci0" before the removal of the USB 3.0
ExpressCard, there is no panic, the device gets deattached properly, and
I can reconnect it later.

Anyhow, because the mechanism holding the ExpressCard is not completely
reliable, on the third time inserting the card, it did not hold and I
got a panic, because it was immediately ejected without devctl detach.

Due to the card not holding firmly, I often pulled it together with the
usb device on 10.3-RELEASE and never got a panic. I guess it is a
regression in the usb driver dealing with sudden loss of the device.

The panic message is below, I guess I should take this discussion to
freebsd-usb@, CCed.

Thanks,
Jan Henrik


Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; acpic id = 01
instruction pointer		= 0x20:0xffffffff80b1549c
stack pointer			= 0x28:0xfffffe022f62ca00
frame pointer			= 0x28:0xfffffe022f62ca70
code segment			= base 0x0, limit 0xfffff, type 0x1b
				= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags		= interrupt enabled, resume, IOPL = 0
current process			= 14 (usbus1)
trap number                     = 9
panic: general protection fault
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff80b24077 at kdb_backtrace+0x67
#1 0xffffffff80ad93e2 at vpanic+0x182
#2 0xffffffff80ad9253 at panic+0x43
#3 0xffffffff80fa0d31 at trap_fatal+0x351
#4 0xffffffff80fa09c8 at trap+0x768
#5 0xffffffff80f84141 at calltrap+0x8
#6 0xffffffff808f2f63 at usb_detach_device+0xf3
#7 0xffffffff808f1d5b at usb_unconfigure+0x2b
#8 0xffffffff808f5623 at usb_free_device+0x103
#9 0xffffffff808f58b1 at usb_bus_detach+0x161
#10 0xffffffff80903e95 at usb_process+0x125
#11 0xffffffff80a90055 at fork_exit+0x85
#12 0xffffffff80f8467e at fork_trampoline+0xe
Uptime: 18m27s
Automatic reboot in 15 seconds - press a key on the console to abort

Borja Marcos

2016-Sep-28 09:41 UTC

head link

Status of PCIe Hotplug?

> On 27 Sep 2016, at 17:51, Eric van Gyzen <vangyzen at FreeBSD.org>
wrote:
> 
> 
> To my knowledge, all the necessary PCIe-layer code is present.  However,
> that's just one layer:  Many drivers will likely need changes in order
> to cope with surprise removal of their devices.
Thank you very much, that?s what I needed to know :) I saw that the bits were
indeed present, but I was wondering wether I should expect it to work or not.
> For that reason, HotPlug needs a lot of testing on a variety of
> platforms.  The FreeBSD developer base is much smaller than its user
> base, of course, so the variety of our testing is rather limited.  You
> can help immensely by giving us detailed bug reports, either on a
> mailing list or in Bugzilla.  For a panic, the panic messages and stack
> trace of the current thread will be very helpful.  Complete crashinfo(8)
> output would be great.
Of course. Unfortunately, due to poor timing and a DOA server last month, this
server is in a countdown to get into production tomorrow running Solaris, but
I?ll try
to get whatever I can today.

Thanks!





Borja.

freebsd stable - Sep 2016 - Status of PCIe Hotplug?

Status of PCIe Hotplug?

Status of PCIe Hotplug?

Status of PCIe Hotplug?