thr3ads.net - Linux Virtualization - [PATCH V2 3/5] vDPA: introduce vDPA bus [Feb 2020]

If this information is useful, please help other people find it:
Share via:

Jason Gunthorpe

2020-Feb-13 16:24 UTC

[PATCH V2 3/5] vDPA: introduce vDPA bus

On Thu, Feb 13, 2020 at 10:56:00AM -0500, Michael S. Tsirkin
wrote:> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote:
> > > That bus is exactly what Greg KH proposed. There are other ways
> > > to solve this I guess but this bikeshedding is getting tiring.
> > 
> > This discussion was for a different goal, IMHO.
> 
> Hmm couldn't find it anymore. What was the goal there in your opinion?
I think it was largely talking about how to model things like
ADI/SF/etc, plus stuff got very confused when the discussion tried to
explain what mdev's role was vs the driver core.

The standard driver model is a 'bus' driver provides the HW access
(think PCI level things), and a 'hw driver' attaches to the bus
device, and instantiates a 'subsystem device' (think netdev, rdma,
etc) using some per-subsystem XXX_register(). The 'hw driver' pulls in
functions from the 'subsystem' using a combination of callbacks and
library-style calls so there is no code duplication.

As a subsystem, vhost&vdpa should expect its 'HW driver' to bind to
devices on busses, for instance I would expect:

 - A future SF/ADI/'virtual bus' as a child of multi-functional PCI
device
   Exactly how this works is still under active discussion and is
   one place where Greg said 'use a bus'.
 - An existing PCI, platform, or other bus and device. No need for an
   extra bus here, PCI is the bus.
 - No bus, ie for a simulator or binding to a netdev. (existing vhost?)

They point is that the HW driver's job is to adapt from the bus level
interfaces (eg readl/writel) to the subsystem level (eg something like
the vdpa_ops). 

For instance that Intel driver should be a pci_driver to bind to a
struct pci_device for its VF and then call some 'vhost&vdpa'
_register() function to pass its ops to the subsystem which in turn
creates the struct device of the subsystem calls, common char devices,
sysfs, etc and calls the driver's ops in response to uAPI calls.

This is already almost how things were setup in v2 of the patches,
near as I can see, just that a bus was inserted somehow instead of
having only the vhost class. So it iwas confusing and the lifetime
model becomes too complicated to implement correctly...

Jason

Jason Wang

2020-Feb-14 04:05 UTC

head link

[PATCH V2 3/5] vDPA: introduce vDPA bus

On 2020/2/14 ??12:24, Jason Gunthorpe wrote:> On Thu, Feb 13, 2020 at 10:56:00AM -0500, Michael S. Tsirkin wrote:
>> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote:
>>>> That bus is exactly what Greg KH proposed. There are other ways
>>>> to solve this I guess but this bikeshedding is getting tiring.
>>> This discussion was for a different goal, IMHO.
>> Hmm couldn't find it anymore. What was the goal there in your
opinion?
> I think it was largely talking about how to model things like
> ADI/SF/etc, plus stuff got very confused when the discussion tried to
> explain what mdev's role was vs the driver core.
>
> The standard driver model is a 'bus' driver provides the HW access
> (think PCI level things), and a 'hw driver' attaches to the bus
> device,

This is not true, kernel had already had plenty virtual bus where 
virtual devices and drivers could be attached, besides mdev and virtio, 
you can see vop, rpmsg, visorbus etc.

> and instantiates a 'subsystem device' (think netdev, rdma,
> etc) using some per-subsystem XXX_register().

Well, if you go through virtio spec, we support ~20 types of different 
devices. Classes like netdev and rdma are correct since they have a 
clear set of semantics their own. But grouping network and scsi into a 
single class looks wrong, that's the work of a virtual bus.

The class should be done on top of vDPA device instead of vDPA device 
itself:

- For kernel driver, netdev, blk dev could be done on top
- For userspace driver, the class could be done by the drivers inside VM 
or userspace (dpdk)

> The 'hw driver' pulls in
> functions from the 'subsystem' using a combination of callbacks and
> library-style calls so there is no code duplication.

The point is we want vDPA devices to be used by different subsystems, 
not only vhost, but also netdev, blk, crypto (every subsystem that can 
use virtio devices). That's why we introduce vDPA bus and introduce 
different drivers on top.

>
> As a subsystem, vhost&vdpa should expect its 'HW driver' to
bind to
> devices on busses, for instance I would expect:
>
>   - A future SF/ADI/'virtual bus' as a child of multi-functional
PCI device
>     Exactly how this works is still under active discussion and is
>     one place where Greg said 'use a bus'.

That's ok but it's something that is not directly related to vDPA which 
can be implemented by any kinds of devices/buses:

struct XXX_device {
struct vdpa_device vdpa;
struct adi_device/pci_device *lowerdev;
}
...

>   - An existing PCI, platform, or other bus and device. No need for an
>     extra bus here, PCI is the bus.

There're several examples that a bus is needed on top.

A good example is Mellanox TmFIFO driver which is a platform device 
driver but register itself as a virtio device in order to be used by 
virito-console driver on the virtio bus.

But it's a pity that the device can not be used by userspace driver due 
to the limitation of virito bus which is designed for kernel driver. 
That's why vDPA bus is introduced which abstract the common requirements 
of both kernel and userspace drivers which allow the a single HW driver 
to be used by kernel drivers (and the subsystems on top) and userspace 
drivers.

>   - No bus, ie for a simulator or binding to a netdev. (existing vhost?)

Note, simulator can have its own class (sysfs etc.).

>
> They point is that the HW driver's job is to adapt from the bus level
> interfaces (eg readl/writel) to the subsystem level (eg something like
> the vdpa_ops).
>
> For instance that Intel driver should be a pci_driver to bind to a
> struct pci_device for its VF and then call some 'vhost&vdpa'
> _register() function to pass its ops to the subsystem which in turn
> creates the struct device of the subsystem calls, common char devices,
> sysfs, etc and calls the driver's ops in response to uAPI calls.
>
> This is already almost how things were setup in v2 of the patches,
> near as I can see, just that a bus was inserted somehow instead of
> having only the vhost class.

Well the series (plus mdev part) uses a bus since day 0. It's not 
something new.

Thanks

>   So it iwas confusing and the lifetime
> model becomes too complicated to implement correctly...
>
> Jason
>

Jason Gunthorpe

2020-Feb-14 14:04 UTC

head link

[PATCH V2 3/5] vDPA: introduce vDPA bus

On Fri, Feb 14, 2020 at 12:05:32PM +0800, Jason Wang wrote:
> > The standard driver model is a 'bus' driver provides the HW
access
> > (think PCI level things), and a 'hw driver' attaches to the
bus
> > device,
> 
> This is not true, kernel had already had plenty virtual bus where virtual
> devices and drivers could be attached, besides mdev and virtio, you can see
> vop, rpmsg, visorbus etc.
Sure, but those are not connecting HW into the kernel..
 > > and instantiates a 'subsystem device' (think netdev, rdma,
> > etc) using some per-subsystem XXX_register().
> 
> 
> Well, if you go through virtio spec, we support ~20 types of different
> devices. Classes like netdev and rdma are correct since they have a clear
> set of semantics their own. But grouping network and scsi into a single
> class looks wrong, that's the work of a virtual bus.
rdma also has about 20 different types of things it supports on top of
the generic ib_device.

The central point in RDMA is the 'struct ib_device' which is a device
class. You can discover all RDMA devices by looking in /sys/class/infiniband/

It has an internal bus like thing (which probably should have been an
actual bus, but this was done 15 years ago) which allows other
subsystems to have drivers to match and bind their own drivers to the
struct ib_device.

So you'd have a chain like:

struct pci_device -> struct ib_device -> [ib client bus thing] ->
struct net_device

And the various char devs are created by clients connecting to the
ib_device and creating char devs on their own classes.

Since ib_devices are multi-queue we can have all 20 devices running
concurrently and there are various schemes to manage when the various
things are created.
> > The 'hw driver' pulls in
> > functions from the 'subsystem' using a combination of
callbacks and
> > library-style calls so there is no code duplication.
> 
> The point is we want vDPA devices to be used by different subsystems, not
> only vhost, but also netdev, blk, crypto (every subsystem that can use
> virtio devices). That's why we introduce vDPA bus and introduce
different
> drivers on top.
See the other mail, it seems struct virtio_device serves this purpose
already, confused why a struct vdpa_device and another bus is being
introduced
> There're several examples that a bus is needed on top.
> 
> A good example is Mellanox TmFIFO driver which is a platform device driver
> but register itself as a virtio device in order to be used by
virito-console
> driver on the virtio bus.
How is that another bus? The platform bus is the HW bus, the TmFIFO is
the HW driver, and virtio_device is the subsystem.

This seems reasonable/normal so far..
> But it's a pity that the device can not be used by userspace driver due
to
> the limitation of virito bus which is designed for kernel driver.
That's why
> vDPA bus is introduced which abstract the common requirements of both
kernel
> and userspace drivers which allow the a single HW driver to be used by
> kernel drivers (and the subsystems on top) and userspace drivers.
Ah! Maybe this is the source of all this strangeness - the userspace
driver is something parallel to the struct virtio_device instead of
being a consumer of it?? That certianly would mess up the driver model
quite a lot.

Then you want to add another bus to switch between vhost and struct
virtio_device? But only for vdpa?

But as you point out something like TmFIFO is left hanging. Seems like
the wrong abstraction point..

Jason

Seemingly Similar Threads

Search for more seemingly similar threads

Linux Virtualization - Feb 2020 - [PATCH V2 3/5] vDPA: introduce vDPA bus

[PATCH V2 3/5] vDPA: introduce vDPA bus

[PATCH V2 3/5] vDPA: introduce vDPA bus

[PATCH V2 3/5] vDPA: introduce vDPA bus

Seemingly Similar Threads