On Thu, Feb 13, 2020 at 10:56:00AM -0500, Michael S. Tsirkin wrote:> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote: > > > That bus is exactly what Greg KH proposed. There are other ways > > > to solve this I guess but this bikeshedding is getting tiring. > > > > This discussion was for a different goal, IMHO. > > Hmm couldn't find it anymore. What was the goal there in your opinion?I think it was largely talking about how to model things like ADI/SF/etc, plus stuff got very confused when the discussion tried to explain what mdev's role was vs the driver core. The standard driver model is a 'bus' driver provides the HW access (think PCI level things), and a 'hw driver' attaches to the bus device, and instantiates a 'subsystem device' (think netdev, rdma, etc) using some per-subsystem XXX_register(). The 'hw driver' pulls in functions from the 'subsystem' using a combination of callbacks and library-style calls so there is no code duplication. As a subsystem, vhost&vdpa should expect its 'HW driver' to bind to devices on busses, for instance I would expect: - A future SF/ADI/'virtual bus' as a child of multi-functional PCI device Exactly how this works is still under active discussion and is one place where Greg said 'use a bus'. - An existing PCI, platform, or other bus and device. No need for an extra bus here, PCI is the bus. - No bus, ie for a simulator or binding to a netdev. (existing vhost?) They point is that the HW driver's job is to adapt from the bus level interfaces (eg readl/writel) to the subsystem level (eg something like the vdpa_ops). For instance that Intel driver should be a pci_driver to bind to a struct pci_device for its VF and then call some 'vhost&vdpa' _register() function to pass its ops to the subsystem which in turn creates the struct device of the subsystem calls, common char devices, sysfs, etc and calls the driver's ops in response to uAPI calls. This is already almost how things were setup in v2 of the patches, near as I can see, just that a bus was inserted somehow instead of having only the vhost class. So it iwas confusing and the lifetime model becomes too complicated to implement correctly... Jason
On 2020/2/14 ??12:24, Jason Gunthorpe wrote:> On Thu, Feb 13, 2020 at 10:56:00AM -0500, Michael S. Tsirkin wrote: >> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote: >>>> That bus is exactly what Greg KH proposed. There are other ways >>>> to solve this I guess but this bikeshedding is getting tiring. >>> This discussion was for a different goal, IMHO. >> Hmm couldn't find it anymore. What was the goal there in your opinion? > I think it was largely talking about how to model things like > ADI/SF/etc, plus stuff got very confused when the discussion tried to > explain what mdev's role was vs the driver core. > > The standard driver model is a 'bus' driver provides the HW access > (think PCI level things), and a 'hw driver' attaches to the bus > device,This is not true, kernel had already had plenty virtual bus where virtual devices and drivers could be attached, besides mdev and virtio, you can see vop, rpmsg, visorbus etc.> and instantiates a 'subsystem device' (think netdev, rdma, > etc) using some per-subsystem XXX_register().Well, if you go through virtio spec, we support ~20 types of different devices. Classes like netdev and rdma are correct since they have a clear set of semantics their own. But grouping network and scsi into a single class looks wrong, that's the work of a virtual bus. The class should be done on top of vDPA device instead of vDPA device itself: - For kernel driver, netdev, blk dev could be done on top - For userspace driver, the class could be done by the drivers inside VM or userspace (dpdk)> The 'hw driver' pulls in > functions from the 'subsystem' using a combination of callbacks and > library-style calls so there is no code duplication.The point is we want vDPA devices to be used by different subsystems, not only vhost, but also netdev, blk, crypto (every subsystem that can use virtio devices). That's why we introduce vDPA bus and introduce different drivers on top.> > As a subsystem, vhost&vdpa should expect its 'HW driver' to bind to > devices on busses, for instance I would expect: > > - A future SF/ADI/'virtual bus' as a child of multi-functional PCI device > Exactly how this works is still under active discussion and is > one place where Greg said 'use a bus'.That's ok but it's something that is not directly related to vDPA which can be implemented by any kinds of devices/buses: struct XXX_device { struct vdpa_device vdpa; struct adi_device/pci_device *lowerdev; } ...> - An existing PCI, platform, or other bus and device. No need for an > extra bus here, PCI is the bus.There're several examples that a bus is needed on top. A good example is Mellanox TmFIFO driver which is a platform device driver but register itself as a virtio device in order to be used by virito-console driver on the virtio bus. But it's a pity that the device can not be used by userspace driver due to the limitation of virito bus which is designed for kernel driver. That's why vDPA bus is introduced which abstract the common requirements of both kernel and userspace drivers which allow the a single HW driver to be used by kernel drivers (and the subsystems on top) and userspace drivers.> - No bus, ie for a simulator or binding to a netdev. (existing vhost?)Note, simulator can have its own class (sysfs etc.).> > They point is that the HW driver's job is to adapt from the bus level > interfaces (eg readl/writel) to the subsystem level (eg something like > the vdpa_ops). > > For instance that Intel driver should be a pci_driver to bind to a > struct pci_device for its VF and then call some 'vhost&vdpa' > _register() function to pass its ops to the subsystem which in turn > creates the struct device of the subsystem calls, common char devices, > sysfs, etc and calls the driver's ops in response to uAPI calls. > > This is already almost how things were setup in v2 of the patches, > near as I can see, just that a bus was inserted somehow instead of > having only the vhost class.Well the series (plus mdev part) uses a bus since day 0. It's not something new. Thanks> So it iwas confusing and the lifetime > model becomes too complicated to implement correctly... > > Jason >
On Fri, Feb 14, 2020 at 12:05:32PM +0800, Jason Wang wrote:> > The standard driver model is a 'bus' driver provides the HW access > > (think PCI level things), and a 'hw driver' attaches to the bus > > device, > > This is not true, kernel had already had plenty virtual bus where virtual > devices and drivers could be attached, besides mdev and virtio, you can see > vop, rpmsg, visorbus etc.Sure, but those are not connecting HW into the kernel..> > and instantiates a 'subsystem device' (think netdev, rdma, > > etc) using some per-subsystem XXX_register(). > > > Well, if you go through virtio spec, we support ~20 types of different > devices. Classes like netdev and rdma are correct since they have a clear > set of semantics their own. But grouping network and scsi into a single > class looks wrong, that's the work of a virtual bus.rdma also has about 20 different types of things it supports on top of the generic ib_device. The central point in RDMA is the 'struct ib_device' which is a device class. You can discover all RDMA devices by looking in /sys/class/infiniband/ It has an internal bus like thing (which probably should have been an actual bus, but this was done 15 years ago) which allows other subsystems to have drivers to match and bind their own drivers to the struct ib_device. So you'd have a chain like: struct pci_device -> struct ib_device -> [ib client bus thing] -> struct net_device And the various char devs are created by clients connecting to the ib_device and creating char devs on their own classes. Since ib_devices are multi-queue we can have all 20 devices running concurrently and there are various schemes to manage when the various things are created.> > The 'hw driver' pulls in > > functions from the 'subsystem' using a combination of callbacks and > > library-style calls so there is no code duplication. > > The point is we want vDPA devices to be used by different subsystems, not > only vhost, but also netdev, blk, crypto (every subsystem that can use > virtio devices). That's why we introduce vDPA bus and introduce different > drivers on top.See the other mail, it seems struct virtio_device serves this purpose already, confused why a struct vdpa_device and another bus is being introduced> There're several examples that a bus is needed on top. > > A good example is Mellanox TmFIFO driver which is a platform device driver > but register itself as a virtio device in order to be used by virito-console > driver on the virtio bus.How is that another bus? The platform bus is the HW bus, the TmFIFO is the HW driver, and virtio_device is the subsystem. This seems reasonable/normal so far..> But it's a pity that the device can not be used by userspace driver due to > the limitation of virito bus which is designed for kernel driver. That's why > vDPA bus is introduced which abstract the common requirements of both kernel > and userspace drivers which allow the a single HW driver to be used by > kernel drivers (and the subsystems on top) and userspace drivers.Ah! Maybe this is the source of all this strangeness - the userspace driver is something parallel to the struct virtio_device instead of being a consumer of it?? That certianly would mess up the driver model quite a lot. Then you want to add another bus to switch between vhost and struct virtio_device? But only for vdpa? But as you point out something like TmFIFO is left hanging. Seems like the wrong abstraction point.. Jason