On Thu, Feb 13, 2020 at 10:41:06AM -0500, Michael S. Tsirkin wrote:> On Thu, Feb 13, 2020 at 11:05:42AM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 13, 2020 at 10:58:44PM +0800, Jason Wang wrote: > > > > > > On 2020/2/13 ??9:41, Jason Gunthorpe wrote: > > > > On Thu, Feb 13, 2020 at 11:34:10AM +0800, Jason Wang wrote: > > > > > > > > > > You have dev, type or > > > > > > class to choose from. Type is rarely used and doesn't seem to be used > > > > > > by vdpa, so class seems the right choice > > > > > > > > > > > > Jason > > > > > Yes, but my understanding is class and bus are mutually exclusive. So we > > > > > can't add a class to a device which is already attached on a bus. > > > > While I suppose there are variations, typically 'class' devices are > > > > user facing things and 'bus' devices are internal facing (ie like a > > > > PCI device) > > > > > > > > > Though all vDPA devices have the same programming interface, but the > > > semantic is different. So it looks to me that use bus complies what > > > class.rst said: > > > > > > " > > > > > > Each device class defines a set of semantics and a programming interface > > > that devices of that class adhere to. Device drivers are the > > > implementation of that programming interface for a particular device on > > > a particular bus. > > > > > > " > > > > Here we are talking about the /dev/XX node that provides the > > programming interface. All the vdpa devices have the same basic > > chardev interface and discover any semantic variations 'in band' > > > > > > So why is this using a bus? VDPA is a user facing object, so the > > > > driver should create a class vhost_vdpa device directly, and that > > > > driver should live in the drivers/vhost/ directory. > > > > > > This is because we want vDPA to be generic for being used by different > > > drivers which is not limited to vhost-vdpa. E.g in this series, it allows > > > vDPA to be used by kernel virtio drivers. And in the future, we will > > > probably introduce more drivers in the future. > > > > I don't see how that connects with using a bus. > > > > Every class of virtio traffic is going to need a special HW driver to > > enable VDPA, that special driver can create the correct vhost side > > class device. > > That's just a ton of useless code duplication, and a good chance > to have minor variations in implementations confusing > userspace.What? Why? This is how almost every user of the driver core works. I don't see how you get any duplication unless the subsystem core is badly done wrong. The 'class' is supposed to provide all the library functions to remove this duplication. Instead of plugging the HW driver in via some bus scheme every subsystem has its own 'ops' that the HW driver provides to the subsystem's class via subsystem_register() This is the *standard* pattern to use the driver core. This is almost there, it just has this extra bus part to convey the HW ops instead of directly.> Instead, each device implement the same interface, and then > vhost sits on top.Sure, but plugging in via ops/etc not via a bus and another struct device.> That bus is exactly what Greg KH proposed. There are other ways > to solve this I guess but this bikeshedding is getting tiring.This discussion was for a different goal, IMHO.> Come on it's an internal kernel interface, if we feel > it was a wrong direction to take we can change our minds later. > Main thing is getting UAPI right.This discussion started because the refcounting has been busted up in every version posted so far. It is not bikeshedding when these bugs are actually being caused by trying to abuse the driver core and shoehorn in a bus that isn't needed when a class is the correct thing to use. Jason
On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote:> > That bus is exactly what Greg KH proposed. There are other ways > > to solve this I guess but this bikeshedding is getting tiring. > > This discussion was for a different goal, IMHO.Hmm couldn't find it anymore. What was the goal there in your opinion? -- MST
On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote:> The 'class' is supposed to provide all the library functions to remove > this duplication. Instead of plugging the HW driver in via some bus > scheme every subsystem has its own 'ops' that the HW driver provides > to the subsystem's class via subsystem_register()Hmm I'm not familiar with subsystem_register. A grep didn't find it in the kernel either ... -- MST
On Thu, Feb 13, 2020 at 10:59:34AM -0500, Michael S. Tsirkin wrote:> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote: > > The 'class' is supposed to provide all the library functions to remove > > this duplication. Instead of plugging the HW driver in via some bus > > scheme every subsystem has its own 'ops' that the HW driver provides > > to the subsystem's class via subsystem_register() > > Hmm I'm not familiar with subsystem_register. A grep didn't find it > in the kernel either ...I mean it is the registration function provided by the subsystem that owns the class, for instance tpm_chip_register(), ib_register_device(), register_netdev(), rtc_register_device() etc So if you have some vhost (vhost net?) class then you'd have some vhost_vdpa_init/alloc(); vhost_vdpa_register(), sequence presumably. (vs trying to do it with a bus matcher) I recommend to look at rtc and tpm for fairly simple easy to follow patterns for creating a subsystem in the kernel. A subsystem owns a class, allows HW drivers to plug in to it, and provides a consistent user API via a cdev/sysfs/etc. The driver model class should revolve around the char dev and sysfs uABI - if you enumerate the devices on the class then they should all follow the char dev and sysfs interfaces contract of that class. Those examples show how to do all the refcounting semi-sanely, introduce sysfs, cdevs, etc. I thought the latest proposal was to use the existing vhost class and largely the existing vhost API, so it probably just needs to make sure the common class-wide stuff is split from the 'driver' stuff of the existing vhost to netdev. Jason
On Thu, Feb 13, 2020 at 10:56:00AM -0500, Michael S. Tsirkin wrote:> On Thu, Feb 13, 2020 at 11:51:54AM -0400, Jason Gunthorpe wrote: > > > That bus is exactly what Greg KH proposed. There are other ways > > > to solve this I guess but this bikeshedding is getting tiring. > > > > This discussion was for a different goal, IMHO. > > Hmm couldn't find it anymore. What was the goal there in your opinion?I think it was largely talking about how to model things like ADI/SF/etc, plus stuff got very confused when the discussion tried to explain what mdev's role was vs the driver core. The standard driver model is a 'bus' driver provides the HW access (think PCI level things), and a 'hw driver' attaches to the bus device, and instantiates a 'subsystem device' (think netdev, rdma, etc) using some per-subsystem XXX_register(). The 'hw driver' pulls in functions from the 'subsystem' using a combination of callbacks and library-style calls so there is no code duplication. As a subsystem, vhost&vdpa should expect its 'HW driver' to bind to devices on busses, for instance I would expect: - A future SF/ADI/'virtual bus' as a child of multi-functional PCI device Exactly how this works is still under active discussion and is one place where Greg said 'use a bus'. - An existing PCI, platform, or other bus and device. No need for an extra bus here, PCI is the bus. - No bus, ie for a simulator or binding to a netdev. (existing vhost?) They point is that the HW driver's job is to adapt from the bus level interfaces (eg readl/writel) to the subsystem level (eg something like the vdpa_ops). For instance that Intel driver should be a pci_driver to bind to a struct pci_device for its VF and then call some 'vhost&vdpa' _register() function to pass its ops to the subsystem which in turn creates the struct device of the subsystem calls, common char devices, sysfs, etc and calls the driver's ops in response to uAPI calls. This is already almost how things were setup in v2 of the patches, near as I can see, just that a bus was inserted somehow instead of having only the vhost class. So it iwas confusing and the lifetime model becomes too complicated to implement correctly... Jason