Hi, The current virtio config_ops do not map very well to the PCI config space. Here they are:> struct virtio_config_ops > { > void *(*find)(struct virtio_device *vdev, u8 type, unsigned *len); > void (*get)(struct virtio_device *vdev, void *token, > void *buf, unsigned len); > void (*set)(struct virtio_device *vdev, void *token, > const void *buf, unsigned len); > u8 (*get_status)(struct virtio_device *vdev); > void (*set_status)(struct virtio_device *vdev, u8 status); > struct virtqueue *(*find_vq)(struct virtio_device *vdev, > bool (*callback)(struct virtqueue *)); > void (*del_vq)(struct virtqueue *vq); > };Semantically, find requires that a field have both a type and a length. With the exception of the VIRTQUEUE field used internally by lguest, type is always a unique identifier. Since virtqueue information is not a required part of the config space, it seems to me that type really should be treated as a unique identifier. find_vq also is curious in that it is stateful in it's enumeration. This adds seemingly unnecessary complexity. The result is that in my PCI virtio implementation, I have to use an opaque blob that's self describing and variable length in the PCI config space. This is not very natural at all for a PCI device! Here is how I propose changing the config_ops: struct virtio_config_ops { bool (*get)(struct virtio_device *vdev, unsigned id, void *buf, unsigned len); bool (*set)(struct virtio_device *vdev, unsigned id, const void *buf, unsigned len); u8 (*get_status)(struct virtio_device *vdev); void (*set_status)(struct virtio_device *vdev, u8 status); struct virtqueue *(*get_vq)(struct virtio_device *vdev, unsigned index, bool (*callback)(struct virtqueue *)); void (*del_vq)(struct virtqueue *vq); }; config_ops->get() returns false if the id is invalid as does ->set(). get_vq() returns NULL if the index is not a valid virtqueue (and we'll mandate that all virtqueues are in order starting at 0). The id space is defined by the driver itself but is guaranteed to be non-overlapping and starting from 0. For instance, the block device would have: /* The capacity (in 512-byte sectors). */ #define VIRTIO_CONFIG_BLK_F_CAPACITY 0x00 /* The maximum segment size. */ #define VIRTIO_CONFIG_BLK_F_SIZE_MAX 0x08 /* The maximum number of segments. */ #define VIRTIO_CONFIG_BLK_F_SEG_MAX 0x12 This maps very well to the PCI config space. For lguest, the actual config items would simply be packed in a single page. Since lguest uses the config space for virtqueues, lguest_device_desc will have to be changed to also contain an array of virtqueue infos. It doesn't prevent an implementation from using the id's as opaque keys though (as one might do for Xen since presumably the configuration data would be represented within xenbus). If there's agreement that this approach is better, I'll start submitting patches. Regards, Anthony Liguori
On Wednesday 07 November 2007 04:48:35 Anthony Liguori wrote:> Semantically, find requires that a field have both a type and a length. > With the exception of the VIRTQUEUE field used internally by lguest, > type is always a unique identifier. Since virtqueue information is not > a required part of the config space, it seems to me that type really > should be treated as a unique identifier.Hi Anthony, Not sure I get this. It is a unique identifier. You need the length to handle unknown fields.> find_vq also is curious in that it is stateful in it's enumeration.Well, they're *all* stateful. This gives a simple method of knowing what fields the guest understands: it marks the fields as it finds them. Then it sets the status, which allows the host to know when it's completed configuration reads. I like enumerating the virtqueues: it's not necessary but it's clearer.> This adds seemingly unnecessary complexity.I'd be happy for a simpler mechanism... Cheers, Rusty.
Rusty Russell wrote:> On Wednesday 07 November 2007 04:48:35 Anthony Liguori wrote: >> Semantically, find requires that a field have both a type and a length. >> With the exception of the VIRTQUEUE field used internally by lguest, >> type is always a unique identifier. Since virtqueue information is not >> a required part of the config space, it seems to me that type really >> should be treated as a unique identifier. > > Hi Anthony, > > Not sure I get this. It is a unique identifier. You need the length > to handle unknown fields.It's not a unique identifier since it can be used for multiple items (like it is for virtqueues configs).>> find_vq also is curious in that it is stateful in it's enumeration. > > Well, they're *all* stateful. This gives a simple method of knowing what > fields the guest understands: it marks the fields as it finds them. Then it > sets the status, which allows the host to know when it's completed > configuration reads.But PCI device configuration is not stateful. If you care about letting the host know what features a guest understands, I think something more explicit and stateful should be used. For instance, a feature register that stores a bitmap. Otherwise, the host has to infer based on what fields that guest has read what features the guest actually supports. That seems error prone to me.> I like enumerating the virtqueues: it's not necessary but it's clearer. > >> This adds seemingly unnecessary complexity. > > I'd be happy for a simpler mechanism...What do you think of what I proposed? It seems simpler to me. Regards, Anthony Liguori> Cheers, > Rusty.
Rusty Russell wrote:> On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote: >> I would prefer that the virtio API not expose a little endian standard. >> I'm currently converting config->get() ops to ioreadXX depending on the >> size which already does the endianness conversion for me so this just >> messes things up. I think it's better to let the backend deal with >> endianness since it's trivial to handle for both the PCI backend and the >> lguest backend (lguest doesn't need to do any endianness conversion). > > -ETOOMUCHMAGIC. We should either expose all the XX interfaces (but this isn't > a high-speed interface, so let's not) or not "sometimes" convert endianness. > Getting surprises because a field happens to be packed into 4 bytes is > counter-intuitive.Then I think it's necessary to expose the XX interfaces. Otherwise, the backend has to deal with doing all register operations at a per-byte granularity which adds a whole lot of complexity on a per-device basis (as opposed to a little complexity once in the transport layer). You really want to be able to rely on multi-byte atomic operations too when setting values. Otherwise, you need another register to just to signal when it's okay for the device to examine any given register. Regards, Anthony Liguori> Since your most trivial implementation is to do a byte at a time, I don't > think you have a good argument on that basis either. > > Cheers, > Rusty.