Rusty Russell
2007-Sep-20 05:04 UTC
[PATCH 0/6] virtio with config abstraction and ring implementation
Hi all, This patch series attempts to come closer to unifying kvm and lguest's usage of virtio. As these two are the first implementations we've seen, I hope making them closer will make future ones closer too. Drivers now unpack their own configuration: their probe() methods are uniform. The configuration mechanism is extensible and can be backed by PCI, a string of bytes, or something else. I've abstracted out the lguest ring buffer code into a common library. The format has changed slightly (mainly because I had an epiphany about inter-guest I/O). I also implemented a console (lguest needs one). Finally, there is a working lguest implementation. Unfortunately, lguest is being refactored for non-i386 ports, so the virtio patches sit at the end of the (quite long) for-2.6.24 patchqueue. Nonetheless, they can be found at http://lguest.ozlabs.org/patches (click on bz2 to get the series). Cheers! Rusty.
(Changes: - renamed sync to kick as Dor suggested - added new_vq and free_vq hooks to create virtqueues - define a simple virtio driver, which uses PCI ids - provide register/unregister_virtio_driver hooks) This attempts to implement a "virtual I/O" layer which should allow common drivers to be efficiently used across most virtual I/O mechanisms. It will no-doubt need further enhancement. The virtio drivers add and get I/O buffers; as the buffers are consumed the driver "interrupt" callbacks are invoked. It also provides driver register and unregister hooks, which are simply overridden at run time (eg. for a guest kernel which supports KVM paravirt and lguest). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/virtio/Kconfig | 3 + drivers/virtio/Makefile | 1 drivers/virtio/virtio.c | 20 ++++++++ include/linux/virtio.h | 114 +++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 138 insertions(+) ==================================================================--- /dev/null +++ b/drivers/virtio/Kconfig @@ -0,0 +1,3 @@ +# Virtio always gets selected by whoever wants it. +config VIRTIO + bool ==================================================================--- /dev/null +++ b/drivers/virtio/Makefile @@ -0,0 +1,1 @@ +obj-$(CONFIG_VIRTIO) += virtio.o ==================================================================--- /dev/null +++ b/drivers/virtio/virtio.c @@ -0,0 +1,20 @@ +#include <linux/virtio.h> + +struct virtio_backend_ops virtio_backend_ops; +EXPORT_SYMBOL_GPL(virtio_backend_ops); + +/* register and unregister simply punt through to the specific backend. */ +int register_virtio_driver(struct virtio_driver *drv) +{ + if (!virtio_backend_ops.register_driver) + return -ENOSYS; + + return virtio_backend_ops.register_driver(drv); +} +EXPORT_SYMBOL_GPL(register_virtio_driver); + +void unregister_virtio_driver(struct virtio_driver *drv) +{ + virtio_backend_ops.unregister_driver(drv); +} +EXPORT_SYMBOL_GPL(unregister_virtio_driver); ==================================================================--- /dev/null +++ b/include/linux/virtio.h @@ -0,0 +1,114 @@ +#ifndef _LINUX_VIRTIO_H +#define _LINUX_VIRTIO_H +/* Everything a virtio driver needs to work with any particular virtio + * implementation. */ +#include <linux/types.h> +#include <linux/scatterlist.h> +#include <linux/spinlock.h> +#include <linux/pci.h> + +struct device; +struct virtio_config_space; + +/** + * virtqueue_ops - operations for virtqueue abstraction layer + * @new_vq: create a new virtqueue + * config: the virtio_config_space field describing the queue + * off: the offset in the config space of the queue configuration + * len: the length of the virtio_config_space field + * callback: the driver callback when the queue is used. + * cb_data: the pointer for the callback. + * Returns a new virtqueue or an ERR_PTR. + * @free_vq: free a virtqueue + * vq: the struct virtqueue to free (must be unused or empty). + * @add_buf: expose buffer to other end + * vq: the struct virtqueue we're talking about. + * sg: the description of the buffer(s). + * out_num: the number of sg readable by other side + * in_num: the number of sg which are writable (after readable ones) + * data: the token identifying the buffer. + * Returns 0 or an error. + * @kick: update after add_buf + * vq: the struct virtqueue + * After one or more add_buf calls, invoke this to kick the virtio layer. + * @get_buf: get the next used buffer + * vq: the struct virtqueue we're talking about. + * len: the length written into the buffer + * Returns NULL or the "data" token handed to add_buf. + * @restart: restart callbacks ater callback returned false. + * vq: the struct virtqueue we're talking about. + * This returns "false" (and doesn't re-enable) if there are pending + * buffers in the queue, to avoid a race. + * @shutdown: "unadd" all buffers. + * vq: the struct virtqueue we're talking about. + * Remove everything from the queue. + * + * Locking rules are straightforward: the driver is responsible for + * locking. No two operations may be invoked simultaneously. + * + * All operations can be called in any context. + */ +struct virtqueue_ops { + struct virtqueue *(*new_vq)(struct device *dev, + struct virtio_config_space *config, + int off, unsigned len, + bool (*callback)(void *), + void *cb_data); + + void (*free_vq)(struct virtqueue *vq); + + int (*add_buf)(struct virtqueue *vq, + struct scatterlist sg[], + unsigned int out_num, + unsigned int in_num, + void *data); + + void (*kick)(struct virtqueue *vq); + + void *(*get_buf)(struct virtqueue *vq, unsigned int *len); + + bool (*restart)(struct virtqueue *vq); + + void (*shutdown)(struct virtqueue *vq); +}; + +/* FIXME: Get a real PCI vendor ID. */ +#define PCI_VENDOR_ID_VIRTIO 0x5000 + +#define VIRTIO_DEV_ID(_devid, _class) { \ + .vendor = PCI_VENDOR_ID_VIRTIO, \ + .device = (_devid), \ + .subvendor = PCI_ANY_ID, \ + .subdevice = PCI_ANY_ID, \ + .class = (_class)<<8, \ + .class_mask = 0xFFFF00 } + +/** + * virtio_driver - operations for a virtio I/O driver + * @name: the name of the driver (KBUILD_MODNAME). + * @owner: the module which contains these routines (ie. THIS_MODULE). + * @id_table: the ids (we re-use PCI ids) serviced by this driver. + * @probe: the function to call when a device is found. Returns a token for + * remove, or PTR_ERR(). + * @remove: the function when a device is removed. + */ +struct virtio_driver { + const char *name; + struct module *owner; + struct pci_device_id *id_table; + void *(*probe)(struct device *device, + struct virtio_config_space *config, + struct virtqueue_ops *vqops); + void (*remove)(void *dev); +}; + +int register_virtio_driver(struct virtio_driver *drv); +void unregister_virtio_driver(struct virtio_driver *drv); + +/* The particular virtio backend supplies these. */ +struct virtio_backend_ops { + int (*register_driver)(struct virtio_driver *drv); + void (*unregister_driver)(struct virtio_driver *drv); +}; +extern struct virtio_backend_ops virtio_backend_ops; +#endif /* _LINUX_VIRTIO_H */
Dor Laor
2007-Sep-20 06:44 UTC
[PATCH 0/6] virtio with config abstraction and ring implementation
Rusty Russell wrote:> > Hi all, > > This patch series attempts to come closer to unifying kvm and > lguest's > usage of virtio. As these two are the first implementations we've seen, > I hope making them closer will make future ones closer too. > > Drivers now unpack their own configuration: their probe() > methods are > uniform. The configuration mechanism is extensible and can be backed by > PCI, a string of bytes, or something else. > > I've abstracted out the lguest ring buffer code into a common > library. > The format has changed slightly (mainly because I had an epiphany about > inter-guest I/O). > > I also implemented a console (lguest needs one). > > Finally, there is a working lguest implementation. Unfortunately, > lguest is being refactored for non-i386 ports, so the virtio patches sit > at the end of the (quite long) for-2.6.24 patchqueue. Nonetheless, they > can be found at http://lguest.ozlabs.org/patches (click on bz2 to get > the series). > > Cheers! > Rusty. >Superb job, it saved me the burden of try to merge the in-house virtio_backend. I like the separation of the ring code, the improved descriptors and the notify too. Regarding the pci config space, I rather see config_ops type of operations to let the 390/xen/other implementations jump on our wagon. Maybe change the offset/len type into a handle pointer and function pointers. The best would be to let them comment on that. I glimpsed over xen netfront.c and I think that the config space can be used without too many hassles Dor. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.linux-foundation.org/pipermail/virtualization/attachments/20070920/e75e904d/attachment.htm
On Thursday 20 September 2007, Rusty Russell wrote:> + * virtio_driver - operations for a virtio I/O driver > + * @name: the name of the driver (KBUILD_MODNAME). > + * @owner: the module which contains these routines (ie. THIS_MODULE). > + * @id_table: the ids (we re-use PCI ids) serviced by this driver. > + * @probe: the function to call when a device is found. ?Returns a token for > + * ? ?remove, or PTR_ERR(). > + * @remove: the function when a device is removed. > + */ > +struct virtio_driver { > +???????const char *name; > +???????struct module *owner; > +???????struct pci_device_id *id_table; > +???????void *(*probe)(struct device *device, > +??????????????? ? ? ? struct virtio_config_space *config, > +??????????????? ? ? ? struct virtqueue_ops *vqops); > +???????void (*remove)(void *dev); > +}; > + > +int register_virtio_driver(struct virtio_driver *drv); > +void unregister_virtio_driver(struct virtio_driver *drv); > + > +/* The particular virtio backend supplies these. */ > +struct virtio_backend_ops { > + int (*register_driver)(struct virtio_driver *drv); > + void (*unregister_driver)(struct virtio_driver *drv); > +}; > +extern struct virtio_backend_ops virtio_backend_ops;This still seems a little awkward. From what I understand, you register a virtio_driver, which leads to a pci_driver (or whatever you are based on) to be registered behind the covers, so that the pci_device can be used directly as the virtio device. I think there should instead be a pci_driver that automatically binds to all PCI based virtio imlpementations and creates a child device for the actual virtio_device. Then you can have the virtio_driver itself be based on a device_driver, and you can get rid of the global virtio_backend_ops. That will be useful when a virtual machine has two ways to get at the virtio devices, e.g. a KVM guest that has both hcall based probing for virtio devices and some other virtio devices that are exported through PCI. Arnd <><
Apparently Analagous Threads
- [PATCH 0/6] virtio with config abstraction and ring implementation
- [PATCH 0/3] virtio implementation (draft VI)
- [PATCH 0/3] virtio implementation (draft VI)
- [PATCH 1/2] reset support: make net driver alloc/cleanup in probe and remove
- [PATCH 1/2] reset support: make net driver alloc/cleanup in probe and remove