Rusty Russell
2008-Mar-20 05:59 UTC
[RFC PATCH 0/4] Inter-guest virtio I/O example with lguest
Hi all, Just finished my prototype of inter-guest virtio, using networking as an example. Each guest mmaps the other's address space and uses a FIFO for notifications. There are two issues with this approach. The first is that neither guest can change its mappings. See patch 1. The second is that our feature configuration is "host presents, guest chooses" which breaks down when we don't know the capabilities of each guest. In particular, TSO capability for networking. There are three possible solutions: 1) Just offer the lowest common denominator to both sides (ie. no features). This is what I do with lguest in these patches. 2) Offer something and handle the case where one Guest accepts and another doesn't by emulating it. ie. de-TSO the packets manually. 3) "Hot unplug" the device from the guest which asks for the greater features, then re-add it offering less features. Requires hotplug in the guest OS. I haven't tuned or even benchmarked these patches, but it pings! Rusty.
From: Paul TBBle Hampson <Paul.Hampson at Pobox.com> This creates a file in $HOME/.lguest/ to directly back the RAM and DMA memory mappings created by map_zeroed_pages. Signed-off-by: Rusty Russell <rusty at rustcorp.com.au> --- Documentation/lguest/lguest.c | 59 ++++++++++++++++++++++++++++-------------- 1 file changed, 40 insertions(+), 19 deletions(-) diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -236,19 +236,51 @@ static int open_or_die(const char *name, return fd; } -/* map_zeroed_pages() takes a number of pages. */ +/* unlink_memfile() removes the backing file for the Guest's memory, if we exit + * cleanly. */ +static char memfile_path[PATH_MAX]; + +static void unlink_memfile(void) +{ + unlink(memfile_path); +} + +/* map_zeroed_pages() takes a number of pages, and creates a mapping file where + * this Guest's memory lives. */ static void *map_zeroed_pages(unsigned int num) { - int fd = open_or_die("/dev/zero", O_RDONLY); + int fd; void *addr; - /* We use a private mapping (ie. if we write to the page, it will be - * copied). */ + /* We create a .lguest directory in the user's home, to put the memory + * files into. */ + snprintf(memfile_path, PATH_MAX, "%s/.lguest", getenv("HOME") ?: ""); + if (mkdir(memfile_path, S_IRWXU) != 0 && errno != EEXIST) + err(1, "Creating directory %s", memfile_path); + + /* Name the memfiles by the process ID of this launcher. */ + snprintf(memfile_path, PATH_MAX, "%s/.lguest/%u", + getenv("HOME") ?: "", getpid()); + fd = open(memfile_path, O_RDWR | O_CREAT | O_TRUNC, S_IRWXU); + if (fd < 0) + err(1, "Creating memory backing file %s", memfile_path); + + /* Make sure we remove it when we're finished. */ + atexit(unlink_memfile); + + /* Now, we opened it with O_TRUNC, so the file is 0 bytes long. Here + * we expand it to the length we need, and it will be filled with + * zeroes. */ + if (ftruncate(fd, num * getpagesize()) != 0) + err(1, "Truncating file %s %u pages", memfile_path, num); + + /* We use a shared mapping, so others can share with us. */ addr = mmap(NULL, getpagesize() * num, - PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, fd, 0); + PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, fd, 0); if (addr == MAP_FAILED) err(1, "Mmaping %u pages of /dev/zero", num); + verbose("Memory backing file is %s @ %p\n", memfile_path, addr); return addr; } @@ -263,23 +295,12 @@ static void *get_pages(unsigned int num) return addr; } -/* This routine is used to load the kernel or initrd. It tries mmap, but if - * that fails (Plan 9's kernel file isn't nicely aligned on page boundaries), - * it falls back to reading the memory in. */ +/* This routine is used to load the kernel or initrd. We used to mmap, but now + * we simply read it in, so it will be present in the shared underlying + * file. */ static void map_at(int fd, void *addr, unsigned long offset, unsigned long len) { ssize_t r; - - /* We map writable even though for some segments are marked read-only. - * The kernel really wants to be writable: it patches its own - * instructions. - * - * MAP_PRIVATE means that the page won't be copied until a write is - * done to it. This allows us to share untouched memory between - * Guests. */ - if (mmap(addr, len, PROT_READ|PROT_WRITE|PROT_EXEC, - MAP_FIXED|MAP_PRIVATE, fd, offset) != MAP_FAILED) - return; /* pread does a seek and a read in one shot: saves a few lines. */ r = pread(fd, addr, len, offset);
Avi Kivity
2008-Mar-20 06:54 UTC
[kvm-devel] [RFC PATCH 0/4] Inter-guest virtio I/O example with lguest
Rusty Russell wrote:> Hi all, > > Just finished my prototype of inter-guest virtio, using networking as an > example. Each guest mmaps the other's address space and uses a FIFO for > notifications. > >Isn't that a security hole (hole? chasm)? If the two guests can access each other's memory, they might as well be just one guest, and communicate internally. My feeling is that the host needs to copy the data, using dma if available. Another option is to have one guest map the other's memory for read and write, while the other guest is unprivileged. This allows one privileged guest to provide services for other, unprivileged guests, like domain 0 or driver domains in Xen. -- Any sufficiently difficult bug is indistinguishable from a feature.
Anthony Liguori
2008-Mar-20 14:11 UTC
[kvm-devel] [RFC PATCH 0/4] Inter-guest virtio I/O example with lguest
Rusty Russell wrote:> Hi all, > > Just finished my prototype of inter-guest virtio, using networking as an > example. Each guest mmaps the other's address space and uses a FIFO for > notifications. > > There are two issues with this approach. The first is that neither guest > can change its mappings. See patch 1.Avi mentioned that with MMU notifiers, it may be possible to introduce a new kernel mechanism whereas you could map an arbitrary region of one process's memory into another process. This would address this problem quite nicely.> The second is that our feature > configuration is "host presents, guest chooses" which breaks down when we > don't know the capabilities of each guest. In particular, TSO capability for > networking. > There are three possible solutions: > 1) Just offer the lowest common denominator to both sides (ie. no features). > This is what I do with lguest in these patches. > 2) Offer something and handle the case where one Guest accepts and another > doesn't by emulating it. ie. de-TSO the packets manually. > 3) "Hot unplug" the device from the guest which asks for the greater features, > then re-add it offering less features. Requires hotplug in the guest OS. >4) Add a feature negotiation feature. The feature that gets set is the "feature negotiate" feature. If a guest doesn't support feature negotiation, you end up with the least-common denominator (no features). If both guests support feature negotiation, you can then add something new to determine the true common subset.> I haven't tuned or even benchmarked these patches, but it pings! >Very nice! It's particularly cool that it was possible entirely in userspace. Regards, Anthony Liguori> Rusty. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > kvm-devel mailing list > kvm-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kvm-devel >
Possibly Parallel Threads
- [RFC PATCH 0/4] Inter-guest virtio I/O example with lguest
- [PATCH] Remove -static from Documentation/lguest/Makefile
- [PATCH] Remove -static from Documentation/lguest/Makefile
- [PATCH 1/3] lguest: 2.6.21-mm1 update: lguest-remove-unnecessary-gdt-load.patch
- [PATCH 1/3] lguest: 2.6.21-mm1 update: lguest-remove-unnecessary-gdt-load.patch