Daniel P. Berrange
2006-Nov-21 14:02 UTC
[Xen-devel] HVM guest with file backed disks using loop devices ?
I''ve been looking at an issue with HVM guests and file backed virtual disks. A user reported that they were unable to create more than 8 HVM guests as the hotplug scripts failed to find a free loop device on the 9th guest. Ignoring the option of increasing max available loop devices in kernel, I''m confused as to why HVM guests need to allocate loopback devices at all. The QEMU device model is perfectly happy access plain files directly - it has no need for them to be bound to a loop device. Indeed if I look at an active HVM guest, there is nothing using the loop device at all: # grep disk /etc/xen/demo disk = [ "file:/xen/demo.img,hda,w", "file:/root/boot.iso,hdc:cdrom,r" ] # ps -axuwf | grep loop root 18631 0.0 0.0 0 0 ? S< 13:32 0:00 [loop0] root 18673 0.0 0.0 0 0 ? S< 13:32 0:00 [loop1] # lsof /dev/loop0 # lsof /dev/loop1 # lsof /root/boot.iso COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME qemu-dm 18551 root 6u REG 253,0 6711296 1933665 /root/boot.iso # lsof /xen/demo.img COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME qemu-dm 18551 root 5u REG 253,0 4294967297 1277954 /xen/demo.img So qemu-dm has the two raw files open & nothing is using the loop devices. The loop devices are getting created for the HVM guest by the hotplug scripts, in particular /etc/xen/scripts/block. This script simply looks at the block device type - either phy: or file: and takes action accordingly - for the latter it will always create a loop device. There does not appear to be any logic in the hotplug script to check whether the guest is paravirt or HVM. If I hack the ''block'' hotplug script to rip out creation of the loopback device for file: disks, then HVM guests can still be created & appear to be able to access their virtual disks without problems (obviously this isn''t an actual solution since it breaks paravirt file: disks). I can think of two reasons for the creation of loop devices for HVM: - This was needed in the past, but is now obsolete & we simply forgot to turn off loop device code for HVM - This is an accidental consequence changing the way HVM disks are configured (ie when we droppped :ioemu tag from disks). Can anyone who is more familiar with the history of HVM development shed some light on this behaviour ? I''d like to update the hotplug scripts to stop the loopback device being created for HVM, but don''t see any obvious data available in the hotplug scripts which would allow me to distinguish between paravirt & HVM disk configurations. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2006-Nov-22 09:49 UTC
Re: [Xen-devel] HVM guest with file backed disks using loop devices ?
On Tue, Nov 21, 2006 at 02:02:13PM +0000, Daniel P. Berrange wrote:> > I''ve been looking at an issue with HVM guests and file backed virtual disks. > A user reported that they were unable to create more than 8 HVM guests as > the hotplug scripts failed to find a free loop device on the 9th guest. > > Ignoring the option of increasing max available loop devices in kernel, I''m > confused as to why HVM guests need to allocate loopback devices at all. > The QEMU device model is perfectly happy access plain files directly - it > has no need for them to be bound to a loop device. Indeed if I look at an > active HVM guest, there is nothing using the loop device at all: > > # grep disk /etc/xen/demo > disk = [ "file:/xen/demo.img,hda,w", "file:/root/boot.iso,hdc:cdrom,r" ] > # ps -axuwf | grep loop > root 18631 0.0 0.0 0 0 ? S< 13:32 0:00 [loop0] > root 18673 0.0 0.0 0 0 ? S< 13:32 0:00 [loop1] > > # lsof /dev/loop0 > # lsof /dev/loop1 > # lsof /root/boot.iso > COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME > qemu-dm 18551 root 6u REG 253,0 6711296 1933665 /root/boot.iso > # lsof /xen/demo.img > COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME > qemu-dm 18551 root 5u REG 253,0 4294967297 1277954 /xen/demo.img > > So qemu-dm has the two raw files open & nothing is using the loop devices. > > > The loop devices are getting created for the HVM guest by the hotplug scripts, > in particular /etc/xen/scripts/block. This script simply looks at the block > device type - either phy: or file: and takes action accordingly - for the > latter it will always create a loop device. There does not appear to be any > logic in the hotplug script to check whether the guest is paravirt or HVM. > > If I hack the ''block'' hotplug script to rip out creation of the loopback device > for file: disks, then HVM guests can still be created & appear to be able to > access their virtual disks without problems (obviously this isn''t an actual > solution since it breaks paravirt file: disks). > > I can think of two reasons for the creation of loop devices for HVM: > > - This was needed in the past, but is now obsolete & we simply forgot to > turn off loop device code for HVM > - This is an accidental consequence changing the way HVM disks are > configured (ie when we droppped :ioemu tag from disks). > > Can anyone who is more familiar with the history of HVM development shed > some light on this behaviour ?It''s there to support paravirtual drivers within HVM domains. In that case, you will be using blkback, just as with PV domains, and so you need the backend there.> I''d like to update the hotplug scripts to stop the loopback device being > created for HVM, but don''t see any obvious data available in the hotplug > scripts which would allow me to distinguish between paravirt & HVM disk > configurations.The best thing would be to add a device option and a corresponding node in the store that told blkback "this device is not for you, don''t bring it up". That way, you don''t pay the cost of the device backend and the loop device, in the case where you are using QEMU-emulated devices, and not using the PV drivers. The flag has to be this way around for compatibility -- old configurations will be expecting the backend to be created unconditionally, to support the PV drivers. Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2006-Nov-22 17:10 UTC
Re: [Xen-devel] HVM guest with file backed disks using loop devices ?
On Wed, Nov 22, 2006 at 09:49:18AM +0000, Ewan Mellor wrote:> On Tue, Nov 21, 2006 at 02:02:13PM +0000, Daniel P. Berrange wrote: > > > I''ve been looking at an issue with HVM guests and file backed virtual disks. > > A user reported that they were unable to create more than 8 HVM guests as > > the hotplug scripts failed to find a free loop device on the 9th guest. > > I can think of two reasons for the creation of loop devices for HVM:[snip]> > - This was needed in the past, but is now obsolete & we simply forgot to > > turn off loop device code for HVM > > - This is an accidental consequence changing the way HVM disks are > > configured (ie when we droppped :ioemu tag from disks). > > > > Can anyone who is more familiar with the history of HVM development shed > > some light on this behaviour ? > > It''s there to support paravirtual drivers within HVM domains. In that case, > you will be using blkback, just as with PV domains, and so you need the > backend there.Ok, that makes sense now. This raises another question though - is it possible to make the PV-for-HVM drivers work against the blktap backend ? The loop back driver has bad data integrity issues upon crash & poor performance in comparison to blktap, so I''d really like to avoid loopback altogether.> > I''d like to update the hotplug scripts to stop the loopback device being > > created for HVM, but don''t see any obvious data available in the hotplug > > scripts which would allow me to distinguish between paravirt & HVM disk > > configurations. > > The best thing would be to add a device option and a corresponding node in the > store that told blkback "this device is not for you, don''t bring it up". That > way, you don''t pay the cost of the device backend and the loop device, in the > case where you are using QEMU-emulated devices, and not using the PV drivers. > > The flag has to be this way around for compatibility -- old configurations > will be expecting the backend to be created unconditionally, to support the PV > drivers.Ok, sounds like I''l need to investigate how the PV-HVM drivers integrate in a little more detail before attempting such a change Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Warfield
2006-Nov-22 18:17 UTC
Re: [Xen-devel] HVM guest with file backed disks using loop devices ?
> Ok, that makes sense now. This raises another question though - is it possible > to make the PV-for-HVM drivers work against the blktap backend ? The loop > back driver has bad data integrity issues upon crash & poor performance in > comparison to blktap, so I''d really like to avoid loopback altogether.So would I. One reasonably straight-forward option here would be to just do an xm block-attach in the ioemu block connect stage -- this will add extra bounces through the kernel, but should work. An alternate solution would be to build a direct bridge from qemu to the tapdisk process or the tapdisk driver plugins... likely a bit more work though. patches/ideas welcome... ;) a. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel