I currently have 2 x ESXi boxes that have VMs stored on a NFS/iSCSI debian lenny linux box. I had purchased a new whitebox server as an intended replacement for the linux box. I had always planned on installing opensolaris on the new hardware with this config; 2 x 640GB mirrored rpool 6 x 1TB in raidz2 This box was going to be built in parallel with the current server and I would have simply copied the VMDK files across. Main reason for the move to opensolaris is ZFS and all its goodness like; end-to-end checksumming expected good NFS and iSCSI performance (I''d be using b124 rather than 0906 due to poor iSCSI performance in comstar 0906) send/receive for backups snapshot cloning etc Now I''m having a change of heart and want to investigate the possibility of moving from three servers to two. Both of them opensolaris with xVM/zones as the virtualization tech. I''d like to build the two boxes spec''d thus; Intel i7 920 + 12GB 4 x 1TB SATA (raidz) xVM on b124 3 x 1000mb NICs (1 onboard, 2 added PCI-E) The plan is to have the VM files synced across to the ''secondary'' node via ZFS send/receive so that should one fail I can simply restart the VMs on the second node. I may even be able to spread the VM load across the two nodes and sync each bunch of VMs to the ''other'' node. My questions are; 1. Is xVM considered stable enough for use ? And in particular with ZFS underneath in dom0. 2. Can I run virtualbox on opensolaris dom0 ? 3. will the ZFS send/receive work as I intend ? 4. Should I just keep the central ''storage'' server and continue with ESXi or Xenserver 5.5 ? Any thoughts or experiences would be great. -- This message posted from opensolaris.org
Question 2: Answer - No. You cant run virtualbox in dom0. When you try and install the package; ## Executing postinstall script. ## VirtualBox cannot run under xVM Dom0! Fatal Error, Aborting installation! pkgadd: ERROR: postinstall script did not complete successfully -- This message posted from opensolaris.org
On Thu, Oct 15, 2009 at 02:35:21AM -0700, Brian McKerr wrote:> The plan is to have the VM files synced across to the ''secondary'' node > via ZFS send/receive so that should one fail I can simply restart the > VMs on the second node. I may even be able to spread the VM load > across the two nodes and sync each bunch of VMs to the ''other'' node.I''m not clear on your setup. So you have box A that is presumably exporting ZFS volumes for each VM storage across iSCSI, and you want to sync those volumes with send/receive across to a backup box. Where are the VMs running? How often do you need this sync to happen? Note that you can''t sync when guests are running as the storage may not be in a stable state filesystem-wise. regards john
Brian McKerr wrote:> I currently have 2 x ESXi boxes that have VMs stored on a NFS/iSCSI debian lenny linux box. I had purchased a new whitebox server as an intended replacement for the linux box. I had always planned on installing opensolaris on the new hardware with this config; > > 2 x 640GB mirrored rpool > > 6 x 1TB in raidz2 > > This box was going to be built in parallel with the current server and I would have simply copied the VMDK files across. > > Main reason for the move to opensolaris is ZFS and all its goodness like; > > end-to-end checksumming > > expected good NFS and iSCSI performance (I''d be using b124 rather than 0906 due to poor iSCSI performance in comstar 0906) > > send/receive for backups > > snapshot cloning etc > > > Now I''m having a change of heart and want to investigate the possibility of moving from three servers to two. Both of them opensolaris with xVM/zones as the virtualization tech. I''d like to build the two boxes spec''d thus; > > Intel i7 920 + 12GB > 4 x 1TB SATA (raidz) > xVM on b124 > 3 x 1000mb NICs (1 onboard, 2 added PCI-E) > > The plan is to have the VM files synced across to the ''secondary'' node via ZFS send/receive so that should one fail I can simply restart the VMs on the second node. I may even be able to spread the VM load across the two nodes and sync each bunch of VMs to the ''other'' node. > > My questions are; > > 1. Is xVM considered stable enough for use ? And in particular with ZFS underneath in dom0. > 2. Can I run virtualbox on opensolaris dom0 ? > 3. will the ZFS send/receive work as I intend ? > 4. Should I just keep the central ''storage'' server and continue with ESXi or Xenserver 5.5 ? > > > Any thoughts or experiences would be great.What services are you exposing? What is your updating model for those services, e.g. how frequent, are there underlaying requirements for the service? in general stop thinking about servers and think about services, because it will allow different approaches to service delivery. rich
John Levon wrote:> On Thu, Oct 15, 2009 at 02:35:21AM -0700, Brian McKerr wrote: > > >> The plan is to have the VM files synced across to the ''secondary'' node >> via ZFS send/receive so that should one fail I can simply restart the >> VMs on the second node. I may even be able to spread the VM load >> across the two nodes and sync each bunch of VMs to the ''other'' node. >> > > I''m not clear on your setup. So you have box A that is presumably > exporting ZFS volumes for each VM storage across iSCSI, and you want to > sync those volumes with send/receive across to a backup box. > > Where are the VMs running? How often do you need this sync to happen? > Note that you can''t sync when guests are running as the storage may not > be in a stable state filesystem-wise. >Concur with John, you won''t be able to guarantee consistency of the filesystem within the ZVOL. Your device/filesystem chain is probably going to look like guest FS (ZFS/ext3/etc) -> virtual disk -> iSCSI LU -> ZVOL While ZFS will take care that anything written to the iSCSI target is consistent, there are cases where the guest may have written only partial filesystem data at the time you snapshot/send/receive. ZFS should reduce that possibility but may not entirely eliminate it. One idea that I had been thinking about but have not had a chance to test is to export two iSCSI LUs from the target, one from each storage server, and use them to create a mirrored filesystem on the guest. Theoretically this should provide better redundancy if either storage node goes down but I''m not sure how the iSCSI initiator will handle a dead node (TCP timeouts, etc...) --joe
Whoops, reread your intended new config, which sounds like you want to do away with a dedicated storage server and just run say a primary xVM hypervisor and standby xVM hypervisor, using snapshot/send/receive to copy the ZVOLs for the virtual disks to the standby node. Is that correct? I think you still run the risk of an inconsistent filesystem on the standby node unless your guests are configured with filesystems that are guaranteed to be consistent, like ZFS. What will your guests be anyways? --joe Joseph Mocker wrote:> > Concur with John, you won''t be able to guarantee consistency of the > filesystem within the ZVOL. Your device/filesystem chain is probably > going to look like > > guest FS (ZFS/ext3/etc) -> virtual disk -> iSCSI LU -> ZVOL > > While ZFS will take care that anything written to the iSCSI target is > consistent, there are cases where the guest may have written only > partial filesystem data at the time you snapshot/send/receive. ZFS > should reduce that possibility but may not entirely eliminate it. > > One idea that I had been thinking about but have not had a chance to > test is to export two iSCSI LUs from the target, one from each storage > server, and use them to create a mirrored filesystem on the guest. > Theoretically this should provide better redundancy if either storage > node goes down but I''m not sure how the iSCSI initiator will handle a > dead node (TCP timeouts, etc...) > > > --joe
> ...sounds like you want to do away with a dedicated storage server and just run > say a primary xVM hypervisor and standby xVM hypervisor, using > snapshot/send/receive to copy the ZVOLs for the virtual disks to the standby > node. Is that correct? > > I think you still run the risk of an inconsistent > filesystem on the standby node unless your guests are configured with > filesystems that are guaranteed to be consistent, like ZFS.Ditto. Some vendors/partners have integrated storage and virtualization products so that a snapshot request to the VM will quiesce the guest file system before the snapshot is taken, allowing for a consistent snapshot of the guest. xVM with ZFS isn''t there yet. Short answer: xVM isn''t ready for serious use. -- This message posted from opensolaris.org
Chris wrote:>> ...sounds like you want to do away with a dedicated storage server and just run >> say a primary xVM hypervisor and standby xVM hypervisor, using >> snapshot/send/receive to copy the ZVOLs for the virtual disks to the standby >> node. Is that correct? >> >> I think you still run the risk of an inconsistent >> filesystem on the standby node unless your guests are configured with >> filesystems that are guaranteed to be consistent, like ZFS. >> > > Ditto. Some vendors/partners have integrated storage and virtualization products so that a snapshot request to the VM will quiesce the guest file system before the snapshot is taken, allowing for a consistent snapshot of the guest. xVM with ZFS isn''t there yet. >Technically even quiescing the file system isn''t really enough. This will only ensure your file system is consistent, however, unless less you are running ACID compliant applications its still possible that an application has not completed write(s) to make its data files consistent.> Short answer: xVM isn''t ready for serious use. >I don''t know if I would go that far. We''ve been running a cluster of xVM machines hosting various sites for Sun organizations, and we''ve been quite happy with the stability and performance of xVM Xen. Are there rough edges? Sure. Is xVM Xen stable enough to stay running for months? So far, for us, yes. --joe
On Fri, Oct 16, 2009 at 08:45:25AM -0700, Joseph Mocker wrote:> Chris wrote:...> >Ditto. Some vendors/partners have integrated storage and virtualization > >products so that a snapshot request to the VM will quiesce the guest file > >system before the snapshot is taken, allowing for a consistent snapshot of > >the guest. xVM with ZFS isn''t there yet. > > > Technically even quiescing the file system isn''t really enough. This > will only ensure your file system is consistent, however, unless less > you are running ACID compliant applications its still possible that an > application has not completed write(s) to make its data files consistent.Yepp. So we for example have a "cron job" on the windows servers, which shuts it down automatically at 4am. On Dom0 a cron job starts at 4pm as well, which checks all DomU states for at most an hour. If a DomU is in shutoff state, a new snapshot for its ZFS vol[s] gets created and after that the DomU gets restarted. So the WinAdmin has still enough freedom to decide, when a snapshot/backup for his DomU should be made ...> >Short answer: xVM isn''t ready for serious use. > > > I don''t know if I would go that far. We''ve been running a cluster of xVM > machines hosting various sites for Sun organizations, and we''ve been > quite happy with the stability and performance of xVM Xen. Are there > rough edges? Sure. Is xVM Xen stable enough to stay running for months? > So far, for us, yes.No problems wrt. stability for ~1 year, as long as one gives not more than 1 vcpu to Win DomU (but this is snv_b98 - not sure, whether it is fixed in more recent versions ...). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Am 16.10.2009 18:07, schrieb Jens Elkner:> On Fri, Oct 16, 2009 at 08:45:25AM -0700, Joseph Mocker wrote: >> Chris wrote: > ... >>> Ditto. Some vendors/partners have integrated storage and virtualization >>> products so that a snapshot request to the VM will quiesce the guest file >>> system before the snapshot is taken, allowing for a consistent snapshot of >>> the guest. xVM with ZFS isn''t there yet. >>> >> Technically even quiescing the file system isn''t really enough. This >> will only ensure your file system is consistent, however, unless less >> you are running ACID compliant applications its still possible that an >> application has not completed write(s) to make its data files consistent. > > Yepp. So we for example have a "cron job" on the windows servers, which > shuts it down automatically at 4am. On Dom0 a cron job starts at 4pm as > well, which checks all DomU states for at most an hour. If a DomU is in > shutoff state, a new snapshot for its ZFS vol[s] gets created and after > that the DomU gets restarted. So the WinAdmin has still enough freedom > to decide, when a snapshot/backup for his DomU should be made ... > >>> Short answer: xVM isn''t ready for serious use. >>> >> I don''t know if I would go that far. We''ve been running a cluster of xVM >> machines hosting various sites for Sun organizations, and we''ve been >> quite happy with the stability and performance of xVM Xen. Are there >> rough edges? Sure. Is xVM Xen stable enough to stay running for months? >> So far, for us, yes. > > No problems wrt. stability for ~1 year, as long as one gives not > more than 1 vcpu to Win DomU (but this is snv_b98 - not sure, whether > it is fixed in more recent versions ...).It is fixed definitely I have running an xvm-3.4 from gate with up to 4 VCPU on win 2008 and win2008R2 (both x64) with 4 GB of RAM each, perfectly and fast. The only problem making me head aches is the memoryleak in qemu-dm, which is fixed hopefully within the next weeks, Mark I count on you ;) Florian> > Regards, > jel.
On Fri, Oct 16, 2009 at 07:26:46PM +0200, Florian Manschwetus wrote:> Am 16.10.2009 18:07, schrieb Jens Elkner:...> > No problems wrt. stability for ~1 year, as long as one gives not > > more than 1 vcpu to Win DomU (but this is snv_b98 - not sure, whether > > it is fixed in more recent versions ...). > It is fixed definitely I have running an xvm-3.4 from gate with up to 4 > VCPU on win 2008 and win2008R2 (both x64) with 4 GB of RAM each,Ahhh - good news :)> perfectly and fast. The only problem making me head aches is the > memoryleak in qemu-dm, which is fixed hopefully within the next weeks, > Mark I count on you ;)Me too ;-) Thanx for the info, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768