thr3ads.net - zfs discuss - [zfs-discuss] opensolaris-vmware [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Greg

2010-Jan-12 00:17 UTC

[zfs-discuss] opensolaris-vmware

Hello All,
I hope this makes sense, I have two opensolaris machines with a bunch of hard
disks, one acts as a iSCSI SAN, and the other is identical other than the hard
disk configuration. The only thing being served are VMWare esxi raw disks, which
hold either virtual machines or data that the particular virtual machine uses,
I.E. we have exchange 2007 virtualized and through its iSCSI initiator we are
mounting two LUNs one for the database and another for the Logs, all on
different arrays of course. Any how we are then snapshotting this data across
the SAN network to the other box using snapshot send/recv. In the case the other
box fails this box can immediatly serve all of the iSCSI LUNs. The problem, I
don''t really know if its a problem...Is when I snapshot a running vm
will it come up alive in esxi or do I have to accomplish this in a different
way. These snapshots will then be written to tape with bacula. I hope I am
posting this in the correct place.

Thanks, 
Greg
-- 
This message posted from opensolaris.org

Tim Cook

2010-Jan-12 03:36 UTC

head link

[zfs-discuss] opensolaris-vmware

On Mon, Jan 11, 2010 at 6:17 PM, Greg <gregory.durham at gmail.com> wrote:
> Hello All,
> I hope this makes sense, I have two opensolaris machines with a bunch of
> hard disks, one acts as a iSCSI SAN, and the other is identical other than
> the hard disk configuration. The only thing being served are VMWare esxi
raw
> disks, which hold either virtual machines or data that the particular
> virtual machine uses, I.E. we have exchange 2007 virtualized and through
its
> iSCSI initiator we are mounting two LUNs one for the database and another
> for the Logs, all on different arrays of course. Any how we are then
> snapshotting this data across the SAN network to the other box using
> snapshot send/recv. In the case the other box fails this box can immediatly
> serve all of the iSCSI LUNs. The problem, I don''t really know if
its a
> problem...Is when I snapshot a running vm will it come up alive in esxi or
> do I have to accomplish this in a different way. These snapshots will then
> be written to tape with bacula. I hope I am posting this in the correct
> place.
>
> Thanks,
> Greg
> --
>
>What you''ve got are crash consistent snapshots.  The disks are in the
same
state they would be in if you pulled the power plug.  They may come up just
fine, or they may be in a corrupt state.  If you take snapshots frequently
enough, you should have at least one good snapshot.  Your other option is
scripting.  You can build custom scripts to leverage the VSS providers in
Windows... but it won''t be easy.

Any reason in particular you''re using iSCSI?  I''ve found NFS
to be much more
simple to manage, and performance to be equivalent if not better (in large
clusters).

-- 
--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100111/c6a3833d/attachment.html>

Arnaud Brand

2010-Jan-12 09:11 UTC

head link

[zfs-discuss] opensolaris-vmware

Your machines won''t come up running, they''ll start up from
scratch (like if you had hit the reset button).

If you want your machines to come up you have to make vmware snapshots, which
capture the state of the running VM (memory, etc..). Typically this is automated
with solutions like VCB (Vmware consolidated backup), but I''ve just
found http://communities.vmware.com/docs/DOC-8760 (not tested though since we
are running ESX and have bought VCB licenses).

Bear in mind that vmware won''t be able to take a consistent snapshot if
some disks in the VM come from VMDK files while some other disks are raw LUNs
(or otherwise mounted directly in the VM, I mean out of control from esx).
You''ll have to restart the machine from scratch in this case and have a
strong potential for discrepancies between VMDK and raw luns.
On the other hand, I understand that you want Exchange2007 logs and db to live
their live so that when you ? revert to snapshot ? you don''t loose all
the mail that was sent/delivered in between.
So this can be a perfectly valid design depending on how you have set it up.

I don''t think snapshots (be they vmware or zfs) are a good tool for
failover or redundancy here. Basically, if your storage is not accessible from
your esxi hosts, your VMs are toasted and you have to restart them from scratch.
Please note, I don''t know about esxi iscsi retry policies specifics.
For ESX we use an SVC cluster (2 node FC cluster), so our ESX hosts can always
access the storage.

You could try to setup an iscsi cluster like this
http://docs.sun.com/app/docs/doc/820-7821/z40000f557a?a=view (look for the
figure at the bottom). You would obtain a mirrored pool where you could place
the vmware zvols. Then you could iscsi-share these zvols.
Though I''m not sure if/how OpenHA could/would failover if one of your
node fails (I always wanted to play with openHA but don''t have the time
nor the hardware at hand to try it).

This setup of course doesn''t prevent you from doing vmware snapshots
and zfs snapshots, you''ll just achieve some level of fault-tolerance.

Please note I don''t know anything about using NFS with esx/esxi. Maybe
there are setups that are easier to achieve using NFS and provide the same (or a
better) level of fault-tolerance.

Hope this helps,
Arnaud

De : zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at
opensolaris.org] De la part de Tim Cook
Envoy? : mardi 12 janvier 2010 04:36
? : Greg
Cc : zfs-discuss at opensolaris.org
Objet : Re: [zfs-discuss] opensolaris-vmware

On Mon, Jan 11, 2010 at 6:17 PM, Greg <gregory.durham at
gmail.com<mailto:gregory.durham at gmail.com>> wrote:
Hello All,
I hope this makes sense, I have two opensolaris machines with a bunch of hard
disks, one acts as a iSCSI SAN, and the other is identical other than the hard
disk configuration. The only thing being served are VMWare esxi raw disks, which
hold either virtual machines or data that the particular virtual machine uses,
I.E. we have exchange 2007 virtualized and through its iSCSI initiator we are
mounting two LUNs one for the database and another for the Logs, all on
different arrays of course. Any how we are then snapshotting this data across
the SAN network to the other box using snapshot send/recv. In the case the other
box fails this box can immediatly serve all of the iSCSI LUNs. The problem, I
don''t really know if its a problem...Is when I snapshot a running vm
will it come up alive in esxi or do I have to accomplish this in a different
way. These snapshots will then be written to tape with bacula. I hope I am
posting this in the correct place.

Thanks,
Greg
--

What you''ve got are crash consistent snapshots. The disks are in the
same state they would be in if you pulled the power plug. They may come up just
fine, or they may be in a corrupt state. If you take snapshots frequently
enough, you should have at least one good snapshot. Your other option is
scripting. You can build custom scripts to leverage the VSS providers in
Windows... but it won''t be easy.

Any reason in particular you''re using iSCSI? I''ve found NFS
to be much more simple to manage, and performance to be equivalent if not better
(in large clusters).

--
--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100112/5fa43bfd/attachment.html>

Gregory Durham

2010-Jan-13 23:24 UTC

head link

[zfs-discuss] opensolaris-vmware

Tim,
iSCSI was a design descision at the time. Performance was key and I wanted
to utilize being able to hand a LUN on the SAN to esxi, and use it as a raw
disk in physical compatibility mode...however what this has done is that I
can no longer take snapshots on the esxi server and must rely on zfs
snapshot. Also I have multiple *nix virtual machines I need to worry about
backing up and making sure that if all fails that the file systems are
consistent...

Thanks,
Greg

On Mon, Jan 11, 2010 at 7:36 PM, Tim Cook <tim at cook.ms> wrote:
>
>
> On Mon, Jan 11, 2010 at 6:17 PM, Greg <gregory.durham at gmail.com>
wrote:
>
>> Hello All,
>> I hope this makes sense, I have two opensolaris machines with a bunch
of
>> hard disks, one acts as a iSCSI SAN, and the other is identical other
than
>> the hard disk configuration. The only thing being served are VMWare
esxi raw
>> disks, which hold either virtual machines or data that the particular
>> virtual machine uses, I.E. we have exchange 2007 virtualized and
through its
>> iSCSI initiator we are mounting two LUNs one for the database and
another
>> for the Logs, all on different arrays of course. Any how we are then
>> snapshotting this data across the SAN network to the other box using
>> snapshot send/recv. In the case the other box fails this box can
immediatly
>> serve all of the iSCSI LUNs. The problem, I don''t really know
if its a
>> problem...Is when I snapshot a running vm will it come up alive in esxi
or
>> do I have to accomplish this in a different way. These snapshots will
then
>> be written to tape with bacula. I hope I am posting this in the correct
>> place.
>>
>> Thanks,
>> Greg
>> --
>>
>>
> What you''ve got are crash consistent snapshots.  The disks are in
the same
> state they would be in if you pulled the power plug.  They may come up just
> fine, or they may be in a corrupt state.  If you take snapshots frequently
> enough, you should have at least one good snapshot.  Your other option is
> scripting.  You can build custom scripts to leverage the VSS providers in
> Windows... but it won''t be easy.
>
> Any reason in particular you''re using iSCSI?  I''ve found
NFS to be much
> more simple to manage, and performance to be equivalent if not better (in
> large clusters).
>
> --
> --Tim
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100113/3cc6cd9d/attachment.html>

Gregory Durham

2010-Jan-13 23:40 UTC

head link

[zfs-discuss] opensolaris-vmware

Arnaud,
The virtual machines coming up as if they were on is the least of my
worries, my biggest worry is keeping the filesystems of the vms alive i.e.
not corrupt. I have all of my virtual machines set up with raw LUNs in
physical compatibility mode. This has increased performance but sadly at the
cost of vmware snapshots. Is there anything within the virtual machine
itself I can do to keep the filesysystem in tact?

In the case of exchange, I have exchange itself on a raw lun in physical
compatibility mode, and I have 2 LUNs mounted with the Server 2008 iSCSI
initiator for logs and the exchange DB.

This is a set up is similar to several other *nix vms I have residing on
this SAN. Which I am also worrying about. Any other ideas?

Thanks,
Greg


On Tue, Jan 12, 2010 at 1:11 AM, Arnaud Brand <ABrand at esca.fr> wrote:
>  Your machines won?t come up running, they?ll start up from scratch (like
> if you had hit the reset button).
>
>
>
> If you want your machines to come up you have to make vmware snapshots,
> which capture the state of the running VM (memory, etc..). Typically this
is
> automated with solutions like VCB (Vmware consolidated backup), but I?ve
> just found http://communities.vmware.com/docs/DOC-8760 (not tested though
> since we are running ESX and have bought VCB licenses).
>
>
>
> Bear in mind that vmware won?t be able to take a consistent snapshot if
> some disks in the VM come from VMDK files while some other disks are raw
> LUNs (or otherwise mounted directly in the VM, I mean out of control from
> esx).  You?ll have to restart the machine from scratch in this case and
have
> a strong potential for discrepancies between VMDK and raw luns.
>
> On the other hand, I understand that you want Exchange2007 logs and db to
> live their live so that when you ? revert to snapshot ? you don?t loose all
> the mail that was sent/delivered in between.
>
> So this can be a perfectly valid design depending on how you have set it
> up.
>
>
>
> I don?t think snapshots (be they vmware or zfs) are a good tool for
> failover or redundancy here. Basically, if your storage is not accessible
> from your esxi hosts, your VMs are toasted and you have to restart them
from
> scratch.
>
> Please note, I don?t know about esxi iscsi retry policies specifics. For
> ESX we use an SVC cluster (2 node FC cluster), so our ESX hosts can always
> access the storage.
>
>
>
> You could try to setup an iscsi cluster like this
> http://docs.sun.com/app/docs/doc/820-7821/z40000f557a?a=view (look for the
> figure at the bottom). You would obtain a mirrored pool where you could
> place the vmware zvols. Then you could iscsi-share these zvols.
>
> Though I?m not sure if/how OpenHA could/would failover if one of your node
> fails (I always wanted to play with openHA but don?t have the time nor the
> hardware at hand to try it).
>
>
>
> This setup of course doesn?t prevent you from doing vmware snapshots and
> zfs snapshots, you?ll just achieve some level of fault-tolerance.
>
>
>
> Please note I don?t know anything about using NFS with esx/esxi. Maybe
> there are setups that are easier to achieve using NFS and provide the same
> (or a better) level of fault-tolerance.
>
>
>
> Hope this helps,
>
> Arnaud
>
>
>
> *De :* zfs-discuss-bounces at opensolaris.org [mailto:
> zfs-discuss-bounces at opensolaris.org] *De la part de* Tim Cook
> *Envoy? :* mardi 12 janvier 2010 04:36
> *? :* Greg
> *Cc :* zfs-discuss at opensolaris.org
> *Objet :* Re: [zfs-discuss] opensolaris-vmware
>
>
>
>
>
> On Mon, Jan 11, 2010 at 6:17 PM, Greg <gregory.durham at gmail.com>
wrote:
>
> Hello All,
> I hope this makes sense, I have two opensolaris machines with a bunch of
> hard disks, one acts as a iSCSI SAN, and the other is identical other than
> the hard disk configuration. The only thing being served are VMWare esxi
raw
> disks, which hold either virtual machines or data that the particular
> virtual machine uses, I.E. we have exchange 2007 virtualized and through
its
> iSCSI initiator we are mounting two LUNs one for the database and another
> for the Logs, all on different arrays of course. Any how we are then
> snapshotting this data across the SAN network to the other box using
> snapshot send/recv. In the case the other box fails this box can immediatly
> serve all of the iSCSI LUNs. The problem, I don''t really know if
its a
> problem...Is when I snapshot a running vm will it come up alive in esxi or
> do I have to accomplish this in a different way. These snapshots will then
> be written to tape with bacula. I hope I am posting this in the correct
> place.
>
> Thanks,
> Greg
> --
>
>
> What you''ve got are crash consistent snapshots.  The disks are in
the same
> state they would be in if you pulled the power plug.  They may come up just
> fine, or they may be in a corrupt state.  If you take snapshots frequently
> enough, you should have at least one good snapshot.  Your other option is
> scripting.  You can build custom scripts to leverage the VSS providers in
> Windows... but it won''t be easy.
>
> Any reason in particular you''re using iSCSI?  I''ve found
NFS to be much
> more simple to manage, and performance to be equivalent if not better (in
> large clusters).
>
>
> --
> --Tim
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100113/9eea6bed/attachment.html>

Fajar A. Nugraha

2010-Jan-14 04:26 UTC

head link

[zfs-discuss] opensolaris-vmware

On Thu, Jan 14, 2010 at 6:40 AM, Gregory Durham
<gregory.durham at gmail.com> wrote:> Arnaud,
> The virtual machines coming up as if they were on is the least of my
> worries, my biggest worry is keeping the filesystems of the vms alive i.e.
> not corrupt.
As Tim said,  The snapshot disk are in the same state they would be in
if you pulled the power plug.
This is also the same thing you got BTW if you use LVM snapshot (on
Linux) or SAN/NAS based snapshots (like NetApp)
> In the case of exchange, I have exchange itself on a raw lun in physical
> compatibility mode, and I have 2 LUNs mounted with the Server 2008 iSCSI
> initiator for logs and the exchange DB.
Most modern filesystem and database have journaling that can recover
from power failure scenarios, so they should be able to use the
snapshot and provide consistent, non-corrupt information.

So the question now is, have you tried restoring from snapshot?

-- 
Fajar

Gregory Durham

2010-Jan-14 05:12 UTC

head link

[zfs-discuss] opensolaris-vmware

Haha, Yeah that''s tomorrow, I have a test vm I will be testing on. I
shall
report back! Thank you all!

On Wed, Jan 13, 2010 at 8:26 PM, Fajar A. Nugraha <fajar at fajar.net>
wrote:
> On Thu, Jan 14, 2010 at 6:40 AM, Gregory Durham
> <gregory.durham at gmail.com> wrote:
> > Arnaud,
> > The virtual machines coming up as if they were on is the least of my
> > worries, my biggest worry is keeping the filesystems of the vms alive
> i.e.
> > not corrupt.
>
> As Tim said,  The snapshot disk are in the same state they would be in
> if you pulled the power plug.
> This is also the same thing you got BTW if you use LVM snapshot (on
> Linux) or SAN/NAS based snapshots (like NetApp)
>
> > In the case of exchange, I have exchange itself on a raw lun in
physical
> > compatibility mode, and I have 2 LUNs mounted with the Server 2008
iSCSI
> > initiator for logs and the exchange DB.
>
> Most modern filesystem and database have journaling that can recover
> from power failure scenarios, so they should be able to use the
> snapshot and provide consistent, non-corrupt information.
>
> So the question now is, have you tried restoring from snapshot?
>
> --
> Fajar
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100113/cf59f4be/attachment.html>

Gregory Durham

2010-Jan-14 17:33 UTC

head link

[zfs-discuss] opensolaris-vmware

I have been recommended by several other users on this mailing list to use
inside the vm snapshots, vmware snapshots, and then use zfs snapshots. I
believe I understand the difference between filesystem snapshots vs block
level snapshots, however since I cannot use vmware snapshots (all LUNs on
the SAN are mapped to ESXi using RAW disk in physical compatibility mode,
which then disables vmware snapshots) does this cause me to have a weaker
backup strategy? What else can I do? Should I convert the virtual machines
from physical compatibility to virtual compatibility in order to get
snapshotting on the ESXi server?

Thanks for all the helpful information!
Greg

On Wed, Jan 13, 2010 at 9:12 PM, Gregory Durham <gregory.durham at
gmail.com>wrote:
> Haha, Yeah that''s tomorrow, I have a test vm I will be testing on.
I shall
> report back! Thank you all!
>
>
> On Wed, Jan 13, 2010 at 8:26 PM, Fajar A. Nugraha <fajar at
fajar.net> wrote:
>
>> On Thu, Jan 14, 2010 at 6:40 AM, Gregory Durham
>> <gregory.durham at gmail.com> wrote:
>> > Arnaud,
>> > The virtual machines coming up as if they were on is the least of
my
>> > worries, my biggest worry is keeping the filesystems of the vms
alive
>> i.e.
>> > not corrupt.
>>
>> As Tim said,  The snapshot disk are in the same state they would be in
>> if you pulled the power plug.
>> This is also the same thing you got BTW if you use LVM snapshot (on
>> Linux) or SAN/NAS based snapshots (like NetApp)
>>
>> > In the case of exchange, I have exchange itself on a raw lun in
physical
>> > compatibility mode, and I have 2 LUNs mounted with the Server 2008
iSCSI
>> > initiator for logs and the exchange DB.
>>
>> Most modern filesystem and database have journaling that can recover
>> from power failure scenarios, so they should be able to use the
>> snapshot and provide consistent, non-corrupt information.
>>
>> So the question now is, have you tried restoring from snapshot?
>>
>> --
>> Fajar
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100114/6dece84c/attachment.html>

Fajar A. Nugraha

2010-Jan-15 00:45 UTC

head link

[zfs-discuss] opensolaris-vmware

On Fri, Jan 15, 2010 at 12:33 AM, Gregory Durham
<gregory.durham at gmail.com> wrote:> I have been recommended by several other users on this mailing list to use
> inside the vm snapshots, vmware snapshots, and then use zfs snapshots. I
> believe I understand the difference between filesystem snapshots vs block
> level snapshots, however since I cannot use vmware snapshots (all LUNs on
> the SAN are mapped to ESXi using RAW disk in physical compatibility mode,
> which then disables vmware snapshots) does this cause me to have a weaker
> backup strategy? What else can I do? Should I convert the virtual machines
> from physical compatibility to virtual compatibility in order to get
> snapshotting on the ESXi server?
IMHO using all three is too much. you can pick one, and combine that
with other (non-snapshot) backup strategy.
vmware snapshot is good because it also stores memory state, but it
also uses more space.

What I recommend you to do in your current setup:
- check whether your application can survive an unclean shutdown/power
outage (it should). If not, then you have to do application-specific
backup.
- do zfs snapshot plus send/receive
- add regular tape backup if necessary, although it might not need to
be as frequent (you already plan this)
- regulary excercise restoring from backups, to make sure your backup
system works.

-- 
Fajar

Gregory Durham

2010-Jan-19 20:46 UTC

head link

[zfs-discuss] opensolaris-vmware

Thank you so much Fajar,
You have been incredibly helpful! I will do as you said I am just glad I
have not been going down the wrong path!

Thanks,
Greg

On Thu, Jan 14, 2010 at 4:45 PM, Fajar A. Nugraha <fajar at fajar.net>
wrote:
> On Fri, Jan 15, 2010 at 12:33 AM, Gregory Durham
> <gregory.durham at gmail.com> wrote:
> > I have been recommended by several other users on this mailing list to
> use
> > inside the vm snapshots, vmware snapshots, and then use zfs snapshots.
I
> > believe I understand the difference between filesystem snapshots vs
block
> > level snapshots, however since I cannot use vmware snapshots (all LUNs
on
> > the SAN are mapped to ESXi using RAW disk in physical compatibility
mode,
> > which then disables vmware snapshots) does this cause me to have a
weaker
> > backup strategy? What else can I do? Should I convert the virtual
> machines
> > from physical compatibility to virtual compatibility in order to get
> > snapshotting on the ESXi server?
>
> IMHO using all three is too much. you can pick one, and combine that
> with other (non-snapshot) backup strategy.
> vmware snapshot is good because it also stores memory state, but it
> also uses more space.
>
> What I recommend you to do in your current setup:
> - check whether your application can survive an unclean shutdown/power
> outage (it should). If not, then you have to do application-specific
> backup.
> - do zfs snapshot plus send/receive
> - add regular tape backup if necessary, although it might not need to
> be as frequent (you already plan this)
> - regulary excercise restoring from backups, to make sure your backup
> system works.
>
> --
> Fajar
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100119/5aa83cc3/attachment.html>

zfs discuss - Jan 2010 - opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware

[zfs-discuss] opensolaris-vmware