thr3ads.net - zfs discuss - [zfs-discuss] ZFS iSCSI Clustered for VMware Host use [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Jason

2009-Aug-31 20:42 UTC

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

I''ve been looking to build my own cheap SAN to explore HA scenarios
with VMware hosts, though not for a production environment.  I''m new to
opensolaris but I am familiar with other clustered HA systems.  The features of
ZFS seem like they would fit right in with attempting to build an HA storage
platform for VMware hosts on inexpensive hardware.

Here is what I am thinking.  I want to have at least two clustered nodes (may be
virtual running off the local storage of the VMware host) that act as the front
end of the SAN.  These will not have any real storage themselves, but will be
initiators for backend computers with the actual disks in them.  I want to be
able to add and remove/replace at will so I figure the backends will just be
fairly dumb iSCSI targets that just present each disk.  That way the front ends
are close to the hardware for zfs to work best but would not limit a raid set to
the capacity of a single enclosure.

I''d like to present a RAIDZ2 array as a block device to VMware, how
would that work?  Could that then be clustered so the iSCSI target is HA?  Am I
completely off base or is there an easier way?  My goal is to be able to kill
any one box (or multiple) and still keep the storage available for VMware, but
still get a better total storage to usable ratio than just a plain mirror (2:1).
I also want to be able to add and remove storage dynamically.  You know,
champagne on a beer budget. :)
-- 
This message posted from opensolaris.org

Tim Cook

2009-Aug-31 21:17 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Mon, Aug 31, 2009 at 3:42 PM, Jason <wheelz311 at hotmail.com> wrote:
> I''ve been looking to build my own cheap SAN to explore HA
scenarios with
> VMware hosts, though not for a production environment.  I''m new to
> opensolaris but I am familiar with other clustered HA systems.  The
features
> of ZFS seem like they would fit right in with attempting to build an HA
> storage platform for VMware hosts on inexpensive hardware.
>
> Here is what I am thinking.  I want to have at least two clustered nodes
> (may be virtual running off the local storage of the VMware host) that act
> as the front end of the SAN.  These will not have any real storage
> themselves, but will be initiators for backend computers with the actual
> disks in them.  I want to be able to add and remove/replace at will so I
> figure the backends will just be fairly dumb iSCSI targets that just
present
> each disk.  That way the front ends are close to the hardware for zfs to
> work best but would not limit a raid set to the capacity of a single
> enclosure.
>
> I''d like to present a RAIDZ2 array as a block device to VMware,
how would
> that work?  Could that then be clustered so the iSCSI target is HA?  Am I
> completely off base or is there an easier way?  My goal is to be able to
> kill any one box (or multiple) and still keep the storage available for
> VMware, but still get a better total storage to usable ratio than just a
> plain mirror (2:1).  I also want to be able to add and remove storage
> dynamically.  You know, champagne on a beer budget. :)
>
>Any particular reason you want to present block storage to VMware?  It works
as well, if not better over NFS, and saves a LOT of headaches.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090831/cd926f48/attachment.html>

Jason

2009-Aug-31 21:26 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

Well, I knew a guy who was involved in a project to do just that for a
production environment.  Basically they abandoned using that because there was a
huge performance hit using ZFS over NFS.  I didn?t get the specifics but his
group is usually pretty sharp.  I?ll have to check back with him.  So mainly
just to avoid that, but also VMware tends to roll out storage features on NFS
last after fibre and iSCSI.

*sorry if this is duplicate... Learning the workings of this discussion forum as
well*
-- 
This message posted from opensolaris.org

Tim Cook

2009-Aug-31 21:29 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Mon, Aug 31, 2009 at 4:26 PM, Jason <wheelz311 at hotmail.com> wrote:
> Well, I knew a guy who was involved in a project to do just that for a
> production environment.  Basically they abandoned using that because there
> was a huge performance hit using ZFS over NFS.  I didn?t get the specifics
> but his group is usually pretty sharp.  I?ll have to check back with him.
>  So mainly just to avoid that, but also VMware tends to roll out storage
> features on NFS last after fibre and iSCSI.
>
> *sorry if this is duplicate... Learning the workings of this discussion
> forum as well*
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

That''s not true at all.  Dynamic grow and shrink has been available on
NFS
forever.  You STILL can''t shrink vmfs, and they''ve JUST added
grow
capabilities.  Not to mention it being thin provisioned by default.  As for
performance, I have a tough time believing his performance issues were
because of NFS, and not some other underlying bug.

I''ve got MASSIVE deployments of VMware on NFS over 10g that achieve
stellar
performance (admittedly, it isn''t on zfs).

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090831/11263967/attachment.html>

Jason

2009-Aug-31 21:42 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

Specifically I remember storage vmotion being supported on NFS last as well as
jumbo frames.  Just the impression I get from past features, perhaps they are
doing better with that.

I know the performance problem had specifically to do with ZFS and the way it
handled something.  I know lots of implementations with just straight NFS so I
know that works... I''m not opposed to NFS but I was hoping what he saw
was just a combination of ZFS over NFS as he said he didn''t know if it
would happen over iSCSI.  So I thought I''d try that first. 
I''ll have to see if I can get the details from him tomorrow.
-- 
This message posted from opensolaris.org

David Magda

2009-Aug-31 22:26 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Aug 31, 2009, at 17:29, Tim Cook wrote:
> I''ve got MASSIVE deployments of VMware on NFS over 10g that
achieve
> stellar
> performance (admittedly, it isn''t on zfs).
Without a separate ZIL device NFS would probably be slower with NFS-- 
hence why Sun''s own appliances use SSDs.

Erik Trimble

2009-Aug-31 23:04 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Mon, 2009-08-31 at 18:26 -0400, David Magda wrote:> On Aug 31, 2009, at 17:29, Tim Cook wrote:
> 
> > I''ve got MASSIVE deployments of VMware on NFS over 10g that
achieve
> > stellar
> > performance (admittedly, it isn''t on zfs).
> 
> Without a separate ZIL device NFS would probably be slower with NFS-- 
> hence why Sun''s own appliances use SSDs.
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Hmm.

On a related note:  I''m looking to be using Sun''s xVM on our
Nehalem
(x4170) machines, and was assuming I''d be best off using iSCSI targets
exported from my ZFS-based disk machine.

Under xVM (xen-based, or possibly VirtualBox, too), would I be better
off having an iSCSI raw partition mounted on the xVM server, or using
NFS?  (assuming I would have SSD accelerators on the ZFS disk machine)

I''m looking at performance issues, not things like being able to grow
the image under xVM (I''m hosting QA machines in xVM). 

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Jason

2009-Sep-01 18:45 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

So aside from the NFS debate, would this 2 tier approach work?  I am a bit fuzzy
on how I would get the RAIDZ2 redundancy but still present the volume to the
VMware host as a raw device.  Is that possible or is my understanding wrong? 
Also could it be defined as a clustered resource?
-- 
This message posted from opensolaris.org

Richard Elling

2009-Sep-01 18:53 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Sep 1, 2009, at 11:45 AM, Jason wrote:
> So aside from the NFS debate, would this 2 tier approach work?  I am  
> a bit fuzzy on how I would get the RAIDZ2 redundancy but still  
> present the volume to the VMware host as a raw device.  Is that  
> possible or is my understanding wrong?  Also could it be defined as  
> a clustered resource?
The easiest and proven method is to use shared disks, two heads,
ZFS, and Open HA Cluster to provide highly available NFS or iSCSI
targets. This the fundamental architecture for most HA implementations.

An implementation, which does not use Open HA Cluster, is available
in appliance form as the Sun Storage 7310 or 7410 Cluster System.
But if you are building your own, Open HA Cluster may be a better
choice than rolling your own cluster software.
  -- richard

Jason

2009-Sep-01 19:17 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

True, though an enclosure for shared disks is expensive.  This isn''t
for production but for me to explore what I can do with x86/x64 hardware.  The
idea being that I can just throw up another x86/x64 box to add more storage. 
Has anyone tried anything similar?
-- 
This message posted from opensolaris.org

Tim Cook

2009-Sep-01 19:28 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Tue, Sep 1, 2009 at 2:17 PM, Jason <wheelz311 at hotmail.com> wrote:
> True, though an enclosure for shared disks is expensive.  This
isn''t for
> production but for me to explore what I can do with x86/x64 hardware.  The
> idea being that I can just throw up another x86/x64 box to add more
storage.
>  Has anyone tried anything similar?
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

I still don''t understand why you need this two layer architecture. 
Just add
a server to the mix, and add the new storage to vmware.  If you''re
doing
iSCSI, you''ll hit the LUN size limitations long before you''ll
need a second
box.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090901/dd70f3de/attachment.html>

Richard Elling

2009-Sep-01 19:29 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Sep 1, 2009, at 12:17 PM, Jason wrote:
> True, though an enclosure for shared disks is expensive.  This
isn''t
> for production but for me to explore what I can do with x86/x64  
> hardware.  The idea being that I can just throw up another x86/x64  
> box to add more storage.  Has anyone tried anything similar?
You mean something like this?
    disk ---- server ---+
                       +-- server --- network --- client
    disk ---- server ---+

I''m not sure how that can be less expensive in the TCO sense.
  -- richard

Jason

2009-Sep-01 20:28 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

I guess I should come at it from the other side:

If you have 1 iscsi target box and it goes down, you''re dead in the
water.

If you have 2 iscsi target boxes that replicate and one dies, you are OK but you
then have to have a 2:1 total storage to usable ratio (excluding expensive
shared disks).

If you have 2 tiers where you have n cheap back-end iSCSI targets that have the
physical disks in them and present them to 2 clustered virtual iSCSI target
servers (assuming this can be done with disks over iSCSI) that are presenting
the iSCSI targets to the VMware hosts, then any one server could go down but
everything would keep running.  It would create a virtual clustered pair that is
basically doing RAID over the network (iSCSI).  Since you already have the
VMware hosts, the 2 virtual ones are "free".  None of the back-end
servers would need redundant components because any one can fail, so you should
be able to build them with inexpensive parts.

This would also allow you to add/replace storage easily (I hope).  Perhaps
you''d have to RAIDZ the backend disks together and then present them to
the front-end which would RAIDZ all the back-ends together.  For example, if you
had 5 backend boxes with 8 drives each you''d have a 10:7 ratio. 
I''m sure the RAID combinations could be played with to get the balance
of redundancy and capacity that you need.  I don''t know what kind of
performance hit you would take doing that over iSCSI but I thought it might work
as long as you have gigabit speeds.  Or I could be completely off my rocker. :)
Am I?
-- 
This message posted from opensolaris.org

Scott Meilicke

2009-Sep-01 20:49 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

You are completely off your rocker :)

No, just kidding. Assuming the virtual front-end servers are running on
different hosts, and you are doing some sort of raid, you should be fine.
Performance may be poor due to the inexpensive targets on the back end, but you
probably know that. A while back I thought of doing similar stuff using local
storage on my ESX hosts, and abstracting that with an OpenSolaris VM and
iSCSI/NFS.

Perhaps consider inexpensive but decent NAS/SAN devices from Synology. They are
not expensive, offer NFS and iSCSI, and you can also replicate/backup between
two of them using rsync. Yes, you would be ''wasting'' the
storage space by having two, but like I said, they are inexpensive. Then you
would not have the two layer architecture.

I just tested a two disk model, using ESXi 3.5u4 and a Windows VM. I used
iometer, realworld test, and IOs were about what you would expect from mirrored
7200 SATA drives - 138 IOPS, about 1.1 Mbps. The internal CPU was around 20%,
RAM usage was 128MB out of the 512MB on board, so it was disk limited.

The Dell 2950 that I have 2009.06 installed on (16GB of RAM and an LSI HBA with
an external SAS enclosure) with a single mirror using two 7200 drives gave me
about 200 IOPS using the same test, presumably because of the large amounts of
RAM for the L2ARC cache.

-Scott
-- 
This message posted from opensolaris.org

Richard Elling

2009-Sep-01 21:21 UTC

head link

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

On Sep 1, 2009, at 1:28 PM, Jason wrote:
> I guess I should come at it from the other side:
>
> If you have 1 iscsi target box and it goes down, you''re dead in
the
> water.
Yep.
> If you have 2 iscsi target boxes that replicate and one dies, you  
> are OK but you then have to have a 2:1 total storage to usable ratio  
> (excluding expensive shared disks).
Servers cost more than storage, especially when you consider power.
> If you have 2 tiers where you have n cheap back-end iSCSI targets  
> that have the physical disks in them and present them to 2 clustered  
> virtual iSCSI target servers (assuming this can be done with disks  
> over iSCSI) that are presenting the iSCSI targets to the VMware  
> hosts, then any one server could go down but everything would keep  
> running.  It would create a virtual clustered pair that is basically  
> doing RAID over the network (iSCSI).  Since you already have the  
> VMware hosts, the 2 virtual ones are "free".  None of the
back-end
> servers would need redundant components because any one can fail, so  
> you should be able to build them with inexpensive parts.
This will certainly work. But it is, IMHO, too complicated to be  
effective
at producing high availability services.  Too many parts means too many
opportunities for failure (yes, even VMWare fails). The problem with  
your
approach is that you seem to only be considering failures of the type
"its broke, so it is completely dead." Those aren''t the kind
of
failures that
dominate real life.

When we design highly available systems for the datacenter, we spend
a lot of time on rapid recovery. We know things will break, so we try  
to build
systems and processes that can recover as quickly as possible. This  
leads
to the observation that reliability trumps redundancy -- though we build
fast recovery systems, it is better to not need to recover. Hence we  
developed
dependability benchmarks which expose the cost/dependability trade-offs.
More reliable parts tend to cost more, but the best approach is to have
fewer reliable parts rather than more unreliable parts.
> This would also allow you to add/replace storage easily (I hope).   
> Perhaps you''d have to RAIDZ the backend disks together and then  
> present them to the front-end which would RAIDZ all the back-ends  
> together.  For example, if you had 5 backend boxes with 8 drives  
> each you''d have a 10:7 ratio.  I''m sure the RAID
combinations could
> be played with to get the balance of redundancy and capacity that  
> you need.  I don''t know what kind of performance hit you would
take
> doing that over iSCSI but I thought it might work as long as you  
> have gigabit speeds.  Or I could be completely off my rocker. :) Am I?
Don''t worry about bandwidth. It is the latency that will kill  
performance.
Adding more stuff between your CPU and the media means increasing
latency.
  -- richard

zfs discuss - Aug 2009 - ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

[zfs-discuss] ZFS iSCSI Clustered for VMware Host use