thr3ads.net - zfs discuss - [zfs-discuss] Does block allocation for small writes work over iSCSI? [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Gilberto Mautner

2008-Jan-08 03:27 UTC

[zfs-discuss] Does block allocation for small writes work over iSCSI?

Hello list,
 
 
I''m thinking about this topology:
 
NFS Client <----NFS---> zFS Host <---iSCSI---> zFS Node 1, 2, 3 etc.
 
The idea here is to create a scalable NFS server by plugging in more nodes as
more space is needed, striping data across them.
 
A question is: we know from the docs that zFS optimizes random write speed by
consolidating what would be many random writes into a single sequential
operation.
 
I imagine that for zFS be able to do that it has to have some knowledge about
the hard disk geography. Now, if this geography is being abstracted by iSCSI, is
that optimization still valid?
 
 
Thanks
 
Gilberto
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080108/d3ff5cd6/attachment.html>

Richard Elling

2008-Jan-08 05:47 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

Gilberto Mautner wrote:> Hello list,
>  
>  
> I''m thinking about this topology:
>  
> NFS Client <----NFS---> zFS Host <---iSCSI---> zFS Node 1, 2, 3
etc.
>  
> The idea here is to create a scalable NFS server by plugging in more 
> nodes as more space is needed, striping data across them.
I see people doing this, but, IMHO, it seems like a waste of
resources and will be generally slower than having the 
disks on the NFS server.
>  
> A question is: we know from the docs that zFS optimizes random write 
> speed by consolidating what would be many random writes into a single 
> sequential operation.
>  
> I imagine that for zFS be able to do that it has to have some 
> knowledge about the hard disk geography. Now, if this geography is 
> being abstracted by iSCSI, is that optimization still valid?
ZFS doesn''t do any optimization for hard disk geometry.  Allocations
are
made starting at the beginning and proceeding according to the slab size.
For diversity, redundant copies of metadata are spread further away, so
there may be some additional "jumps," but these aren''t really
based on
disk geometry.  In other words, I believe the optimization is probably still
valid.
 -- richard

Andre Wenas

2008-Jan-08 08:36 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

Although it looks like possible, but very complex architecture.

If you can wait, please explore pNFS: 
http://opensolaris.org/os/project/nfsv41/

What is pNFS?

    * The pNFS protocol allows us to separate a NFS file system''s data
      and metadata paths. With a separate data path we are free to lay
      file data out in interesting ways like striping it across multiple
      different file servers. For more information, see the NFSv4.1
      specification.



Gilberto Mautner wrote:> Hello list,
>  
>  
> I''m thinking about this topology:
>  
> NFS Client <----NFS---> zFS Host <---iSCSI---> zFS Node 1, 2, 3
etc.
>  
> The idea here is to create a scalable NFS server by plugging in more 
> nodes as more space is needed, striping data across them.
>  
> A question is: we know from the docs that zFS optimizes random write 
> speed by consolidating what would be many random writes into a single 
> sequential operation.
>  
> I imagine that for zFS be able to do that it has to have some 
> knowledge about the hard disk geography. Now, if this geography is 
> being abstracted by iSCSI, is that optimization still valid?
>  
>  
> Thanks
>  
> Gilberto
>  
> ------------------------------------------------------------------------
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080108/1b95225d/attachment.html>

Ross

2008-Jan-09 08:39 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

That''s pretty much exactly what I''m looking to do, except
I''ve been calling it Tiered or Clustered ZFS and I''m looking
to serve out CIFS / iSCSI instead of NFS.

Sun set up a demo in their labs before Christmas to have a look at this for me
and see how it performs.  Latency was their biggest concern, other than that
they seemed reasonably confident it would work ok.  I''m hoping to hear
back from them in the next few weeks and will be happy to pass on any results I
get.
 
 
This message posted from opensolaris.org

Ross

2008-Jan-09 08:50 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

PS.  This is how I drew up the concept, I''m hoping we''ll be
able to cluster the ZFS Hosts by this time next year:
http://www.opensolaris.org/jive/servlet/JiveServlet/download/94-44970-177042-4435/Clustered%20ZFS.pdf
 
 
This message posted from opensolaris.org

Richard Elling

2008-Jan-09 16:00 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

Ross wrote:> PS.  This is how I drew up the concept, I''m hoping we''ll
be able to cluster the ZFS Hosts by this time next year:
>
http://www.opensolaris.org/jive/servlet/JiveServlet/download/94-44970-177042-4435/Clustered%20ZFS.pdf
>  
>  
A few notes:
slide 1. striping (aka RAID-0) is not reliable.

slide 3.  I''m not sure why you say this would be available in
"late
2008/early 2009"
since it is in fact available today with Solaris Cluster, in some form.

But ultimately, your architecture could be described as this:

    disk -- server -<iSCSI>- server -<iSCSI>- client
or
    disk -- server -<iSCSI>- server -<NFS>- client
or
    disk -- server -<iSCSI>- server -<CIFS>- client

This really doesn''t make sense as it is far more complicated than a
simpler,
and more reliable, architecture which is widely adopted:
    disk -- server -<NFS/iSCSI/CIFS>- client

As we''ve been watching people attempt to implement these sorts of
things, they enter into a new set of failure domains and often do not
consider the implications.  If you *really* want Enterprise levels of
availability, you simplify rather than complicate the architecture, and
you reduce the number of failure domains whenever you can.
 -- richard

Ross

2008-Jan-10 09:12 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

For slide 3, HA-ZFS is available now with HA-Storage+ if you''re happy
with Active/Passive.  HA-iSCSI code was released just before christmas I believe
but is currently untested, and HA-CIFS is just a thought on the roadmap.

The reason for the 2008/2009 timeline is because that''s when
I''ve been told it''s likely that we''ll see HA-CIFS.

And yes, it''s more complicated than Disk -- Server -- Client, but you
could use that same argument for VMware.  That goes Server -- OS -- OS -- Client
instead of the traditional Server -- OS -- Client, but I think everyone would
agree that there are significant advantages from that abstraction and I see the
same here.
 
 
This message posted from opensolaris.org

Richard Elling

2008-Jan-10 16:05 UTC

head link

[zfs-discuss] Does block allocation for small writes work over iSCSI?

Ross wrote:> For slide 3, HA-ZFS is available now with HA-Storage+ if you''re
happy with Active/Passive.  HA-iSCSI code was released just before christmas I
believe but is currently untested, and HA-CIFS is just a thought on the roadmap.
>
> The reason for the 2008/2009 timeline is because that''s when
I''ve been told it''s likely that we''ll see HA-CIFS.
>   
HA-Samba has been available for 4+ years.
Sharing the file systems is the easy part.  Reconciling locks is the 
hard part.
iSCSI has very limited locking capabilities (reservations).  For CIFS, the
existence of HA-Samba agents sets a precedent.  The HA Samba agent is
written in ksh, so it shouldn''t be too scary :-)
http://opensolaris.org/os/community/ha-clusters/ohac/Documentation/Agents/open-agents/

Or, if you want to roll your own, the agent builder is relatively easy 
to use...> And yes, it''s more complicated than Disk -- Server -- Client, but
you could use that same argument for VMware.  That goes Server -- OS -- OS --
Client instead of the traditional Server -- OS -- Client, but I think everyone
would agree that there are significant advantages from that abstraction and I
see the same here.
>   
Ah, but you said "Enterprise class" and VMWare is not.  VM is
enterprise
class,
but not VMWare (but clever naming always helps make positive 
associations :-)
 -- richard

zfs discuss - Jan 2008 - Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?

[zfs-discuss] Does block allocation for small writes work over iSCSI?