I''m seriously looking at using the SunCluster software in combination with ZFS (either in Sol 10u2, or Nevada). I''m really looking at doing a dual-machine HA setup, probably active-active. How well does ZFS play in a SunCluster? I''ve looked at the "zfs [import|export]" stuff, and I''m a little confused as to how it might work in an HA environment for complete hot-failover. Especially if I do something like throw Zones into the mix (as: run 2 machines, each with 2 zones on them, and cluster the zones in 2 clusters. both machines dual-attached to some JBOD). -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
Alexandre CHARTRE - Solaris Sustaining
2006-May-30 15:05 UTC
[zfs-discuss] ZFS and Sun Cluster....
The current version of Sun Cluster (3.1) has no support for ZFS. You will be able to use ZFS as a failover filesystem with Sun Cluster 3.2 which will be released by the end of this year. alex. Erik Trimble wrote:> I''m seriously looking at using the SunCluster software in combination > with ZFS (either in Sol 10u2, or Nevada). I''m really looking at doing a > dual-machine HA setup, probably active-active. > > How well does ZFS play in a SunCluster? I''ve looked at the "zfs > [import|export]" stuff, and I''m a little confused as to how it might > work in an HA environment for complete hot-failover. Especially if I do > something like throw Zones into the mix (as: run 2 machines, each with > 2 zones on them, and cluster the zones in 2 clusters. both machines > dual-attached to some JBOD).
SunCluster will support ZFS in our 3.2 release of SunCluster, via the HAStoragePlus resource type. This support will be for failover use only, not scaleable or active-active applications. It will use the import|export stuff to do its work. It required code modifications to HAStoragePlus due to the non-conventional use of vfstab, and the "device" argument of the mount command. HAStoragePlus will also support zones in SunCluster 3.2. However in Zones, HAStoragePlus will first mount the filesystem in the global zone, then export it to the local zone via lofs. SunCluster 3.2 will support Sol 10u2, but not Nevada or OpenSolaris. -Charles>Date: Fri, 26 May 2006 12:08:38 -0700 >From: Erik Trimble <Erik.Trimble at sun.com> >Subject: [zfs-discuss] ZFS and Sun Cluster.... >To: ZFS Discussions <zfs-discuss at opensolaris.org> >MIME-version: 1.0 >Content-transfer-encoding: 7BIT >X-BeenThere: zfs-discuss at opensolaris.org >Delivered-to: zfs-discuss at opensolaris.org >X-PMX-Version: 5.1.2.240295 >X-Original-To: zfs-discuss at opensolaris.org >X-Mailman-Version: 2.1.4 >List-Post: <mailto:zfs-discuss at opensolaris.org> >List-Subscribe: <http://mail.opensolaris.org/mailman/listinfo/zfs-discuss>,<mailto:zfs-discuss-request at opensolaris.org?subject=subscribe>>List-Unsubscribe: <http://mail.opensolaris.org/mailman/listinfo/zfs-discuss>,<mailto:zfs-discuss-request at opensolaris.org?subject=unsubscribe>>List-Archive: <http://mail.opensolaris.org/pipermail/zfs-discuss> >List-Help: <mailto:zfs-discuss-request at opensolaris.org?subject=help> >List-Id: zfs-discuss.opensolaris.org > >I''m seriously looking at using the SunCluster software in combination >with ZFS (either in Sol 10u2, or Nevada). I''m really looking at doing a >dual-machine HA setup, probably active-active. > >How well does ZFS play in a SunCluster? I''ve looked at the "zfs >[import|export]" stuff, and I''m a little confused as to how it might >work in an HA environment for complete hot-failover. Especially if I do >something like throw Zones into the mix (as: run 2 machines, each with >2 zones on them, and cluster the zones in 2 clusters. both machines >dual-attached to some JBOD). > > > >-- >Erik Trimble >Java System Support >Mailstop: usca14-102 >Phone: x17195 >Santa Clara, CA >Timezone: US/Pacific (GMT-0800) > >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> SunCluster will support ZFS in our 3.2 release of SunCluster, > via the HAStoragePlus resource type. This support will be for > failover use only, not scaleable or active-active applications.What about quorum reservation in ZFS storage pools. AFAIK ZFS does not support SCSI3 persistent group reservation (PGR) Will that be emulated? As also done on SCSI2 with PGRE ? Why is there no storage communication via the ORB as we had it in SunCluster 3.1 with SDS or Veritas since global storage actually is a virtual layer above the raid sw layer. Why doesn''t that work with ZFS. Any information/clues? The limitations mentioned do sound strange to me. Is it planned to have the cluster fs or proxy fs layer between the ZFS layer and the Storage pool layer? This sounds exciting and I''m eager to learn more about this. FEED ME :) Sorry if the questions in question may sound a bit too detailed, my knowledge base is SunCluster 3.1 and I''m currently proof-reading parts of Rolfs book on SunCluster which is scheduled to be released Q1/2007. Tatjana This message posted from opensolaris.org
Re: Quorum and ZFS. PGR is a property of the scsi devices in the zpool, not a property of ZFS or the zpool. The same is true for SCSI 2 reserve/release protocol. The PGRE protocol requires a reserved part of the disk. However, this reserved part of the disk is reserved at the "label" level. While it is not part of the label itself, the Solaris label reserves a few tracks of the disk for use by PGRE. Therefor, it should be possible to use a disk that is in the zpool as a quorum device for both PGR and PGRE situations. Re: Why doesn''t GFS (PxFS) just work on zfs. Given that PxFS just works at the vfs layer, I would have thought that this would have "just worked". However, it turns out that PxFS has too much knowledge of the interaction if the filesystem and the virtual memory system for this to work. I think the biggest problem was the variable blocksize feature of ZFS. It is possible that PxFS will be made to work with ZFS, but I doubt it. The current direction is more towards making ZFS itself a "cluster filesystem" so that all of the advantages of ZFS can be utilized without being hampered by a layered filesystem. If you need more details about why PxFS did not work with ZFS, contact me, and I will try to get more details. -Charles>Date: Tue, 30 May 2006 10:30:14 -0700 (PDT) >From: Tatjana S Heuser <theuser at orbit.in-berlin.de> >Subject: [zfs-discuss] Re: ZFS and Sun Cluster.... >To: zfs-discuss at opensolaris.org >MIME-version: 1.0 >Content-transfer-encoding: 7BIT >X-BeenThere: zfs-discuss at opensolaris.org >Delivered-to: zfs-discuss at opensolaris.org >X-PMX-Version: 5.1.2.240295 >X-Original-To: zfs-discuss at opensolaris.org >X-OpenSolaris-URL: http://www.opensolaris.org/jive/message.jspa?messageID=40507&tstart=0#40507 >X-Mailman-Version: 2.1.4 >List-Post: <mailto:zfs-discuss at opensolaris.org> >List-Subscribe: <http://mail.opensolaris.org/mailman/listinfo/zfs-discuss>,<mailto:zfs-discuss-request at opensolaris.org?subject=subscribe>>List-Unsubscribe: <http://mail.opensolaris.org/mailman/listinfo/zfs-discuss>,<mailto:zfs-discuss-request at opensolaris.org?subject=unsubscribe>>List-Archive: <http://mail.opensolaris.org/pipermail/zfs-discuss> >List-Help: <mailto:zfs-discuss-request at opensolaris.org?subject=help> >List-Id: zfs-discuss.opensolaris.org > >> SunCluster will support ZFS in our 3.2 release of SunCluster, >> via the HAStoragePlus resource type. This support will be for >> failover use only, not scaleable or active-active applications. > >What about quorum reservation in ZFS storage pools. >AFAIK ZFS does not support SCSI3 persistent group reservation (PGR) >Will that be emulated? As also done on SCSI2 with PGRE ? >Why is there no storage communication via the ORB as we had it in >SunCluster 3.1 with SDS or Veritas since global storage actually is a >virtual layer above the raid sw layer. Why doesn''t that work with ZFS. >Any information/clues? The limitations mentioned do sound strange to me. >Is it planned to have the cluster fs or proxy fs layer between the ZFS layer >and the Storage pool layer? > >This sounds exciting and I''m eager to learn more about this. FEED ME :) > >Sorry if the questions in question may sound a bit too detailed, my knowledge >base is SunCluster 3.1 and I''m currently proof-reading parts of Rolfs book on >SunCluster which is scheduled to be released Q1/2007. > >Tatjana > > >This message posted from opensolaris.org >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Tue, 2006-05-30 at 10:30 -0700, Tatjana S Heuser wrote:> > SunCluster will support ZFS in our 3.2 release of SunCluster, > > via the HAStoragePlus resource type. This support will be for > > failover use only, not scaleable or active-active applications. > > What about quorum reservation in ZFS storage pools. > AFAIK ZFS does not support SCSI3 persistent group reservation (PGR) > Will that be emulated? As also done on SCSI2 with PGRE ?Reservations work at a lower level than ZFS. In the case that a LUN was reserved away from ZFS, it would appear as an error. Since we implement failfast in the SCSI driver, the fenced node will panic. For quorum, ZFS will never see the reservation.> Why is there no storage communication via the ORB as we had it in > SunCluster 3.1 with SDS or Veritas since global storage actually is a > virtual layer above the raid sw layer. Why doesn''t that work with ZFS. > Any information/clues? The limitations mentioned do sound strange to me. > Is it planned to have the cluster fs or proxy fs layer between the ZFS layer > and the Storage pool layer?I can''t speak for Sun Cluster engineering, but the use of the HAStoragePlus resource type is much more common today than pxfs/GFS. Also, there was a discussion a few months ago on this forum about requirements for a distributed ZFS. If you have requirements, now is the time to make them known. -- richard
Tatjana S Heuser wrote:> Is it planned to have the cluster fs or proxy fs layer between the ZFS layer > and the Storage pool layer?This, AFAIK, is not the current plan of action. Sun Cluster should be moving towards ZFS as a ''true'' cluster filesystem. Not going the ''proxy fs layer'' way (PxFS/GFS), IMHO, is not due to technical infeasibility. Regards, Manoj -- Global Data and Devices, Sun Cluster Engineering.