Hi! I''ve been testing ZFS, and would like to use it on SAN attached disks in our production environment, where multiple machines can see the same zpools. I''m having some concerns about importing/exporting pools on possible failure situations. If box that was using some zpool crashes (for example sending break to the host when testing this), I would like to import that pool on some other host right away. Of course I''ll have to use import -f cause the pool was not exported. Now the other host is serving the disk, no problem there, but when I boot the crashed host again, it wants to keep using the pools it previosly had and it doesn''t realize that the pool is now in use by the other host. That leads to two systems using the same zpool which is not nice. Is there any solution to this problem, or do I have to get Sun Cluster 3.2 if I want to serve same zpools from many hosts? We may try Sun Cluster anyway, but I''d like to know if this can be solved without it. -- Ari-Pekka Oksavuori <aoksavuo at cs.tut.fi>
Ari-Pekka Oksavuori wrote:> Hi! > > I''ve been testing ZFS, and would like to use it on SAN attached disks in > our production environment, where multiple machines can see the same > zpools. I''m having some concerns about importing/exporting pools on > possible failure situations. If box that was using some zpool crashes > (for example sending break to the host when testing this), I would like > to import that pool on some other host right away. Of course I''ll have > to use import -f cause the pool was not exported. Now the other host is > serving the disk, no problem there, but when I boot the crashed host > again, it wants to keep using the pools it previosly had and it doesn''t > realize that the pool is now in use by the other host. That leads to two > systems using the same zpool which is not nice.s/not nice/really really bad/ :-)> Is there any solution to this problem, or do I have to get Sun Cluster > 3.2 if I want to serve same zpools from many hosts? We may try Sun > Cluster anyway, but I''d like to know if this can be solved without it.You can''t do it *safely* without the protection of a high- availability framework such as SunCluster. best regards, James C. McPherson -- Solaris kernel software engineer Sun Microsystems
James C. McPherson wrote:> You can''t do it *safely* without the protection of a high- > availability framework such as SunCluster.Thanks for the fast reply. :) We''ll have look into the Cluster solution. -- Ari-Pekka Oksavuori <aoksavuo at cs.tut.fi>
If you _boot_ the original machine then it should see that the pool now is "owned" by the other host and ignore it (you''d have to do a "zpool import -f" again I think). Not tested though so don''t take my word for it... However if you simply type "go" and let it continue from where it was then things definitely will not be pretty... :-) This message posted from opensolaris.org
On Jan 26, 2007, at 7:17, Peter Eriksson wrote:> If you _boot_ the original machine then it should see that the pool > now is "owned" by > the other host and ignore it (you''d have to do a "zpool import -f" > again I think). Not tested though so don''t take my word for it...Conceptually, that''s about right, but in practice it''s not quite as simple as that. We had to do a lot of work in Cluster to ensure that the zpool would never be imported on more than one node at a time.> However if you simply type "go" and let it continue from where it was > then things definitely will not be pretty... :-)Yes, but that''s only one of the bad scenarios. --Ed
Peter Eriksson wrote:> If you _boot_ the original machine then it should see that the pool now is "owned" by > the other host and ignore it (you''d have to do a "zpool import -f" again I think). Not tested though so don''t take my word for it... > > However if you simply type "go" and let it continue from where it was then things definitely will not be pretty... :-)I tested this, same thing with reboot. -- Ari-Pekka Oksavuori <aoksavuo at cs.tut.fi>
Ed Gould wrote:> On Jan 26, 2007, at 7:17, Peter Eriksson wrote: >> If you _boot_ the original machine then it should see that the pool >> now is "owned" by >> the other host and ignore it (you''d have to do a "zpool import -f" >> again I think). Not tested though so don''t take my word for it... > > Conceptually, that''s about right, but in practice it''s not quite as > simple as that. We had to do a lot of work in Cluster to ensure that > the zpool would never be imported on more than one node at a time.Did VxVM use hostid on disks to check where the disk groups were last used, and won''t automatically import groups with different id on disk. Would something like this be hard to implement? -- Ari-Pekka Oksavuori <aoksavuo at cs.tut.fi>
> > On Jan 26, 2007, at 7:17, Peter Eriksson wrote: > >> If you _boot_ the original machine then it should see that the pool > >> now is "owned" by > >> the other host and ignore it (you''d have to do a "zpool import -f" > >> again I think). Not tested though so don''t take my word for it... > > > > Conceptually, that''s about right, but in practice it''s not quite as > > simple as that. We had to do a lot of work in Cluster to ensure that > > the zpool would never be imported on more than one node at a time. > > Did VxVM use hostid on disks to check where the disk groups were last > used, and won''t automatically import groups with different id on disk. > Would something like this be hard to implement?Yes, it does. There was a long thread on this not too long ago. Something similar will be added to ZFS. It won''t be a full cluster solution, but it would aid in hand-failover situations like this. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
aoksavuo at cs.tut.fi said:> . . . > realize that the pool is now in use by the other host. That leads to two > systems using the same zpool which is not nice. > > Is there any solution to this problem, or do I have to get Sun Cluster 3.2 if > I want to serve same zpools from many hosts? We may try Sun Cluster anyway, > but I''d like to know if this can be solved without it.Perhaps I''m stating the obvious, but here goes: You could use SAN zoning of the affected LUN''s to keep multiple hosts from seeing the zpool. When failover time comes, you change the zoning to make the LUN''s visible to the new host, then import. When the old host reboots, it won''t find any zpool. Better safe than sorry.... Regards, Marion
On Jan 26, 2007, at 10:52, Marion Hakanson wrote:> Perhaps I''m stating the obvious, but here goes: > > You could use SAN zoning of the affected LUN''s to keep multiple hosts > from seeing the zpool. When failover time comes, you change the zoning > to make the LUN''s visible to the new host, then import. When the old > host reboots, it won''t find any zpool. Better safe than sorry....Yes, that would work. But what or who does the "change the zoning" process? If it''s a human, then failover could take hours; that''s not often good enough. If it''s automated, then you need the Cluster infrastructure to decide when to do it. --Ed
On Jan 26, 2007, at 13:52, Marion Hakanson wrote:> aoksavuo at cs.tut.fi said: >> . . . >> realize that the pool is now in use by the other host. That leads >> to two >> systems using the same zpool which is not nice. >> >> Is there any solution to this problem, or do I have to get Sun >> Cluster 3.2 if >> I want to serve same zpools from many hosts? We may try Sun >> Cluster anyway, >> but I''d like to know if this can be solved without it. > > Perhaps I''m stating the obvious, but here goes: > > You could use SAN zoning of the affected LUN''s to keep multiple hosts > from seeing the zpool. When failover time comes, you change the > zoning > to make the LUN''s visible to the new host, then import. When the old > host reboots, it won''t find any zpool. Better safe than sorry....Actually if you use the Sun Leadville stack you can dynamically take the ports offline with cfgadm, and if you don''t want everything autoconfigured on boot you might want to flip the "manual_configuration_only" bit in the fp.conf. You can do this on the wwpn with something like: # cfgadm -c unconfigure c3::510000f010fd92a1 Also you could simply LUN mask them in the fp.conf to prevent the LUNs from being seen. We put this in last year due to an issue with replicated VxVM volumes that could corrupt a disk group since they would have the same signature on them. Take a look at the "pwwn-lun-blacklist" towards the bottom of the configuration file for an example .. or take a look at fp(7d) man page. If you''re careful you could maintain 2 fp.conf files and flip back and forth between different configurations of visible FC storage. Works nicely to run a standby server as a QA box during the day, and flip to a production server on a reboot - particularly if you''ve got an ABE setup. (SAN-boot if you''re brave) --- .je
> You could use SAN zoning of the affected LUN''s to keep multiple hosts > from seeing the zpool. When failover time comes, you change the zoning > to make the LUN''s visible to the new host, then import. When the old > host reboots, it won''t find any zpool. Better safe than sorry....Or change the LUN masking on the array. Depending on your switch that can be less disruptive, and depending on your storage array might be able to be scripted. Best Regards, Jason