Anselm Strauss wrote:> hi. > > if i have two lustre servers that are fail-overed and use the same > storage device on a common SAN, is there some built-in locking > functionality in lustre to avoid concurrent access to that device? in > the case of a malfunctioning fail-over it would be very important that > the servers never access the same storage at the same time.No, there is nothing in Lustre to prevent concurrent access to a shared device. Future versions may be smart enough to avoid this, but with current Lustre it''s up to you. Anytime you have shared storage, you must have STONITH or it''s equivalent to avoid this issue. Concurrent access is very bad, and will definately scramble your data. hope this helps cliffw> > i read GFS'' lock manager does locking on device level, thus preventing > concurrent access to devices and ensure consistency. even though i > don''t know whether the locking information is exchanged over the SAN > and stored somewhere on the device, or if it''s done over a second network. > > have a nice day, > anselm strauss > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Anselm Strauss wrote:> thanks for the answer. > > do you know in what version this functionality could be included? > because this is actually a very important feature for us. >If it is very important you should definitely use STONITH, what sort of failover (HA) software are you using? cliffw> anselm > > > On Mar 24, 2006, at 12:00 AM, cliff white wrote: > >> Anselm Strauss wrote: >> >>> hi. >>> if i have two lustre servers that are fail-overed and use the same >>> storage device on a common SAN, is there some built-in locking >>> functionality in lustre to avoid concurrent access to that device? >>> in the case of a malfunctioning fail-over it would be very >>> important that the servers never access the same storage at the >>> same time. >> >> >> No, there is nothing in Lustre to prevent concurrent access to a >> shared device. Future versions may be smart enough to avoid this, but >> with current Lustre it''s up to you. Anytime you have shared storage, >> you must have STONITH or it''s equivalent to avoid this issue. >> Concurrent access is very bad, and will definately scramble your data. >> hope this helps >> cliffw >> >>> i read GFS'' lock manager does locking on device level, thus >>> preventing concurrent access to devices and ensure consistency. >>> even though i don''t know whether the locking information is >>> exchanged over the SAN and stored somewhere on the device, or if >>> it''s done over a second network. >>> have a nice day, >>> anselm strauss >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss@clusterfs.com >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> >> >
On Mar 24, 2006, at 5:09 PM, Cliff White wrote:> Anselm Strauss wrote: >> thanks for the answer. >> do you know in what version this functionality could be included? >> because this is actually a very important feature for us. >> > If it is very important you should definitely use STONITH, what > sort of failover (HA) software are you using?actually we don''t have yet a failover installation. we are planning it, but what we really would like to have is redundancy on the operating system level. we made the experience that linux is mostly stable but as a storage server for 150 nodes it will sooner or later crash. since there is no raid support in lustre yet, the only other possibility is failover, and this needs shared storage. i thought about stonith, but wasn''t attired in that much. is this really a proven and safe solution? i thought about a manual failover at least, switching over by hand, but this doesn''t make it really safe, and can increase downtime a lot. does device locking in lustre make sense at all? i thought it was something really missing, but i don''t know all the other possibilities. anselm> cliffw > >> anselm >> On Mar 24, 2006, at 12:00 AM, cliff white wrote: >>> Anselm Strauss wrote: >>> >>>> hi. >>>> if i have two lustre servers that are fail-overed and use the >>>> same storage device on a common SAN, is there some built-in >>>> locking functionality in lustre to avoid concurrent access to >>>> that device? in the case of a malfunctioning fail-over it >>>> would be very important that the servers never access the same >>>> storage at the same time. >>> >>> >>> No, there is nothing in Lustre to prevent concurrent access to a >>> shared device. Future versions may be smart enough to avoid >>> this, but with current Lustre it''s up to you. Anytime you have >>> shared storage, you must have STONITH or it''s equivalent to >>> avoid this issue. Concurrent access is very bad, and will >>> definately scramble your data. >>> hope this helps >>> cliffw >>> >>>> i read GFS'' lock manager does locking on device level, thus >>>> preventing concurrent access to devices and ensure consistency. >>>> even though i don''t know whether the locking information is >>>> exchanged over the SAN and stored somewhere on the device, or >>>> if it''s done over a second network. >>>> have a nice day, >>>> anselm strauss >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss@clusterfs.com >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >>> >
Anselm, You need to use a mechanism like STONITH in conjunction with an appropriate High Availability framework such as Cluster Manager. If you configure them appropriately you should be able to implement a solution that will ensure that only one node in a set of nodes is actually acting as the server for a particular Lustre MDS or OST device. You will need shared storage in such a case, and any additional hardware requirements imposed by the High Availability framework, e.g. heartbeat connectivity, quorum disk devices. Fergal. -- Fergal.McCarthy@HP.com (The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated, you should consider this message and attachments as "HP CONFIDENTIAL".) -----Original Message----- From: lustre-discuss-bounces@clusterfs.com [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Anselm Strauss Sent: 27 March 2006 15:24 To: lustre-discuss@clusterfs.com Subject: Re: [Lustre-discuss] sharing devices On Mar 24, 2006, at 5:09 PM, Cliff White wrote:> Anselm Strauss wrote: >> thanks for the answer. >> do you know in what version this functionality could be included? >> because this is actually a very important feature for us. >> > If it is very important you should definitely use STONITH, what > sort of failover (HA) software are you using?actually we don''t have yet a failover installation. we are planning it, but what we really would like to have is redundancy on the operating system level. we made the experience that linux is mostly stable but as a storage server for 150 nodes it will sooner or later crash. since there is no raid support in lustre yet, the only other possibility is failover, and this needs shared storage. i thought about stonith, but wasn''t attired in that much. is this really a proven and safe solution? i thought about a manual failover at least, switching over by hand, but this doesn''t make it really safe, and can increase downtime a lot. does device locking in lustre make sense at all? i thought it was something really missing, but i don''t know all the other possibilities. anselm> cliffw > >> anselm >> On Mar 24, 2006, at 12:00 AM, cliff white wrote: >>> Anselm Strauss wrote: >>> >>>> hi. >>>> if i have two lustre servers that are fail-overed and use the >>>> same storage device on a common SAN, is there some built-in >>>> locking functionality in lustre to avoid concurrent access to >>>> that device? in the case of a malfunctioning fail-over it >>>> would be very important that the servers never access the same >>>> storage at the same time. >>> >>> >>> No, there is nothing in Lustre to prevent concurrent access to a >>> shared device. Future versions may be smart enough to avoid >>> this, but with current Lustre it''s up to you. Anytime you have >>> shared storage, you must have STONITH or it''s equivalent to >>> avoid this issue. Concurrent access is very bad, and will >>> definately scramble your data. >>> hope this helps >>> cliffw >>> >>>> i read GFS'' lock manager does locking on device level, thus >>>> preventing concurrent access to devices and ensure consistency. >>>> even though i don''t know whether the locking information is >>>> exchanged over the SAN and stored somewhere on the device, or >>>> if it''s done over a second network. >>>> have a nice day, >>>> anselm strauss >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss@clusterfs.com >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >>> >>> >_______________________________________________ Lustre-discuss mailing list Lustre-discuss@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
thanks for the answer. do you know in what version this functionality could be included? because this is actually a very important feature for us. anselm On Mar 24, 2006, at 12:00 AM, cliff white wrote:> Anselm Strauss wrote: >> hi. >> if i have two lustre servers that are fail-overed and use the >> same storage device on a common SAN, is there some built-in >> locking functionality in lustre to avoid concurrent access to >> that device? in the case of a malfunctioning fail-over it would >> be very important that the servers never access the same storage >> at the same time. > > No, there is nothing in Lustre to prevent concurrent access to a > shared device. Future versions may be smart enough to avoid this, > but with current Lustre it''s up to you. Anytime you have shared > storage, you must have STONITH or it''s equivalent to avoid this > issue. Concurrent access is very bad, and will definately scramble > your data. > hope this helps > cliffw > >> i read GFS'' lock manager does locking on device level, thus >> preventing concurrent access to devices and ensure consistency. >> even though i don''t know whether the locking information is >> exchanged over the SAN and stored somewhere on the device, or if >> it''s done over a second network. >> have a nice day, >> anselm strauss >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
hi. if i have two lustre servers that are fail-overed and use the same storage device on a common SAN, is there some built-in locking functionality in lustre to avoid concurrent access to that device? in the case of a malfunctioning fail-over it would be very important that the servers never access the same storage at the same time. i read GFS'' lock manager does locking on device level, thus preventing concurrent access to devices and ensure consistency. even though i don''t know whether the locking information is exchanged over the SAN and stored somewhere on the device, or if it''s done over a second network. have a nice day, anselm strauss