On Mar 10, 2008 09:09 -0600, Colin Faber wrote:> Is this true even in the case of mounting the OSS as a read only node?Yes, definitely even a "read only" mount can cause serious corruption. There are several issues involved, the most dangerous is that even for read-only mounting the journal is replayed by the kernel or otherwise the filesystem may appear to be corrupted. In addition, there is the problem that (meta)data that is cached on the read-only mounting node will become incorrect as the writing node is changing the filesystem. The ext3 filesystem is not cluster aware. In order to prevent situations like this, the newer releases of ldiskfs and e2fsprogs have an "mmp" (multi-mount protection) feature which will prevent the filesystem to be mounted on another node if it is active on one node (either mounted, or running e2fsck). This will be enabled by default on newly-formatted filesystems which are created with the "--failover" flag, and can also be enabled by hand with "tune2fs -O mmp /dev/XXXX" (replace with MDT or OST device names as appropriate). This will prevent the filesystem from being mounted or e2fsck''d by old kernels/e2fsprogs so it isn''t enabled by default on existing filesystems.> Andreas Dilger wrote: >> On Mar 07, 2008 00:04 +0530, Neeladri Bose wrote: >> >>> To address the performance hit (whatever be the %age) if we setup DRDB in >>> active-passive mode across the 4500''s but have the LustreFS points to >>> separate raid sets from the network across the DRDB pair of 4500''s & thus >>> become an active-active solution which may actually increase the >>> throughput of the LustreFS. >>> >>> Can it be a possible scenario using DRDB on Linux with ext3 & LustreFS? >>> >> >> No, Lustre does not support active-active export of backing filesystems. >> This doesn''t work because the backing filesystems (ext3/ZFS) are not >> themselves cluster-aware and mounting them on two nodes will quickly >> lead to whole filesystem corruption.Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Andreas, Is the mmp feature already in the existing Lustre distribution? If so, what versions are mmp-aware? If not, which version will be the first to incorporate it? thanks, Klaus On 3/10/08 2:15 PM, "Andreas Dilger" <adilger at sun.com>did etch on stone tablets:> On Mar 10, 2008 09:09 -0600, Colin Faber wrote: >> Is this true even in the case of mounting the OSS as a read only node? > > Yes, definitely even a "read only" mount can cause serious corruption. > There are several issues involved, the most dangerous is that even for > read-only mounting the journal is replayed by the kernel or otherwise > the filesystem may appear to be corrupted. > > In addition, there is the problem that (meta)data that is cached on the > read-only mounting node will become incorrect as the writing node is > changing the filesystem. The ext3 filesystem is not cluster aware. > > In order to prevent situations like this, the newer releases of ldiskfs > and e2fsprogs have an "mmp" (multi-mount protection) feature which will > prevent the filesystem to be mounted on another node if it is active > on one node (either mounted, or running e2fsck). > > This will be enabled by default on newly-formatted filesystems which > are created with the "--failover" flag, and can also be enabled by > hand with "tune2fs -O mmp /dev/XXXX" (replace with MDT or OST device > names as appropriate). This will prevent the filesystem from being > mounted or e2fsck''d by old kernels/e2fsprogs so it isn''t enabled by > default on existing filesystems. > >> Andreas Dilger wrote: >>> On Mar 07, 2008 00:04 +0530, Neeladri Bose wrote: >>> >>>> To address the performance hit (whatever be the %age) if we setup DRDB in >>>> active-passive mode across the 4500''s but have the LustreFS points to >>>> separate raid sets from the network across the DRDB pair of 4500''s & thus >>>> become an active-active solution which may actually increase the >>>> throughput of the LustreFS. >>>> >>>> Can it be a possible scenario using DRDB on Linux with ext3 & LustreFS? >>>> >>> >>> No, Lustre does not support active-active export of backing filesystems. >>> This doesn''t work because the backing filesystems (ext3/ZFS) are not >>> themselves cluster-aware and mounting them on two nodes will quickly >>> lead to whole filesystem corruption. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Mar 10, 2008 15:48 -0700, Klaus Steden wrote:> Is the mmp feature already in the existing Lustre distribution? If so, what > versions are mmp-aware? If not, which version will be the first to > incorporate it?It''s in Lustre 1.6.2+ and 1.4.12, and e2fsprogs-1.40.2+.> On 3/10/08 2:15 PM, "Andreas Dilger" <adilger at sun.com>did etch on stone > tablets: > > > On Mar 10, 2008 09:09 -0600, Colin Faber wrote: > >> Is this true even in the case of mounting the OSS as a read only node? > > > > Yes, definitely even a "read only" mount can cause serious corruption. > > There are several issues involved, the most dangerous is that even for > > read-only mounting the journal is replayed by the kernel or otherwise > > the filesystem may appear to be corrupted. > > > > In addition, there is the problem that (meta)data that is cached on the > > read-only mounting node will become incorrect as the writing node is > > changing the filesystem. The ext3 filesystem is not cluster aware. > > > > In order to prevent situations like this, the newer releases of ldiskfs > > and e2fsprogs have an "mmp" (multi-mount protection) feature which will > > prevent the filesystem to be mounted on another node if it is active > > on one node (either mounted, or running e2fsck). > > > > This will be enabled by default on newly-formatted filesystems which > > are created with the "--failover" flag, and can also be enabled by > > hand with "tune2fs -O mmp /dev/XXXX" (replace with MDT or OST device > > names as appropriate). This will prevent the filesystem from being > > mounted or e2fsck''d by old kernels/e2fsprogs so it isn''t enabled by > > default on existing filesystems. > > > >> Andreas Dilger wrote: > >>> On Mar 07, 2008 00:04 +0530, Neeladri Bose wrote: > >>> > >>>> To address the performance hit (whatever be the %age) if we setup DRDB in > >>>> active-passive mode across the 4500''s but have the LustreFS points to > >>>> separate raid sets from the network across the DRDB pair of 4500''s & thus > >>>> become an active-active solution which may actually increase the > >>>> throughput of the LustreFS. > >>>> > >>>> Can it be a possible scenario using DRDB on Linux with ext3 & LustreFS? > >>>> > >>> > >>> No, Lustre does not support active-active export of backing filesystems. > >>> This doesn''t work because the backing filesystems (ext3/ZFS) are not > >>> themselves cluster-aware and mounting them on two nodes will quickly > >>> lead to whole filesystem corruption. > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc. > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hi Andreas, For clarity''s sake ... 1.6.2+ == 1.6.3 and higher? I''m using lustre-1.6.2-2.6.9_55.0.2.EL_lustre.1.6.2smp and it doesn''t appear to be available. I haven''t run a tunefs.lustre -O mmp, but the file system was originally formatted with 1.6.2, and I can still mount my MDT from multiple locations if I really try (and I try not to) ... cheers, Klaus On 3/10/08 7:56 PM, "Andreas Dilger" <adilger at sun.com>did etch on stone tablets:> On Mar 10, 2008 15:48 -0700, Klaus Steden wrote: >> Is the mmp feature already in the existing Lustre distribution? If so, what >> versions are mmp-aware? If not, which version will be the first to >> incorporate it? > > It''s in Lustre 1.6.2+ and 1.4.12, and e2fsprogs-1.40.2+. >
On Mar 11, 2008 11:40 -0700, Klaus Steden wrote:> For clarity''s sake ... 1.6.2+ == 1.6.3 and higher?Yes.> I''m using lustre-1.6.2-2.6.9_55.0.2.EL_lustre.1.6.2smp and it doesn''t appear > to be available. I haven''t run a tunefs.lustre -O mmp, but the file system > was originally formatted with 1.6.2, and I can still mount my MDT from > multiple locations if I really try (and I try not to) ...It wasn''t enabled by default until later, but the feature is there, AFAIK.> On 3/10/08 7:56 PM, "Andreas Dilger" <adilger at sun.com>did etch on stone > tablets: > > > On Mar 10, 2008 15:48 -0700, Klaus Steden wrote: > >> Is the mmp feature already in the existing Lustre distribution? If so, what > >> versions are mmp-aware? If not, which version will be the first to > >> incorporate it? > > > > It''s in Lustre 1.6.2+ and 1.4.12, and e2fsprogs-1.40.2+. > >Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.