I am having the following issue: trying to create an ext4 lustre filesystem attached to an OSS. the disks being used are exported from an external disk enclosure. i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. I am able to format such an array with normal ext4, mount a filesytem etc. however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. the lustre format completes normally, without errors. If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. the kernel log reports the following when a mount is attempted: LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) lsmod reports that all the modules are loaded fsck reports the following fsck 1.41.10.sun2 (24-Feb-2010) e2fsck 1.41.10.sun2 (24-Feb-2010) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 It would seem the filesystem has not been written properly, but mkfs reports no errors .... lustre version 1.8.4 kernel 2.6.18-194.3.1.el5_lustre.1.8.4 disk array is a coraid SATA/AOE device which has worked fine in every other context this seems like an interaction of lustre with software RAID on the OSS? I wonder if anyone has seen anything like this before. any ideas about this? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/b098b8e1/attachment.html
On 2011-01-21, at 13:36, Samuel Aparicio wrote:> trying to create an ext4 lustre filesystem attached to an OSS. > the disks being used are exported from an external disk enclosure. > i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. > I am able to format such an array with normal ext4, mount a filesytem etc. > however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. > the lustre format completes normally, without errors.You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using.> lustre version 1.8.4 > kernel 2.6.18-194.3.1.el5_lustre.1.8.4 > disk array is a coraid SATA/AOE device which has worked fine in every other contextDo you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: [root]# modinfo ldiskfs filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko license: GPL description: Fourth Extended Filesystem ^^^^^^^^^^^^^^^^^^^^^^^^^^ author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others srcversion: D5D8992C8B3E6FCA6ED4FF2 depends: vermagic: 2.6.32.20 SMP mod_unload modversions Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc.
I?m not an expert on lustre, just begin with it J but What is your version of e2fsprogs? What is your command line to format your raid? Regards. De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio Envoy? : vendredi 21 janvier 2011 21:37 ? : lustre-discuss at lists.lustre.org Objet : [Lustre-discuss] lustre and software RAID I am having the following issue: trying to create an ext4 lustre filesystem attached to an OSS. the disks being used are exported from an external disk enclosure. i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. I am able to format such an array with normal ext4, mount a filesytem etc. however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. the lustre format completes normally, without errors. If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. the kernel log reports the following when a mount is attempted: LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) lsmod reports that all the modules are loaded fsck reports the following fsck 1.41.10.sun2 (24-Feb-2010) e2fsck 1.41.10.sun2 (24-Feb-2010) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 It would seem the filesystem has not been written properly, but mkfs reports no errors .... lustre version 1.8.4 kernel 2.6.18-194.3.1.el5_lustre.1.8.4 disk array is a coraid SATA/AOE device which has worked fine in every other context this seems like an interaction of lustre with software RAID on the OSS? I wonder if anyone has seen anything like this before. any ideas about this? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca <http://molonc.bccrc.ca/> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/c2cfd1e2/attachment-0001.html
modinfo reports as follows. seems like the ext4 modules. the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS -------- filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko license: GPL description: Fourth Extended Filesystem author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others srcversion: B4DBDF5EA1FA02D1D1417AF depends: jbd2,crc16 vermagic: 2.6.18-194.3.1.el5_lustre.1.8.4 SMP mod_unload gcc-4.1 parm: default_mb_history_length:Default number of entries saved for mb_history (int) --------- Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote:> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >> trying to create an ext4 lustre filesystem attached to an OSS. >> the disks being used are exported from an external disk enclosure. >> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >> I am able to format such an array with normal ext4, mount a filesytem etc. >> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >> the lustre format completes normally, without errors. > > You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. > >> lustre version 1.8.4 >> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >> disk array is a coraid SATA/AOE device which has worked fine in every other context > > Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: > > [root]# modinfo ldiskfs > filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko > license: GPL > description: Fourth Extended Filesystem > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others > srcversion: D5D8992C8B3E6FCA6ED4FF2 > depends: > vermagic: 2.6.32.20 SMP mod_unload modversions > > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/2da928b8/attachment.html
e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64
mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0
/dev/md2
mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16
--spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10
/dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14
/dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18
/dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22
/dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8
cat /proc/mdstat
md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14]
etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10]
etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5]
etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0]
15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU]
Professor Samuel Aparicio BM BCh PhD FRCPath
Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
675 West 10th, Vancouver V5Z 1L3, Canada.
office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website
http://molonc.bccrc.ca
On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote:
> I?m not an expert on lustre, just begin with it J but?
>
> What is your version of e2fsprogs?
>
> What is your command line to format your raid?
>
> Regards.
>
> De : lustre-discuss-bounces at lists.lustre.org
[mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel
Aparicio
> Envoy? : vendredi 21 janvier 2011 21:37
> ? : lustre-discuss at lists.lustre.org
> Objet : [Lustre-discuss] lustre and software RAID
>
> I am having the following issue:
>
> trying to create an ext4 lustre filesystem attached to an OSS.
> the disks being used are exported from an external disk enclosure.
> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine.
> I am able to format such an array with normal ext4, mount a filesytem etc.
> however when i try the same thing, trying to format for a lustre filesystem
I am unable to mount the filesystem and lustre does not seem to detect it.
> the lustre format completes normally, without errors.
>
> If I arrange to present the disks as a RAID10 set from the external disk
enclosure, which has it''s own internal RAID capability,
> (rather than trying to use mdadm on the OSS) the lustre formatting works
fine and I can get a mountable OST.
>
> the kernel log reports the following when a mount is attempted:
>
> LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem
> LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount
/dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module
available?
> LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount
device /dev/md2: -22
> LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount
(-22)
>
> lsmod reports that all the modules are loaded
>
> fsck reports the following
> fsck 1.41.10.sun2 (24-Feb-2010)
> e2fsck 1.41.10.sun2 (24-Feb-2010)
> fsck.ext2: Superblock invalid, trying backup blocks...
> fsck.ext2: Bad magic number in super-block while trying to open /dev//md2
>
> It would seem the filesystem has not been written properly, but mkfs
reports no errors ....
>
> lustre version 1.8.4
> kernel 2.6.18-194.3.1.el5_lustre.1.8.4
> disk array is a coraid SATA/AOE device which has worked fine in every other
context
>
> this seems like an interaction of lustre with software RAID on the OSS?
> I wonder if anyone has seen anything like this before.
> any ideas about this?
>
>
>
>
> Professor Samuel Aparicio BM BCh PhD FRCPath
> Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
> 675 West 10th, Vancouver V5Z 1L3, Canada.
> office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website
http://molonc.bccrc.ca
>
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/21bd34d0/attachment-0001.html
On 2011-01-21, at 14:50, Samuel Aparicio wrote:> modinfo reports as follows. seems like the ext4 modules. > the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS > > -------- > filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko > description: Fourth Extended Filesystem > ---------After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. If that doesn''t work then the format failed for some reason. Providing the output of "mkfs.lustre -v {options}" would help diagnose it.> On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote: > >> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >>> trying to create an ext4 lustre filesystem attached to an OSS. >>> the disks being used are exported from an external disk enclosure. >>> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >>> I am able to format such an array with normal ext4, mount a filesytem etc. >>> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >>> the lustre format completes normally, without errors. >> >> You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. >> >>> lustre version 1.8.4 >>> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >>> disk array is a coraid SATA/AOE device which has worked fine in every other context >> >> Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: >> >> [root]# modinfo ldiskfs >> filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> license: GPL >> description: Fourth Extended Filesystem >> ^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others >> srcversion: D5D8992C8B3E6FCA6ED4FF2 >> depends: >> vermagic: 2.6.32.20 SMP mod_unload modversions >> >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Principal Engineer >> Whamcloud, Inc. >> >> >> >Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc.
ok thanks, I will look into this. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 21, 2011, at 2:00 PM, Andreas Dilger wrote:> On 2011-01-21, at 14:50, Samuel Aparicio wrote: >> modinfo reports as follows. seems like the ext4 modules. >> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >> >> -------- >> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> description: Fourth Extended Filesystem >> --------- > > After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. > > If that doesn''t work then the format failed for some reason. Providing the output of "mkfs.lustre -v {options}" would help diagnose it. > >> On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote: >> >>> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >>>> trying to create an ext4 lustre filesystem attached to an OSS. >>>> the disks being used are exported from an external disk enclosure. >>>> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >>>> I am able to format such an array with normal ext4, mount a filesytem etc. >>>> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >>>> the lustre format completes normally, without errors. >>> >>> You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. >>> >>>> lustre version 1.8.4 >>>> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >>>> disk array is a coraid SATA/AOE device which has worked fine in every other context >>> >>> Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: >>> >>> [root]# modinfo ldiskfs >>> filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >>> license: GPL >>> description: Fourth Extended Filesystem >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others >>> srcversion: D5D8992C8B3E6FCA6ED4FF2 >>> depends: >>> vermagic: 2.6.32.20 SMP mod_unload modversions >>> >>> >>> Cheers, Andreas >>> -- >>> Andreas Dilger >>> Principal Engineer >>> Whamcloud, Inc. >>> >>> >>> >> > > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/292b201c/attachment.html
Presumably, unlike the order shown below, you run the mkfs.lustre AFTER the mdadm command? Cheers, Andreas On 2011-01-21, at 14:55, Samuel Aparicio <saparicio at bccrc.ca> wrote:> > e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64 > > mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0 /dev/md2 > > mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16 --spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10 /dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14 /dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18 /dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22 /dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8 > > > cat /proc/mdstat > > md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14] etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10] etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5] etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0] > 15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU] > > > > > > > > Professor Samuel Aparicio BM BCh PhD FRCPath > Nan and Lorraine Robertson Chair UBC/BC Cancer Agency > 675 West 10th, Vancouver V5Z 1L3, Canada. > office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca > > > > > > On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote: > >> I?m not an expert on lustre, just begin with it J but? >> >> What is your version of e2fsprogs? >> >> What is your command line to format your raid? >> >> Regards. >> >> De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio >> Envoy? : vendredi 21 janvier 2011 21:37 >> ? : lustre-discuss at lists.lustre.org >> Objet : [Lustre-discuss] lustre and software RAID >> >> I am having the following issue: >> >> trying to create an ext4 lustre filesystem attached to an OSS. >> the disks being used are exported from an external disk enclosure. >> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >> I am able to format such an array with normal ext4, mount a filesytem etc. >> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >> the lustre format completes normally, without errors. >> >> If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, >> (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. >> >> the kernel log reports the following when a mount is attempted: >> >> LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem >> LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? >> LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 >> LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) >> >> lsmod reports that all the modules are loaded >> >> fsck reports the following >> fsck 1.41.10.sun2 (24-Feb-2010) >> e2fsck 1.41.10.sun2 (24-Feb-2010) >> fsck.ext2: Superblock invalid, trying backup blocks... >> fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 >> >> It would seem the filesystem has not been written properly, but mkfs reports no errors .... >> >> lustre version 1.8.4 >> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >> disk array is a coraid SATA/AOE device which has worked fine in every other context >> >> this seems like an interaction of lustre with software RAID on the OSS? >> I wonder if anyone has seen anything like this before. >> any ideas about this? >> >> >> >> >> Professor Samuel Aparicio BM BCh PhD FRCPath >> Nan and Lorraine Robertson Chair UBC/BC Cancer Agency >> 675 West 10th, Vancouver V5Z 1L3, Canada. >> office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca >> >> >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110122/d106cef9/attachment-0001.html
yes, mdadm first then format.
I have now managed to get the mkfs to produce a mountable filesystem on this
server, however attempts to format a filesystem with an external journal (-J
device=/dev/sdc, /dev/sdc previously created/formatted as a journal filesystem)
seem to fail (silently - mkfs completes without errors). However, formatting as
below (no -J) and using tune2fs to modify and make the journal external seem to
work fine. I am not quite sure what was happening but the process seems to be
working now, the filesystem mounts, can be written to, can be modified with
tune2fs etc. I am looking into this some more. I suspect the journal device may
have been corrupted before the OST was formatted. Is it possible that formatting
an OST with an external journal would fail silently if the journal was not
present, or corrupted?
interestingly, and the reason I was experimenting with this test server, was to
look at whether software raid 10 on the OSS would do a better job, as opposed to
having the external disk enclosure present them as a RAID10 LUN (our disk
enclosures have this capability on board). Some basic write testing suggests
the software raid10 on the OSS server is doing a better job (maybe 15-20%
better) with sustained write throughput (beyond any possible buffering in RAM),
than having the same disks exported as raid10 LUN.
thanks for your input on this.
s.
________________________________________
From: Andreas Dilger [adilger at whamcloud.com]
Sent: Saturday, January 22, 2011 9:26 PM
To: Sam Aparicio
Cc: Eudes PHILIPPE; lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] lustre and software RAID
Presumably, unlike the order shown below, you run the mkfs.lustre AFTER the
mdadm command?
Cheers, Andreas
On 2011-01-21, at 14:55, Samuel Aparicio <saparicio at
bccrc.ca<mailto:saparicio at bccrc.ca>> wrote:
e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64
mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0
/dev/md2
mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16
--spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10
/dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14
/dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18
/dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22
/dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8
cat /proc/mdstat
md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14]
etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10]
etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5]
etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0]
15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU]
Professor Samuel Aparicio BM BCh PhD FRCPath
Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
675 West 10th, Vancouver V5Z 1L3, Canada.
office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website
<http://molonc.bccrc.ca/> http://molonc.bccrc.ca
On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote:
I?m not an expert on lustre, just begin with it ? but?
What is your version of e2fsprogs?
What is your command line to format your raid?
Regards.
De : <mailto:lustre-discuss-bounces at lists.lustre.org>
lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at
lists.lustre.org> [mailto:lustre-discuss-bounces at lists.lustre.org] De la
part de Samuel Aparicio
Envoy? : vendredi 21 janvier 2011 21:37
? : <mailto:lustre-discuss at lists.lustre.org> lustre-discuss at
lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
Objet : [Lustre-discuss] lustre and software RAID
I am having the following issue:
trying to create an ext4 lustre filesystem attached to an OSS.
the disks being used are exported from an external disk enclosure.
i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine.
I am able to format such an array with normal ext4, mount a filesytem etc.
however when i try the same thing, trying to format for a lustre filesystem I am
unable to mount the filesystem and lustre does not seem to detect it.
the lustre format completes normally, without errors.
If I arrange to present the disks as a RAID10 set from the external disk
enclosure, which has it''s own internal RAID capability,
(rather than trying to use mdadm on the OSS) the lustre formatting works fine
and I can get a mountable OST.
the kernel log reports the following when a mount is attempted:
LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem
LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount
/dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module
available?
LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount
device /dev/md2: -22
LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount
(-22)
lsmod reports that all the modules are loaded
fsck reports the following
fsck 1.41.10.sun2 (24-Feb-2010)
e2fsck 1.41.10.sun2 (24-Feb-2010)
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev//md2
It would seem the filesystem has not been written properly, but mkfs reports no
errors ....
lustre version 1.8.4
kernel 2.6.18-194.3.1.el5_lustre.1.8.4
disk array is a coraid SATA/AOE device which has worked fine in every other
context
this seems like an interaction of lustre with software RAID on the OSS?
I wonder if anyone has seen anything like this before.
any ideas about this?
Professor Samuel Aparicio BM BCh PhD FRCPath
Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
675 West 10th, Vancouver V5Z 1L3, Canada.
office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website
<http://molonc.bccrc.ca/> http://molonc.bccrc.ca
_______________________________________________
Lustre-discuss mailing list
<mailto:Lustre-discuss at lists.lustre.org>Lustre-discuss at
lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org>
<http://lists.lustre.org/mailman/listinfo/lustre-discuss>http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at
lists.lustre.org>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
Andreas Dilger wrote:> On 2011-01-21, at 14:50, Samuel Aparicio wrote: > >> modinfo reports as follows. seems like the ext4 modules. >> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >> >> -------- >> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> description: Fourth Extended Filesystem >> --------- >> > > After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. >Last time I tried it (with Lustre 1.8.5), that didn''t work for me (see Bug 24398), although I left out "-t ext4" it tried to mount as ext4. "mount -t ldiskfs ..." should work Kevin
it worked ... but the issue went away as well. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca PLEASE SUPPORT MY FUNDRAISING FOR THE RIDE TO SEATTLE AND THE WEEKEND TO END WOMENS CANCERS. YOU CAN DONATE AT THE LINKS BELOW Ride to Seattle Fundraiser Weekend to End Womens Cancers On Jan 25, 2011, at 6:39 AM, Kevin Van Maren wrote:> Andreas Dilger wrote: >> On 2011-01-21, at 14:50, Samuel Aparicio wrote: >> >>> modinfo reports as follows. seems like the ext4 modules. >>> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >>> >>> -------- >>> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >>> description: Fourth Extended Filesystem >>> --------- >>> >> >> After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. >> > > Last time I tried it (with Lustre 1.8.5), that didn''t work for me (see > Bug 24398), although I left out "-t ext4" it tried to mount as ext4. > "mount -t ldiskfs ..." should work > > Kevin >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110125/bed55b32/attachment-0001.html