I am having the following issue: trying to create an ext4 lustre filesystem attached to an OSS. the disks being used are exported from an external disk enclosure. i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. I am able to format such an array with normal ext4, mount a filesytem etc. however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. the lustre format completes normally, without errors. If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. the kernel log reports the following when a mount is attempted: LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) lsmod reports that all the modules are loaded fsck reports the following fsck 1.41.10.sun2 (24-Feb-2010) e2fsck 1.41.10.sun2 (24-Feb-2010) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 It would seem the filesystem has not been written properly, but mkfs reports no errors .... lustre version 1.8.4 kernel 2.6.18-194.3.1.el5_lustre.1.8.4 disk array is a coraid SATA/AOE device which has worked fine in every other context this seems like an interaction of lustre with software RAID on the OSS? I wonder if anyone has seen anything like this before. any ideas about this? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/b098b8e1/attachment.html
On 2011-01-21, at 13:36, Samuel Aparicio wrote:> trying to create an ext4 lustre filesystem attached to an OSS. > the disks being used are exported from an external disk enclosure. > i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. > I am able to format such an array with normal ext4, mount a filesytem etc. > however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. > the lustre format completes normally, without errors.You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using.> lustre version 1.8.4 > kernel 2.6.18-194.3.1.el5_lustre.1.8.4 > disk array is a coraid SATA/AOE device which has worked fine in every other contextDo you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: [root]# modinfo ldiskfs filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko license: GPL description: Fourth Extended Filesystem ^^^^^^^^^^^^^^^^^^^^^^^^^^ author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others srcversion: D5D8992C8B3E6FCA6ED4FF2 depends: vermagic: 2.6.32.20 SMP mod_unload modversions Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc.
I?m not an expert on lustre, just begin with it J but What is your version of e2fsprogs? What is your command line to format your raid? Regards. De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio Envoy? : vendredi 21 janvier 2011 21:37 ? : lustre-discuss at lists.lustre.org Objet : [Lustre-discuss] lustre and software RAID I am having the following issue: trying to create an ext4 lustre filesystem attached to an OSS. the disks being used are exported from an external disk enclosure. i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. I am able to format such an array with normal ext4, mount a filesytem etc. however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. the lustre format completes normally, without errors. If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. the kernel log reports the following when a mount is attempted: LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) lsmod reports that all the modules are loaded fsck reports the following fsck 1.41.10.sun2 (24-Feb-2010) e2fsck 1.41.10.sun2 (24-Feb-2010) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 It would seem the filesystem has not been written properly, but mkfs reports no errors .... lustre version 1.8.4 kernel 2.6.18-194.3.1.el5_lustre.1.8.4 disk array is a coraid SATA/AOE device which has worked fine in every other context this seems like an interaction of lustre with software RAID on the OSS? I wonder if anyone has seen anything like this before. any ideas about this? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca <http://molonc.bccrc.ca/> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/c2cfd1e2/attachment-0001.html
modinfo reports as follows. seems like the ext4 modules. the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS -------- filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko license: GPL description: Fourth Extended Filesystem author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others srcversion: B4DBDF5EA1FA02D1D1417AF depends: jbd2,crc16 vermagic: 2.6.18-194.3.1.el5_lustre.1.8.4 SMP mod_unload gcc-4.1 parm: default_mb_history_length:Default number of entries saved for mb_history (int) --------- Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote:> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >> trying to create an ext4 lustre filesystem attached to an OSS. >> the disks being used are exported from an external disk enclosure. >> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >> I am able to format such an array with normal ext4, mount a filesytem etc. >> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >> the lustre format completes normally, without errors. > > You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. > >> lustre version 1.8.4 >> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >> disk array is a coraid SATA/AOE device which has worked fine in every other context > > Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: > > [root]# modinfo ldiskfs > filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko > license: GPL > description: Fourth Extended Filesystem > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others > srcversion: D5D8992C8B3E6FCA6ED4FF2 > depends: > vermagic: 2.6.32.20 SMP mod_unload modversions > > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/2da928b8/attachment.html
e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64 mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0 /dev/md2 mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16 --spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10 /dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14 /dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18 /dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22 /dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8 cat /proc/mdstat md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14] etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10] etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5] etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0] 15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU] Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote:> I?m not an expert on lustre, just begin with it J but? > > What is your version of e2fsprogs? > > What is your command line to format your raid? > > Regards. > > De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio > Envoy? : vendredi 21 janvier 2011 21:37 > ? : lustre-discuss at lists.lustre.org > Objet : [Lustre-discuss] lustre and software RAID > > I am having the following issue: > > trying to create an ext4 lustre filesystem attached to an OSS. > the disks being used are exported from an external disk enclosure. > i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. > I am able to format such an array with normal ext4, mount a filesytem etc. > however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. > the lustre format completes normally, without errors. > > If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, > (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. > > the kernel log reports the following when a mount is attempted: > > LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem > LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? > LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 > LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) > > lsmod reports that all the modules are loaded > > fsck reports the following > fsck 1.41.10.sun2 (24-Feb-2010) > e2fsck 1.41.10.sun2 (24-Feb-2010) > fsck.ext2: Superblock invalid, trying backup blocks... > fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 > > It would seem the filesystem has not been written properly, but mkfs reports no errors .... > > lustre version 1.8.4 > kernel 2.6.18-194.3.1.el5_lustre.1.8.4 > disk array is a coraid SATA/AOE device which has worked fine in every other context > > this seems like an interaction of lustre with software RAID on the OSS? > I wonder if anyone has seen anything like this before. > any ideas about this? > > > > > Professor Samuel Aparicio BM BCh PhD FRCPath > Nan and Lorraine Robertson Chair UBC/BC Cancer Agency > 675 West 10th, Vancouver V5Z 1L3, Canada. > office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca > > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/21bd34d0/attachment-0001.html
On 2011-01-21, at 14:50, Samuel Aparicio wrote:> modinfo reports as follows. seems like the ext4 modules. > the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS > > -------- > filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko > description: Fourth Extended Filesystem > ---------After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. If that doesn''t work then the format failed for some reason. Providing the output of "mkfs.lustre -v {options}" would help diagnose it.> On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote: > >> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >>> trying to create an ext4 lustre filesystem attached to an OSS. >>> the disks being used are exported from an external disk enclosure. >>> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >>> I am able to format such an array with normal ext4, mount a filesytem etc. >>> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >>> the lustre format completes normally, without errors. >> >> You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. >> >>> lustre version 1.8.4 >>> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >>> disk array is a coraid SATA/AOE device which has worked fine in every other context >> >> Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: >> >> [root]# modinfo ldiskfs >> filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> license: GPL >> description: Fourth Extended Filesystem >> ^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others >> srcversion: D5D8992C8B3E6FCA6ED4FF2 >> depends: >> vermagic: 2.6.32.20 SMP mod_unload modversions >> >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Principal Engineer >> Whamcloud, Inc. >> >> >> >Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc.
ok thanks, I will look into this. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 21, 2011, at 2:00 PM, Andreas Dilger wrote:> On 2011-01-21, at 14:50, Samuel Aparicio wrote: >> modinfo reports as follows. seems like the ext4 modules. >> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >> >> -------- >> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> description: Fourth Extended Filesystem >> --------- > > After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. > > If that doesn''t work then the format failed for some reason. Providing the output of "mkfs.lustre -v {options}" would help diagnose it. > >> On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote: >> >>> On 2011-01-21, at 13:36, Samuel Aparicio wrote: >>>> trying to create an ext4 lustre filesystem attached to an OSS. >>>> the disks being used are exported from an external disk enclosure. >>>> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >>>> I am able to format such an array with normal ext4, mount a filesytem etc. >>>> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >>>> the lustre format completes normally, without errors. >>> >>> You are probably formatting the filesystem with an ext4 feature that is not in the ldiskfs module you are using. >>> >>>> lustre version 1.8.4 >>>> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >>>> disk array is a coraid SATA/AOE device which has worked fine in every other context >>> >>> Do you have the ext4-based ldiskfs RPM installed? It is a separate download on the download page. You can check whether the ldiskfs module installed was based on ext3 or ext4 with the "modinfo" command: >>> >>> [root]# modinfo ldiskfs >>> filename: /lib/modules/2.6.32.20/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >>> license: GPL >>> description: Fourth Extended Filesystem >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts''o and others >>> srcversion: D5D8992C8B3E6FCA6ED4FF2 >>> depends: >>> vermagic: 2.6.32.20 SMP mod_unload modversions >>> >>> >>> Cheers, Andreas >>> -- >>> Andreas Dilger >>> Principal Engineer >>> Whamcloud, Inc. >>> >>> >>> >> > > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110121/292b201c/attachment.html
Presumably, unlike the order shown below, you run the mkfs.lustre AFTER the mdadm command? Cheers, Andreas On 2011-01-21, at 14:55, Samuel Aparicio <saparicio at bccrc.ca> wrote:> > e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64 > > mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0 /dev/md2 > > mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16 --spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10 /dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14 /dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18 /dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22 /dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8 > > > cat /proc/mdstat > > md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14] etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10] etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5] etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0] > 15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU] > > > > > > > > Professor Samuel Aparicio BM BCh PhD FRCPath > Nan and Lorraine Robertson Chair UBC/BC Cancer Agency > 675 West 10th, Vancouver V5Z 1L3, Canada. > office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca > > > > > > On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote: > >> I?m not an expert on lustre, just begin with it J but? >> >> What is your version of e2fsprogs? >> >> What is your command line to format your raid? >> >> Regards. >> >> De : lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio >> Envoy? : vendredi 21 janvier 2011 21:37 >> ? : lustre-discuss at lists.lustre.org >> Objet : [Lustre-discuss] lustre and software RAID >> >> I am having the following issue: >> >> trying to create an ext4 lustre filesystem attached to an OSS. >> the disks being used are exported from an external disk enclosure. >> i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. >> I am able to format such an array with normal ext4, mount a filesytem etc. >> however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. >> the lustre format completes normally, without errors. >> >> If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, >> (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. >> >> the kernel log reports the following when a mount is attempted: >> >> LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem >> LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? >> LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 >> LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) >> >> lsmod reports that all the modules are loaded >> >> fsck reports the following >> fsck 1.41.10.sun2 (24-Feb-2010) >> e2fsck 1.41.10.sun2 (24-Feb-2010) >> fsck.ext2: Superblock invalid, trying backup blocks... >> fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 >> >> It would seem the filesystem has not been written properly, but mkfs reports no errors .... >> >> lustre version 1.8.4 >> kernel 2.6.18-194.3.1.el5_lustre.1.8.4 >> disk array is a coraid SATA/AOE device which has worked fine in every other context >> >> this seems like an interaction of lustre with software RAID on the OSS? >> I wonder if anyone has seen anything like this before. >> any ideas about this? >> >> >> >> >> Professor Samuel Aparicio BM BCh PhD FRCPath >> Nan and Lorraine Robertson Chair UBC/BC Cancer Agency >> 675 West 10th, Vancouver V5Z 1L3, Canada. >> office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca >> >> >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110122/d106cef9/attachment-0001.html
yes, mdadm first then format. I have now managed to get the mkfs to produce a mountable filesystem on this server, however attempts to format a filesystem with an external journal (-J device=/dev/sdc, /dev/sdc previously created/formatted as a journal filesystem) seem to fail (silently - mkfs completes without errors). However, formatting as below (no -J) and using tune2fs to modify and make the journal external seem to work fine. I am not quite sure what was happening but the process seems to be working now, the filesystem mounts, can be written to, can be modified with tune2fs etc. I am looking into this some more. I suspect the journal device may have been corrupted before the OST was formatted. Is it possible that formatting an OST with an external journal would fail silently if the journal was not present, or corrupted? interestingly, and the reason I was experimenting with this test server, was to look at whether software raid 10 on the OSS would do a better job, as opposed to having the external disk enclosure present them as a RAID10 LUN (our disk enclosures have this capability on board). Some basic write testing suggests the software raid10 on the OSS server is doing a better job (maybe 15-20% better) with sustained write throughput (beyond any possible buffering in RAM), than having the same disks exported as raid10 LUN. thanks for your input on this. s. ________________________________________ From: Andreas Dilger [adilger at whamcloud.com] Sent: Saturday, January 22, 2011 9:26 PM To: Sam Aparicio Cc: Eudes PHILIPPE; lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] lustre and software RAID Presumably, unlike the order shown below, you run the mkfs.lustre AFTER the mdadm command? Cheers, Andreas On 2011-01-21, at 14:55, Samuel Aparicio <saparicio at bccrc.ca<mailto:saparicio at bccrc.ca>> wrote: e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64 mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3 at tcp0 /dev/md2 mdadm -v --create /dev/md2 --chunk=256 --level=raid10 --raid-devices=16 --spare-devices=1 --assume-clean --layout=n2 /dev/etherd/e5.9 /dev/etherd/e5.10 /dev/etherd/e5.11 /dev/etherd/e5.12 /dev/etherd/e5.13 /dev/etherd/e5.14 /dev/etherd/e5.15 /dev/etherd/e5.16 /dev/etherd/e5.17 /dev/etherd/e5.18 /dev/etherd/e5.19 /dev/etherd/e5.20 /dev/etherd/e5.21 /dev/etherd/e5.22 /dev/etherd/e5.23 /dev/etherd/e5.7 /dev/etherd/e5.8 cat /proc/mdstat md2 : active raid10 etherd/e5.8[16](S) etherd/e5.7[15] etherd/e5.23[14] etherd/e5.22[13] etherd/e5.21[12] etherd/e5.20[11] etherd/e5.19[10] etherd/e5.18[9] etherd/e5.17[8] etherd/e5.16[7] etherd/e5.15[6] etherd/e5.14[5] etherd/e5.13[4] etherd/e5.12[3] etherd/e5.11[2] etherd/e5.10[1] etherd/e5.9[0] 15628113920 blocks 256K chunks 2 near-copies [16/16] [UUUUUUUUUUUUUUUU] Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website <http://molonc.bccrc.ca/> http://molonc.bccrc.ca On Jan 21, 2011, at 1:01 PM, Eudes PHILIPPE wrote: I?m not an expert on lustre, just begin with it ? but? What is your version of e2fsprogs? What is your command line to format your raid? Regards. De : <mailto:lustre-discuss-bounces at lists.lustre.org> lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at lists.lustre.org> [mailto:lustre-discuss-bounces at lists.lustre.org] De la part de Samuel Aparicio Envoy? : vendredi 21 janvier 2011 21:37 ? : <mailto:lustre-discuss at lists.lustre.org> lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org> Objet : [Lustre-discuss] lustre and software RAID I am having the following issue: trying to create an ext4 lustre filesystem attached to an OSS. the disks being used are exported from an external disk enclosure. i create a raid10 set with mdadm from 16 2Tb disks, this part seems fine. I am able to format such an array with normal ext4, mount a filesytem etc. however when i try the same thing, trying to format for a lustre filesystem I am unable to mount the filesystem and lustre does not seem to detect it. the lustre format completes normally, without errors. If I arrange to present the disks as a RAID10 set from the external disk enclosure, which has it''s own internal RAID capability, (rather than trying to use mdadm on the OSS) the lustre formatting works fine and I can get a mountable OST. the kernel log reports the following when a mount is attempted: LDISKFS-fs (md2): VFS: Can''t find ldiskfs filesystem LustreError: 15241:0:(obd_mount.c:1292:server_kernel_mount()) premount /dev/md2:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? LustreError: 15241:0:(obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md2: -22 LustreError: 15241:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) lsmod reports that all the modules are loaded fsck reports the following fsck 1.41.10.sun2 (24-Feb-2010) e2fsck 1.41.10.sun2 (24-Feb-2010) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev//md2 It would seem the filesystem has not been written properly, but mkfs reports no errors .... lustre version 1.8.4 kernel 2.6.18-194.3.1.el5_lustre.1.8.4 disk array is a coraid SATA/AOE device which has worked fine in every other context this seems like an interaction of lustre with software RAID on the OSS? I wonder if anyone has seen anything like this before. any ideas about this? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website <http://molonc.bccrc.ca/> http://molonc.bccrc.ca _______________________________________________ Lustre-discuss mailing list <mailto:Lustre-discuss at lists.lustre.org>Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> <http://lists.lustre.org/mailman/listinfo/lustre-discuss>http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org<mailto:Lustre-discuss at lists.lustre.org> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Andreas Dilger wrote:> On 2011-01-21, at 14:50, Samuel Aparicio wrote: > >> modinfo reports as follows. seems like the ext4 modules. >> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >> >> -------- >> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >> description: Fourth Extended Filesystem >> --------- >> > > After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. >Last time I tried it (with Lustre 1.8.5), that didn''t work for me (see Bug 24398), although I left out "-t ext4" it tried to mount as ext4. "mount -t ldiskfs ..." should work Kevin
it worked ... but the issue went away as well. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca PLEASE SUPPORT MY FUNDRAISING FOR THE RIDE TO SEATTLE AND THE WEEKEND TO END WOMENS CANCERS. YOU CAN DONATE AT THE LINKS BELOW Ride to Seattle Fundraiser Weekend to End Womens Cancers On Jan 25, 2011, at 6:39 AM, Kevin Van Maren wrote:> Andreas Dilger wrote: >> On 2011-01-21, at 14:50, Samuel Aparicio wrote: >> >>> modinfo reports as follows. seems like the ext4 modules. >>> the odd thing is that the format works when the disk array is already presented as a raid set, rather than making the raidset with mdadm on the OSS >>> >>> -------- >>> filename: /lib/modules/2.6.18-194.3.1.el5_lustre.1.8.4/updates/kernel/fs/lustre-ldiskfs/ldiskfs.ko >>> description: Fourth Extended Filesystem >>> --------- >>> >> >> After the filesystem is formatted with mkfs.lustre, you should be able to mount it directly with "mount -t ext4 /dev/md??? /mnt" and see a few files in it. >> > > Last time I tried it (with Lustre 1.8.5), that didn''t work for me (see > Bug 24398), although I left out "-t ext4" it tried to mount as ext4. > "mount -t ldiskfs ..." should work > > Kevin >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110125/bed55b32/attachment-0001.html