Greetings,
I''ve posted before but no one responded. I''m reposting because
I''m
really dead in the water here until I can get this fixed.
The issue is that my OSTs don''t survive a reboot of the OSS.
In the below I''m dealing with two OSTs, quad-core Intel Xeon machines
with 8Gigs memory and dual port Qlogic fiber channel card.  They both
run SLES 10.1 and lustre 1.6.4.3.  My two MDS (similiar, though not
exactly same hardware), don''t have the same problem, though
I''m only
accessing a single MDT from them.
I''ve produced the problem by something as simple as running
umount /mnt/lustre/ost/ost_oss01_lustre0102_01
tune2fs -O +mmp /dev/mapper/ost_oss01_lustre0102_01
mount -t lustre /dev/mapper/ost_oss01_lustre0102_01
/mnt/lustre/ost/ost_oss01_lustre0102_01
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.
When I look at the partition table with parted I see that it''s changed
from loop to gpt (as shown below).
But the simpliest case is:
oss01:/net/lmd01/space/lustre # mkfs.lustre --reformat --fsname i3_lfs3 --ost
--failnode oss02 --mgsnode mds01 --mgsnode mds02
/dev/mapper/ost_oss01_lustre0102_01
oss01:/net/lmd01/space/lustre # reboot
# log in
oss01:/net/lmd01/space/lustre # mount -t lustre
/dev/mapper/ost_oss01_lustre0102_01 /mnt/lustre/ost/ost_oss01_lustre0102_01
mount.lustre: mount /dev/mapper/ost_oss01_lustre0102_01 at
/mnt/lustre/ost/ost_oss01_lustre0102_01 failed: Invalid argument
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.
oss01:/net/lmd01/space/lustre # dumpe2fs -h /dev/mapper/ost_oss01_lustre0102_01
|grep feature
dumpe2fs 1.40.4.cfs1 (31-Dec-2007)
dumpe2fs: Bad magic number in super-block while trying to open
/dev/mapper/ost_oss01_lustre0102_01
# another example, I re-run mkfs.lustre on the above device and mount
# it and 2 other OSTs on the second OSS
oss02:/net/lmd01/space/lustre # df|egrep ''File|ost''
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/ost_oss01_lustre0102_01
                      5768201600    469544 5474724244   1%
/mnt/lustre/ost/ost_oss01_lustre0102_01
/dev/mapper/ost_oss01_lustre0102_02
                      5768201600    469540 5474724248   1%
/mnt/lustre/ost/ost_oss01_lustre0102_02
/dev/mapper/ost_oss02_lustre0102_01
                      5768201600    479940 5474713848   1%
/mnt/lustre/ost/ost_oss02_lustre0102_01
# I reboot the first machine then
oss02:/net/lmd01/space/lustre # umount -t lustre -a
# then try to mount from first machine and ...
oss01:/net/lmd01/space/lustre # cat a
   mount -t lustre /dev/mapper/ost_oss01_lustre0102_01
/mnt/lustre/ost/ost_oss01_lustre0102_01
   mount -t lustre /dev/mapper/ost_oss01_lustre0102_02
/mnt/lustre/ost/ost_oss01_lustre0102_02
   mount -t lustre /dev/mapper/ost_oss02_lustre0102_01
/mnt/lustre/ost/ost_oss02_lustre0102_01
oss01:/net/lmd01/space/lustre # sh a
mount.lustre: mount /dev/mapper/ost_oss01_lustre0102_01 at
/mnt/lustre/ost/ost_oss01_lustre0102_01 failed: Invalid argument
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.
oss01:/net/lmd01/space/lustre # df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/cciss/c0d0p3     61022084   6398044  51524300  12% /
udev                   4089220       312   4088908   1% /dev
/dev/cciss/c0d0p1      1241220     48324   1129844   5% /boot
lmd01:/space         470387232   8296256 438196704   2% /net/lmd01/space
/dev/mapper/ost_oss01_lustre0102_02
                      5768201600    469540 5474724248   1%
/mnt/lustre/ost/ost_oss01_lustre0102_02
/dev/mapper/ost_oss02_lustre0102_01
                      5768201600    479940 5474713848   1%
/mnt/lustre/ost/ost_oss02_lustre0102_01
# So the device was up just fine on one machine, I umounted them and tried on
the other OSS
# and the partition table has changed
oss01:/net/lmd01/space/lustre # /usr/local/sbin/parted
/dev/mapper/ost_oss01_lustre0102_01
GNU Parted 1.8.8
Using /dev/mapper/ost_oss01_lustre0102_01
Welcome to GNU Parted! Type ''help'' to view a list of commands.
(parted) p
Model: Linux device-mapper (dm)
Disk /dev/mapper/ost_oss01_lustre0102_01: 6001GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number  Start  End  Size  File system  Name  Flags
(parted) quit
# I can''t just put another partition table back
(parted) mklabel
Warning: The existing disk label on /dev/mapper/ost_oss01_lustre0102_01 will be
destroyed and all data on this disk will be lost. Do you want
to continue?
Yes/No? yes
New disk label type?  [gpt]? loop
(parted) p
Model: Linux device-mapper (dm)
Disk /dev/mapper/ost_oss01_lustre0102_01: 6001GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Number  Start  End  Size  File system  Flags
(parted) mkpart
File system type?  [ext2]? ext3
Start? 0
End? 6001GB
(parted) p
Error: /dev/mapper/ost_oss01_lustre0102_01: unrecognised disk label
(parted) quit
# There is nothing unusual about the device; looking at multipath
oss01:/net/lmd01/space/lustre # multipath -l|grep ost_oss01_lustre0102_01
ost_oss01_lustre0102_01 (36000402001fc14596ef496ed00000000) dm-4
NEXSAN,SATABeast
oss02:/net/lmd01/space/lustre # multipath -l|grep ost_oss01_lustre0102_01
ost_oss01_lustre0102_01 (36000402001fc14596ef496ed00000000) dm-4
NEXSAN,SATABeast
Any suggestions would be deeply appreciated.
Thanks much,
JR Smith
jrs wrote:> Greetings, > > I''ve posted before but no one responded. I''m reposting because I''m > really dead in the water here until I can get this fixed. > > The issue is that my OSTs don''t survive a reboot of the OSS. > > In the below I''m dealing with two OSTs, quad-core Intel Xeon machines > with 8Gigs memory and dual port Qlogic fiber channel card. They both > run SLES 10.1 and lustre 1.6.4.3. My two MDS (similiar, though not > exactly same hardware), don''t have the same problem, though I''m only > accessing a single MDT from them. > > I''ve produced the problem by something as simple as running > umount /mnt/lustre/ost/ost_oss01_lustre0102_01 > tune2fs -O +mmp /dev/mapper/ost_oss01_lustre0102_01 > mount -t lustre /dev/mapper/ost_oss01_lustre0102_01 /mnt/lustre/ost/ost_oss01_lustre0102_01 >.....> Any suggestions would be deeply appreciated.It looks like something is really destroying your disks, if you try this with ordinary ext3, does the filesystem survive a reboot? Otherwise, you could try: - mkfs.lustre as before. # tunefs.lustre --print <device> reboot # tunefs.lustre --print <device> Tunefs with --print is read-only, if it doesn''t work the second time, you should be able to compare the results. cliffw> > > Thanks much, > JR Smith > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
I just made an ext3 filesystem, mounted it (on both OSSes - not at the same time), unmounted, reboot both servers and it''s still there. It appearance that this destruction of filesystems is a lustre only thing. A difference between this and the lustre filesystem, of course, is that there is not device name created for the partition, e.g., oss01:~ # ls -l /dev/mapper/ost_oss01_lustre0102_01_bad_no_use* brw------- 1 root root 253, 7 May 2 09:09 /dev/mapper/ost_oss01_lustre0102_01_bad_no_use brw------- 1 root root 253, 12 May 2 09:09 /dev/mapper/ost_oss01_lustre0102_01_bad_no_use-part1 while a lustre OST uses the whole disk/volume oss01:~ # ls -l /dev/mapper/ost_oss01_lustre0304_02 brw------- 1 root root 253, 5 May 2 09:45 /dev/mapper/ost_oss01_lustre0304_02 In the mailing lists some time back someone had talked about kpartx (though I think it was in the context of having a consistent device name - whch I have no trouble with since I''m explicitly naming them in /etc/multipathd.conf. Another issue that appears to be a bug, though is probably not related to my issue is when running mkfs.lustre with --failnode mmp should be set on the filesystem. However, looking at the output of dumpe2fs that doesn''t appear to be the case: oss01:/net/lmd01/space/lustre # dumpe2fs -h /dev/mapper/ost_oss01_lustre0304_02|grep -A 1 feat dumpe2fs 1.40.4.cfs1 (31-Dec-2007) Filesystem features: has_journal resize_inode dir_index filetype needs_recovery extents sparse_super large_file Filesystem flags: signed directory hash Of course, I can run tune2fs but that, in the past, has induced the disappearance of the filesystem as well. thanks, JR Cliff White wrote:> jrs wrote: >> Greetings, >> >> I''ve posted before but no one responded. I''m reposting because I''m >> really dead in the water here until I can get this fixed. >> >> The issue is that my OSTs don''t survive a reboot of the OSS. >> >> In the below I''m dealing with two OSTs, quad-core Intel Xeon machines >> with 8Gigs memory and dual port Qlogic fiber channel card. They both >> run SLES 10.1 and lustre 1.6.4.3. My two MDS (similiar, though not >> exactly same hardware), don''t have the same problem, though I''m only >> accessing a single MDT from them. >> >> I''ve produced the problem by something as simple as running >> umount /mnt/lustre/ost/ost_oss01_lustre0102_01 >> tune2fs -O +mmp /dev/mapper/ost_oss01_lustre0102_01 >> mount -t lustre /dev/mapper/ost_oss01_lustre0102_01 >> /mnt/lustre/ost/ost_oss01_lustre0102_01 >> > ..... > >> Any suggestions would be deeply appreciated. > > It looks like something is really destroying your disks, if you try this > with ordinary ext3, does the filesystem survive a reboot? > > Otherwise, you could try: > - mkfs.lustre as before. > # tunefs.lustre --print <device> > reboot > # tunefs.lustre --print <device> > > Tunefs with --print is read-only, if it doesn''t work the second time, > you should be able to compare the results. > cliffw > >> >> >> Thanks much, >> JR Smith >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >
On Fri, 2008-05-02 at 12:09 -0400, jrs wrote:> > A difference between this and the lustre filesystem, of course, is that > there is not device name created for the partition, e.g., > > oss01:~ # ls -l /dev/mapper/ost_oss01_lustre0102_01_bad_no_use* > brw------- 1 root root 253, 7 May 2 09:09 /dev/mapper/ost_oss01_lustre0102_01_bad_no_use > brw------- 1 root root 253, 12 May 2 09:09 /dev/mapper/ost_oss01_lustre0102_01_bad_no_use-part1 > > while a lustre OST uses the whole disk/volumeSo for your ext3 test you partitioned the disk and used a partition and for lustre you used the whole disk? Why not do a more apples-to-apples comparison and format the whole device with ext3 just like you would with Lustre? There is no rule that you have to use partitions with ext3. Also be sure you are using the exact same disk/device between your two tests to eliminate a possibility that this is related to only one specific device. You also mention partitions in your post. You need to make sure that if you are using a whole disk device (i.e. /dev/sda rather than /dev/sda1) you cannot use any partitioning tools on that device or you will overwrite the beginning of your filesystem. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080502/25ce2569/attachment.bin
On May 01, 2008 11:52 -0400, jrs wrote:> oss01:/net/lmd01/space/lustre # mount -t lustre /dev/mapper/ost_oss01_lustre0102_01 /mnt/lustre/ost/ost_oss01_lustre0102_01 > mount.lustre: mount /dev/mapper/ost_oss01_lustre0102_01 at /mnt/lustre/ost/ost_oss01_lustre0102_01 failed: Invalid argument > This may have multiple causes. > Are the mount options correct? > Check the syslog for more info.^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ???????> mount.lustre: mount /dev/mapper/ost_oss01_lustre0102_01 at /mnt/lustre/ost/ost_oss01_lustre0102_01 failed: Invalid argument > This may have multiple causes. > Are the mount options correct? > Check the syslog for more info.^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ????????> # I can''t just put another partition table back > > (parted) mklabel > Warning: The existing disk label on /dev/mapper/ost_oss01_lustre0102_01 will be destroyed and all data on this disk will be lost. Do you want > to continue? > Yes/No? yes > New disk label type? [gpt]? loop > (parted) pWhat is a "loop" partition table?> Model: Linux device-mapper (dm) > Disk /dev/mapper/ost_oss01_lustre0102_01: 6001GBNote that anything over 2TB (I think, maybe 4TB?) needs a GPT partition table, or the size of the device is incorrect. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Fri, May 02, 2008 at 07:34:25PM -0700, Andreas Dilger wrote:> On May 01, 2008 11:52 -0400, jrs wrote: > > Note that anything over 2TB (I think, maybe 4TB?) needs a GPT partition table, > or the size of the device is incorrect. >And I have a warning for people who need to use GPT tables - from our experience the kernel silently ignores GPT. We run into this already some months ago. Everything seemed to be correct, /proc/partitions was what we did expect, but then we noticed very odd data corruption. First we thought we introduced a bug into ldiskfs (you know, we usually need more recent kernels than you support), but after adding a patch printing the partition offsets, we recognized it simply did use a dos partition table and wrapped around at 4TiB. Oh well, I wanted to to further investigate this already for a long time. But since it can be easily worked around by specifying the "gpt" kernel command line parameter other issues always did have higher priority. Bernd PS: This gpt bug is in 2.6.20 and 2.6.22, don''t know if it fixed in more recent kernel versions.
Hello Bernd, could you give some details on the data corruption you have seen? We have been using GPT tables for some time now, but the partitions have been filled up only on NFS servers, not on Lustre OSS. And on these NFS volumes there were no problems attributed to the partition size. This would also depend on the actual size limit: I understand GPT tables are necessary for partitions > 2TB. The largest partition I have set up so far was 3.2 TB, so if you are sure about your number of 4 TiB, we might simply be on the lucky side. And the workaround is specifying "gpt" on the kernel command line ? For once an easy solution ;-) Regards, Thomas Bernd Schubert wrote:> On Fri, May 02, 2008 at 07:34:25PM -0700, Andreas Dilger wrote: >> On May 01, 2008 11:52 -0400, jrs wrote: >> >> Note that anything over 2TB (I think, maybe 4TB?) needs a GPT partition table, >> or the size of the device is incorrect. >> > > And I have a warning for people who need to use GPT tables - from our > experience the kernel silently ignores GPT. We run into this already > some months ago. Everything seemed to be correct, /proc/partitions > was what we did expect, but then we noticed very odd data corruption. > First we thought we introduced a bug into ldiskfs (you know, we > usually need more recent kernels than you support), but after > adding a patch printing the partition offsets, we recognized it > simply did use a dos partition table and wrapped around at 4TiB. > > Oh well, I wanted to to further investigate this already for a long > time. But since it can be easily worked around by specifying the > "gpt" kernel command line parameter other issues always did have > higher priority. > > Bernd > > PS: This gpt bug is in 2.6.20 and 2.6.22, don''t know if it fixed in more > recent kernel versions. > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 Gesellschaft f?r Schwerionenforschung mbH Planckstra?e 1 D-64291 Darmstadt www.gsi.de Gesellschaft mit beschr?nkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Gesch?ftsf?hrer: Professor Dr. Horst St?cker Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph, Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
On May 05, 2008 11:57 -0400, jrs wrote:> mds01:/net/lmd01/space/lustre # mount -t lustre /dev/mapper/mdt_mds01_lustre0102 /mnt/lustre/mdt > mount.lustre: mount /dev/mapper/mdt_mds01_lustre0102 at /mnt/lustre/mdt failed: Invalid argument > This may have multiple causes. > Are the mount options correct? > Check the syslog for more info. > > Which produces this in /var/log/message > > May 5 09:35:41 mds01 kernel: VFS: Can''t find ldiskfs filesystem on dev dm-1. > May 5 09:35:41 mds01 multipathd: dm-1: umount map (uevent) > May 5 09:35:41 mds01 kernel: LustreError: > 16215:0:(obd_mount.c:1229:server_kernel_mount()) premount > /dev/mapper/mdt_mds01_lustre0102:0x0 ldiskfs failed: -22, ldiskfs2 failed: > -19. Is the ldiskfs module available? > May 5 09:35:41 mds01 kernel: LustreError: 16215:0:(obd_mount.c:1533:server_fill_super()) Unable to mount device /dev/mapper/mdt_mds01_lustre0102: -22 > May 5 09:35:41 mds01 kernel: LustreError: 16215:0:(obd_mount.c:1924:lustre_fill_super()) Unable to mount (-22) > > If I try to look at the partition table with parted I see: > > mds01:/net/oss02/space/parted-1.8.8 # /usr/local/sbin/parted /dev/mapper/mdt_mds01_lustre0102 > GNU Parted 1.8.8 > Using /dev/mapper/mdt_mds01_lustre0102 > Welcome to GNU Parted! Type ''help'' to view a list of commands. > (parted) p > Error: /dev/mapper/mdt_mds01_lustre0102: unrecognised disk label > (parted) > > A good filesystem looks like: > mds01:/net/oss02/space/parted-1.8.8 # /usr/local/sbin/parted /dev/mapper/ost_oss01_lustre0304_01 > GNU Parted 1.8.8 > Using /dev/mapper/ost_oss01_lustre0304_01 > Welcome to GNU Parted! Type ''help'' to view a list of commands. > (parted) p > Model: Unknown (unknown) > Disk /dev/mapper/ost_oss01_lustre0304_01: 6001GB > Sector size (logical/physical): 512B/512B > Partition Table: loop > > Number Start End Size File system Flags > 1 0.00B 6001GB 6001GB ext3 > > > NOTE: in another post someone commented on the loop partition type. > I don''t know what it is but all my lustre partitions are of that > type. The fact that a lustre person (I believe this individual was > employed by Sun) was unfamiliar with it certainly is surprising.I don''t think that being employed by Sun makes everyone suddenly know and understand everything :-). That other person was me, and while I''ve even contributed a significant amount of code to parted in the past, I just haven''t used it in several years and am not familiar with the "loop" partition type.> Perhaps my version of parted has an issue (the one shipped with SLES > returns: > mds01:/net/oss02/space/parted-1.8.8 # parted /dev/mapper/mdt_mds01_lustre0102 > Floating point exceptionTwo things of note: - there have been ongoing issues with parted and ldiskfs with large disk devices, and I tend to avoid parted and fdisk entirely for these reasons. I''ve been using LVM (DM) to manage my storage for some time now, if it is needed. - we generally do NOT recommend using partitions of any kind for production Lustre filesystems, because of problems like this, and the fact that in RAID setups this can hurt performance due to misaligned IO to the disk. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hello Thomas, On Monday 05 May 2008 18:53:12 Thomas Roth wrote:> Hello Bernd, > > could you give some details on the data corruption you have seen? > We have been using GPT tables for some time now, but the partitions have > been filled up only on NFS servers, not on Lustre OSS. And on these NFS > volumes there were no problems attributed to the partition size.the type of filesystem is not important at all, it is still on low device level.> > This would also depend on the actual size limit: I understand GPT tables > are necessary for partitions > 2TB. The largest partition I have set up > so far was 3.2 TB, so if you are sure about your number of 4 TiB, we > might simply be on the lucky side.I''m rather sure we did see the problem with >4TiB only.> > And the workaround is specifying "gpt" on the kernel command line ? For > once an easy solution ;-)Yes, usually the kernel seems to first try the dos partition table and then tries other tables. But the first 512B of GPT seems to be compatible with DOS and so the kernel thinks the dos-table is fine. When you specify gpt, it first tries gpt and if this doesn''t fit it switches to dos. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH
On Monday 05 May 2008 19:21:43 Andreas Dilger wrote:> > > > (parted) p > > Model: Unknown (unknown) > > Disk /dev/mapper/ost_oss01_lustre0304_01: 6001GB > > Sector size (logical/physical): 512B/512B > > Partition Table: loop > > > > Number Start End Size File system Flags > > 1 0.00B 6001GB 6001GB ext3 > > > > > > NOTE: in another post someone commented on the loop partition type. > > I don''t know what it is but all my lustre partitions are of that > > type. The fact that a lustre person (I believe this individual was > > employed by Sun) was unfamiliar with it certainly is surprising. > > I don''t think that being employed by Sun makes everyone suddenly know > and understand everything :-). That other person was me, and while > I''ve even contributed a significant amount of code to parted in the > past, I just haven''t used it in several years and am not familiar with > the "loop" partition type.You definitely know more about filesystems and partitions than I do, but I''m sure this is a bug.> > > Perhaps my version of parted has an issue (the one shipped with SLES > > returns: > > mds01:/net/oss02/space/parted-1.8.8 # parted > > /dev/mapper/mdt_mds01_lustre0102 Floating point exceptionProbably this: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=259248> > Two things of note: > - there have been ongoing issues with parted and ldiskfs with large disk > devices, and I tend to avoid parted and fdisk entirely for these reasons. > I''ve been using LVM (DM) to manage my storage for some time now, if it is > needed. > - we generally do NOT recommend using partitions of any kind for production > Lustre filesystems, because of problems like this, and the fact that in > RAID setups this can hurt performance due to misaligned IO to the disk. >Well, I wish we wouldn''t need to use partitions, but for some projects we need to do so: - ldiskfs is still limited to 8TiB - linux-md raid6 is not parallized and a single CPU becomes a limit, while 7 other CPUs are idling. Creating several raid sets is then some kind of parallization Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH
Well, things have changed again as I''m trying to get back to something
that works but on one of the MDSs you see.  Below that I have
the output of ''multipath -l''.  I dual port HBAs and multiple
paths
to the backend storage so it looks a little complex.  I''ve modified
the /etc/multipathd.conf file to give the logical names you see, e.g.,
ost_lustre03-04_04_oss01_dm_7_mds01.
Even though it looks a little scary remember that things work fine and
can even survive a random number of reboots before an OST disappears.
Since the last time I posted I had an MST go away too.
Does anyone think that I might have better luck running Redhat?
I''ve looked through the /etc/init.d/* files but can''t see
anything
that might be destroying the partition.
Thanks
John
$ cat /proc/partition
major minor  #blocks  name
  104     0   71652960 cciss/c0d0
  104     1    2104483 cciss/c0d0p1
  104     2   69545385 cciss/c0d0p2
    8     0 5860157184 sda
    8    16 5860157184 sdb
    8    32 5860230912 sdc
    8    48 5860156250 sdd
    8    64 5860156250 sde
    8    80 5860156250 sdf
    8    96 5860156250 sdg
    8   112 5860156250 sdh
    8   128 5860156250 sdi
    8   144 5860157184 sdj
    8   160 5860157184 sdk
    8   176 5860230912 sdl
    8   192 5860156250 sdm
    8   193 5860156216 sdm1
    8   208 5860156250 sdn
    8   224 5860156250 sdo
    8   240 5860157184 sdp
   65     0 5860157184 sdq
   65    16 5860230912 sdr
   65    32 5860156250 sds
   65    33 5860156216 sds1
   65    48 5860156250 sdt
   65    64 5860156250 sdu
   65    80 5860157184 sdv
   65    96 5860157184 sdw
   65   112 5860230912 sdx
   65   128 5860156250 sdy
   65   129 5860156216 sdy1
   65   144 5860156250 sdz
   65   160 5860156250 sdaa
   65   176 5860157184 sdab
   65   192 5860157184 sdac
   65   208 5860230912 sdad
   65   224 5860156250 sdae
   65   225 5860156216 sdae1
   65   240 5860156250 sdaf
   66     0 5860156250 sdag
   66    16 5860157184 sdah
   66    32 5860157184 sdai
   66    48 5860230912 sdaj
   66    64 5860157184 sdak
   66    80 5860157184 sdal
   66    96 5860230912 sdam
   66   112 5860156250 sdan
   66   128 5860156250 sdao
   66   144 5860156250 sdap
   66   160 5860156250 sdaq
   66   176 5860156250 sdar
   66   192 5860156250 sdas
   66   208 5860157184 sdat
   66   224 5860157184 sdau
   66   240 5860230912 sdav
  253     0 5860156250 dm-0
  253     1 5860157184 dm-1
  253     2 5860157184 dm-2
  253     3 5860230912 dm-3
  253     4 5860156250 dm-4
  253     5 5860156250 dm-5
  253     6 5860157184 dm-6
  253     7 5860157184 dm-7
  253     8 5860230912 dm-8
  253     9 5860156250 dm-9
  253    10 5860156250 dm-10
  253    11 5860156250 dm-11
  253    12 5860156216 dm-12
$ multipath -l
ost_lustre03-04_04_oss01_dm_7_mds01 (36000402001fc308260c0ace100000000) dm-7
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:3:4 sdai 66:32  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:7:4 sdau 66:224 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:3:4 sdk  8:160  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:7:4 sdw  65:96  [active][undef]
ost_lustre01-02_04_oss01_dm_5_mds01 (36000402001fc14596ef496fd00000000) dm-5
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:2:1 sdaf 65:240 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:4:1 sdn  8:208  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:6:1 sdt  65:48  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:0:1 sdz  65:144 [active][undef]
ost_lustre03-04_02_oss01_dm_3_mds01 (36000402001fc308260c0af3700000000) dm-3
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:1:2 sdad 65:208 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:4:2 sdam 66:96  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:0:2 sdc  8:32   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:5:2 sdr  65:16  [active][undef]
ost_lustre01-02_02_oss01_dm_11_mds01 (36000402001fc14596ef497ee00000000) dm-11
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:5:5 sdap 66:144 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:6:5 sdas 66:192 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:1:5 sdf  8:80   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:2:5 sdi  8:128  [active][undef]
ost_lustre01-02_05_oss02_dm_0_mds01 (36000402001fc14596ef4970e00000000) dm-0
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:0:2 sdaa 65:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:2:2 sdag 66:0   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:4:2 sdo  8:224  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:6:2 sdu  65:64  [active][undef]
ost_lustre01-02_01_oss02_dm_10_mds01 (36000402001fc14596ef497dc00000000) dm-10
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:5:4 sdao 66:128 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:6:4 sdar 66:176 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:1:4 sde  8:64   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:2:4 sdh  8:112  [active][undef]
mdt_lustre03-04_00_dm_8_mds01 (36000402001fc308260c0ac9e00000000) dm-8
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:3:5 sdaj 66:48  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:7:5 sdav 66:240 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:3:5 sdl  8:176  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:7:5 sdx  65:112 [active][undef]
ost_lustre03-04_03_oss02_dm_6_mds01 (36000402001fc308260c0acc200000000) dm-6
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:3:3 sdah 66:16  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:7:3 sdat 66:208 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:3:3 sdj  8:144  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:7:3 sdv  65:80  [active][undef]
ost_lustre01-02_03_oss02_dm_4_mds01 (36000402001fc14596ef496ed00000000) dm-4
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:2:0 sdae 65:224 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:4:0 sdm  8:192  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:6:0 sds  65:32  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:0:0 sdy  65:128 [active][undef]
ost_lustre03-04_01_oss02_dm_2_mds01 (36000402001fc308260c0af1600000000) dm-2
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:1:1 sdac 65:192 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:4:1 sdal 66:80  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:0:1 sdb  8:16   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:5:1 sdq  65:0   [active][undef]
ost_lustre01-02_00_oss01_dm_9_mds01 (36000402001fc14596ef497cc00000000) dm-9
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:5:3 sdan 66:112 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:6:3 sdaq 66:160 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:1:3 sdd  8:48   [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:2:3 sdg  8:96   [active][undef]
ost_lustre03-04_00_oss01_dm_1_mds01 (36000402001fc308260c0af5b00000000) dm-1
NEXSAN,SATABeast
[size=5.5T][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
  \_ 1:0:1:0 sdab 65:176 [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 1:0:4:0 sdak 66:64  [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:0:0 sda  8:0    [active][undef]
\_ round-robin 0 [prio=0][enabled]
  \_ 0:0:5:0 sdp  8:240  [active][undef]
Bernd Schubert wrote:
 > On Mon, May 05, 2008 at 12:30:23PM -0400, jrs wrote:
 >> I wonder if I''d have better luck, with the disappearing OST
bug, if
 >> I actually explictly partitioned the device and then used, to take
 >> the example above
 >>
 >>     /dev/mapper/ost_oss01_lustre0304_02-part1
 >>
 >> rather than the whole disk.
 >>
 >
 > What does /proc/partitions say?