Robert Heller
2011-Jul-22 19:08 UTC
[CentOS] Strange problem with LVM, device-mapper, and software RAID...
Running on a up-to-date CentOS 5.6 x86_64 machine:
[heller at ravel ~]$ uname -a
Linux ravel.60villagedrive 2.6.18-238.19.1.el5 #1 SMP Fri Jul 15 07:31:24 EDT
2011 x86_64 x86_64 x86_64 GNU/Linux
with a TYAN Computer Corp S4881 motherboard, which has a nVidia 4
channel SATA controller. It also has a Marvell Technology Group Ltd.
88SX7042 PCI-e 4-port SATA-II (rev 02).
This machine has a 120G SATA system disk on the motherboard controller
as the system disk:
[heller at ravel ~]$ sudo /sbin/fdisk -l /dev/sda
Disk /dev/sda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 125 1004031 83 Linux
/dev/sda2 126 14593 116214210 8e Linux LVM
/dev/sda1 is /boot and /dev/sda2 is a LVM Volume group (named
"RavelSystem")
with two logical volumes (named "root" and "swap"),
containing the root
file system and a base 1G swap area. So far so good.
On the Marvell controller are 4 1.5GB disks arranged as a RAID10 array:
[heller at ravel ~]$ cat /proc/mdstat
Personalities : [raid10]
md1 : active raid10 sdg1[3] sdf1[2] sde1[1] sdd1[0]
2930270208 blocks 512K chunks 2 far-copies [4/4] [UUUU]
unused devices: <none>
[heller at ravel ~]$ sudo /sbin/mdadm --detail /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Tue Jun 21 19:04:19 2011
Raid Level : raid10
Array Size : 2930270208 (2794.52 GiB 3000.60 GB)
Used Dev Size : 1465135616 (1397.26 GiB 1500.30 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Fri Jul 22 14:37:04 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : far=2
Chunk Size : 512K
UUID : 7a257206:a2b7c9d9:c5d004ae:2bdf6faf
Events : 0.578
Number Major Minor RaidDevice State
0 8 49 0 active sync /dev/sdd1
1 8 65 1 active sync /dev/sde1
2 8 81 2 active sync /dev/sdf1
3 8 97 3 active sync /dev/sdg1
This RAID10 array has a second LVM Volume Group (named "RavelData2")
on
it, also with two logical volumes (named "data" and
"largeswap"), a
large data file system and a 16gig swap area.
We are getting a strange message from device mapper on boot up:
[heller at ravel ~]$ dmesg | grep device-mapper -A 5 -B 5
sdg: Write Protect is off
sdg: Mode Sense: 00 3a 00 00
SCSI device sdg: drive cache: write back
sdg: sdg1
sd 7:0:0:0: Attached scsi disk sdg
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised:
dm-devel at redhat.com
device-mapper: dm-raid45: initialized v0.2594l
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
type=1404 audit(1311358429.773:2): selinux=0 auid=4294967295
ses=4294967295
--
md: bind<sdg1>
md: running: <sdg1><sdf1><sde1><sdd1>
md: raid10 personality registered for level 10
raid10: raid set md1 active with 4 out of 4 devices
md: ... autorun DONE.
device-mapper: multipath: version 1.0.6 loaded
device-mapper: table: 253:6: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
EXT3 FS on dm-5, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 180 seconds
'253:6' happens to be the large swap space:
[heller at ravel ~]$ dir -l /dev/mapper/
total 0
crw------- 1 root root 10, 63 Jul 22 14:13 control
brw-rw---- 1 root disk 253, 2 Jul 22 14:13 nvidia_biaabdaf
brw-rw---- 1 root disk 253, 3 Jul 22 14:13 nvidia_biaabdafp1
brw-rw---- 1 root disk 253, 0 Jul 22 14:13 nvidia_efjjfcad
brw-rw---- 1 root disk 253, 1 Jul 22 14:13 nvidia_efjjfcadp1
brw-rw---- 1 root disk 253, 7 Jul 22 14:13 RavelData2-data
brw-rw---- 1 root disk 253, 6 Jul 22 14:13 RavelData2-largeswap
brw-rw---- 1 root disk 253, 5 Jul 22 14:13 RavelSystem-root
brw-rw---- 1 root disk 253, 4 Jul 22 14:13 RavelSystem-swap
The weirdness is this:
1) the large swap space is not being activated automatically on boot. It
can be manually activated with the swapon command, and as a stopgap
measure, I've added a swapon command to rc.local
2) vgscan, vgchange, and vgdisplay are not seeing RavelData2, *even
though* the system has no problem mounting the data disk during boot
and swapon has no problem activating the largeswap swap area (at least
once the system is in full multi-user mode). And (appearently) the
device mapper is perfectly happy to do its thing (more or less) with
these logical volumes.
[heller at ravel ~]$ sudo /sbin/vgscan -v --ignorelockingfailure --mknodes -d
Wiping cache of LVM-capable devices
Wiping internal VG cache
Reading all physical volumes. This may take a while...
Finding all volume groups
Finding volume group "RavelSystem"
Found volume group "RavelSystem" using metadata type lvm2
Finding all logical volumes
[heller at ravel ~]$ sudo /sbin/vgchange -a y
2 logical volume(s) in volume group "RavelSystem" now active
[heller at ravel ~]$ sudo /sbin/vgchange -a y RavelData2
Volume group "RavelData2" not found
[heller at ravel ~]$ sudo /usr/sbin/vgdisplay RavelData2
Volume group "RavelData2" not found
Does anyone have any guesses as to what is going on here?
--
Robert Heller -- 978-544-6933 / heller at deepsoft.com
Deepwoods Software -- http://www.deepsoft.com/
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
Seemingly Similar Threads
- CentOS 7 - Have 2 disks, each with a biosboot partition, can only boot off one of them
- USB disk dropping out under light load
- LVM and hotswap (USB/iSCSI) devices?
- Bootable USB key...
- rsync: hlink.c:271: check_prior: Assertion `node->data != ((void *)0)' failed.
