m.roth at 5-cent.us
2016-Feb-18 21:25 UTC
[CentOS] CentOS 7, Xeon CPUs, not booting, [SOLVED], bug filed
Paul Heinlein wrote:> On Thu, 18 Feb 2016, m.roth at 5-cent.us wrote: > >> This is happening on anything other than plain vanilla Dell servers. One >> R730, with dual Tesla cards, one R420, with a fibre card for a RAID >> device, it never switches root. All these systems have Xeons, not AMD >> CPUs. >> >> We've had this with every one of the 327 kernels. In addition, it seems >> to happen also with the 229.20.1; the 229.14.1 has no such problem. >> >> From the rdsosreport: >> starting at line 126: >> /dev/disk/by-label: >> total 0 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 SWAP -> ../../sda2 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2f -> ../../sda3 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2fboot -> ../../sda1 >> >> Then, starting at line 1283: >> [ 3.317027] <servername> systemd[1]: Found device ST500NM0003-9ZM172 >> /. >> [ 3.317974] <servername> systemd[1]: Starting File System Check on >> /dev/disk/by-label/\x2f... >> [ 3.320089] <servername> systemd-fsck[590]: Failed to detect device >> /dev/disk/by-label// >> [ 3.320567] <servername> systemd[1]: systemd-fsck-root.service: main >> process exited, code=exited, status=1/FAILURE >> [ 3.320972] <servername> systemd[1]: Failed to start File System >> Check >> on /dev/disk/by-label/\x2f. >> >> Does *ANYONE* have any clues as to what's going on? >> >> Meanwhile, on a plain vanilla Dell R420, I see: >> ll /dev/disk/by-label/ >> total 0 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 SWAP -> ../../sda2 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 boot -> ../../sda1 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 root -> ../../sda3 >> >> So, what is this by-label with the x2f, and why can't it find the >> drives? >> >> Or do I have to file a bug report? This is a true show-stopper. > > Here are a few related thoughts: > > The 'x2f' looks to me very similar to me to %2F, the URL encoding for > the forward slash (/). > > If you look in /usr/lib/udev/rules.d, you'll see rules like > > ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", > SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}" > > where, if ID_FS_LABEL_ENC were equal to "/", then the rule would be > disk/by-label// -- with two trailing slashes, which (perhaps) gets > interpreted not as one slash (like cd might do) by as "/x2f". > > That's the end of random thought #1. > > The second is like it: > > A local C7 machine has this root entry in /etc/fstab: > > /dev/mapper/vg00-rootdev / xfs defaults 0 0 > > When I search my system logs for messages like the ones in your > original post, I see > > systemd: Found device /dev/mapper/vg00-rootdev. > systemd: Starting File System Check on /dev/mapper/vg00-rootdev... > > It's only after that's complete that I get device-specific messages > like > > systemd: Found device ST9600204SS. > > So I'm interested to know the content of your /etc/fstab file. > > End of thought #2.I just successfully brought up one that consistently failed. And filed a bug report, 0010398. What I did: 1. in /etc/fstab, I changed LABEL= to /dev/sda* 2. I did rebuild the initramfs with that. That still didn't do it. Finally, I did this: from the grub2 boot menu, I edited the kernel line so that instead of reading ... root=LABEL=/, it read root=/dev/sda3, and it booted with zero issues. There is, therefore, a bug in grub2? the handoff to systemd? where it does not handle LABEL correctly. mark
m.roth at 5-cent.us
2016-Feb-18 22:25 UTC
[CentOS] CentOS 7, Xeon CPUs, not booting, [SOLVED], bug filed
<SNIP>> What I did: > 1. in /etc/fstab, I changed LABEL= to /dev/sda* > 2. I did rebuild the initramfs with that. > That still didn't do it. > > Finally, I did this: from the grub2 boot menu, I edited the kernel line so > that instead of reading ... root=LABEL=/, it read root=/dev/sda3, and it > booted with zero issues. > > There is, therefore, a bug in grub2? the handoff to systemd? where it does > not handle LABEL correctly. >One more bit of information, which I added to the bug report: using e2label, I relabeled /boot and / to boot and root, and edited /etc/fstab and /etc/grub2.cfg to reflect that... and it booted with no trouble. I believe that a month ago, I neglected to edit grub2.cfg. Note that /dev/sdd1 and /dev/sde1, which both have labels that begin with a leading slash, mounted correctly. This, to me, indicates the bug is with grub2's handling of LABEL=. mark
Gordon Messmer
2016-Feb-18 23:36 UTC
[CentOS] CentOS 7, Xeon CPUs, not booting, [SOLVED], bug filed
On Thu, Feb 18, 2016 at 2:25 PM, <m.roth at 5-cent.us> wrote:> > Note that /dev/sdd1 and /dev/sde1, which both have labels that begin with > a leading slash, mounted correctly. This, to me, indicates the bug is with > grub2's handling of LABEL=.I'm pretty sure grub2 just passes strings to the kernel. Also, if you're able to select an older kernel and the system boots, then the signs probably point to a problem with the kernel's handling of labels.