This is happening on anything other than plain vanilla Dell servers. One R730, with dual Tesla cards, one R420, with a fibre card for a RAID device, it never switches root. All these systems have Xeons, not AMD CPUs. We've had this with every one of the 327 kernels. In addition, it seems to happen also with the 229.20.1; the 229.14.1 has no such problem.>From the rdsosreport:starting at line 126: /dev/disk/by-label: total 0 lrwxrwxrwx 1 root 0 10 Jan 27 19:03 SWAP -> ../../sda2 lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2f -> ../../sda3 lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2fboot -> ../../sda1 Then, starting at line 1283: [ 3.317027] <servername> systemd[1]: Found device ST500NM0003-9ZM172 /. [ 3.317974] <servername> systemd[1]: Starting File System Check on /dev/disk/by-label/\x2f... [ 3.320089] <servername> systemd-fsck[590]: Failed to detect device /dev/disk/by-label// [ 3.320567] <servername> systemd[1]: systemd-fsck-root.service: main process exited, code=exited, status=1/FAILURE [ 3.320972] <servername> systemd[1]: Failed to start File System Check on /dev/disk/by-label/\x2f. Does *ANYONE* have any clues as to what's going on? Meanwhile, on a plain vanilla Dell R420, I see: ll /dev/disk/by-label/ total 0 lrwxrwxrwx. 1 root root 10 Feb 17 10:06 SWAP -> ../../sda2 lrwxrwxrwx. 1 root root 10 Feb 17 10:06 boot -> ../../sda1 lrwxrwxrwx. 1 root root 10 Feb 17 10:06 root -> ../../sda3 So, what is this by-label with the x2f, and why can't it find the drives? Or do I have to file a bug report? This is a true show-stopper. mark
On Thu, 18 Feb 2016, m.roth at 5-cent.us wrote:> This is happening on anything other than plain vanilla Dell servers. One > R730, with dual Tesla cards, one R420, with a fibre card for a RAID > device, it never switches root. All these systems have Xeons, not AMD > CPUs. > > We've had this with every one of the 327 kernels. In addition, it seems to > happen also with the 229.20.1; the 229.14.1 has no such problem. > > From the rdsosreport: > starting at line 126: > /dev/disk/by-label: > total 0 > lrwxrwxrwx 1 root 0 10 Jan 27 19:03 SWAP -> ../../sda2 > lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2f -> ../../sda3 > lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2fboot -> ../../sda1 > > Then, starting at line 1283: > [ 3.317027] <servername> systemd[1]: Found device ST500NM0003-9ZM172 /. > [ 3.317974] <servername> systemd[1]: Starting File System Check on > /dev/disk/by-label/\x2f... > [ 3.320089] <servername> systemd-fsck[590]: Failed to detect device > /dev/disk/by-label// > [ 3.320567] <servername> systemd[1]: systemd-fsck-root.service: main > process exited, code=exited, status=1/FAILURE > [ 3.320972] <servername> systemd[1]: Failed to start File System Check > on /dev/disk/by-label/\x2f. > > Does *ANYONE* have any clues as to what's going on? > > Meanwhile, on a plain vanilla Dell R420, I see: > ll /dev/disk/by-label/ > total 0 > lrwxrwxrwx. 1 root root 10 Feb 17 10:06 SWAP -> ../../sda2 > lrwxrwxrwx. 1 root root 10 Feb 17 10:06 boot -> ../../sda1 > lrwxrwxrwx. 1 root root 10 Feb 17 10:06 root -> ../../sda3 > > So, what is this by-label with the x2f, and why can't it find the drives? > > Or do I have to file a bug report? This is a true show-stopper.Here are a few related thoughts: The 'x2f' looks to me very similar to me to %2F, the URL encoding for the forward slash (/). If you look in /usr/lib/udev/rules.d, you'll see rules like ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}" where, if ID_FS_LABEL_ENC were equal to "/", then the rule would be disk/by-label// -- with two trailing slashes, which (perhaps) gets interpreted not as one slash (like cd might do) by as "/x2f". That's the end of random thought #1. The second is like it: A local C7 machine has this root entry in /etc/fstab: /dev/mapper/vg00-rootdev / xfs defaults 0 0 When I search my system logs for messages like the ones in your original post, I see systemd: Found device /dev/mapper/vg00-rootdev. systemd: Starting File System Check on /dev/mapper/vg00-rootdev... It's only after that's complete that I get device-specific messages like systemd: Found device ST9600204SS. So I'm interested to know the content of your /etc/fstab file. End of thought #2. -- Paul Heinlein <> heinlein at madboa.com <> http://www.madboa.com/
m.roth at 5-cent.us
2016-Feb-18 21:25 UTC
[CentOS] CentOS 7, Xeon CPUs, not booting, [SOLVED], bug filed
Paul Heinlein wrote:> On Thu, 18 Feb 2016, m.roth at 5-cent.us wrote: > >> This is happening on anything other than plain vanilla Dell servers. One >> R730, with dual Tesla cards, one R420, with a fibre card for a RAID >> device, it never switches root. All these systems have Xeons, not AMD >> CPUs. >> >> We've had this with every one of the 327 kernels. In addition, it seems >> to happen also with the 229.20.1; the 229.14.1 has no such problem. >> >> From the rdsosreport: >> starting at line 126: >> /dev/disk/by-label: >> total 0 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 SWAP -> ../../sda2 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2f -> ../../sda3 >> lrwxrwxrwx 1 root 0 10 Jan 27 19:03 \x2fboot -> ../../sda1 >> >> Then, starting at line 1283: >> [ 3.317027] <servername> systemd[1]: Found device ST500NM0003-9ZM172 >> /. >> [ 3.317974] <servername> systemd[1]: Starting File System Check on >> /dev/disk/by-label/\x2f... >> [ 3.320089] <servername> systemd-fsck[590]: Failed to detect device >> /dev/disk/by-label// >> [ 3.320567] <servername> systemd[1]: systemd-fsck-root.service: main >> process exited, code=exited, status=1/FAILURE >> [ 3.320972] <servername> systemd[1]: Failed to start File System >> Check >> on /dev/disk/by-label/\x2f. >> >> Does *ANYONE* have any clues as to what's going on? >> >> Meanwhile, on a plain vanilla Dell R420, I see: >> ll /dev/disk/by-label/ >> total 0 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 SWAP -> ../../sda2 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 boot -> ../../sda1 >> lrwxrwxrwx. 1 root root 10 Feb 17 10:06 root -> ../../sda3 >> >> So, what is this by-label with the x2f, and why can't it find the >> drives? >> >> Or do I have to file a bug report? This is a true show-stopper. > > Here are a few related thoughts: > > The 'x2f' looks to me very similar to me to %2F, the URL encoding for > the forward slash (/). > > If you look in /usr/lib/udev/rules.d, you'll see rules like > > ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", > SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}" > > where, if ID_FS_LABEL_ENC were equal to "/", then the rule would be > disk/by-label// -- with two trailing slashes, which (perhaps) gets > interpreted not as one slash (like cd might do) by as "/x2f". > > That's the end of random thought #1. > > The second is like it: > > A local C7 machine has this root entry in /etc/fstab: > > /dev/mapper/vg00-rootdev / xfs defaults 0 0 > > When I search my system logs for messages like the ones in your > original post, I see > > systemd: Found device /dev/mapper/vg00-rootdev. > systemd: Starting File System Check on /dev/mapper/vg00-rootdev... > > It's only after that's complete that I get device-specific messages > like > > systemd: Found device ST9600204SS. > > So I'm interested to know the content of your /etc/fstab file. > > End of thought #2.I just successfully brought up one that consistently failed. And filed a bug report, 0010398. What I did: 1. in /etc/fstab, I changed LABEL= to /dev/sda* 2. I did rebuild the initramfs with that. That still didn't do it. Finally, I did this: from the grub2 boot menu, I edited the kernel line so that instead of reading ... root=LABEL=/, it read root=/dev/sda3, and it booted with zero issues. There is, therefore, a bug in grub2? the handoff to systemd? where it does not handle LABEL correctly. mark