thr3ads.net - CentOS - [CentOS] Unable to boot CentOS 6

If this information is useful, please help other people find it:
Share via:

John Cenile

2016-May-29 00:27 UTC

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

Hi all,
I had an issue this morning with one of my virtual machines. It wouldn't
boot (into any runlevel), nor could I chroot into the root partition using
a rescue disk.

Unfortunately I didn't grab a screenshot, however the error(s) when booting
were:


/pre-pivot/50selinux-loadpolicy.sh: 14

<other messages>

init: readahead main process (425) killed by SEGV signal
init: readahead-col lector main process (421) killed by SEGV signal
init: rcS pre-start process (425) killed by SEGV signal
init: Error while reading from descriptor: Bad file descriptor
init: readahead-col lector post-stop process (424) killed by SEGV signal
init: rcS post-stop process (427) killed by SEGV signal
init: readahead-disable-services main process (428) killed by SEGV signal


When using a rescue CD and chrooting into the root partition, I would get :

Segmentation Fault: Core Dumped


In the end, the fix was to boot into a rescue CD with networking, and SCP
the entire contents of /bin and /sbin from another (working) server to the
broken installation. This finally allowed CentOS 6 to boot correctly.

So I'm left to assume some of the files in /bin *or */sbin were corrupt.

My question is, does anyone have any ideas on how this might have happened?
I did do a quick memory test using the rescue CD (it didn't complete) and
there weren't any errors. The virtual machine is running on VMWare with 3
other VMs, which all seem fine. There wasn't any unexpected power loss
either.

Thanks.

John Cenile

2016-May-29 00:42 UTC

head link

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

Also, the last message in /var/log/messages before the crash was:


^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@May
29 07:30:10 *hostname* kernel: imklog 5.8.10, log source = /proc/kmsg
started


Which seems very concerning.

On 29 May 2016 at 10:27, John Cenile <jcenile1983 at gmail.com> wrote:
> Hi all,
> I had an issue this morning with one of my virtual machines. It
wouldn't
> boot (into any runlevel), nor could I chroot into the root partition using
> a rescue disk.
>
> Unfortunately I didn't grab a screenshot, however the error(s) when
> booting were:
>
>
> /pre-pivot/50selinux-loadpolicy.sh: 14
>
> <other messages>
>
> init: readahead main process (425) killed by SEGV signal
> init: readahead-col lector main process (421) killed by SEGV signal
> init: rcS pre-start process (425) killed by SEGV signal
> init: Error while reading from descriptor: Bad file descriptor
> init: readahead-col lector post-stop process (424) killed by SEGV signal
> init: rcS post-stop process (427) killed by SEGV signal
> init: readahead-disable-services main process (428) killed by SEGV signal
>
>
> When using a rescue CD and chrooting into the root partition, I would get :
>
> Segmentation Fault: Core Dumped
>
>
> In the end, the fix was to boot into a rescue CD with networking, and SCP
> the entire contents of /bin and /sbin from another (working) server to the
> broken installation. This finally allowed CentOS 6 to boot correctly.
>
> So I'm left to assume some of the files in /bin *or */sbin were
corrupt.
>
> My question is, does anyone have any ideas on how this might have
> happened? I did do a quick memory test using the rescue CD (it didn't
> complete) and there weren't any errors. The virtual machine is running
on
> VMWare with 3 other VMs, which all seem fine. There wasn't any
unexpected
> power loss either.
>
> Thanks.
>

cpolish at surewest.net

2016-May-29 17:03 UTC

head link

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

On 2016-05-29 10:42, John Cenile wrote:> Also, the last message in /var/log/messages before the crash was:
<snip />>
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@May
> 29 07:30:10 *hostname* kernel: imklog 5.8.10, log source = /proc/kmsg
> started
> 
> Which seems very concerning.
Hi John,

TL;DR: prevention.

I can't say what happened, but I've a long-standing dread of 
your situation. Here's some ways to prepare for (or prevent)
the next time this happens. Possibly you're already doing all 
this but a recitation here might help someone else too.

- Set up remote logging. I favor rsyslog, but you can also
  use syslog-ng. Have one central logging server. This way you 
  can look for signs of trouble that preceded the crash.
 
- Keep baselines from the guest VMs. You can run rpm --verify 
  and preserve the output off-host (last step in yum update).
  Disable the nightly pre-link behavior (was this ever a good 
  idea?) to make comparing results more meaningful. 
  Post-crash, mount the victim read-only and re-run the verify
  to pin-point what part of the filesystem was clobbered.
  Knowing what was clobbered (and when) can help. Not long ago
  an errant script in production cleared the wrong
  directory but only when transaction volume crested some
  threshold, wiping out a critical monitoring script.

- Treat your hosts like cattle, not pets. Automating creation
  and maintenance of hosts gives you more and better options 
  for recovery when hosts go insane.

- Test and re-test your storage system. There are bugs lurking
  in every storage code base and every HBA's firmware. The
  physical connectors in your data path are built on a mass
  of compromises and contradictory design goals and are just 
  waiting to fail. Flush bugs out before putting gear into
  production.

- Restores, not backups, are your friends.[1] I ran into a
  bug in Gnu tar (this year) that left me with silently
  corrupted archives but only for thin-provisioned virtual 
  filesystems >16GB that compressed to <8GB. Only a full 
  restore unearthed the ugly truth.

- Consider ECC RAM. Once you have a few tens of GB's  you've 
  essentially got your own cosmic ray detector. If you 
  figure your time at $50/hour and it takes ten hours to deal 
  with with one ephemeral mysterious incident then springing 
  for $500 worth of ECC RAM is a good bet. Figure in the cost 
  of downtime and it's a no brainer.

Best regards,
-- 
Charles Polisher

[1]
http://web.archive.org/web/20070920215346/http://people.qualcomm.com/ggr/GoB.txt

Possibly Parallel Threads

Search for more possibly parallel threads

CentOS - May 2016 - Unable to boot CentOS 6 - Segmentation Erorr

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

[CentOS] Unable to boot CentOS 6 - Segmentation Erorr

Possibly Parallel Threads