mitch@bitblock.net
2013-Sep-29 20:59 UTC
Re: fresh xcp 1.6 & xenserver install fails on first boot (kernel not found). [SOLVED]
So I''ve found the issue... But I don''t yet understand the issue. If anyone has any ideas feel free to share. To keep a long story short (cause it took days and I don''t want to bore anyone...) I ended up going through a process of a/b testing narrowing down the conditions that reproduced the failed install. Nearing the end I found that it was only happening on equipment located in a particular cabinet - then I narrowed it down to be only reproducible on ONE machine - that one machine could produce the problem with either raid controller. Here''s the key - it would only happen when the unit was racked. To all those that struggle after me, determining the in use modules (when I thought the issue might be a driver missing from the running kernel) : lspci -k gives you a list of running modules - then opening up / extracting the initrd can allow you to look for the missing module. In my case nothing was missing. So I continued to work on my isolation of the issue - it turns out it was happening ONLY when a specific serial db9 to rj45 adapter is plugged in. That same adapter (or at least similar adapter) is in use on ALL our machines - however if I disconnect or replace the one on this one machine the problem goes away. I''m suspecting some sort of short in the adapter which is somehow affecting the server, but even then I''m stumped at how something on a serial port is affecting a boot process. And why it doesn''t affect windows or vmware booting on the same machine (the machine was once a windows machine, and until the attempted reinstall was a functioning esxi host). At any rate, I have the unit labelled - and have looked inside for a solder blob - can''t see anything yet. It is conclusively the issue though. I can crash the boot process by connecting it, and allow it to work properly by disconnecting it. Thanks to everyone who helped me as I looked for what Ian correctly identified as a "red herring" - I have still yet to receive any response on any of the other lists / forums - I guess the issue was overly unusual. I''m very grateful that there are people on this list who can respond - even when it''s outside the scope of the list. Thanks guys! Mitch -----Original Message----- From: Ian Campbell [mailto:Ian.Campbell@citrix.com] Sent: September 24, 2013 2:40 AM To: Mitch (BitBlock) Cc: ''eneal@businessgrade.com''; ''Xen-users@lists.xen.org''; ''jaceksburghardt@gmail.com'' Subject: Re: [Xen-users] fresh xcp 1.6 & xenserver install fails on first boot (kernel not found). On Tue, 2013-09-24 at 00:02 +0000, mitch@bitblock.net wrote:> Someone mentioned I should verify the installed kernel had the proper > driver in it (which the iso obviously has as I can mount the file > system from it).I think this is a red-herring. From the description you have given I don''t think you are getting anywhere near the point at which the kernel would want a driver for the hardware. You are failing at the bootloader stage to even load the kernel in the first place. Are you able to get a full log of the boot, gibberish and all, perhaps using a serial console? Have you validated the content of /boot/extlinux.conf?