Benjamin Karhan
2006-Sep-02 20:51 UTC
[CentOS] CentOS 4.4 LILO Raid (also e2fsprogs in general)
i noticed that the 4.4 upgrade broke LILO on Raid... it gave errors on partitions not on the primary disk along the lines of "/dev/dis: no such device"... even explicitly installing the boot sector on each device in the array, while it allowed the system to boot, gave errors at boot time and did not display the CentOS boot screen anymore... i solved the problem by creating a package for the most recent version of LILO and installing that... with a few minor changes to "lilo.conf" everything was kosker for LILO install and boot... i even made a nice new 640x480x256 CentOS boot screen for it... there is one caveat to using the new LILO... the new version (or at least one above 22.x) must be booted once (text-mode) before LILO can correctly probe the video BIOS... so, the boot screen needs to be installed after an initial priming boot... afterwards, i did some thorough testing before deploying on any production servers... and noticed only one problem... the new version of LILO broke the "grubby" probe... primarily because "/boot/boot.b" is no longer needed... but also because the comparisons between "boot.b" and the actual boot sector aren't accurate anymore either... since i'd noticed that "grubby"'s probe for GRUB itself has been broken for a while... i decided a quick patch-job for the "mkinitrd" package was in order... my new package correctly identifies whether either LILO or GRUB has been installed at all, but lacks the careful byte comparisons that were the core of the breakage of the original "grubby"... it's possibly problematic... but i doubt it... and it solves more problems than it creates... anyways... that solved my 4.4 upgrade breakage... and... since i was writing to the list... i figured i should mention a rather serious, but extremely unlikely to happen, bug i noticed in "e2fsprogs" (specifically "e2fsck")... with the version on CentOS... "e2fsck" will cause some big problems and potential data loss on directories containing more than 1.2 million inodes... the first 1.2 million inodes will be ok... the next 1.2 million (max) will be saved to /lost+found as unidentified files... before it too fills up... and then everything else will just "vanish"... this problem is due to some incorrect math in the handling of the maximum inode size... and has already been fixed in e2fsprogs 1.38... so, my solution was to take the 1.38 Fedora Core sources and build a package for CentOS... if anyone is particularly interested... i have the SRPMS for my lilo, mkinitrd, and e2fsprogs packages... also, i have the pretty new CentOS boot screen (and a few others i made) and some working lilo configurations for switching over to them... all of which i could probably post somewhere public... but if they are really useful, i'd rather ship them along to someone more closely involved in maintaining CentOS... so as to "contribute" as best i could to everyone's future well-being as well... B. Karhan simon at pop.psu.edu PRI/SSRI Unix Administrator