I thought I'd summarise this with a proper subject line: 1. We used up2date to upgrade the kernel of a 7.2 machine that is doing far too much journalling (kjournald at 50% CPU+ often). 2. It installed fine, but when we reboot - GRUB only shows the old 2.4.7-10 although there are 3 kernels listed in grub.conf My Question is "How can we select booting to 2.4.18-24.7 when GRUB lonly lists the oldest kernel?" Here are the contents of the directory /boot & also of grub.conf <SNIP> [root@ns5 boot]# ls -al total 10908 drwxr-xr-x 3 root root 4096 Feb 6 09:27 . drwxr-xr-x 23 root root 4096 Feb 6 09:56 .. -rw-r--r-- 1 root root 5824 Jun 25 2001 boot.b -rw-r--r-- 1 root root 612 Jun 25 2001 chain.b -rw-r--r-- 1 root root 42268 Nov 14 01:50 config-2.4.18-18.7.x -rw-r--r-- 1 root root 42257 Jan 31 12:20 config-2.4.18-24.7.x drwxr-xr-x 2 root root 4096 Feb 6 09:16 grub -rw-r--r-- 1 root root 126906 Dec 16 10:34 initrd-2.4.18-18.7.x.img -rw-r--r-- 1 root root 127368 Feb 6 09:15 initrd-2.4.18-24.7.x.img -rw-r--r-- 1 root root 337546 Feb 22 2002 initrd-2.4.7-10.img lrwxrwxrwx 1 root root 14 Dec 16 10:33 kernel.h -> kernel.h-2.4.9 -rw-r--r-- 1 root root 405 Dec 16 10:33 kernel.h-2.4.9 -rw-r--r-- 1 root root 23108 Jun 25 2001 message lrwxrwxrwx 1 root root 25 Feb 6 09:15 module-info -> module-info-2.4.18-24.7.x -rw-r--r-- 1 root root 15436 Nov 14 01:50 module-info-2.4.18-18.7.x -rw-r--r-- 1 root root 15436 Jan 31 12:20 module-info-2.4.18-24.7.x -rw-r--r-- 1 root root 13598 Sep 6 2001 module-info-2.4.7-10 -rw-r--r-- 1 root root 640 Jun 25 2001 os2_d.b lrwxrwxrwx 1 root root 19 Feb 6 09:27 System.map -> System.map-2.4.7-10 -rw-r--r-- 1 root root 490460 Nov 14 01:50 System.map-2.4.18-18.7.x -rw-r--r-- 1 root root 490643 Jan 31 12:20 System.map-2.4.18-24.7.x -rw-r--r-- 1 root root 435039 Sep 6 2001 System.map-2.4.7-10 -rwxr-xr-x 1 root root 2983920 Nov 14 01:50 vmlinux-2.4.18-18.7.x -rwxr-xr-x 1 root root 2986554 Jan 31 12:20 vmlinux-2.4.18-24.7.x lrwxrwxrwx 1 root root 21 Feb 6 09:15 vmlinuz -> vmlinuz-2.4.18-24.7.x -rw-r--r-- 1 root root 1064284 Nov 14 01:50 vmlinuz-2.4.18-18.7.x -rw-r--r-- 1 root root 1064546 Jan 31 12:20 vmlinuz-2.4.18-24.7.x -rw-r--r-- 1 root root 802068 Sep 6 2001 vmlinuz-2.4.7-10 </SNIP> <SNIP> [root@ns5 grub]# cat grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You do not have a /boot partition. This means that # all kernel and initrd paths are relative to /, eg. # root (hd0,1) # kernel /boot/vmlinuz-version ro root=/dev/md0 # initrd /boot/initrd-version.img #boot=/dev/hda default=0 timeout=10 splashimage=(hd0,1)/boot/grub/splash.xpm.gz title Red Hat Linux (2.4.18-24.7.x) root (hd0,1) kernel /boot/vmlinuz-2.4.18-24.7.x ro root=/dev/md0 initrd /boot/initrd-2.4.18-24.7.x.img title Red Hat Linux (2.4.18-18.7.x) root (hd0,1) kernel /boot/vmlinuz-2.4.18-18.7.x ro root=/dev/md0 initrd /boot/initrd-2.4.18-18.7.x.img title Red Hat Linux (2.4.7-10) root (hd0,1) kernel /boot/vmlinuz-2.4.7-10 ro root=/dev/md0 initrd /boot/initrd-2.4.7-10.img </SNIP> The ONLY oddity I can see is that the symblink in /boot below might be wrong: lrwxrwxrwx 1 root root 19 Feb 6 09:27 System.map -> System.map-2.4.7-10 Surely this should point to System.map-2.4.18-24.7.x Can anyone suggest why GRUB is not showing all 3 kernels & why we are not booting into the leatest? Many thanks to all. Regards, Nico Morrison - Director nico.morrison@micronicos.com ___________________________________________ Micronicos Limited - London, UK. Tel: +44 20 8870 8849 Fax: +44 20 8870 5290 Web hosting & domain registrations. 1st Site http://site-registrations.co.uk 1st Domains UK http://1stdomains.co.uk HeadQuarters http://micronicos.com Free Dialup http://www.mailaid.co.uk ___________________________________________
Juri Haberland
2003-Feb-06 12:33 UTC
Re: Why does old kernel boot when new kernel installed?
Nico Morrison wrote:> I thought I'd summarise this with a proper subject line: > > 1. We used up2date to upgrade the kernel of a 7.2 machine that is doing far > too much journalling (kjournald at 50% CPU+ often). > > 2. It installed fine, but when we reboot - GRUB only shows the old 2.4.7-10 > although there are 3 kernels listed in grub.conf > > My Question is "How can we select booting to 2.4.18-24.7 when GRUB lonly > lists the oldest kernel?"> Can anyone suggest why GRUB is not showing all 3 kernels & why we are not > booting into the leatest?Possibly because grub uses a different grub.conf than you think? At this stage I'd recommend to get a Linux/Unix professional as this seems to be standard stuff that you should know how to do and if you don't, then leave it alone and let it do someone, who knows what he does. Sorry if that sounds a bit harsh, but it looks like you are wasting the valuable time of the developers with trivias. Have you actually verified with 'uname -a' that this old kernel is running? Cheers, Juri
Nico Morrison
2003-Feb-06 12:41 UTC
RE: Why does old kernel boot when new kernel installed?
Hello, Yes - verified with uname -a Our tech support are Linux/UNIX professionals & are baffled - I am hoping for some help here, I am emailing as they don't have the time, look after over 100 servers, we only run 12 so I try to dig .... Apologies if trivial & timewasting but important for us - surely someone can tell me why GRUB only shows the single kernel when 3 are there? Last night I did kernel upgrades of 3 other machines & all went through just fine ..... only this 1 machine has this problem? In answer - only 1 grub.conf <snip> [root@ns5 nico]# whereis grub.conf grub: /sbin/grub /etc/grub.conf /usr/share/grub /usr/share/man/man8/grub.8.gz [root@ns5 nico]# </snip> Any help would still be very much appreciated. Regards, Nico Morrison nico.morrison@micronicos.com ___________________________________________ Micronicos Limited - London, UK. Tel: +44 20 8870 8849 Fax: +44 20 8870 5290 ___________________________________________ -----Original Message----- From: Juri Haberland [mailto:juri@koschikode.com] Sent: 06 February 2003 12:34 To: ext3 users list Subject: Re: Why does old kernel boot when new kernel installed? Nico Morrison wrote:> I thought I'd summarise this with a proper subject line: > > 1. We used up2date to upgrade the kernel of a 7.2 machine that is doingfar> too much journalling (kjournald at 50% CPU+ often). > > 2. It installed fine, but when we reboot - GRUB only shows the old2.4.7-10> although there are 3 kernels listed in grub.conf > > My Question is "How can we select booting to 2.4.18-24.7 when GRUB lonly > lists the oldest kernel?"> Can anyone suggest why GRUB is not showing all 3 kernels & why we are not > booting into the leatest?Possibly because grub uses a different grub.conf than you think? At this stage I'd recommend to get a Linux/Unix professional as this seems to be standard stuff that you should know how to do and if you don't, then leave it alone and let it do someone, who knows what he does. Sorry if that sounds a bit harsh, but it looks like you are wasting the valuable time of the developers with trivias. Have you actually verified with 'uname -a' that this old kernel is running? Cheers, Juri _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users
Nico Morrison
2003-Feb-06 13:05 UTC
RE: Why does old kernel boot when new kernel installed?
Hello Juri, The machine is one large RAID1 partition - of 2 IDE drives ....... I think you may have seen something here - am forwarding to our techies & thank you VERY much. Regards, Nico Morrison nico.morrison@micronicos.com ___________________________________________ Micronicos Limited - London, UK. Tel: +44 20 8870 8849 Fax: +44 20 8870 5290 ___________________________________________ From: Juri Haberland [mailto:juri@koschikode.com] Sent: 06 February 2003 12:53 To: ext3 users list Subject: Re: Why does old kernel boot when new kernel installed? Nico Morrison wrote:> Hello, > > Yes - verified with uname -a> Last night I did kernel upgrades of 3 other machines & all went throughjust> fine ..... only this 1 machine has this problem? > > In answer - only 1 grub.confIs it possible that you have a /boot partition that get's mounted over the real /boot? I noticed looking at your grub.conf that it tells something about your server not having a separate /boot partition. So everything is setup to be on the / partition. If, for some unknown reason, you do have a /boot partition that gets mounted, all changes to the kernels and grub.conf will be on that partition but at boot time grub will look at the / partition where no changes occurred. Uhm, is it understandable what I just wrote? Regards, Juri _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users
Nico Morrison
2003-Feb-06 13:30 UTC
RE: Why does old kernel boot when new kernel installed?
Hello Juri & Others, I have found a difference between the setups of the machines that upgraded kernel OK and the 1 that did not. We thought we had setup all these machines as: [root@co1 /]# df -k Filesystem 1k-blocks Used Available Use% Mounted on /dev/md0 38480340 505032 36020604 2% / /dev/hdb1 69973 17976 48384 28% /boot none 511552 0 511552 0% /dev/shm Where md0 is a large RAID1 partition for everything but /boot But the machine with the problem has: [root@ns5 boot]# df -k Filesystem 1k-blocks Used Available Use% Mounted on /dev/md0 36463784 5642076 28969420 17% / none 510400 0 510400 0% /dev/shm Where /boot is ALSO on the RAID1 partition ( this must have been a mistake at setup time ..... although the machine works fine apart from a LOT of kjournald activity (up to 60% CPU!).) Could this be causing GRUB not to see the other kernels & if so what can we do? This is a busy public server with several 100 users ......... we have to be v careful doing anything. Regards, Nico Morrison nico.morrison@micronicos.com ___________________________________________ Micronicos Limited - London, UK. Tel: +44 20 8870 8849 Fax: +44 20 8870 5290 ___________________________________________ From: Juri Haberland [mailto:juri@koschikode.com] Sent: 06 February 2003 12:53 To: ext3 users list Subject: Re: Why does old kernel boot when new kernel installed? Nico Morrison wrote:> Hello, > > Yes - verified with uname -a> Last night I did kernel upgrades of 3 other machines & all went throughjust> fine ..... only this 1 machine has this problem? > > In answer - only 1 grub.confIs it possible that you have a /boot partition that get's mounted over the real /boot? I noticed looking at your grub.conf that it tells something about your server not having a separate /boot partition. So everything is setup to be on the / partition. If, for some unknown reason, you do have a /boot partition that gets mounted, all changes to the kernels and grub.conf will be on that partition but at boot time grub will look at the / partition where no changes occurred. Uhm, is it understandable what I just wrote? Regards, Juri _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users
Nico Morrison
2003-Feb-06 14:30 UTC
RE: Why does old kernel boot when new kernel installed?
Hello Theodore Ts'o, Thank you for your points & I take your comments about being professional ...... we DO have a spare machine all setup & ready to go, the fly in this ointment is the secure server certificate, under which several users are running their small shops ...... otherwise I'd ne thinking of moving the server & starting again. I am forwarding this to our techies & will see. Regards, Nico Morrison nico.morrison@micronicos.com ___________________________________________ Micronicos Limited - London, UK. Tel: +44 20 8870 8849 Fax: +44 20 8870 5290 ___________________________________________ From: Theodore Ts'o [mailto:tytso@mit.edu] Sent: 06 February 2003 14:25 To: Nico Morrison Cc: 'Juri Haberland'; ext3 users list Subject: Re: Why does old kernel boot when new kernel installed? On Thu, Feb 06, 2003 at 01:30:15PM -0000, Nico Morrison wrote:> [root@ns5 boot]# df -k > Filesystem 1k-blocks Used Available Use% Mounted on > /dev/md0 36463784 5642076 28969420 17% / > none 510400 0 510400 0% /dev/shm > > Where /boot is ALSO on the RAID1 partition ( this must have been a mistake > at setup time ..... although the machine works fine apart from a LOT of > kjournald activity (up to 60% CPU!).) > > Could this be causing GRUB not to see the other kernels & if so what canwe> do?Um, that would be yes, very likely. The big question at this point is how GRUB was actually configured at installation time. It is either using a "preset-menu" embedded into it at install time (which it uses if it cannot find the configuration file), or the configuration file, depending on where it was defined to be when GRUB was installed, is somewhere else. If you are right in assuming that the configuration file on all of your machines are otherwise identical, and your Linux/Unix "professionals" didn't perform other improvisations when they installed that particular server, then creating a /boot filesystem on /dev/hda1 like the other systems, and populating it with the appropriate files, and then rebooting, *may* fix the problem for you. Or if you're really lucky, /boot already exists in /dev/hda1, but it wasn't mounted, and once you mount it, you can re-install the newer kernel, and update the /boot/grub/menu.lst found in /dev/hda1's filesystem, and you're good to go. However, a good system administrator, over the years, becomes a paranoid s.o.b. Fortunately, the worst case in performing this particular test would be a reboot; creating or modifying the /boot partition in /dev/hda1, will, in the worst case, simply result in it being ignored by grub. If that doesn't work, however, the next thing to recommend would be to reinstall grub, or if at this point your faith that the system was properly installed, and you are concerned that there may be some other deviances between the "as designed" and "as built" of your server, would be to save the data disks, and rebuild and reconfigure your server from scratch.> This is a busy public server with several 100 users ......... we > have to be very careful doing anything. > >Our tech support are Linux/UNIX professionals & are baffled - I am hoping >for some help here, I am emailing as they don't have the time, look after >over 100 servers, we only run 12 so I try to dig ....As professionals, especially if they are maintaining a large scale site with as many machines as you mentioned, I'm sure they designed and implemented installation scripts so that server machines are easily replicable, and can be rebuilt on a moment's notice. So rebuilding the system software on your server machine should be something that should be doable very easily. Better yet, they should be able to have spare machines on which you can rebuild the system software from scratch, and where you can test to make sure the machine boots correctly, etc., and then afterwards, you can schedule downtime, pull the data disks from suspect server, and then install them in the replacement server, and restore service with very minimal downtime. What, you say you aren't using separate disks and filesystems to separate the system software from the user/application data? And you don't have turnkey scripts that allow you to rebuild the system software of your servers in a repeatable and less error-prone fashion? You *did* say you had professionals in your employ, right? :-) Seriously, there are some really basic, fundamental principles of sound, large-scale system administration that are not being followed, and the fact that you are using a single gigantic root partition and are co-mingling system and user data is just one sympom of the fact that very likely your system administrators are breaking a good number of these fundamentals. The one good thing about the current state of the economy is there are a lot of really good, experienced system administrators who can understand how to design systems that are robust and which can be easily serviced and maintained. I would seriously suggest that you consider bringing one of them on board as a member of your team. - Ted
Sanjeev \"Ghane\" Gupta
2003-Feb-07 03:48 UTC
Re: Why does old kernel boot when new kernel installed?
Nico Morrison wrote:> In answer - only 1 grub.conf > > <snip> > [root@ns5 nico]# whereis grub.conf > grub: /sbin/grub /etc/grub.conf /usr/share/grub > /usr/share/man/man8/grub.8.gz > [root@ns5 nico]# > </snip>I have never used grub, but can you try this: halt machine, wait 10mins, boot. Now do a ls -lu /etc/grup.conf to see if any process lokked at it a minute ago. If no, then I presume grub is not looking at the file at all. -- Sanjeev