Dan Hyatt
2014-May-09 19:25 UTC
[CentOS] Kickstarts failing 30% of time on Dell 620 blades
I have a large set of Dell 620 blades fully populated with memory and duel socket CPUs, Centos6.4 image. I have a kickstart that I am using to pxe boot 36 blades. I have two internal drives which are raid1 (two disks formed into one, no redundancy), not san attached In the first set, 9 successfully completed. 7 more built correctly after trying another pxe boot. 2 just wont pxeboot In the second set I had 11 fail and 5 succeed and the two I tried again failed. When they fail, they go to GRUB. I try booting from disk from the drac and still get grub It looks like the complete centos kickstart occurs as it goes through the whole install before rebooting and failing. Any idea why this would happen with identical hardware, identical kickstart/image, inside the same blade chassis. Any idea what to test. -- Dan Hyatt Division of Statistical Genomics Washington University School of Medicine 4444 Forest Park Blvd, Campus Box 8506 St. Louis, MO 63108 314 747 4767 (o) 314 473 8713 (c) dhyatt at dsgmail.wustl.edu
Laurent CREPET
2014-May-09 19:54 UTC
[CentOS] Kickstarts failing 30% of time on Dell 620 blades
Same BIOS/adapter settings ? Same firmware versions ? On Fri, May 9, 2014 at 9:25 PM, Dan Hyatt <dhyatt at dsgmail.wustl.edu> wrote:> > I have a large set of Dell 620 blades fully populated with memory and > duel socket CPUs, Centos6.4 image. > > I have a kickstart that I am using to pxe boot 36 blades. > I have two internal drives which are raid1 (two disks formed into one, > no redundancy), not san attached > In the first set, 9 successfully completed. 7 more built correctly after > trying another pxe boot. 2 just wont pxeboot > > In the second set I had 11 fail and 5 succeed and the two I tried again > failed. > When they fail, they go to GRUB. I try booting from disk from the drac > and still get grub > It looks like the complete centos kickstart occurs as it goes through > the whole install before rebooting and failing. > > Any idea why this would happen with identical hardware, identical > kickstart/image, inside the same blade chassis. > Any idea what to test. > > -- > > Dan Hyatt > Division of Statistical Genomics > Washington University School of Medicine > 4444 Forest Park Blvd, Campus Box 8506 > St. Louis, MO 63108 > 314 747 4767 (o) > 314 473 8713 (c) > dhyatt at dsgmail.wustl.edu > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >
m.roth at 5-cent.us
2014-May-09 20:06 UTC
[CentOS] Kickstarts failing 30% of time on Dell 620 blades
Dan Hyatt wrote:> > I have a large set of Dell 620 blades fully populated with memory and > duel socket CPUs, Centos6.4 image. > > I have a kickstart that I am using to pxe boot 36 blades. > I have two internal drives which are raid1 (two disks formed into one, > no redundancy), not san attached > In the first set, 9 successfully completed. 7 more built correctly after > trying another pxe boot. 2 just wont pxeboot<snip>> Any idea why this would happen with identical hardware, identical > kickstart/image, inside the same blade chassis. > Any idea what to test.Nasty thoughts: look at one that's gone to grub, and from the grub command line, try root (<tab>). Then try kernel \vm<tab> I'm just wondering if either they're not pointing to the same UUID, or if they're looking at /dev/sda, and some of them have enumerated it so that it's /dev/sdb, or whatever. Also, I wonder about the possibility of a race issue, if they're all trying to come up at the same time. mark