We encountar a somewhat odd problem. We use pxelinux to boot a linux cluster and when we reboot all 103 of the nodes some of the won't start since the tftp-servere looses the arp-entry _between_ it downloads the pxelinux.0 and the config file! pxelinux doesn't seem to answer the arp-requests either in this phase. We run linux 2.4.22 on the tftp-server which also is dhcp-server and nfs-server. Is this som known issue with the linux kernel or pxelinux? // Jimmy -- Jimmy Hedman South Pole AB Phone: +46 8 51420420 Gelbjutarv?gen 5 Fax: +46 8 51420429 SE - 17148 Solna e-mail: jimmy.hedman at southpole.se www.southpole.se
Hi, Jimmy Hedman <jimmy.hedman at southpole.se> schrieb am 15.10.03 15:54:42:> > We encountar a somewhat odd problem. We use pxelinux to boot a linux > cluster and when we reboot all 103 of the nodes some of the won't start > since the tftp-servere looses the arp-entry _between_ it downloads the > pxelinux.0 and the config file! pxelinux doesn't seem to answer the > arp-requests either in this phase. > > We run linux 2.4.22 on the tftp-server which also is dhcp-server and > nfs-server. > > Is this som known issue with the linux kernel or pxelinux?The problem is: The server has its arp table overflown. Since pxelinux on the pxe rom is somewhat limited I think it doesn't answer to arp requests while downloading a file. ( hpa, can you confirm this?) The solution is to configure the server to have a bigger arp cache, this is set somewhere in /proc/kernel. See the kernel documentation for this. Regards, Josef ______________________________________________________________________________ Zwei Mal Platz 1 mit dem jeweils besten Testergebnis! WEB.DE FreeMail und WEB.DE Club bei Stiftung Warentest! http://f.web.de/?mc=021183
Jimmy Hedman wrote:> We encountar a somewhat odd problem. We use pxelinux to boot a linux > cluster and when we reboot all 103 of the nodes some of the won't start > since the tftp-servere looses the arp-entry _between_ it downloads the > pxelinux.0 and the config file! pxelinux doesn't seem to answer the > arp-requests either in this phase.This isn't a very long phase... and PXE will unfortunately only answer ARP while it is either sending or trying to receive a packet. However, I find it extremely odd that you would see failure due to a window that usually amounts only a few milliseconds. There are other delays which would seem much more severe.> We run linux 2.4.22 on the tftp-server which also is dhcp-server and > nfs-server. > > Is this som known issue with the linux kernel or pxelinux?PXELINUX, or perhaps better said, it's a known issue with PXE. -hpa