Petr Kohts
2007-Aug-09 08:06 UTC
[syslinux] pxelinux doesn't answer ARP requests when it should
Hi there. Regarding currently known problem as stated on http://syslinux.zytor.com/pxe.php: we should probably call the UDP receive function in the keyboard entry loop, so that we answer ARP requests. Here's another illustration of the problem, description of related pxelinux problem and workaround for both problems. Configuration: dhcp/tftp server -- 213.180.194.116 (00:1B:78:05:84:6C) debian 4.0, kernel 2.6.18-4-686-bigmem dhcp/tftp client (the one which boots) -- 213.180.194.120 (00:30:48:34:26:12) supermicro x7dbr-8/x7dbr-i bios rev 1.3b intel boot agent ge v.1.2.36 Both computers are attached to one switch. Tcpdump (tshark actually) is started on the third computer attached to the same switch. The switch is configured to copy all the packets from the dhcp/tftp server's port to the port of this third computer. pxelinux tested: syslinux 3.51 (both compiled with HAVE_IDLE 0 and HAVE_IDLE 1), syslinux 3.31 (debian binary package 3.31-4) Consider the following tcpdump session: glavryba:~# tshark -i eth0 ether src 00:30:48:34:26:12 or ether dst 00:30:48:34:26:12 Capturing on eth0 0.000000 0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x4a342612 1.001464 213.180.194.116 -> 255.255.255.255 DHCP DHCP Offer - Transaction ID 0x4a342612 2.032308 0.0.0.0 -> 255.255.255.255 DHCP DHCP Request - Transaction ID 0x4a342612 2.033558 213.180.194.116 -> 255.255.255.255 DHCP DHCP ACK - Transaction ID 0x4a342612 2.034558 Supermic_34:26:12 -> Broadcast ARP Who has 213.180.194.116? Tell 213.180.194.120 2.034570 213.180.194.116 -> 213.180.194.120 ICMP Echo (ping) request 2.034575 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP 213.180.194.116 is at 00:1b:78:05:84:6c 2.034933 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux.0, Transfer type: octet 2.035562 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 2.036066 213.180.194.120 -> 213.180.194.116 TFTP Error Code, Code: Not defined, Message: TFTP Aborted 2.036184 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux.0, Transfer type: octet 2.036808 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 2.037306 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 2.037430 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2 2.042927 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2 [skipped] 2.276153 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux.cfg/default, Transfer type: octet 2.276657 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 2.277152 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0 2.277280 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 2.277777 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 2.277903 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2 2.278403 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2 2.278529 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 3 2.279028 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 3 2.279154 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 4 (last) 2.279652 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4 2.279658 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux-screens/welcome, Transfer type: octet 2.280280 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 2.280651 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0 2.280904 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 2.281277 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 2.281527 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2 (last) 2.281900 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2 7.033132 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 8.033121 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 9.033111 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 10.031477 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/k/ds386, Transfer type: octet 10.031984 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 10.032477 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0 10.032604 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 10.033101 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 13.291531 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 19.882343 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 33.064094 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 At the mark 2.281900 client has finished downloading pxelinux.0 and default menu, displayed the menu and is waiting at user prompt. In approx. 5 seconds (mark 7.033132) server starts to check whether the client as alive and the client doesn't respond. For several seconds server uses the old, unverified MAC address (marks 10.031984 and 10.032604). Finally server decides that the client is dead and stops sending data packets. At this point client thinks that its last acknowledgement packet might be dropped and starts to retry it (marks 10.033101, 13.291531, 19.882343, 33.064094). Consider another tcpdump session: glavryba:~# tshark -r test | less 1 0.000000 0.0.0.0 -> 255.255.255.255 DHCP DHCP Discover - Transaction ID 0x4a342612 2 1.000086 213.180.194.116 -> 255.255.255.255 DHCP DHCP Offer - Transaction ID 0x4a342612 3 2.032179 0.0.0.0 -> 255.255.255.255 DHCP DHCP Request - Transaction ID 0x4a342612 4 2.033178 213.180.194.116 -> 255.255.255.255 DHCP DHCP ACK - Transaction ID 0x4a342612 5 2.034178 Supermic_34:26:12 -> Broadcast ARP Who has 213.180.194.116? Tell 213.180.194.120 6 2.034188 213.180.194.116 -> 213.180.194.120 ICMP Echo (ping) request 7 2.034192 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP 213.180.194.116 is at 00:1b:78:05:84:6c 8 2.034678 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux.0, Transfer type: octet 9 2.035303 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 10 2.035801 213.180.194.120 -> 213.180.194.116 TFTP Error Code, Code: Not defined, Message: TFTP Aborted 11 2.035927 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux.0, Transfer type: octet [skipped] 99 2.279653 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/pxelinux-screens/welcome, Transfer type: octet 100 2.280272 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 101 2.280772 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0 102 2.280898 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 103 2.281396 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 104 2.281521 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2 (last) 105 2.282021 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2 106 5.359440 213.180.194.120 -> 213.180.194.116 TFTP Read Request, File: pxelinux/k/ds386, Transfer type: octet 107 5.359940 213.180.194.116 -> 213.180.194.120 TFTP Option Acknowledgement 108 5.360440 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 0 109 5.360566 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1 110 5.361064 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1 111 5.361189 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2 112 5.361689 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2 [skipped] 5207 7.030129 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2550 5208 7.030629 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2550 5209 7.031879 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 5210 7.032003 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2551 5211 7.032503 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2551 [skipped] 7646 8.030992 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1151 7647 8.031244 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1152 7648 8.031867 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 7649 8.037488 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1152 7650 8.037739 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 1153 7651 8.038112 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 1153 [skipped] 10702 9.031234 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2679 10703 9.031607 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2679 10704 9.031741 00:1b:78:05:84:6c -> Supermic_34:26:12 ARP Who has 213.180.194.120? Tell 213.180.194.116 10705 9.031859 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 2680 10706 9.032232 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 2680 [skipped] 13759 10.030598 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 4207 13760 10.031097 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4207 13761 10.031223 213.180.194.116 -> 213.180.194.120 TFTP Data Packet, Block: 4208 13762 10.031721 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4208 13763 13.291526 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4208 13764 19.882338 213.180.194.120 -> 213.180.194.116 TFTP Acknowledgement, Block: 4208 At mark 105 (time 2.282021) client successfully downloaded welcome text, displayed it and waiting at user prompt. This time I was not waiting for server to start checking client with arp requests but typed desired menu label quickly and client started to download kernel (mark 106, time 5.359440). Further we observe the tftp exchange during which server sends 3 arp probes to check that the client is alive: marks 5209, 7648, 10704. Unfortunately client does not respond to these arp probes and consequently server decides that the client is dead and stops sending tftp data blocks (last one is sent at mark 13761). Again client decides that server has not received its last ACK (mark 13762) and starts to resend it (marks 13763, 13764). Conclusion: 1) pxelinux does not respond to ARP packets not only in the keyboard loop but also in tftp send/receive loop 2) Linux kernel does not use incoming tftp packets as a last resort when checking that the host is alive. Workaround: change server's aggressive arp checking behaviour which defaults first arp probe of newly known client to 5 seconds (which seems to be linux 2.6 kernel default). echo 60 > /proc/sys/net/ipv4/neigh/eth0/delay_first_probe_time Regards, Petya. ps: I would like to thank all the CCed people who helped me to carry out this little investigation.
H. Peter Anvin
2007-Aug-09 16:42 UTC
[syslinux] pxelinux doesn't answer ARP requests when it should
Petr Kohts wrote:> Hi there. > > Regarding currently known problem as stated on > http://syslinux.zytor.com/pxe.php: > we should probably call the UDP receive function in the keyboard > entry loop, so that we answer ARP requests. >... which has turned out to be impossible, because too many PXE stacks are buggy and will wait for a packet when UDP READ is called, despite the fact that the call is explicitly documented as nonblocking. *Sigh.*> Conclusion: > 1) pxelinux does not respond to ARP packets not only > in the keyboard loop but also in tftp send/receive loop > 2) Linux kernel does not use incoming tftp packets > as a last resort when checking that the host is alive.Ironically I heard about this problem first as late as the day before yesterday. I'm not sure how to work around it, other than having PXELINUX carry its own IP stack with it. THAT is already in the works, however, since I've been working with the Etherboot team to come up with a gPXE-PXELINUX integrated solution. When finalized, this will be a single "pxelinux.0" image which will contain the gPXE (which contains an independent IP stack) with it. It will also make it possible to get files over HTTP or other TCP protocols. We successfully demoed this at the Etherboot.org booth at LinuxWorld this week, but it still needs additional work. -hpa
Possibly Parallel Threads
- EFI & PXE-booting: very slow TFTP performance on a VMWare test setup
- tftp-hpa server with multiple network interfaces
- EFI & PXE-booting: very slow TFTP performance on a VMWare test setup
- PXE boot hangs while transferring vmlinuz or initrd.img
- 1.4 Zaptel/Sangoma Issues on CentOS