H. Peter Anvin
2014-Mar-12 21:00 UTC
[syslinux] Very slow download with pxelinux > 4.07 on specific hardware
On 03/10/2014 04:15 PM, Gene Cumm wrote:> > It's also a balance of time. While working on 4.10-pre*/5.10-pre*, I > found that some hardware misreports its behavior. "Sure, Interrupts > work" but they don't is but one that I worked around on specific > hardware. >The odd part is that people are reporting this even using the legacy PXE implementation (not lpxelinux.0)... -hpa
Eric PEYREMORTE
2014-Mar-13 11:09 UTC
[syslinux] Very slow download with pxelinux > 4.07 on specific hardware
Le 12/03/2014 22:00, H. Peter Anvin a ?crit :> On 03/10/2014 04:15 PM, Gene Cumm wrote: >> It's also a balance of time. While working on 4.10-pre*/5.10-pre*, I >> found that some hardware misreports its behavior. "Sure, Interrupts >> work" but they don't is but one that I worked around on specific >> hardware. >> > The odd part is that people are reporting this even using the legacy PXE > implementation (not lpxelinux.0)... > > -hpaIf there is a way to get useful debug traces let me know. By the way, everything is slow from the moment the following string appears : PXELINUX 5.10 0x5321850f I tried to search through the code, compare different versions to understand what's wrong, but i definitely don't have the required skills.... What i notice from the wireshark traces, is that pxelinux.0 is loaded really quickly. Then it fetches ldlinux.c32 very slowly (for the next files too) For lpxelinux.0, from the trace, everything is slow too, but at some point, the client seems stuck in a loop sending acknowledgement for a packet again and again. The server tries to send the next packet but the clients keeps sending ack for the previous one. Eric
H. Peter Anvin
2014-Mar-14 17:40 UTC
[syslinux] Very slow download with pxelinux > 4.07 on specific hardware
On 03/13/2014 04:09 AM, Eric PEYREMORTE wrote:> Le 12/03/2014 22:00, H. Peter Anvin a ?crit : >> On 03/10/2014 04:15 PM, Gene Cumm wrote: >>> It's also a balance of time. While working on 4.10-pre*/5.10-pre*, I >>> found that some hardware misreports its behavior. "Sure, Interrupts >>> work" but they don't is but one that I worked around on specific >>> hardware. >>> >> The odd part is that people are reporting this even using the legacy PXE >> implementation (not lpxelinux.0)... >> >> -hpa > If there is a way to get useful debug traces let me know. > > By the way, everything is slow from the moment the following string > appears : > > PXELINUX 5.10 0x5321850f >I am *assuming* you are seeing the full copyright banner here, not just the above string (dumb question, I know, but sometimes it really, really matters.)> I tried to search through the code, compare different versions to > understand what's wrong, but i definitely don't have the required > skills....This is very challenging. One of the big problems is that the legacy network code (pxelinux.0 as opposed to lpxelinux.0) was pulled out and then pulled back in, and clearly something changed in the process. I looked over your wire trace and there is a fixed amount of delay -- just under 20 ms -- between each packet, which strongly implies that it ends up waiting for some kind of timer to expire. *What* timer that is is less clear, because the only *architectural* timer is the 55 ms timer interrupt, which doesn't fit the observed time. That implies this is a timer inside the PXE code. Why that didn't happen before and does now is the real mystery.> What i notice from the wireshark traces, is that pxelinux.0 is loaded > really quickly. Then it fetches ldlinux.c32 very slowly (for the next > files too) > > For lpxelinux.0, from the trace, everything is slow too, but at some > point, the client seems stuck in a loop sending acknowledgement for a > packet again and again. The server tries to send the next packet but the > clients keeps sending ack for the previous one.Right... this implies that the receiver stopped functioning so the machine "went deaf". That is a fairly common failure mode, but why it happens here is again the big question. Unfortunately I only have my "spare time" to work on Syslinux anymore, which makes hard problems like this difficult to dig into. I *really* appreciate the debugging information you have already given us... it gives us a starting point at least. The 20 ms delay is a very important clue. -hpa
H. Peter Anvin
2014-Mar-14 17:45 UTC
[syslinux] Very slow download with pxelinux > 4.07 on specific hardware
Here is a thought. Could you try putting "nohalt 1" in the configuration file? The download of ldlinux.c32 and the config file will still be slow, of course, because this doesn't take effect until the config file is read, but if it speeds up after that I think we might have a plausible explanation. -hpa
Possibly Parallel Threads
- Very slow download with pxelinux > 4.07 on specific hardware
- Very slow download with pxelinux > 4.07 on specific hardware
- Very slow download with pxelinux > 4.07 on specific hardware
- Very slow download with pxelinux > 4.07 on specific hardware
- Very slow download with pxelinux > 4.07 on specific hardware