> On Fri, Oct 2, 2015 at 4:07 AM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> I have a patch that I think may help your situation of syslinux.efi >> being unable to load ldlinux.e64/ldlinux.e32 (though I don't know if >> any of you are using an EFI ia32 platform). >> >> The basics are that we try to enable UseDefaultAddress as it helps >> certain clients with routing and works on numerous other clients. If >> we timeout on receiving a packet and have never received any packets, >> disable UseDefaultAddress and set the addresses manually. >> >> >> git://github.com/geneC/syslinux.git >> https://github.com/geneC/syslinux.git >> >> Branch 1efipxe >> >> >> My test x86-64 binaries: >> >> https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attredirects=0&d=1On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com> wrote:> This works! Fixes my issue I have been having with the DL160sFurther testing, preferably of the above binaries, on machines that previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly appreciated as I know you've observed this issue and this seems like we might have a final resolution. -- -Gene
Geert Stappers
2015-Oct-07 06:33 UTC
[syslinux] UEFI: Failed to load ldlinux.e64/ldlinux.e32
On Tue, Oct 06, 2015 at 10:27:15PM -0400, Gene Cumm via Syslinux wrote:> > On Fri, Oct 2, 2015 at 4:07 AM, Gene Cumm <gene.cumm at gmail.com> wrote: > >> > >> I have a patch that I think may help your situation of syslinux.efi > >> being unable to load ldlinux.e64/ldlinux.e32 (though I don't know if > >> any of you are using an EFI ia32 platform). > >> > >> The basics are that we try to enable UseDefaultAddress as it helps > >> certain clients with routing and works on numerous other clients. If > >> we timeout on receiving a packet and have never received any packets, > >> disable UseDefaultAddress and set the addresses manually. > >> > >> > >> git://github.com/geneC/syslinux.git > >> https://github.com/geneC/syslinux.git > >> > >> Branch 1efipxe > >> > >> > >> My test x86-64 binaries: > >> > >> https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attredirects=0&d=1 > > On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com> wrote: > > This works! Fixes my issue I have been having with the DL160s > > Further testing, preferably of the above binaries, on machines that > previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly > appreciated as I know you've observed this issue and this seems like > we might have a final resolution.Please tell in your feedback if you did get a message about disabling UseDefaultAddress on the screen. Genec: Respect! Respect! Yes twice kudos. One for the technical skills and one for the effort of finding the e-mail address of people effect by the issue. I would have just address the mailinglist, the Syslinux community as a whole. But hey, the personal approach might be good for our project, Syslinux. Groeten Geert Stappers -- Leven en laten leven
On Wed, Oct 7, 2015 at 6:17 AM, Ashish, Shivendra <shivendra.ashish at hpe.com> wrote:> Gene, > > I have tested the binaries provided, it worked for me on DL380 Gen9. > I'm going to test this comprehensively in many different hardware and VM platforms. > I will get back in case I face any issue. > > Thanks a lot.I already found a regression that affects a rather unusual scenario. I hope to have another set shortly after I test this again. -- -Gene> -----Original Message----- > From: Geert Stappers [mailto:stappers at stappers.nl] > Sent: Wednesday, October 07, 2015 12:04 PM > To: Gene Cumm; Da Shi Cao; S. Schauenburg; Michael Glasgow; Oscar Roozen; jeff_sloan at selinc.com; Cao, Da-Shi (EG-China-Presales-CPC-GZ); holger.baust at freenet.ag; For discussion of Syslinux and tftp-hpa; Ashish, Shivendra; Derrick M > Subject: Re: [syslinux] UEFI: Failed to load ldlinux.e64/ldlinux.e32 > > On Tue, Oct 06, 2015 at 10:27:15PM -0400, Gene Cumm via Syslinux wrote: >> > On Fri, Oct 2, 2015 at 4:07 AM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> >> >> I have a patch that I think may help your situation of syslinux.efi >> >> being unable to load ldlinux.e64/ldlinux.e32 (though I don't know >> >> if any of you are using an EFI ia32 platform). >> >> >> >> The basics are that we try to enable UseDefaultAddress as it helps >> >> certain clients with routing and works on numerous other clients. >> >> If we timeout on receiving a packet and have never received any >> >> packets, disable UseDefaultAddress and set the addresses manually. >> >> >> >> >> >> git://github.com/geneC/syslinux.git >> >> https://github.com/geneC/syslinux.git >> >> >> >> Branch 1efipxe >> >> >> >> >> >> My test x86-64 binaries: >> >> >> >> https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attr >> >> edirects=0&d=1 >> >> On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com> wrote: >> > This works! Fixes my issue I have been having with the DL160s >> >> Further testing, preferably of the above binaries, on machines that >> previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly >> appreciated as I know you've observed this issue and this seems like >> we might have a final resolution. > > Please tell in your feedback > if you did get a message about disabling UseDefaultAddress on the screen. > > > Genec: Respect! Respect! > Yes twice kudos. One for the technical skills and one for the effort of finding the e-mail address of people effect by the issue. > > I would have just address the mailinglist, the Syslinux community as a whole. > But hey, the personal approach might be good for our project, Syslinux.
On Wed, Oct 7, 2015 at 2:33 AM, Geert Stappers <stappers at stappers.nl> wrote:> On Tue, Oct 06, 2015 at 10:27:15PM -0400, Gene Cumm via Syslinux wrote: >> > On Fri, Oct 2, 2015 at 4:07 AM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> >> >> I have a patch that I think may help your situation of syslinux.efi >> >> being unable to load ldlinux.e64/ldlinux.e32 (though I don't know if >> >> any of you are using an EFI ia32 platform). >> >> >> >> The basics are that we try to enable UseDefaultAddress as it helps >> >> certain clients with routing and works on numerous other clients. If >> >> we timeout on receiving a packet and have never received any packets, >> >> disable UseDefaultAddress and set the addresses manually. >> >> >> >> >> >> git://github.com/geneC/syslinux.git >> >> https://github.com/geneC/syslinux.git >> >> >> >> Branch 1efipxe >> >> >> >> >> >> My test x86-64 binaries: >> >> >> >> https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attredirects=0&d=1 >> >> On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com> wrote: >> > This works! Fixes my issue I have been having with the DL160s >> >> Further testing, preferably of the above binaries, on machines that >> previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly >> appreciated as I know you've observed this issue and this seems like >> we might have a final resolution. > > Please tell in your feedback > if you did get a message about disabling UseDefaultAddress on the screen.And please try HTTP transfers too (especially DHCP option 210 set to an HTTP URL). I have a feeling I need to implement some of the same logic into TCP too. Reporting what you see on the screen will be helpful (ie "Failed to connect: %d" where %d is a number).> Genec: Respect! Respect! > Yes twice kudos. One for the technical skills and one for the > effort of finding the e-mail address of people effect by the issue.Fortunately it didn't take much effort.> I would have just address the mailinglist, the Syslinux community as a whole. > But hey, the personal approach might be good for our project, Syslinux.Then comes the questions of if a given user is currently subscribed. -- -Gene
Michael Glasgow
2015-Oct-11 05:26 UTC
[syslinux] UEFI: Failed to load ldlinux.e64/ldlinux.e32
Gene Cumm wrote:> >> My test x86-64 binaries: > >> > >> https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attredirects=0&d=1 > > On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com> wrote: > > This works! Fixes my issue I have been having with the DL160s > > Further testing, preferably of the above binaries, on machines that > previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly > appreciated as I know you've observed this issue and this seems like > we might have a final resolution.I got some time to look at this today. Definitely better, but I think it's still broken for me on an Oracle X5-2 with latest bios and ilom firmware. I loaded official binaries for this test and replaced the two files with your patched versions. Here's the config file: DEFAULT type_INSTALL_to_begin LABEL INSTALL_ovm341 KERNEL mboot.c32 APPEND media/ovm34_beta/images/pxeboot/xen.gz dom0_mem=max:128G dom0_max_vcpus=20 com1=57600,8n1 console=com1 --- media/ovm34_beta/images/pxeboot/vmlinuz console=ttyS0,57600n8 ks=http://10.196.129.1/ks/ovm341_unmanaged.ks --- media/ovm34_beta/images/pxeboot/initrd.img Console output:>>Checking Media Presence...... >>Media Present......Downloading NBP file... Succeed to download NBP file. Getting cached packet My IP is 10.196.129.123 Loading type_INSTALL_to_begin... failed: No such file or directory boot: INSTALL_ovm341 [hangs while loading the xen kernel] In syslog you can see it request the xen kernel, then nothing further: Oct 11 06:08:49 oosinf01 in.tftpd[72726]: RRQ from 10.196.129.123 filename efi64/mboot.c32 Oct 11 06:08:49 oosinf01 in.tftpd[72727]: RRQ from 10.196.129.123 filename efi64/libcom32.c32 Oct 11 06:08:49 oosinf01 in.tftpd[72728]: RRQ from 10.196.129.123 filename efi64/media/ovm34_beta/images/pxeboot/xen.gz With tcpdump you can see the pxe client suddenly stops acknowledging tftp packets, apparently before the server is done sending the kernel: 06:08:49.645053 IP (tos 0x0, ttl 64, id 37770, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0xc464!] UDP, length 1412 06:08:49.645129 IP (tos 0x0, ttl 64, id 59240, offset 0, flags [none], proto UDP (17), length 32) 10.196.129.123.1722 > 10.196.129.1.43197: [udp sum ok] UDP, length 4 06:08:49.645143 IP (tos 0x0, ttl 64, id 37771, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:08:50.646315 IP (tos 0x0, ttl 64, id 37772, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:08:52.648615 IP (tos 0x0, ttl 64, id 37773, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:08:56.652794 IP (tos 0x0, ttl 64, id 37774, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:09:04.660903 IP (tos 0x0, ttl 64, id 37775, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:09:20.677014 IP (tos 0x0, ttl 64, id 37776, offset 0, flags [none], proto UDP (17), length 1440) 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 -> 0x6e43!] UDP, length 1412 06:09:25.689215 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.196.129.123 tell 10.196.129.1, length 28 06:09:25.707342 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.196.129.123 is-at 00:10:e0:71:eb:f4, length 46 [hangs here forever] -- Michael Glasgow <glasgow at beer.net>
On Oct 11, 2015 1:26 AM, "Michael Glasgow" <glasgow at beer.net> wrote:> > Gene Cumm wrote: > > >> My test x86-64 binaries: > > >> > > >>https://sites.google.com/site/genecsyslinux/sl604p0g17-x64.tgz?attredirects=0&d=1> > > > On Fri, Oct 2, 2015 at 4:46 PM, Derrick M <derrick.martinez at gmail.com>wrote:> > > This works! Fixes my issue I have been having with the DL160s > > > > Further testing, preferably of the above binaries, on machines that > > previously had issues loading ldlinux.e64/ldlinux.e32 would be greatly > > appreciated as I know you've observed this issue and this seems like > > we might have a final resolution. > > I got some time to look at this today. Definitely better, but I > think it's still broken for me on an Oracle X5-2 with latest bios > and ilom firmware. I loaded official binaries for this test and > replaced the two files with your patched versions.Excellent. We're at the next phase.> Here's the config file: > > DEFAULT type_INSTALL_to_beginDEFAULT install_ovm341 SAY type install_ovm341 to begin PROMPT 1 TIMEOUT 3000> LABEL INSTALL_ovm341This should be treated case insensitively and tab completion should show it as typed.> KERNEL mboot.c32 > APPEND media/ovm34_beta/images/pxeboot/xen.gz dom0_mem=max:128Gdom0_max_vcpus=20 com1=57600,8n1 console=com1 --- media/ovm34_beta/images/pxeboot/vmlinuz console=ttyS0,57600n8 kshttp://10.196.129.1/ks/ovm341_unmanaged.ks --- media/ovm34_beta/images/pxeboot/initrd.img I'm honestly unsure if mboot.c32 works on EFI. Did you try a plain Linux kernel yet?> Console output: > > >>Checking Media Presence...... > >>Media Present...... > Downloading NBP file... > > Succeed to download NBP file. > Getting cached packet > My IP is 10.196.129.123 > Loading type_INSTALL_to_begin... failed: No such file or directory > boot: INSTALL_ovm341 > > [hangs while loading the xen kernel]Thanks for the output.> In syslog you can see it request the xen kernel, then nothing further: > > Oct 11 06:08:49 oosinf01 in.tftpd[72726]: RRQ from 10.196.129.123filename efi64/mboot.c32> Oct 11 06:08:49 oosinf01 in.tftpd[72727]: RRQ from 10.196.129.123filename efi64/libcom32.c32> Oct 11 06:08:49 oosinf01 in.tftpd[72728]: RRQ from 10.196.129.123filename efi64/media/ovm34_beta/images/pxeboot/xen.gz> > With tcpdump you can see the pxe client suddenly stops acknowledging > tftp packets, apparently before the server is done sending the kernel: > > 06:08:49.645053 IP (tos 0x0, ttl 64, id 37770, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0xc464!] UDP, length 1412> 06:08:49.645129 IP (tos 0x0, ttl 64, id 59240, offset 0, flags [none],proto UDP (17), length 32)> 10.196.129.123.1722 > 10.196.129.1.43197: [udp sum ok] UDP, length 4 > 06:08:49.645143 IP (tos 0x0, ttl 64, id 37771, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:08:50.646315 IP (tos 0x0, ttl 64, id 37772, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:08:52.648615 IP (tos 0x0, ttl 64, id 37773, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:08:56.652794 IP (tos 0x0, ttl 64, id 37774, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:09:04.660903 IP (tos 0x0, ttl 64, id 37775, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:09:20.677014 IP (tos 0x0, ttl 64, id 37776, offset 0, flags [none],proto UDP (17), length 1440)> 10.196.129.1.43197 > 10.196.129.123.1722: [bad udp cksum 0x1da2 ->0x6e43!] UDP, length 1412> 06:09:25.689215 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has10.196.129.123 tell 10.196.129.1, length 28> 06:09:25.707342 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.196.129.123is-at 00:10:e0:71:eb:f4, length 46 Feels like a stall in mboot.c32. I'd typically consider a hang when Ctrl-Alt-Del and ARP don't respond. I'd guess that the core filled a buffer but mboot.c32 isn't emptying. How much of the kernel loaded? Please try a plain Linux kernel to see if the core is flowing nicely and that mboot.c32 is the issue. If you try to load a file over 15MB via TFTP, please do a capture to a file. I'd like to know if your system also exhibits the decaying IO rate. - Was this with binaries from sl604p0g17 or sl604p0g18? - Could you try the other also? If you have difficulty loading a plain Linux kernel with both, please report the following: - Make/model of system - UEFI firmware revision - What NIC type and port number? - UEFI extension agents (struggling to recall the proper term; comparable to a BIOS PXE OROM for add-in cards) - Looks like you copied the console output well enough. - I see you did a packet capture that seems valid. --Gene