Alexander Perlis
2014-Jul-02 02:31 UTC
[syslinux] iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
I believe I'm seeing a bug in lpxelinux.0 6.03-pre17 but I need some advice on how to isolate and troubleshoot this. (I can't try pre18 at the moment, but did try 4.07 and 5.10 and saw similar behavior, also with pxelinux.0, so although I'll give pre18 a try soon, some isolation/troubleshooting advice will be a good education no matter what.) To get to our PXE-launched tools from hosts on a subnet without proper DHCP support (e.g., on a NAT or in a different building), we're trying to use small iPXE USB thumb drives and/or iPXE CD-ROMs, obtained from rom-o-matic.eu, which then chainload to lpxelinux.0 off our actual PXE server. (We used pxelinux-options to put a "-b pxe.ip.address" into lpxelinux.0, so that it would know the server IP for grabbing the subsequent libxxx and config files.) On some hosts we successfully get all the way to the graphical vesamenu.c32 under lpxelinux.0, while on other hosts we reach the initial lpxelinux.0 banner line but then the host hangs (and server shows no attempt to grab libxxx or config files), while on other hosts there is a reboot as soon as control is handed to lpxelinux.0 (and unclear whether the banner line is printed, as the reboot blanks the screen too quickly). Intel Macs: local-ipxe->lpxelinux.0->banner->vesamenu.c32->success Dell GX620: local-ipxe->lpxelinux.0->banner->hang Dell 780: local-ipxe->lpxelinux.0->instant-reboot I'm not sure how to dig deeper. I'm using the precompiled binaries. Is it easy to compile a debug version that spits out verbose progress prior and after the banner and perhaps pauses for user input? I'm guessing ipxe is somehow setting the stage in a way that is caught by something finicky in lpxelinux.0 on certain hardware, or perhaps there's a bug in how ipxe sets the stage. Just to eliminate that latter variable, any recommendations for a non-ipxe-way to boot off a CD or USB to then PXE-boot to a specific server (not via DHCP)? Alex
Gene Cumm
2014-Jul-02 02:55 UTC
[syslinux] iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
On Jul 1, 2014 10:37 PM, "Alexander Perlis" <aperlis at math.lsu.edu> wrote:> > I believe I'm seeing a bug in lpxelinux.0 6.03-pre17 but I need someadvice on how to isolate and troubleshoot this. (I can't try pre18 at the moment, but did try 4.07 and 5.10 and saw similar behavior, also with pxelinux.0, so although I'll give pre18 a try soon, some isolation/troubleshooting advice will be a good education no matter what.) Odd. 4.07 should be good but the 4.10/5.1*/6.0* revisions make sense.> To get to our PXE-launched tools from hosts on a subnet without properDHCP support (e.g., on a NAT or in a different building), we're trying to use small iPXE USB thumb drives and/or iPXE CD-ROMs, obtained from rom-o-matic.eu, which then chainload to lpxelinux.0 off our actual PXE server. (We used pxelinux-options to put a "-b pxe.ip.address" into lpxelinux.0, so that it would know the server IP for grabbing the subsequent libxxx and config files.)> > On some hosts we successfully get all the way to the graphicalvesamenu.c32 under lpxelinux.0, while on other hosts we reach the initial lpxelinux.0 banner line but then the host hangs (and server shows no attempt to grab libxxx or config files), while on other hosts there is a reboot as soon as control is handed to lpxelinux.0 (and unclear whether the banner line is printed, as the reboot blanks the screen too quickly).> > Intel Macs: local-ipxe->lpxelinux.0->banner->vesamenu.c32->success > Dell GX620: local-ipxe->lpxelinux.0->banner->hang > Dell 780: local-ipxe->lpxelinux.0->instant-reboot > > I'm not sure how to dig deeper. I'm using the precompiled binaries. Is iteasy to compile a debug version that spits out verbose progress prior and after the banner and perhaps pauses for user input?> > I'm guessing ipxe is somehow setting the stage in a way that is caught bysomething finicky in lpxelinux.0 on certain hardware, or perhaps there's a bug in how ipxe sets the stage. Just to eliminate that latter variable, any recommendations for a non-ipxe-way to boot off a CD or USB to then PXE-boot to a specific server (not via DHCP)? This is probably related to a bisect I did recently. I found the culprit commit in my case but a blind revert feels wrong. --Gene
Geert Stappers
2014-Jul-02 04:42 UTC
[syslinux] iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
Op 2014-07-01 om 22:55 schreef Gene Cumm:> On Jul 1, 2014 10:37 PM, "Alexander Perlis" wrote: > > > > I believe I'm seeing a bug in lpxelinux.0 6.03-pre17 but I need some > > advice on how to isolate and troubleshoot this. (I can't try pre18 > > at the moment, but did try 4.07 and 5.10 and saw similar behavior, > > also with pxelinux.0, so although I'll give pre18 a try soon, some > > isolation/troubleshooting advice will be a good education no matter > > what.) > > Odd. 4.07 should be good but the 4.10/5.1*/6.0* revisions make sense. > > > To get to our PXE-launched tools from hosts on a subnet without proper > > DHCP support (e.g., on a NAT or in a different building), we're trying > > to use small iPXE USB thumb drives and/or iPXE CD-ROMs, obtained from > > rom-o-matic.eu, which then chainload to lpxelinux.0 off our actual > > PXE server. (We used pxelinux-options to put a "-b pxe.ip.address" > > into lpxelinux.0, so that it would know the server IP for grabbing > > the subsequent libxxx and config files.) > > > > On some hosts we successfully get all the way to the graphical > > vesamenu.c32 under lpxelinux.0, while on other hosts we reach the > > initial lpxelinux.0 banner line but then the host hangs (and server > > shows no attempt to grab libxxx or config files), while on other > > hosts there is a reboot as soon as control is handed to lpxelinux.0 > > (and unclear whether the banner line is printed, as the reboot blanks > > the screen too quickly). > > > > Intel Macs: local-ipxe->lpxelinux.0->banner->vesamenu.c32->success > > Dell GX620: local-ipxe->lpxelinux.0->banner->hang > > Dell 780: local-ipxe->lpxelinux.0->instant-reboot > > > > I'm not sure how to dig deeper. I'm using the precompiled binaries. Is > > it easy to compile a debug version that spits out verbose progress > > prior and after the banner and perhaps pauses for user input? > > > > I'm guessing ipxe is somehow setting the stage in a way that is caught > > by something finicky in lpxelinux.0 on certain hardware, or perhaps > > there's a bug in how ipxe sets the stage. Just to eliminate that latter > > variable, any recommendations for a non-ipxe-way to boot off a CD or > > USB to then PXE-boot to a specific server (not via DHCP)? > > This is probably related to a bisect I did recently. I found the culprit > commit in my case but a blind revert feels wrong.Please post in this thread what was found with `git bisect`. If it is in our mailinglist archive, then please reference to it. Groeten Geert Stappers -- Leven en laten leven
Alexander Perlis
2014-Jul-02 15:13 UTC
[syslinux] iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
On 07/01/2014 09:55 PM, Gene Cumm wrote:> On Jul 1, 2014 10:37 PM, "Alexander Perlis" <aperlis at math.lsu.edu > > I believe I'm seeing a bug in lpxelinux.0 6.03-pre17 ... > > Odd. 4.07 should be good but the 4.10/5.1*/6.0* revisions make sense.My bad. I tried again, and in 4.07 we do get further. We couldn't boot all the way because, it seems, using pxelinux-options in 4.07 to force in "-b next-server pxe.ip.address" doesn't seem to have an effect, and so we got stuck after launching 4.07 pxelinux.0, which led to the mistaken report. Indeed, as you expected, the iPXE-to-pxelinux chaining itself does work fine in 4.07. In fact, just tried something: if we tell iPXE itself the value of next-server before chaining (instead of modifying pxelinux.0 using pxelinux-options), then we make it all the way to our graphical vesamenu. But now back to 6.0x, as that's the version we're now running and would love to stick with the newer stuff! Just for completeness, I just tried 6.03-pre18 but have the same problems as reported with pre17. (But based on your message, your understanding would have correctly guessed that.)> This is probably related to a bisect I did recently. I found the culprit > commit in my case but a blind revert feels wrong.A revert would toss out your work. Can another patch be made that maintains the goals of your work while also fixing the iPXE->(l)pxelinux.0 chaining? Or is there a different (non-ipxe?) way we can locally boot a machine and get it to chain to our 6.0x pxelinux server? Alex
Reasonably Related Threads
- iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
- iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
- iPXE chain to lpxelinux.0 6.03-pre17 inconsistencies and failures
- iPXE chain to lpxelinux.0 6.03 inconsistencies and failures
- iPXE chain to lpxelinux.0 6.03 inconsistencies and failures