On 05-08-15 04:17, Ady via Syslinux wrote:>> Now... why is vesamenu.c32 crashing like it does now? Why is the version >> I tried without Gene's latest patches crashing before even beginning to >> load the first stage: ldlinux.e64? > I think you were "hinted" about this before.About what? Did I miss something obvious? I now know why it did not load ldlinux.e64, but I still don't know why it crashes. The code seems well written with error checking everywhere, so I would expect an error message, or even a silent fail causing the boot process to try the DVD player next, or whatever. Not a crash.> Unfortunately, different > people define "boot" in different ways.In this case I define 'boot' as 'reach a stage where syslinux successfully loads and runs a linux-kernel'. Up until now I think I described reasonably well how far it got into this process.> The not-so-good "news" is that, considering that vesamenu.c32 has > several problems (under UEFI), the reasons for your latest crashes > would need more specific (detailed) investigations and reports (as > opposed to "failed to boot").What more can I do? I dived deep into the C code of a product that I don't have intimate knowledge of and spent more time on google than a teenager on facebook, but this is all I can provide. Please help me out here, as time is running out. I don't own the hardware and will probably loose access to it soon. Unfortunately the debugging process is barely documented at all. The wiki page doesn't really help in this case and I had to read the source to find out how to enable more verbose output. And then there is this problem of stdout not being visible on this EFI system, so I first had to debug the debugging system... ;( Just describing my experience here. I'm new to the syslinux source and am having a hard time trying to understand how everything sings along.> Things that come to mind: the space-like character issues (whether > SYSAPPEND is being used or not),Nope.>additional building interactions (gcc,> gnu-efi...)I build using 'make spotless; make efi64', initially without any changes to the code or makefiles. The build system is Centos 7.1.1503 at the moment, with gcc version 4.8.3 20140911 (Red Hat 4.8.3-9). However, binaries built by Gene crashed in the same way.>, the output console (length of the command, keyboard > issues...), screen resolution supported by your / the UEFI firmware, > and more.This all only got relevant yesterday, when I found out that replacing vesamenu.c32 with menu.c32 solved the crashes in this case. Are there any specific outputs you want to see? Just telling me you need more doesn't really help... I don't want to spam the list with everything and more I can capture, but if I leave out too much, by all means, please ask for it.
> On 05-08-15 04:17, Ady via Syslinux wrote: > >> Now... why is vesamenu.c32 crashing like it does now? Why is the version > >> I tried without Gene's latest patches crashing before even beginning to > >> load the first stage: ldlinux.e64? > > I think you were "hinted" about this before. > > About what? Did I miss something obvious?If I mixed between email threads, my apologies. Unfortunately, I don't have the time today to look for what I was referring to (or at least, attempting to).> > I now know why it did not load ldlinux.e64, but I still don't know why > it crashes. The code seems well written with error checking everywhere, > so I would expect an error message, or even a silent fail causing the > boot process to try the DVD player next, or whatever. Not a crash. > > > > Unfortunately, different > > people define "boot" in different ways. > > In this case I define 'boot' as 'reach a stage where syslinux > successfully loads and runs a linux-kernel'. Up until now I think I > described reasonably well how far it got into this process.And there you go :) (and you are definitely not alone). For debugging (and for reporting in general), we should try to be more accurate. If I may... If I were to say something like "Syslinux doesn't boot", I would mean that I cannot even get to a simple Syslinux's "boot:" prompt. In other words, I would mean that _Syslinux_ (in one of its variants / family members) is failing to boot. Whether the problem is in Syslinux itself or in some other area (HW, FW, user error, inadequate configuration...), that would be the first troubleshooting matter. Next, if _Syslinux_ works, I would try to load a simple LABEL from CLI. If it fails, I would try to load a popular and public kernel (+initrd+whatever). Trying the same on alternative / other HW... Checking BIOS / UEFI version and available updates... If all this works, then I would use menu.c32, perhaps some additional directives (TIMEOUT, multiple LABELs,...), and then vesamenu.c32. Only then I would try additional fancy thingies like "MENU COLOR", MENU BACKGROUND", multiple screen resolutions (especially when talking about UEFI, GOP, UGA. "removing VGA hardware dependencies"... arghhh!) with corresponding MENU RESOLUTION values and an adequate background (again, with a matching screen resolution, otherwise I would be cheating). We know now that the multi-nic branch solved your problem with the first part, the one where _Syslinux_ was really failing to boot (i.e. the bootloader file plus the core module). So the current problem has a different behavior to report than the one you first reported. Problems with vesamenu.c32 (in BIOS too, but especially in UEFI) are _not_ a surprise :O. Although we yet don't know whether it is related to the multi-nic feature, we should probably not assume such relation, and the investigation should be focused, IMHO, on narrowing down what does trigger the behavior, and what does not. Therefore, IMHO, the problem is not exactly defined as a "booting problem in Syslinux" (in the same sense as the multi-nic issue was). Obviously, it is something related to "booting", but it is not the same as "Syslinux fails to boot". Clearly stating which step / feature / component is failing (vesamenu.c32 in this case) is essential, whether you add some debugging "prints" or not.> > > > The not-so-good "news" is that, considering that vesamenu.c32 has > > several problems (under UEFI), the reasons for your latest crashes > > would need more specific (detailed) investigations and reports (as > > opposed to "failed to boot"). > > What more can I do? I dived deep into the C code of a product that I > don't have intimate knowledge of and spent more time on google than a > teenager on facebook, but this is all I can provide. Please help me out > here, as time is running out. I don't own the hardware and will probably > loose access to it soon. > > Unfortunately the debugging process is barely documented at all. The > wiki page doesn't really help in this case and I had to read the source > to find out how to enable more verbose output. And then there is this > problem of stdout not being visible on this EFI system, so I first had > to debug the debugging system... ;( > > Just describing my experience here. I'm new to the syslinux source and > am having a hard time trying to understand how everything sings along.Let me state this clearly: your efforts / feedback / reports / contributions are useful and appreciated. I was just trying to make an important distinction between something like "my problem is that ldlinux.e64 is not being loaded" and "vesamenu.c32 is making my booting experience less enjoyable". I sincerely understand the desire to have vesamenu.c32 working "as it should" (hey, I am still waiting for lss16 and the CLI to work as expected, for about 3 years now :), but if we are troubleshooting, I think we need to be clear about the (source of the) problem; right?> > > > Things that come to mind: the space-like character issues (whether > > SYSAPPEND is being used or not), > > Nope.What do you mean with "nope"? I think I saw some IPAPPEND somewhere; am I wrong? Am I mixing email threads (again)? Or perhaps you meant that you have triple-checked and thoroughly tested that strange behavior we saw last month regarding some "space-like characters" and concluded, without a doubt, that such problem does not exist in your scenario / hardware?> > > >additional building interactions (gcc,> gnu-efi...) > > I build using 'make spotless; make efi64', initially without any changes > to the code or makefiles. The build system is Centos 7.1.1503 at the > moment, with gcc version 4.8.3 20140911 (Red Hat 4.8.3-9). However, > binaries built by Gene crashed in the same way.Just as a potential example, I would guess that Gene's binaries were produced with some old-ish version of gnu-efi, and not with the latest available commit (just a baseless guess, and certainly I could be very wrong). I am _not_ implying that I know this to be a cause of a problem in this case; it is just a generic example of what I meant. Lets not even attempt the gcc 5+ case, as we want to *solve* Syslinux-related problems, not add ones.> > > >, the output console (length of the command, keyboard > > issues...), screen resolution supported by your / the UEFI firmware, > > and more. > > This all only got relevant yesterday, when I found out that replacing > vesamenu.c32 with menu.c32 solved the crashes in this case. Are thereAs I said, taking vesamenu.c32 out of the scene would had been one of my first steps (before any reading of the code, "standards", "protocols" and what not). There is not one crash, but multiple different crashes. The multi-nic situation was one thing (and good to know about one additional case that was solved with the new branch code), the vesamenu.c32 is another situation. Other than the in-common hardware, and the people testing / participating, we shouldn't assume they are related.> any specific outputs you want to see? Just telling me you need more > doesn't really help... I don't want to spam the list with everything and > more I can capture, but if I leave out too much, by all means, please > ask for it.I think others are much much much (and much much) more qualified to ask for specifics, if needed. I was just trying to make things a little bit clearer (hopefully for other readers / common users too), so to avoid potential additional confusions / misunderstandings in these email threads. If the result was the opposite, my sincere apologies. Regards, Ady.> _______________________________________________ > Syslinux mailing list > Submissions to Syslinux at zytor.com > Unsubscribe or set options at: > http://www.zytor.com/mailman/listinfo/syslinux >
> If I may... If I were to say something like "Syslinux doesn't boot", I > would mean that I cannot even get to a simple Syslinux's "boot:" > prompt. In other words, I would mean that _Syslinux_ (in one of its > variants / family members) is failing to boot. Whether the problem is > in Syslinux itself or in some other area (HW, FW, user error, > inadequate configuration...), that would be the first troubleshooting > matter.I don't think I bluntly stated it this way. If I did, I apologise. I do admit that there are a few issues being discussed that maybe should have their own thread. Right now I have the following issues discussed in these two threads i started last week: 1) qemu + ipxe.rom + syslinux = 0xC (discussed in the other thread) 2) HP FW + ipxe.efi + syslinux = 0xC (also discussed there) 3) multinic (solved) 4) crashing when multinic fails 5) crashing when vesalinux fails 6) possibly some stupidity on my part with lib*.c32 so I have to retest 5. 7) suddenly solving 4 after recompiling with extra debugging info. 8) a possible problem with printf not working in EFI mode 9) running out of allocated time (not really a syslinux issue) (Where crashing == exceptions + stack traces, sudden reboots, etc) I'm having a hard time keeping track of this myself. I can imagine it's even more difficult to follow if you're just lurking in the list. I put number 1 and 2 on hold, by the way. The solution (reading three threads and drawing my own conclusions from there) is too time consuming for me. I'll wait for the official ipxe to include the necessary stuff and will try to find some time by then. Did I already mention that I had to learn more about git than I needed the past few years? ;-)> Problems with vesamenu.c32 (in BIOS too, but especially in UEFI) are > _not_ a surprise :O. Although we yet don't know whether it is related > to the multi-nic feature, we should probably not assume such relation, > and the investigation should be focused, IMHO, on narrowing down what > does trigger the behavior, and what does not.The only relation I could find was that something fails, the system crashes instead of printing an error and failing gracefully...> Therefore, IMHO, the problem is not exactly defined as a "booting > problem in Syslinux" (in the same sense as the multi-nic issue was). > Obviously, it is something related to "booting", but it is not the same > as "Syslinux fails to boot". Clearly stating which step / feature / > component is failing (vesamenu.c32 in this case) is essential, whether > you add some debugging "prints" or not.I am trying to find out whether vesamenu.c32 was crashing or the code that was loading/starting vesamenu.c32. I haven't found that out yet.> Let me state this clearly: your efforts / feedback / reports / > contributions are useful and appreciated. I was just trying to make an > important distinction between something like "my problem is that > ldlinux.e64 is not being loaded" and "vesamenu.c32 is making my booting > experience less enjoyable".Thanks! ;-) I think Gene and Patrick were following me when the exchange went from the first case to the next one. Unfortunately the thread spread out like a tree, so if you're reading them by reference instead of chronological, the confusion is understandable.> I sincerely understand the desire to have > vesamenu.c32 working "as it should" (hey, I am still waiting for lss16 > and the CLI to work as expected, for about 3 years now :), but if we > are troubleshooting, I think we need to be clear about the (source of > the) problem; right?> What do you mean with "nope"? I think I saw some IPAPPEND somewhere; am > I wrong? Am I mixing email threads (again)?No you're right. Is IPAPPEND family of SYSAPPEND? I'm not so fluent in syslinux configuration yet. And I just discovered how loading just menu.c32 instead of vesamenu.c32 stopped the machine from crashing. I really haven't had time to dig deeper into the 'why' since discovering this. I also understand that I need to learn more about the configuration parser now..> Or perhaps you meant that you have triple-checked and thoroughly tested > that strange behavior we saw last month regarding some "space-like > characters" and concluded, without a doubt, that such problem does not > exist in your scenario / hardware?Uh oh... more reading to do for me. I did not fully understand your question there. But the menu entry is working. If syslinux starts and menu.c32 loads, all menu entries work perfectly fine. No crashing there. Is IPAPPEND or "space-like characters" (whatever these are) still relevant?>> This all only got relevant yesterday, when I found out that replacing >> vesamenu.c32 with menu.c32 solved the crashes in this case. Are there > As I said, taking vesamenu.c32 out of the scene would had been one of > my first steps (before any reading of the code, "standards", > "protocols" and what not).You probably have a different background than me. There was no hint at all that this could be a problem. I was confronted with a machine that was crashing without any helpful hints while running code that I'm barely beginning to understand. From the tftp-logs I could see that the multinic patch worked, because the rest of the files were being downloaded, but then the machine crashed, just like it did before.> There is not one crash, but multiple > different crashes. The multi-nic situation was one thing (and good to > know about one additional case that was solved with the new branch > code), the vesamenu.c32 is another situation. Other than the in-common > hardware, and the people testing / participating, we shouldn't assume > they are related.I'm not assuming. I'm just asking. What I see is a crash when it should just handle an error situation. Maybe the longjmp() back to the exit handling code is borken? Who knows? I wanted to find out, so that's why I started to enable debugging and adding extra printf() everywhere. To see how far it got. And then I ran into more problems on that, which is a pity because I want to be as detailed as possible or even find and fix the problem myself. And now, just before I had to leave the office to return only on Friday, I found that most crashes mysteriously disappeared. Go figure... I *really* want to dive into it now, but I can't..