Phil Pokorny
2017-Mar-06 00:23 UTC
[syslinux] PATH directive searches in reverse order with wrong separator
I've been trying to get syslinux.efi working in my environment again... Found what look like a bunch of little bugs that are very frustrating... First, the documentation on the Wiki says that as of 5.11, the list separator is space, not colon. But I can find no evidence that 5.11 was ever officially released or that a commit to git was made to make this change. 6.00 and following still use colon as the separator in the parse routine. This should fix that. diff --git a/com32/elflink/ldlinux/readconfig.c b/com32/elflink/ldlinux/readconfig.c --- a/com32/elflink/ldlinux/readconfig.c +++ b/com32/elflink/ldlinux/readconfig.c @@ -786,7 +786,7 @@ static int parse_path(char *p) char *c = p; /* Find the next directory */ - while (*c && *c != ':') + while (*c && !my_isspace(*c)) c++; str = refstrndup(p, c - p); Speaking of which, why does the code in elflink/ldlinux use "my_isspace" while the rest of SYSLINUX uses isspace()? And why is there not_whitespace() in core/include/fs.h? Next, path_add() adds new entries at the beginning of the list not the end. This causes directories to be search last first not first to last. This should fix that. diff --git a/core/path.c b/core/path.c --- a/core/path.c +++ b/core/path.c @@ -32,7 +32,7 @@ __export struct path_entry *path_add(const char *str) if (!entry->str) goto bail; - list_add(&entry->list, &PATH); + list_add_tail(&entry->list, &PATH); return entry; But that will cause the code that attempts to put the CurrentDirName at the beginning of the list to break because it will end up at the end. Probably better to move the check for current working directory location for a file to findpath() like this: diff --git a/com32/lib/sys/module/common.c b/com32/lib/sys/module/common.c --- a/com32/lib/sys/module/common.c +++ b/com32/lib/sys/module/common.c @@ -67,6 +67,14 @@ FILE *findpath(char *name) if (f) return f; + /* Look in CurrentDirName */ + /* CurrentDirName ends with a / (see efi_setcwd) */ + snprintf(path, sizeof(path), "%s%s", CurrentDirName, name); + dprintf("findpath: trying \"%s\"\n", path); + f = fopen(path, "rb"); /* for full path */ + if (f) + return f; + list_for_each_entry(entry, &PATH, list) { bool slash = false; but I'm worried that the code in the com32/lib/ tree expects to use chdir,(), get_cwd() and friends to manipulate the current working directory. (see core/fs/lib/searchconfig.c) But that leads into a maze of twisty passages of virtual functions, default implementations and other stuff I haven't figured out yet. Also, what's the difference between CurrentDirName[] and config_cwd[] and this_fs->cwd_name? While trying to figure all that out, I found this in generic_mangle_name() which looks wrong to me. When collapsing repeat slashes (should we also be stripping repeat backslashes here?), we decrement i which is the "space left" count, but we haven't actually copied a character. I think that means that later, we'll end up padding with the incorrect number of characters. diff --git a/core/fs/lib/mangle.c b/core/fs/lib/mangle.c --- a/core/fs/lib/mangle.c +++ b/core/fs/lib/mangle.c @@ -23,7 +23,6 @@ void generic_mangle_name(char *dst, const char *src) if (*src == '/') { if (src[1] == '/') { src++; - i--; continue; } } Even with these fixes or workarounds in my testing, the ldlinux searching seems broken. Specifically, I have my DHCP server configure to send "/efi/syslinux.efi" as the "filename" and placed ldlinux.e64 in that same directory under the assumption that the documentation is right and syslinux will look in the same directory as the filename for ldlinux.EXT. The code appears to do that, but what the TFTP server sees is a second request for /efi/syslinux.efi and then ldlinux.e64 searched for in the hard coded search_directories list which doesn't match my network configuration. If I change the DHCP configuration to use pathprefix set to /efi/ then it finds ldlinux.e64 and can find my configfile, but I want to follow the layout recommended in the wiki documentation where I could have all three of bios, e32 and e64 versions of com32 modules in subdirectories of the same parent expecting a PATH directive to find he right version. But that doesn't seem to work either. TLDR: If I could get a second pair of eyes on this, I'll turn them into a series of smaller commits in git and submit a pull request. Phil P. -- Philip Pokorny, RHCE Chief Technology Officer PENGUIN COMPUTING, Inc www.penguincomputing.com Changing the world through technical innovation
Ady Ady
2017-Mar-06 03:51 UTC
[syslinux] PATH directive searches in reverse order with wrong separator
> I've been trying to get syslinux.efi working in my environment again... > > Found what look like a bunch of little bugs that are very frustrating... > > First, the documentation on the Wiki says that as of 5.11, the list > separator is space, not colon. But I can find no evidence that 5.11 > was ever officially released or that a commit to git was made to make > this change. 6.00 and following still use colon as the separator in > the parse routine. This should fix that.(snip) A minor note (for readers), just in case... the current git repo is located at: http://repo.or.cz/syslinux.git Many users/readers still think that kernel.org is always up-to-date, which is not the case. We can never be sure which base (6.03? 6.04-pre1? some commit?) is being used for patches, unless explicitly noted. Now, if I may... I don't recall any user presenting a practical real-world situation in which multiple paths were actually needed as arguments for the PATH directive. On the other hand, I do recall several conversations / reports / problems caused by the supposed capability of having multiple paths available in the PATH directive. The reported problems were not only related to the capability itself (and how it was coded) but also related to the specific selection of path separator (or entry separator). One alternative, in an attempt to reduce potential issues, was/is the recommendation to use multiple PATH directives, each in a different line/row in the configuration file, instead of using the path separator within one same PATH directive. In other words, the PATH directive accepts: PATH first_path PATH second_path and the paths are appended to the list of searched-for paths (looking for c32 (library) modules). I would suggest trying this alternative for a test, independently of how the code might parse / mess up this notation anyway. Yet, again, real-world cases were not presented here. Important reminder: the "Current Working Directory" is supposed to be the first searched-for directory, always; the specific location of the main c32 file does not modify the priority of the searched-for list of its dependencies. (snip)> > While trying to figure all that out, I found this in > generic_mangle_name() which looks wrong to me. When collapsing repeat > slashes (should we also be stripping repeat backslashes here?), we > decrement i which is the "space left" count, but we haven't actually > copied a character. I think that means that later, we'll end up > padding with the incorrect number of characters. > > diff --git a/core/fs/lib/mangle.c b/core/fs/lib/mangle.c > --- a/core/fs/lib/mangle.c > +++ b/core/fs/lib/mangle.c > @@ -23,7 +23,6 @@ void generic_mangle_name(char *dst, const char *src) > if (*src == '/') { > if (src[1] == '/') { > src++; > - i--; > continue; > } > }Hmm, for some reason your words reminded me of: "chrreplace: Don't skip the first character" commit/779a4c85252de59b3134d39961c3f63c99d0ea2e Very possibly it has nothing to do here, but...> > Even with these fixes or workarounds in my testing, the ldlinux > searching seems broken. Specifically, I have my DHCP server configure > to send "/efi/syslinux.efi" as the "filename" and placed ldlinux.e64 > in that same directory under the assumption that the documentation is > right and syslinux will look in the same directory as the filename for > ldlinux.EXT. The code appears to do that, but what the TFTP server > sees is a second request for /efi/syslinux.efi and then ldlinux.e64 > searched for in the hard coded search_directories list which doesn't > match my network configuration. > > If I change the DHCP configuration to use pathprefix set to /efi/ then > it finds ldlinux.e64 and can find my configfile, but I want to follow > the layout recommended in the wiki documentation where I could have > all three of bios, e32 and e64 versions of com32 modules in > subdirectories of the same parent expecting a PATH directive to find > he right version. But that doesn't seem to work either. > > TLDR: > If I could get a second pair of eyes on this, I'll turn them into a > series of smaller commits in git and submit a pull request.In addition, there is a known issue in the PATH directive. The PATH directive parses its own paths as "absolute", and the result is "as expected" when other directives use relative paths. But when there are absolute paths being used by other directives, then the resulting PATH is incongruous. Whether the issue should be considered a "bug" or an expected / desired result, that has never been clarified. See bug #58 in Syslinux's bug tracker. Finally, from the user's perspective, alternative (new) directives have been proposed, reducing the complexity (for users). See: http://www.syslinux.org/archives/2016-June/025224.html Being realistic, if the goal is to make syslinux.efi work in your environment, then probably posting your complete set of configuration files might be relevant, considering the known problems and the frequent misunderstandings we have seen over the years about the multi-platform setup. At any rate, I would certainly welcome active improvement of the source code.> > Phil P. >Regards, Ady.
Geert Stappers
2017-Mar-06 06:15 UTC
[syslinux] PATH directive searches in reverse order with wrong separator
> > At any rate, > I would certainly welcome active improvement of the source code.+1
Phil Pokorny
2017-Mar-06 11:32 UTC
[syslinux] PATH directive searches in reverse order with wrong separator
On Sun, Mar 5, 2017 at 4:23 PM, Phil Pokorny <ppokorny at penguincomputing.com> wrote:> But that will cause the code that attempts to put the CurrentDirName > at the beginning of the list to break because it will end up at the > end. Probably better to move the check for current working directory > location for a file to findpath() like this: > > diff --git a/com32/lib/sys/module/common.c b/com32/lib/sys/module/common.c > --- a/com32/lib/sys/module/common.c > +++ b/com32/lib/sys/module/common.c > @@ -67,6 +67,14 @@ FILE *findpath(char *name) > if (f) > return f; > > + /* Look in CurrentDirName */ > + /* CurrentDirName ends with a / (see efi_setcwd) */ > + snprintf(path, sizeof(path), "%s%s", CurrentDirName, name); > + dprintf("findpath: trying \"%s\"\n", path); > + f = fopen(path, "rb"); /* for full path */ > + if (f) > + return f; > + > list_for_each_entry(entry, &PATH, list) { > bool slash = false; > > > but I'm worried that the code in the com32/lib/ tree expects to use > chdir,(), get_cwd() and friends to manipulate the current working > directory. (see core/fs/lib/searchconfig.c) But that leads into a > maze of twisty passages of virtual functions, default implementations > and other stuff I haven't figured out yet. Also, what's the > difference between CurrentDirName[] and config_cwd[] and > this_fs->cwd_name?No, this is a bad idea. Looking at fopen->open_file->searchdir call path, I think we can assume that the fopen call above this patch already searched the "current directory" and therefore there is no reason to search it again. We can probably kill CurrentDirName entirely now that we have fs->ops with chdir() and get_cwd() and realname(). There aren't that many places it's used.> While trying to figure all that out, I found this in > generic_mangle_name() which looks wrong to me. When collapsing repeat > slashes (should we also be stripping repeat backslashes here?), we > decrement i which is the "space left" count, but we haven't actually > copied a character. I think that means that later, we'll end up > padding with the incorrect number of characters.There's another copy of that incorrect code in the vfat fs implementation of mangle_name(). There is a strreplace() function in the SYSLINUX library that could be used to convert backslash to forward slash and then call the generic_mangle_name implementation which would fix this bug and eliminate the duplicate code. Unfortunately, I'm testing with an AMI Aptio-4 based BIOS (just recently released updated version for this motherboard) and it still has problems with the network stack. Network sniffer traces show that when configured with pathprefix set to "::/efi/" it repeatedly asks for the file "/efi/ldlinux.e64" from the TFTP server but appears to ignore the replies and never fetches the data. After multiple failures, it tries the other "search_directories" in order. Changing pathprefix to "http://server/path/to/efi/" and copying the files into place, it then does repeated TCP connection handshakes (SYN -> SYN/ACK -> ACK) and then immediately (micro seconds later) issues a FIN, then replies with RST when the server tries to FIN/ACK No HTTP headers are exchanged. HTTP fails much faster than TFTP and so it appears to exhaust all search_directories attempts in a few seconds and then aborts back to the EFI BIOS. UseDefault might have been part of this, but that appears to be fixed now (6.04-pre1 + HEAD) and the message "disable UseDefault" is printed. I have an EFI version of GRUB (grubx64.efi) which works on this BIOS and is able to TFTP a config file, kernel, initrd and launch Linux. So there don't appear to be any network connectivity or other hardware issues. Any ideas? -- Philip Pokorny, RHCE Chief Technology Officer PENGUIN COMPUTING, Inc www.penguincomputing.com Changing the world through technical innovation