Christian Hesse
2011-Dec-29 07:55 UTC
[syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17
Hello everybody, setting up a netboot server for a really huge network I decided to go with what will be syslinux 4.10 to get support for http transfers. The setup works on my notebook, booting another notebook directly connected. However it fails with a more complex setup: A virtual machine on the second notebook bridged to the ethernet device does not boot, systems from other networks with a router in between do not boot either. The last message I can see on the clients is "Trying to load: pxelinux.cfg/<mac>". With tcpdump I can confirm there is a request for /pxelinux.cfg/<uid> (which is answered correctly with "no such file"). But there is nothing about /pxelinux.cfg/<mac> in the logs (neither request nor answer). This used to work perfectly with syslinux 4.05. Any ideas what could go wrong? BTW, is there a git repo containing the latest code for syslinux pre 4.10? Could not find one so far. -- Best regards, Chris
Shantanu Gadgil
2011-Dec-29 09:39 UTC
[syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17
Hi Christian, This is the exact same problem that I have faced for machines located across a few switches/routers ("far away" machines) JFYI the "far away" machines were DELL 2850 and SunFire V20z grade machines. When I tried to boot the "far away" machines things turned ugly real quick. The loading of the kernel would time out, sometimes loading of the initrd would time out, sometimes nothing would load, even the transfer of pxelinux.0 would fail!!! I would start seeing "failed sending ..." messages on the TFTP server for pxelinux.0 itself. Quite unnerving!!! I quickly replaced the 4.10-pre17 (seventeen) files with the ones from 4.05 and things have been working since then. The really scary (and unexplainable) bit was that even after replacing the 4.10-pre17 files with the 4.05 files, all the "far away" machines /still/ behaved "slow" for nearly two days. Now things are back to normal. To test it in a contained manner, I setup a desktop grade machine right next to my TFTP server to see what happens. Details explained in a previous post titled "Please test 4.10-pre17". (May be related) My search for similar problems lead me to a common topic called "TFTP logging strangeness" reported on various DNSMasq forums which matched what I was seeing on the TFTP server side. All of the answers to those pointed out to buggy ROMS of the clients, but then how do things work for 4.05 ?!? (I am confused) For now, I have NOT been able to retest the 4.10-pre17 due to workload/holidays. I plan to retest the 4.10-pre17 in the contained two-machine environment in the new year, to capture tcpdump logs. Regards, Shantanu --- On Thu, 12/29/11, Christian Hesse <list at eworm.de> wrote:> From: Christian Hesse <list at eworm.de> > Subject: [syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17 > To: "Syslinux" <syslinux at zytor.com> > Date: Thursday, December 29, 2011, 1:25 PM > Hello everybody, > > setting up a netboot server for a really huge network I > decided to go with > what will be syslinux 4.10 to get support for http > transfers. > > The setup works on my notebook, booting another notebook > directly connected. > However it fails with a more complex setup: A virtual > machine on the > second notebook bridged to the ethernet device does not > boot, systems from > other networks with a router in between do not boot > either. > > The last message I can see on the clients is "Trying to > load: > pxelinux.cfg/<mac>". With tcpdump I can confirm there > is a request > for /pxelinux.cfg/<uid> (which is answered correctly > with "no such file"). > But there is nothing about /pxelinux.cfg/<mac> in the > logs (neither request > nor answer). > > This used to work perfectly with syslinux 4.05. Any ideas > what could go wrong? > > BTW, is there a git repo containing the latest code for > syslinux pre 4.10? > Could not find one so far. > -- > Best regards, > Chris > _______________________________________________ > Syslinux mailing list > Submissions to Syslinux at zytor.com > Unsubscribe or set options at: > http://www.zytor.com/mailman/listinfo/syslinux > Please do not send private replies to mailing list > traffic. > >
H. Peter Anvin
2011-Dec-29 19:21 UTC
[syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17
On 12/28/2011 11:55 PM, Christian Hesse wrote:> Hello everybody, > > setting up a netboot server for a really huge network I decided to go with > what will be syslinux 4.10 to get support for http transfers. > > The setup works on my notebook, booting another notebook directly connected. > However it fails with a more complex setup: A virtual machine on the > second notebook bridged to the ethernet device does not boot, systems from > other networks with a router in between do not boot either. > > The last message I can see on the clients is "Trying to load: > pxelinux.cfg/<mac>". With tcpdump I can confirm there is a request > for /pxelinux.cfg/<uid> (which is answered correctly with "no such file"). > But there is nothing about /pxelinux.cfg/<mac> in the logs (neither request > nor answer). > > This used to work perfectly with syslinux 4.05. Any ideas what could go wrong? > > BTW, is there a git repo containing the latest code for syslinux pre 4.10? > Could not find one so far.Is there any way you could get a packet trace of the failing system? -hpa
Christian Hesse
2011-Dec-29 19:59 UTC
[syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17
"H. Peter Anvin" <hpa at zytor.com> on Thu, 29 Dec 2011 11:21:20 -0800:> On 12/28/2011 11:55 PM, Christian Hesse wrote: > > The setup works on my notebook, booting another notebook directly > > connected. However it fails with a more complex setup: A virtual machine > > on the second notebook bridged to the ethernet device does not boot, > > systems from other networks with a router in between do not boot either. > > [...] > > Is there any way you could get a packet trace of the failing system?Sure. Will post it tomorrow. -- Schoene Gruesse Chris
On Dec 29, 2011 3:39 AM, "Christian Hesse" <list at eworm.de> wrote:> BTW, is there a git repo containing the latest code for syslinux pre 4.10? > Could not find one so far.Branch lwip. -- -Gene
On Mon, Jan 2, 2012 at 13:16, Christian Hesse <mail at eworm.de> wrote:> Christian Hesse <list at eworm.de> on Mon, 2 Jan 2012 19:10:54 +0100: >> Gene Cumm <gene.cumm at gmail.com> on Mon, 2 Jan 2012 12:47:13 -0500: >> > On Jan 2, 2012 9:34 AM, "Christian Hesse" <list at eworm.de> wrote: >> > > >> > > Gene Cumm <gene.cumm at gmail.com> on Mon, 2 Jan 2012 07:40:33 -0500: >> > > > On Mon, Jan 2, 2012 at 02:53, Christian Hesse <list at eworm.de> wrote: >> > > > > Gene Cumm <gene.cumm at gmail.com> on Sat, 31 Dec 2011 13:55:00 -0500: >> > > > >> On Sat, Dec 31, 2011 at 12:19, Christian Hesse <list at eworm.de> >> > > > >> wrote: >> > > > >> > Let me know what to do and I will test it. >> > > > >> >> > > > >> Thanks in advance. ?I'll probably send a binary and/or diff. >> > > > > >> > > > > Added some debug output myself... >> > > > > Looks like try_load() is called and system hangs in call16(). >> > > > >> > > > So core/fs/pxe/pxe.c pxe_load_config() calls try_load() which calls >> > > > call16() to go back to the real mode core code which eventually gets >> > > > into the lwip code. >> > > > >> > > > git://github.com/geneC1/syslinux.git lwip-undiif-debug >> > > > >> > > > This is what I used to debug the VMware platform issue. ?The diff from >> > > > 85c4ef2e is attached (which will apply cleanly to the top of the lwip >> > > > branch) so you can have your choice of using patch or git-remote. >> > > >> > > gcc complains abaut unused variables and some more... Had to remoce >> > > "-Werror" from mk/devel.mk - hope that is ok. >> > > >> > > The log from serial is attached. >> > >> > The log is about 1/10th the size it should be.? Edit line 62 of >> > core/Makefile which should read "# CFLAGS ? += -DDEBUG=1" to be >> > "CFLAGS ? ? += -DDEBUG=1" then run 'make spotless; make' which will >> > clean things up and rebuild in a true debug build. ?Hopefully we don't >> > need DEBUG=2 which is even bigger. >> >> It's not a lot bigger now... >> Should I go with DEBUG=2? > > Sorry, I booted the wrong image. > > Here we go: > http://www.eworm.de/tmp/serial.log > -- > Schoene Gruesse > ChrisThat looks much better than the first. Line 197 "tcp_input: no PCB match found, resetting RST." is extremely familiar I need to look over it more closely. -- -Gene
Christian Hesse
2012-Jan-09 11:03 UTC
[syslinux] tftp with pxelinux.0 from syslinux 4.10-pre17
Gene Cumm <gene.cumm at gmail.com> on Mon, 2 Jan 2012 13:49:25 -0500:> On Mon, Jan 2, 2012 at 13:16, Christian Hesse <mail at eworm.de> wrote: > > Here we go: > > http://www.eworm.de/tmp/serial.log > > That looks much better than the first. Line 197 "tcp_input: no PCB > match found, resetting RST." is extremely familiar I need to look > over it more closely.Ok, waiting for some news now. ;) Let me know if there is anything to test. -- Schoene Gruesse Chris