On Mon, Aug 31, 2015 at 07:59:06PM -0400, Gene Cumm via Syslinux wrote:> On Mon, Aug 31, 2015 at 6:42 PM, Derrick M <derrick.martinez at gmail.com> wrote: > > Thanks Gene! > > > > this one is much better > > EXCELLENT! That's what I wanted to see. It iterates through 3 > handles, printing the entire MAC buffer and the handle's memory > address. Looks like it's merely a visual display bug. > > > On Mon, Aug 31, 2015 at 3:11 PM, Gene Cumm <gene.cumm at gmail.com> wrote: > >> > >> On Mon, Aug 31, 2015 at 2:49 PM, Derrick M <derrick.martinez at gmail.com> > >> wrote: > >> > Gene, > >> > > >> > Also, I wanted to point out there the code is freezing at in main.c. > >> > It's > >> > right at load_env32(NULL); at that point status has a numerical value of > >> > 6. > >> > >> That's excellent to know. Thanks. > > And numerical value of 6 is EFI_NOT_READY, indicating the keyboard > buffer is empty, as expected. >Gene, Yeah I guess that could be good or bad news. Also I have determined where it is getting hung at -- core/elflink/load_env32.c in start_ldlinux at rv spawn_load(LDLINUX, argc, argv); Do you have any other ideas for troubleshooting?
On Mon, Aug 31, 2015 at 8:23 PM, Derrick M <derrick.martinez at gmail.com> wrote:> Gene, > > Yeah I guess that could be good or bad news. Also I have determined where it > is getting hung at -- core/elflink/load_env32.c in start_ldlinux at rv > spawn_load(LDLINUX, argc, argv); > > Do you have any other ideas for troubleshooting?spawn_load() should be the call to load ldlinux.* and in this case, use the network to retrieve it, likely the first network IO operation. When you saw those messages to print the handle address and MAC, Syslinux should have been in the middle of that spawn_load() call. My next curiosity is with the configured mapping. Is the firmware merely saying it's ready when it's really not? I recall a suspicion some HP system might have been going "configured" when it really wasn't ready to send. With my test/debug binaries, do you see any network IO occur from Syslinux? I'm looking for the TFTP RRQ specifically and how it responds (including acting deaf). -- -Gene> On Mon, Aug 31, 2015 at 4:59 PM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> On Mon, Aug 31, 2015 at 6:42 PM, Derrick M <derrick.martinez at gmail.com> >> wrote: >> > Thanks Gene! >> > >> > this one is much better >> >> EXCELLENT! That's what I wanted to see. It iterates through 3 >> handles, printing the entire MAC buffer and the handle's memory >> address. Looks like it's merely a visual display bug. >> >> > On Mon, Aug 31, 2015 at 3:11 PM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> >> >> On Mon, Aug 31, 2015 at 2:49 PM, Derrick M <derrick.martinez at gmail.com> >> >> wrote: >> >> > Gene, >> >> > >> >> > Also, I wanted to point out there the code is freezing at in main.c. >> >> > It's >> >> > right at load_env32(NULL); at that point status has a numerical value >> >> > of >> >> > 6. >> >> >> >> That's excellent to know. Thanks. >> >> And numerical value of 6 is EFI_NOT_READY, indicating the keyboard >> buffer is empty, as expected. >> >> -- >> -Gene >> >> >> > On Mon, Aug 31, 2015 at 3:08 AM, Gene Cumm <gene.cumm at gmail.com> >> >> > wrote: >> >> >> >> >> >> On Aug 30, 2015 8:42 PM, "Derrick" <derrick22 at gmail.com> wrote: >> >> >> > >> >> >> > Gene thanks, here is the output >> >> >> > >> >> >> > My IP is 10.2.49.10 >> >> >> > Img @ 71d89718 = 8cdcd40ca5f0 >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> >> Figured out I had a bug when thinking about that output. >> >> >> >> >> >> >> >> https://sites.google.com/site/genecsyslinux/150831-efi.tgz?attredirects=0&d=1 >> >> will show the proper iterations over the handles. >> >> >> >> -- >> >> -Gene >> >> >> >> >> > From that point it is hung >> >> >> >> >> >> I find it hard to believe that it'd print that. I think you did a >> >> >> copy/paste and forgot to update and trimmed some data. >> >> >> >> >> >> Does your HP DL160 G9 have an iLO with remote KVM function? If not >> >> >> and >> >> >> you feel a screenshot from a real camera is too big for the list, >> >> >> feel >> >> >> free >> >> >> to send it to me directly. >> >> >> >> >> >> --Gene >> >> >> >> >> >> > On Sun, Aug 30, 2015 at 4:10 AM, Gene Cumm via Syslinux >> >> >> > <syslinux at zytor.com> wrote: >> >> >> >> >> >> >> >> On Fri, Aug 28, 2015 at 3:13 PM, Derrick M >> >> >> >> <derrick.martinez at gmail.com> >> >> >> >> wrote: >> >> >> >> > Gene, >> >> >> >> > >> >> >> >> > Your binaries didn't work for me, however I put some code in to >> >> >> >> > print >> >> >> >> > the >> >> >> >> > byes of mac1 and mac2. In efi_create_binding() it does go >> >> >> >> > through >> >> >> >> > all >> >> >> >> > of the >> >> >> >> > macs looking for the correct one and then finds a 100% match. >> >> >> >> > In >> >> >> >> > this >> >> >> >> > case >> >> >> >> > the mac is 8c-dc-d4-0d-a5-f0 so && memcmp(mac_1, mac_2, >> >> >> >> > PXE_MAC_LENGTH) =>> >> >> >> > 0) { is correct >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://sites.google.com/site/genecsyslinux/150829-efi-1136.tgz?attredirects=0&d=1 >> >> >> >> >> >> >> >> Those should produce debug text listing the original PxeBc handle >> >> >> >> and >> >> >> >> its full MAC then proceed to list the UDPv4Sb handles and their >> >> >> >> MACs. >> >> >> >> I'm still trying to figure out if there's a way to get the handle >> >> >> >> numbers that the shells use. >> >> >> >> >> >> >> >> > On Fri, Aug 28, 2015 at 3:46 AM, Gene Cumm >> >> >> >> > <gene.cumm at gmail.com> >> >> >> >> > wrote: >> >> >> >> >> >> >> >> >> >> On Fri, Aug 28, 2015 at 6:34 AM, Patrick Masotta >> >> >> >> >> <masottaus at yahoo.com> >> >> >> >> >> wrote: >> >> >> >> >> >>>> >> >> >> >> >> > More importantly: look at the actual captured text. It >> >> >> >> >> > does >> >> >> >> >> > NOT >> >> >> >> >> > specify a valid MAC in its entirety and leaves off the >> >> >> >> >> > leading >> >> >> >> >> > nibble >> >> >> >> >> > (11 characters, not 12). Handle 267 shows "065F36E00EE" >> >> >> >> >> > not >> >> >> >> >> > "0065F36E00EE". >> >> >> >> >> > <<< >> >> >> >> >> > >> >> >> >> >> > I saw that, they might even be making a mistake when >> >> >> >> >> > implementing >> >> >> >> >> > the Device Path protocol. >> >> >> >> >> > >> >> >> >> >> > >>> >> >> >> >> >> > It is possible that this is a visual bug but it DOES give a >> >> >> >> >> > hint >> >> >> >> >> > that >> >> >> >> >> > there may be an issue in the MAC addresses. >> >> >> >> >> > <<< >> >> >> >> >> > >> >> >> >> >> > It has to be more than visual; if not the code would've got >> >> >> >> >> > a >> >> >> >> >> > match. >> >> >> >> >> > >> >> >> >> >> > Probably they do not change the MAC but they make a mistake >> >> >> >> >> > with the MAC Address Device Path. Anyway; it's a buggy EFI >> >> >> >> >> > implementation that breaks the multi-nic approach. >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> >>>> >> >> >> >> >> > Derrick is already running the latest firmware on this >> >> >> >> >> > machine. >> >> >> >> >> > <<< >> >> >> >> >> > >> >> >> >> >> > sorry didn't know. >> >> >> >> >> > >> >> >> >> >> > Derrick , >> >> >> >> >> > you could try this as a ""hack"" for probably solving your >> >> >> >> >> > problem: >> >> >> >> >> > - && memcmp(mac_1, mac_2, PXE_MAC_LENGTH) == 0) { >> >> >> >> >> > >> >> >> >> >> > + && memcmp(mac_1 + 1, mac_2 + 1, 5) == 0) { >> >> >> >> >> > **or alternatively** (I do not remember now if the 6 bytes >> >> >> >> >> > of >> >> >> >> >> > the >> >> >> >> >> > MAC >> >> >> >> >> > go at front or back of the 32 bytes string >> >> >> >> >> > + && memcmp(mac_1 + PXE_MAC_LENGTH - 5, mac_2 + >> >> >> >> >> > PXE_MAC_LENGTH - >> >> >> >> >> > 5, 5) == 0) { >> >> >> >> >> > at efi/main.c\efi_create_binding() >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> > this change will only look after the last 5 bytes of the HP >> >> >> >> >> > MAC >> >> >> >> >> > address >> >> >> >> >> > for a match, >> >> >> >> >> > considering they really are using a MAC length of >> >> >> >> >> > PXE_MAC_LENGTH >> >> >> >> >> > (32 >> >> >> >> >> > bytes) on the >> >> >> >> >> > MAC representation of the MAC Address Device Path (who >> >> >> >> >> > knows..) >> >> >> >> >> >> >> >> >> >> I sent Patrick and Derrick a copy of some binaries with a >> >> >> >> >> quick >> >> >> >> >> hack >> >> >> >> >> to print the entire MAC on the first attempt to use a UDPv4Sb >> >> >> >> >> where >> >> >> >> >> the PxeBc and UDPv4Sb don't live on the same handle. >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> -Gene >> >> >> >> >> -- >> >> >> >> -Gene
On Tue, Sep 1, 2015 at 11:36 PM, Derrick M <derrick.martinez at gmail.com> wrote:> Gene > > I have tried to sleep 75 seconds right before the spawn_load() call just to > wait if it wasn't ready yet to send. Also according to tcpdump there are no > RRQ packets to the tftp server from syslinux. (only the initial > syslinux.efi)Where are you doing the capture? I find that an inline capture with misbehaving clients the most helpful (through a tap, port mirror/SPAN, or a Linux bridge). Feel free to send me a PCAP directly if you want an extra set of eyes. The other recent message has me wondering what MAC/IPv4 addresses are in those packets and what responses come about. I'm suspecting the way we're trying to let the firmware guide the packet might be misdirecting it as a result.> On Tue, Sep 1, 2015 at 3:31 AM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> On Mon, Aug 31, 2015 at 8:23 PM, Derrick M <derrick.martinez at gmail.com> >> wrote: >> > Gene, >> > >> > Yeah I guess that could be good or bad news. Also I have determined >> > where it >> > is getting hung at -- core/elflink/load_env32.c in start_ldlinux at rv >> > spawn_load(LDLINUX, argc, argv); >> > >> > Do you have any other ideas for troubleshooting? >> >> spawn_load() should be the call to load ldlinux.* and in this case, >> use the network to retrieve it, likely the first network IO operation. >> When you saw those messages to print the handle address and MAC, >> Syslinux should have been in the middle of that spawn_load() call. >> >> My next curiosity is with the configured mapping. Is the firmware >> merely saying it's ready when it's really not? I recall a suspicion >> some HP system might have been going "configured" when it really >> wasn't ready to send. >> >> With my test/debug binaries, do you see any network IO occur from >> Syslinux? I'm looking for the TFTP RRQ specifically and how it >> responds (including acting deaf). >> >> -- >> -Gene >> >> > On Mon, Aug 31, 2015 at 4:59 PM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> >> >> On Mon, Aug 31, 2015 at 6:42 PM, Derrick M <derrick.martinez at gmail.com> >> >> wrote: >> >> > Thanks Gene! >> >> > >> >> > this one is much better >> >> >> >> EXCELLENT! That's what I wanted to see. It iterates through 3 >> >> handles, printing the entire MAC buffer and the handle's memory >> >> address. Looks like it's merely a visual display bug. >> >> >> >> > On Mon, Aug 31, 2015 at 3:11 PM, Gene Cumm <gene.cumm at gmail.com> >> >> > wrote: >> >> >> >> >> >> On Mon, Aug 31, 2015 at 2:49 PM, Derrick M >> >> >> <derrick.martinez at gmail.com> >> >> >> wrote: >> >> >> > Gene, >> >> >> > >> >> >> > Also, I wanted to point out there the code is freezing at in >> >> >> > main.c. >> >> >> > It's >> >> >> > right at load_env32(NULL); at that point status has a numerical >> >> >> > value >> >> >> > of >> >> >> > 6. >> >> >> >> >> >> That's excellent to know. Thanks. >> >> >> >> And numerical value of 6 is EFI_NOT_READY, indicating the keyboard >> >> buffer is empty, as expected. >> >> >> >> -- >> >> -Gene >> >> >> >> >> > On Mon, Aug 31, 2015 at 3:08 AM, Gene Cumm <gene.cumm at gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> On Aug 30, 2015 8:42 PM, "Derrick" <derrick22 at gmail.com> wrote: >> >> >> >> > >> >> >> >> > Gene thanks, here is the output >> >> >> >> > >> >> >> >> > My IP is 10.2.49.10 >> >> >> >> > Img @ 71d89718 = 8cdcd40ca5f0 >> >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> >> > Udp @ 71d89718 = 8cdcd40ca5f0 >> >> >> >> >> >> Figured out I had a bug when thinking about that output. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://sites.google.com/site/genecsyslinux/150831-efi.tgz?attredirects=0&d=1 >> >> >> will show the proper iterations over the handles. >> >> >> >> >> >> -- >> >> >> -Gene >> >> >> >> >> >> >> > From that point it is hung >> >> >> >> >> >> >> >> I find it hard to believe that it'd print that. I think you did >> >> >> >> a >> >> >> >> copy/paste and forgot to update and trimmed some data. >> >> >> >> >> >> >> >> Does your HP DL160 G9 have an iLO with remote KVM function? If >> >> >> >> not >> >> >> >> and >> >> >> >> you feel a screenshot from a real camera is too big for the list, >> >> >> >> feel >> >> >> >> free >> >> >> >> to send it to me directly. >> >> >> >> >> >> >> >> --Gene >> >> >> >> >> >> >> >> > On Sun, Aug 30, 2015 at 4:10 AM, Gene Cumm via Syslinux >> >> >> >> > <syslinux at zytor.com> wrote: >> >> >> >> >> >> >> >> >> >> On Fri, Aug 28, 2015 at 3:13 PM, Derrick M >> >> >> >> >> <derrick.martinez at gmail.com> >> >> >> >> >> wrote: >> >> >> >> >> > Gene, >> >> >> >> >> > >> >> >> >> >> > Your binaries didn't work for me, however I put some code in >> >> >> >> >> > to >> >> >> >> >> > print >> >> >> >> >> > the >> >> >> >> >> > byes of mac1 and mac2. In efi_create_binding() it does go >> >> >> >> >> > through >> >> >> >> >> > all >> >> >> >> >> > of the >> >> >> >> >> > macs looking for the correct one and then finds a 100% >> >> >> >> >> > match. >> >> >> >> >> > In >> >> >> >> >> > this >> >> >> >> >> > case >> >> >> >> >> > the mac is 8c-dc-d4-0d-a5-f0 so && memcmp(mac_1, mac_2, >> >> >> >> >> > PXE_MAC_LENGTH) =>> >> >> >> >> > 0) { is correct >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://sites.google.com/site/genecsyslinux/150829-efi-1136.tgz?attredirects=0&d=1 >> >> >> >> >> >> >> >> >> >> Those should produce debug text listing the original PxeBc >> >> >> >> >> handle >> >> >> >> >> and >> >> >> >> >> its full MAC then proceed to list the UDPv4Sb handles and >> >> >> >> >> their >> >> >> >> >> MACs. >> >> >> >> >> I'm still trying to figure out if there's a way to get the >> >> >> >> >> handle >> >> >> >> >> numbers that the shells use. >> >> >> >> >> >> >> >> >> >> > On Fri, Aug 28, 2015 at 3:46 AM, Gene Cumm >> >> >> >> >> > <gene.cumm at gmail.com> >> >> >> >> >> > wrote: >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Aug 28, 2015 at 6:34 AM, Patrick Masotta >> >> >> >> >> >> <masottaus at yahoo.com> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >>>> >> >> >> >> >> >> > More importantly: look at the actual captured text. It >> >> >> >> >> >> > does >> >> >> >> >> >> > NOT >> >> >> >> >> >> > specify a valid MAC in its entirety and leaves off the >> >> >> >> >> >> > leading >> >> >> >> >> >> > nibble >> >> >> >> >> >> > (11 characters, not 12). Handle 267 shows "065F36E00EE" >> >> >> >> >> >> > not >> >> >> >> >> >> > "0065F36E00EE". >> >> >> >> >> >> > <<< >> >> >> >> >> >> > >> >> >> >> >> >> > I saw that, they might even be making a mistake when >> >> >> >> >> >> > implementing >> >> >> >> >> >> > the Device Path protocol. >> >> >> >> >> >> > >> >> >> >> >> >> > >>> >> >> >> >> >> >> > It is possible that this is a visual bug but it DOES >> >> >> >> >> >> > give a >> >> >> >> >> >> > hint >> >> >> >> >> >> > that >> >> >> >> >> >> > there may be an issue in the MAC addresses. >> >> >> >> >> >> > <<< >> >> >> >> >> >> > >> >> >> >> >> >> > It has to be more than visual; if not the code would've >> >> >> >> >> >> > got >> >> >> >> >> >> > a >> >> >> >> >> >> > match. >> >> >> >> >> >> > >> >> >> >> >> >> > Probably they do not change the MAC but they make a >> >> >> >> >> >> > mistake >> >> >> >> >> >> > with the MAC Address Device Path. Anyway; it's a buggy >> >> >> >> >> >> > EFI >> >> >> >> >> >> > implementation that breaks the multi-nic approach. >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> >>>> >> >> >> >> >> >> > Derrick is already running the latest firmware on this >> >> >> >> >> >> > machine. >> >> >> >> >> >> > <<< >> >> >> >> >> >> > >> >> >> >> >> >> > sorry didn't know. >> >> >> >> >> >> > >> >> >> >> >> >> > Derrick , >> >> >> >> >> >> > you could try this as a ""hack"" for probably solving >> >> >> >> >> >> > your >> >> >> >> >> >> > problem: >> >> >> >> >> >> > - && memcmp(mac_1, mac_2, PXE_MAC_LENGTH) == 0) { >> >> >> >> >> >> > >> >> >> >> >> >> > + && memcmp(mac_1 + 1, mac_2 + 1, 5) == 0) { >> >> >> >> >> >> > **or alternatively** (I do not remember now if the 6 >> >> >> >> >> >> > bytes >> >> >> >> >> >> > of >> >> >> >> >> >> > the >> >> >> >> >> >> > MAC >> >> >> >> >> >> > go at front or back of the 32 bytes string >> >> >> >> >> >> > + && memcmp(mac_1 + PXE_MAC_LENGTH - 5, mac_2 + >> >> >> >> >> >> > PXE_MAC_LENGTH - >> >> >> >> >> >> > 5, 5) == 0) { >> >> >> >> >> >> > at efi/main.c\efi_create_binding() >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > this change will only look after the last 5 bytes of the >> >> >> >> >> >> > HP >> >> >> >> >> >> > MAC >> >> >> >> >> >> > address >> >> >> >> >> >> > for a match, >> >> >> >> >> >> > considering they really are using a MAC length of >> >> >> >> >> >> > PXE_MAC_LENGTH >> >> >> >> >> >> > (32 >> >> >> >> >> >> > bytes) on the >> >> >> >> >> >> > MAC representation of the MAC Address Device Path (who >> >> >> >> >> >> > knows..) >> >> >> >> >> >> >> >> >> >> >> >> I sent Patrick and Derrick a copy of some binaries with a >> >> >> >> >> >> quick >> >> >> >> >> >> hack >> >> >> >> >> >> to print the entire MAC on the first attempt to use a >> >> >> >> >> >> UDPv4Sb >> >> >> >> >> >> where >> >> >> >> >> >> the PxeBc and UDPv4Sb don't live on the same handle. >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> -Gene >> >> >> >> >> >> >> -- >> >> >> >> >> -Gene > >-- -Gene
On Mon, Sep 14, 2015 at 3:08 PM, Derrick M <derrick.martinez at gmail.com> wrote:> Turns out that it was looking for ldlinux.e64 on the DHCP server not the > next-server.. but after compiling the latest commit, it is now at least > correctly trying the next-server.Excellent news.> But I'm getting a lot of 'RRQ from 10.1.1.205 filename /efi.x64/ldlinux.e64' > entires before finally giving up. I'll have to do more debuggingSounds deaf (transmitting packets but not receiving them at the application layer). -- -Gene
On Mon, Sep 14, 2015 at 4:51 PM, Derrick M <derrick.martinez at gmail.com> wrote:> Yes it does sound deaf. Also using tftpd-hpa with TFTP_OPTIONS="--secure > -vv" > > On Mon, Sep 14, 2015 at 1:39 PM, Gene Cumm <gene.cumm at gmail.com> wrote: >> >> On Mon, Sep 14, 2015 at 3:08 PM, Derrick M <derrick.martinez at gmail.com> >> wrote: >> > Turns out that it was looking for ldlinux.e64 on the DHCP server not the >> > next-server.. but after compiling the latest commit, it is now at least >> > correctly trying the next-server. >> >> Excellent news. >> >> > But I'm getting a lot of 'RRQ from 10.1.1.205 filename >> > /efi.x64/ldlinux.e64' >> > entires before finally giving up. I'll have to do more debugging >> >> Sounds deaf (transmitting packets but not receiving them at the >> application layer).Jeff's recent reply had me thinking perhaps there's something wrong with packet filtering that's dropping this packet. It's mildly possible that avoiding UseDefaultAddress or perhaps examining another flag/setting may help. -- -Gene