Patrick Masotta
2016-Mar-07 15:50 UTC
[syslinux] "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
>>>... Bottom line; There are timing issues with TFTP transfers on VM machines emulating BIOS hardware. Probably the interrupt based timer is not the culprit; now I'm working on /core/fs/pxe/core.c trying to see if there's something wrong there. Best, Patrick <<< I've been working on this issue, I have tested the timers, and as you guys mentioned before they are OK. I've noticed that if the VMware VM uses the driver e1000e everything is "OK" but if the VM uses the "vmxnet3" the problem is there. I have attached a small pcap file showing the end of the lpxelinux.0 TFTP transfer and the beginning of the problematic ldlinux.c32 transfer. I wonder if the "vmxnet3" driver is probably "too fast" for the Sequential lwIP API? Best, Patrick -------------- next part -------------- A non-text attachment was scrubbed... Name: lwIP-vmware-vmxnet3.pcapng Type: application/octet-stream Size: 23120 bytes Desc: not available URL: <http://www.zytor.com/pipermail/syslinux/attachments/20160307/ca69020c/attachment-0001.obj>
Gene Cumm
2016-Mar-08 11:38 UTC
[syslinux] "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
On Mon, Mar 7, 2016 at 10:50 AM, Patrick Masotta via Syslinux <syslinux at zytor.com> wrote:>>>> > > ... > Bottom line; There are timing issues with TFTP transfers > on VM machines emulating BIOS hardware. > Probably the interrupt based timer is not the culprit; > now I'm working on /core/fs/pxe/core.c trying to see if > there's something wrong there. > > Best, > > Patrick > <<< > > I've been working on this issue, I have tested the timers, > and as you guys mentioned before they are OK. > I've noticed that if the VMware VM uses the driver e1000e everything is "OK" > but if the VM uses the "vmxnet3" the problem is there. > I have attached a small pcap file showing the end of the lpxelinux.0 TFTP transfer and the > > beginning of the problematic ldlinux.c32 transfer. > I wonder if the "vmxnet3" driver is probably "too fast" for the Sequential lwIP API?It's been a while since I bothered looking at the lpxelinux.0/TFTP issue in my VMware Workstation VMs. From what I recall, the vmxnet3 had the biggest issues and the others vlance/pcnet32, e1000, and e1000e, still had issues but to a far lower extent, topping out around 1-2 MB/s. Working in a VMware vSphere environment, VMware Workstation is a better fit for me and has been my primary testing area for years. At the moment, I have at least 6 different VMs on my main PC with different configs whose sole purpose is Syslinux testing. vmxnet3 is designed to move data with the least CPU overhead possible when used with a typical OS. e1000/e1000e can get into the 2-2.5 Gbps range when intrahost and across a 10+ Gbps vmnic uplink but vmxnet3 can push past this with relative ease. -- -Gene
Patrick Masotta
2016-Mar-08 13:31 UTC
[syslinux] "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
>>>It's been a while since I bothered looking at the lpxelinux.0/TFTP issue in my VMware Workstation VMs. From what I recall, the vmxnet3 had the biggest issues and the others vlance/pcnet32, e1000, and e1000e, still had issues but to a far lower extent, topping out around 1-2 MB/s. Working in a VMware vSphere environment, VMware Workstation is a better fit for me and has been my primary testing area for years. At the moment, I have at least 6 different VMs on my main PC with different configs whose sole purpose is Syslinux testing. vmxnet3 is designed to move data with the least CPU overhead possible when used with a typical OS. e1000/e1000e can get into the 2-2.5 Gbps range when intrahost and across a 10+ Gbps vmnic uplink but vmxnet3 can push past this with relative ease. -Gene <<< I also use VMs under Workstation a lot; for testing and also Syslinux development. You are right e1000e is "considered" a 1GB driver while vmxnet3 is a 10GB. I wonder if it wouldn't be better implementing a lwIP RAW(udp_pcb) approach instead; the netcon interface seems slow. This way we could start receiving packets even before sending the first TFTP request... I have also detected ARP issues; lwip asks, it receives an answer but it asks again... legacy does not need ARP, the UNDI stack knows about it, then why should we bother about ARP with lwip? BTW lwIP received lot of fixes on v1.4.1 I've tried a quick update I understand there are a few changes from the original code in Syslinux like lwip_init_mem(), the size of the MAC address, etc, but failed when running; Have you tried running 1.4.1? Best, Patrick
Seemingly Similar Threads
- "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
- "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
- "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
- "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs
- "Tick-counting" vs "Tick-less" timekeeping issues on VMs emulating BIOS PCs