On 9/4/20 6:47 PM, daggs wrote:> Greetings Laine, > >>> >>> I would start troubleshooting by making sure that the dhcp server is >>> running, and that you can communicate between the machine with DHCP >>> server and the guest once a manual IP is assigned. Then use tcpdump or >>> wireshark at different places on the path between those two to see how >>> far the DHCP request is getting out, whether a response is being sent by >>> the server, and if so how far the response is getting back (i.e. on the >>> host, run tcpdump on the guest's tap device; if you see the DHCP request >>> there, then run tcpdump on the bridge, if you see it there, run it on >>> the tap device for the guest, if you see it there, then run tcpdump >>> inside the guest; then check the dhcp server logs to see if it's >>> receiving requests. While you're doing all of this, you can also be >>> noticing whether or not a DHCP response is arriving at each step (and if >>> you see the response, you can skip looking further ahead in the packet >>> path, since you know by inference that it made it all the way to the >>> DHCP server). Once you find the point that the packet is blocked, you'll >>> be better able to determine why. >>> >>> >> >> alright, I'll try that, thanks. >> > > I've ran tcpdump on the vm's tap device, here is what I see:When you say "the vm", you mean the one running libreelec, that is trying to get and IP address, correct?> 01:42:15.404754 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:5a:4c:8c (oui Unknown), length 548 > 01:42:15.405075 IP Broadcom.Home.bootps > 10.0.0.40.bootpc: BOOTP/DHCP, Reply, length 300I guess Broadcom.home is the IP of the VM that's running the dhcp server? (I should have suggested using "tcpdump -n -e -v" :-/)> 01:42:15.735893 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:6b:1b:92.8003, length 35 > 01:42:17.718941 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:6b:1b:92.8003, length 35 > 01:42:17.846918 IP6 fe80::fc54:ff:fe5a:4c8c > ff02::2: ICMP6, router solicitation, length 16 > 01:42:19.702944 STP 802.1d, Config, Flags [none], bridge-id 8000.52:54:00:6b:1b:92.8003, length 35 > 01:42:20.450441 ARP, Request who-has 10.0.0.40 tell Broadcom.Home, length 28 > > I think that issue is this: > 01:42:20.450441 ARP, Request who-has 10.0.0.40 tell Broadcom.Home, length 28 > > I'm not sure if this is expected but looks like my dhcp server ignores it. > any thoughts on the matter?It looks strange, but is normal. What usually happens is this: 1) The guest sends a DHCP Discover Request, suggesting that it would like to use the addres 10.0.0.40 (These details will be revealed once you add "-v" to your tcpdump commandline. 2) The DHCP server says to itself "Hmm, this guy wants to use 10.0.0.40, which is okay with me, but first I should see if someone else is using it", so it sends out an ARP request for 10.0.0.40. Then just to be sure, it sends another. (at this point, if the server is dnsmasq and it hasn't received an ARP request, for some reason it sends an ICMP echo request to 10.0.0.40 (the requested/suggested IP) with destination MAC address of the client that just sent the DHCP request. No idea why. It won't be answered though (unless the client actually still had a lease on that address and was just renewing; but the DHCP server would know it if that's what was happening, so...) 3) If the server doesn't receive any response to the ARP request, then it will send a DHCP response to the requested IP + client MAC saying "Yes, you can use that IP address. 4) I'm not sure why (because it's been > 20 years since I last read the DHCP RFC), but in the case I just looked at on my host (which is using dnsmasq as the server, and dhclient as the client), the same request and response are sent/received at the same IP+MAC addresses a 2nd time. 5) at this point everybody agrees on the new IP address, the client sets its IP address, the server updates its leases table, and life carries on. But to back up for a minute - it's completely normal for the DHCP server to send out an ARP request and get no response. I think things are going south sometime after that. Are you seeing a DHCP reply at all? If you don't see it on the libreelec (client) machine's tap device, check if you see it going *out* on the DHCP server's tap. If it's not there, then you'll need to debug inside the guest running the DHCP server. Before this packet is receivd, the guest doesn't yet know that's its IP address, but it does know that's its MAC address, and it's waiting for a DHCP reply, so it takes the info from the reply, then sends another request, this time including all the options it received in the first reply. 4) Now
I am very inexperienced with KVM and have a question about networking As per my basic understanding a network in KVM apparently also provides the work of an DHCP server for all machines on that network. So, whether that perception is correct, how would you go about whether you wanted (for learning purposes) to use a particular VM on such a KVM (internal) network to act as the DHCP server for all machines on that network? by default it should conflict with the built in DHCP service of the network itself, wouldn't it?
I looked into this a little more myself (mixture of the "Virtual Machine Manager" gui and virsh cli is my toolset) which provided me with some insights. On 06.09.20 14:50, gunnar.wagner wrote:> /As per my basic understanding a network in KVM apparently also > provides the work of an DHCP server for all machines on that network. /only if DHCP is enabled upon creation of such a network> /[...] how would you go about whether you wanted (for learning > purposes) to use a particular VM on such a KVM (internal) network to > act as the DHCP server for all machines on that network? > /by creating a network without DHCP enabled
Greetings LAine,> When you say "the vm", you mean the one running libreelec, that is > trying to get and IP address, correct?yes, you are correct.> I guess Broadcom.home is the IP of the VM that's running the dhcp > server? (I should have suggested using "tcpdump -n -e -v" :-/) >frankly, I have no idea who is Broadcom.home. here is the requested dump: https://dpaste.com/849DMX9ND> It looks strange, but is normal. What usually happens is this: > > 1) The guest sends a DHCP Discover Request, suggesting that it would > like to use the addres 10.0.0.40 (These details will be revealed once > you add "-v" to your tcpdump commandline. > > > 2) The DHCP server says to itself "Hmm, this guy wants to use 10.0.0.40, > which is okay with me, but first I should see if someone else is using > it", so it sends out an ARP request for 10.0.0.40. Then just to be sure, > it sends another. > > (at this point, if the server is dnsmasq and it hasn't received an ARP > request, for some reason it sends an ICMP echo request to 10.0.0.40 (the > requested/suggested IP) with destination MAC address of the client that > just sent the DHCP request. No idea why. It won't be answered though > (unless the client actually still had a lease on that address and was > just renewing; but the DHCP server would know it if that's what was > happening, so...) > > 3) If the server doesn't receive any response to the ARP request, then > it will send a DHCP response to the requested IP + client MAC saying > "Yes, you can use that IP address. > > 4) I'm not sure why (because it's been > 20 years since I last read the > DHCP RFC), but in the case I just looked at on my host (which is using > dnsmasq as the server, and dhclient as the client), the same request and > response are sent/received at the same IP+MAC addresses a 2nd time. > > 5) at this point everybody agrees on the new IP address, the client sets > its IP address, the server updates its leases table, and life carries on. > > But to back up for a minute - it's completely normal for the DHCP server > to send out an ARP request and get no response. I think things are going > south sometime after that. Are you seeing a DHCP reply at all? If you > don't see it on the libreelec (client) machine's tap device, check if > you see it going *out* on the DHCP server's tap. If it's not there, then > you'll need to debug inside the guest running the DHCP server. > > Before this packet is receivd, the guest doesn't yet know that's its IP > address, but it does know that's its MAC address, and it's waiting for a > DHCP reply, so it takes the info from the reply, then sends another > request, this time including all the options it received in the first reply. > > 4) Now > >should I add another nic with static ip and try to trace the pkts from there?
On 9/6/20 12:02 PM, daggs wrote:> Greetings LAine, > >> When you say "the vm", you mean the one running libreelec, that is >> trying to get and IP address, correct? > > yes, you are correct. > >> I guess Broadcom.home is the IP of the VM that's running the dhcp >> server? (I should have suggested using "tcpdump -n -e -v" :-/) >> > > frankly, I have no idea who is Broadcom.home.It's just some name tcpdump used to replace the IP address of one of the machines, and since it's the source IP of a DHCP reply packet, it most likely is the IP of the DHCP server.> here is the requested dump: https://dpaste.com/849DMX9NDWhat I see in that dump is that the DHCP client (Mac address 52:54:00:5a:4c:8c, hostname "streamer" repeatedly sends the exact same DHCP request (6 times), and the DHCP server responds to each of these requests alternating between sending the response to the client's MAC with a destination IP already set, and to the broadcast MAC + IP addresses) interspersed with several ARP requests directed at the MAC address of the client asking who has the IP that the server just suggested (so it's doing something different from what I described in my previous message - rather than using ARP to verify that an IP isn't already in use prior to assigning it, it's assuming it has full authority over IP addresses in the broadcast domain, assigning that IP to the client without checking for prior use, and then sending the ARP request to see if the client actually decided to use it.) Eventually the client gives up (because it hasn't seen any valid DHCP responses) and gives itself an IP on the 169.254.0.0/16 network, then goes about the process of looking for other devices to connect to using that IP. Was this dump taken on the host of the tap device of the client (libreelec aka streamer)? If so, I can only see two options: 1) there is something in iptables or ebtables (or nftables, if you have that on the host) blocking the DHCP response packets from going out the tap interface, or 2) there is something in the guest itself blocking the traffic or preventing the packet from passing. For (1) you'd need to run "ebtables -L; iptables -S; nft list ruleset" and look for something suspicious. For (2) can you try changing both the libreelec and the DHCP server vm's ethernet device models from virtio to e1000? (or e1000e if they are q35 machinetypes)? If that works, then change one or the other back and see if it stops working. > should I add another nic with static ip and try to trace the pkts from there? > You mean so you can ssh to the client/libreelec and run tcpdump there agains the interface that's doing dhcp? Is tcpdump even available on libreelec? I know it's very limited, and has no simple facilities for adding new packages. If it has tcpdump though, then sure. The only problem is that you would probably not be able to get tcpdump running via that interface quick enough to see the initial boottime dhcp exchange; instead you'll probably need to go into the UI and bring the other interface down/up to trigger a new DHCP cycle. (BTW, if everything works when the client has a static IP address, then that proves there is no problem related to ARP requests/responses - that much is required in order for even a static IP to work)