Santos, Jose Renato G
2006-Aug-30 20:11 UTC
[Xen-devel] Directly mapping vifs to physical devices in netback - an alternative to bridge
We would like to propose an alternative to the linux bridge for network virtualization in Xen. We think that the standard linux bridge makes the network configuration more complex than what is really necessary increasing the chances of network configuration errors. The bridge itself is an additional entity that needs to be configured and associated with physical interfaces (more things to configure, more opportunities for mistakes). Bridge configuration is not a simple operation: the physical interface is brought down and up, virtual interfaces are created and associated with the bridge, virtual and physical interfaces are renamed and so on. This complexity has created several problems in the past. Many reports of user mistakes or network script bugs have been posted. Although most issues have been solved, it seems that some still remain. As an example, we have an unusual network setup and the bridge scripts do not work perfectly for us. Our server has 8 interfaces (eth0 to eth7) connected to an isolated test network (for running network benchmarks) and another interface (eth8) connecting the machine to the outside world and the default interface for IP routing. Up to until a few weeks ago (maybe 1-2 months), running the command "network-bridge start vifnum=0" would not work as expected: the bridge would be configured with "peth8" (default route) instead of peth0. In the current version of xen-unstable this seems to be fixed. However, now the command "network-bridge start netdev=eth0" does not work properly as it tries to create veth8 (instead of veth0) which does not exist when the maximum number of loopback devices is 8 (veth0-veth7). This error can be avoided by specifying "vifnum=0", but still this is annoying and confusing to the user. The claim is that with a simpler network approach as the one proposed here, both problems in the network configuration scripts and mistakes in user network configuration can be significantly reduced. Here is a brief summary of the proposed alternative scheme: - Netback keeps a mapping of vifs to physical network devices. Netback intercepts all packets sent or received on the physical interface and on the I/O channel and forwards them directly to the appropriate domU, dom0 (local network stack) or physical interface (external host) based on the packet MAC address (handling broadcast correctly). A new parameter "pdev" is used in a vif definition in the domain configuration file to indicate the physical interface associated with the vif. This parameter is then used by netback to create the appropriate virtual to physical mapping. Some advantages of the alternative network approach: a) Direct association of vifs to physical devices A vif is directly associated with a physical device in the domain configuration file, instead of being associated with a bridge which in turn is associated with a device. This reduces the likelihood of user mis-configurations (less things to configure, less opportunities for mistakes) b) No "network-script" required Only a very simple "vif-script" based on "vif-common.sh" is needed to bring a new virtual interface up at the time of domain creation with no script needed to setup the network configuration when xend starts. Again simpler configuration which reduces likelihood of potential network script bugs. Since there is no script for network configuration and the "vif-script" just has to bring up a virtual interface, the likelihood of any script bug is very small. c) No loopback interfaces used for dom0 communication Simpler network configuration with fewer interfaces visible to user. Current limitation on the number of physical devices imposed by the number of available loopback interfaces is eliminated. Possibly better performance for dom0 traffic due to fewer stages for packet handling (needs to be measured). d) No need for bringing a physical interface down and up when configuring network The current bridge setup brings the physical interface down and then up when configuring the bridge. This is a problem for configurations that cannot lose network connectivity such as for example a system with an NFS root filesystem. e) Performance Previous OProfile results have shown that the default bridge configuration has significant performance overhead. The proposed netback switching approach has much lower performance overhead. A more careful analysis indicated that most of the bridge overhead was caused by the netfilter code in the bridge. When the netfilter option in the bridge (CONFIG_BRIDGE_NETFILTER) is disabled, both approaches have similar performance, with the proposed netback switching approach performing slightly better. See results summary below. ======================================================================= Performance Results: - Machine: 4-way P4 Xeon 2.8 GHz with 4GB of RAM (dom0 with 512 MB and domU with 256MB) - Benchmark: single TCP connection at max rate on a gigabit interface (940 Mb/s) Measurement: CPU utilization on domain0 (99% confidence interval for 8 meaurements) ======================================================================| Experiment | default bridge | bridge with | netback | | | | netfilter disabled | switching | ======================================================================| receive | 85.00% ±0.38% | 73.97% ±0.23% | 72.17% ±0.56% | | transmit | 77.13 ±0.49% | 68.86% ±0.73% | 66.34% ±0.52% | ====================================================================== We have attached a patch with the the netback direct switching approach. Comments, suggestions and criticisms are welcome ... Thanks for your time and for any feedback Renato _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Aug-31 14:39 UTC
RE: [Xen-devel] Directly mapping vifs to physical devices in netback -an alternative to bridge
> > Performance Results: > - Machine: 4-way P4 Xeon 2.8 GHz with 4GB of RAM (dom0 with 512 MB and > domU with 256MB) > - Benchmark: single TCP connection at max rate on a gigabit interface > (940 Mb/s) > > Measurement: CPU utilization on domain0 (99% confidence interval for 8 > meaurements) > ======================================================================> | Experiment | default bridge | bridge with | netback | > | | | netfilter disabled | switching | > ======================================================================> | receive | 85.00% ±0.38% | 73.97% ±0.23% | 72.17% ±0.56% | > | transmit | 77.13 ±0.49% | 68.86% ±0.73% | 66.34% ±0.52% | > ======================================================================I''m kinda surprised that it doesn''t work better than that. We see bridge fns show up a lot on oprofile results, so I''d have expected to see more than 1.5% benefit. How are you measuring CPU utilization? Are the dom0/domU on different CPUs? Do you get the downgraded bridging performance simply by having CONFIG_BRIDGE_NETFILTER=y in the compiled kernel, or do you need to have modules loaded or rules installed? Does ebtables have the same effect? Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Santos, Jose Renato G
2006-Aug-31 19:21 UTC
RE: [Xen-devel] Directly mapping vifs to physical devices in netback -an alternative to bridge
Ian, Thanks for the taking the time to look at this. Comments are embedded on the text below.> -----Original Message----- > From: Ian Pratt [mailto:m+Ian.Pratt@cl.cam.ac.uk] > Sent: Thursday, August 31, 2006 7:40 AM > To: Santos, Jose Renato G; Xen Devel > Cc: Turner, Yoshio; G John Janakiraman; ian.pratt@cl.cam.ac.uk > Subject: RE: [Xen-devel] Directly mapping vifs to physical > devices in netback -an alternative to bridge > > > I''m kinda surprised that it doesn''t work better than that. We > see bridge fns show up a lot on oprofile results, so I''d haveYes, bridge shows up a lot on oprofile results as I pointed out in my presentation in the last Xen summit, but this is significantly reduced by disabling netfilter (see results below)> expected to see more than 1.5% benefit. How are you measuring > CPU utilization? Are the dom0/domU on different CPUs? >Yes, dom0 and domU were running on different CPUs. The reported CPU utilization was for dom0 CPU. CPU utilization was computed using the total number of oprofile samples (for unhalted CLOCK cycles) divided by the max number of possible samples, during a fixed time interval (based on sample rate and CLOCK frequency)> Do you get the downgraded bridging performance simply by > having CONFIG_BRIDGE_NETFILTER=y in the compiled kernel, or > do you need to have modules loaded or rules installed? Does > ebtables have the same effect? >Yes, simply having CONFIG_BRIDGE_NETFILTER=y (default for xen0 configs) with no rules or modules installed significantly increases the CPU utilization. Look at the oprofile results below comparing the three approaches (filtered based on functions defined in net/bridge/bridge.o or on the replacement functions for the alternative non bridge approach). I have not tried ebtables regards Renato ======================================================Filtered Oprofiles results for the receive case : Bridge functions from OProfile results (CONFIG_BRIDGE_NETFILTER=y) Samples % function 2868 1.9956 br_nf_pre_routing 1768 1.2302 br_nf_post_routing 1713 1.1920 br_handle_frame 1389 0.9665 br_nf_forward_ip 1236 0.8600 br_nf_pre_routing_finish 1141 0.7939 br_fdb_update 1138 0.7919 __br_fdb_get 978 0.6805 ip_sabotage_out 771 0.5365 br_handle_frame_finish 593 0.4126 br_nf_forward_finish 526 0.3660 br_dev_queue_push_xmit 494 0.3437 __br_forward 437 0.3041 br_forward 412 0.2867 ip_sabotage_in 388 0.2700 br_forward_finish 299 0.2081 br_nf_dev_queue_xmit 204 0.1419 setup_pre_routing 4 0.0028 br_fdb_cleanup 16359 11.3830 TOTAL ======================================================Bridge functions from OProfile results (# CONFIG_BRIDGE_NETFILTER is not set) Samples % function 1256 1.0052 br_handle_frame 1000 0.8003 br_fdb_update 901 0.7211 __br_fdb_get 380 0.3041 br_handle_frame_finish 283 0.2265 br_dev_queue_push_xmit 146 0.1168 __br_forward 121 0.0968 br_forward 93 0.0744 br_forward_finish 4 0.0032 br_fdb_cleanup 1 0.0008 br_hold_timer_expired 1 0.0008 br_send_config_bpdu 4186 3.3500 TOTAL ======================================================Alternative forwarding functions replacing bridge: Samples % function 2479 2.0423 vifdevmap_rx_packet 573 0.4721 vifdevmap_guest_packet 372 0.3065 vifdevmap_find 297 0.2447 vifdevmap_tx_packet 159 0.1310 __vifdevmap_find 3880 3.1966 TOTAL _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel