Yaron Haviv
2009-Aug-07 20:35 UTC
[Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
Paul, I also think that bridge may not be the right place for VEPA, but rather a simpler sw/hw mux Although the VEPA support may reside in multiple places (I.e. also in the bridge) As Arnd pointed out Or already added an extension to qemu that allow direct guest virtual NIC mapping to an interface device (vs using tap), this was done specifically to address VEPA, and result in much faster performance and lower cpu overhead (Or and some others are planning additional meaningful performance optimizations) The interface multiplexing can be achieved using macvlan driver or using an SR-IOV capable NIC (the preferred option), macvlan may need to be extended to support VEPA multicast handling, this looks like a rather simple task It may be counter intuitive for some, but we expect the (completed) qemu VEPA mode + SR-IOV + certain switches with hairpin (vepa) mode to perform faster than using bridge+tap even for connecting 2 VMs on the same host Yaron Sent from BlackBerry ________________________________ From: evb at yahoogroups.com To: 'Stephen Hemminger' ; 'Fischer, Anna' Cc: bridge at lists.linux-foundation.org ; linux-kernel at vger.kernel.org ; netdev at vger.kernel.org ; virtualization at lists.linux-foundation.org ; evb at yahoogroups.com ; davem at davemloft.net ; kaber at trash.net ; adobriyan at gmail.com ; 'Arnd Bergmann' Sent: Fri Aug 07 21:58:00 2009 Subject: [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support> > After reading more about this, I am not convinced this should be part > of the bridge code. The bridge code really consists of two parts: > forwarding table and optional spanning tree. Well the VEPA code short > circuits both of these; it can't imagine it working with STP turned > on. The only part of bridge code that really gets used by this are the > receive packet hooks and the crufty old API. > > So instead of adding more stuff to existing bridge code, why not have > a new driver for just VEPA. You could do it with a simple version of > macvlan type driver.Stephen, Thanks for your comments and questions. We do believe the bridge code is the right place for this, so I'd like to embellish on that a bit more to help persuade you. Sorry for the long winded response, but here are some thoughts: - First and foremost, VEPA is going to be a standard addition to the IEEE 802.1Q specification. The working group agreed at the last meeting to pursue a project to augment the bridge standard with hairpin mode (aka reflective relay) and a remote filtering service (VEPA). See for details: http://www.ieee802.org/1/files/public/docs2009/new-evb-congdon-evbPar5C-0709 <http://www.ieee802.org/1/files/public/docs2009/new-evb-congdon-evbPar5C-0709> -v01.pdf - The VEPA functionality was really a pretty small change to the code with low risk and wouldn't seem to warrant an entire new driver or module. - There are good use cases where VMs will want to have some of their interfaces attached to bridges and others to bridges operating in VEPA mode. In other words, we see simultaneous operation of the bridge code and VEPA occurring, so having as much of the underlying code as common as possible would seem to be beneficial. - By augmenting the bridge code with VEPA there is a great amount of re-use achieved. It works wherever the bridge code works and doesn't need anything special to support KVM, XEN, and all the hooks, etc... - The hardware vendors building SR-IOV NICs with embedded switches will be adding VEPA mode, so by keeping the bridge module in sync would be consistent with this trend and direction. It will be possible to extend the hardware implementations by cascading a software bridge and/or VEPA, so being in sync with the architecture would make this more consistent. - The forwarding table is still needed and used on inbound traffic to deliver frames to the correct virtual interfaces and to filter any reflected frames. A new driver would have to basically implement an equivalent forwarding table anyway. As I understand the current macvlan type driver, it wouldn't filter multicast frames properly without such a table. - It seems the hairpin mode would be needed in the bridge module whether VEPA was added to the bridge module or a new driver. Having the associated changes together in the same code could aid in understanding and deployment. As I understand the macvlan code, it currently doesn't allow two VMs on the same machine to communicate with one another. I could imagine a hairpin mode on the adjacent bridge making this possible, but the macvlan code would need to be updated to filter reflected frames so a source did not receive his own packet. I could imagine this being done as well, but to also support selective multicast usage, something similar to the bridge forwarding table would be needed. I think putting VEPA into a new driver would cause you to implement many things the bridge code already supports. Given that we expect the bridge standard to ultimately include VEPA, and the new functions are basic forwarding operations, it seems to make most sense to keep this consistent with the bridge module. Paul __._,_.___ Messages in this topic <http://groups.yahoo.com/group/evb/message/167;_ylc=X3oDMTMzb3FibzIzBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARtc2dJZAMyMTIEc2VjA2Z0cgRzbGsDdnRwYwRzdGltZQMxMjQ5NjcxNTEwBHRwY0lkAzE2Nw--> (9) Reply (via web post) <http://groups.yahoo.com/group/evb/post;_ylc=X3oDMTJwcDZzNTZqBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARtc2dJZAMyMTIEc2VjA2Z0cgRzbGsDcnBseQRzdGltZQMxMjQ5NjcxNTEw?act=reply&messageNum=212> | Start a new topic <http://groups.yahoo.com/group/evb/post;_ylc=X3oDMTJmZW52ZmhiBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNudHBjBHN0aW1lAzEyNDk2NzE1MTA-> Messages <http://groups.yahoo.com/group/evb/messages;_ylc=X3oDMTJmODJ0MmU2BF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNtc2dzBHN0aW1lAzEyNDk2NzE1MTA-> | Files <http://groups.yahoo.com/group/evb/files;_ylc=X3oDMTJnaGdsYXI4BF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNmaWxlcwRzdGltZQMxMjQ5NjcxNTEw> | Photos <http://groups.yahoo.com/group/evb/photos;_ylc=X3oDMTJmYm90MmpqBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNwaG90BHN0aW1lAzEyNDk2NzE1MTA-> | Links <http://groups.yahoo.com/group/evb/links;_ylc=X3oDMTJnbWdyaTZnBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNsaW5rcwRzdGltZQMxMjQ5NjcxNTEw> | Database <http://groups.yahoo.com/group/evb/database;_ylc=X3oDMTJkZW1ka3FhBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNkYgRzdGltZQMxMjQ5NjcxNTEw> | Polls <http://groups.yahoo.com/group/evb/polls;_ylc=X3oDMTJnMG9lZTJuBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNwb2xscwRzdGltZQMxMjQ5NjcxNTEw> | Members <http://groups.yahoo.com/group/evb/members;_ylc=X3oDMTJmMWdwYXViBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNtYnJzBHN0aW1lAzEyNDk2NzE1MTA-> | Calendar <http://groups.yahoo.com/group/evb/calendar;_ylc=X3oDMTJlZnQ1N25iBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNjYWwEc3RpbWUDMTI0OTY3MTUxMA--> Yahoo! Groups <http://groups.yahoo.com/;_ylc=X3oDMTJlNDhoZDY1BF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNnZnAEc3RpbWUDMTI0OTY3MTUxMA--> Change settings via the Web <http://groups.yahoo.com/group/evb/join;_ylc=X3oDMTJna2g4aW9zBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNzdG5ncwRzdGltZQMxMjQ5NjcxNTEw> (Yahoo! ID required) Change settings via email: Switch delivery to Daily Digest <mailto:evb-digest at yahoogroups.com?subject=Email Delivery: Digest> | Switch format to Traditional <mailto:evb-traditional at yahoogroups.com?subject=Change Delivery Format: Traditional> Visit Your Group <http://groups.yahoo.com/group/evb;_ylc=X3oDMTJlN2ZwMTRxBF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDZnRyBHNsawNocGYEc3RpbWUDMTI0OTY3MTUxMA--> | Yahoo! Groups Terms of Use <http://docs.yahoo.com/info/terms/> | Unsubscribe <mailto:evb-unsubscribe at yahoogroups.com?subject=> Recent Activity Visit Your Group <http://groups.yahoo.com/group/evb;_ylc=X3oDMTJmYW91dGs2BF9TAzk3MzU5NzE0BGdycElkAzIzODk2NDQ3BGdycHNwSWQDMTcwNTAwNDc1MARzZWMDdnRsBHNsawN2Z2hwBHN0aW1lAzEyNDk2NzE1MTA-> Give Back Yahoo! for Good <http://us.lrd.yahoo.com/_ylc=X3oDMTJuam45aG04BF9TAzk3MzU5NzE0BF9wAzEEZ3JwSWQDMjM4OTY0NDcEZ3Jwc3BJZAMxNzA1MDA0NzUwBHNlYwNuY21vZARzbGsDYnJhbmQEc3RpbWUDMTI0OTY3MTUxMA--;_ylg=1/SIG=11314uv3k/**http%3A//brand.yahoo.com/forgood> Get inspired by a good cause. Y! Toolbar Get it Free! <http://us.lrd.yahoo.com/_ylc=X3oDMTJwbGY0NzUzBF9TAzk3MzU5NzE0BF9wAzIEZ3JwSWQDMjM4OTY0NDcEZ3Jwc3BJZAMxNzA1MDA0NzUwBHNlYwNuY21vZARzbGsDdG9vbGJhcgRzdGltZQMxMjQ5NjcxNTEw;_ylg=1/SIG=11c6dvmk9/**http%3A//toolbar.yahoo.com/%3F.cpdl=ygrps> easy 1-click access to your groups. Yahoo! Groups Start a group <http://groups.yahoo.com/start;_ylc=X3oDMTJwdjNqdTNiBF9TAzk3MzU5NzE0BF9wAzMEZ3JwSWQDMjM4OTY0NDcEZ3Jwc3BJZAMxNzA1MDA0NzUwBHNlYwNuY21vZARzbGsDZ3JvdXBzMgRzdGltZQMxMjQ5NjcxNTEw> in 3 easy steps. Connect with others. . <http://geo.yahoo.com/serv?s=97359714/grpId=23896447/grpspId=1705004750/msgId=212/stime=1249671510/nc1=1/nc2=2/nc3=3> __,_._,___ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.linux-foundation.org/pipermail/bridge/attachments/20090807/a97fd94f/attachment-0001.htm
Fischer, Anna
2009-Aug-07 21:00 UTC
[Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
Hi Yaron, Yes, I also believe that VEPA + SRIOV can potentially, in some deployments, achieve better performance than a bridge/tap configuration, especially when you run multiple VMs and if you want to enable more sophisticated network processing in the data path. If you do have a SRIOV NIC that supports VEPA, then I would think that you do not have QEMU or macvtap in the setup any more though. Simply because in that case the VM can directly access the VF on the physical device. That would be ideal. I do think that the macvtap driver is a good addition as a simple and fast virtual network I/O interface, in case you do not need full bridge functionality. It does seem to assume though that the virtualization software uses QEMU/tap interfaces. How would this work with a Xen para-virtualized network interface? I guess there would need to be yet another driver? Anna -- From: Yaron Haviv [mailto:yaronh at voltaire.com] Sent: 07 August 2009 21:36 To: evb at yahoogroups.com; shemminger at linux-foundation.org; Fischer, Anna Cc: bridge at lists.linux-foundation.org; netdev at vger.kernel.org; virtualization at lists.linux-foundation.org; davem at davemloft.net; kaber at trash.net; adobriyan at gmail.com; arnd at arndb.de Subject: Re: [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support Paul, I also think that bridge may not be the right place for VEPA, but rather a simpler sw/hw mux Although the VEPA support may reside in multiple places (I.e. also in the bridge) As Arnd pointed out Or already added an extension to qemu that allow direct guest virtual NIC mapping to an interface device (vs using tap), this was done specifically to address VEPA, and result in much faster performance and lower cpu overhead (Or and some others are planning additional meaningful performance optimizations) The interface multiplexing can be achieved using macvlan driver or using an SR-IOV capable NIC (the preferred option), macvlan may need to be extended to support VEPA multicast handling, this looks like a rather simple task It may be counter intuitive for some, but we expect the (completed) qemu VEPA mode + SR-IOV + certain switches with hairpin (vepa) mode to perform faster than using bridge+tap even for connecting 2 VMs on the same host Yaron Sent from BlackBerry ________________________________________ From: evb at yahoogroups.com To: 'Stephen Hemminger' ; 'Fischer, Anna' Cc: bridge at lists.linux-foundation.org ; linux-kernel at vger.kernel.org ; netdev at vger.kernel.org ; virtualization at lists.linux-foundation.org ; evb at yahoogroups.com ; davem at davemloft.net ; kaber at trash.net ; adobriyan at gmail.com ; 'Arnd Bergmann' Sent: Fri Aug 07 21:58:00 2009 Subject: [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support ?> > After reading more about this, I am not convinced this should be part > of the bridge code. The bridge code really consists of two parts: > forwarding table and optional spanning tree. Well the VEPA code short > circuits both of these; it can't imagine it working with STP turned > on. The only part of bridge code that really gets used by this are the > receive packet hooks and the crufty old API. > > So instead of adding more stuff to existing bridge code, why not have > a new driver for just VEPA. You could do it with a simple version of > macvlan type driver.Stephen, Thanks for your comments and questions. We do believe the bridge code is the right place for this, so I'd like to embellish on that a bit more to help persuade you. Sorry for the long winded response, but here are some thoughts: - First and foremost, VEPA is going to be a standard addition to the IEEE 802.1Q specification. The working group agreed at the last meeting to pursue a project to augment the bridge standard with hairpin mode (aka reflective relay) and a remote filtering service (VEPA). See for details: http://www.ieee802.org/1/files/public/docs2009/new-evb-congdon-evbPar5C-0709 -v01.pdf - The VEPA functionality was really a pretty small change to the code with low risk and wouldn't seem to warrant an entire new driver or module. - There are good use cases where VMs will want to have some of their interfaces attached to bridges and others to bridges operating in VEPA mode. In other words, we see simultaneous operation of the bridge code and VEPA occurring, so having as much of the underlying code as common as possible would seem to be beneficial. - By augmenting the bridge code with VEPA there is a great amount of re-use achieved. It works wherever the bridge code works and doesn't need anything special to support KVM, XEN, and all the hooks, etc... - The hardware vendors building SR-IOV NICs with embedded switches will be adding VEPA mode, so by keeping the bridge module in sync would be consistent with this trend and direction. It will be possible to extend the hardware implementations by cascading a software bridge and/or VEPA, so being in sync with the architecture would make this more consistent. - The forwarding table is still needed and used on inbound traffic to deliver frames to the correct virtual interfaces and to filter any reflected frames. A new driver would have to basically implement an equivalent forwarding table anyway. As I understand the current macvlan type driver, it wouldn't filter multicast frames properly without such a table. - It seems the hairpin mode would be needed in the bridge module whether VEPA was added to the bridge module or a new driver. Having the associated changes together in the same code could aid in understanding and deployment. As I understand the macvlan code, it currently doesn't allow two VMs on the same machine to communicate with one another. I could imagine a hairpin mode on the adjacent bridge making this possible, but the macvlan code would need to be updated to filter reflected frames so a source did not receive his own packet. I could imagine this being done as well, but to also support selective multicast usage, something similar to the bridge forwarding table would be needed. I think putting VEPA into a new driver would cause you to implement many things the bridge code already supports. Given that we expect the bridge standard to ultimately include VEPA, and the new functions are basic forwarding operations, it seems to make most sense to keep this consistent with the bridge module. Paul
Paul Congdon (UC Davis)
2009-Aug-07 21:06 UTC
[Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
Yaron, The interface multiplexing can be achieved using macvlan driver or using an SR-IOV capable NIC (the preferred option), macvlan may need to be extended to support VEPA multicast handling, this looks like a rather simple task Agreed that the hardware solution is preferred so the macvlan implementation doesn?t really matter. If we are talking SR-IOV, then it is direct mapped, regardless of whether there is a VEB or VEPA in the hardware below, so you are bypassing the bridge software code also. I disagree that adding the multicast handling is simple ? while not conceptually hard, it will basically require you to put an address table into the macvlan implementation ? if you have that, then why not have just used the one already in the bridge code. If you hook a VEPA up to a non-hairpin mode external bridge, you get the macvlan capability as well. It also seems to me like the special macvlan interfaces for KVM don?t apply to XEN or a non-virtualized environment? Or more has to be written to make that work? If it is in the bridge code, you get all of this re-use. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.linux-foundation.org/pipermail/bridge/attachments/20090807/e44aa609/attachment.htm
Benny Amorsen
2009-Aug-08 08:50 UTC
[Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
"Fischer, Anna" <anna.fischer at hp.com> writes:> If you do have a SRIOV NIC that supports VEPA, then I would think that > you do not have QEMU or macvtap in the setup any more though. Simply > because in that case the VM can directly access the VF on the physical > device. That would be ideal.I'm just trying to understand how this all works, so I'm probably asking a stupid question: Would a SRIOV NIC with VEPA support show up as multiple devices? I.e. would I get e.g. eth0-eth7 for a NIC with support for 8 virtual interfaces? Would they have different MAC addresses? /Benny
Possibly Parallel Threads
- [Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
- [Bridge] [evb] RE: [PATCH][RFC] net/bridge: add basic VEPA support
- [Bridge] [PATCH][RFC] net/bridge: add basic VEPA support
- [Bridge] [PATCH][RFC] net/bridge: add basic VEPA support
- [Bridge] [PATCH][RFC] net/bridge: add basic VEPA support