Hi Crossbow team, I''m making some progress with the Crossbow/Clearview iptun merge, but am running into a problem receiving packets. When the first IP interface above a tunnel binds, a flow is created, and that flow is inserted into the flow hash table using the flow_l2_hash_fe() hashing function. When a packet is received, it attempts to match a flow by calling flow_l2_hash(). Both of these functions use the HASH_MAC_VID() hashing macro, which is passed the destination MAC address as its first argument. None of this is news to you, every l2 flow behaves this way. This HASH_MAC_VID() macro looks like this: #define HASH_MAC_VID(a, v, s) \ ((((uint32_t)(a)[3] + (a)[4] + (a)[5]) ^ (v)) % (s)) I''m sure you see the problem. It accesses data that may be off of the end of the MAC address is the MAC address is less than 6 bytes, and it also assumes that this 3-byte sample of the MAC address will represent an adequately variable sample. Perhaps this macro is the reason why there''s a check that MAC addresses must be greater than or equal to 6 bytes long... In the interest of removing this MAC address length limitation (because I have to to get iptun to work), this hashing algorithm needs to be more flexible and take into account the length of the MAC address. Perhaps the MAC-type plugins could provide their hashing function? Alternately, the hashing function could take the MAC address length into account and only use a sample of bytes within the actual address. Any thoughts on how you want to see this fixed? Thanks, -Seb
On Thu, 2009-01-08 at 11:40 -0500, Sebastien Roy wrote:> This HASH_MAC_VID() macro looks like this: > > #define HASH_MAC_VID(a, v, s) \ > ((((uint32_t)(a)[3] + (a)[4] + (a)[5]) ^ (v)) % (s))A followup question: Assuming that we keep this macro for Ethernet-based flow hashing (non-Ethernet flows have no business hashing using a VLAN-id anyway), wouldn''t it be simpler to treat the last 4 bytes of the Ethernet address as an integer, xor it with the VLAN id and be done with it? Why do the addition there? -Seb
On Thu, Jan 08, 2009 at 11:40:31AM -0500, Sebastien Roy wrote:> Hi Crossbow team, > > I''m making some progress with the Crossbow/Clearview iptun merge, but am > running into a problem receiving packets. > > When the first IP interface above a tunnel binds, a flow is created, and > that flow is inserted into the flow hash table using the > flow_l2_hash_fe() hashing function. When a packet is received, it > attempts to match a flow by calling flow_l2_hash(). Both of these > functions use the HASH_MAC_VID() hashing macro, which is passed the > destination MAC address as its first argument. None of this is news to > you, every l2 flow behaves this way. > > This HASH_MAC_VID() macro looks like this: > > #define HASH_MAC_VID(a, v, s) \ > ((((uint32_t)(a)[3] + (a)[4] + (a)[5]) ^ (v)) % (s)) > > I''m sure you see the problem. It accesses data that may be off of the > end of the MAC address is the MAC address is less than 6 bytes, and it > also assumes that this 3-byte sample of the MAC address will represent > an adequately variable sample. Perhaps this macro is the reason why > there''s a check that MAC addresses must be greater than or equal to 6 > bytes long... >sorry I forgot about this case when I said the address length doesn''t matter.> In the interest of removing this MAC address length limitation (because > I have to to get iptun to work), this hashing algorithm needs to be more > flexible and take into account the length of the MAC address. Perhaps > the MAC-type plugins could provide their hashing function? Alternately, > the hashing function could take the MAC address length into account and > only use a sample of bytes within the actual address. > > Any thoughts on how you want to see this fixed? >I would prefer to keep the existing macro and use it in flow_ether_hash(). for flow_l2_hash() and flow_l2_hash_fe(), we could do something like this: #define HASH_L2(a, v, l, s, h) { uint32_t i, sum = 0, nbytes = ((l) < 3) ? (l) : 3; \ for (i = (l) - nbytes; i < (l); i++) \ sum += (a)[i]; \ h = (sum ^ (v)) % (s); \ } static uint32_t flow_l2_hash(flow_tab_t *ft, flow_state_t *s) { flow_l2info_t *l2 = &s->fs_l2info; uint32_t h; HASH_l2(l2->l2_daddr, l2->l2_vid, l2->l2_addrlen, ft->ft_size, h); return (h); } note that the new l2_addrlen will have to be filled in at flow_l2_accept. regarding why we do a sum instead of using the last 4 bytes as an integer, I think the reason was we can''t be sure of the alignment of the mac address so we can''t just use *((uint32_t *)&l2->l2_daddr[2]) directly. eric
Hi Eric, On Thu, 2009-01-08 at 14:20 -0800, Eric Cheng wrote:> On Thu, Jan 08, 2009 at 11:40:31AM -0500, Sebastien Roy wrote: > > In the interest of removing this MAC address length limitation (because > > I have to to get iptun to work), this hashing algorithm needs to be more > > flexible and take into account the length of the MAC address. Perhaps > > the MAC-type plugins could provide their hashing function? Alternately, > > the hashing function could take the MAC address length into account and > > only use a sample of bytes within the actual address. > > > > Any thoughts on how you want to see this fixed? > > > > I would prefer to keep the existing macro and use it in flow_ether_hash(). > for flow_l2_hash() and flow_l2_hash_fe(), we could do something like this:... That''s what I ended up doing. See the changes in mac_flow.c in the clearview webrev below. Note that I had to implement a flow_ether_hash_fe() in order to keep the hash functions the same in both directions for Ethernet (i.e., you can''t have flow_ether_hash() use one hash function, but have Ethernet use flow_l2_hash_fe() which uses a different function): http://clearview.east/ws/clearview/clearview/webrev/> #define HASH_L2(a, v, l, s, h) { > uint32_t i, sum = 0, nbytes = ((l) < 3) ? (l) : 3; \ > for (i = (l) - nbytes; i < (l); i++) \ > sum += (a)[i]; \ > h = (sum ^ (v)) % (s); \ > }I ended up doing something similar, except using a function instead of a macro (I guess I could use a macro if need be, but I''m not convinced that the macro helps performance given compiler optimization). Note that there is no point in hashing using a VLAN id, since only Ethernet can have VLANs (right?), and it has it''s own specialized hashing function. In any case, please check out what I came up with, and I''ll be happy to apply any changes based on feedback you may have.> note that the new l2_addrlen will have to be filled in at > flow_l2_accept.We have the MAC address length in the flow table itself, since it''s in the mac_impl_t''s mi_type->mt_address_length.> regarding why we do a sum instead of using the last 4 bytes as an integer, > I think the reason was we can''t be sure of the alignment of the mac address > so we can''t just use *((uint32_t *)&l2->l2_daddr[2]) directly.Okay, that makes sense. Thanks, -Seb
On Thu, Jan 08, 2009 at 05:49:26PM -0500, Sebastien Roy wrote:> Hi Eric, > > On Thu, 2009-01-08 at 14:20 -0800, Eric Cheng wrote: > > On Thu, Jan 08, 2009 at 11:40:31AM -0500, Sebastien Roy wrote: > > > In the interest of removing this MAC address length limitation (because > > > I have to to get iptun to work), this hashing algorithm needs to be more > > > flexible and take into account the length of the MAC address. Perhaps > > > the MAC-type plugins could provide their hashing function? Alternately, > > > the hashing function could take the MAC address length into account and > > > only use a sample of bytes within the actual address. > > > > > > Any thoughts on how you want to see this fixed? > > > > > > > I would prefer to keep the existing macro and use it in flow_ether_hash(). > > for flow_l2_hash() and flow_l2_hash_fe(), we could do something like this: > ... > > That''s what I ended up doing. See the changes in mac_flow.c in the > clearview webrev below. Note that I had to implement a > flow_ether_hash_fe() in order to keep the hash functions the same in > both directions for Ethernet (i.e., you can''t have flow_ether_hash() use > one hash function, but have Ethernet use flow_l2_hash_fe() which uses a > different function): > > http://clearview.east/ws/clearview/clearview/webrev/ >I''m fine with your changes. thanks for fixing this. eric> > #define HASH_L2(a, v, l, s, h) { > > uint32_t i, sum = 0, nbytes = ((l) < 3) ? (l) : 3; \ > > for (i = (l) - nbytes; i < (l); i++) \ > > sum += (a)[i]; \ > > h = (sum ^ (v)) % (s); \ > > } > > I ended up doing something similar, except using a function instead of a > macro (I guess I could use a macro if need be, but I''m not convinced > that the macro helps performance given compiler optimization). Note > that there is no point in hashing using a VLAN id, since only Ethernet > can have VLANs (right?), and it has it''s own specialized hashing > function. In any case, please check out what I came up with, and I''ll > be happy to apply any changes based on feedback you may have. > > > note that the new l2_addrlen will have to be filled in at > > flow_l2_accept. > > We have the MAC address length in the flow table itself, since it''s in > the mac_impl_t''s mi_type->mt_address_length. > > > regarding why we do a sum instead of using the last 4 bytes as an integer, > > I think the reason was we can''t be sure of the alignment of the mac address > > so we can''t just use *((uint32_t *)&l2->l2_daddr[2]) directly. > > Okay, that makes sense. > > Thanks, > -Seb > >