Jason Wang
2013-Dec-27 03:04 UTC
[PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
On 12/27/2013 04:06 AM, Michael Dalton wrote:> On Mon, Dec 23, 2013 at 4:51 AM, Michael S. Tsirkin <mst at redhat.com> wrote: >> OK so a high level benchmark shows it's worth it, >> but how well does the logic work? >> I think we should make the buffer size accessible in sysfs >> or debugfs, and look at it, otherwise we don't really know. >> > Exporting the size sounds good to me, it is definitely an > important metric and would give more visibility to the admin. > > Do you have a preference for implementation strategy? I was > thinking just add a DEVICE_ATTR to create a read-only sysfs file, > 'mergeable_rx_buffer_size', and return a space-separated list of the > current buffer size (computed from the average packet size) for each > receive queue. -EINVAL or a similar error could be returned if the > netdev was not configured for mergeable rx buffers. > >> I don't get the real motivation for this. >> >> We have skbs A,B,C sharing a page, with chunk D being unused. >> This randomly charges chunk D to an skb that ended up last >> in the page. >> Correct? >> Why does this make sense? > The intent of this code is to adjust the SKB true size for > the packet. We should completely use each packet buffer except > for the last buffer. For all buffers except the last buffer, it > should be the case that 'len' (bytes received) = buffer size. For > the last buffer, this code adjusts the truesize by comparing the > approximated buffer size with the bytes received into the buffer, > and adding the difference to the SKB truesize if the buffer size > is greater than the number of bytes received.But look like it does not take the 'hole' into account which may cause under estimation of trusize.> > We approximate the buffer size by using the last packet buffer size > from that same page, which as you have correctly noted may be a buffer > that belongs to a different packet on the same virtio-net device. This > buffer size should be very close to the actual buffer size because our > EWMA estimator uses a high weight (so the packet buffer size changes very > slowly) and there are only a handful packets on a page (even order-3).We can make this more accurate by using extra data structure to track the real buf size and using it as token.>> Why head_skb only? Why not full buffer size that comes from host? >> This is simply len. > Sorry, I believe this code fragment should be clearer. Basically, we > have a corner case in that for packets with size <= GOOD_COPY_LEN, there > are no frags because page_to_skb() already unref'd the page and the entire > packet contents are copied to skb->data. In this case, the SKB truesize > is already accurate and should not be updated (and it would be unsafe to > access page->private as page is already unref'd). > > I'll look at the above code again and cleanup (please let me know if you > have a preference) and/or add a comment to clarify. > > Best, > > Mike
Michael Dalton
2013-Dec-27 21:41 UTC
[PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
I'm working on a followup patchset to address current feedback. I think it will be cleaner to do a debugfs implementation for per-receive queue packet buffer size exporting, so I'm trying that out. On Thu, Dec 26, 2013 at 7:04 PM, Jason Wang <jasowang at redhat.com> wrote:> We can make this more accurate by using extra data structure to track > the real buf size and using it as token.I agree -- we can do precise buffer total len tracking. Something like struct mergeable_packet_buffer_ctx { void *buf; unsigned int total_len; }; Each receive queue could have a pointer to an array of N buffer contexts, where N is queue size (kzalloc'd in init_vqs or similar). That would allow us to allocate all of our buffer context data at startup. Would this be preferred to the current approach or is there another approach you would prefer? All other things being equal, having precise length tracking is advantageous, so I'm inclined to try this out and see how it goes. I think this is a big design point - for example, if we have an extra buffer context structure, then per-receive queue frag allocators are not required for auto-tuning and we can reduce the number of patches in this patchset. I'm happy to implement either way. Thanks! Best, Mike
Jason Wang
2013-Dec-30 04:50 UTC
[PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
On 12/28/2013 05:41 AM, Michael Dalton wrote:> I'm working on a followup patchset to address current feedback. I think > it will be cleaner to do a debugfs implementation for per-receive queue > packet buffer size exporting, so I'm trying that out. > > On Thu, Dec 26, 2013 at 7:04 PM, Jason Wang <jasowang at redhat.com> wrote: >> We can make this more accurate by using extra data structure to track >> the real buf size and using it as token. > I agree -- we can do precise buffer total len tracking. Something like > struct mergeable_packet_buffer_ctx { > void *buf; > unsigned int total_len; > }; > > Each receive queue could have a pointer to an array of N buffer contexts, > where N is queue size (kzalloc'd in init_vqs or similar). That would > allow us to allocate all of our buffer context data at startup. > > Would this be preferred to the current approach or is there another > approach you would prefer? All other things being equal, having precise > length tracking is advantageous, so I'm inclined to try this out and > see how it goes.I think this is better since it was accurate and easier to be implemented and understand. Thanks> > I think this is a big design point - for example, if we have an extra > buffer context structure, then per-receive queue frag allocators are not > required for auto-tuning and we can reduce the number of patches in > this patchset. > > I'm happy to implement either way. Thanks! > > Best, > > Mike
Jason Wang
2013-Dec-30 05:38 UTC
[PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
On 12/28/2013 05:41 AM, Michael Dalton wrote:> I'm working on a followup patchset to address current feedback. I think > it will be cleaner to do a debugfs implementation for per-receive queue > packet buffer size exporting, so I'm trying that out. > > On Thu, Dec 26, 2013 at 7:04 PM, Jason Wang <jasowang at redhat.com> wrote: >> We can make this more accurate by using extra data structure to track >> the real buf size and using it as token. > I agree -- we can do precise buffer total len tracking. Something like > struct mergeable_packet_buffer_ctx { > void *buf; > unsigned int total_len; > }; > > Each receive queue could have a pointer to an array of N buffer contexts, > where N is queue size (kzalloc'd in init_vqs or similar). That would > allow us to allocate all of our buffer context data at startup. > > Would this be preferred to the current approach or is there another > approach you would prefer? All other things being equal, having precise > length tracking is advantageous, so I'm inclined to try this out and > see how it goes. > > I think this is a big design point - for example, if we have an extra > buffer context structure, then per-receive queue frag allocators are not > required for auto-tuning and we can reduce the number of patches in > this patchset.Not required but better keep it. Consider we may have multiple virtio-net cards, using per-receive queue frag may increase the possibility of coalescing.> > I'm happy to implement either way. Thanks! > > Best, > > Mike
Michael S. Tsirkin
2014-Jan-08 17:37 UTC
[PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
On Fri, Dec 27, 2013 at 01:41:28PM -0800, Michael Dalton wrote:> I'm working on a followup patchset to address current feedback. I think > it will be cleaner to do a debugfs implementation for per-receive queue > packet buffer size exporting, so I'm trying that out. > > On Thu, Dec 26, 2013 at 7:04 PM, Jason Wang <jasowang at redhat.com> wrote: > > We can make this more accurate by using extra data structure to track > > the real buf size and using it as token. > > I agree -- we can do precise buffer total len tracking. Something like > struct mergeable_packet_buffer_ctx { > void *buf; > unsigned int total_len;Maybe make total_len long so size is a power of 2.> };Hmm this doubles VQ cache footprint. In the past when I tried increasong cache footprint this hurt performance measureable. It's just a suggestion though, YMMV, if numbers are good we don't need to argue about this.> > Each receive queue could have a pointer to an array of N buffer contexts, > where N is queue size (kzalloc'd in init_vqs or similar). That would > allow us to allocate all of our buffer context data at startup. > > Would this be preferred to the current approach or is there another > approach you would prefer? All other things being equal, having precise > length tracking is advantageous, so I'm inclined to try this out and > see how it goes. > > I think this is a big design point - for example, if we have an extra > buffer context structure, then per-receive queue frag allocators are not > required for auto-tuning and we can reduce the number of patches in > this patchset.I'd be careful with adding even more stuff in mergeable_packet_buffer_ctx for above reason.> I'm happy to implement either way. Thanks! > > Best, > > Mike
Reasonably Related Threads
- [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
- [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
- [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
- [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance
- [PATCH net-next 4/4] virtio-net: auto-tune mergeable rx buffer size for improved performance