thr3ads.net - Linux Virtualization - [PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Rusty Russell

2008-Oct-09 00:55 UTC

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

On Thursday 09 October 2008 06:34:59 Mark McLoughlin
wrote:> From: Herbert Xu <herbert.xu at redhat.com>
>
> If segmentation offload is enabled by the host, we currently allocate
> maximum sized packet buffers and pass them to the host. This uses up
> 20 ring entries, allowing us to supply only 20 packet buffers to the
> host with a 256 entry ring. This is a huge overhead when receiving
> small packets, and is most keenly felt when receiving MTU sized
> packets from off-host.
Hi Mark!

There are three approaches we should investigate before adding YA feature.  
Obviously, we can simply increase the number of ring entries.

Secondly, we can put the virtio_net_hdr at the head of the skb data (this is 
also worth considering for xmit I think if we have headroom) and drop 
MAX_SKB_FRAGS which contains a gratuitous +2.

Thirdly, we can try to coalesce contiguous buffers.  The page caching scheme 
we have might help here, I don't know.  Maybe we should be explicitly trying
to allocate higher orders.

Now, that said, we might need this anyway.  But let's try the easy things 
first?  (Or as well...)
> The size of the logical buffer is
> returned to the guest rather than the size of the individual smaller
> buffers.
That's a virtio transport breakage: can you use the standard virtio
mechanism,
just put the extended length or number of extra buffers inside the 
virtio_net_hdr?

That makes more sense to me.
> Make use of this support by supplying single page receive buffers to
> the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
> the payload to the skb's linear data buffer and adjust the fragment
> offset to point to the remaining data. This ensures proper alignment
> and allows us to not use any paged data for small packets. If the
> payload occupies multiple pages, we simply append those pages as
> fragments and free the associated skbs.
> +		char *p = page_address(skb_shinfo(skb)->frags[0].page);
...> +		memcpy(hdr, p, sizeof(*hdr));
> +		p += sizeof(*hdr);
I think you need kmap_atomic() here to access the page.  And yes, that will 
effect performance :(

A few more comments moved from the patch header into the source wouldn't go 
astray, but I'm happy to do that myself (it's been on my TODO for a
while).

Thanks!
Rusty.

Herbert Xu

2008-Oct-09 15:30 UTC

head link

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

On Thu, Oct 09, 2008 at 11:55:59AM +1100, Rusty Russell
wrote:>
> There are three approaches we should investigate before adding YA feature.
> Obviously, we can simply increase the number of ring entries.
That's not going to work so well as you need to increase the ring
size by MAX_SKB_FRAGS times to achieve the same level of effect.

Basically the current scheme is either going to suck at non-TSO
traffic or it's going to chew too much resources.
> Secondly, we can put the virtio_net_hdr at the head of the skb data (this
is
> also worth considering for xmit I think if we have headroom) and drop 
> MAX_SKB_FRAGS which contains a gratuitous +2.
That's fine but having skb->data in the ring still means two
different kinds of memory in there and it sucks when you only
have 1500-byte packets.
> Thirdly, we can try to coalesce contiguous buffers.  The page caching
scheme
> we have might help here, I don't know.  Maybe we should be explicitly
trying
> to allocate higher orders.
That's not really the key problem here.  The problem here is
that the scheme we're currently using in virtio-net is simply
broken when it comes to 1500-byte sized packets.  Most of the
entries on the ring buffer go to waste.

We need a scheme that handles both 1500-byte packets as well
as 64K-byte size ones, and without holding down 16M of memory
per guest.
> > The size of the logical buffer is
> > returned to the guest rather than the size of the individual smaller
> > buffers.
> 
> That's a virtio transport breakage: can you use the standard virtio
mechanism,
> just put the extended length or number of extra buffers inside the 
> virtio_net_hdr?
Sure that sounds reasonable.
> > Make use of this support by supplying single page receive buffers to
> > the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
> > the payload to the skb's linear data buffer and adjust the
fragment
> > offset to point to the remaining data. This ensures proper alignment
> > and allows us to not use any paged data for small packets. If the
> > payload occupies multiple pages, we simply append those pages as
> > fragments and free the associated skbs.
> 
> > +		char *p = page_address(skb_shinfo(skb)->frags[0].page);
> ...
> > +		memcpy(hdr, p, sizeof(*hdr));
> > +		p += sizeof(*hdr);
> 
> I think you need kmap_atomic() here to access the page.  And yes, that will
> effect performance :(
No we don't.  kmap would only be necessary for highmem which we
did not request.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert at gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Chris Wright

2008-Oct-09 15:35 UTC

head link

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

* Rusty Russell (rusty at rustcorp.com.au) wrote:> On Thursday 09 October 2008 06:34:59 Mark McLoughlin wrote:
> > From: Herbert Xu <herbert.xu at redhat.com>
> >
> > If segmentation offload is enabled by the host, we currently allocate
> > maximum sized packet buffers and pass them to the host. This uses up
> > 20 ring entries, allowing us to supply only 20 packet buffers to the
> > host with a 256 entry ring. This is a huge overhead when receiving
> > small packets, and is most keenly felt when receiving MTU sized
> > packets from off-host.
> 
> There are three approaches we should investigate before adding YA feature.
> Obviously, we can simply increase the number of ring entries.
Tried that, it didn't help much.  I don't have my numbers handy, but
levelled off at about 512 and was a modest boost.  It's still wasteful
to preallocate like that on the off-chance it's a large packet.

thanks,
-chris

Apparently Analagous Threads

Search for more maybe matching threads

Linux Virtualization - Oct 2008 - [PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

[PATCH 2/2] virtio_net: Improve the recv buffer allocation scheme

Apparently Analagous Threads