On Thu, Apr 20, 2023 at 03:59:39PM +0200, Alexander Lobakin
wrote:> Hmm, currently almost all Ethernet drivers map Rx pages once and then
> just recycle them, keeping the original DMA mapping. Which means pages
> can have the same first mapping for very long time, often even for the
> lifetime of the struct device. Same for XDP sockets, the lifetime of DMA
> mappings equals the lifetime of sockets.
> Does it mean we'd better review that approach and try switching to
> dma_alloc_*() family (non-coherent/caching in our case)?
Yes, exactly. dma_alloc_noncoherent can be used exactly as alloc_pages
+ dma_map_* by the driver (including the dma_sync_* calls on reuse), but
has a huge number of advantages.
> Also, I remember I tried to do that for one my driver, but the thing
> that all those functions zero the whole page(s) before returning them to
> the driver ruins the performance -- we don't need to zero buffers for
> receiving packets and spend a ton of cycles on it (esp. in cases when 4k
> gets zeroed each time, but your main body of traffic is 64-byte frames).
Hmm, the single zeroing when doing the initial allocation shows up
in these profiles?