Dan Magenheimer
2009-May-06 00:37 UTC
[Xen-devel] vnif socket buffer mistaken for pagetable page causes major performance problem
A recent posting reminded me of this and, though the information is a bit vague, someone familiar with the shadow code might know just where to look to fix this, hopefully in time for 3.4 (if its not already fixed). One of our performance experts discovered a strange major network performance hit seen only under certain circumstances, IIRC mostly in HVM but sometimes in PV when migrating. With some help from Intel, it was determined that the heuristics used by shadow paging to determine whether a guest is modifying a pagetable page were getting fooled by the access pattern used by the code that copies data into a newly allocated socket buffer. As a result, many unnecessary vmenter/vmexits were happening. The workaround was to preallocate all socket buffer memory. That''s all I''ve got, but we can try to answer questions if this isn''t already a known fixed problem. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gianluca Guida
2009-Jun-08 09:11 UTC
Re: [Xen-devel] vnif socket buffer mistaken for pagetable page causes major performance problem
Hi, Sorry for the late reply, Dan Magenheimer wrote:> A recent posting reminded me of this and, though the > information is a bit vague, someone familiar with the > shadow code might know just where to look to fix this, > hopefully in time for 3.4 (if its not already fixed). > > One of our performance experts discovered a strange > major network performance hit seen only under certain > circumstances, IIRC mostly in HVM but sometimes in > PV when migrating. > > With some help from Intel, it was determined that > the heuristics used by shadow paging to determine > whether a guest is modifying a pagetable page were > getting fooled by the access pattern used by the > code that copies data into a newly allocated socket > buffer. As a result, many unnecessary vmenter/vmexits > were happening. > > The workaround was to preallocate all socket buffer > memory. > > That''s all I''ve got, but we can try to answer questions > if this isn''t already a known fixed problem.Can you be a little more specific about what is the performance loss, in what workload, and what heuristic you found to be wrong in this case? Thanks, Gianluca _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Herbert van den Bergh
2009-Jun-08 16:22 UTC
Re: [Xen-devel] vnif socket buffer mistaken for pagetable page causes major performance problem
Gianluca Guida wrote:> Hi, > > Sorry for the late reply, > > Dan Magenheimer wrote: >> A recent posting reminded me of this and, though the >> information is a bit vague, someone familiar with the >> shadow code might know just where to look to fix this, >> hopefully in time for 3.4 (if its not already fixed). >> >> One of our performance experts discovered a strange >> major network performance hit seen only under certain >> circumstances, IIRC mostly in HVM but sometimes in >> PV when migrating. >> >> With some help from Intel, it was determined that >> the heuristics used by shadow paging to determine >> whether a guest is modifying a pagetable page were >> getting fooled by the access pattern used by the >> code that copies data into a newly allocated socket >> buffer. As a result, many unnecessary vmenter/vmexits >> were happening. >> >> The workaround was to preallocate all socket buffer >> memory. >> >> That''s all I''ve got, but we can try to answer questions >> if this isn''t already a known fixed problem. > > Can you be a little more specific about what is the performance loss,Network throughput was reduced to 10% of normal.> in what workload,A network send throughput test using netperf.> and what heuristic you found to be wrong in this case?The access pattern to the memory page that was recognized as a pagetable access was a regular memcpy doing 4 byte aligned writes into a newly allocated page. I''m not that familiar with the shadow pagetable code, so I don''t know how "wrong" this is, just that it caused a false positive on this type of memory access. Thanks, Herbert. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gianluca Guida
2009-Jun-09 13:25 UTC
Re: [Xen-devel] vnif socket buffer mistaken for pagetable page causes major performance problem
Herbert van den Bergh wrote:>> Can you be a little more specific about what is the performance loss, > Network throughput was reduced to 10% of normal. >> in what workload, > A network send throughput test using netperf.This is with a Linux guest (HVM of course), right?>> and what heuristic you found to be wrong in this case? > The access pattern to the memory page that was recognized as a pagetable > access was a regular memcpy doing 4 byte aligned writes into a newly > allocated page. I''m not that familiar with the shadow pagetable code, > so I don''t know how "wrong" this is, just that it caused a false > positive on this type of memory access.Most probably the guest OS is recycling a page that used to be a pagetable and that is still shadowed. Thanks, Gianluca _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Herbert van den Bergh
2009-Jun-09 15:22 UTC
Re: [Xen-devel] vnif socket buffer mistaken for pagetable page causes major performance problem
Gianluca Guida wrote:> Herbert van den Bergh wrote: >>> Can you be a little more specific about what is the performance loss, >> Network throughput was reduced to 10% of normal. >>> in what workload, >> A network send throughput test using netperf. > > This is with a Linux guest (HVM of course), right?Right.>>> and what heuristic you found to be wrong in this case? >> The access pattern to the memory page that was recognized as a >> pagetable access was a regular memcpy doing 4 byte aligned writes >> into a newly allocated page. I''m not that familiar with the shadow >> pagetable code, so I don''t know how "wrong" this is, just that it >> caused a false positive on this type of memory access. > > Most probably the guest OS is recycling a page that used to be a > pagetable and that is still shadowed. > > Thanks, > Gianluca_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel