thr3ads.net - xen discuss - review for 6721983 and 6730667 [Aug 2008]

If this information is useful, please help other people find it:
Share via:

David Edmondson

2008-Aug-18 15:06 UTC

review for 6721983 and 6730667

I''d like reviewers for:

6721983 High CPU utilization @dom0 when doing Tx at one guest domain
	due to a hot lock in the xnb driver
6730667 High CPU utilization @domU when doing Rx at one guest domain
	due to massive calling of mac_tx_update

The webrev is at http://dme.org/solaris/webrev/on-bugfix.hg.

Chaoyue Xiong

2008-Sep-02 22:44 UTC

head link

Re: review for 6721983 and 6730667

On 08/18/08 08:06 AM, David Edmondson wrote:> I''d like reviewers for:
> 
> 6721983 High CPU utilization @dom0 when doing Tx at one guest domain
> 	due to a hot lock in the xnb driver
> 6730667 High CPU utilization @domU when doing Rx at one guest domain
> 	due to massive calling of mac_tx_update
> 
> The webrev is at http://dme.org/solaris/webrev/on-bugfix.hg.
In xnf_intr function, the current logic is to reclaim Tx ring
whenever xnf gets an interrupt. I suggest we change the
logic as such that the code reclaims Tx ring ONLY when xnf
needs to send packets. This is to avoid grabbing xnf_txlock
as much as possible.

Current:

1442         /*
1443          * Clean tx ring and try to start any blocked xmit streams if
1444          * there is now some space.
1445          */
1446         mutex_enter(&xnfp->xnf_txlock);
1447         if (xnf_clean_tx_ring(xnfp) > 0) {
1448                 sched = xnfp->xnf_need_sched;
1449                 xnfp->xnf_need_sched = B_FALSE;
1450         }
1451         mutex_exit(&xnfp->xnf_txlock);
1452
1453         if (sched)
1454                 mac_tx_update(xnfp->xnf_mh);

Suggested change:

if (xnfp->xnf_need_sched) {
    mutex_enter(&xnfp->xnf_txlock);
    no=xnf_clean_tx_ring(xnfp);
    if (no>0) {
       mac_tx_update(xnfp->xnf_mh);
       xnfp->xnf_need_sched = B_FALSE;
    }
    mutex_exit(&xnfp->xnf_txlock);
}

David Edmondson

2008-Sep-08 14:37 UTC

head link

Re: review for 6721983 and 6730667

On Tue, Sep 02, 2008 at 03:44:08PM -0700, Chaoyue Xiong
wrote:> On 08/18/08 08:06 AM, David Edmondson wrote:
>> I''d like reviewers for:
>>
>> 6721983 High CPU utilization @dom0 when doing Tx at one guest domain
>> 	due to a hot lock in the xnb driver
>> 6730667 High CPU utilization @domU when doing Rx at one guest domain
>> 	due to massive calling of mac_tx_update
>>
>> The webrev is at http://dme.org/solaris/webrev/on-bugfix.hg.
>
> In xnf_intr function, the current logic is to reclaim Tx ring
> whenever xnf gets an interrupt. I suggest we change the
> logic as such that the code reclaims Tx ring ONLY when xnf
> needs to send packets. This is to avoid grabbing xnf_txlock
> as much as possible.
We need to hold xnf_txlock to check xnf_need_sched. Hence, we
might write:

mutex_enter(&xnfp->xnf_txlock);
if (xnfp->xnf_need_sched) {
   no = xnf_clean_tx_ring(xnfp);
   if (no > 0) {
      mac_tx_update(xnfp->xnf_mh);
      xnfp->xnf_need_sched = B_FALSE;
   }
}
mutex_exit(&xnfp->xnf_txlock);

This would avoid the call to xnf_clean_tx_ring(), but we would still
end up taking the lock.

Given that xnf_send_one() cleans the ring before sending, I can''t see
why the approach above would be bad.

It also occurs to me that xnf_send() might call xnf_clean_tx_ring()
rather than xnf_send_one(), meaning that we could reduce the number of
attempts to clean the ring from "once per packet" to "once per
chain".

The fixes that we''re reviewing now have an approved RTI (while I was
away on holiday), so I''d like to get those back into the gate and then
pursue this as a separate item. Could you file an RFE please?

Chaoyue (Joy) Xiong

2008-Sep-08 22:53 UTC

head link

Re: review for 6721983 and 6730667

David Edmondson wrote:> On Tue, Sep 02, 2008 at 03:44:08PM -0700, Chaoyue Xiong wrote:
>   
>> On 08/18/08 08:06 AM, David Edmondson wrote:
>>     
>>> I''d like reviewers for:
>>>
>>> 6721983 High CPU utilization @dom0 when doing Tx at one guest
domain
>>> 	due to a hot lock in the xnb driver
>>> 6730667 High CPU utilization @domU when doing Rx at one guest
domain
>>> 	due to massive calling of mac_tx_update
>>>
>>> The webrev is at http://dme.org/solaris/webrev/on-bugfix.hg.
>>>       
>> In xnf_intr function, the current logic is to reclaim Tx ring
>> whenever xnf gets an interrupt. I suggest we change the
>> logic as such that the code reclaims Tx ring ONLY when xnf
>> needs to send packets. This is to avoid grabbing xnf_txlock
>> as much as possible.
>>     
>
> We need to hold xnf_txlock to check xnf_need_sched. Hence, we
> might write:
>
> mutex_enter(&xnfp->xnf_txlock);
> if (xnfp->xnf_need_sched) {
>    no = xnf_clean_tx_ring(xnfp);
>    if (no > 0) {
>       mac_tx_update(xnfp->xnf_mh);
>       xnfp->xnf_need_sched = B_FALSE;
>    }
> }
> mutex_exit(&xnfp->xnf_txlock);
>   
If we don''t hold the lock for check, I feel we won''t have
serious
functional
problem here. What do you think of the following change?

if (xnfp->xnf_need_sched) {
   no = xnf_clean_tx_ring(xnfp);
   if (no > 0) {
      mac_tx_update(xnfp->xnf_mh);
      mutex_enter(&xnfp->xnf_txlock);
      xnfp->xnf_need_sched = B_FALSE;
      mutex_exit(&xnfp->xnf_txlock);
   }
}

> This would avoid the call to xnf_clean_tx_ring(), but we would still
> end up taking the lock.
>
> Given that xnf_send_one() cleans the ring before sending, I can''t
see
> why the approach above would be bad.
>
> It also occurs to me that xnf_send() might call xnf_clean_tx_ring()
> rather than xnf_send_one(), meaning that we could reduce the number of
> attempts to clean the ring from "once per packet" to "once
per
> chain".
>
> The fixes that we''re reviewing now have an approved RTI (while I
was
> away on holiday), so I''d like to get those back into the gate and
then
> pursue this as a separate item. Could you file an RFE please?
>   
Sure, I will file an RFE on this.

-Joy

xen discuss - Aug 2008 - review for 6721983 and 6730667

review for 6721983 and 6730667

Re: review for 6721983 and 6730667

Re: review for 6721983 and 6730667

Re: review for 6721983 and 6730667