thr3ads.net - Virtualization - [PATCH net] Revert "vhost: cache used event for better performance" [Aug 2017]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2017-Jul-26 16:08 UTC

[PATCH net] Revert "vhost: cache used event for better performance"

On Wed, Jul 26, 2017 at 09:37:15PM +0800, Jason Wang
wrote:> 
> 
> On 2017?07?26? 21:18, Jason Wang wrote:
> > 
> > 
> > On 2017?07?26? 20:57, Michael S. Tsirkin wrote:
> > > On Wed, Jul 26, 2017 at 04:03:17PM +0800, Jason Wang wrote:
> > > > This reverts commit
809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
> > > > was reported to break vhost_net. We want to cache used event
and use
> > > > it to check for notification. We try to valid cached used
event by
> > > > checking whether or not it was ahead of new, but this is not
correct
> > > > all the time, it could be stale and there's no way to
know about this.
> > > > 
> > > > Signed-off-by: Jason Wang<jasowang at redhat.com>
> > > Could you supply a bit more data here please?  How does it get
stale?
> > > What does guest need to do to make it stale?  This will be
helpful if
> > > anyone wants to bring it back, or if we want to extend the
protocol.
> > > 
> > 
> > The problem we don't know whether or not guest has published a new
used
> > event. The check vring_need_event(vq->last_used_event, new +
vq->num,
> > new) is not sufficient to check for this.
> > 
> > Thanks
> 
> More notes, the previous assumption is that we don't move used event
back,
> but this could happen in fact if idx is wrapper around.
You mean if the 16 bit index wraps around after 64K entries.
Makes sense.
> Will repost and add
> this into commit log.
> 
> Thanks

K. Den

2017-Jul-30 06:26 UTC

head link

[PATCH net] Revert "vhost: cache used event for better performance"

On Wed, 2017-07-26 at 19:08 +0300, Michael S. Tsirkin
wrote:> On Wed, Jul 26, 2017 at 09:37:15PM +0800, Jason Wang wrote:
> > 
> > 
> > On 2017?07?26? 21:18, Jason Wang wrote:
> > > 
> > > 
> > > On 2017?07?26? 20:57, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 26, 2017 at 04:03:17PM +0800, Jason Wang wrote:
> > > > > This reverts commit
809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
> > > > > was reported to break vhost_net. We want to cache used
event and use
> > > > > it to check for notification. We try to valid cached
used event by
> > > > > checking whether or not it was ahead of new, but this
is not correct
> > > > > all the time, it could be stale and there's no way
to know about this.
> > > > > 
> > > > > Signed-off-by: Jason Wang<jasowang at redhat.com>
> > > > 
> > > > Could you supply a bit more data here please???How does it
get stale?
> > > > What does guest need to do to make it stale???This will be
helpful if
> > > > anyone wants to bring it back, or if we want to extend the
protocol.
> > > > 
> > > 
> > > The problem we don't know whether or not guest has published
a new used
> > > event. The check vring_need_event(vq->last_used_event, new +
vq->num,
> > > new) is not sufficient to check for this.
> > > 
> > > Thanks
> > 
> > More notes, the previous assumption is that we don't move used
event back,
> > but this could happen in fact if idx is wrapper around.
> 
> You mean if the 16 bit index wraps around after 64K entries.
> Makes sense.
> 
> > Will repost and add
> > this into commit log.
> > 
> > Thanks
Hi,

I am just curious but I have got a question:
AFAIU, if you wanted to keep the caching mechanism alive in the code base,
the following two changes could clear off the issue, or not?:
(1) Always fetch the latest event value from guest when signalled_used event is
invalid, which includes last_used_idx wraps-around case. Otherwise we might need
changes which would complicate too much the logic to properly decide whether or
not to skip signalling in the next vhost_notify round.
(2) On top of that, split the signal-postponing logic to three cases like:
* if the interval of vq.num is [2^16, UINT_MAX]:
any cached event is in should-postpone-signalling interval, so paradoxically
must always do signalling.
* else if the interval of vq.num is [2^15, 2^16):
the logic in the original patch (809ecb9bca6a9) suffices
* else (= less than 2^15) (optional):
checking only (vring_need_event(vq->last_used_event, new + vq->num, new)
would suffice.

Am I missing something, or is this irrelevant?
I would appreciate if you could elaborate a bit more how the situation where
event idx wraps around and moves back would make trouble.

Thanks.

Jason Wang

2017-Aug-09 02:38 UTC

head link

[PATCH net] Revert "vhost: cache used event for better performance"

On 2017?07?30? 14:26, K. Den wrote:> On Wed, 2017-07-26 at 19:08 +0300, Michael S. Tsirkin wrote:
>> On Wed, Jul 26, 2017 at 09:37:15PM +0800, Jason Wang wrote:
>>>
>>> On 2017?07?26? 21:18, Jason Wang wrote:
>>>>
>>>> On 2017?07?26? 20:57, Michael S. Tsirkin wrote:
>>>>> On Wed, Jul 26, 2017 at 04:03:17PM +0800, Jason Wang wrote:
>>>>>> This reverts commit
809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
>>>>>> was reported to break vhost_net. We want to cache used
event and use
>>>>>> it to check for notification. We try to valid cached
used event by
>>>>>> checking whether or not it was ahead of new, but this
is not correct
>>>>>> all the time, it could be stale and there's no way
to know about this.
>>>>>>
>>>>>> Signed-off-by: Jason Wang<jasowang at redhat.com>
>>>>> Could you supply a bit more data here please?  How does it
get stale?
>>>>> What does guest need to do to make it stale?  This will be
helpful if
>>>>> anyone wants to bring it back, or if we want to extend the
protocol.
>>>>>
>>>> The problem we don't know whether or not guest has
published a new used
>>>> event. The check vring_need_event(vq->last_used_event, new +
vq->num,
>>>> new) is not sufficient to check for this.
>>>>
>>>> Thanks
>>> More notes, the previous assumption is that we don't move used
event back,
>>> but this could happen in fact if idx is wrapper around.
>> You mean if the 16 bit index wraps around after 64K entries.
>> Makes sense.
>>
>>> Will repost and add
>>> this into commit log.
>>>
>>> Thanks
> Hi,
Hi, sorry for the late reply, was on vacation last week.
>
> I am just curious but I have got a question:
> AFAIU, if you wanted to keep the caching mechanism alive in the code base,
> the following two changes could clear off the issue, or not?:
> (1) Always fetch the latest event value from guest when signalled_used
event is
> invalid, which includes last_used_idx wraps-around case. Otherwise we might
need
> changes which would complicate too much the logic to properly decide
whether or
> not to skip signalling in the next vhost_notify round.
> (2) On top of that, split the signal-postponing logic to three cases like:
> * if the interval of vq.num is [2^16, UINT_MAX]:
> any cached event is in should-postpone-signalling interval, so
paradoxically
> must always do signalling.
I think don't think current code can work well if vq.num is grater than 
2^15. Since all cached idx is u16. This looks like a bug which needs to 
be fixed.
> * else if the interval of vq.num is [2^15, 2^16):
> the logic in the original patch (809ecb9bca6a9) suffices
> * else (= less than 2^15) (optional):
> checking only (vring_need_event(vq->last_used_event, new + vq->num,
new)
> would suffice.
>
> Am I missing something, or is this irrelevant?
Looks not, I think this may work. Let me do some test.

Thanks
> I would appreciate if you could elaborate a bit more how the situation
where
> event idx wraps around and moves back would make trouble.
>
> Thanks.
>

Apparently Analagous Threads

Search for more apparently analagous threads

Virtualization - Aug 2017 - [PATCH net] Revert "vhost: cache used event for better performance"

[PATCH net] Revert "vhost: cache used event for better performance"

[PATCH net] Revert "vhost: cache used event for better performance"

[PATCH net] Revert "vhost: cache used event for better performance"

Apparently Analagous Threads