On Thu, Jun 2, 2022 at 10:59 AM Parav Pandit <parav at nvidia.com>
wrote:>
>
> > From: Jason Wang <jasowang at redhat.com>
> > Sent: Wednesday, June 1, 2022 10:00 PM
> >
> > On Thu, Jun 2, 2022 at 2:58 AM Parav Pandit <parav at
nvidia.com> wrote:
> > >
> > >
> > > > From: Jason Wang <jasowang at redhat.com>
> > > > Sent: Tuesday, May 31, 2022 10:42 PM
> > > >
> > > > Well, the ability to query the virtqueue state was proposed
as
> > > > another feature (Eugenio, please correct me). This should be
> > > > sufficient for making virtio-net to be live migrated.
> > > >
> > > The device is stopped, it won't answer to this special vq
config done here.
> >
> > This depends on the definition of the stop. Any query to the device
state
> > should be allowed otherwise it's meaningless for us.
> >
> > > Programming all of these using cfg registers doesn't scale
for on-chip
> > memory and for the speed.
> >
> > Well, they are orthogonal and what I want to say is, we should first
define
> > the semantics of stop and state of the virtqueue.
> >
> > Such a facility could be accessed by either transport specific method
or admin
> > virtqueue, it totally depends on the hardware architecture of the
vendor.
> >
> I find it hard to believe that a vendor can implement a CVQ but not AQ and
chose to expose tens of hundreds of registers.
> But maybe, it fits some specific hw.
You can have a look at the ifcvf dpdk driver as an example.
But another thing that is unrelated to hardware architecture is the
nesting support. Having admin virtqueue in a nesting environment looks
like an overkill. Presenting a register in L1 and map it to L0's admin
should be good enough.
>
> I like to learn the advantages of such method other than simplicity.
>
> We can clearly that we are shifting away from such PCI registers with SIOV,
IMS and other scalable solutions.
> virtio drifting in reverse direction by introducing more registers as
transport.
> I expect it to an optional transport like AQ.
Actually, I had a proposal of using admin virtqueue as a transport,
it's designed to be SIOV/IMS capable. And it's not hard to extend it
with the state/stop support etc.
>
> > >
> > > Next would be to program hundreds of statistics of the 64 VQs
through a
> > giant PCI config space register in some busy polling scheme.
> >
> > We don't need giant config space, and this method has been
implemented
> > by some vDPA vendors.
> >
> There are tens of 64-bit counters per VQs. These needs to programmed on
destination side.
> Programming these via registers requires exposing them on the registers.
> In one of the proposals, I see them being queried via CVQ from the device.
I didn't see a proposal like this. And I don't think querying general
virtio state like idx with a device specific CVQ is a good design.
>
> Programming them via cfg registers requires large cfg space or synchronous
programming until receiving ACK from it.
> This means one entry at a time...
>
> Programming them via CVQ needs replicate and align cmd values etc on all
device types. All duplicate and hard to maintain.
>
>
> > >
> > > I can clearly see how all these are inefficient for faster LM.
> > > We need an efficient AQ to proceed with at minimum.
> >
> > I'm fine with admin virtqueue, but the stop and state are
orthogonal to that.
> > And using admin virtqueue for stop/state will be more natural if we
use
> > admin virtqueue as a transport.
> Ok.
> We should have defined it bit earlier that all vendors can use. :(
I agree.
Thanks