thr3ads.net - Xen devel - [Xen-devel] Move some of the PCI device manage/control into pciback? [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Cui, Dexuan

2009-Jan-15 03:09 UTC

[Xen-devel] Move some of the PCI device manage/control into pciback?

1) Now in Xen VT-d, the FLR related things (can the device(s) be
statically/dynamically assigned to a guest? how should the device(s) be FLR-ed?)
are done in xend. The diff of the python patch is ~700 lines.
We may consider moving these things to pciback.  Certainly, with these things in
pciback, I''m afraid we''ll have less  flexibility -- a small
adjustment (e.g., some people would like to relax the co-assignment constraint)
or a bug fix requires a reload of pciback or a reboot of host (if pciback is
built into Dom0 kernel). And we have some other issues: a) moving all the python
logic into the pciback using C needs a big effort so maybe somebody
doesn''t like the big number of the line of code; b) we may need to add
an interface between pciback and control panel so that xend can invoke these FLR
related functions of pciback.

2) Now the pci config space virtualizations of PV and HVM guests are not the
same and there are some duplicated codes in pciback and ioemu. Now the ioemu of
Dom0 accesses device config space via libpci (the /sys); maybe ioemu can talk to
pciback directly?
In the case of stubdomain, looks the libpci is implemented via pcifront -- if
ioemu can talk to pciback directly, I think we can eliminate the duplicated
codes in ioemu and we''ll have a consistency between PV and HVM.
And for the pci passthrough related hypercalls invoked by the ioemu in the
de-priviledged stubdomain, I think ioemu can ask pciback to help to invoke the
hypercall, but this needs us to add an interface in pciback.

All these things need us to re-architect the current codes. Will this bring
compatibility issues? I remember it''s said Xen 3.4 will be released in
March; now it''s the suitable time for us to consider the changes?

PS, in the long run -- how long? -- will ioemu be removed from Dom0 and
stubdomain will be the only place for ioemu?

Any comment is appreciated!

Thanks,
-- Dexuan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shohei Fujiwara

2009-Jan-15 10:17 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On Thu, 15 Jan 2009 11:09:21 +0800
"Cui, Dexuan" <dexuan.cui@intel.com> wrote:
> 1) Now in Xen VT-d, the FLR related things (can the device(s) be
> statically/dynamically assigned to a guest? how should the device(s)
> be FLR-ed?) are done in xend. The diff of the python patch is ~700
> lines.  We may consider moving these things to pciback.  Certainly,
> with these things in pciback, I''m afraid we''ll have less
flexibility
> -- a small adjustment (e.g., some people would like to relax the
> co-assignment constraint) or a bug fix requires a reload of pciback
> or a reboot of host (if pciback is built into Dom0 kernel). And we
> have some other issues: a) moving all the python logic into the
> pciback using C needs a big effort so maybe somebody doesn''t like
> the big number of the line of code; b) we may need to add an
> interface between pciback and control panel so that xend can invoke
> these FLR related functions of pciback
>
> 2) Now the pci config space virtualizations of PV and HVM guests are
> not the same and there are some duplicated codes in pciback and
> ioemu. Now the ioemu of Dom0 accesses device config space via libpci
> (the /sys); maybe ioemu can talk to pciback directly?  In the case
> of stubdomain, looks the libpci is implemented via pcifront -- if
> ioemu can talk to pciback directly, I think we can eliminate the
> duplicated codes in ioemu and we''ll have a consistency between PV
> and HVM.
I agree with you that there are two similar codes in pciback and
ioemu. But I''m not happy if the code is removed from ioemu.

In case of HVM domain with stub domain, I''m considering direct access
from ioemu to configuration space.  We can achieve this by mapping the
subset of MMCFG to stub domain. This will improve the scalability of PCI
pass-through and reduce the responsibility of dom0.

My model is the following.

    1. PCI back driver resets the device and setups it.
    2. PCI back driver passes the responsibility of configuration
       space of device to ioemu.
    3. Ioemu reads/writes configuration space of the device,
       responding guest OS.
    4. When ioemu exits, pci back driver gets the responsibility of
       configuration space of device.
    5. PCI back driver resets device (and put D3hot state if possible)

As you know, current xend reads/writes configuration space. If xend
doesn''t reads/writes, the architecture becomes simpler.

What do you think about this?

Thanks,
--
Shohei Fujiwara
> And for the pci passthrough related hypercalls invoked by
> the ioemu in the de-priviledged stubdomain, I think ioemu can ask
> pciback to help to invoke the hypercall, but this needs us to add an
> interface in pciback.
> All these things need us to re-architect the current codes. Will
> this bring compatibility issues? I remember it''s said Xen 3.4 will
> be released in March; now it''s the suitable time for us to
consider
> the changes?
> 
> PS, in the long run -- how long? -- will ioemu be removed from Dom0
> and stubdomain will be the only place for ioemu?
> 
> Any comment is appreciated!
> 
> Thanks,
> -- Dexuan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Jan-15 11:04 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On 15/01/2009 10:17, "Shohei Fujiwara"
<fujiwara-sxa@necst.nec.co.jp> wrote:
> In case of HVM domain with stub domain, I''m considering direct
access
> from ioemu to configuration space.  We can achieve this by mapping the
> subset of MMCFG to stub domain. This will improve the scalability of PCI
> pass-through and reduce the responsibility of dom0.
> 
> My model is the following.
> 
>     1. PCI back driver resets the device and setups it.
>     2. PCI back driver passes the responsibility of configuration
>        space of device to ioemu.
>     3. Ioemu reads/writes configuration space of the device,
>        responding guest OS.
>     4. When ioemu exits, pci back driver gets the responsibility of
>        configuration space of device.
>     5. PCI back driver resets device (and put D3hot state if possible)
> 
> As you know, current xend reads/writes configuration space. If xend
> doesn''t reads/writes, the architecture becomes simpler.
> 
> What do you think about this?
I''d rather have all accesses mediated through pciback. I don''t
think PCI
config accesses should be on any data path anyway, and you''ve already
taken
the hit of trapping to qemu in that case.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2009-Jan-16 03:26 UTC

head link

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

xen-devel-bounces@lists.xensource.com <> wrote:> On Thu, 15 Jan 2009 11:09:21 +0800
> "Cui, Dexuan" <dexuan.cui@intel.com> wrote:
> 
>> 1) Now in Xen VT-d, the FLR related things (can the device(s) be
>> statically/dynamically assigned to a guest? how should the device(s)
>> be FLR-ed?) are done in xend. The diff of the python patch is ~700
>> lines.  We may consider moving these things to pciback.  Certainly,
>> with these things in pciback, I''m afraid we''ll have
less flexibility
>> -- a small adjustment (e.g., some people would like to relax the
>> co-assignment constraint) or a bug fix requires a reload of pciback
>> or a reboot of host (if pciback is built into Dom0 kernel). And we
>> have some other issues: a) moving all the python logic into the
>> pciback using C needs a big effort so maybe somebody doesn''t
like
>> the big number of the line of code; b) we may need to add an
>> interface between pciback and control panel so that xend can invoke
>> these FLR related functions of pciback
I''m still not sure if we really need such flexibility in production
environment.
>> 
>> 2) Now the pci config space virtualizations of PV and HVM guests are
>> not the same and there are some duplicated codes in pciback and
>> ioemu. Now the ioemu of Dom0 accesses device config space via libpci
>> (the /sys); maybe ioemu can talk to pciback directly?  In the case
>> of stubdomain, looks the libpci is implemented via pcifront -- if
>> ioemu can talk to pciback directly, I think we can eliminate the
>> duplicated codes in ioemu and we''ll have a consistency between
PV
>> and HVM.
So you mean ioemu initiate xen_pci_op directly to pciback?
> 
> I agree with you that there are two similar codes in pciback and
> ioemu. But I''m not happy if the code is removed from ioemu.
> 
> In case of HVM domain with stub domain, I''m considering direct
access
> from ioemu to configuration space.  We can achieve this by mapping the
> subset of MMCFG to stub domain. This will improve the
> scalability of PCI
> pass-through and reduce the responsibility of dom0.
> 
> My model is the following.
> 
>    1. PCI back driver resets the device and setups it.
>    2. PCI back driver passes the responsibility of configuration
>       space of device to ioemu.
>    3. Ioemu reads/writes configuration space of the device,      
> responding guest OS. 
>    4. When ioemu exits, pci back driver gets the responsibility of
>       configuration space of device.
>    5. PCI back driver resets device (and put D3hot state if possible)
> 
> As you know, current xend reads/writes configuration space. If xend
> doesn''t reads/writes, the architecture becomes simpler.
> 
> What do you think about this?
Shohei, I think this model may have some issue. 
a) The stubdomain/qemu is not trustable, so user may use a fake stub domain and
try to programe some sensitive config space (like MSI).
b) If there is no mmcfg support, to sync access to cf8/cfc will be difficult. So
you mean we have different implementation for mmcfg/cf8 method?
> 
> Thanks,
> --
> Shohei Fujiwara
> 
>> And for the pci passthrough related hypercalls invoked by
>> the ioemu in the de-priviledged stubdomain, I think ioemu can ask
>> pciback to help to invoke the hypercall, but this needs us to add an
>> interface in pciback.
> 
>> All these things need us to re-architect the current codes. Will
>> this bring compatibility issues? I remember it''s said Xen 3.4
will
>> be released in March; now it''s the suitable time for us to
consider the
>> changes? 
>> 
>> PS, in the long run -- how long? -- will ioemu be removed from Dom0
>> and stubdomain will be the only place for ioemu?
>> 
>> Any comment is appreciated!
>> 
>> Thanks,
>> -- Dexuan
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shohei Fujiwara

2009-Jan-16 05:47 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On Fri, 16 Jan 2009 11:26:10 +0800
"Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> > I agree with you that there are two similar codes in pciback and
> > ioemu. But I''m not happy if the code is removed from ioemu.
> > 
> > In case of HVM domain with stub domain, I''m considering
direct access
> > from ioemu to configuration space.  We can achieve this by mapping the
> > subset of MMCFG to stub domain. This will improve the
> > scalability of PCI
> > pass-through and reduce the responsibility of dom0.
> > 
> > My model is the following.
> > 
> >    1. PCI back driver resets the device and setups it.
> >    2. PCI back driver passes the responsibility of configuration
> >       space of device to ioemu.
> >    3. Ioemu reads/writes configuration space of the device,      
> > responding guest OS. 
> >    4. When ioemu exits, pci back driver gets the responsibility of
> >       configuration space of device.
> >    5. PCI back driver resets device (and put D3hot state if possible)
> > 
> > As you know, current xend reads/writes configuration space. If xend
> > doesn''t reads/writes, the architecture becomes simpler.
> > 
> > What do you think about this?
> 
> Shohei, I think this model may have some issue. 
> a) The stubdomain/qemu is not trustable, so user may use a fake stub
>  domain and try to programe some sensitive config space (like MSI).
My idea is to call XEN_DOMCTL_iomem_permission from domain 0. 
So my idea doesn''t open a new hole.

In addition to this, interrupt remapping of VT-d can block invalid MSI.
> b) If there is no mmcfg support, to sync access to cf8/cfc will be
> difficult. So you mean we have different implementation for
> mmcfg/cf8 method?
If there is no mmcfg support, I''d like to use existing
mechanism (pciback in dom0 and pcifront in stub domain).

If there is mmcfg support, I''d like to allow stub domain to access
directly.

Thanks,
--
Shohei Fujiwara


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2009-Jan-16 06:16 UTC

head link

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

Shohei Fujiwara <mailto:fujiwara-sxa@necst.nec.co.jp>
wrote:> On Fri, 16 Jan 2009 11:26:10 +0800
> "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> 
>>> I agree with you that there are two similar codes in pciback and
>>> ioemu. But I''m not happy if the code is removed from
ioemu.
>>> 
>>> In case of HVM domain with stub domain, I''m considering
direct access
>>> from ioemu to configuration space.  We can achieve this by mapping
the
>>> subset of MMCFG to stub domain. This will improve the
>>> scalability of PCI
>>> pass-through and reduce the responsibility of dom0.
>>> 
>>> My model is the following.
>>> 
>>>    1. PCI back driver resets the device and setups it.
>>>    2. PCI back driver passes the responsibility of configuration
>>>       space of device to ioemu.
>>>    3. Ioemu reads/writes configuration space of the device,
responding
>>> guest OS. 
>>>    4. When ioemu exits, pci back driver gets the responsibility of
>>>       configuration space of device.
>>>    5. PCI back driver resets device (and put D3hot state if
possible)
>>> 
>>> As you know, current xend reads/writes configuration space. If xend
>>> doesn''t reads/writes, the architecture becomes simpler.
>>> 
>>> What do you think about this?
>> 
>> Shohei, I think this model may have some issue.
>> a) The stubdomain/qemu is not trustable, so user may use a fake stub
>>  domain and try to programe some sensitive config space (like MSI).
> 
> My idea is to call XEN_DOMCTL_iomem_permission from domain 0.
> So my idea doesn''t open a new hole.
> In addition to this, interrupt remapping of VT-d can block invalid MSI.
I suspect that feature is not enabled in all system.

Also what will happen if guest try to change the BAR value? Will be passed to
hardware also? I''m not sure what will happen if two device under the
same bus has the same BAR value. Maybe then it is possible one guest can write
MMIO of another device.
> 
>> b) If there is no mmcfg support, to sync access to cf8/cfc will be
>> difficult. So you mean we have different implementation for
>> mmcfg/cf8 method?
> 
> If there is no mmcfg support, I''d like to use existing
> mechanism (pciback in dom0 and pcifront in stub domain).
> 
> If there is mmcfg support, I''d like to allow stub domain to access
directly.
I''m not sure how difference between these two implementation and if we
really want keep this implementation. Mostly I think it is ok since it should
not be on data path (Or any special device will do that??)
But there is really one thing we need consider: The mask bit for MSI/MSI-X.
Because guest may try to mask/unmask the interrupt. Maybe we need translate that
operation to the mask/unmask of the virtual interrupt.
> 
> Thanks,
> --
> Shohei Fujiwara_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2009-Jan-16 06:18 UTC

head link

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

xen-devel-bounces@lists.xensource.com <> wrote:> On 15/01/2009 10:17, "Shohei Fujiwara"
> <fujiwara-sxa@necst.nec.co.jp> wrote:
> 
>> In case of HVM domain with stub domain, I''m considering direct
access
>> from ioemu to configuration space.  We can achieve this by mapping the
>> subset of MMCFG to stub domain. This will improve the scalability of
PCI
>> pass-through and reduce the responsibility of dom0.
>> 
>> My model is the following.
>> 
>>     1. PCI back driver resets the device and setups it.
>>     2. PCI back driver passes the responsibility of configuration
>>        space of device to ioemu.
>>     3. Ioemu reads/writes configuration space of the device,       
>> responding guest OS. 
>>     4. When ioemu exits, pci back driver gets the responsibility of
>>        configuration space of device.
>>     5. PCI back driver resets device (and put D3hot state if possible)
>> 
>> As you know, current xend reads/writes configuration space. If xend
>> doesn''t reads/writes, the architecture becomes simpler.
>> 
>> What do you think about this?
> 
> I''d rather have all accesses mediated through pciback. I
don''t think PCI
> config accesses should be on any data path anyway, and you''ve
already taken
> the hit of trapping to qemu in that case.
There is one exception: The mask bit for MSI/MSI-X. Maybe we need add some
mechanism for HVM domain to mask/unmask the virtual interrupt directly, like
what DomU did for evtchn. But that will be tricky.

Thanks
Yunhong Jiang
> 
> -- Keir
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Jan-16 08:07 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On 16/01/2009 06:18, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
>> I''d rather have all accesses mediated through pciback. I
don''t think PCI
>> config accesses should be on any data path anyway, and you''ve
already taken
>> the hit of trapping to qemu in that case.
> 
> There is one exception: The mask bit for MSI/MSI-X. Maybe we need add some
> mechanism for HVM domain to mask/unmask the virtual interrupt directly,
like
> what DomU did for evtchn. But that will be tricky.
Yes, that did occur to me. We already have plenty of special emulation code
for MSI/MSI-x. I guess we may explicitly paravirtualise that aspect in a
different way which would allow ioemu to interact direct with Xen. Actually
if mask/unmask happens on every IRQ, we may need to push support for the PCI
MSI registers right down into Xen itself to get decent speed? Because going
to qemu with any great frequency is not very high performance.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shohei Fujiwara

2009-Jan-16 08:55 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On Fri, 16 Jan 2009 14:16:08 +0800
"Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> Shohei Fujiwara <mailto:fujiwara-sxa@necst.nec.co.jp> wrote:
> > On Fri, 16 Jan 2009 11:26:10 +0800
> > "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> > 
> >>> I agree with you that there are two similar codes in pciback
and
> >>> ioemu. But I''m not happy if the code is removed from
ioemu.
> >>> 
> >>> In case of HVM domain with stub domain, I''m
considering direct access
> >>> from ioemu to configuration space.  We can achieve this by
mapping the
> >>> subset of MMCFG to stub domain. This will improve the
> >>> scalability of PCI
> >>> pass-through and reduce the responsibility of dom0.
> >>> 
> >>> My model is the following.
> >>> 
> >>>    1. PCI back driver resets the device and setups it.
> >>>    2. PCI back driver passes the responsibility of
configuration
> >>>       space of device to ioemu.
> >>>    3. Ioemu reads/writes configuration space of the device,
responding
> >>> guest OS. 
> >>>    4. When ioemu exits, pci back driver gets the
responsibility of
> >>>       configuration space of device.
> >>>    5. PCI back driver resets device (and put D3hot state if
possible)
> >>> 
> >>> As you know, current xend reads/writes configuration space. If
xend
> >>> doesn''t reads/writes, the architecture becomes
simpler.
> >>> 
> >>> What do you think about this?
> >> 
> >> Shohei, I think this model may have some issue.
> >> a) The stubdomain/qemu is not trustable, so user may use a fake
stub
> >>  domain and try to programe some sensitive config space (like
MSI).
> > 
> > My idea is to call XEN_DOMCTL_iomem_permission from domain 0.
> > So my idea doesn''t open a new hole.
> > In addition to this, interrupt remapping of VT-d can block invalid
MSI.
> 
> I suspect that feature is not enabled in all system.
> 
> Also what will happen if guest try to change the BAR value? Will be
> passed to hardware also? I''m not sure what will happen if two
device
> under the same bus has the same BAR value. Maybe then it is possible
> one guest can write MMIO of another device.
This is the figure of my idea.

If mmcfg and interrupt remapping are supported:

    guest domain      | stub domain
    ------------------+------------------------------------------
    guest software -> | ioemu -> libpci(pcifront) -> mmcfg(subset)

If mmcfg or interrupt remapping are not supported:

    guest domain      | stub domain                  | domain 0
    ------------------+------------------------------+---------------------
    guest software -> | ioemu -> libpci(pcifront) -> | pciback ->
mmcfg/cf8

    * This is the same with current implementation.

BAR is virtualized by ioemu. BAR value written by guest software 
isn''t passed to hardware.

If stub domain is hijacked, it is possible to set invalid BAR value.
> >> b) If there is no mmcfg support, to sync access to cf8/cfc will be
> >> difficult. So you mean we have different implementation for
> >> mmcfg/cf8 method?
> > 
> > If there is no mmcfg support, I''d like to use existing
> > mechanism (pciback in dom0 and pcifront in stub domain).
> > 
> > If there is mmcfg support, I''d like to allow stub domain to
access directly.
> 
> I''m not sure how difference between these two implementation and
if
> we really want keep this implementation. Mostly I think it is ok
> since it should not be on data path (Or any special device will do
> that??)
> But there is really one thing we need consider: The mask bit for
> MSI/MSI-X. Because guest may try to mask/unmask the interrupt. Maybe
> we need translate that operation to the mask/unmask of the virtual
> interrupt.
As mentioned above, my idea keeps pci configuration virtualization in
ioemu in stub domain. So MSI mask bit in config space and MSI-X mask
bits in memory space will work fine.

Thanks,
--
Shohei Fujiwara


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2009-Jan-16 13:54 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

[Keir Fraser]> On 16/01/2009 06:18, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
>>> I''d rather have all accesses mediated through pciback. I
don''t
>>> think PCI config accesses should be on any data path anyway, and
>>> you''ve already taken the hit of trapping to qemu in that
case.
>> 
>> There is one exception: The mask bit for MSI/MSI-X. Maybe we need
>> add some mechanism for HVM domain to mask/unmask the virtual
>> interrupt directly, like what DomU did for evtchn. But that will be
>> tricky.
> Yes, that did occur to me. We already have plenty of special
> emulation code for MSI/MSI-x. I guess we may explicitly
> paravirtualise that aspect in a different way which would allow
> ioemu to interact direct with Xen. Actually if mask/unmask happens
> on every IRQ, we may need to push support for the PCI MSI registers
> right down into Xen itself to get decent speed? Because going to
> qemu with any great frequency is not very high performance.
Last time I checked, current Linux code does not update the MSI/MSI-X
mask bits frequently (as in on every IRQ).  Doing so would require
device interaction and could result in quite some overhead.  I don''t
know how other systems (e.g., Windows) handles the mask bits.

I don''t think we need to optimize for frequent mask bit updates.
Updates due to enabling/disabling interrupts, or masking due to
interrupt storms would be ok to channel through a slower code path.

Please do tell if the above assumption is wrong.

	eSk



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2009-Jan-16 14:19 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

[Shohei Fujiwara]> On Fri, 16 Jan 2009 14:16:08 +0800
> "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>> Shohei Fujiwara <mailto:fujiwara-sxa@necst.nec.co.jp> wrote:
>>> My idea is to call XEN_DOMCTL_iomem_permission from domain 0.  So
>>> my idea doesn''t open a new hole.  In addition to this,
interrupt
>>> remapping of VT-d can block invalid MSI.
>> 
>> I suspect that feature is not enabled in all system.
>> 
>> Also what will happen if guest try to change the BAR value? Will be
>> passed to hardware also? I''m not sure what will happen if two
>> device under the same bus has the same BAR value. Maybe then it is
>> possible one guest can write MMIO of another device.
> This is the figure of my idea.
> If mmcfg and interrupt remapping are supported:
>     guest domain      | stub domain
>     ------------------+------------------------------------------
>     guest software -> | ioemu -> libpci(pcifront) -> mmcfg(subset)
> If mmcfg or interrupt remapping are not supported:
>     guest domain      | stub domain                  | domain 0
>     ------------------+------------------------------+---------------------
>     guest software -> | ioemu -> libpci(pcifront) -> | pciback
-> mmcfg/cf8
>     * This is the same with current implementation.
> BAR is virtualized by ioemu. BAR value written by guest software 
> isn''t passed to hardware.
> If stub domain is hijacked, it is possible to set invalid BAR value.

I still don''t understand what you''re trying to achieve by
avoiding to
go through pciback.  As Keir said, PCI config accesses should not be
taken on the data path.  Config accesses should neither be required
for regular device operation.  It is afterall called "configuration
space", not "control space".  PCI config space acesses are kind
of
bound to have some overhead.  For example, Itanium requires them to go
through a SAL call.

Is there a real problem you''re trying to solve by pushing this to the
stub domain?  Also, if this is to be handled in the stub domain I
would very much like to be able to configure certain devices so that
their config space acesses are still tunneled through pciback.

	eSk

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2009-Jan-16 14:41 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

[Shohei Fujiwara]> On Fri, 16 Jan 2009 11:26:10 +0800
> "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>> Shohei, I think this model may have some issue. 
>> a) The stubdomain/qemu is not trustable, so user may use a fake stub
>>  domain and try to programe some sensitive config space (like MSI).
> My idea is to call XEN_DOMCTL_iomem_permission from domain 0.  So my
> idea doesn''t open a new hole.
> In addition to this, interrupt remapping of VT-d can block invalid
> MSI.
Except, the MSI entry must be programmed to deliver interrupts in a
special remappable format.  The stub domain can not be allowed to
write arbitrary contents into the MSI entry.

	eSk


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shohei Fujiwara

2009-Jan-19 07:05 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On Fri, 16 Jan 2009 14:19:12 +0000
Espen Skoglund <espen.skoglund@netronome.com> wrote:
> [Shohei Fujiwara]
> > On Fri, 16 Jan 2009 14:16:08 +0800
> > "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> >> Shohei Fujiwara <mailto:fujiwara-sxa@necst.nec.co.jp> wrote:
> >>> My idea is to call XEN_DOMCTL_iomem_permission from domain 0. 
So
> >>> my idea doesn''t open a new hole.  In addition to
this, interrupt
> >>> remapping of VT-d can block invalid MSI.
> >> 
> >> I suspect that feature is not enabled in all system.
> >> 
> >> Also what will happen if guest try to change the BAR value? Will
be
> >> passed to hardware also? I''m not sure what will happen if
two
> >> device under the same bus has the same BAR value. Maybe then it is
> >> possible one guest can write MMIO of another device.
> 
> > This is the figure of my idea.
> 
> > If mmcfg and interrupt remapping are supported:
> 
> >     guest domain      | stub domain
> >     ------------------+------------------------------------------
> >     guest software -> | ioemu -> libpci(pcifront) ->
mmcfg(subset)
> 
> > If mmcfg or interrupt remapping are not supported:
> 
> >     guest domain      | stub domain                  | domain 0
> >    
------------------+------------------------------+---------------------
> >     guest software -> | ioemu -> libpci(pcifront) -> |
pciback -> mmcfg/cf8
> 
> >     * This is the same with current implementation.
> 
> > BAR is virtualized by ioemu. BAR value written by guest software 
> > isn''t passed to hardware.
> 
> > If stub domain is hijacked, it is possible to set invalid BAR value.
> 
> 
> I still don''t understand what you''re trying to achieve by
avoiding to
> go through pciback.  As Keir said, PCI config accesses should not be
> taken on the data path.  Config accesses should neither be required
> for regular device operation.  It is afterall called "configuration
> space", not "control space".  PCI config space acesses are
kind of
> bound to have some overhead.  For example, Itanium requires them to go
> through a SAL call.
Domain 0 is SPOF(Single Point of Failure). If domain 0 panics, whole
system stops. So, I''d like to remove the function from domain 0, if we
can keep security. This reduces possibility of panic of domain 0.

In the future, it is great if domain 0 can reboot while guest domain
are working. This avoids SPOF. 
To achieve this, we have to solve many problems. In case
of network, emulating link down during rebooting is needed. In case of
PCI passthrough, it is difficult to block configuration access during
rebooting. If stub domain can access to configuration space directly,
we don''t need to block configuration access.

What do you think?
> Is there a real problem you''re trying to solve by pushing this to
the
> stub domain?  Also, if this is to be handled in the stub domain I
> would very much like to be able to configure certain devices so that
> their config space acesses are still tunneled through pciback.
There is no real problem of configuration. I also think config space
access should work, if it is tunneled or not.

Thanks,
--
Shohei Fujiwara


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Jan-19 08:30 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On 19/01/2009 07:05, "Shohei Fujiwara"
<fujiwara-sxa@necst.nec.co.jp> wrote:
>> I still don''t understand what you''re trying to
achieve by avoiding to
>> go through pciback.  As Keir said, PCI config accesses should not be
>> taken on the data path.  Config accesses should neither be required
>> for regular device operation.  It is afterall called
"configuration
>> space", not "control space".  PCI config space acesses
are kind of
>> bound to have some overhead.  For example, Itanium requires them to go
>> through a SAL call.
> 
> Domain 0 is SPOF(Single Point of Failure). If domain 0 panics, whole
> system stops. So, I''d like to remove the function from domain 0,
if we
> can keep security. This reduces possibility of panic of domain 0.
> 
> In the future, it is great if domain 0 can reboot while guest domain
> are working. This avoids SPOF.
> To achieve this, we have to solve many problems. In case
> of network, emulating link down during rebooting is needed. In case of
> PCI passthrough, it is difficult to block configuration access during
> rebooting. If stub domain can access to configuration space directly,
> we don''t need to block configuration access.
> 
> What do you think?
I think what you want to do sounds pretty hard. PCI accesses should
definitely go through pciback by default. If you need other modes for more
extensive rearchitecting you are doing, they belong in your dom0-can-reboot
branch, or in the main tree as a configurable option.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Wei Huang

2009-Jan-19 17:13 UTC

head link

[Xen-devel] Status of SR-IOV for Xen?

Hi,

Any further update on SR-IOV support for Xen? Are we going to include 
this feature soon?

Thanks,

-Wei


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Simon Horman

2009-Jan-20 01:15 UTC

head link

Re: [Xen-devel] Status of SR-IOV for Xen?

On Mon, Jan 19, 2009 at 11:13:37AM -0600, Wei Huang
wrote:> Hi,
>
> Any further update on SR-IOV support for Xen? Are we going to include  
> this feature soon?
I am also interested in this and would be willing to do some work
on breaking out the patches to address Jan Beulich''s concerns[1]
about accreditation. And to bring the patches up to date with the most
recent Linux patches. The main problem that I see with the latter
that 2.6.18.8''s PCI stack is now quite old, so using that as
a target would be more work than for instance using the paravirt ops
work that Jeremy Fitzhardinge has been working on.

[1] http://lists.xensource.com/archives/html/xen-devel/2008-09/msg00965.html

-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2009-Jan-20 06:08 UTC

head link

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

Keir Fraser <mailto:keir.fraser@eu.citrix.com>
wrote:> On 16/01/2009 06:18, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>>> I''d rather have all accesses mediated through pciback. I
don''t think PCI
>>> config accesses should be on any data path anyway, and
you''ve already
>>> taken the hit of trapping to qemu in that case.
>> 
>> There is one exception: The mask bit for MSI/MSI-X. Maybe we need add
some
>> mechanism for HVM domain to mask/unmask the virtual interrupt directly,
>> like what DomU did for evtchn. But that will be tricky.
> 
> Yes, that did occur to me. We already have plenty of special emulation code
> for MSI/MSI-x. I guess we may explicitly paravirtualise that
> aspect in a
> different way which would allow ioemu to interact direct with
> Xen. Actually
> if mask/unmask happens on every IRQ, we may need to push
> support for the PCI
> MSI registers right down into Xen itself to get decent speed?
> Because going
> to qemu with any great frequency is not very high performance.
We plan to do this for MSI-X firstly, since currently qemu does not present mask
support for MSI interrupt.
And we do notice such issue for some OS (at least for those based on kernel
2.6.18).
> 
> -- Keir_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shohei Fujiwara

2009-Jan-20 09:26 UTC

head link

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

On Mon, 19 Jan 2009 08:30:18 +0000
Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> On 19/01/2009 07:05, "Shohei Fujiwara"
<fujiwara-sxa@necst.nec.co.jp> wrote:
> 
> >> I still don''t understand what you''re trying to
achieve by avoiding to
> >> go through pciback.  As Keir said, PCI config accesses should not
be
> >> taken on the data path.  Config accesses should neither be
required
> >> for regular device operation.  It is afterall called
"configuration
> >> space", not "control space".  PCI config space
acesses are kind of
> >> bound to have some overhead.  For example, Itanium requires them
to go
> >> through a SAL call.
> > 
> > Domain 0 is SPOF(Single Point of Failure). If domain 0 panics, whole
> > system stops. So, I''d like to remove the function from domain
0, if we
> > can keep security. This reduces possibility of panic of domain 0.
> > 
> > In the future, it is great if domain 0 can reboot while guest domain
> > are working. This avoids SPOF.
> > To achieve this, we have to solve many problems. In case
> > of network, emulating link down during rebooting is needed. In case of
> > PCI passthrough, it is difficult to block configuration access during
> > rebooting. If stub domain can access to configuration space directly,
> > we don''t need to block configuration access.
> > 
> > What do you think?
> 
> I think what you want to do sounds pretty hard. PCI accesses should
> definitely go through pciback by default. If you need other modes for more
> extensive rearchitecting you are doing, they belong in your dom0-can-reboot
> branch, or in the main tree as a configurable option.
I understand PCI accesses should go through pciback by default. Direct
access to MMCFG from stub domain should be configurable.

I''d like to keep developing in the main tree while it is
possible. For now, I am trying to enable PCI passthrough with stub
domain, keeping it de-privileged. I hope new patch can be applied to
main tree, because it will be useful for other developers and users.

Thanks,
--
Shohei Fujiwara


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Wei Huang

2009-Jan-20 14:43 UTC

head link

Re: [Xen-devel] Status of SR-IOV for Xen?

It is better not to duplicate the efforts if Yu Zhao is working on it. 
Otherwise, I would be interested on it too.

-Wei

Simon Horman wrote:> On Mon, Jan 19, 2009 at 11:13:37AM -0600, Wei Huang wrote:
>  > Hi,
>  >
>  > Any further update on SR-IOV support for Xen? Are we going to include
>  > this feature soon?
> 
> I am also interested in this and would be willing to do some work
> on breaking out the patches to address Jan Beulich''s concerns[1]
> about accreditation. And to bring the patches up to date with the most
> recent Linux patches. The main problem that I see with the latter
> that 2.6.18.8''s PCI stack is now quite old, so using that as
> a target would be more work than for instance using the paravirt ops
> work that Jeremy Fitzhardinge has been working on.
> 
> [1]
http://lists.xensource.com/archives/html/xen-devel/2008-09/msg00965.html
> 
> --
> Simon Horman
>   VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
>   H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Zhao, Yu

2009-Jan-20 15:20 UTC

head link

Re: [Xen-devel] Status of SR-IOV for Xen?

I''ll respin those patches and resubmit them later. One thing is that 
SR-IOV depends on MSI/MSI-x but they are disabled as Keir said in thread 
''x86: Disable MSI as it seems to be triggering ASSERT at
irq.c:269''.

Wei Huang wrote:> It is better not to duplicate the efforts if Yu Zhao is working on it. 
> Otherwise, I would be interested on it too.
> 
> -Wei
> 
> Simon Horman wrote:
>> On Mon, Jan 19, 2009 at 11:13:37AM -0600, Wei Huang wrote:
>>  > Hi,
>>  >
>>  > Any further update on SR-IOV support for Xen? Are we going to
include
>>  > this feature soon?
>>
>> I am also interested in this and would be willing to do some work
>> on breaking out the patches to address Jan Beulich''s
concerns[1]
>> about accreditation. And to bring the patches up to date with the most
>> recent Linux patches. The main problem that I see with the latter
>> that 2.6.18.8''s PCI stack is now quite old, so using that as
>> a target would be more work than for instance using the paravirt ops
>> work that Jeremy Fitzhardinge has been working on.
>>
>> [1]
http://lists.xensource.com/archives/html/xen-devel/2008-09/msg00965.html
>>
>> --
>> Simon Horman
>>   VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
>>   H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en
>>
>>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Simon Horman

2009-Jan-20 23:10 UTC

head link

Re: [Xen-devel] Status of SR-IOV for Xen?

Sorry, I wasn''t as clear as I should have been.
I was offering to help Yu.

On Tue, Jan 20, 2009 at 08:43:29AM -0600, Wei Huang
wrote:> It is better not to duplicate the efforts if Yu Zhao is working on it.  
> Otherwise, I would be interested on it too.
>
> -Wei
>
> Simon Horman wrote:
>> On Mon, Jan 19, 2009 at 11:13:37AM -0600, Wei Huang wrote:
>>  > Hi,
>>  >
>>  > Any further update on SR-IOV support for Xen? Are we going to 
>> include  > this feature soon?
>>
>> I am also interested in this and would be willing to do some work
>> on breaking out the patches to address Jan Beulich''s
concerns[1]
>> about accreditation. And to bring the patches up to date with the most
>> recent Linux patches. The main problem that I see with the latter
>> that 2.6.18.8''s PCI stack is now quite old, so using that as
>> a target would be more work than for instance using the paravirt ops
>> work that Jeremy Fitzhardinge has been working on.
>>
>> [1]
http://lists.xensource.com/archives/html/xen-devel/2008-09/msg00965.html
>>
>> --
>> Simon Horman
>>   VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
>>   H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en
>>
>>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Jan-21 14:38 UTC

head link

Re: [Xen-devel] Status of SR-IOV for Xen?

Hopefully I will have MSI support fixed and check in by tomorrow. The issue
was not with Intel''s locking changes to pci/msi, but with Jan
Beulich''s
changes to ACKTYPE_NONE/ACKTYPE_EOI handling of MSIs. I''m just doing
some
testing of my fix now.

 -- Keir

On 20/01/2009 15:20, "Zhao, Yu" <yu.zhao@intel.com> wrote:
> I''ll respin those patches and resubmit them later. One thing is
that
> SR-IOV depends on MSI/MSI-x but they are disabled as Keir said in thread
> ''x86: Disable MSI as it seems to be triggering ASSERT at
irq.c:269''.
> 
> Wei Huang wrote:
>> It is better not to duplicate the efforts if Yu Zhao is working on it.
>> Otherwise, I would be interested on it too.
>> 
>> -Wei
>> 
>> Simon Horman wrote:
>>> On Mon, Jan 19, 2009 at 11:13:37AM -0600, Wei Huang wrote:
>>>> Hi,
>>>> 
>>>> Any further update on SR-IOV support for Xen? Are we going to
include
>>>> this feature soon?
>>> 
>>> I am also interested in this and would be willing to do some work
>>> on breaking out the patches to address Jan Beulich''s
concerns[1]
>>> about accreditation. And to bring the patches up to date with the
most
>>> recent Linux patches. The main problem that I see with the latter
>>> that 2.6.18.8''s PCI stack is now quite old, so using that
as
>>> a target would be more work than for instance using the paravirt
ops
>>> work that Jeremy Fitzhardinge has been working on.
>>> 
>>> [1]
http://lists.xensource.com/archives/html/xen-devel/2008-09/msg00965.html
>>> 
>>> --
>>> Simon Horman
>>>   VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
>>>   H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en
>>> 
>>> 
>> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jan 2009 - Move some of the PCI device manage/control into pciback?

[Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

[Xen-devel] Status of SR-IOV for Xen?

Re: [Xen-devel] Status of SR-IOV for Xen?

RE: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Move some of the PCI device manage/control into pciback?

Re: [Xen-devel] Status of SR-IOV for Xen?

Re: [Xen-devel] Status of SR-IOV for Xen?

Re: [Xen-devel] Status of SR-IOV for Xen?

Re: [Xen-devel] Status of SR-IOV for Xen?