Joanna Rutkowska
2010-Jul-06 21:37 UTC
[Xen-devel] pciback: question about the permissive flag
I''m trying to understand the purpose of the permissive flag in the Xen pciback driver. The comments in the code suggest that setting permissive=1 is "potentially unsafe", and I''ve been wondering why? My thinking goes this way -- we either: 1) have IOMMU/VT-d in the system, and use it to isolate the device assigned to a DomU, in which case allowing the DomU to fully control the assigned device''s config space should not be a problem because VT-d should do its job (we hope at least ;), or 2) we don''t have IOMMU/VT-d, in which case assigning a device to anything other than Dom0 is simply insecure, no matter if we try to restrict access to config space (but still allow DMA engine to be programmed by DomU) or not. So, what am I missing here? joanna. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jul-07 06:32 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On 06/07/2010 22:37, "Joanna Rutkowska" <joanna@invisiblethingslab.com> wrote:> So, what am I missing here?I think the fear was that there could be class- or device-specific config registers that we wouldn''t know how to handle, and which could have unexpected effects if they are passed through naively. Concrete examples were never given, and this was all pre-vtd so as you say pass-through of a DMA-capable device was insecure anyway. I''ve always thought the permissive flag stuff was pretty useless, and I always suggest people to enable the permissive flag. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2010-Jul-07 13:30 UTC
RE: [Xen-devel] pciback: question about the permissive flag
> I think the fear was that there could be class- or device-specific config > registers that we wouldn''t know how to handle, and which could have > unexpected effects if they are passed through naively. Concrete examples > were never given, and this was all pre-vtd so as you say pass-through of a > DMA-capable device was insecure anyway. I''ve always thought the permissive > flag stuff was pretty useless, and I always suggest people to enable the > permissive flag.There are some devices (typically integrated ones, e.g. igfx) that use PCI config space in nasty ways, such as to describe additional BARs, or to trigger SMIs. Allowing free access to these seems dangerous. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Joanna Rutkowska
2010-Jul-07 14:05 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On 07/07/10 15:30, Ian Pratt wrote:>> I think the fear was that there could be class- or device-specific >> config registers that we wouldn''t know how to handle, and which >> could have unexpected effects if they are passed through naively. >> Concrete examples were never given, and this was all pre-vtd so as >> you say pass-through of a DMA-capable device was insecure anyway. >> I''ve always thought the permissive flag stuff was pretty useless, >> and I always suggest people to enable the permissive flag. > > There are some devices (typically integrated ones, e.g. igfx) that > use PCI config space in nasty ways, such as to describe additional > BARs, or to trigger SMIs. Allowing free access to these seems > dangerous. >So, you''re saying that, if we have a device that allows us to set some of its PCI config register (some BAR) to tell where to MMIO-map some of the device''s additional config range, and if we "asked it" to map it over, say, some physical addresses belonging to the hypervisor, then the MCH would allow for that? And the CPU would happily redirect access to those addresses over to the device memory? Why would it? That would clearly be a CPU/chipset bug, as we normally would have to mark this memory range as MMIOed in the first place... And even if we wanted to instruct the device to map its memory over some already MMIOed memory in a hypervisor, shouldn''t VT-d prevent the read/write transactions going to this device? As for the SMI generation: that stinks indeed. But, does it offer any control over the generated #SMI, e.g. what we write into the 0xb2 port, or something like that? If it doesn, then surely it''s an avenue for DomU->SMM escalation, which would mean full system compromise. I''m trying to figure out why so many drivers do not work well when run in a PV driver domain (specifically net drivers), but work fine when running in Dom0. Clearly this is not a pfn != mfn problem, as this inequality also applies to Dom0, while in Dom0 the same drivers work just fine. So it seems like it could only be caused by either of the following: 1) restricted access to device config space 2) interrupt routing problem Or maybe something else? Thanks, joanna. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jul-07 15:18 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote:> I''m trying to understand the purpose of the permissive flag in the Xen > pciback driver. The comments in the code suggest that setting > permissive=1 is "potentially unsafe", and I''ve been wondering why? > > My thinking goes this way -- we either: > > 1) have IOMMU/VT-d in the system, and use it to isolate the device > assigned to a DomU, in which case allowing the DomU to fully control the > assigned device''s config space should not be a problem because VT-dBut that is not the case. The PCI config writes are actually done by Dom0. The Xen PCI frontend redirects all config space reads/writes to the Xen PCI backend that does them on the guest behalf. There are some backend-backend config space libs that deal with different regions (power, MSI), and for those that are not present the permissive flag is used to figure out whether the guest is allowed to write to that region. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jul-07 15:28 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On Wed, Jul 07, 2010 at 04:05:44PM +0200, Joanna Rutkowska wrote:> On 07/07/10 15:30, Ian Pratt wrote: > >> I think the fear was that there could be class- or device-specific > >> config registers that we wouldn''t know how to handle, and which > >> could have unexpected effects if they are passed through naively. > >> Concrete examples were never given, and this was all pre-vtd so as > >> you say pass-through of a DMA-capable device was insecure anyway. > >> I''ve always thought the permissive flag stuff was pretty useless, > >> and I always suggest people to enable the permissive flag. > > > > There are some devices (typically integrated ones, e.g. igfx) that > > use PCI config space in nasty ways, such as to describe additional > > BARs, or to trigger SMIs. Allowing free access to these seems > > dangerous. > > > > So, you''re saying that, if we have a device that allows us to set some > of its PCI config register (some BAR) to tell where to MMIO-map some of > the device''s additional config range, and if we "asked it" to map it > over, say, some physical addresses belonging to the hypervisor, then the > MCH would allow for that? And the CPU would happily redirect access to > those addresses over to the device memory? Why would it? That wouldI would think the VT-d chipset would throw a fit.> clearly be a CPU/chipset bug, as we normally would have to mark this > memory range as MMIOed in the first place... > > And even if we wanted to instruct the device to map its memory over some > already MMIOed memory in a hypervisor, shouldn''t VT-d prevent the > read/write transactions going to this device?That is my feeling too.> > As for the SMI generation: that stinks indeed. But, does it offer any > control over the generated #SMI, e.g. what we write into the 0xb2 port, > or something like that? If it doesn, then surely it''s an avenue for > DomU->SMM escalation, which would mean full system compromise. > > I''m trying to figure out why so many drivers do not work well when run > in a PV driver domain (specifically net drivers), but work fine when > running in Dom0. Clearly this is not a pfn != mfn problem, as this > inequality also applies to Dom0, while in Dom0 the same drivers work > just fine. So it seems like it could only be caused by either of the > following: > 1) restricted access to device config spaceYou can track those easily. Turn on xen-pciback.verbose=1 and you should see the writes/reads and see if there are any that touch on the restricted areas.> 2) interrupt routing problemWell, that can easily be seen by the /proc/interrupts. If the numbers are increasing the interrupts are getting there. Thought if this is MSI/MSI-X make sure you have the latest pv-ops kernel. There were some bugs I introduced earlier on so that turning on MSI/MSI-X interrupts would trash the guest. That has been fixed nowadays.> > Or maybe something else?If you crank up the debug options something should show up. Especially if you have the IOMMU turned on. Are these wireless drivers? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2010-Jul-07 15:44 UTC
RE: [Xen-devel] pciback: question about the permissive flag
> So, you''re saying that, if we have a device that allows us to set some of > its PCI config register (some BAR) to tell where to MMIO-map some of the > device''s additional config range, and if we "asked it" to map it over, > say, some physical addresses belonging to the hypervisor, then the MCH > would allow for that? And the CPU would happily redirect access to those > addresses over to the device memory? Why would it? That would clearly be a > CPU/chipset bug, as we normally would have to mark this memory range as > MMIOed in the first place...Mapping it over memory might be prevented by the MCH (would you want to rely on that?), but mapping it over another device is likely going to create system instability if not a vulnerability.> And even if we wanted to instruct the device to map its memory over some > already MMIOed memory in a hypervisor, shouldn''t VT-d prevent the > read/write transactions going to this device?VT-d only deals with DMAs coming from the device, not CPU MMIOs.> As for the SMI generation: that stinks indeed. But, does it offer any > control over the generated #SMI, e.g. what we write into the 0xb2 port, or > something like that?No idea. Discarding such config writes just seems like a good default. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Joanna Rutkowska
2010-Jul-07 21:23 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On 07/07/10 17:18, Konrad Rzeszutek Wilk wrote:> On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote: >> I''m trying to understand the purpose of the permissive flag in the Xen >> pciback driver. The comments in the code suggest that setting >> permissive=1 is "potentially unsafe", and I''ve been wondering why? >> >> My thinking goes this way -- we either: >> >> 1) have IOMMU/VT-d in the system, and use it to isolate the device >> assigned to a DomU, in which case allowing the DomU to fully control the >> assigned device''s config space should not be a problem because VT-d > > But that is not the case. The PCI config writes are actually done by > Dom0. The Xen PCI frontend redirects all config space reads/writes to > the Xen PCI backend that does them on the guest behalf. >Hmm, not sure if I understand why you wrote "this is not the case" above? Of course DomU cannot directly change anything in PCI config space of any device, because its kernel code executes in Ring 3 or 1, and cannot do IO to 0xcf8/cfc. But I was under impression that once we assign a PCI device to the DomU, and once we set permissive=1, then this would effectively allow DomU to fully control the device config space. Is this not correct?> There are some backend-backend config space libs that deal with > different regions (power, MSI), and for those that are not present > the permissive flag is used to figure out whether the guest is allowed > to write to that region. >What do you mean by a "backend-backend" lib? joanna. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Joanna Rutkowska
2010-Jul-07 21:41 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On 07/07/10 17:44, Ian Pratt wrote:>> So, you''re saying that, if we have a device that allows us to set >> some of its PCI config register (some BAR) to tell where to >> MMIO-map some of the device''s additional config range, and if we >> "asked it" to map it over, say, some physical addresses belonging >> to the hypervisor, then the MCH would allow for that? And the CPU >> would happily redirect access to those addresses over to the device >> memory? Why would it? That would clearly be a CPU/chipset bug, as >> we normally would have to mark this memory range as MMIOed in the >> first place... > > Mapping it over memory might be prevented by the MCH (would you want > to rely on that?),Well, we need to rely on the CPU and MCH anyway, so why not? :)> but mapping it over another device is likely going > to create system instability if not a vulnerability. > >> And even if we wanted to instruct the device to map its memory over >> some already MMIOed memory in a hypervisor, shouldn''t VT-d prevent >> the read/write transactions going to this device? > > VT-d only deals with DMAs coming from the device, not CPU MMIOs. >So, we would have two devices on the PCIe bus that would be willing to respond for a single PCI read request (for some address that both of the devices map some of their memory). I guess which device would actually answer would be implementation/race-condition specific. Let assume the "bad" device answers the PCIe read request first, and will send its data back (this is what the attacker hopes to achieve -- to feed unexpected data into the hypervisor/Dom0). Are you saying that VT-d would not prevent this answer coming back to the CPU? Can somebody from Intel comment on this? This is interesting.>> As for the SMI generation: that stinks indeed. But, does it offer >> any control over the generated #SMI, e.g. what we write into the >> 0xb2 port, or something like that? > > No idea. Discarding such config writes just seems like a good > default. >So far I''ve been aware that the southbridge can trigger #SMI in response to certain conditions, e.g. wrong BDF address (which is apparently used by OEMs to emulate PCI devices from within SMM, how crazy is this?!). But what would be the reason to let IGD device to trigger #SMI? Can Interrupt Remapping (apparently present in VT-d2) be used to prevent a device from triggering an #SMI? Thanks, joanna. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2010-Jul-07 22:51 UTC
RE: [Xen-devel] pciback: question about the permissive flag
> >> And even if we wanted to instruct the device to map its memory over > >> some already MMIOed memory in a hypervisor, shouldn''t VT-d prevent > >> the read/write transactions going to this device? > > > > VT-d only deals with DMAs coming from the device, not CPU MMIOs. > > > > So, we would have two devices on the PCIe bus that would be willing to > respond for a single PCI read request (for some address that both of the > devices map some of their memory). I guess which device would actually > answer would be implementation/race-condition specific.On a PCI bus there''s definitely opportunity for races. On a PCIe bus I''m not entirely sure what would happen as the bridge/IOH presumably has an opinion of which addresses should be routed through which ports. [You also have to be careful of multiple devices behind non-ACS capable bridges where creating a new BAR could cause DMAs to go peer-to-peer.]> Let assume the "bad" device answers the PCIe read request first, and will > send its data back (this is what the attacker hopes to achieve -- to feed > unexpected data into the hypervisor/Dom0). Are you saying that VT-d would > not prevent this answer coming back to the CPU? Can somebody from Intel > comment on this? This is interesting.VT-d only gets involved with transactions initiated by the device (i.e. DMAs). Control/remapping of MMIO transactions initiated by the CPU are handled by the normal CPU MMU.> >> As for the SMI generation: that stinks indeed. But, does it offer any > >> control over the generated #SMI, e.g. what we write into the > >> 0xb2 port, or something like that? > > > > No idea. Discarding such config writes just seems like a good default. > > > > So far I''ve been aware that the southbridge can trigger #SMI in response > to certain conditions, e.g. wrong BDF address (which is apparently used by > OEMs to emulate PCI devices from within SMM, how crazy is this?!). > But what would be the reason to let IGD device to trigger #SMI?Probably something like OpRegion doorbells.> Can Interrupt Remapping (apparently present in VT-d2) be used to prevent a > device from triggering an #SMI?Er, that''s beyond my knowledge... Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jul-09 14:09 UTC
Re: [Xen-devel] pciback: question about the permissive flag
On Wed, Jul 07, 2010 at 11:23:38PM +0200, Joanna Rutkowska wrote:> On 07/07/10 17:18, Konrad Rzeszutek Wilk wrote: > > On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote: > >> I''m trying to understand the purpose of the permissive flag in the Xen > >> pciback driver. The comments in the code suggest that setting > >> permissive=1 is "potentially unsafe", and I''ve been wondering why? > >> > >> My thinking goes this way -- we either: > >> > >> 1) have IOMMU/VT-d in the system, and use it to isolate the device > >> assigned to a DomU, in which case allowing the DomU to fully control the > >> assigned device''s config space should not be a problem because VT-d > > > > But that is not the case. The PCI config writes are actually done by > > Dom0. The Xen PCI frontend redirects all config space reads/writes to > > the Xen PCI backend that does them on the guest behalf. > > > > Hmm, not sure if I understand why you wrote "this is not the case" > above? Of course DomU cannot directly change anything in PCI config > space of any device, because its kernel code executes in Ring 3 or 1, > and cannot do IO to 0xcf8/cfc. But I was under impression that once we > assign a PCI device to the DomU, and once we set permissive=1, then this > would effectively allow DomU to fully control the device config space. > Is this not correct?That is correct.> > > There are some backend-backend config space libs that deal with > > different regions (power, MSI), and for those that are not present > > the permissive flag is used to figure out whether the guest is allowed > > to write to that region. > > > > What do you mean by a "backend-backend" lib?drivers/xen/pciback/conf_space_*> > joanna. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel