Reiner Sailer
2005-Sep-02 03:26 UTC
[Xen-devel] [PATCH] ACM: adding get_ssid command and cleanup
This patch: * adds a get_ssid ACM command that allows privileged domains to retrieve types for either a given ssid reference or a given domain id (of a running domain); this command can be used to extend access control into device domains, e.g., to control network traffic currently moving through Domain 0 uncontrolled by the ACM policy * adds a script getlabel.sh that allows users inside Dom0 to retrieve the label for a given ssid reference or a given domain id (multiple labels might map onto a single ssid reference) * cleans up label-related code in tools/security by merging common functions into labelfuncs.sh * cleans up ACM code related to above changes (eventually approximating a common coding style) Comments welcome. Thanks Reiner Signed-off-by Reiner Sailer <sailer@us.ibm.com> Signed-off by Stefan Berger <stefanb@us.ibm.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Palmer
2005-Sep-02 18:41 UTC
Re: [Xense-devel] [PATCH] ACM: adding get_ssid command and cleanup
Reiner, I''ve looked over the code. As input, it takes either an SSID or a DomainID. If given a DomainID, it looks up the domain''s SSID. It then returns two arrays of 0''s and 1''s. One array is a row from the STE-Type matrix and the other is a row from the ChWall-Type matrix corresponding to the given SSID. My question then: What constitutes a legitimate use vs. a clear abuse of this information? For example, lets say I create a domain that manages a resource. When another domain connects, the resource domain checks for a specific type using get_ssid() on the subject''s DomainID and indexes one of the arrays with the type number. If the type is set, then it provides the "Privileged" interface with the other domain. If it is not set, then it provides the "Unprivileged" interface with the domain. Is this legitimate or an abuse of the function? Why or why not? Dave On 9/1/05, Reiner Sailer <sailer@us.ibm.com> wrote:> > > This patch: > > * adds a get_ssid ACM command that allows privileged domains to retrieve > types for either a given ssid reference or a given domain id (of a running > domain); this command can be used to extend access control into device > domains, e.g., to control network traffic currently moving through Domain > 0 uncontrolled by the ACM policy > > * adds a script getlabel.sh that allows users inside Dom0 to retrieve the > label for a given ssid reference or a given domain id (multiple labels might > map onto a single ssid reference) > > * cleans up label-related code in tools/security by merging common > functions into labelfuncs.sh > > * cleans up ACM code related to above changes (eventually approximating a > common coding style) > > Comments welcome. > > Thanks > Reiner > > Signed-off-by Reiner Sailer <sailer@us.ibm.com> > Signed-off by Stefan Berger <stefanb@us.ibm.com> > > > > _______________________________________________ > Xense-devel mailing list > Xense-devel@lists.xensource.com > http://lists.xensource.com/xense-devel > > > >_______________________________________________ Xense-devel mailing list Xense-devel@lists.xensource.com http://lists.xensource.com/xense-devel
Reiner Sailer
2005-Sep-03 02:53 UTC
[Xen-devel] Re: [Xense-devel] [PATCH] ACM: adding get_ssid command and cleanup
David Palmer <dwpalmer.xense@gmail.com> wrote on 09/02/2005 02:41:28 PM:> Reiner, > > I''ve looked over the code. As input, it takes either an SSID or a > DomainID. If given a DomainID, it looks up the domain''s SSID. It > then returns two arrays of 0''s and 1''s. One array is a row from the > STE-Type matrix and the other is a row from the ChWall-Type matrix > corresponding to the given SSID.More information explaining legitimate and envisioned use of this function: The get_ssid (get subject security identifier) command was mainly introduced to allow device domains to retrieve the security related information they need from the hypervisor. This way, they can enforce access control on the virtual resources they are offering to other domains. To do this, a device domain only needs to know those types of a remote domain that it shares with this domain. In the future, we will restrict domains other than the security management domain (currently dom0) to those types. We use get_ssid based on the domainID in device domains that need to know the types of their peer domains requiring access (e.g., requesting to mount a logical partition). This usually involves code "behind" backend interfaces in Xen. We plan to use get_ssid based on the ssidref for resources (once resource labeling is introduced) to control the allocation of physical resources (e.g. peripherals) to domains according to the types of the domain to which a peripheral is being assigned and the types of the peripheral (only domains that share a type with the peripheral can own it).> My question then: What constitutes a legitimate use vs. a clear > abuse of this information? > For example, lets say I create a domain that manages a resource. > When another domain connects, the resource domain checks for a > specific type using get_ssid() on the subject''s DomainID and indexes > one of the arrays with the type number. If the type is set, then it > provides the "Privileged" interface with the other domain.Some background for the legitimate use of this function: The access control decision of a device domain is yes/no to a request of a remote domain to access a resource (e.g., connect a front-end virtual block device driver in a user domain to a back-end virtual block device driver in a device domain). It is not based on any specific operation but only on the security types of the domains. The "privileged" part comes in when a domain tries to use get_ssid on the hypervisor. Your question seems to go towards operation granularity for access control decisions, which is not what the current policies envision. We leave this granularity to upper layers (inside domains) in Xen. I could re-formulate your latter sentence: "If the type is set, then it allows access (any access), otherwise it denies access to the resource." Denying access in this context means, e.g., that connecting a front-end block device driver to the respective back-end block device driver fails and a domain will not be able to mount a drive (or access the network in case of network front/back ends). The hypervisor with the help of device domains does NOT control the operation ("mount disk" or "send network traffic") but controls general access of domains to virtual resources (access to the storage domain''s virtual disks, access to the network domains virtual network interfaces). In this context, the hypervisor controls if a domain can communicate at all to a device domain, the device domain then controls if a domain can access a certain virtual resource.>Is this legitimate or an abuse of the function? Why or why not?Using get_ssid is restricted to privileged domains. If the privileged domain is a device domain, then it MUST enforce the hypervisor policy (here Type Enforcement). To further restrict access in higher layers is legitimate and envisioned in device domains. Offering other domains even an unprivileged interface if they don''t share a type is a violation of the hypervisor STE policy; this is illegal in device domains. Using get_ssid in any other privileged domain in the way you describe could / should be considered abuse since: a) the policy information is not used as intended and inconsistencies are likely to evolve b) predicting the effect of policy settings onto the enforcement becomes increasingly difficult (even the simple STE policy now can define quite complex relationships) Resolution: i) The privileged/unprivileged access control interface could be implemented based on a separate policy/enforcement layer above and independent of the hypervisor (inside the privileged domain). ii) A different hypervisor security policy could be implemented that does not conflict with your privileged/unprivileged interface interpretation. At this point your example use of get_ssid becomes legitimate since it is consistent with the interpretation of the hypervisor policy. I hope this is helful. Thanks Reiner> On 9/1/05, Reiner Sailer <sailer@us.ibm.com> wrote: > > This patch: > > * adds a get_ssid ACM command that allows privileged domains to > retrieve types for either a given ssid reference or a given domain > id (of a running domain); this command can be used to extend access > control into device domains, e.g., to control network traffic > currently moving through Domain 0 uncontrolled by the ACM policy > > * adds a script getlabel.sh that allows users inside Dom0 to > retrieve the label for a given ssid reference or a given domain id > (multiple labels might map onto a single ssid reference) > > * cleans up label-related code in tools/security by merging common > functions into labelfuncs.sh > > * cleans up ACM code related to above changes (eventually > approximating a common coding style) > > Comments welcome. > > Thanks > Reiner > > Signed-off-by Reiner Sailer <sailer@us.ibm.com> > Signed-off by Stefan Berger <stefanb@us.ibm.com> > > > > _______________________________________________ > Xense-devel mailing list > Xense-devel@lists.xensource.com > http://lists.xensource.com/xense-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Palmer
2005-Sep-03 16:49 UTC
Re: [Xense-devel] [PATCH] ACM: adding get_ssid command and cleanup
Yes, that helps considerably. I had the mistaken impression that you were implementing the Flask architecture. From the papers I''ve read, it calls for object managers and a security server. Each object manager is only concerned with object specific knowledge for policy enforcement. Object managers rely on a central security server to make policy decisions. The security server has the sole responsibility of interpreting the policy. This is clearly not the architecture you envision for sHype. Instead, what I hear you saying sounds like a collection of resource reference monitors that follow a global policy that applies to all reference monitors. 1. Each reference monitor makes policy decisions and enforces them for its resources. 2. A central policy server is used to provide the relevant portions of the global policy to the reference monitors. 3. Each reference monitor faithfully interprets the global policy according to the common policy semantics. 4. Each reference monitor enforces the global policy in that it does not allow any more access than what is permitted by the policy semantics. It may choose to grant less access as long as it does not change the meaning of the global policy. For example, lets consider the case where I have a domain that provides both a "privileged" and "unprivileged" interface to its resources. The global policy allows a red and a black domain to each connect to the resource domain. The resource domain may choose to provide different levels of access to red and black. It should not interpret the global policy differently, but instead it can honor a local policy that names the red and black domains. Have you published a paper detailing this architecture and how it compares with other architectures? It would be interesting to go over it in detail and see what you have learned about the approach. In your messages, you note that it is important that the global policy has a consistent meaning for all reference monitors, and that the architecture supports the ability to change the meaning of the policy in the future. 1. Doesn''t optimizing the policy decision logic for each resource monitor increase the risk that there will be differences in how each of them interprets the global policy? Although we all try to write perfect code, we certainly have to accept that it doesn''t generally happen. There is an advantage to having a single golden implementation where defects can be fixed such that all resource managers benefit. Independent optimizations for the policy decision logic in each resource monitor increases the chances for defects that have to be fix independently. Unfortunately, testing tends not to work well for eliminating security vulnerabilities as it only finds the few that were tested for. 2. How can semantic changes in the global policy be made? If each resource monitor is responsible for interpreting the policy consistently with each other, aren''t they locked into the specific semantics of the policy they understand? In the worse case, won''t this lead to needing to rewrite each reference monitor in order to add or alter the policy semantics? I''m concerned that if I start implementing my own reference monitor with the given get_ssid() function, I''ll end up having to rewrite it completely as it won''t be consistent with the solution you have in mind for addressing the goals of providing consistent policy semantics and allowing them to be changed in the future. Dave On 9/2/05, Reiner Sailer <sailer@us.ibm.com> wrote:> > > David Palmer <dwpalmer.xense@gmail.com> wrote on 09/02/2005 02:41:28 PM: > > > Reiner, > > > > I''ve looked over the code. As input, it takes either an SSID or a > > DomainID. If given a DomainID, it looks up the domain''s SSID. It > > then returns two arrays of 0''s and 1''s. One array is a row from the > > STE-Type matrix and the other is a row from the ChWall-Type matrix > > corresponding to the given SSID. > > More information explaining legitimate and envisioned use of this > function: > The get_ssid (get subject security identifier) command was mainly > introduced to allow device domains to retrieve the security related > information they need from the hypervisor. This way, they can enforce access > control on the virtual resources they are offering to other domains. To do > this, a device domain only needs to know those types of a remote domain that > it shares with this domain. In the future, we will restrict domains other > than the security management domain (currently dom0) to those types. > > We use get_ssid based on the domainID in device domains that need to know > the types of their peer domains requiring access (e.g., requesting to > mount a logical partition). This usually involves code "behind" backend > interfaces in Xen. > > We plan to use get_ssid based on the ssidref for resources (once resource > labeling is introduced) to control the allocation of physical resources ( > e.g. peripherals) to domains according to the types of the domain to which > a peripheral is being assigned and the types of the peripheral (only domains > that share a type with the peripheral can own it). > > > My question then: What constitutes a legitimate use vs. a clear > > abuse of this information? > > For example, lets say I create a domain that manages a resource. > > When another domain connects, the resource domain checks for a > > specific type using get_ssid() on the subject''s DomainID and indexes > > one of the arrays with the type number. If the type is set, then it > > provides the "Privileged" interface with the other domain. > > Some background for the legitimate use of this function: > The access control decision of a device domain is yes/no to a request of a > remote domain to access a resource (e.g., connect a front-end virtual > block device driver in a user domain to a back-end virtual block device > driver in a device domain). It is not based on any specific operation but > only on the security types of the domains. The "privileged" part comes in > when a domain tries to use get_ssid on the hypervisor. > > Your question seems to go towards operation granularity for access control > decisions, which is not what the current policies envision. We leave this > granularity to upper layers (inside domains) in Xen. I could re-formulate > your latter sentence: "If the type is set, then it allows access (any > access), otherwise it denies access to the resource." Denying access in this > context means, e.g., that connecting a front-end block device driver to > the respective back-end block device driver fails and a domain will not be > able to mount a drive (or access the network in case of network front/back > ends). > > The hypervisor with the help of device domains does NOT control the > operation ("mount disk" or "send network traffic") but controls general > access of domains to virtual resources (access to the storage domain''s > virtual disks, access to the network domains virtual network interfaces). In > this context, the hypervisor controls if a domain can communicate at all to > a device domain, the device domain then controls if a domain can access a > certain virtual resource. > > >Is this legitimate or an abuse of the function? Why or why not? > > Using get_ssid is restricted to privileged domains. If the privileged > domain is a device domain, then it MUST enforce the hypervisor policy (here > Type Enforcement). To further restrict access in higher layers is legitimate > and envisioned in device domains. Offering other domains even an > unprivileged interface if they don''t share a type is a violation of the > hypervisor STE policy; this is illegal in device domains. Using get_ssid in > any other privileged domain in the way you describe could / should be > considered abuse since: > > a) the policy information is not used as intended and inconsistencies are > likely to evolve > b) predicting the effect of policy settings onto the enforcement becomes > increasingly difficult (even the simple STE policy now can define quite > complex relationships) > > Resolution: > i) The privileged/unprivileged access control interface could be > implemented based on a separate policy/enforcement layer above and > independent of the hypervisor (inside the privileged domain). > > ii) A different hypervisor security policy could be implemented that does > not conflict with your privileged/unprivileged interface interpretation. At > this point your example use of get_ssid becomes legitimate since it is > consistent with the interpretation of the hypervisor policy. > > I hope this is helful. > > Thanks > > Reiner > > > On 9/1/05, Reiner Sailer <sailer@us.ibm.com> wrote: > > > > This patch: > > > > * adds a get_ssid ACM command that allows privileged domains to > > retrieve types for either a given ssid reference or a given domain > > id (of a running domain); this command can be used to extend access > > control into device domains, e.g., to control network traffic > > currently moving through Domain 0 uncontrolled by the ACM policy > > > > * adds a script getlabel.sh that allows users inside Dom0 to > > retrieve the label for a given ssid reference or a given domain id > > (multiple labels might map onto a single ssid reference) > > > > * cleans up label-related code in tools/security by merging common > > functions into labelfuncs.sh > > > > * cleans up ACM code related to above changes (eventually > > approximating a common coding style) > > > > Comments welcome. > > > > Thanks > > Reiner > > > > Signed-off-by Reiner Sailer <sailer@us.ibm.com> > > Signed-off by Stefan Berger <stefanb@us.ibm.com> > > > > > > > > _______________________________________________ > > Xense-devel mailing list > > Xense-devel@lists.xensource.com > > http://lists.xensource.com/xense-devel > > > > >_______________________________________________ Xense-devel mailing list Xense-devel@lists.xensource.com http://lists.xensource.com/xense-devel
Reiner Sailer
2005-Sep-03 20:16 UTC
[Xen-devel] Re: [Xense-devel] [PATCH] ACM: adding get_ssid command and cleanup
David Palmer <dwpalmer.xense@gmail.com> wrote on 09/03/2005 12:49:01 PM:> Yes, that helps considerably. I had the mistaken impression that > you were implementing the Flask architecture. From the papers I''ve > read, it calls for object managers and a security server. Each > object manager is only concerned with object specific knowledge for > policy enforcement. Object managers rely on a central security > server to make policy decisions. The security server has the sole > responsibility of interpreting the policy. This is clearly not the > architecture you envision for sHype.You are right. On one hand, are applying a flask-like architecture inside the hypervisor where we have "hooks" around operations on eventchannels and grant-tables. These hooks actually do not know about policies but call into the ACM (security server). On the other hand, since some objects are located outside the hypervisor (virtual resources based on peripherals), we need to modularly extend this basic access control to allow certain trusted domains (device domains). Such domains are part of the access control infrastructure; they are not user domains.> Instead, what I hear you saying sounds like a collection of resource > reference monitors that follow a global policy that applies to all > reference monitors.yes.> 1. Each reference monitor makes policy decisions and enforces them > for its resources. > > 2. A central policy server is used to provide the relevant portions > of the global policy to the reference monitors. > > 3. Each reference monitor faithfully interprets the global policy > according to the common policy semantics. > > 4. Each reference monitor enforces the global policy in that it does > not allow any more access than what is permitted by the policy > semantics. It may choose to grant less access as long as it does > not change the meaning of the global policy.yes.> For example, lets consider the case where I have a domain that > provides both a "privileged" and "unprivileged" interface to its > resources. The global policy allows a red and a black domain to > each connect to the resource domain. The resource domain may choose > to provide different levels of access to red and black. It should > not interpret the global policy differently, but instead it can > honor a local policy that names the red and black domains. > > Have you published a paper detailing this architecture and how it > compares with other architectures? It would be interesting to go > over it in detail and see what you have learned about the approach.Will be on ACSAC 2005 (December, Tuscon) We are working on it ;_)> In your messages, you note that it is important that the global > policy has a consistent meaning for all reference monitors, and that > the architecture supports the ability to change the meaning of the > policy in the future. > > 1. Doesn''t optimizing the policy decision logic for each resource > monitor increase the risk that there will be differences in how each > of them interprets the global policy?The current policies are simple so it should be possible to get this right (eventually). The more difficult part is to ensure that the device domains are small and tight enough to keep access to resources of different types safely apart (MAC confinement). Minimal Linx + enforcment (e.g. SELinux), micro-kernels, ... might be interesting experiment candidates and we encourage experiments.> Although we all try to write perfect code, we certainly have to > accept that it doesn''t generally happen. There is an advantage to > having a single golden implementation where defects can be fixed > such that all resource managers benefit. Independent optimizations > for the policy decision logic in each resource monitor increases the > chances for defects that have to be fix independently. > Unfortunately, testing tends not to work well for eliminating > security vulnerabilities as it only finds the few that were tested for.This is one correct side of arguments. The other is that modularity has advantages too. We try to minimize the code intrusiveness by controlling access at the natural points where we have direct control on access and all necessary information to derive the access decision. Both alternative ways should be explored.> 2. How can semantic changes in the global policy be made? > > If each resource monitor is responsible for interpreting the policy > consistently with each other, aren''t they locked into the specific > semantics of the policy they understand? In the worse case, won''t > this lead to needing to rewrite each reference monitor in order to > add or alter the policy semantics? > > I''m concerned that if I start implementing my own reference monitor > with the given get_ssid() function, I''ll end up having to rewrite it > completely as it won''t be consistent with the solution you have in > mind for addressing the goals of providing consistent policy > semantics and allowing them to be changed in the future. > > DaveAll reference monitors (hypervisor + device domains of multiple types) are part of the policy enforcement; from an access control viewpoint, they are more part of the hypervisor than they are a real domain. If you change the policy, then you need to do this in all the elements. No different from any other reference monitor implementation. Of course, the enforcement for device domains should ideally be a pretty small patch to the driver code leveraging existing OS controls (e.g., SELinux) to take over the confinement/isolation part inside the domain. The hypervisor-level access control is supposed to be simple but strong (small TCB). It is not one of our current goals to optimize for changes in the semantics of a policy since this is best done by defining a new policy and implementing respective code. The coarse granularity of control rather suggests that the policy should be a smallest common denominator and a strong safety-net. There is certainly room for choices while we are moving into device domains and it can be pretty interesting to experiment with different approaches for this integration. I believe that your original suggestion is a good one and that we should think about introducing an additional ACM call that allows to retrieve a policy decision for the current policy based on two ssidrefs or one ssidref and a domain id (whatever is available in the device domain) to ensure that such experiments can apply to the full range of possible enforcement options in device domains and to arrive at the best architecture and not some local maxima. We have so far used the following criteria when evaluating hypervisor security architecture alternatives: a) performance (without this it cannot survive in commercial environments; see history) b) minimize code-intrusiveness (without this, the security architecture code will be subject to endless changes when other hypervisor code is optimized and maintained or we end up re-writing the hypervisor; see history) c) simplicity (minimal TCB but considering a) and b)) Regards Reiner _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel