Daniel De Graaf
2013-Jul-09 20:28 UTC
[PATCH v2] xen: use domid check in is_hardware_domain
Instead of checking is_privileged to determine if a domain should control the hardware, check that the domain_id is equal to zero (which is currently the only domain for which is_privileged is true). This allows other places where domain_id is checked for zero to be replaced with is_hardware_domain. The distinction between is_hardware_domain, is_control_domain, and domain 0 is based on the following disaggregation model: Domain 0 bootstraps the system. It may remain to perform requested builds of domains that need a minimal trust chain (i.e. vTPM domains). Other than being built by the hypervisor, nothing is special about this domain - although it may be useful to have is_control_domain() return true depending on the toolstack it uses to build other domains. The hardware domain manages devices for PCI pass-through to driver domains or can act as a driver domain itself, depending on the desired degree of disaggregation. It is also the domain managing devices that do not support pass-through: PCI configuration space access, parsing the hardware ACPI tables and system power or machine check events. This is the only domain where is_hardware_domain() is true. The return of is_control_domain() is false for this domain. The control domain manages other domains, controls guest launch and shutdown, and manages resource constraints; is_control_domain() returns true. The functionality guarded by is_control_domain may in the future be adapted to use explicit hypercalls, eliminating the special treatment of this domain. It may be reasonable to have multiple control domains on a multi-tenant system. Guest domains and other service or driver domains are all treated identically by the hypervisor; the security policy may further constrain administrative actions on or communication between these domains. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Cc: Keir Fraser <keir@xen.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@citrix.com> Cc: Tim Deegan <tim@xen.org> Cc: George Dunlap <George.Dunlap@eu.citrix.com> --- Changes from v1: Collapse patches and improve commit message. xen/arch/arm/domain.c | 2 +- xen/arch/arm/vgic.c | 2 +- xen/arch/arm/vpl011.c | 4 ++-- xen/arch/x86/domain.c | 2 +- xen/arch/x86/hvm/i8254.c | 2 +- xen/arch/x86/time.c | 4 ++-- xen/arch/x86/traps.c | 4 ++-- xen/common/domain.c | 10 +++++----- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- xen/drivers/passthrough/vtd/iommu.c | 8 ++++---- xen/drivers/passthrough/vtd/x86/vtd.c | 2 +- xen/include/xen/sched.h | 4 ++-- 12 files changed, 23 insertions(+), 23 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index f465ab7..c7dc69a 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -504,7 +504,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) goto fail; /* Domain 0 gets a real UART not an emulated one */ - if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) + if ( !is_hardware_domain(d) && (rc = domain_uart0_init(d)) != 0 ) goto fail; return 0; diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 2e4b11f..d9c73be 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d) /* Currently nr_lines in vgic and gic doesn''t have the same meanings * Here nr_lines = number of SPIs */ - if ( d->domain_id == 0 ) + if ( is_hardware_domain(d) ) d->arch.vgic.nr_lines = gic_number_lines() - 32; else d->arch.vgic.nr_lines = 0; /* We don''t need SPIs for the guest */ diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c index 13ba623..0e9454f 100644 --- a/xen/arch/arm/vpl011.c +++ b/xen/arch/arm/vpl011.c @@ -43,7 +43,7 @@ int domain_uart0_init(struct domain *d) { - ASSERT( d->domain_id ); + ASSERT( !is_hardware_domain(d) ); spin_lock_init(&d->arch.uart0.lock); d->arch.uart0.idx = 0; @@ -87,7 +87,7 @@ static int uart0_mmio_check(struct vcpu *v, paddr_t addr) { struct domain *d = v->domain; - return d->domain_id != 0 && addr >= UART0_START && addr < UART0_END; + return !is_hardware_domain(d) && addr >= UART0_START && addr < UART0_END; } static int uart0_mmio_read(struct vcpu *v, mmio_info_t *info) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 874742c..51d0ea6 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) } set_cpuid_faulting(!is_hvm_vcpu(next) && - (next->domain->domain_id != 0)); + !is_control_domain(next->domain)); } if (is_hvm_vcpu(next) && (prev != next) ) diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c index c0d6bc2..add61e3 100644 --- a/xen/arch/x86/hvm/i8254.c +++ b/xen/arch/x86/hvm/i8254.c @@ -541,7 +541,7 @@ int pv_pit_handler(int port, int data, int write) .data = data }; - if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) ) + if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) ) { /* nothing to do */; } diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index cf8bc78..bacc8ef 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -1885,7 +1885,7 @@ void tsc_set_info(struct domain *d, uint32_t tsc_mode, uint64_t elapsed_nsec, uint32_t gtsc_khz, uint32_t incarnation) { - if ( is_idle_domain(d) || (d->domain_id == 0) ) + if ( is_idle_domain(d) || is_hardware_domain(d) ) { d->arch.vtsc = 0; return; @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); for_each_domain ( d ) { - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) continue; printk("dom%u%s: mode=%d",d->domain_id, is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 57dbd0c..8e8d3d1 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -745,7 +745,7 @@ static void pv_cpuid(struct cpu_user_regs *regs) c = regs->ecx; d = regs->edx; - if ( current->domain->domain_id != 0 ) + if ( !is_control_domain(current->domain) ) { unsigned int cpuid_leaf = a, sub_leaf = c; @@ -1836,7 +1836,7 @@ static inline uint64_t guest_misc_enable(uint64_t val) static int is_cpufreq_controller(struct domain *d) { return ((cpufreq_controller == FREQCTL_dom0_kernel) && - (d->domain_id == 0)); + is_hardware_domain(d)); } #include "x86_64/mmconfig.h" diff --git a/xen/common/domain.c b/xen/common/domain.c index 6c264a5..692372a 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -238,7 +238,7 @@ struct domain *domain_create( if ( domcr_flags & DOMCRF_hvm ) d->is_hvm = 1; - if ( domid == 0 ) + if ( is_hardware_domain(d) ) { d->is_pinned = opt_dom0_vcpus_pin; d->disable_migrate = 1; @@ -263,10 +263,10 @@ struct domain *domain_create( d->is_paused_by_controller = 1; atomic_inc(&d->pause_count); - if ( domid ) - d->nr_pirqs = nr_static_irqs + extra_domU_irqs; - else + if ( is_hardware_domain(d) ) d->nr_pirqs = nr_static_irqs + extra_dom0_irqs; + else + d->nr_pirqs = nr_static_irqs + extra_domU_irqs; if ( d->nr_pirqs > nr_irqs ) d->nr_pirqs = nr_irqs; @@ -600,7 +600,7 @@ void domain_shutdown(struct domain *d, u8 reason) d->shutdown_code = reason; reason = d->shutdown_code; - if ( d->domain_id == 0 ) + if ( is_hardware_domain(d) ) dom0_shutdown(reason); if ( d->is_shutting_down ) diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index 5ea1a1d..e8ec695 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device( BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer ); - if ( iommu_passthrough && (domain->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(domain) ) valid = 0; if ( ats_enabled ) diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 0fc10de..8d5c43d 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1336,7 +1336,7 @@ int domain_context_mapping_one( return res; } - if ( iommu_passthrough && (domain->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(domain) ) { context_set_translation_type(*context, CONTEXT_TT_PASS_THRU); agaw = level_to_agaw(iommu->nr_pt_levels); @@ -1710,7 +1710,7 @@ static int intel_iommu_map_page( return 0; /* do nothing if dom0 and iommu supports pass thru */ - if ( iommu_passthrough && (d->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(d) ) return 0; spin_lock(&hd->mapping_lock); @@ -1754,7 +1754,7 @@ static int intel_iommu_map_page( static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) { /* Do nothing if dom0 and iommu supports pass thru. */ - if ( iommu_passthrough && (d->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(d) ) return 0; dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); @@ -1927,7 +1927,7 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev) /* If the device belongs to dom0, and it has RMRR, don''t remove it * from dom0, because BIOS may use RMRR at booting time. */ - if ( pdev->domain->domain_id == 0 ) + if ( is_hardware_domain(pdev->domain) ) { for_each_rmrr_device ( rmrr, bdf, i ) { diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c index 875b033..eb9e52a 100644 --- a/xen/drivers/passthrough/vtd/x86/vtd.c +++ b/xen/drivers/passthrough/vtd/x86/vtd.c @@ -117,7 +117,7 @@ void __init iommu_set_dom0_mapping(struct domain *d) { unsigned long i, j, tmp, top; - BUG_ON(d->domain_id != 0); + BUG_ON(!is_hardware_domain(d)); top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ae6a3b8..4e6edf5 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -722,10 +722,10 @@ void watchdog_domain_destroy(struct domain *d); /* * Use this check when the following are both true: * - Using this feature or interface requires full access to the hardware - * (that is, this is would not be suitable for a driver domain) + * (that is, this would not be suitable for a driver domain) * - There is never a reason to deny dom0 access to this */ -#define is_hardware_domain(_d) ((_d)->is_privileged) +#define is_hardware_domain(_d) ((_d)->domain_id == 0) /* This check is for functionality specific to a control domain */ #define is_control_domain(_d) ((_d)->is_privileged) -- 1.8.1.4
Jan Beulich
2013-Jul-10 08:30 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 09.07.13 at 22:28, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote: > Instead of checking is_privileged to determine if a domain should > control the hardware, check that the domain_id is equal to zero (which > is currently the only domain for which is_privileged is true). This > allows other places where domain_id is checked for zero to be replaced > with is_hardware_domain. > > The distinction between is_hardware_domain, is_control_domain, and > domain 0 is based on the following disaggregation model: > > Domain 0 bootstraps the system. It may remain to perform requested > builds of domains that need a minimal trust chain (i.e. vTPM domains). > Other than being built by the hypervisor, nothing is special about this > domain - although it may be useful to have is_control_domain() return > true depending on the toolstack it uses to build other domains. > > The hardware domain manages devices for PCI pass-through to driver > domains or can act as a driver domain itself, depending on the desired > degree of disaggregation. It is also the domain managing devices that > do not support pass-through: PCI configuration space access, parsing the > hardware ACPI tables and system power or machine check events. This is > the only domain where is_hardware_domain() is true. The return of > is_control_domain() is false for this domain. > > The control domain manages other domains, controls guest launch and > shutdown, and manages resource constraints; is_control_domain() returns > true. The functionality guarded by is_control_domain may in the future > be adapted to use explicit hypercalls, eliminating the special treatment > of this domain. It may be reasonable to have multiple control domains > on a multi-tenant system. > > Guest domains and other service or driver domains are all treated > identically by the hypervisor; the security policy may further constrain > administrative actions on or communication between these domains. > > Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> > Acked-by: Jan Beulich <jbeulich@suse.com>This isn''t correct: I gave my Reviewed-by for the full series; the Acked-by was given only for the two patches touching only code I''m maintainer for. The distinction we''re trying to establish is that an ack implies that a maintainer is okay with a certain patch (i.e. a non-maintainer would generally not send ack-s at all), whereas a review means what it says - the patch was reviewed. That said, while I realize that you did this collapsing because you were asked to by George, I''m not certain this was a good move: With one big patch, there''s now no way to apply it step by step, as the necessary ack-s trickle in. But the significantly extended description is perhaps outweighing that. Jan
George Dunlap
2013-Jul-10 09:18 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On 10/07/13 09:30, Jan Beulich wrote:>>>> On 09.07.13 at 22:28, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote: >> Instead of checking is_privileged to determine if a domain should >> control the hardware, check that the domain_id is equal to zero (which >> is currently the only domain for which is_privileged is true). This >> allows other places where domain_id is checked for zero to be replaced >> with is_hardware_domain. >> >> The distinction between is_hardware_domain, is_control_domain, and >> domain 0 is based on the following disaggregation model: >> >> Domain 0 bootstraps the system. It may remain to perform requested >> builds of domains that need a minimal trust chain (i.e. vTPM domains). >> Other than being built by the hypervisor, nothing is special about this >> domain - although it may be useful to have is_control_domain() return >> true depending on the toolstack it uses to build other domains. >> >> The hardware domain manages devices for PCI pass-through to driver >> domains or can act as a driver domain itself, depending on the desired >> degree of disaggregation. It is also the domain managing devices that >> do not support pass-through: PCI configuration space access, parsing the >> hardware ACPI tables and system power or machine check events. This is >> the only domain where is_hardware_domain() is true. The return of >> is_control_domain() is false for this domain. >> >> The control domain manages other domains, controls guest launch and >> shutdown, and manages resource constraints; is_control_domain() returns >> true. The functionality guarded by is_control_domain may in the future >> be adapted to use explicit hypercalls, eliminating the special treatment >> of this domain. It may be reasonable to have multiple control domains >> on a multi-tenant system. >> >> Guest domains and other service or driver domains are all treated >> identically by the hypervisor; the security policy may further constrain >> administrative actions on or communication between these domains. >> >> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> >> Acked-by: Jan Beulich <jbeulich@suse.com> > This isn''t correct: I gave my Reviewed-by for the full series; the > Acked-by was given only for the two patches touching only code > I''m maintainer for. > > The distinction we''re trying to establish is that an ack implies that > a maintainer is okay with a certain patch (i.e. a non-maintainer > would generally not send ack-s at all), whereas a review means > what it says - the patch was reviewed.The definition you''re using for Reviewed-by: is wrong. From Linux''s SubmittingPatches: Reviewed-by:, instead, indicates that the patch has been reviewed and found acceptable according to the Reviewer''s Statement: Reviewer''s statement of oversight By offering my Reviewed-by: tag, I state that: (a) I have carried out a technical review of this patch to evaluate its appropriateness and readiness for inclusion into the mainline kernel. (b) Any problems, concerns, or questions relating to the patch have been communicated back to the submitter. I am satisfied with the submitter''s response to my comments. (c) While there may be things that could be improved with this submission, I believe that it is, at this time, (1) a worthwhile modification to the kernel, and (2) free of known issues which would argue against its inclusion. (d) While I have reviewed the patch and believe it to be sound, I do not (unless explicitly stated elsewhere) make any warranties or guarantees that it will achieve its stated purpose or function properly in any given situation. A Reviewed-by tag is a statement of opinion that the patch is an appropriate modification of the kernel without any remaining serious technical issues. Any interested reviewer (who has done the work) can offer a Reviewed-by tag for a patch. This tag serves to give credit to reviewers and to inform maintainers of the degree of review which has been done on the patch. Reviewed-by: tags, when supplied by reviewers known to understand the subject area and to perform thorough reviews, will normally increase the likelihood of your patch getting into the kernel. (ref: https://github.com/torvalds/linux/blob/master/Documentation/SubmittingPatches) So Reviewed-by is much stronger than Acked-by, and one could be forgiven for thinking that it could be "collapsed down" that way. -George
Jan Beulich
2013-Jul-10 09:38 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 10.07.13 at 11:18, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 10/07/13 09:30, Jan Beulich wrote: >>>>> On 09.07.13 at 22:28, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote: >>> Instead of checking is_privileged to determine if a domain should >>> control the hardware, check that the domain_id is equal to zero (which >>> is currently the only domain for which is_privileged is true). This >>> allows other places where domain_id is checked for zero to be replaced >>> with is_hardware_domain. >>> >>> The distinction between is_hardware_domain, is_control_domain, and >>> domain 0 is based on the following disaggregation model: >>> >>> Domain 0 bootstraps the system. It may remain to perform requested >>> builds of domains that need a minimal trust chain (i.e. vTPM domains). >>> Other than being built by the hypervisor, nothing is special about this >>> domain - although it may be useful to have is_control_domain() return >>> true depending on the toolstack it uses to build other domains. >>> >>> The hardware domain manages devices for PCI pass-through to driver >>> domains or can act as a driver domain itself, depending on the desired >>> degree of disaggregation. It is also the domain managing devices that >>> do not support pass-through: PCI configuration space access, parsing the >>> hardware ACPI tables and system power or machine check events. This is >>> the only domain where is_hardware_domain() is true. The return of >>> is_control_domain() is false for this domain. >>> >>> The control domain manages other domains, controls guest launch and >>> shutdown, and manages resource constraints; is_control_domain() returns >>> true. The functionality guarded by is_control_domain may in the future >>> be adapted to use explicit hypercalls, eliminating the special treatment >>> of this domain. It may be reasonable to have multiple control domains >>> on a multi-tenant system. >>> >>> Guest domains and other service or driver domains are all treated >>> identically by the hypervisor; the security policy may further constrain >>> administrative actions on or communication between these domains. >>> >>> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> >>> Acked-by: Jan Beulich <jbeulich@suse.com> >> This isn''t correct: I gave my Reviewed-by for the full series; the >> Acked-by was given only for the two patches touching only code >> I''m maintainer for. >> >> The distinction we''re trying to establish is that an ack implies that >> a maintainer is okay with a certain patch (i.e. a non-maintainer >> would generally not send ack-s at all), whereas a review means >> what it says - the patch was reviewed. > > The definition you''re using for Reviewed-by: is wrong. > > From Linux''s SubmittingPatches: > [...]So what was wrong with my description of Reviewed-by?> So Reviewed-by is much stronger than Acked-by, and one could be forgiven > for thinking that it could be "collapsed down" that way.What I was trying to point out is my current understanding: No matter how Linux understands Acked-by, we aim at it to mean that a maintainer is fine with a given patch being committed by a committer. And then again, having offered my Reviewed-by to the whole series (and explicitly pointed out that an eventual Acked-by would apply only to a subset, in an attempt to make my understanding of the tag''s meaning explicit), I also don''t see the point in weakening the stronger, wider scope tag. Jan
George Dunlap
2013-Jul-10 10:10 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On 10/07/13 10:38, Jan Beulich wrote:>>>> On 10.07.13 at 11:18, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 10/07/13 09:30, Jan Beulich wrote: >>>>>> On 09.07.13 at 22:28, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote: >>>> Instead of checking is_privileged to determine if a domain should >>>> control the hardware, check that the domain_id is equal to zero (which >>>> is currently the only domain for which is_privileged is true). This >>>> allows other places where domain_id is checked for zero to be replaced >>>> with is_hardware_domain. >>>> >>>> The distinction between is_hardware_domain, is_control_domain, and >>>> domain 0 is based on the following disaggregation model: >>>> >>>> Domain 0 bootstraps the system. It may remain to perform requested >>>> builds of domains that need a minimal trust chain (i.e. vTPM domains). >>>> Other than being built by the hypervisor, nothing is special about this >>>> domain - although it may be useful to have is_control_domain() return >>>> true depending on the toolstack it uses to build other domains. >>>> >>>> The hardware domain manages devices for PCI pass-through to driver >>>> domains or can act as a driver domain itself, depending on the desired >>>> degree of disaggregation. It is also the domain managing devices that >>>> do not support pass-through: PCI configuration space access, parsing the >>>> hardware ACPI tables and system power or machine check events. This is >>>> the only domain where is_hardware_domain() is true. The return of >>>> is_control_domain() is false for this domain. >>>> >>>> The control domain manages other domains, controls guest launch and >>>> shutdown, and manages resource constraints; is_control_domain() returns >>>> true. The functionality guarded by is_control_domain may in the future >>>> be adapted to use explicit hypercalls, eliminating the special treatment >>>> of this domain. It may be reasonable to have multiple control domains >>>> on a multi-tenant system. >>>> >>>> Guest domains and other service or driver domains are all treated >>>> identically by the hypervisor; the security policy may further constrain >>>> administrative actions on or communication between these domains. >>>> >>>> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> >>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>> This isn''t correct: I gave my Reviewed-by for the full series; the >>> Acked-by was given only for the two patches touching only code >>> I''m maintainer for. >>> >>> The distinction we''re trying to establish is that an ack implies that >>> a maintainer is okay with a certain patch (i.e. a non-maintainer >>> would generally not send ack-s at all), whereas a review means >>> what it says - the patch was reviewed. >> The definition you''re using for Reviewed-by: is wrong. >> >> From Linux''s SubmittingPatches: >> [...] > So what was wrong with my description of Reviewed-by?I think the interpretation of "Ack" is just, "I''m OK with this" / "I don''t object". Reviewed-by includes not only, "I think this patch is sound", but "I think this patch should be accepted". As such, it would subsume and imply an Ack. You said, "Reviewed-by means what it says - the patch was reviewed", which I understood to mean only "I think this patch is sound", and not "I think this patch should be accepted". Otherwise I don''t understand the point you are trying to make.>> So Reviewed-by is much stronger than Acked-by, and one could be forgiven >> for thinking that it could be "collapsed down" that way. > What I was trying to point out is my current understanding: No > matter how Linux understands Acked-by, we aim at it to mean > that a maintainer is fine with a given patch being committed by > a committer. > > And then again, having offered my Reviewed-by to the whole > series (and explicitly pointed out that an eventual Acked-by > would apply only to a subset, in an attempt to make my > understanding of the tag''s meaning explicit),Yes, and so since Reviewed-by implies everything that Acked-by implies, the Acked-by''s are redundant.> I also don''t see > the point in weakening the stronger, wider scope tag.I''m not what you''re talking about here -- which is the stronger scope tag, and how do you perceive it being weakened? -George
Jan Beulich
2013-Jul-10 10:30 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 10.07.13 at 12:10, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 10/07/13 10:38, Jan Beulich wrote: >> I also don''t see >> the point in weakening the stronger, wider scope tag. > > I''m not what you''re talking about here -- which is the stronger scope > tag, and how do you perceive it being weakened?We appear to agree that Reviewed-by is the stronger tag. Hence replacing it by Acked-by weakens things imo. Jan
George Dunlap
2013-Jul-10 10:57 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On 09/07/13 21:28, Daniel De Graaf wrote:> Instead of checking is_privileged to determine if a domain should > control the hardware, check that the domain_id is equal to zero (which > is currently the only domain for which is_privileged is true). This > allows other places where domain_id is checked for zero to be replaced > with is_hardware_domain. > > The distinction between is_hardware_domain, is_control_domain, and > domain 0 is based on the following disaggregation model: > > Domain 0 bootstraps the system. It may remain to perform requested > builds of domains that need a minimal trust chain (i.e. vTPM domains). > Other than being built by the hypervisor, nothing is special about this > domain - although it may be useful to have is_control_domain() return > true depending on the toolstack it uses to build other domains. > > The hardware domain manages devices for PCI pass-through to driver > domains or can act as a driver domain itself, depending on the desired > degree of disaggregation. It is also the domain managing devices that > do not support pass-through: PCI configuration space access, parsing the > hardware ACPI tables and system power or machine check events. This is > the only domain where is_hardware_domain() is true. The return of > is_control_domain() is false for this domain. > > The control domain manages other domains, controls guest launch and > shutdown, and manages resource constraints; is_control_domain() returns > true. The functionality guarded by is_control_domain may in the future > be adapted to use explicit hypercalls, eliminating the special treatment > of this domain. It may be reasonable to have multiple control domains > on a multi-tenant system. > > Guest domains and other service or driver domains are all treated > identically by the hypervisor; the security policy may further constrain > administrative actions on or communication between these domains. > > Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> > Acked-by: Jan Beulich <jbeulich@suse.com> > Cc: Keir Fraser <keir@xen.org> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@citrix.com> > Cc: Tim Deegan <tim@xen.org> > Cc: George Dunlap <George.Dunlap@eu.citrix.com> > --- > > Changes from v1: Collapse patches and improve commit message. > > xen/arch/arm/domain.c | 2 +- > xen/arch/arm/vgic.c | 2 +- > xen/arch/arm/vpl011.c | 4 ++-- > xen/arch/x86/domain.c | 2 +- > xen/arch/x86/hvm/i8254.c | 2 +- > xen/arch/x86/time.c | 4 ++-- > xen/arch/x86/traps.c | 4 ++-- > xen/common/domain.c | 10 +++++----- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 8 ++++---- > xen/drivers/passthrough/vtd/x86/vtd.c | 2 +- > xen/include/xen/sched.h | 4 ++-- > 12 files changed, 23 insertions(+), 23 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index f465ab7..c7dc69a 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -504,7 +504,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > goto fail; > > /* Domain 0 gets a real UART not an emulated one */ > - if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) > + if ( !is_hardware_domain(d) && (rc = domain_uart0_init(d)) != 0 ) > goto fail; > > return 0; > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 2e4b11f..d9c73be 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d) > /* Currently nr_lines in vgic and gic doesn''t have the same meanings > * Here nr_lines = number of SPIs > */ > - if ( d->domain_id == 0 ) > + if ( is_hardware_domain(d) ) > d->arch.vgic.nr_lines = gic_number_lines() - 32; > else > d->arch.vgic.nr_lines = 0; /* We don''t need SPIs for the guest */ > diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c > index 13ba623..0e9454f 100644 > --- a/xen/arch/arm/vpl011.c > +++ b/xen/arch/arm/vpl011.c > @@ -43,7 +43,7 @@ > > int domain_uart0_init(struct domain *d) > { > - ASSERT( d->domain_id ); > + ASSERT( !is_hardware_domain(d) ); > > spin_lock_init(&d->arch.uart0.lock); > d->arch.uart0.idx = 0; > @@ -87,7 +87,7 @@ static int uart0_mmio_check(struct vcpu *v, paddr_t addr) > { > struct domain *d = v->domain; > > - return d->domain_id != 0 && addr >= UART0_START && addr < UART0_END; > + return !is_hardware_domain(d) && addr >= UART0_START && addr < UART0_END; > } > > static int uart0_mmio_read(struct vcpu *v, mmio_info_t *info) > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 874742c..51d0ea6 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) > } > > set_cpuid_faulting(!is_hvm_vcpu(next) && > - (next->domain->domain_id != 0)); > + !is_control_domain(next->domain));Won''t this mean that in your separate hardware/control domain model that the hardware domain *will* fault on cpuid? Is this what we want?> } > > if (is_hvm_vcpu(next) && (prev != next) ) > diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c > index c0d6bc2..add61e3 100644 > --- a/xen/arch/x86/hvm/i8254.c > +++ b/xen/arch/x86/hvm/i8254.c > @@ -541,7 +541,7 @@ int pv_pit_handler(int port, int data, int write) > .data = data > }; > > - if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) ) > + if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) )Ack.> { > /* nothing to do */; > } > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index cf8bc78..bacc8ef 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -1885,7 +1885,7 @@ void tsc_set_info(struct domain *d, > uint32_t tsc_mode, uint64_t elapsed_nsec, > uint32_t gtsc_khz, uint32_t incarnation) > { > - if ( is_idle_domain(d) || (d->domain_id == 0) ) > + if ( is_idle_domain(d) || is_hardware_domain(d) ) > { > d->arch.vtsc = 0; > return; > @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) > "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); > for_each_domain ( d ) > { > - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) > + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) > continue; > printk("dom%u%s: mode=%d",d->domain_id, > is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode);Why have direct access to the tsc for the hardware domain and not the control domain?> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index 57dbd0c..8e8d3d1 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -745,7 +745,7 @@ static void pv_cpuid(struct cpu_user_regs *regs) > c = regs->ecx; > d = regs->edx; > > - if ( current->domain->domain_id != 0 ) > + if ( !is_control_domain(current->domain) ) > { > unsigned int cpuid_leaf = a, sub_leaf = c; >Same as above re cpuid.> @@ -1836,7 +1836,7 @@ static inline uint64_t guest_misc_enable(uint64_t val) > static int is_cpufreq_controller(struct domain *d) > { > return ((cpufreq_controller == FREQCTL_dom0_kernel) && > - (d->domain_id == 0)); > + is_hardware_domain(d)); > } > > #include "x86_64/mmconfig.h" > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 6c264a5..692372a 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -238,7 +238,7 @@ struct domain *domain_create( > if ( domcr_flags & DOMCRF_hvm ) > d->is_hvm = 1; > > - if ( domid == 0 ) > + if ( is_hardware_domain(d) ) > { > d->is_pinned = opt_dom0_vcpus_pin; > d->disable_migrate = 1;At some point this will have to be thought about a bit more -- which of the disaggregated domains do we actually want pinned here? But I think this is fine for now. Everything else looks reasonable to me, but obviously needs acks from various maintainers. -George> @@ -263,10 +263,10 @@ struct domain *domain_create( > d->is_paused_by_controller = 1; > atomic_inc(&d->pause_count); > > - if ( domid ) > - d->nr_pirqs = nr_static_irqs + extra_domU_irqs; > - else > + if ( is_hardware_domain(d) ) > d->nr_pirqs = nr_static_irqs + extra_dom0_irqs; > + else > + d->nr_pirqs = nr_static_irqs + extra_domU_irqs; > if ( d->nr_pirqs > nr_irqs ) > d->nr_pirqs = nr_irqs; > > @@ -600,7 +600,7 @@ void domain_shutdown(struct domain *d, u8 reason) > d->shutdown_code = reason; > reason = d->shutdown_code; > > - if ( d->domain_id == 0 ) > + if ( is_hardware_domain(d) ) > dom0_shutdown(reason); > > if ( d->is_shutting_down ) > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index 5ea1a1d..e8ec695 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device( > > BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer ); > > - if ( iommu_passthrough && (domain->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(domain) ) > valid = 0; > > if ( ats_enabled ) > diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c > index 0fc10de..8d5c43d 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1336,7 +1336,7 @@ int domain_context_mapping_one( > return res; > } > > - if ( iommu_passthrough && (domain->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(domain) ) > { > context_set_translation_type(*context, CONTEXT_TT_PASS_THRU); > agaw = level_to_agaw(iommu->nr_pt_levels); > @@ -1710,7 +1710,7 @@ static int intel_iommu_map_page( > return 0; > > /* do nothing if dom0 and iommu supports pass thru */ > - if ( iommu_passthrough && (d->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(d) ) > return 0; > > spin_lock(&hd->mapping_lock); > @@ -1754,7 +1754,7 @@ static int intel_iommu_map_page( > static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) > { > /* Do nothing if dom0 and iommu supports pass thru. */ > - if ( iommu_passthrough && (d->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(d) ) > return 0; > > dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); > @@ -1927,7 +1927,7 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev) > /* If the device belongs to dom0, and it has RMRR, don''t remove it > * from dom0, because BIOS may use RMRR at booting time. > */ > - if ( pdev->domain->domain_id == 0 ) > + if ( is_hardware_domain(pdev->domain) ) > { > for_each_rmrr_device ( rmrr, bdf, i ) > { > diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c > index 875b033..eb9e52a 100644 > --- a/xen/drivers/passthrough/vtd/x86/vtd.c > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c > @@ -117,7 +117,7 @@ void __init iommu_set_dom0_mapping(struct domain *d) > { > unsigned long i, j, tmp, top; > > - BUG_ON(d->domain_id != 0); > + BUG_ON(!is_hardware_domain(d)); > > top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1); > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index ae6a3b8..4e6edf5 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -722,10 +722,10 @@ void watchdog_domain_destroy(struct domain *d); > /* > * Use this check when the following are both true: > * - Using this feature or interface requires full access to the hardware > - * (that is, this is would not be suitable for a driver domain) > + * (that is, this would not be suitable for a driver domain) > * - There is never a reason to deny dom0 access to this > */ > -#define is_hardware_domain(_d) ((_d)->is_privileged) > +#define is_hardware_domain(_d) ((_d)->domain_id == 0) > > /* This check is for functionality specific to a control domain */ > #define is_control_domain(_d) ((_d)->is_privileged)
Jan Beulich
2013-Jul-10 11:43 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 10.07.13 at 12:57, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 09/07/13 21:28, Daniel De Graaf wrote: >> --- a/xen/arch/x86/domain.c >> +++ b/xen/arch/x86/domain.c >> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) >> } >> >> set_cpuid_faulting(!is_hvm_vcpu(next) && >> - (next->domain->domain_id != 0)); >> + !is_control_domain(next->domain)); > > Won''t this mean that in your separate hardware/control domain model that > the hardware domain *will* fault on cpuid? Is this what we want?I think this is correct, as it''s the control domain that ought to be able to set up CPUID via the tool stack for all other domains. The implication is that it''s the first booted domain. If that''s not generally the case, we''d need another qualification here. And perhaps that qualification would in the end be domid == 0...>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >> for_each_domain ( d ) >> { >> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >> continue; >> printk("dom%u%s: mode=%d",d->domain_id, >> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); > > Why have direct access to the tsc for the hardware domain and not the > control domain?Because, as its name says, it has direct access to the hardware (including the TSC)?>> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -745,7 +745,7 @@ static void pv_cpuid(struct cpu_user_regs *regs) >> c = regs->ecx; >> d = regs->edx; >> >> - if ( current->domain->domain_id != 0 ) >> + if ( !is_control_domain(current->domain) ) >> { >> unsigned int cpuid_leaf = a, sub_leaf = c; >> > > Same as above re cpuid.Ditto.>> --- a/xen/common/domain.c >> +++ b/xen/common/domain.c >> @@ -238,7 +238,7 @@ struct domain *domain_create( >> if ( domcr_flags & DOMCRF_hvm ) >> d->is_hvm = 1; >> >> - if ( domid == 0 ) >> + if ( is_hardware_domain(d) ) >> { >> d->is_pinned = opt_dom0_vcpus_pin; >> d->disable_migrate = 1; > > At some point this will have to be thought about a bit more -- which of > the disaggregated domains do we actually want pinned here? But I think > this is fine for now.The pinning really is mainly for the Dom0-controlled CPU frequency scaling to work (and hence validly qualified by is_hardware_domain()). Any other uses, no matter how frequently they are seen, are abusing dom0_vcpus_pin afaict. Jan
George Dunlap
2013-Jul-10 13:00 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On Wed, Jul 10, 2013 at 12:43 PM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 10.07.13 at 12:57, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 09/07/13 21:28, Daniel De Graaf wrote: >>> --- a/xen/arch/x86/domain.c >>> +++ b/xen/arch/x86/domain.c >>> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) >>> } >>> >>> set_cpuid_faulting(!is_hvm_vcpu(next) && >>> - (next->domain->domain_id != 0)); >>> + !is_control_domain(next->domain)); >> >> Won''t this mean that in your separate hardware/control domain model that >> the hardware domain *will* fault on cpuid? Is this what we want? > > I think this is correct, as it''s the control domain that ought to be > able to set up CPUID via the tool stack for all other domains. > > The implication is that it''s the first booted domain. If that''s not > generally the case, we''d need another qualification here. And > perhaps that qualification would in the end be domid == 0...The question isn''t why the control domain has it; the question is why the hardware domain doesn''t have it.> >>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >>> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >>> for_each_domain ( d ) >>> { >>> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>> continue; >>> printk("dom%u%s: mode=%d",d->domain_id, >>> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); >> >> Why have direct access to the tsc for the hardware domain and not the >> control domain? > > Because, as its name says, it has direct access to the hardware > (including the TSC)?Again, my question wasn''t so much about why the hardware domain does have it, but why the control domain does *not* have it.>>> --- a/xen/common/domain.c >>> +++ b/xen/common/domain.c >>> @@ -238,7 +238,7 @@ struct domain *domain_create( >>> if ( domcr_flags & DOMCRF_hvm ) >>> d->is_hvm = 1; >>> >>> - if ( domid == 0 ) >>> + if ( is_hardware_domain(d) ) >>> { >>> d->is_pinned = opt_dom0_vcpus_pin; >>> d->disable_migrate = 1; >> >> At some point this will have to be thought about a bit more -- which of >> the disaggregated domains do we actually want pinned here? But I think >> this is fine for now. > > The pinning really is mainly for the Dom0-controlled CPU frequency > scaling to work (and hence validly qualified by is_hardware_domain()). > Any other uses, no matter how frequently they are seen, are abusing > dom0_vcpus_pin afaict.If that feature was meant to be used exclusively for cpu frequency, then it should have been named something related to cpu frequency. Using something originally designed for one purpose for another purpose isn''t an "abuse", it''s a "cleaver hack". :-) -George
Jan Beulich
2013-Jul-10 13:56 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 10.07.13 at 15:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Wed, Jul 10, 2013 at 12:43 PM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 10.07.13 at 12:57, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>> On 09/07/13 21:28, Daniel De Graaf wrote: >>>> --- a/xen/arch/x86/domain.c >>>> +++ b/xen/arch/x86/domain.c >>>> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu > *next) >>>> } >>>> >>>> set_cpuid_faulting(!is_hvm_vcpu(next) && >>>> - (next->domain->domain_id != 0)); >>>> + !is_control_domain(next->domain)); >>> >>> Won''t this mean that in your separate hardware/control domain model that >>> the hardware domain *will* fault on cpuid? Is this what we want? >> >> I think this is correct, as it''s the control domain that ought to be >> able to set up CPUID via the tool stack for all other domains. >> >> The implication is that it''s the first booted domain. If that''s not >> generally the case, we''d need another qualification here. And >> perhaps that qualification would in the end be domid == 0... > > The question isn''t why the control domain has it; the question is why > the hardware domain doesn''t have it.Because - as said - for _all_ other domains, the control domain can set a CPUID policy via the tool stack.>>>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >>>> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >>>> for_each_domain ( d ) >>>> { >>>> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>> continue; >>>> printk("dom%u%s: mode=%d",d->domain_id, >>>> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); >>> >>> Why have direct access to the tsc for the hardware domain and not the >>> control domain? >> >> Because, as its name says, it has direct access to the hardware >> (including the TSC)? > > Again, my question wasn''t so much about why the hardware domain does > have it, but why the control domain does *not* have it.And I think I answered it. Or should I return the question and ask: "Why do you think the control domain should be using the TSC directly? I.e. in what way is it different from other non-hardware domains in its interaction with hardware?">>>> --- a/xen/common/domain.c >>>> +++ b/xen/common/domain.c >>>> @@ -238,7 +238,7 @@ struct domain *domain_create( >>>> if ( domcr_flags & DOMCRF_hvm ) >>>> d->is_hvm = 1; >>>> >>>> - if ( domid == 0 ) >>>> + if ( is_hardware_domain(d) ) >>>> { >>>> d->is_pinned = opt_dom0_vcpus_pin; >>>> d->disable_migrate = 1; >>> >>> At some point this will have to be thought about a bit more -- which of >>> the disaggregated domains do we actually want pinned here? But I think >>> this is fine for now. >> >> The pinning really is mainly for the Dom0-controlled CPU frequency >> scaling to work (and hence validly qualified by is_hardware_domain()). >> Any other uses, no matter how frequently they are seen, are abusing >> dom0_vcpus_pin afaict. > > If that feature was meant to be used exclusively for cpu frequency, > then it should have been named something related to cpu frequency.I agree.> Using something originally designed for one purpose for another > purpose isn''t an "abuse", it''s a "cleaver hack". :-)Okay, you can call it that way if you like. Jan
George Dunlap
2013-Jul-10 15:50 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On Wed, Jul 10, 2013 at 2:56 PM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 10.07.13 at 15:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> On Wed, Jul 10, 2013 at 12:43 PM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>> On 10.07.13 at 12:57, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>> On 09/07/13 21:28, Daniel De Graaf wrote: >>>>> --- a/xen/arch/x86/domain.c >>>>> +++ b/xen/arch/x86/domain.c >>>>> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu >> *next) >>>>> } >>>>> >>>>> set_cpuid_faulting(!is_hvm_vcpu(next) && >>>>> - (next->domain->domain_id != 0)); >>>>> + !is_control_domain(next->domain)); >>>> >>>> Won''t this mean that in your separate hardware/control domain model that >>>> the hardware domain *will* fault on cpuid? Is this what we want? >>> >>> I think this is correct, as it''s the control domain that ought to be >>> able to set up CPUID via the tool stack for all other domains. >>> >>> The implication is that it''s the first booted domain. If that''s not >>> generally the case, we''d need another qualification here. And >>> perhaps that qualification would in the end be domid == 0... >> >> The question isn''t why the control domain has it; the question is why >> the hardware domain doesn''t have it. > > Because - as said - for _all_ other domains, the control domain can > set a CPUID policy via the tool stack.So you''re saying that the control domain will set the CPUID policy for the hardware domain?> >>>>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >>>>> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >>>>> for_each_domain ( d ) >>>>> { >>>>> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>>> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>>> continue; >>>>> printk("dom%u%s: mode=%d",d->domain_id, >>>>> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); >>>> >>>> Why have direct access to the tsc for the hardware domain and not the >>>> control domain? >>> >>> Because, as its name says, it has direct access to the hardware >>> (including the TSC)? >> >> Again, my question wasn''t so much about why the hardware domain does >> have it, but why the control domain does *not* have it. > > And I think I answered it. Or should I return the question and ask: > "Why do you think the control domain should be using the TSC > directly? I.e. in what way is it different from other non-hardware > domains in its interaction with hardware?"Well why does the hardware domain, or even domain 0, need it? "It has direct access to the hardware" isn''t really an answer. -George
Daniel De Graaf
2013-Jul-10 18:33 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On 07/10/2013 06:57 AM, George Dunlap wrote:> On 09/07/13 21:28, Daniel De Graaf wrote:[...]>> index 874742c..51d0ea6 100644 >> --- a/xen/arch/x86/domain.c >> +++ b/xen/arch/x86/domain.c >> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) >> } >> set_cpuid_faulting(!is_hvm_vcpu(next) && >> - (next->domain->domain_id != 0)); >> + !is_control_domain(next->domain)); > > Won''t this mean that in your separate hardware/control domain model that the hardware domain *will* fault on cpuid? Is this what we want?I would expect that if CPUID faulting is turned on for the hardware domain that no features would be masked (since it can''t be migrated, there''s no reason to avoid using an existing feature). Jan commented on a prior version of this patch (on April 12) that it makes sense for a control domain to see the full features of the CPU in order to create masks for guests. In theory, we could enable CPUID faulting for all domains after dom0 and force the domain builder to copy out the hardware CPUID to its guests - likely unmodified for hardware and control domains, and then masked as usual for a domU for others. However, this may require too much knowledge of the behavior of CPUID sub-leaves to be encoded in the domain builder - this seems like knowledge best left to the hardware domain (for things like feature bits for power management MSRs) and the control domain (to see an upper bound on features it exposes to the guest). So, I think the best solution is to make this condition !hardware && !control.>> } >> if (is_hvm_vcpu(next) && (prev != next) )[...]>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >> for_each_domain ( d ) >> { >> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >> continue; >> printk("dom%u%s: mode=%d",d->domain_id, >> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); > > Why have direct access to the tsc for the hardware domain and not the control domain?This is just excluding output from a printk. It may make sense to exclude more than the hardware domain, but that''s really a matter of taste. We could also just remove the exclusion, but that should be a separate patch.>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >> index 57dbd0c..8e8d3d1 100644 >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -745,7 +745,7 @@ static void pv_cpuid(struct cpu_user_regs *regs) >> c = regs->ecx; >> d = regs->edx; >> - if ( current->domain->domain_id != 0 ) >> + if ( !is_control_domain(current->domain) ) >> { >> unsigned int cpuid_leaf = a, sub_leaf = c; > > Same as above re cpuid.Will need to be handled the same if the above is changed. [...]>> index 6c264a5..692372a 100644 >> --- a/xen/common/domain.c >> +++ b/xen/common/domain.c >> @@ -238,7 +238,7 @@ struct domain *domain_create( >> if ( domcr_flags & DOMCRF_hvm ) >> d->is_hvm = 1; >> - if ( domid == 0 ) >> + if ( is_hardware_domain(d) ) >> { >> d->is_pinned = opt_dom0_vcpus_pin; >> d->disable_migrate = 1; > > At some point this will have to be thought about a bit more -- which of the disaggregated domains do we actually want pinned here? But I think this is fine for now.I would say the hardware domain would be the most likely one to pin, since it would be in charge of things like CPU frequency, microcode, and so forth. Pinning other domains for performance reasons is really better done by a scheduler interface, either at domain creation or at runtime.> Everything else looks reasonable to me, but obviously needs acks from various maintainers. > > -George >-- Daniel De Graaf National Security Agency
Daniel De Graaf
2013-Jul-10 18:33 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
On 07/10/2013 04:30 AM, Jan Beulich wrote:>>>> On 09.07.13 at 22:28, Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote: >> Instead of checking is_privileged to determine if a domain should >> control the hardware, check that the domain_id is equal to zero (which >> is currently the only domain for which is_privileged is true). This >> allows other places where domain_id is checked for zero to be replaced >> with is_hardware_domain. >> >> The distinction between is_hardware_domain, is_control_domain, and >> domain 0 is based on the following disaggregation model: >> >> Domain 0 bootstraps the system. It may remain to perform requested >> builds of domains that need a minimal trust chain (i.e. vTPM domains). >> Other than being built by the hypervisor, nothing is special about this >> domain - although it may be useful to have is_control_domain() return >> true depending on the toolstack it uses to build other domains. >> >> The hardware domain manages devices for PCI pass-through to driver >> domains or can act as a driver domain itself, depending on the desired >> degree of disaggregation. It is also the domain managing devices that >> do not support pass-through: PCI configuration space access, parsing the >> hardware ACPI tables and system power or machine check events. This is >> the only domain where is_hardware_domain() is true. The return of >> is_control_domain() is false for this domain. >> >> The control domain manages other domains, controls guest launch and >> shutdown, and manages resource constraints; is_control_domain() returns >> true. The functionality guarded by is_control_domain may in the future >> be adapted to use explicit hypercalls, eliminating the special treatment >> of this domain. It may be reasonable to have multiple control domains >> on a multi-tenant system. >> >> Guest domains and other service or driver domains are all treated >> identically by the hypervisor; the security policy may further constrain >> administrative actions on or communication between these domains. >> >> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> >> Acked-by: Jan Beulich <jbeulich@suse.com> > > This isn''t correct: I gave my Reviewed-by for the full series; the > Acked-by was given only for the two patches touching only code > I''m maintainer for. > > The distinction we''re trying to establish is that an ack implies that > a maintainer is okay with a certain patch (i.e. a non-maintainer > would generally not send ack-s at all), whereas a review means > what it says - the patch was reviewed.My logic for applying the Acked-by here was that you had Acked the parts of the patch (in expanded form) that you were listed as maintainer for, so your Ack on the unified patch would actually only be relevant for the x86 and IOMMU parts, requiring additional Acks for the rest of the patch. I had not considered Reviewed-by to be a stronger statement than Acked-by when applying the sign-offs, but after reading the discussion in this thread I agree that Reviewed-by should have been retained (and will on any v3).> That said, while I realize that you did this collapsing because you > were asked to by George, I''m not certain this was a good move: > With one big patch, there''s now no way to apply it step by step, > as the necessary ack-s trickle in. But the significantly extended > description is perhaps outweighing that. > > JanWhile this was my original reason for leaving the patch split up, I don''t think partially applying the series is overly helpful - unless there''s a risk it will miss 4.4 (which seems unlikely), the patch shouldn''t cause issues if it''s delayed waiting for all the Acks. -- Daniel De Graaf National Security Agency
Jan Beulich
2013-Jul-11 08:03 UTC
Re: [PATCH v2] xen: use domid check in is_hardware_domain
>>> On 10.07.13 at 17:50, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Wed, Jul 10, 2013 at 2:56 PM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 10.07.13 at 15:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >>> On Wed, Jul 10, 2013 at 12:43 PM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>>> On 10.07.13 at 12:57, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>>> On 09/07/13 21:28, Daniel De Graaf wrote: >>>>>> --- a/xen/arch/x86/domain.c >>>>>> +++ b/xen/arch/x86/domain.c >>>>>> @@ -1459,7 +1459,7 @@ void context_switch(struct vcpu *prev, struct vcpu >>> *next) >>>>>> } >>>>>> >>>>>> set_cpuid_faulting(!is_hvm_vcpu(next) && >>>>>> - (next->domain->domain_id != 0)); >>>>>> + !is_control_domain(next->domain)); >>>>> >>>>> Won''t this mean that in your separate hardware/control domain model that >>>>> the hardware domain *will* fault on cpuid? Is this what we want? >>>> >>>> I think this is correct, as it''s the control domain that ought to be >>>> able to set up CPUID via the tool stack for all other domains. >>>> >>>> The implication is that it''s the first booted domain. If that''s not >>>> generally the case, we''d need another qualification here. And >>>> perhaps that qualification would in the end be domid == 0... >>> >>> The question isn''t why the control domain has it; the question is why >>> the hardware domain doesn''t have it. >> >> Because - as said - for _all_ other domains, the control domain can >> set a CPUID policy via the tool stack. > > So you''re saying that the control domain will set the CPUID policy for > the hardware domain?I would think that''s a reasonable model (but certainly not the only one).>>>>>> @@ -1962,7 +1962,7 @@ static void dump_softtsc(unsigned char key) >>>>>> "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); >>>>>> for_each_domain ( d ) >>>>>> { >>>>>> - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>>>> + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) >>>>>> continue; >>>>>> printk("dom%u%s: mode=%d",d->domain_id, >>>>>> is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); >>>>> >>>>> Why have direct access to the tsc for the hardware domain and not the >>>>> control domain? >>>> >>>> Because, as its name says, it has direct access to the hardware >>>> (including the TSC)? >>> >>> Again, my question wasn''t so much about why the hardware domain does >>> have it, but why the control domain does *not* have it. >> >> And I think I answered it. Or should I return the question and ask: >> "Why do you think the control domain should be using the TSC >> directly? I.e. in what way is it different from other non-hardware >> domains in its interaction with hardware?" > > Well why does the hardware domain, or even domain 0, need it? "It has > direct access to the hardware" isn''t really an answer.I can only defer answering this to the one(s) having introduced this logic in the first place. Jan