Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [GIT PULL] xen /proc/mtrr implementation
Hi Ingo, This series of patches makes /proc/mtrr fully functional under Xen. Thanks, J The following changes since commit ce791368bb4a53d05e78e1588bac0aacde8db84c: Jeremy Fitzhardinge (1): xen/i386: make sure initial VGA/ISA mappings are not overridden are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git for-ingo/xen/dom0/mtrr Jeremy Fitzhardinge (1): xen: set cpu_callout_mask to make mtrr work Mark McLoughlin (5): xen mtrr: Use specific cpu_has_foo macros instead of generic cpu_has() xen mtrr: Use generic_validate_add_page() xen mtrr: Implement xen_get_free_region() xen mtrr: Add xen_{get,set}_mtrr() implementations xen mtrr: Kill some unnecessary includes arch/x86/kernel/cpu/mtrr/mtrr.h | 2 + arch/x86/kernel/cpu/mtrr/xen.c | 99 ++++++++++++++++++++++++++++++++------- arch/x86/xen/smp.c | 2 + 3 files changed, 86 insertions(+), 17 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 1/6] xen: set cpu_callout_mask to make mtrr work
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> The mtrr code uses num_booting_cpus() to manage its corrdinated mtrr update, which in turn depends on cpu_callout_mask. [ Impact: bugfix; make /proc/mtrr writes work in Xen ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/smp.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 429834e..4706af7 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -297,6 +297,8 @@ static int __cpuinit xen_cpu_up(unsigned int cpu) xen_setup_timer(cpu); xen_init_lock_cpu(cpu); + cpumask_set_cpu(cpu, cpu_callout_mask); + per_cpu(cpu_state, cpu) = CPU_UP_PREPARE; /* make sure interrupts start blocked */ -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 2/6] xen mtrr: Use specific cpu_has_foo macros instead of generic cpu_has()
From: Mark McLoughlin <markmc@redhat.com> Use specific cpu_has_foo macros instead of generic cpu_has() [ Impact: cleanup ] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/cpu/mtrr/xen.c | 10 ++++------ 1 files changed, 4 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c index 6760907..8ac2ccd 100644 --- a/arch/x86/kernel/cpu/mtrr/xen.c +++ b/arch/x86/kernel/cpu/mtrr/xen.c @@ -41,15 +41,13 @@ static int __init xen_num_var_ranges(void) void __init xen_init_mtrr(void) { - struct cpuinfo_x86 *c = &boot_cpu_data; - if (!xen_initial_domain()) return; - if ((!cpu_has(c, X86_FEATURE_MTRR)) && - (!cpu_has(c, X86_FEATURE_K6_MTRR)) && - (!cpu_has(c, X86_FEATURE_CYRIX_ARR)) && - (!cpu_has(c, X86_FEATURE_CENTAUR_MCR))) + if (!cpu_has_mtrr && + !cpu_has_k6_mtrr && + !cpu_has_cyrix_arr && + !cpu_has_centaur_mcr) return; mtrr_if = &xen_mtrr_ops; -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 3/6] xen mtrr: Use generic_validate_add_page()
From: Mark McLoughlin <markmc@redhat.com> The hypervisor already performs the same validation, but better to do it early before getting to the range combining code. [ Impact: cleanup, avoid Xen console noise ] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/cpu/mtrr/xen.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c index 8ac2ccd..3077bb3 100644 --- a/arch/x86/kernel/cpu/mtrr/xen.c +++ b/arch/x86/kernel/cpu/mtrr/xen.c @@ -20,6 +20,7 @@ static int __init xen_num_var_ranges(void); static struct mtrr_ops xen_mtrr_ops = { .vendor = X86_VENDOR_UNKNOWN, .get_free_region = generic_get_free_region, + .validate_add_page = generic_validate_add_page, .have_wrcomb = positive_have_wrcomb, .use_intel_if = 0, .num_var_ranges = xen_num_var_ranges, -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 4/6] xen mtrr: Implement xen_get_free_region()
From: Mark McLoughlin <markmc@redhat.com> When an already set MTRR is being changed, we need to first unset, since Xen also maintains a usage count. [ Impact: complete xen mtrr implementation ] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/cpu/mtrr/mtrr.h | 2 ++ arch/x86/kernel/cpu/mtrr/xen.c | 28 +++++++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h index 3502f6c..52bd87d 100644 --- a/arch/x86/kernel/cpu/mtrr/mtrr.h +++ b/arch/x86/kernel/cpu/mtrr/mtrr.h @@ -5,6 +5,8 @@ #include <linux/types.h> #include <linux/stddef.h> +#include <asm/mtrr.h> + #define MTRRcap_MSR 0x0fe #define MTRRdefType_MSR 0x2ff diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c index 3077bb3..f99fa15 100644 --- a/arch/x86/kernel/cpu/mtrr/xen.c +++ b/arch/x86/kernel/cpu/mtrr/xen.c @@ -15,11 +15,37 @@ static int __init xen_num_var_ranges(void); +static int xen_get_free_region(unsigned long base, unsigned long size, + int replace_reg) +{ + struct xen_platform_op op; + int error; + + if (replace_reg < 0) + return generic_get_free_region(base, size, -1); + + /* If we''re replacing the contents of a register, + * we need to first unset it since Xen also keeps + * a usage count. + */ + op.cmd = XENPF_del_memtype; + op.u.del_memtype.handle = 0; + op.u.del_memtype.reg = replace_reg; + + error = HYPERVISOR_dom0_op(&op); + if (error) { + BUG_ON(error > 0); + return error; + } + + return replace_reg; +} + /* DOM0 TODO: Need to fill in the remaining mtrr methods to have full * working userland mtrr support. */ static struct mtrr_ops xen_mtrr_ops = { .vendor = X86_VENDOR_UNKNOWN, - .get_free_region = generic_get_free_region, + .get_free_region = xen_get_free_region, .validate_add_page = generic_validate_add_page, .have_wrcomb = positive_have_wrcomb, .use_intel_if = 0, -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 5/6] xen mtrr: Add xen_{get, set}_mtrr() implementations
From: Mark McLoughlin <markmc@redhat.com> Straightforward apart from the hack to turn mtrr_ops->set() into a no-op on all but one CPU. [ Impact: complete Xen mtrr implementation ] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/cpu/mtrr/xen.c | 50 ++++++++++++++++++++++++++++++++++++++- 1 files changed, 48 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c index f99fa15..5e0aa2b 100644 --- a/arch/x86/kernel/cpu/mtrr/xen.c +++ b/arch/x86/kernel/cpu/mtrr/xen.c @@ -15,6 +15,52 @@ static int __init xen_num_var_ranges(void); +static void xen_set_mtrr(unsigned int reg, unsigned long base, + unsigned long size, mtrr_type type) +{ + struct xen_platform_op op; + int error; + + /* mtrr_ops->set() is called once per CPU, + * but Xen''s ops apply to all CPUs. + */ + if (smp_processor_id()) + return; + + if (size == 0) { + op.cmd = XENPF_del_memtype; + op.u.del_memtype.handle = 0; + op.u.del_memtype.reg = reg; + } else { + op.cmd = XENPF_add_memtype; + op.u.add_memtype.mfn = base; + op.u.add_memtype.nr_mfns = size; + op.u.add_memtype.type = type; + } + + error = HYPERVISOR_dom0_op(&op); + BUG_ON(error != 0); +} + +static void xen_get_mtrr(unsigned int reg, unsigned long *base, + unsigned long *size, mtrr_type *type) +{ + struct xen_platform_op op; + + op.cmd = XENPF_read_memtype; + op.u.read_memtype.reg = reg; + if (HYPERVISOR_dom0_op(&op) != 0) { + *base = 0; + *size = 0; + *type = 0; + return; + } + + *size = op.u.read_memtype.nr_mfns; + *base = op.u.read_memtype.mfn; + *type = op.u.read_memtype.type; +} + static int xen_get_free_region(unsigned long base, unsigned long size, int replace_reg) { @@ -41,10 +87,10 @@ static int xen_get_free_region(unsigned long base, unsigned long size, return replace_reg; } -/* DOM0 TODO: Need to fill in the remaining mtrr methods to have full - * working userland mtrr support. */ static struct mtrr_ops xen_mtrr_ops = { .vendor = X86_VENDOR_UNKNOWN, + .set = xen_set_mtrr, + .get = xen_get_mtrr, .get_free_region = xen_get_free_region, .validate_add_page = generic_validate_add_page, .have_wrcomb = positive_have_wrcomb, -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-12 23:27 UTC
[Xen-devel] [PATCH 6/6] xen mtrr: Kill some unnecessary includes
From: Mark McLoughlin <markmc@redhat.com> [ Impact: cleanup ] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/cpu/mtrr/xen.c | 8 +------- 1 files changed, 1 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/xen.c b/arch/x86/kernel/cpu/mtrr/xen.c index 5e0aa2b..4286569 100644 --- a/arch/x86/kernel/cpu/mtrr/xen.c +++ b/arch/x86/kernel/cpu/mtrr/xen.c @@ -1,12 +1,6 @@ #include <linux/init.h> -#include <linux/proc_fs.h> -#include <linux/ctype.h> -#include <linux/module.h> -#include <linux/seq_file.h> -#include <asm/uaccess.h> -#include <linux/mutex.h> - -#include <asm/mtrr.h> +#include <linux/mm.h> + #include "mtrr.h" #include <xen/interface/platform.h> -- 1.6.0.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-13 13:30 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Hi Ingo, > > This series of patches makes /proc/mtrr fully functional under Xen. > > Thanks, > J > > The following changes since commit ce791368bb4a53d05e78e1588bac0aacde8db84c: > Jeremy Fitzhardinge (1): > xen/i386: make sure initial VGA/ISA mappings are not overridden > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git for-ingo/xen/dom0/mtrr > > Jeremy Fitzhardinge (1): > xen: set cpu_callout_mask to make mtrr work > > Mark McLoughlin (5): > xen mtrr: Use specific cpu_has_foo macros instead of generic cpu_has() > xen mtrr: Use generic_validate_add_page() > xen mtrr: Implement xen_get_free_region() > xen mtrr: Add xen_{get,set}_mtrr() implementations > xen mtrr: Kill some unnecessary includes > > arch/x86/kernel/cpu/mtrr/mtrr.h | 2 + > arch/x86/kernel/cpu/mtrr/xen.c | 99 ++++++++++++++++++++++++++++++++------- > arch/x86/xen/smp.c | 2 + > 3 files changed, 86 insertions(+), 17 deletions(-)i never got a reply to my question for your previous submission: http://lkml.indiana.edu/hypermail/linux/kernel/0905.1/00152.html Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-13 14:39 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:> i never got a reply to my question for your previous submission: > > http://lkml.indiana.edu/hypermail/linux/kernel/0905.1/00152.html >That was in response to the mtrr patch in the dom0/core series.> Please dont post patches with ugly TODO items in them.I removed them in the repost.> Also, a more general objection is that /proc/mtrr is a legacy > interface, we dont really want to extend its use.It''s not an extended use; its just making the existing interface work under Xen (ie, not breaking the userspace ABI). The only other alternatives would be to 1) use Kconfig to prevent MTRR and Xen from being set at the same time, or 2) put a runtime hack in to disable MTRR when running under Xen. Neither seems like a good idea when we can just keep the interface working.> The Xen hypervisor > should get proper PAT support instead ...Well, it has PAT support, but there''s an issue that the Xen PAT setup isn''t quite the same as Linux''s (but I thought you were cc:d on the discussion about that). We need to sort out some details about the precise mechanism, but it looks like we''ll be able to support PAT in Linux guests relatively easily (but not immediately). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-15 18:27 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ingo Molnar wrote: >> i never got a reply to my question for your previous submission: >> >> http://lkml.indiana.edu/hypermail/linux/kernel/0905.1/00152.html >> > > That was in response to the mtrr patch in the dom0/core series. > >> Please dont post patches with ugly TODO items in them. > I removed them in the repost. >> Also, a more general objection is that /proc/mtrr is a legacy >> interface, we dont really want to extend its use. > It''s not an extended use; its just making the existing interface work > under Xen (ie, not breaking the userspace ABI). The only other > alternatives would be to 1) use Kconfig to prevent MTRR and Xen from > being set at the same time, or 2) put a runtime hack in to disable MTRR > when running under Xen. Neither seems like a good idea when we can just > keep the interface working.Right now there''s no MTRR support under Xen guests and the Xen hypervisor was able to survive, right? Why should we do it under dom0? The MTRR code is extremely fragile, we dont really need an added layer there. _Especially_ since /proc/mtrr is an obsolete API. If you want to allow a guest to do MTRR ops, you can do it by catching the native kernel accesses to the MTRR space. There''s no guest side support needed for that. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-15 20:09 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:> Right now there''s no MTRR support under Xen guests and the Xen > hypervisor was able to survive, right? Why should we do it under > dom0? >Because dom0 has direct hardware access, and is running real device drivers. domU guests don''t see physical memory, and so MTRR has no relevance for them.> The MTRR code is extremely fragile, we dont really need an added > layer there. _Especially_ since /proc/mtrr is an obsolete API. >There''s no added layer there. I''m just adding another implementation of mtrr_ops. /proc/mtrr is in wide use today. It may be planned for obsolescence, but there''s no way you can claim its obsolete today (my completely up-to-date F10 X server is using it, for example). We don''t break oldish usermode ABIs in new kernels. Besides, the MTRR code is also a kernel-internal API, used by DRM and other drivers to configure the system MTRR state. Those drivers will either perform badly or outright fail if they can''t set the appropriate cachability properties. That is not obsolete in any way.> If you want to allow a guest to do MTRR ops, you can do it by > catching the native kernel accesses to the MTRR space. There''s no > guest side support needed for that. >MTRR can''t be virtualized like that. It can''t be meaningfully multiplexed, and must be set in a uniform way on all physical CPUs. Guests run on virtual CPUs, and don''t have any knowledge of what the mapping of VCPU to PCPU is, or even any visibility of all PCPUs. It is not a piece of per-guest state; it is system-wide property, maintained by Xen. These patches add the mechanism for dom0 (=hardware control domain) to tell Xen what state they should be in. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Eric W. Biederman
2009-May-15 23:26 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Jeremy Fitzhardinge <jeremy@goop.org> writes:> Ingo Molnar wrote: >> Right now there''s no MTRR support under Xen guests and the Xen hypervisor was >> able to survive, right? Why should we do it under dom0? >> > > Because dom0 has direct hardware access, and is running real device drivers. > domU guests don''t see physical memory, and so MTRR has no relevance for them.>> The MTRR code is extremely fragile, we dont really need an added layer >> there. _Especially_ since /proc/mtrr is an obsolete API. >> > > There''s no added layer there. I''m just adding another implementation of > mtrr_ops. > > /proc/mtrr is in wide use today. It may be planned for obsolescence, but > there''s no way you can claim its obsolete today (my completely up-to-date F10 X > server is using it, for example). We don''t break oldish usermode ABIs in new > kernels.Sure it is. There is a better newer replacement. It is taking a while to get userspace transitioned but that is different. Honestly I am puzzled why that it but whatever.> Besides, the MTRR code is also a kernel-internal API, used by DRM and other > drivers to configure the system MTRR state. Those drivers will either perform > badly or outright fail if they can''t set the appropriate cachability properties. > That is not obsolete in any way.There are about 5 of them so let''s fix them. With PAT we are in a much better position both for portability and for flexibility.>> If you want to allow a guest to do MTRR ops, you can do it by catching the >> native kernel accesses to the MTRR space. There''s no guest side support needed >> for that. >> > > MTRR can''t be virtualized like that. It can''t be meaningfully multiplexed, and > must be set in a uniform way on all physical CPUs. Guests run on virtual CPUs, > and don''t have any knowledge of what the mapping of VCPU to PCPU is, or even any > visibility of all PCPUs. > > It is not a piece of per-guest state; it is system-wide property, maintained by > Xen. These patches add the mechanism for dom0 (=hardware control domain) to > tell Xen what state they should be in.Is it possible to fix PAT and get that working first. That is very definitely the preferend API. Eric _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-15 23:49 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Eric W. Biederman wrote:>> /proc/mtrr is in wide use today. It may be planned for obsolescence, but >> there''s no way you can claim its obsolete today (my completely up-to-date F10 X >> server is using it, for example). We don''t break oldish usermode ABIs in new >> kernels. >> > > Sure it is. There is a better newer replacement. It is taking a while to > get userspace transitioned but that is different. Honestly I am puzzled > why that it but whatever. >There''s no mention in feature-removal-schedule.txt.>> Besides, the MTRR code is also a kernel-internal API, used by DRM and other >> drivers to configure the system MTRR state. Those drivers will either perform >> badly or outright fail if they can''t set the appropriate cachability properties. >> That is not obsolete in any way. >> > > There are about 5 of them so let''s fix them. >Well, I count at least 30+, but anyway.> With PAT we are in a much better position both for portability and for > flexibility. >PAT is relatively recent, and even more recently bug-free. There are many people with processors which can''t or won''t do PAT; what''s the plan to support them? Just hit them with a performance regression? Or wrap MTRR in some other API?> Is it possible to fix PAT and get that working first. That is very definitely > the preferend API. >Sure, when available. We''re sorting out the details for Xen, but even then it may not be available, either because we''re running on an old version of Xen, or because some other guest is using PAT differently. But I honestly don''t understand the hostility towards 120 lines of code to make an interface (albeit legacy/deprecated/whatever) behave in an expected way. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jesse Barnes
2009-May-16 03:22 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On Fri, 15 May 2009 16:49:12 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Eric W. Biederman wrote: > >> /proc/mtrr is in wide use today. It may be planned for > >> obsolescence, but there''s no way you can claim its obsolete today > >> (my completely up-to-date F10 X server is using it, for example). > >> We don''t break oldish usermode ABIs in new kernels. > >> > > > > Sure it is. There is a better newer replacement. It is taking a > > while to get userspace transitioned but that is different. > > Honestly I am puzzled why that it but whatever. > > > > There''s no mention in feature-removal-schedule.txt. > > >> Besides, the MTRR code is also a kernel-internal API, used by DRM > >> and other drivers to configure the system MTRR state. Those > >> drivers will either perform badly or outright fail if they can''t > >> set the appropriate cachability properties. That is not obsolete > >> in any way. > > > > There are about 5 of them so let''s fix them. > > > > Well, I count at least 30+, but anyway. > > > With PAT we are in a much better position both for portability and > > for flexibility. > > > > PAT is relatively recent, and even more recently bug-free. There are > many people with processors which can''t or won''t do PAT; what''s the > plan to support them? Just hit them with a performance regression? > Or wrap MTRR in some other API? > > > Is it possible to fix PAT and get that working first. That is > > very definitely the preferend API. > > > > Sure, when available. We''re sorting out the details for Xen, but > even then it may not be available, either because we''re running on an > old version of Xen, or because some other guest is using PAT > differently. > > But I honestly don''t understand the hostility towards 120 lines of > code to make an interface (albeit legacy/deprecated/whatever) behave > in an expected way.FWIW I think supporting the MTRR API in Xen makes sense. There''s a lot of old code out there that wants it; would be nice if it mostly worked, especially at such a minimal cost. It''s taken awhile to get PAT going (and there are still issues here and there) so having the MTRR stuffa available is awfully nice. -- Jesse Barnes, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Eric W. Biederman
2009-May-16 04:26 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Jesse Barnes <jbarnes@virtuousgeek.org> writes:> On Fri, 15 May 2009 16:49:12 -0700 > Jeremy Fitzhardinge <jeremy@goop.org> wrote: > >> Eric W. Biederman wrote: >> >> /proc/mtrr is in wide use today. It may be planned for >> >> obsolescence, but there''s no way you can claim its obsolete today >> >> (my completely up-to-date F10 X server is using it, for example). >> >> We don''t break oldish usermode ABIs in new kernels. >> >> >> > >> > Sure it is. There is a better newer replacement. It is taking a >> > while to get userspace transitioned but that is different. >> > Honestly I am puzzled why that it but whatever. >> > >> >> There''s no mention in feature-removal-schedule.txt.I don''t know that it makes sense to remove mtrrs but it certainly doesn''t make sense to use them if you can avoid it.>> >> Besides, the MTRR code is also a kernel-internal API, used by DRM >> >> and other drivers to configure the system MTRR state. Those >> >> drivers will either perform badly or outright fail if they can''t >> >> set the appropriate cachability properties. That is not obsolete >> >> in any way. >> > >> > There are about 5 of them so let''s fix them. >> > >> >> Well, I count at least 30+, but anyway.Wow. We had a lot of those slip in. Definitely time to fix the drivers.>> > With PAT we are in a much better position both for portability and >> > for flexibility. >> > >> >> PAT is relatively recent, and even more recently bug-free. There are >> many people with processors which can''t or won''t do PAT; what''s the >> plan to support them? Just hit them with a performance regression? >> Or wrap MTRR in some other API?PPro is roughly when PAT came out. I remember discussing this a while ago and the conclusion was that there are very few systems with MTRRs that don''t have a usable PAT implementation. I expect many of those systems are on their last legs today.>> Sure, when available. We''re sorting out the details for Xen, but >> even then it may not be available, either because we''re running on an >> old version of Xen, or because some other guest is using PAT >> differently.There are only 3 states that are interesting. WB UC and WC. Since Xen controls the page tables anyway. I expect it can even remap it feels like it.>> But I honestly don''t understand the hostility towards 120 lines of >> code to make an interface (albeit legacy/deprecated/whatever) behave >> in an expected way.> FWIW I think supporting the MTRR API in Xen makes sense. There''s a lot > of old code out there that wants it; would be nice if it mostly worked, > especially at such a minimal cost. It''s taken awhile to get PAT going > (and there are still issues here and there) so having the MTRR stuffa > available is awfully nice.I won''t argue that having MTRRs when you can makes sense. It is a bit weird in a vitalized system. At a practical level there are an increasing number of systems for which MTRRs are unusable because the BIOS sets up overlapping mtrrs. With cheap entry level systems shipping with 4G I expect it is becoming a majority of systems. Eric _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jesse Barnes
2009-May-16 18:22 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On Fri, 15 May 2009 21:26:47 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:> > FWIW I think supporting the MTRR API in Xen makes sense. There''s a > > lot of old code out there that wants it; would be nice if it mostly > > worked, especially at such a minimal cost. It''s taken awhile to > > get PAT going (and there are still issues here and there) so having > > the MTRR stuffa available is awfully nice. > > I won''t argue that having MTRRs when you can makes sense. It is a bit > weird in a vitalized system. At a practical level there are an > increasing number of systems for which MTRRs are unusable because the > BIOS sets up overlapping mtrrs. With cheap entry level systems > shipping with 4G I expect it is becoming a majority of systems.Yeah, MTRRs definitely have issues too; as you say on many recent machines they''re tougher to use. Either we need to reprogram them to free some up for WC mappings, or use PAT. But that''s a relatively recent development (last year or two I think?). This is really about what software Xen wants to support though. You can say, "it would be easier for you to just support new software that doesn''t use MTRRs," and you might be right, but supporting older stuff doesn''t appear that difficult, and it sounds like something they want to do. -- Jesse Barnes, Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-18 04:57 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Eric W. Biederman wrote:> There are only 3 states that are interesting. WB UC and WC. Since > Xen controls the page tables anyway. I expect it can even remap > it feels like it. >It would be awkward. A paravirtualized guest has direct access to the real pagetables, and so would notice if Xen swizzled around the PAT bits when it reads back a pagetable entry. We don''t currently have any paravirtualized hooks for adjusting the PTE flags, because there hasn''t been any need, and it would probably be pretty costly (lots of read+bit-tests would turn into a function call). On the other hand, there''s probably only a few places (if any) in the kernel which actually inspect the PAT status of an established PTE, so we could put in some special case mapping there. It becomes a maintenance burden to 1) track down all the right places, and then 2) make sure any new instances get handled properly. So, not a preferred solution, I think. But our planned approach is to simply make Xen use the same PAT layout as Linux, and go from there. We still need to sort out the details of how to handle other Xen guests which use the existing Xen PAT setup, how to verify that Xen and the guest kernel are really using the same setup, etc. But since we support the last few year''s worth of released versions of Xen, we still need to handle the PAT-not-supported case with reasonable grace.> I won''t argue that having MTRRs when you can makes sense. It is a bit > weird in a vitalized system.It''s not really virtualized. We''re talking about dom0, which is the guest domain which has access to the real machine''s real hardware; the MTRR is part of that.> At a practical level there are an > increasing number of systems for which MTRRs are unusable because the > BIOS sets up overlapping mtrrs. With cheap entry level systems > shipping with 4G I expect it is becoming a majority of systems. >Yes, but that is true irrespective of Xen. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-18 05:02 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Jesse Barnes wrote:> This is really about what software Xen wants to support though. You > can say, "it would be easier for you to just support new software that > doesn''t use MTRRs," and you might be right, but supporting older stuff > doesn''t appear that difficult, and it sounds like something they want > to do. >My rough target is that you should be able to take a pvops-dom0 kernel and use it to replace the 2.6.18-xen (or other patched up -xen) kernel on an existing server installation without having to replace very much (or anything) else. In practise that means making it work in something like RHEL 5 or SLES 10(?) environment. That seems to be what at least some of my testers are doing. Of course, most of the deployments will be with whatever new distro ships with the kernel. But that doesn''t mean we can write-off the old stuff. (I think AKPM still tests current kernels on something like FC1 or 2 to check for general kernel/distro regressions.) J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-18 08:59 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> It''s not really virtualized. We''re talking about dom0, which is > the guest domain which has access to the real machine''s real > hardware; the MTRR is part of that.That is a really broken model and design of virtualization: splitting the hypervisor into Xen and then a separate Linux dom0 entity because reality called home a few years ago and you needed actual working drivers and hardware support and a developer community to pull that off ... Here Xen invades an already fragile piece of upstream code (/proc/mtrr) that is obsolete and on the way out. If you want a solution you should add PAT support to Xen and you should use recent upstream kernels. Or you should emulate /proc/mtrr in _Xen the hypervisor_, if you really care that much - without increasing the amount of crap in Linux. Without a better reason than what you''ve given so far the answer is really: "no thanks" ... My suspicion is that Linus would (rightfully) refuse to pull such a broken approach from me, so why should i pull it? If i''m wrong and if you can get an Acked-by from Linus _before_ sending a pull request we can override my NAK. I''ve Cc:-ed him, in case he wants to express an opinion. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-May-18 13:17 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
>>> Ingo Molnar <mingo@elte.hu> 18.05.09 10:59 >>> >Here Xen invades an already fragile piece of upstream code >(/proc/mtrr) that is obsolete and on the way out. If you want a >solution you should add PAT support to Xen and you should use recentAs Jeremy pointed out a number of times, Xen *does* have PAT support, perhaps (that''s my personal opinion) even superior to the Linux one, as it doesn''t redefine the 486-inherited caching mode attributes but rather uses the full 3 bits that the hardware provides (and, as an acknowledgement to the various hardware bugs, makes sure not to use any large page mappings when using non-WB mappings).>upstream kernels. Or you should emulate /proc/mtrr in _Xen the >hypervisor_, if you really care that much - without increasing the >amount of crap in Linux.As Jeremy also pointed out previously, emulating the MTRRs in the hypervisor is very undesirable (and technically at least very close to impossible), as we''re talking about the *real* MTTRs that need managing here (whereas dealing with virtualized MTRRs in a fully virtualized guest is a completely different - and very reasonable - thing). I can only support Jeremy in asking that you please reconsider your NAK. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2009-May-18 17:51 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Ingo Molnar (mingo@elte.hu) wrote:> Here Xen invades an already fragile piece of upstream code > (/proc/mtrr) that is obsolete and on the way out. If you want a > solution you should add PAT support to Xen and you should use recent > upstream kernels. Or you should emulate /proc/mtrr in _Xen the > hypervisor_, if you really care that much - without increasing the > amount of crap in Linux.Could you be specific re: technical issues? I see in the general mtrr impact has one oddity: +int __init common_num_var_ranges +static int __init xen_num_var_ranges(void); +.num_var_ranges A bit unusual to have an __init function in an ops table. Albeit safe in this case. Could slightly minimize impact by keeping setup_num_var_ranges and have it do: if (mtrr_if->num_var_ranges) num_var_ranges = mtrr_if->num_var_ranges(); else num_var_ranges = common_num_var_ranges; Similarly, could do a simple inline stub to remove the extra ifdef. +#ifdef CONFIG_XEN_DOM0 + xen_init_mtrr(); +#endif But those are pretty minor. I think the changes proposed are pretty small and reasonable to make existing /proc/mtrr usable in Xen dom0 (different discussion of when to formally deprecate /proc/mtrr). thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-18 18:07 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:> Here Xen invades an already fragile piece of upstream code > (/proc/mtrr) that is obsolete and on the way out. If you want a > solution you should add PAT support to Xen and you should use recent > upstream kernels. Or you should emulate /proc/mtrr in _Xen the > hypervisor_, if you really care that much - without increasing the > amount of crap in Linux. >That''s a gross mis-characterisation of what we''re talking about here. arch/x86 already defines an mtrr_ops, which defines how to manipulate the MTRR registers. There are currently several implementations of that interface. In Xen the MTRR registers belong to the hypervisor, but it allows a privileged kernel to modify them via hypercalls. I simply added a new, straightforward mtrr_ops implementation to do that. It adds about 120 lines of new code, in a single mtrr/xen.c file. That''s it. I could add any number of bizarre convolutions to achieve the same effect, but given that there''s an existing interface that is exactly designed for what we want to achieve, I have to admit it didn''t occur to me to do anything else. MTRR may well be on its way out, and PAT is the proper way to achieve the same effect from now on. But MTRR is still a widely used kernel-internal API as well as usermode ABI, and it seems just gratuitous to not support it when doing so is such a low-impact change. And if/when it comes to be time to completely deprecate/remove mtrr support, we can delete it along with everything else. I''m honestly at a loss to explain the vehemence of your objection to these changes given their nature. We''re also working on making PAT support work where possible. But that doesn''t mean we want to do without anything in the meantime. But more generally, we need to work out how to move things forward. That said, we can live without. If these MTRR patches are your biggest concern, drop them, pull the rest so we can get them seen, tested, in linux-next, etc, ready for the next merge window. You know, like you said you wanted to do the last time you blocked the Xen patches. I would prefer to move them through tip.git: I appreciate your (constructive) comments and reviews, the testing infrastructure has proved very valuable in the past, and they''ll generally get wider exposure than my pool of testers. And its just the right way to do it. But if you''re so concerned that they''ll simply give Linus more ammunition to beat you up with, to the extent that you''re second-guessing yourself, then I''m perfectly happy to submit them myself. I don''t think it would be an overall better outcome, but it keeps the heat off you, and we''d be making some progress. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 09:59 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ingo Molnar wrote: >> Here Xen invades an already fragile piece of upstream code >> (/proc/mtrr) that is obsolete and on the way out. If you want a >> solution you should add PAT support to Xen and you should use recent >> upstream kernels. Or you should emulate /proc/mtrr in _Xen the >> hypervisor_, if you really care that much - without increasing the >> amount of crap in Linux. >> > > That''s a gross mis-characterisation of what we''re talking about here. > > arch/x86 already defines an mtrr_ops, which defines how to > manipulate the MTRR registers. There are currently several > implementations of that interface. In Xen the MTRR registers > belong to the hypervisor, but it allows a privileged kernel to > modify them via hypercalls. I simply added a new, straightforward > mtrr_ops implementation to do that. It adds about 120 lines of > new code, in a single mtrr/xen.c file. > > That''s it. I could add any number of bizarre convolutions to > achieve the same effect, but given that there''s an existing > interface that is exactly designed for what we want to achieve, I > have to admit it didn''t occur to me to do anything else.Exactly what is ''bizarre'' about using the API defined by the _CPU_ already, without adding any ad-hoc hypecall? Catch the dom0 WRMSRs, filter out the MTRR indices - that''s it. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-May-19 10:22 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
>>> Ingo Molnar <mingo@elte.hu> 19.05.09 11:59 >>> >Exactly what is ''bizarre'' about using the API defined by the _CPU_ >already, without adding any ad-hoc hypecall? Catch the dom0 WRMSRs, >filter out the MTRR indices - that''s it.But that is *not* the same as using the hypercalls: The hypercall tells Xen "Change all CPUs'' MTRRs with the indicated index to the indicated value", while the MSR write says "Change the MTRR with the given index on the physical CPU the current virtual CPU happens to run on to the given value". A write-base/write-mask pair may happen to get interrupted (preempted) by the hypervisor, and hence the two writes may happen on different pCPU-s. Teaching the hypervisor to (correctly!) guess what the guest meant in that situation isn''t trivial, as then it needs to handle all possible situations (and it can never know whether Dom0 really intended to do something that may look bogus/inconsistent at the first glance). This is even more so considering that Linux is not the only OS capable of running as Dom0. An apparently relatively simple solution - latch the writes and commit them only when the global view became consistent again - isn''t possible, since Dom0 may not have a vCPU running on each individual pCPU. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 11:08 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Jan Beulich <JBeulich@novell.com> wrote:> >>> Ingo Molnar <mingo@elte.hu> 19.05.09 11:59 >>> > > > Exactly what is ''bizarre'' about using the API defined by the > > _CPU_ already, without adding any ad-hoc hypecall? Catch the > > dom0 WRMSRs, filter out the MTRR indices - that''s it. > > But that is *not* the same as using the hypercalls: The hypercall > tells Xen "Change all CPUs'' MTRRs with the indicated index to the > indicated value", while the MSR write says "Change the MTRR with > the given index on the physical CPU the current virtual CPU > happens to run on to the given value". [...]The change of MTRR''s on _any_ of the guest CPUs in a dom0 context should immediately be refected on all CPUs. Assymetric MTRR settings are madness. ( And the thing is, changing MTRRs is fragile and racy on native Linux no matter what - even without any hypervisors - due to SMM contexts possibly relying on them etc. )> [...] A write-base/write-mask pair may happen to get interrupted > (preempted) by the hypervisor, and hence the two writes may happen > on different pCPU-s. Teaching the hypervisor to (correctly!) guess > what the guest meant in that situation isn''t trivial, as then it > needs to handle all possible situations (and it can never know > whether Dom0 really intended to do something that may look > bogus/inconsistent at the first glance). [...]None of this is a problem really if a sane approach is used: a change to the MTRR state on dom0 is applied symmetrically on all CPUs. Or, alternatively, the hypervisor can expose its own administrative interface to manage MTRRs. There''s no need to fuglify the Linux kernel for that. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2009-May-19 12:04 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On 05/19/09 13:08, Ingo Molnar wrote:> Or, alternatively, the hypervisor can expose its own administrative > interface to manage MTRRs.Guess what? Xen does exactly that. And the xen mtrr_ops implementation uses that interface ... cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 12:26 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Gerd Hoffmann <kraxel@redhat.com> wrote:> On 05/19/09 13:08, Ingo Molnar wrote: >> Or, alternatively, the hypervisor can expose its own administrative >> interface to manage MTRRs. > > Guess what? Xen does exactly that. And the xen mtrr_ops > implementation uses that interface ...No, that is not an ''administrative interface'' - that is a guest kernel level hack that complicates Linux, extends its effective ABI dependencies and which has to be maintained there from that point on. There''s really just three proper technical solutions here: - either catch the lowlevel CPU hw ops (the MSR modifications, which isnt really all that different from the mtrr_ops approach so it shouldnt pose undue difficulties to the Xen hypervisor). That will be maximally transparent and compatible, with zero changes needed to the Linux kernel. - or introduce its own hypercall API based administration API, without bothering the guest kernel with crap. Trivially patch Xorg to make use of it and that''s it. There is absolutely no reason to introduce some intermediate crap into Linux. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alan Cox
2009-May-19 12:32 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
> - or introduce its own hypercall API based administration API, > without bothering the guest kernel with crap. Trivially patch Xorg > to make use of it and that''s it.PCI pass through mixed with that isn''t going to be fun and PCI pass through is one area where a lot of the MTRR manipulation is meaningful and valid (and can be handled for the guest). I really don''t see why rd/wrmsr processing isn''t sufficient for this Alan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 12:37 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:> > - or introduce its own hypercall API based administration API, > > without bothering the guest kernel with crap. Trivially patch Xorg > > to make use of it and that''s it. > > PCI pass through mixed with that isn''t going to be fun and PCI > pass through is one area where a lot of the MTRR manipulation is > meaningful and valid (and can be handled for the guest).virtualizing that aspect of the CPU properly is certainly the highest grade approach.> I really don''t see why rd/wrmsr processing isn''t sufficient for > this/me neither. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2009-May-19 13:21 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On 05/19/09 14:26, Ingo Molnar wrote:> * Gerd Hoffmann<kraxel@redhat.com> wrote: > >> On 05/19/09 13:08, Ingo Molnar wrote: >>> Or, alternatively, the hypervisor can expose its own administrative >>> interface to manage MTRRs. >> Guess what? Xen does exactly that. And the xen mtrr_ops >> implementation uses that interface ... > > No, that is not an ''administrative interface'' - that is a guest > kernel level hack that complicates Linux, extends its effective ABI > dependencies and which has to be maintained there from that point > on. > > There''s really just three proper technical solutions here: > > - either catch the lowlevel CPU hw ops (the MSR modifications, which > isnt really all that different from the mtrr_ops approach so it > shouldnt pose undue difficulties to the Xen hypervisor).Devil is in the details. The dom0 kernel might not see all physical cpus on the system. So Xen can''t leave the job of looping over all cpus to the dom0 kernel, Xen has to apply the changes made by the (priviledged) guest kernel on any (virtual) cpu to all (physical) cpus in the machine. Which in turn means the "lowlevel cpu hw op" would work in a slightly different way on Xen and native. Nasty.> That will > be maximally transparent and compatible, with zero changes needed > to the Linux kernel.No, the linux kernel probably should do the wrmsr on one cpu only then.> - or introduce its own hypercall API based administration API, > without bothering the guest kernel with crap. Trivially patch Xorg > to make use of it and that''s it.I have serious doubts that this is going to fly with KMS. Oops, the third "proper technical solutions" is missing. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 13:31 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Gerd Hoffmann <kraxel@redhat.com> wrote:> On 05/19/09 14:26, Ingo Molnar wrote: >> * Gerd Hoffmann<kraxel@redhat.com> wrote: >> >>> On 05/19/09 13:08, Ingo Molnar wrote: >>>> Or, alternatively, the hypervisor can expose its own administrative >>>> interface to manage MTRRs. >>> Guess what? Xen does exactly that. And the xen mtrr_ops >>> implementation uses that interface ... >> >> No, that is not an ''administrative interface'' - that is a guest >> kernel level hack that complicates Linux, extends its effective ABI >> dependencies and which has to be maintained there from that point >> on. >> >> There''s really just three proper technical solutions here: >> >> - either catch the lowlevel CPU hw ops (the MSR modifications, which >> isnt really all that different from the mtrr_ops approach so it >> shouldnt pose undue difficulties to the Xen hypervisor). > > Devil is in the details. > > The dom0 kernel might not see all physical cpus on the system. So > Xen can''t leave the job of looping over all cpus to the dom0 > kernel, Xen has to apply the changes made by the (priviledged) > guest kernel on any (virtual) cpu to all (physical) cpus in the > machine.Applying MTRR changes to only part of the CPUs is utter madness.> Which in turn means the "lowlevel cpu hw op" would work in a > slightly different way on Xen and native. Nasty. > >> That will >> be maximally transparent and compatible, with zero changes needed >> to the Linux kernel. > > No, the linux kernel probably should do the wrmsr on one cpu only then.Why?>> - or introduce its own hypercall API based administration API, >> without bothering the guest kernel with crap. Trivially patch Xorg >> to make use of it and that''s it. > > I have serious doubts that this is going to fly with KMS. > > Oops, the third "proper technical solutions" is missing.Yeah, the third one is to not touch MTRRs after bootup and use PAT. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2009-May-19 13:51 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On 05/19/09 15:31, Ingo Molnar wrote:> * Gerd Hoffmann<kraxel@redhat.com> wrote: > >> On 05/19/09 14:26, Ingo Molnar wrote: >>> * Gerd Hoffmann<kraxel@redhat.com> wrote: >>> >>>> On 05/19/09 13:08, Ingo Molnar wrote: >>>>> Or, alternatively, the hypervisor can expose its own administrative >>>>> interface to manage MTRRs. >>>> Guess what? Xen does exactly that. And the xen mtrr_ops >>>> implementation uses that interface ... >>> No, that is not an ''administrative interface'' - that is a guest >>> kernel level hack that complicates Linux, extends its effective ABI >>> dependencies and which has to be maintained there from that point >>> on. >>> >>> There''s really just three proper technical solutions here: >>> >>> - either catch the lowlevel CPU hw ops (the MSR modifications, which >>> isnt really all that different from the mtrr_ops approach so it >>> shouldnt pose undue difficulties to the Xen hypervisor). >> Devil is in the details. >> >> The dom0 kernel might not see all physical cpus on the system. So >> Xen can''t leave the job of looping over all cpus to the dom0 >> kernel, Xen has to apply the changes made by the (priviledged) >> guest kernel on any (virtual) cpu to all (physical) cpus in the >> machine. > > Applying MTRR changes to only part of the CPUs is utter madness.Sure. Do you read what I''m writing?>> Which in turn means the "lowlevel cpu hw op" would work in a >> slightly different way on Xen and native. Nasty. >> >>> That will >>> be maximally transparent and compatible, with zero changes needed >>> to the Linux kernel. >> No, the linux kernel probably should do the wrmsr on one cpu only then. > > Why?See above. Xen has to apply the changes to all cpus anyway.>> Oops, the third "proper technical solutions" is missing. > > Yeah, the third one is to not touch MTRRs after bootup and use PAT.Works only in case the CPU has PAT support. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 14:17 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Gerd Hoffmann <kraxel@redhat.com> wrote:> On 05/19/09 15:31, Ingo Molnar wrote: >> * Gerd Hoffmann<kraxel@redhat.com> wrote: >> >>> On 05/19/09 14:26, Ingo Molnar wrote: >>>> * Gerd Hoffmann<kraxel@redhat.com> wrote: >>>> >>>>> On 05/19/09 13:08, Ingo Molnar wrote: >>>>>> Or, alternatively, the hypervisor can expose its own administrative >>>>>> interface to manage MTRRs. >>>>> Guess what? Xen does exactly that. And the xen mtrr_ops >>>>> implementation uses that interface ... >>>> No, that is not an ''administrative interface'' - that is a guest >>>> kernel level hack that complicates Linux, extends its effective ABI >>>> dependencies and which has to be maintained there from that point >>>> on. >>>> >>>> There''s really just three proper technical solutions here: >>>> >>>> - either catch the lowlevel CPU hw ops (the MSR modifications, which >>>> isnt really all that different from the mtrr_ops approach so it >>>> shouldnt pose undue difficulties to the Xen hypervisor). >>> Devil is in the details. >>> >>> The dom0 kernel might not see all physical cpus on the system. So >>> Xen can''t leave the job of looping over all cpus to the dom0 >>> kernel, Xen has to apply the changes made by the (priviledged) >>> guest kernel on any (virtual) cpu to all (physical) cpus in the >>> machine. >> >> Applying MTRR changes to only part of the CPUs is utter madness. > > Sure. Do you read what I''m writing? > >>> Which in turn means the "lowlevel cpu hw op" would work in a >>> slightly different way on Xen and native. Nasty. >>> >>>> That will >>>> be maximally transparent and compatible, with zero changes needed >>>> to the Linux kernel. >>> No, the linux kernel probably should do the wrmsr on one cpu only then. >> >> Why? > > See above. Xen has to apply the changes to all cpus anyway.do _you_ read what i wrote, in the thread you are replying to: | | The change of MTRR''s on _any_ of the guest CPUs in a dom0 context | should immediately be refected on all CPUs. Assymetric MTRR | settings are madness. |>>> Oops, the third "proper technical solutions" is missing. >> >> Yeah, the third one is to not touch MTRRs after bootup and use PAT. > > Works only in case the CPU has PAT support.Which specific CPU without PAT support do you worry about? Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2009-May-19 14:55 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
>>>> No, the linux kernel probably should do the wrmsr on one cpu only then. >>> Why?> | The change of MTRR''s on _any_ of the guest CPUs in a dom0 context > | should immediately be refected on all CPUs. Assymetric MTRR > | settings are madness.Exactly. And thats why it is pointless to let the dom0 kernel write the mtrr msrs on more than one cpu.>>>> Oops, the third "proper technical solutions" is missing. >>> Yeah, the third one is to not touch MTRRs after bootup and use PAT. >> Works only in case the CPU has PAT support. > > Which specific CPU without PAT support do you worry about?For example: I have a Notebook here, with MTRR and without PAT according to the boot log. "Pentium III (Coppermine)" says /proc/cpuinfo. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-May-19 15:24 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
* Gerd Hoffmann <kraxel@redhat.com> wrote:>>>>> No, the linux kernel probably should do the wrmsr on one cpu only then. >>>> Why? > >> | The change of MTRR''s on _any_ of the guest CPUs in a dom0 context >> | should immediately be refected on all CPUs. Assymetric MTRR >> | settings are madness. > > Exactly. And thats why it is pointless to let the dom0 kernel > write the mtrr msrs on more than one cpu.How does this negate my claim that the Linux kernel needs no modifications? (which i think your point is - let me know if you have some other point here.) the Xen hypervisor can simply repeat all requests (i.e. not care at all about the fact that a guest does these modifications on all CPUs it sees), or realize that the modification has already been done and skip it. This is in no way a performance sensitive codepath - mtrrs are only modified in init sequences - and setting mtrrs is slow and globally serialized to begin with.>>>>> Oops, the third "proper technical solutions" is missing. >>>> Yeah, the third one is to not touch MTRRs after bootup and use PAT. >>> Works only in case the CPU has PAT support. >> >> Which specific CPU without PAT support do you worry about? > > For example: I have a Notebook here, with MTRR and without PAT > according to the boot log. "Pentium III (Coppermine)" says > /proc/cpuinfo.That''s a really old CPU, but even Coppermine has PAT support in the CPU. You need to go back to things like P5 200 MHz CPUs to find PAT-less CPUs. On the Linux side it''s easy to enable it (and _such_ a patch would make sense indeed) - it''s just that nobody has yet gone through the effort of testing and validating the PAT code on that CPU. If you care enough, you can do it, send a patch and ping the Intel folks about it. Once the issues are framed correctly, Linux is helped for real. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2009-May-20 08:01 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
On 05/19/09 17:24, Ingo Molnar wrote:> the Xen hypervisor can simply repeat all requests (i.e. not care at > all about the fact that a guest does these modifications on all CPUs > it sees), or realize that the modification has already been done and > skip it.Could be done, yes. It still feels wrong that wrmsr(mtrr) works slightly different on xen and on native. And it wouldn''t work on existing Xen deployments as the Xen hypervisor doesn''t support that today.>>>>> Yeah, the third one is to not touch MTRRs after bootup and use PAT.> That''s a really old CPU, but even Coppermine has PAT support in the > CPU. You need to go back to things like P5 200 MHz CPUs to find > PAT-less CPUs.Linux shouln''t say "PAT not supported by CPU." then. Also it doesn''t make sense to me to handle things differently on native and xen. While it might make sense to deprecate mtrrs in favor of PAT (don''t know enougth about all the different cpus in the wild to justify that) I don''t think it makes sense to do that for xen only. Native should declare mtrrs obsolete as well. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge <jeremy@goop.org> writes:> arch/x86 already defines an mtrr_ops, which defines how to manipulate > the MTRR registers. There are currently several implementations of > that interface. In Xen the MTRR registers belong to the hypervisor, > but it allows a privileged kernel to modify them via hypercalls.One part that''s unclear to me in this discussion. Could you perhaps clarify Jeremy?: Even Dom0 is not continuous in physical memory, but mapped page by page except for swiotlb mappings. But MTRRs are fundamentally a way to change attributes for large physically continous mappings. How do these two meet? After all when you change a MTRR for a given range of memory linux sees as continuous it isn''t necessarily in Xen. Is this new interface only defined for swiotlb or MMIO mappings? If yes did you check the drivers only actually set it on swiotlb or MMIO (that seems dubious to me)? -Andi -- ak@linux.intel.com -- Speaking for myself only. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-20 16:12 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:>> That''s it. I could add any number of bizarre convolutions to >> achieve the same effect, but given that there''s an existing >> interface that is exactly designed for what we want to achieve, I >> have to admit it didn''t occur to me to do anything else. >> > > Exactly what is ''bizarre'' about using the API defined by the _CPU_ > already, without adding any ad-hoc hypecall? Catch the dom0 WRMSRs, > filter out the MTRR indices - that''s it. >Well, the x86 world can''t seem to decide what the ABI is supposed to be, which is why we have mtrr_ops in the first place. Doing emulation at the MSR level means that I''d need to decide which MTRR interface we''re emulating today and do that. Yes, I realize that almost everyone is using the same Intel-like interface these days, but it does mean there''s a level of fragility that doesn''t exist if we just implement mtrr_ops. There''s some secondary issues which arise. For example, the mtrr trimming test is meaningless in dom0 (the e820 is fake, so it doesn''t make sense to compare it with the mtrrs); we currently avoid that because the test only happens if the mtrr vendor is Intel. We would need to disable that test some other way. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-20 16:35 UTC
Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:> * Jan Beulich <JBeulich@novell.com> wrote: > > >>>>> Ingo Molnar <mingo@elte.hu> 19.05.09 11:59 >>> >>>>> >>> Exactly what is ''bizarre'' about using the API defined by the >>> _CPU_ already, without adding any ad-hoc hypecall? Catch the >>> dom0 WRMSRs, filter out the MTRR indices - that''s it. >>> >> But that is *not* the same as using the hypercalls: The hypercall >> tells Xen "Change all CPUs'' MTRRs with the indicated index to the >> indicated value", while the MSR write says "Change the MTRR with >> the given index on the physical CPU the current virtual CPU >> happens to run on to the given value". [...] >> > > The change of MTRR''s on _any_ of the guest CPUs in a dom0 context > should immediately be refected on all CPUs. Assymetric MTRR settings > are madness. > > ( And the thing is, changing MTRRs is fragile and racy on native > Linux no matter what - even without any hypervisors - due to SMM > contexts possibly relying on them etc. ) > > >> [...] A write-base/write-mask pair may happen to get interrupted >> (preempted) by the hypervisor, and hence the two writes may happen >> on different pCPU-s. Teaching the hypervisor to (correctly!) guess >> what the guest meant in that situation isn''t trivial, as then it >> needs to handle all possible situations (and it can never know >> whether Dom0 really intended to do something that may look >> bogus/inconsistent at the first glance). [...] >> > > None of this is a problem really if a sane approach is used: a > change to the MTRR state on dom0 is applied symmetrically on all > CPUs. > > Or, alternatively, the hypervisor can expose its own administrative > interface to manage MTRRs. > > There''s no need to fuglify the Linux kernel for that.I''m not sure what you mean by that, other than as a description of the current case. The Xen MTRR hypercall: 1. treats MTRR ranges as allocatable resources, and keep track of how many uses there are of each 2. updates all physical cpus synchronously (ie, the MTRR is not presented as a property of dom0''s virtual CPU, but as a system-wide resource) 3. prevents guests from setting inconsistent or conflicting MTRRs Mapping from MSR writes to this interface is moderately complex, because it requires a mapping from a low-semantic-content interface to a high-semantic-content interface. It essentially requires parsing the MSR writes to map them back to the relatively high-level operations at the mtrr_ops interface and then present that to Xen. There are at least a couple of secondary issues which arise from that approach: * mtrr/generic.c also has to do a number of other things like disabling caching, tlb flushes, etc. That adds complexity because Xen guests are never allowed to globally disable caching, so we''d have to add additional filtering to remove those cr0 writes * As we''ve discussed, we''d need to make the mtrr writes implicitly change all cpus atomically, as the dom0 kernel can''t see physical cpus The net effect would be that we would be making a pile of apparently generic CPU operations (MSR writes, control register writes) actually feed a fairly complex parser, increasing the difference between the Xen and native cases even more. mtrr/generic.c about 730 lines of fairly intricate arch-specific code. mtrr/xen.c is 120 lines of straightforward hypercalls. The mtrr_ops interface and the Xen hypercall interface are a close semantic match, so there''s very little glue code in there. But that said, this a huge distraction, an unbelievable amount of noise for a fairly minor point. We can live without these changes, and they''re certainly easy enough to carry out of tree in the meantime. If you can''t live with these changes, then drop them and we''ll work out something else. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-20 16:39 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Andi Kleen wrote:> Jeremy Fitzhardinge <jeremy@goop.org> writes: > > >> arch/x86 already defines an mtrr_ops, which defines how to manipulate >> the MTRR registers. There are currently several implementations of >> that interface. In Xen the MTRR registers belong to the hypervisor, >> but it allows a privileged kernel to modify them via hypercalls. >> > > One part that''s unclear to me in this discussion. Could you perhaps > clarify Jeremy?: > > Even Dom0 is not continuous in physical memory, but mapped page by page > except for swiotlb mappings. But MTRRs are fundamentally a way > to change attributes for large physically continous mappings. How do these > two meet? > > After all when you change a MTRR for a given range of memory > linux sees as continuous it isn''t necessarily in Xen. > > Is this new interface only defined for swiotlb or MMIO mappings? > If yes did you check the drivers only actually set it on > swiotlb or MMIO (that seems dubious to me)?Really? Do we ever set unusual memory types on normal system memory? From what I''ve seen, MTRRs are only ever applied to device mapped memory (framebuffers, etc). I guess it could possibly make sense on system memory which is being prepped for DMA (swiotlb, alloc_coherent, etc), but dom0 would have a pseudo-phys to machine mapping for that memory too (it would be obviously problematic if something tried to program MTRR with pseudo-physical addresses, but Xen would/should probably disallow it anyway). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-May-20 22:49 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Andi Kleen wrote:> There were drivers at some point that set WC MTRRs on their > transfer rings, not mmio. >To avoid cache pollution effects?> Maybe these all switched to PAT now, maybe not > > It''s hard to say because it could be also done from user space. >How does userspace do the virtual to physical mapping? If the driver can''t also deal with the extra conversion from physical to machine under Xen, then it won''t have much success either way. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> >Is this new interface only defined for swiotlb or MMIO mappings? > >If yes did you check the drivers only actually set it on > >swiotlb or MMIO (that seems dubious to me)? > > Really? Do we ever set unusual memory types on normal system memory?There were drivers at some point that set WC MTRRs on their transfer rings, not mmio. Maybe these all switched to PAT now, maybe not It''s hard to say because it could be also done from user space. -Andi -- ak@linux.intel.com -- Speaking for myself only. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2009-May-20 23:03 UTC
[Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Andi Kleen wrote:>>> Is this new interface only defined for swiotlb or MMIO mappings? >>> If yes did you check the drivers only actually set it on >>> swiotlb or MMIO (that seems dubious to me)? >> Really? Do we ever set unusual memory types on normal system memory? > > There were drivers at some point that set WC MTRRs on their > transfer rings, not mmio. > > Maybe these all switched to PAT now, maybe notMost of those that I''ve heard of use WB memory and CLFLUSH. The ones I know of which marked descriptor rings WB were ones which had descriptor rings in MMIO space (which makes sense: MMIO reads are fully synchronizing, but writes can be posted and write combined.) -hpa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel