Scott Parish
2005-Apr-19 23:03 UTC
[Xen-devel] understanding __linear_l2_table and friends
I was trying to understand the states behind domain creation, but i''m having troubles getting past this. Would someone mind saying a few words about what these are and (if still needed) why these calculations work for that? xen/include/asm-x86/page.h: #define linear_l1_table \ ((l1_pgentry_t *)(LINEAR_PT_VIRT_START)) #define __linear_l2_table \ ((l2_pgentry_t *)(LINEAR_PT_VIRT_START + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)))) #define __linear_l3_table \ ((l3_pgentry_t *)(LINEAR_PT_VIRT_START + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)) + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<1)))) #define __linear_l4_table \ ((l4_pgentry_t *)(LINEAR_PT_VIRT_START + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)) + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<1)) + \ (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<2)))) Thanks! sRp -- Scott Parish _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Apr-20 10:05 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
They aren''t actually used during domain building, but anyway: Xen uses the common trick whereby each page directory maps itself. This means that every page-table entry is mapped into the address space at some virtual address. In fact, page directory entries (and PML3 and PML4 entries on x86/64) are also directly accessible in the virtual address space. The macros below are expressions that evaluate to the correct virtual addresses. -- Keir> I was trying to understand the states behind domain creation, but i''m > having troubles getting past this. Would someone mind saying a few > words about what these are and (if still needed) why these calculations > work for that? > > xen/include/asm-x86/page.h: > #define linear_l1_table \ > ((l1_pgentry_t *)(LINEAR_PT_VIRT_START)) > #define __linear_l2_table \ > ((l2_pgentry_t *)(LINEAR_PT_VIRT_START + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)))) > #define __linear_l3_table \ > ((l3_pgentry_t *)(LINEAR_PT_VIRT_START + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)) + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<1)))) > #define __linear_l4_table \ > ((l4_pgentry_t *)(LINEAR_PT_VIRT_START + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<0)) + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<1)) + \ > (LINEAR_PT_VIRT_START >> (PAGETABLE_ORDER<<2)))) > > Thanks! > sRp > > -- > Scott Parish > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-20 16:06 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> They aren''t actually used during domain building,Used anywhere else? Especially __linear_l2_table and __linear_l3_table?> Xen uses the common trick whereby each page directory maps > itself. This means that every page-table entry is mapped into the > address space at some virtual address.Well, in PAE mode that trick doesn''t fully work. It will do fine for the l1 tables, I think also for l2, but certainly not for l3 due to address space constrains ... Gerd -- #define printk(args...) fprintf(stderr, ## args) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-20 16:25 UTC
RE: [Xen-devel] understanding __linear_l2_table and friends
> > Xen uses the common trick whereby each page directory maps itself. > > This means that every page-table entry is mapped into the address > > space at some virtual address. > > Well, in PAE mode that trick doesn''t fully work. It will do > fine for the l1 tables, I think also for l2, but certainly > not for l3 due to address space constrains ...??? The linear tables for PAE will consume 8MB of VA space, and all the current processes''s L1, L2 and L3 pages will all be contained within the linear table. You can use the linear table to update any PTE in the domain''s currrent address space by virtual address. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Apr-20 16:31 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On 20 Apr 2005, at 17:25, Ian Pratt wrote:>> Well, in PAE mode that trick doesn''t fully work. It will do >> fine for the l1 tables, I think also for l2, but certainly >> not for l3 due to address space constrains ... > > ??? > > The linear tables for PAE will consume 8MB of VA space, and all the > current processes''s L1, L2 and L3 pages will all be contained within > the > linear table. > > You can use the linear table to update any PTE in the domain''s currrent > address space by virtual address.Gerd is correct that it does not fully work for PAE, but not simply because of address-space considerations. The top-level page directory in PAE is not the same format as the lower levels (it contains 4 entries rather than 512), so the trick of it mapping itself doesn;t work. We don''t currently use linear mapping for anything other than L1 entries anyway, except maybe in shadow code, and we can fix it up by other means (separately map top-level page dir). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-20 18:53 UTC
RE: [Xen-devel] understanding __linear_l2_table and friends
> Gerd is correct that it does not fully work for PAE, but not > simply because of address-space considerations. The top-level > page directory in PAE is not the same format as the lower > levels (it contains 4 entries rather than 512), so the trick > of it mapping itself doesn;t work.It works at the expense of burning an extra 2MB of VA space in an L2... We have to take 4 slots in the L2 handling the top of the VA space, and have the four slots point at the 4 L2s. We can use this to access all the L1''s and L2''s. We then take another slot in the uppermost L2 and have it point at the L3. Puke. PAE is utterly disgusting. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-20 19:14 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Wed, Apr 20, 2005 at 07:53:00PM +0100, Ian Pratt wrote:> > Gerd is correct that it does not fully work for PAE, but not > > simply because of address-space considerations.Well, sort of. The trick requires that the linear page table address space is aligned to what the topmost page table level can handle. And it eats one entry. We would have to align the linear page table @ 3GB and waste 1GB address space, then the self-referencing trick would work even with the 3rd level I think. Obviously not an option ;)> We have to take 4 slots in the L2 handling the top of the VA space, and > have the four slots point at the 4 L2s. We can use this to access all > the L1''s and L2''s.That''s exactly what I''m doing at the moment.> We then take another slot in the uppermost L2 and have it point at the > L3.That I don''t ;) While I''m at it: Which levels writable pagetables are used for (without shadowing)? Only the first? Or also the other ones? Gerd -- #define printk(args...) fprintf(stderr, ## args) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Scott Parish
2005-Apr-20 19:46 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Wed, Apr 20, 2005 at 11:05:02AM +0100, Keir Fraser wrote:> Xen uses the common trick whereby each page directory maps > itself. This means that every page-table entry is mapped into the > address space at some virtual address.So this is the same as netbsd''s recursive page table stuff. Thanks for the explanation sRp -- Scott Parish _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-20 20:27 UTC
RE: [Xen-devel] understanding __linear_l2_table and friends
> > We have to take 4 slots in the L2 handling the top of the VA space, > > and have the four slots point at the 4 L2s. We can use this > to access > > all the L1''s and L2''s. > > That''s exactly what I''m doing at the moment. > > > We then take another slot in the uppermost L2 and have it > point at the > > L3. > > That I don''t ;)There are three possible soloutions for L3 accesses : * wrap them in map_domain_mem. This will be very slow * burn 2MB of VA space in an L2 to map the L3 * insist on every pagetable having a reserved L1 in which we can steal a 4KB slot Both 2 and 3 are plausible, though 3 might waste a little physical memory unless we arranged such that the kernel could made use of the remaining slots. Having a per-pagetable L2 with reserved slots is going to be a pain enough anyhow.> While I''m at it: Which levels writable pagetables are used > for (without shadowing)? Only the first? Or also the other ones?We currently just use them for L1''s, as you typically don''t see many batch updates to L2s (at least relatively speaking). We currently use mmu_update hypercalls for L2 updates, though it probably wouldn''t be much slower if we just used the instruction emulation path. Since its all hidden in the setpgd macro its not a big deal either way... In the first instance, it probably makes sense to get PAE working using hypercalls everywhere, and then debug the emulation path, and finally enable full writeable pagetables. Cheers, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-20 21:38 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
> There are three possible soloutions for L3 accesses : > * wrap them in map_domain_mem. This will be very slow > * burn 2MB of VA space in an L2 to map the L3 > * insist on every pagetable having a reserved L1 in which we can steal > a 4KB slotAccording to Keir linear tables are used for L1 access only anyway, so this probably isn''t an issue. Beside that I''d probably go with (1). l3 in PAE mode is just 4 entries, so access to them very likely is rare, thus I''d rather take small the map/unmap performance hit than trying to implement complicated things like (3) which could have unexpected side effects all over the place in the paging code.> In the first instance, it probably makes sense to get PAE working using > hypercalls everywhere, and then debug the emulation path, and finally > enable full writeable pagetables.I''m not that far yet ... How does the console output of domain 0 work? Is it passed to xen via hypercall? Or does domain 0 manage it itself (very early in boot)? How far goes the boot of the xenolinux kernel in domain 0 with the initial pagetable setup created by xen''s dom0 builder? I think I should see some kernel messages from linux before it actually touches the page tables? Current state is that xen itself comes up fine, the domain 0 builder completes, but the xenlinux kernel is killed via domain_crash() very early, before the first message appears on the screen, and I''m trying to figure what is going on ... Gerd -- #define printk(args...) fprintf(stderr, ## args) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-20 22:10 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
> > There are three possible soloutions for L3 accesses : > > * wrap them in map_domain_mem. This will be very slow > > * burn 2MB of VA space in an L2 to map the L3 > > * insist on every pagetable having a reserved L1 in which we can steal > > a 4KB slot > > According to Keir linear tables are used for L1 access only anyway, so > this probably isn''t an issue. Beside that I''d probably go with (1). > l3 in PAE mode is just 4 entries, so access to them very likely is rare, > thus I''d rather take small the map/unmap performance hit than trying to > implement complicated things like (3) which could have unexpected side > effects all over the place in the paging code.That''ll be OK to get paravirt mode working, but the shadow modes do do a fair number of accesses to L2(L3) pages via linear mappings. Scheme #1 will do for starters, though. Scheme #2 is easy too, but we have to be careful how much lowmem we burn.> > In the first instance, it probably makes sense to get PAE working using > > hypercalls everywhere, and then debug the emulation path, and finally > > enable full writeable pagetables. > > I''m not that far yet ... > > How does the console output of domain 0 work? Is it passed to xen via > hypercall? Or does domain 0 manage it itself (very early in boot)?It goes via a hypercall. To get early printk, just hack the following into the obvious place in kernel/printk.c after vscnprintf: HYPERVISOR_console_io(CONSOLEIO_write, sizeof(printk_buf), printk_buf);> How far goes the boot of the xenolinux kernel in domain 0 with the > initial pagetable setup created by xen''s dom0 builder? I think > I should see some kernel messages from linux before it actually > touches the page tables?With the above hack, yes. Cheers, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-21 13:51 UTC
RE: [Xen-devel] understanding __linear_l2_table and friends
One key design decision with PAE para-virtualized guests is how to handle the per-pagetable (as opposed to per-domain) mappings that exist in the hypervisor reserved area. The only ones of these that spring to mind are in fact the linear pagetable mappings. PAE Linux currently uses a single L2 for all kernel mappings shared across all pagetables. Thus, when we do the mmu_ext_op hypercall to switch cr3 we''d need to write in new values into the appropriate L2 of the destination pagetable before re-loading cr3 (since in reality there''ll only really ever be one such L2 for the domain, it makes sense to leave an open map_domain_mem to it.) The downside of this scheme is that it will cripple the TLB flush filter on Opteron. Linux used to do this until 2.6.11 anyhow, and no-one really complained much. The far bigger problem is that it won''t work for SMP guests, at least without making the L2 per VCPU and updating the L3 accordingly using mm ref counting, which would be messy but do-able. The alternative is to hack PAE Linux to force the L2 containing kernel mappings to be per-pagetable rather than shared. The downside of the is that we use an extra 4KB per pagetable, and have the hassle of faulting in kernel L2 mappings on demand (like non-PAE Linux has to). This plays nicely with the TLB flush filter, and is fine for SMP guests. The simplest thing of all in the first instance is to turn all of the linear pagetable accesses into macros taking (exec_domain, offset) and then just implement them using pagetable walks. What do you guys think? Implement option #3 in the first instance, then aim for #2. One completely different approach would be to first implement a PAE guest using the "translate, internal" shadow mode where we don''t have to worry about any of this gory stuff. Once its working, we could then implement a paravirtualized mode to improve performance and save memory. Getting shadow mode working on PAE shouldn''t be too hard, as its been written with 2, 3 and 4 level pagetables in mind. The shadow mode approach could be implemented in parallel with the paravirt approach. We could even turn it into a race to the first multiuser boot :-) Cheers, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-21 19:42 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
> The alternative is to hack PAE Linux to force the L2 containing kernel > mappings to be per-pagetable rather than shared. The downside of the is > that we use an extra 4KB per pagetable, and have the hassle of faulting > in kernel L2 mappings on demand (like non-PAE Linux has to). This plays > nicely with the TLB flush filter, and is fine for SMP guests.I think that one is better. The topmost L2 table with the kernel mappings is a special case anyway because it also has the hypervisor hole and thus differs from the other three L2 tables when it comes to allocation and verification (and maybe other places as well). I''m considering adding a new page type for the topmost L2 in PAE mode to handle this. Comments? Better ideas? Gerd -- #define printk(args...) fprintf(stderr, ## args) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Apr-21 21:13 UTC
RE: [Xen-devel] understanding __linear_l2_table and friends
> > The alternative is to hack PAE Linux to force the L2 > containing kernel > > mappings to be per-pagetable rather than shared. The > downside of the > > is that we use an extra 4KB per pagetable, and have the hassle of > > faulting in kernel L2 mappings on demand (like non-PAE > Linux has to). > > This plays nicely with the TLB flush filter, and is fine > for SMP guests. > > I think that one is better.Good. The only hassle is the need for Linux''s demand filling of L2 slots pointing to kernel L1''s, but seeing as non-PAE Linux has similar code already, this shouldn''t be too hard.> The topmost L2 table with the > kernel mappings is a special case anyway because it also has > the hypervisor hole and thus differs from the other three L2 > tables when it comes to allocation and verification (and > maybe other places as well). > I''m considering adding a new page type for the topmost L2 in > PAE mode to handle this. Comments? Better ideas?You can just maintain the va back ptr index for L2''s as well as L1''s (we may want to do this anyway to implement writeable L2 pagetables at some point). If the va back ptr == 3, you know its an L2 with hypervisor slots. Part of validating an L3 will be to check that the top slot is filled in and pointing to a validated L2. When alloc_l2_table is called with a back pointer index of 3 it will install hypervisor entries in the L2. I think this is much neater. Best, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andi Kleen
2005-Apr-22 11:04 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Thu, Apr 21, 2005 at 02:51:34PM +0100, Ian Pratt wrote:> PAE Linux currently uses a single L2 for all kernel mappings shared > across all pagetables. Thus, when we do the mmu_ext_op hypercall to > switch cr3 we''d need to write in new values into the appropriate L2 of > the destination pagetable before re-loading cr3 (since in reality > there''ll only really ever be one such L2 for the domain, it makes sense > to leave an open map_domain_mem to it.) > > The downside of this scheme is that it will cripple the TLB flush filter > on Opteron. Linux used to do this until 2.6.11 anyhow, and no-one reallyIt also cripples the "adaptive cache" on Intel systems, which assume that if two HT siblings have the same CR3 then the L1 cache can be shared. If that is false you get L1 cache thrashing in some HT workloads.> complained much. The far bigger problem is that it won''t work for SMP > guests, at least without making the L2 per VCPU and updating the L3 > accordingly using mm ref counting, which would be messy but do-able. > > The alternative is to hack PAE Linux to force the L2 containing kernel > mappings to be per-pagetable rather than shared. The downside of the is > that we use an extra 4KB per pagetable, and have the hassle of faulting > in kernel L2 mappings on demand (like non-PAE Linux has to). This plays > nicely with the TLB flush filter, and is fine for SMP guests. > > The simplest thing of all in the first instance is to turn all of the > linear pagetable accesses into macros taking (exec_domain, offset) and > then just implement them using pagetable walks. > > What do you guys think? Implement option #3 in the first instance, then > aim for #2.Since PAE is a temporary crock I would chose the least intrusive variant to the codebase :) -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kip Macy
2005-Apr-22 20:47 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
> > Since PAE is a temporary crock I would chose the least intrusive > variant to the codebase :) >A temporary crock that is likely to be 80% of Xen''s deployments for the next couple of years. -Kip _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andi Kleen
2005-Apr-23 15:08 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Fri, Apr 22, 2005 at 01:47:34PM -0700, Kip Macy wrote:> > > > Since PAE is a temporary crock I would chose the least intrusive > > variant to the codebase :) > > > A temporary crock that is likely to be 80% of Xen''s deployments for > the next couple of years.Very unlikely, since you will have a hard time to buy non X86-64 capable servers in the next couple of years. It is already pretty hard with new boxes. Even desktops are becomming more and more 64bit capable (Intel will even enable it on all Celerons a bit later this year). The only 32bit holdout left are the very lowend boxes from AMD and Intel Laptops and VIA. And these generally dont need any PAE since dont support enough RAM (assuming you dont need the NX hype) That is why the PAE effort seems so pointless to me. I estimate it will take some months at least until it is stable and released, and at this time most of the new x86 world is x86-64 capable. The only boxes for which PAE is needed are basically some old servers, and these will be quickly replaced with new 64bit capable ones. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wim Coekaerts
2005-Apr-23 15:13 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Sat, Apr 23, 2005 at 05:08:27PM +0200, Andi Kleen wrote:> That is why the PAE effort seems so pointless to me. I estimate it will > take some months at least until it is stable and released, and at this time > most of the new x86 world is x86-64 capable. > > The only boxes for which PAE is needed are basically some old servers, > and these will be quickly replaced with new 64bit capable ones.sorry andi I disagree "some" is incorrect. there are huge huge numbers of servers outthere, you don''t just replace them. many potential xen users probably have 100s of relatively recent x86 servers around. one doesn''t just replace servers. maybe at home, but not companies. if you have a server farm with 4000 systems, you don''t just toss it. I think it''s worth the effort _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andi Kleen
2005-Apr-23 15:20 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends II
Thinking about this a bit more: On Thu, Apr 21, 2005 at 02:51:34PM +0100, Ian Pratt wrote:> The downside of this scheme is that it will cripple the TLB flush filter > on Opteron. Linux used to do this until 2.6.11 anyhow, and no-one really > complained much. The far bigger problem is that it won''t work for SMP > guests, at least without making the L2 per VCPU and updating the L3 > accordingly using mm ref counting, which would be messy but do-able. > > The alternative is to hack PAE Linux to force the L2 containing kernel > mappings to be per-pagetable rather than shared. The downside of the is > that we use an extra 4KB per pagetable, and have the hassle of faulting > in kernel L2 mappings on demand (like non-PAE Linux has to). This plays > nicely with the TLB flush filter, and is fine for SMP guests.<without having looked at the Xen code much, but some familiarity with the i386 linux code> I thought about this a bit more and your section alternative sounds much better. Faulting on the kernel mappings is very infrequent and usually after some time the PGD is fully set up and only the lower level of the kernel mappings change with vmalloc etc.. On x86-64 Linux I even initialize it when the PGD is created from a static template page. The remaining cases for very big vmalloc can be handled on demand without too much code. It should be pretty easy to do on i386 too.> > The simplest thing of all in the first instance is to turn all of the > linear pagetable accesses into macros taking (exec_domain, offset) and > then just implement them using pagetable walks. > > What do you guys think? Implement option #3 in the first instance, then > aim for #2.I dont get your numbering, didnt you have only two options? Or does the one below count too?> > One completely different approach would be to first implement a PAE > guest using the "translate, internal" shadow mode where we don''t have to > worry about any of this gory stuff. Once its working, we could then > implement a paravirtualized mode to improve performance and save memory. > Getting shadow mode working on PAE shouldn''t be too hard, as its been > written with 2, 3 and 4 level pagetables in mind.That sounds attractive too, except that duplicated page tables can be killer on some workloads (database with many processes and lots of shared memory, you end up with a lot of memory tied in page tables even with hugetlb). And normally databases are one of the most common workloads for PAE. It might be a good idea to avoid it at least for the para case. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andi Kleen
2005-Apr-23 15:28 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Sat, Apr 23, 2005 at 08:13:08AM -0700, Wim Coekaerts wrote:> On Sat, Apr 23, 2005 at 05:08:27PM +0200, Andi Kleen wrote: > > That is why the PAE effort seems so pointless to me. I estimate it will > > take some months at least until it is stable and released, and at this time > > most of the new x86 world is x86-64 capable. > > > > The only boxes for which PAE is needed are basically some old servers, > > and these will be quickly replaced with new 64bit capable ones. > > > sorry andi I disagree > "some" is incorrect. there are huge huge numbers of servers outthere, > you don''t just replace them. many potential xen users probably have 100s > of relatively recent x86 servers around. > > one doesn''t just replace servers. maybe at home, but not companies. > if you have a server farm with 4000 systems, you don''t just toss it.> I think it''s worth the effortYou toss it after 3-4 years at least. Lets say 3 years. If you bought them in the last year you very likely already got them 64bit capable. Assuming it takes a year until PAE Xen is usable. They are at least two years old when PAE Xen runs on them. Gives 1 years of usable runtime. Not too much. My impression is more that people want PAE Xen because 64bit Xen is not quite ready yet, but I would not be surprised if 64bit Xen works sooner than PAE Xen and then that would be obsolete. In general from my experience working on PAE Linux I can say that the complexity of handling more than 4GB RAM with less than 4GB address space is often greatly underestimated. Linux took years before the many corner cases were flushed out, and now it is somewhat fragile. Of course Xen is simpler than Linux, but in many ways it has much less infrastructure to deal with memory pressure so I would not be surprised if some stuff would be harder to handle. So the 1 year estimate for it running well might be optimistic. Making 64bit Xen run well is probably easier, even if it needs more changes and some hacks now. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Knorr
2005-Apr-24 19:55 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
On Sat, Apr 23, 2005 at 05:28:26PM +0200, Andi Kleen wrote:> If you bought them in the last year you very likely already got them 64bit > capable.That the machines are 64bit capable doesn''t mean that people will actually run 64bit software on them. Note that the very good backward compatibility of x86_64 machines to 32bit software is one of the key features leading to the success of the processors (lesson learned from ia64 ;) Not everyone will instantly switch over to 64bit software just because the processor is able to do so, there are still way to much issues with 64bit Software. Linux is way ahead compared to most other operating systems, and still there are plenty of problems: OpenOffice is still 32bit, Firefox runs in 32bit much more stable than in 64bit, to name just two prominent examples. And with non-mainstream software it is even more likely to trap into not-yet fixed 64bit bugs. Nevertheless I don''t expect 80% of the installations being PAE, thats too much. People will start using 64bit software, but I''m sure not everybody will not instantly switch over to 64bit just because the hardware can do that. If it''s only to reduce the maintainance work in a data center with both 32 and 64bit capable machines ...> In general from my experience working on PAE Linux I can say that the > complexity of handling more than 4GB RAM with less than 4GB address > space is often greatly underestimated.> Of course Xen is simpler than Linux, but in many ways it has much less > infrastructure to deal with memory pressure so I would not be > surprised if some stuff would be harder to handle.Well, after looking into Xen''s mm code I''d say this is no problem for Xen. Xen basically delegates all that work to the guest operating system, it simply doesn''t has to deal with memory pressure issues.> So the 1 year estimate for it running well might be optimistic.I''d say it is pessimistic, but let''s see ... At the moment my pae xenlinux kernel doesn''t survice paging_init() yet. It seems to me that this piece of code already triggers almost everything which must be touched for PAE support in xenlinux and xen though, so I expect a dom0 multi-user boot isn''t that far away once paging_init() works fine ;) Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Hopwood
2005-Apr-25 00:41 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
Gerd Knorr wrote:> On Sat, Apr 23, 2005 at 05:28:26PM +0200, Andi Kleen wrote: > >>If you bought them in the last year you very likely already got them 64bit >>capable. > > That the machines are 64bit capable doesn''t mean that people will > actually run 64bit software on them. Note that the very good backward > compatibility of x86_64 machines to 32bit software is one of the key > features leading to the success of the processors (lesson learned from > ia64 ;)What does that have to do with PAE support in Xen? x86_64 machines do not support PAE, and do not need it to run 32-bit applications. (A good decision by AMD, IMHO. The complexity of supporting PAE along with all the other mode combinations would have been ridiculous.) -- David Hopwood <david.nospam.hopwood@blueyonder.co.uk> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2005-Apr-25 00:46 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
> What does that have to do with PAE support in Xen? x86_64 machines > do not support PAE, and do not need it to run 32-bit applications.OK but if you don''t use the 64-bit mode at all there''s nothing to stop you booting in vanilla PAE mode. Owners of x86_64 boxes may then choose to use PAE to run a basically 32-bit system but still access all their RAM. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Hopwood
2005-Apr-25 02:53 UTC
Re: [Xen-devel] understanding __linear_l2_table and friends
Mark Williamson wrote:>>What does that have to do with PAE support in Xen? x86_64 machines >>do not support PAE, and do not need it to run 32-bit applications. > > OK but if you don''t use the 64-bit mode at all there''s nothing to stop you > booting in vanilla PAE mode.Oh, you''re right. I had somehow got the impression that AMD64 boxes didn''t support PAE in "legacy mode" either, but I see that I was mistaken (section 5 of volume 2 of the arch manual). -- David Hopwood <david.nospam.hopwood@blueyonder.co.uk> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel