Shan, Haitao
2008-Nov-20 09:07 UTC
[Xen-devel] [Question] Why code differs in construct_dom0?
Hi, Keir, Please see the code shown below. I don''t understand why pfn''s definitions are different depending on whether NDEBUG is defined or not. Can you tell me why? while ( pfn < nr_pages ) { if ( (page = alloc_chunk(d, nr_pages - d->tot_pages)) == NULL ) panic("Not enough RAM for DOM0 reservation.\n"); while ( pfn < d->tot_pages ) { mfn = page_to_mfn(page); #ifndef NDEBUG #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn))) #endif if ( !is_pv_32on64_domain(d) ) ((unsigned long *)vphysmap_start)[pfn] = mfn; else ((unsigned int *)vphysmap_start)[pfn] = mfn; set_gpfn_from_mfn(mfn, pfn); #undef pfn page++; pfn++; } } Best Regards Haitao Shan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Nov-20 09:17 UTC
[Xen-devel] Re: [Question] Why code differs in construct_dom0?
By deliberately making dom0''s p2m mapping discontiguous we can detect bugs where dom0 is incorrectly assuming pseudophys contiguous memory is machine contiguous. We had nasty bugs of this sort in dom0''s block layer many years ago. -- Keir On 20/11/08 09:07, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Hi, Keir, > > Please see the code shown below. I don''t understand why pfn''s definitions are > different depending on whether NDEBUG is defined or not. Can you tell me why? > > while ( pfn < nr_pages ) > { > if ( (page = alloc_chunk(d, nr_pages - d->tot_pages)) == NULL ) > panic("Not enough RAM for DOM0 reservation.\n"); > while ( pfn < d->tot_pages ) > { > mfn = page_to_mfn(page); > #ifndef NDEBUG > #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn))) > #endif > if ( !is_pv_32on64_domain(d) ) > ((unsigned long *)vphysmap_start)[pfn] = mfn; > else > ((unsigned int *)vphysmap_start)[pfn] = mfn; > set_gpfn_from_mfn(mfn, pfn); > #undef pfn > page++; pfn++; > } > } > > Best Regards > Haitao Shan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 09:41 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
So you mean in the release build we make the mapping discontiguous to detect possible bugs, while in debug build it is not discontiguous? And another question from problems we encountered recently, system with more than 4G memory installed will crash when X server shutdown. The reason is: 1> dom0 allocates memory for agp by calling agp_allocate_memory with GFP_DMA32 set. This implies the pfn comes from memory lower than 4G, while mfn are likely to be from memory above 4G. 2> dom0 then call map_pages_to_apg, since the kernel of handles 32bit gart table, dom0 uses hypercall to change its memory mappings (xen_create_contiguous_region). Xen will pick proper memory below 4G and free those from the guest (likely to be from memory above 4G). 3> As the process goes on. More and more memory below 4G is return to dom0 while leaving memory above 4G in xen. Finally, xen''s reservation of memory below 4G for DMA are exhausted. This creates severe problems for us. What is your comments on this? Both increase the reservation in Xen and using contiguous mappings are helpful in this cases. Which one do you prefer? Best Regards Haitao Shan Keir Fraser wrote:> By deliberately making dom0''s p2m mapping discontiguous we can detect > bugs where dom0 is incorrectly assuming pseudophys contiguous memory > is machine contiguous. We had nasty bugs of this sort in dom0''s block > layer many years ago. > > -- Keir > > On 20/11/08 09:07, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> Hi, Keir, >> >> Please see the code shown below. I don''t understand why pfn''s >> definitions are different depending on whether NDEBUG is defined or >> not. Can you tell me why? >> >> while ( pfn < nr_pages ) >> { >> if ( (page = alloc_chunk(d, nr_pages - d->tot_pages)) =>> NULL ) panic("Not enough RAM for DOM0 reservation.\n"); >> while ( pfn < d->tot_pages ) >> { >> mfn = page_to_mfn(page); >> #ifndef NDEBUG >> #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn))) #endif >> if ( !is_pv_32on64_domain(d) ) >> ((unsigned long *)vphysmap_start)[pfn] = mfn; >> else ((unsigned int *)vphysmap_start)[pfn] = mfn; >> set_gpfn_from_mfn(mfn, pfn); >> #undef pfn >> page++; pfn++; >> } >> } >> >> Best Regards >> Haitao Shan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 09:44 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
I forgot to mention in our configuration we did not specify the dom0_mem. Dom0 was allocated a large portion of the total system memory. Best Regards Haitao Shan Shan, Haitao wrote:> So you mean in the release build we make the mapping discontiguous to > detect possible bugs, while in debug build it is not discontiguous? > > And another question from problems we encountered recently, system > with more than 4G memory installed will crash when X server shutdown. > The reason is: 1> dom0 allocates memory for agp by calling > agp_allocate_memory with GFP_DMA32 set. This implies the pfn comes > from memory lower than 4G, while mfn are likely to be from memory > above 4G. 2> dom0 then call map_pages_to_apg, since the kernel of > handles 32bit gart table, dom0 uses hypercall to change its memory > mappings (xen_create_contiguous_region). Xen will pick proper memory > below 4G and free those from the guest (likely to be from memory > above 4G). 3> As the process goes on. More and more memory below 4G > is return to dom0 while leaving memory above 4G in xen. Finally, > xen''s reservation of memory below 4G for DMA are exhausted. This > creates severe problems for us. > > What is your comments on this? Both increase the reservation in Xen > and using contiguous mappings are helpful in this cases. Which one do > you prefer? > > Best Regards > Haitao Shan > > Keir Fraser wrote: >> By deliberately making dom0''s p2m mapping discontiguous we can detect >> bugs where dom0 is incorrectly assuming pseudophys contiguous memory >> is machine contiguous. We had nasty bugs of this sort in dom0''s >> block layer many years ago. >> >> -- Keir >> >> On 20/11/08 09:07, "Shan, Haitao" <haitao.shan@intel.com> wrote: >> >>> Hi, Keir, >>> >>> Please see the code shown below. I don''t understand why pfn''s >>> definitions are different depending on whether NDEBUG is defined or >>> not. Can you tell me why? >>> >>> while ( pfn < nr_pages ) >>> { >>> if ( (page = alloc_chunk(d, nr_pages - d->tot_pages)) =>>> NULL ) panic("Not enough RAM for DOM0 reservation.\n"); >>> while ( pfn < d->tot_pages ) >>> { >>> mfn = page_to_mfn(page); >>> #ifndef NDEBUG >>> #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn))) >>> #endif if ( !is_pv_32on64_domain(d) ) >>> ((unsigned long *)vphysmap_start)[pfn] = mfn; >>> else ((unsigned int *)vphysmap_start)[pfn] = mfn; >>> set_gpfn_from_mfn(mfn, pfn); >>> #undef pfn >>> page++; pfn++; >>> } >>> } >>> >>> Best Regards >>> Haitao Shan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Nov-20 09:50 UTC
[Xen-devel] Re: [Question] Why code differs in construct_dom0?
On 20/11/08 09:41, "Shan, Haitao" <haitao.shan@intel.com> wrote:> So you mean in the release build we make the mapping discontiguous to detect > possible bugs, while in debug build it is not discontiguous?It''s the other way round.> And another question from problems we encountered recently, system with more > than 4G memory installed will crash when X server shutdown. The reason is: > 1> dom0 allocates memory for agp by calling agp_allocate_memory with GFP_DMA32 > set. This implies the pfn comes from memory lower than 4G, while mfn are > likely to be from memory above 4G. > 2> dom0 then call map_pages_to_apg, since the kernel of handles 32bit gart > table, dom0 uses hypercall to change its memory mappings > (xen_create_contiguous_region). Xen will pick proper memory below 4G and free > those from the guest (likely to be from memory above 4G). > 3> As the process goes on. More and more memory below 4G is return to dom0 > while leaving memory above 4G in xen. Finally, xen''s reservation of memory > below 4G for DMA are exhausted. This creates severe problems for us. > > What is your comments on this? Both increase the reservation in Xen and using > contiguous mappings are helpful in this cases. Which one do you prefer?I''d need more info on the problem. I will point out that 64-bit Xen only allocates memory below 4G when asked, or when there is no memory available above 4G. Actually 32-bit Xen is the same, except the first chunk of dom0 memory allocated has to be below 1GB (because of limitations of Xen''s domain_build.c). So I''m not sure what more Xen can do? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 10:00 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
Keir Fraser wrote:> On 20/11/08 09:41, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> So you mean in the release build we make the mapping discontiguous >> to detect possible bugs, while in debug build it is not >> discontiguous? > > It''s the other way round. > >> And another question from problems we encountered recently, system >> with more than 4G memory installed will crash when X server >> shutdown. The reason is: 1> dom0 allocates memory for agp by calling >> agp_allocate_memory with GFP_DMA32 set. This implies the pfn comes >> from memory lower than 4G, while mfn are likely to be from memory >> above 4G. 2> dom0 then call map_pages_to_apg, since the kernel of >> handles 32bit gart table, dom0 uses hypercall to change its memory >> mappings (xen_create_contiguous_region). Xen will pick proper memory >> below 4G and free those from the guest (likely to be from memory >> above 4G). 3> As the process goes on. More and more memory below 4G >> is return to dom0 while leaving memory above 4G in xen. Finally, >> xen''s reservation of memory below 4G for DMA are exhausted. This >> creates severe problems for us. >> >> What is your comments on this? Both increase the reservation in Xen >> and using contiguous mappings are helpful in this cases. Which one >> do you prefer? > > I''d need more info on the problem. I will point out that 64-bit Xen > only allocates memory below 4G when asked, or when there is no memory > available above 4G. Actually 32-bit Xen is the same, except the first > chunk of dom0 memory allocated has to be below 1GB (because of > limitations of Xen''s domain_build.c). So I''m not sure what more Xen > can do?In our problem, most of memory is allocate to dom0 by not specifying dom0_mem=xxx in grub. Dom0 actually has near 4G memory. So in this case, xen only has little memory below 4G, which is from the reservation pool.> > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 12:52 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
I think I may not have described the problem clearly. The system has 4G memory. From E820 table, there was near 3.5G usable ram below 4G and about 0.5G above 4G. Of all the ram, most of the memory was allocated to dom0, leaving only those for xen itself such as xenheap and xen''s reservations. We were using an onboard graphic card. When starting X, agpgart allocated memory from kernel, then asked xen to exchange these pages to contiguous pages below 4G. Each time agpgart module did this job, some pages in kernel (which are actually above 4G in physical memory) were replaced with contiguous pages below 4G. These kind of demands were rather high, about 256M in our platform. Finally, xen''s reservation (128M) was not enough to fulfill this requirement. Why was the reservation exhausted? Because kernel kept asking for memory below 4G but only returning to xen memory above 4G. Then why is agpgart''s allocation always in effect from above 4G? According to the code I pasted in my first mail, when pfn in dom0 was small in number, mfn was large. The smaller the pfn was, the larger the corresponding mfn was. Apggart allocated memory with GFP_DMA32 set, so the pfns allocated was likely to be small. Then the mfns were likely to be actually quite large (above 4G). Either increasing the reservation (like 384M) or changing the initial p2m mapping in dom0 can solve the problem, and our tests verified this judgment. We do not know which solution is better. That''s why we are seeking your kindly help. I am not sure if I have explained clearly enough so far. So any questions on the problem itself, Keir? Shan Haitao -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Shan, Haitao Sent: 2008年11月20日 18:00 To: ''Keir Fraser'' Cc: ''xen-devel@lists.xensource.com'' Subject: [Xen-devel] RE: [Question] Why code differs in construct_dom0? Keir Fraser wrote:> On 20/11/08 09:41, "Shan, Haitao" <haitao.shan@intel.com> wrote: > >> So you mean in the release build we make the mapping discontiguous >> to detect possible bugs, while in debug build it is not >> discontiguous? > > It''s the other way round. > >> And another question from problems we encountered recently, system >> with more than 4G memory installed will crash when X server >> shutdown. The reason is: 1> dom0 allocates memory for agp by calling >> agp_allocate_memory with GFP_DMA32 set. This implies the pfn comes >> from memory lower than 4G, while mfn are likely to be from memory >> above 4G. 2> dom0 then call map_pages_to_apg, since the kernel of >> handles 32bit gart table, dom0 uses hypercall to change its memory >> mappings (xen_create_contiguous_region). Xen will pick proper memory >> below 4G and free those from the guest (likely to be from memory >> above 4G). 3> As the process goes on. More and more memory below 4G >> is return to dom0 while leaving memory above 4G in xen. Finally, >> xen''s reservation of memory below 4G for DMA are exhausted. This >> creates severe problems for us. >> >> What is your comments on this? Both increase the reservation in Xen >> and using contiguous mappings are helpful in this cases. Which one >> do you prefer? > > I''d need more info on the problem. I will point out that 64-bit Xen > only allocates memory below 4G when asked, or when there is no memory > available above 4G. Actually 32-bit Xen is the same, except the first > chunk of dom0 memory allocated has to be below 1GB (because of > limitations of Xen''s domain_build.c). So I''m not sure what more Xen > can do?In our problem, most of memory is allocate to dom0 by not specifying dom0_mem=xxx in grub. Dom0 actually has near 4G memory. So in this case, xen only has little memory below 4G, which is from the reservation pool.> > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Nov-20 13:03 UTC
[Xen-devel] Re: [Question] Why code differs in construct_dom0?
On 20/11/08 12:52, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Either increasing the reservation (like 384M) or changing the initial p2m > mapping in dom0 can solve the problem, and our tests verified this judgment. > We do not know which solution is better. That''s why we are seeking your kindly > help. > I am not sure if I have explained clearly enough so far. So any questions on > the problem itself, Keir?I don''t think there''s an easy answer. Increasing the default reservation won''t please everyone, since not everyone will want dom0 to be ''robbed'' of 384M! It''s also a bit specific to this particular situation. Relying on p2m being roughly 1:1 is a bit gross but, if it helps, we could change the debug code to swap adjacent pairs of pages, rather than reversing the entire p2m map? Then it would still happen that low pseudophys addresses have low machine addresses? It''s kind of nasty though. Perhaps really we should have the crash path in Linux print a message advising to specify dom0_mem= to Xen? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Nov-20 13:12 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
>>> "Shan, Haitao" <haitao.shan@intel.com> 20.11.08 13:52 >>> >I think I may not have described the problem clearly. The system has 4G >memory. From E820 table, there was near 3.5G usable ram below 4G >and about 0.5G above 4G. Of all the ram, most of the memory was >allocated to dom0, leaving only those for xen >itself such as xenheap >and xen''s reservations. >We were using an onboard graphic card. When starting X, agpgartallocated memory from kernel, then asked xen to exchange these>pages to contiguous pages below 4G. Each time agpgart module did >this job, some pages in kernel (which are actually >above 4G in >physical memory) were replaced with contiguous pages below 4G. >These kind of demands were rather high, about 256M in our platform. >Finally, xen''s reservation (128M) was not enough to fulfill this >requirement. >Why was the reservation exhausted? Because kernel kept asking for >memory below 4G but only returning to xen memory above 4G. Then >why is agpgart''s allocation always in effect from above 4G? According >to the code I pasted in my first mail, when pfn in >dom0 was small in >number, mfn was large. The smaller the pfn was, the larger the >corresponding mfn was. Apggart allocated memory with GFP_DMA32 >set, so the pfns allocated was likely to be small. Then the mfns were >likely to be actually quite large (above 4G). > >Either increasing the reservation (like 384M) or changing the initial p2m >mapping in dom0 can solve the problem, and our tests verified this >judgment. We do not know which solution is better. That''s why we are >seeking your kindly help. I am not sure if I have explained clearly >enough so far. So any questions on the problem itself, Keir?Neither of the suggested solutions seems correct to me - both would only defer the point where the problem occurs. The question really is why agpgart needs so much memory below 4G. And if that really isn''t a bug somewhere else, then requiring a sufficiently large negative value to be passed with dom0_mem= would seem to be the only option on this system (but not as a global default). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 13:14 UTC
[Xen-devel] RE: [Question] Why code differs in construct_dom0?
OK, got it. It seems a kind crash message would be good. Anyhow, the first two solutions can only lower the possibility of such kind of problems. Thanks! Shan Haitao -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年11月20日 21:03 To: Shan, Haitao Cc: ''xen-devel@lists.xensource.com'' Subject: Re: [Question] Why code differs in construct_dom0? On 20/11/08 12:52, "Shan, Haitao" <haitao.shan@intel.com> wrote:> Either increasing the reservation (like 384M) or changing the initial p2m > mapping in dom0 can solve the problem, and our tests verified this judgment. > We do not know which solution is better. That''s why we are seeking your kindly > help. > I am not sure if I have explained clearly enough so far. So any questions on > the problem itself, Keir?I don''t think there''s an easy answer. Increasing the default reservation won''t please everyone, since not everyone will want dom0 to be ''robbed'' of 384M! It''s also a bit specific to this particular situation. Relying on p2m being roughly 1:1 is a bit gross but, if it helps, we could change the debug code to swap adjacent pairs of pages, rather than reversing the entire p2m map? Then it would still happen that low pseudophys addresses have low machine addresses? It''s kind of nasty though. Perhaps really we should have the crash path in Linux print a message advising to specify dom0_mem= to Xen? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shan, Haitao
2008-Nov-20 13:34 UTC
RE: [Xen-devel] RE: [Question] Why code differs in construct_dom0?
I do not know why kernel does this. But I do know that GART table in Intel platform supports 64bit addresses. So in theory, allocating memory above 4G should be OK. Large memory requirement should be the result of architecture of onboard graphic card. It has no own memory on card. Memory requirements must be fulfilled by allocating from system memories. Seems there is no good solution. Like what you said, a message asking for setting dom0_mem may be better. Shan Haitao -----Original Message----- From: Jan Beulich [mailto:jbeulich@novell.com] Sent: 2008年11月20日 21:13 To: ''Keir Fraser''; Shan, Haitao Cc: ''xen-devel@lists.xensource.com'' Subject: [Xen-devel] RE: [Question] Why code differs in construct_dom0?>>> "Shan, Haitao" <haitao.shan@intel.com> 20.11.08 13:52 >>> >I think I may not have described the problem clearly. The system has 4G >memory. From E820 table, there was near 3.5G usable ram below 4G >and about 0.5G above 4G. Of all the ram, most of the memory was >allocated to dom0, leaving only those for xen >itself such as xenheap >and xen''s reservations. >We were using an onboard graphic card. When starting X, agpgartallocated memory from kernel, then asked xen to exchange these>pages to contiguous pages below 4G. Each time agpgart module did >this job, some pages in kernel (which are actually >above 4G in >physical memory) were replaced with contiguous pages below 4G. >These kind of demands were rather high, about 256M in our platform. >Finally, xen''s reservation (128M) was not enough to fulfill this >requirement. >Why was the reservation exhausted? Because kernel kept asking for >memory below 4G but only returning to xen memory above 4G. Then >why is agpgart''s allocation always in effect from above 4G? According >to the code I pasted in my first mail, when pfn in >dom0 was small in >number, mfn was large. The smaller the pfn was, the larger the >corresponding mfn was. Apggart allocated memory with GFP_DMA32 >set, so the pfns allocated was likely to be small. Then the mfns were >likely to be actually quite large (above 4G). > >Either increasing the reservation (like 384M) or changing the initial p2m >mapping in dom0 can solve the problem, and our tests verified this >judgment. We do not know which solution is better. That''s why we are >seeking your kindly help. I am not sure if I have explained clearly >enough so far. So any questions on the problem itself, Keir?Neither of the suggested solutions seems correct to me - both would only defer the point where the problem occurs. The question really is why agpgart needs so much memory below 4G. And if that really isn''t a bug somewhere else, then requiring a sufficiently large negative value to be passed with dom0_mem= would seem to be the only option on this system (but not as a global default). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel