thr3ads.net - Xen devel - [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions [Dec 2013]

If this information is useful, please help other people find it:
Share via:

Ian Campbell

2013-Dec-02 14:39 UTC

[PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

arm32 currently only makes use of memory which is contiguous with the first
bank. On the Midway platform this means that we only use 4GB of the 8GB
available.

Change things to make use of non-contiguous memory regions with the
restriction that we require that at least half of the total span of the RAM
addresses contain RAM. The frametable is currently not sparse and so this
restriction avoids problems with allocating enormous amounts of memory for the
frametable to cover holes in the address space and exhausting the actual RAM.

50% is arguably too restrictive. 4GB of RAM requires 32MB of frametable on
arm32 and 56M on arm64, so we could probably cope with a lower ratio of actual
RAM. However half is nice and conservative.

arm64 currently uses all banks without regard for the size of the frametable,
which I have observed causing problems on models. Implement that same
restriction as arm32 there.

Long term we should look at moving to a pfn compression based scheme similar
to x86, which removes the holes from the frametable.

There were some bogus/outdated comments scattered around this code which I
have removed.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Julien Grall <julien.grall@linaro.org>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
---
v2: Rebase over "avoid truncation in mfn to paddr conversions"

Freeze:

The benefit of this series is that we can use the full 8GB of RAM on the
midway systems, rather than being limited to just the first 4GB (less I/O
holes). I expect it to be common that server class 32-bit arm systems will
have a hole in memory (between a bank <4GB and one above).

The risk is that we regress on some other supported platform but AFAIK the
vexpress and sunxi only have a single memory bank and the arndale has
contiguous memory regions.

The ARM64 portion of this is a bug fix.
---
 xen/arch/arm/setup.c |  185 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 122 insertions(+), 63 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 0795eb9..e640d81 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -360,35 +360,80 @@ static paddr_t __init get_xen_paddr(void)
 #ifdef CONFIG_ARM_32
 static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
 {
-    paddr_t ram_start;
-    paddr_t ram_end;
-    paddr_t ram_size;
+    paddr_t ram_start, ram_end, ram_size;
+    paddr_t contig_start, contig_end;
     paddr_t s, e;
     unsigned long ram_pages;
     unsigned long heap_pages, xenheap_pages, domheap_pages;
     unsigned long dtb_pages;
     unsigned long boot_mfn_start, boot_mfn_end;
-    int i = 0;
+    int i;
     void *fdt;
 
-    /* TODO: Handle non-contiguous memory bank */
     if ( !early_info.mem.nr_banks )
         early_panic("No memory bank\n");
-    ram_start = early_info.mem.bank[0].start;
+
+    /*
+     * We are going to accumulate two regions here.
+     *
+     * The first is the bounds of the initial memory region which is
+     * contiguous with the first bank. For simplicity the xenheap is
+     * always allocated from this region.
+     *
+     * The second is the complete bounds of the regions containing RAM
+     * (ie. from the lowest RAM address to the highest), which
+     * includes any holes.
+     *
+     * We also track the number of actual RAM pages (i.e. not counting
+     * the holes).
+     */
     ram_size  = early_info.mem.bank[0].size;
-    ram_end = ram_start + ram_size;
+
+    contig_start = ram_start = early_info.mem.bank[0].start;
+    contig_end   = ram_end = ram_start + ram_size;
 
     for ( i = 1; i < early_info.mem.nr_banks; i++ )
     {
-        if ( ram_end != early_info.mem.bank[i].start )
+        paddr_t bank_start = early_info.mem.bank[i].start;
+        paddr_t bank_size = early_info.mem.bank[i].size;
+        paddr_t bank_end = bank_start + bank_size;
+
+        paddr_t new_ram_size = ram_size + bank_size;
+        paddr_t new_ram_start = min(ram_start,bank_start);
+        paddr_t new_ram_end = max(ram_end,bank_end);
+
+        /*
+         * If the new bank is contiguous with the initial contiguous
+         * region then incorporate it into the contiguous region.
+         *
+         * Otherwise we allow non-contigious regions so long as at
+         * least half of the total RAM region actually contains
+         * RAM. We actually fudge this slightly and require that
+         * adding the current bank does not cause us to violate this
+         * restriction.
+         *
+         * This restriction ensures that the frametable (which is not
+         * currently sparse) does not consume all available RAM.
+         */
+        if ( bank_start == contig_end )
+            contig_end = bank_end;
+        else if ( bank_end == contig_start )
+            contig_start = bank_start;
+        else if ( 2 * new_ram_size < new_ram_end - new_ram_start )
+            /* Would create memory map which is too sparse, so stop here. */
             break;
 
-        ram_size += early_info.mem.bank[i].size;
-        ram_end += early_info.mem.bank[i].size;
+        ram_size = new_ram_size;
+        ram_start = new_ram_start;
+        ram_end = new_ram_end;
     }
 
     if ( i != early_info.mem.nr_banks )
-        early_printk("WARNING: some memory banks are not used\n");
+    {
+        early_printk("WARNING: only using %d out of %d memory
banks\n",
+                     i, early_info.mem.nr_banks);
+        early_info.mem.nr_banks = i;
+    }
 
     total_pages = ram_pages = ram_size >> PAGE_SHIFT;
 
@@ -403,13 +448,14 @@ static void __init setup_mm(unsigned long dtb_paddr,
size_t dtb_size)
      * We try to allocate the largest xenheap possible within these
      * constraints.
      */
-    heap_pages = (ram_size >> PAGE_SHIFT);
+    heap_pages = ram_pages;
     xenheap_pages = (heap_pages/8 + 0x1fffUL) & ~0x1fffUL;
     xenheap_pages = max(xenheap_pages, 128UL<<(20-PAGE_SHIFT));
 
     do
     {
-        e = consider_modules(ram_start, ram_end,
+        /* xenheap is always in the initial contiguous region */
+        e = consider_modules(contig_start, contig_end,
                              pfn_to_paddr(xenheap_pages),
                              32<<20, 0);
         if ( e )
@@ -433,9 +479,6 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t
dtb_size)
     /*
      * Need a single mapped page for populating bootmem_region_list
      * and enough mapped pages for copying the DTB.
-     *
-     * TODO: The DTB (and other payloads) are assumed to be towards
-     * the start of RAM.
      */
     dtb_pages = (dtb_size + PAGE_SIZE-1) >> PAGE_SHIFT;
     boot_mfn_start = xenheap_mfn_end - dtb_pages - 1;
@@ -443,47 +486,51 @@ static void __init setup_mm(unsigned long dtb_paddr,
size_t dtb_size)
 
     init_boot_pages(pfn_to_paddr(boot_mfn_start), pfn_to_paddr(boot_mfn_end));
 
-    /*
-     * Copy the DTB.
-     *
-     * TODO: handle other payloads too.
-     */
+    /* Copy the DTB. */
     fdt = mfn_to_virt(alloc_boot_pages(dtb_pages, 1));
     copy_from_paddr(fdt, dtb_paddr, dtb_size, BUFFERABLE);
     device_tree_flattened = fdt;
 
     /* Add non-xenheap memory */
-    s = ram_start;
-    while ( s < ram_end )
+    for ( i = 0; i < early_info.mem.nr_banks; i++ )
     {
-        paddr_t n = ram_end;
+        paddr_t bank_start = early_info.mem.bank[i].start;
+        paddr_t bank_end = bank_start + early_info.mem.bank[i].size;
 
-        e = next_module(s, &n);
-
-        if ( e == ~(paddr_t)0 )
+        s = bank_start;
+        while ( s < bank_end )
         {
-            e = n = ram_end;
-        }
+            paddr_t n = bank_end;
 
-        /* Module in RAM which we cannot see here, due to not handling
-         * non-contiguous memory regions yet
-         */
-        if ( e > ram_end )
-            e = ram_end;
+            e = next_module(s, &n);
 
-        /* Avoid the xenheap */
-        if ( s < pfn_to_paddr(xenheap_mfn_start+xenheap_pages)
-             && pfn_to_paddr(xenheap_mfn_start) < e )
-        {
-            e = pfn_to_paddr(xenheap_mfn_start);
-            n = pfn_to_paddr(xenheap_mfn_start+xenheap_pages);
-        }
+            if ( e == ~(paddr_t)0 )
+            {
+                e = n = ram_end;
+            }
+
+            /*
+             * Module in a RAM bank other than the one which we are
+             * not dealing with here.
+             */
+            if ( e > bank_end )
+                e = bank_end;
 
-        dt_unreserved_regions(s, e, init_boot_pages, 0);
+            /* Avoid the xenheap */
+            if ( s < pfn_to_paddr(xenheap_mfn_start+xenheap_pages)
+                 && pfn_to_paddr(xenheap_mfn_start) < e )
+            {
+                e = pfn_to_paddr(xenheap_mfn_start);
+                n = pfn_to_paddr(xenheap_mfn_start+xenheap_pages);
+            }
+
+            dt_unreserved_regions(s, e, init_boot_pages, 0);
 
-        s = n;
+            s = n;
+        }
     }
 
+    /* Frame table covers all of RAM region, including holes */
     setup_frametable_mappings(ram_start, ram_end);
     max_page = PFN_DOWN(ram_end);
 
@@ -499,8 +546,8 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t
dtb_size)
 {
     paddr_t ram_start = ~0;
     paddr_t ram_end = 0;
+    paddr_t ram_size = 0;
     int bank;
-    unsigned long  xenheap_pages = 0;
     unsigned long dtb_pages;
     void *fdt;
 
@@ -508,23 +555,33 @@ static void __init setup_mm(unsigned long dtb_paddr,
size_t dtb_size)
     for ( bank = 0 ; bank < early_info.mem.nr_banks; bank++ )
     {
         paddr_t bank_start = early_info.mem.bank[bank].start;
-        paddr_t bank_size  = early_info.mem.bank[bank].size;
-        paddr_t  bank_end = bank_start + bank_size;
-        unsigned long bank_pages = bank_size >> PAGE_SHIFT;
+        paddr_t bank_size = early_info.mem.bank[bank].size;
+        paddr_t bank_end = bank_start + bank_size;
         paddr_t s, e;
 
-        total_pages += bank_pages;
-
-        if ( bank_start < ram_start )
-            ram_start = bank_start;
-        if ( bank_end > ram_end )
-            ram_end = bank_end;
+        paddr_t new_ram_size = ram_size + bank_size;
+        paddr_t new_ram_start = min(ram_start,bank_start);
+        paddr_t new_ram_end = max(ram_end,bank_end);
+
+        /*
+         * We allow non-contigious regions so long as at least half of
+         * the total RAM region actually contains RAM. We actually
+         * fudge this slightly and require that adding the current
+         * bank does not cause us to violate this restriction.
+         *
+         * This restriction ensures that the frametable (which is not
+         * currently sparse) does not consume all available RAM.
+         */
+        if ( bank > 0 && 2 * new_ram_size < new_ram_end -
new_ram_start )
+            /* Would create memory map which is too sparse, so stop here. */
+            break;
 
-        xenheap_pages += (bank_size >> PAGE_SHIFT);
+        ram_start = new_ram_start;
+        ram_end = new_ram_end;
+        ram_size = new_ram_size;
 
         setup_xenheap_mappings(bank_start>>PAGE_SHIFT,
bank_size>>PAGE_SHIFT);
 
-        /* XXX we assume that the ram regions are ordered */
         s = bank_start;
         while ( s < bank_end )
         {
@@ -547,6 +604,15 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t
dtb_size)
         }
     }
 
+    if ( bank != early_info.mem.nr_banks )
+    {
+        early_printk("WARNING: only using %d out of %d memory
banks\n",
+                     bank, early_info.mem.nr_banks);
+        early_info.mem.nr_banks = bank;
+    }
+
+    total_pages += ram_size >> PAGE_SHIFT;
+
     xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
     xenheap_mfn_start = ram_start >> PAGE_SHIFT;
     xenheap_mfn_end = ram_end >> PAGE_SHIFT;
@@ -554,17 +620,10 @@ static void __init setup_mm(unsigned long dtb_paddr,
size_t dtb_size)
 
     /*
      * Need enough mapped pages for copying the DTB.
-     *
-     * TODO: The DTB (and other payloads) are assumed to be towards
-     * the start of RAM.
      */
     dtb_pages = (dtb_size + PAGE_SIZE-1) >> PAGE_SHIFT;
 
-    /*
-     * Copy the DTB.
-     *
-     * TODO: handle other payloads too.
-     */
+    /* Copy the DTB. */
     fdt = mfn_to_virt(alloc_boot_pages(dtb_pages, 1));
     copy_from_paddr(fdt, dtb_paddr, dtb_size, BUFFERABLE);
     device_tree_flattened = fdt;
-- 
1.7.10.4

George Dunlap

2013-Dec-04 17:02 UTC

head link

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

On Mon, Dec 2, 2013 at 2:39 PM, Ian Campbell <ian.campbell@citrix.com>
wrote:> arm32 currently only makes use of memory which is contiguous with the first
> bank. On the Midway platform this means that we only use 4GB of the 8GB
> available.
>
> Change things to make use of non-contiguous memory regions with the
> restriction that we require that at least half of the total span of the RAM
> addresses contain RAM. The frametable is currently not sparse and so this
> restriction avoids problems with allocating enormous amounts of memory for
the
> frametable to cover holes in the address space and exhausting the actual
RAM.
>
> 50% is arguably too restrictive. 4GB of RAM requires 32MB of frametable on
> arm32 and 56M on arm64, so we could probably cope with a lower ratio of
actual
> RAM. However half is nice and conservative.
>
> arm64 currently uses all banks without regard for the size of the
frametable,
> which I have observed causing problems on models. Implement that same
> restriction as arm32 there.
>
> Long term we should look at moving to a pfn compression based scheme
similar
> to x86, which removes the holes from the frametable.
>
> There were some bogus/outdated comments scattered around this code which I
> have removed.
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Tested-by: Julien Grall <julien.grall@linaro.org>
> Acked-by: Julien Grall <julien.grall@linaro.org>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> ---
> v2: Rebase over "avoid truncation in mfn to paddr conversions"
>
> Freeze:
>
> The benefit of this series is that we can use the full 8GB of RAM on the
> midway systems, rather than being limited to just the first 4GB (less I/O
> holes). I expect it to be common that server class 32-bit arm systems will
> have a hole in memory (between a bank <4GB and one above).
>
> The risk is that we regress on some other supported platform but AFAIK the
> vexpress and sunxi only have a single memory bank and the arndale has
> contiguous memory regions.
If there were a bug in this patch, what would be the likely impact?
Would the host not boot?  Would it only be able to access a part of
its memory?  Or would it just allocate too much memory for a
frametable?

The complexity of the patch looks middle to middle-high, just from the
number of lines -- would you agree with that, or is the resulting code
actually fairly simple and straightforward?

Does the fact that vexpress, sunxi, and arndale have single bank or
contiguous memory regions mean that they will skip the complicated
sections of code?  Or will they go through the complicated code and
possibly trip over some bugs in spite of the fact that their layout is
fairly simple?

 -George

Ian Campbell

2013-Dec-04 17:29 UTC

head link

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

On Wed, 2013-12-04 at 17:02 +0000, George Dunlap wrote:> On Mon, Dec 2, 2013 at 2:39 PM, Ian Campbell
<ian.campbell@citrix.com> wrote:
> > arm32 currently only makes use of memory which is contiguous with the
first
> > bank. On the Midway platform this means that we only use 4GB of the
8GB
> > available.
> >
> > Change things to make use of non-contiguous memory regions with the
> > restriction that we require that at least half of the total span of
the RAM
> > addresses contain RAM. The frametable is currently not sparse and so
this
> > restriction avoids problems with allocating enormous amounts of memory
for the
> > frametable to cover holes in the address space and exhausting the
actual RAM.
> >
> > 50% is arguably too restrictive. 4GB of RAM requires 32MB of
frametable on
> > arm32 and 56M on arm64, so we could probably cope with a lower ratio
of actual
> > RAM. However half is nice and conservative.
> >
> > arm64 currently uses all banks without regard for the size of the
frametable,
> > which I have observed causing problems on models. Implement that same
> > restriction as arm32 there.
> >
> > Long term we should look at moving to a pfn compression based scheme
similar
> > to x86, which removes the holes from the frametable.
> >
> > There were some bogus/outdated comments scattered around this code
which I
> > have removed.
> >
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > Tested-by: Julien Grall <julien.grall@linaro.org>
> > Acked-by: Julien Grall <julien.grall@linaro.org>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > ---
> > v2: Rebase over "avoid truncation in mfn to paddr
conversions"
> >
> > Freeze:
> >
> > The benefit of this series is that we can use the full 8GB of RAM on
the
> > midway systems, rather than being limited to just the first 4GB (less
I/O
> > holes). I expect it to be common that server class 32-bit arm systems
will
> > have a hole in memory (between a bank <4GB and one above).
> >
> > The risk is that we regress on some other supported platform but AFAIK
the
> > vexpress and sunxi only have a single memory bank and the arndale has
> > contiguous memory regions.
> 
> If there were a bug in this patch, what would be the likely impact?
> Would the host not boot?  Would it only be able to access a part of
> its memory?
These are the two most likely outcomes of a bug. I can''t say which one
is more likely for sure.
> Or would it just allocate too much memory for a frametable?
This is unlikely since the restriction placed on the frametable size
relative to RAM size is deliberately pretty harsh to mitigate the risk
here.
> The complexity of the patch looks middle to middle-high, just from the
> number of lines -- would you agree with that, or is the resulting code
> actually fairly simple and straightforward?
There''s quite a few big comments ;-). (1/4 of the additions are
comments)

The need to reindent the arm32 "/* Add non-xenheap memory */" loop
inside a new outer loop also adds quite a bit to the overall look of
complexity, plus I''ve created lots of temporary variables to try and
make the core logic as clear as possible.

The real meat is the 1/2 dozen lines after the "If the new bank is
contiguous" comment in the arm32 case and the two lines after "We
allow
non-contiguous regions " in the arm64 case. I think those pretty
unambiguously implement what is described in the commit log / comments.
> Does the fact that vexpress, sunxi, and arndale have single bank or
> contiguous memory regions mean that they will skip the complicated
> sections of code?  Or will they go through the complicated code and
> possibly trip over some bugs in spite of the fact that their layout is
> fairly simple?
There is some difference between the current code and the "simple fall
straight through" case after this patch but I don''t think it is
major.

FWIW Julien has tested on Arndale (a multiple contiguous bank
configuration) and it is OK.

Ian.

George Dunlap

2013-Dec-04 17:55 UTC

head link

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

On 12/04/2013 05:29 PM, Ian Campbell wrote:> On Wed, 2013-12-04 at 17:02 +0000, George Dunlap wrote:
>> On Mon, Dec 2, 2013 at 2:39 PM, Ian Campbell
<ian.campbell@citrix.com> wrote:
>>> arm32 currently only makes use of memory which is contiguous with
the first
>>> bank. On the Midway platform this means that we only use 4GB of the
8GB
>>> available.
>>>
>>> Change things to make use of non-contiguous memory regions with the
>>> restriction that we require that at least half of the total span of
the RAM
>>> addresses contain RAM. The frametable is currently not sparse and
so this
>>> restriction avoids problems with allocating enormous amounts of
memory for the
>>> frametable to cover holes in the address space and exhausting the
actual RAM.
>>>
>>> 50% is arguably too restrictive. 4GB of RAM requires 32MB of
frametable on
>>> arm32 and 56M on arm64, so we could probably cope with a lower
ratio of actual
>>> RAM. However half is nice and conservative.
>>>
>>> arm64 currently uses all banks without regard for the size of the
frametable,
>>> which I have observed causing problems on models. Implement that
same
>>> restriction as arm32 there.
>>>
>>> Long term we should look at moving to a pfn compression based
scheme similar
>>> to x86, which removes the holes from the frametable.
>>>
>>> There were some bogus/outdated comments scattered around this code
which I
>>> have removed.
>>>
>>> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
>>> Tested-by: Julien Grall <julien.grall@linaro.org>
>>> Acked-by: Julien Grall <julien.grall@linaro.org>
>>> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
>>> ---
>>> v2: Rebase over "avoid truncation in mfn to paddr
conversions"
>>>
>>> Freeze:
>>>
>>> The benefit of this series is that we can use the full 8GB of RAM
on the
>>> midway systems, rather than being limited to just the first 4GB
(less I/O
>>> holes). I expect it to be common that server class 32-bit arm
systems will
>>> have a hole in memory (between a bank <4GB and one above).
>>>
>>> The risk is that we regress on some other supported platform but
AFAIK the
>>> vexpress and sunxi only have a single memory bank and the arndale
has
>>> contiguous memory regions.
>> If there were a bug in this patch, what would be the likely impact?
>> Would the host not boot?  Would it only be able to access a part of
>> its memory?
> These are the two most likely outcomes of a bug. I can''t say which
one
> is more likely for sure.
>
>> Or would it just allocate too much memory for a frametable?
> This is unlikely since the restriction placed on the frametable size
> relative to RAM size is deliberately pretty harsh to mitigate the risk
> here.
>
>> The complexity of the patch looks middle to middle-high, just from the
>> number of lines -- would you agree with that, or is the resulting code
>> actually fairly simple and straightforward?
> There''s quite a few big comments ;-). (1/4 of the additions are
> comments)
>
> The need to reindent the arm32 "/* Add non-xenheap memory */"
loop
> inside a new outer loop also adds quite a bit to the overall look of
> complexity, plus I''ve created lots of temporary variables to try
and
> make the core logic as clear as possible.
>
> The real meat is the 1/2 dozen lines after the "If the new bank is
> contiguous" comment in the arm32 case and the two lines after "We
allow
> non-contiguous regions " in the arm64 case. I think those pretty
> unambiguously implement what is described in the commit log / comments.
It sounds like what you''re saying is that the patch is more mid-low 
complexity?
>
>> Does the fact that vexpress, sunxi, and arndale have single bank or
>> contiguous memory regions mean that they will skip the complicated
>> sections of code?  Or will they go through the complicated code and
>> possibly trip over some bugs in spite of the fact that their layout is
>> fairly simple?
> There is some difference between the current code and the "simple fall
> straight through" case after this patch but I don''t think it
is major.
>
> FWIW Julien has tested on Arndale (a multiple contiguous bank
> configuration) and it is OK.
Right.  I''m inclined to classify "Can''t access half of
the ram" as a
kind of bug.  It''s a bit less serious than "can''t
boot", but then again,
even "can''t boot" is not a bad one to have as bugs go -- you
know right
away that there''s a problem, so it should be fairly easy to find a fix.

So the benefit is fixing the "can''t access half the ram" bug,
at the
risk of introducing an easily discoverable and fixable "can''t
boot" bug
(or an equivalent "can''t access all the ram" bug); and that
risk is
fairly low, if your assessment of the complexity of the patch is accurate.

Unless someone has an objection, I think it should probably go in, then.

I guess that''s a,
"Release-ack-if-no-one-has-objected-in-a-day-or-so". :-)

  -George

Ian Campbell

2013-Dec-05 09:21 UTC

head link

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

On Wed, 2013-12-04 at 17:55 +0000, George Dunlap wrote:> On 12/04/2013 05:29 PM, Ian Campbell wrote:
> > On Wed, 2013-12-04 at 17:02 +0000, George Dunlap wrote:
> >> On Mon, Dec 2, 2013 at 2:39 PM, Ian Campbell
<ian.campbell@citrix.com> wrote:
> >>> arm32 currently only makes use of memory which is contiguous
with the first
> >>> bank. On the Midway platform this means that we only use 4GB
of the 8GB
> >>> available.
> >>>
> >>> Change things to make use of non-contiguous memory regions
with the
> >>> restriction that we require that at least half of the total
span of the RAM
> >>> addresses contain RAM. The frametable is currently not sparse
and so this
> >>> restriction avoids problems with allocating enormous amounts
of memory for the
> >>> frametable to cover holes in the address space and exhausting
the actual RAM.
> >>>
> >>> 50% is arguably too restrictive. 4GB of RAM requires 32MB of
frametable on
> >>> arm32 and 56M on arm64, so we could probably cope with a lower
ratio of actual
> >>> RAM. However half is nice and conservative.
> >>>
> >>> arm64 currently uses all banks without regard for the size of
the frametable,
> >>> which I have observed causing problems on models. Implement
that same
> >>> restriction as arm32 there.
> >>>
> >>> Long term we should look at moving to a pfn compression based
scheme similar
> >>> to x86, which removes the holes from the frametable.
> >>>
> >>> There were some bogus/outdated comments scattered around this
code which I
> >>> have removed.
> >>>
> >>> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> >>> Tested-by: Julien Grall <julien.grall@linaro.org>
> >>> Acked-by: Julien Grall <julien.grall@linaro.org>
> >>> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> >>> ---
> >>> v2: Rebase over "avoid truncation in mfn to paddr
conversions"
> >>>
> >>> Freeze:
> >>>
> >>> The benefit of this series is that we can use the full 8GB of
RAM on the
> >>> midway systems, rather than being limited to just the first
4GB (less I/O
> >>> holes). I expect it to be common that server class 32-bit arm
systems will
> >>> have a hole in memory (between a bank <4GB and one above).
> >>>
> >>> The risk is that we regress on some other supported platform
but AFAIK the
> >>> vexpress and sunxi only have a single memory bank and the
arndale has
> >>> contiguous memory regions.
> >> If there were a bug in this patch, what would be the likely
impact?
> >> Would the host not boot?  Would it only be able to access a part
of
> >> its memory?
> > These are the two most likely outcomes of a bug. I can''t say
which one
> > is more likely for sure.
> >
> >> Or would it just allocate too much memory for a frametable?
> > This is unlikely since the restriction placed on the frametable size
> > relative to RAM size is deliberately pretty harsh to mitigate the risk
> > here.
> >
> >> The complexity of the patch looks middle to middle-high, just from
the
> >> number of lines -- would you agree with that, or is the resulting
code
> >> actually fairly simple and straightforward?
> > There''s quite a few big comments ;-). (1/4 of the additions
are
> > comments)
> >
> > The need to reindent the arm32 "/* Add non-xenheap memory
*/" loop
> > inside a new outer loop also adds quite a bit to the overall look of
> > complexity, plus I''ve created lots of temporary variables to
try and
> > make the core logic as clear as possible.
> >
> > The real meat is the 1/2 dozen lines after the "If the new bank
is
> > contiguous" comment in the arm32 case and the two lines after
"We allow
> > non-contiguous regions " in the arm64 case. I think those pretty
> > unambiguously implement what is described in the commit log /
comments.
> 
> It sounds like what you''re saying is that the patch is more
mid-low
> complexity?
Yes.
> >> Does the fact that vexpress, sunxi, and arndale have single bank
or
> >> contiguous memory regions mean that they will skip the complicated
> >> sections of code?  Or will they go through the complicated code
and
> >> possibly trip over some bugs in spite of the fact that their
layout is
> >> fairly simple?
> > There is some difference between the current code and the "simple
fall
> > straight through" case after this patch but I don''t
think it is major.
> >
> > FWIW Julien has tested on Arndale (a multiple contiguous bank
> > configuration) and it is OK.
> 
> Right.  I''m inclined to classify "Can''t access half
of the ram" as a
> kind of bug.  It''s a bit less serious than "can''t
boot", but then again,
> even "can''t boot" is not a bad one to have as bugs go --
you know right
> away that there''s a problem, so it should be fairly easy to find a
fix.
Agreed.
> So the benefit is fixing the "can''t access half the ram"
bug, at the
> risk of introducing an easily discoverable and fixable "can''t
boot" bug
> (or an equivalent "can''t access all the ram" bug); and
that risk is
> fairly low, if your assessment of the complexity of the patch is accurate.
> 
> Unless someone has an objection, I think it should probably go in, then.
> 
> I guess that''s a,
"Release-ack-if-no-one-has-objected-in-a-day-or-so". :-)
Thanks, will-apply-tomorrow-unless-someone-objects,
Ian.

Xen devel - Dec 2013 - [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

[PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions

Re: [PATCH v2] xen: arm: improve handling of system with non-contiguous RAM regions