Hi,
I hope I''m right here. If not then please point me in the right
direction.
My problem in short:
I have problems using (pinning, mmu_update) physical pages from
0x900000 to 0xB1000 usualy.
I''m writing my own little amd64 64bit toy kernel (based on Mini-OS as
starting point) for xen and I run into problems with the way the start
of day sets up the physical pages.
My kernel is mapped at 0 (due to Mini-OS being there):
_text : 0x0
_etext : 0xcaef
_edata : 0xe8c4
__bss_start : 0x10000
_end : 0x21590
nr_pages : 3072
start_info : 0x27000
pt_base : 0x2a000
nr_pt_frames : 5
machine_to_phys_mapping: 0xffff800000000000
phys_to_machine_mapping: 0x21000
The Mini-OS source says that free pages follow pt_base + nr_pt_frames
+ 3 (store, console, something pages). So far so good. So I reserve
myself 42 pages for initial data structures and remove the rest from
the initial page tables. After some initializing I move over the
unmapped pages, pin them as PIN_L1_TABLE and UNPIN_TABLE before adding
the machine addresses to my list of free pages.
Now here is an example output of this loop:
ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212
ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5
ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2
ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2
ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000
ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1
ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 9b000 [1cb58000], op = 0, mfn = 1cb58
ERROR: -22 pinning failed: addr = 9c000 [707b000], op = 0, mfn = 707b
ERROR: -22 pinning failed: addr = 9d000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 9e000 [7753000], op = 0, mfn = 7753
ERROR: -22 pinning failed: addr = a0000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a1000 [1cf10000], op = 0, mfn = 1cf10
ERROR: -22 pinning failed: addr = a2000 [1000], op = 0, mfn = 1
ERROR: -22 pinning failed: addr = a3000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a4000 [ffffffff000], op = 0, mfn = ffffffff
ERROR: -22 pinning failed: addr = a5000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a6000 [781d000], op = 0, mfn = 781d
ERROR: -22 pinning failed: addr = a7000 [1cb58000], op = 0, mfn = 1cb58
ERROR: -22 pinning failed: addr = ac000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = ad000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = ae000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = af000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = b0000 [1000000], op = 0, mfn = 1000
ERROR: -22 pinning failed: addr = b1000 [21000000], op = 0, mfn = 21000
<return of mmuext_op> <phys> [mach] <contents of
mmuext_op_t>
where I loop over <phys> and [mach] is from the phys_to_machine_mapping.
The number of errors varies between runs but repeats for consecutive
runs and is (all but once I''ve seen) one continuious chunk. It appears
to be also limited to the first 512 physical pages and centered around
0xA0000.
I looked at my code and can''t find anything wrong. The pages that
don''t fail the initial pin/unpin test I can use fine as PageTables or
as normal data pages later on.
Is there something supposed to be mapped at that address range that I
should stay out of? Or am I seeing some bug in xen that causes a
corrupted phys to machine mapping?
MfG
Goswin
Hardware info:
# xm info
host : book
release : 2.6.20.9-xen-1
version : #1 SMP Fri Apr 27 10:45:05 CEST 2007
machine : x86_64
nr_cpus : 2
nr_nodes : 1
sockets_per_node : 1
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 1995
hw_caps :
178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
total_memory : 1919
free_memory : 1024
xen_major : 3
xen_minor : 0
xen_extra : .4-1
xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p
hvm-3.0-x86_64
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)
cc_compile_by : mrvn
cc_compile_domain : book.localnet
cc_compile_date : Sat Sep 22 01:27:47 CEST 2007
xend_config_format : 3
Domain Config:
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "kernel.gz"
# Initial memory allocation (in megabytes) for the new domain.
# Should be at least 12 MB
memory = 12
# A name for your domain. All domains must have different names.
name = "Moose"
on_crash = ''destroy''
vfb = [ ''type=sdl'' ]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Hi,
I hope I''m right here. If not then please point me in the right
direction.
My problem in short:
I have problems using (pinning, mmu_update) physical pages from
0x900000 to 0xB1000 usualy.
I''m writing my own little amd64 64bit toy kernel (based on Mini-OS as
starting point) for xen and I run into problems with the way the start
of day sets up the physical pages.
My kernel is mapped at 0 (due to Mini-OS being there):
_text : 0x0
_etext : 0xcaef
_edata : 0xe8c4
__bss_start : 0x10000
_end : 0x21590
nr_pages : 3072
start_info : 0x27000
pt_base : 0x2a000
nr_pt_frames : 5
machine_to_phys_mapping: 0xffff800000000000
phys_to_machine_mapping: 0x21000
The Mini-OS source says that free pages follow pt_base + nr_pt_frames
+ 3 (store, console, something pages). So far so good. So I reserve
myself 42 pages for initial data structures and remove the rest from
the initial page tables. After some initializing I move over the
unmapped pages, pin them as PIN_L1_TABLE and UNPIN_TABLE before adding
the machine addresses to my list of free pages.
Now here is an example output of this loop:
ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212
ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5
ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2
ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2
ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000
ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1
ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 9b000 [1cb58000], op = 0, mfn = 1cb58
ERROR: -22 pinning failed: addr = 9c000 [707b000], op = 0, mfn = 707b
ERROR: -22 pinning failed: addr = 9d000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = 9e000 [7753000], op = 0, mfn = 7753
ERROR: -22 pinning failed: addr = a0000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a1000 [1cf10000], op = 0, mfn = 1cf10
ERROR: -22 pinning failed: addr = a2000 [1000], op = 0, mfn = 1
ERROR: -22 pinning failed: addr = a3000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a4000 [ffffffff000], op = 0, mfn = ffffffff
ERROR: -22 pinning failed: addr = a5000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = a6000 [781d000], op = 0, mfn = 781d
ERROR: -22 pinning failed: addr = a7000 [1cb58000], op = 0, mfn = 1cb58
ERROR: -22 pinning failed: addr = ac000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = ad000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = ae000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = af000 [0], op = 0, mfn = 0
ERROR: -22 pinning failed: addr = b0000 [1000000], op = 0, mfn = 1000
ERROR: -22 pinning failed: addr = b1000 [21000000], op = 0, mfn = 21000
<return of mmuext_op> <phys> [mach] <contents of
mmuext_op_t>
where I loop over <phys> and [mach] is from the phys_to_machine_mapping.
The number of errors varies between runs but repeats for consecutive
runs and is (all but once I''ve seen) one continuious chunk. It appears
to be also limited to the first 512 physical pages and centered around
0xA0000.
I looked at my code and can''t find anything wrong. The pages that
don''t fail the initial pin/unpin test I can use fine as PageTables or
as normal data pages later on.
Is there something supposed to be mapped at that address range that I
should stay out of? Or am I seeing some bug in xen that causes a
corrupted phys to machine mapping?
MfG
Goswin
Hardware info:
# xm info
host : book
release : 2.6.20.9-xen-1
version : #1 SMP Fri Apr 27 10:45:05 CEST 2007
machine : x86_64
nr_cpus : 2
nr_nodes : 1
sockets_per_node : 1
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 1995
hw_caps :
178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
total_memory : 1919
free_memory : 1024
xen_major : 3
xen_minor : 0
xen_extra : .4-1
xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p
hvm-3.0-x86_64
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)
cc_compile_by : mrvn
cc_compile_domain : book.localnet
cc_compile_date : Sat Sep 22 01:27:47 CEST 2007
xend_config_format : 3
Domain Config:
#----------------------------------------------------------------------------
# Kernel image file.
kernel = "kernel.gz"
# Initial memory allocation (in megabytes) for the new domain.
# Should be at least 12 MB
memory = 12
# A name for your domain. All domains must have different names.
name = "Moose"
on_crash = ''destroy''
vfb = [ ''type=sdl'' ]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
On 23/9/07 06:29, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> Now here is an example output of this loop: > > ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 > ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 > ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 > ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 > ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 > ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 > ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 > ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 > ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0The [phys] values look screwed. There are duplicates and many are 0! So it looks rather like your p2m lookup logic is broken somehow. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-24 18:47 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 23/9/07 06:29, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> Now here is an example output of this loop: >> >> ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 >> ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 >> ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 >> ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 >> ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 >> ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 >> ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 >> ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 >> ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 > > The [phys] values look screwed. There are duplicates and many are 0! So it > looks rather like your p2m lookup logic is broken somehow. > > -- KeirI can''t fathom what could be wrong with this: unsigned long *phys_to_machine_mapping; phys_to_machine_mapping = (unsigned long *)start_info.mfn_list; machine_address = phys_to_machine_mapping[(addr - VIRT_START) >> PAGE_SHIFT] << PAGE_SHIFT; The code is too simple and it works for all other pages outside that one range. No I don''t think it is this piece of code. But if there is nothing to be there than the data itself must be corrupt. Although I can''t think of anything that could be overwriting that phys_to_machine_mapping array. Maybe I could hack the domain creator to map the array read-only and see if I get a segfault? MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-24 23:50 UTC
Re: [Xen-devel] Confused about start of day setup
Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes:> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes: > >> On 23/9/07 06:29, "Goswin von Brederlow" >> <brederlo@informatik.uni-tuebingen.de> wrote: >> >>> Now here is an example output of this loop: >>> >>> ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 >>> ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 >>> ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 >>> ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 >>> ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 >>> ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 >>> ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 >>> ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 >>> ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 >> >> The [phys] values look screwed. There are duplicates and many are 0! So it >> looks rather like your p2m lookup logic is broken somehow. >> >> -- Keir > > I can''t fathom what could be wrong with this: > > unsigned long *phys_to_machine_mapping; > phys_to_machine_mapping = (unsigned long *)start_info.mfn_list; > > machine_address = phys_to_machine_mapping[(addr - VIRT_START) >> PAGE_SHIFT] << PAGE_SHIFT; > > The code is too simple and it works for all other pages outside that > one range. No I don''t think it is this piece of code. But if there is > nothing to be there than the data itself must be corrupt. Although I > can''t think of anything that could be overwriting that > phys_to_machine_mapping array. > > Maybe I could hack the domain creator to map the array read-only and > see if I get a segfault? > > MfG > GoswinIt seems that my stack ends up in the middle of the phys_to_machine_mapping for some reason. I don''t know if that is already broken in Mini-OS or if I broke something though. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 00:52 UTC
Re: [Xen-devel] Confused about start of day setup
Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes:> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: > It seems that my stack ends up in the middle of the > phys_to_machine_mapping for some reason. I don''t know if that is > already broken in Mini-OS or if I broke something though. > > MfG > GoswinAnd once more I answere myself. I thought it was the stack being wrong because the return value of a function was stored in the phys_to_machine_mapping. But actualy it the variable this value gets saved in that is there. What decides where the phys_to_machine_mapping frames ends up? As it is I get it at [0x121000 - 0x127000], which happens to be overlapping with my bss section: objdump -t moose.elf | grep bss | sort 000000000011d520 g O .bss 0000000000004000 irqstack 0000000000121520 g O .bss 0000000000000018 __cacheline_aligned 0000000000121540 g O .bss 0000000000000020 xen_features 0000000000121560 g O .bss 0000000000000008 HYPERVISOR_shared_info 0000000000121568 g O .bss 0000000000000008 phys_to_machine_mapping And those addresses correspond directly to the pages that are later corrupt in the mapping. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/9/07 01:52, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> And once more I answere myself. I thought it was the stack being wrong > because the return value of a function was stored in the > phys_to_machine_mapping. But actualy it the variable this value gets > saved in that is there. > > What decides where the phys_to_machine_mapping frames ends up? > > As it is I get it at [0x121000 - 0x127000], which happens to be > overlapping with my bss section:The domain builder accounts for BSS space, so it sounds like you have a broken domU kernel image that does not specify enough BSS space in its program header. If you ''objdump -l'' your kernel''s image file, does the ''MemSiz'' plus the load address entirely cover your BSS? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 08:55 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 25/9/07 01:52, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> And once more I answere myself. I thought it was the stack being wrong >> because the return value of a function was stored in the >> phys_to_machine_mapping. But actualy it the variable this value gets >> saved in that is there. >> >> What decides where the phys_to_machine_mapping frames ends up? >> >> As it is I get it at [0x121000 - 0x127000], which happens to be >> overlapping with my bss section: > > The domain builder accounts for BSS space, so it sounds like you have a > broken domU kernel image that does not specify enough BSS space in its > program header. If you ''objdump -l'' your kernel''s image file, does the > ''MemSiz'' plus the load address entirely cover your BSS? > > -- KeirI think you ment readelf -l and no. MemSize says 1200f0 and then 121000 is the next page. But it is not something I did wrong. The Mini-OS example kernel is already broken in this way: mrvn@book:~/src/xen/xen-3-3.1.0/extras/mini-os% readelf -l mini-os Elf file type is EXEC (Executable file) Entry point 0x0 There are 2 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x00000000000000c0 0x0000000000000000 0x0000000000000000 0x00000000000105d0 0x0000000000023f88 RWE 20 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 8 Section to Segment mapping: Segment Sections... 00 .text .rodata .data .data.rel.ro.local .data.rel .data.rel.local .got .got.plt 01 mrvn@book:~/src/xen/xen-3-3.1.0/extras/mini-os% objdump -t mini-os | sort ... 0000000000012000 l d .bss 0000000000000000 .bss ... 0000000000024960 g O .bss 0000000000000020 xen_features 0000000000024980 g O .bss 0000000000000008 HYPERVISOR_shared_info 00000000000249a0 g O .bss 0000000000001000 tx_buffers 00000000000259a0 g O .bss 0000000000000008 phys_to_machine_mapping 00000000000259a8 g *ABS* 0000000000000000 _end 0000000080000000 l *ABS* 0000000000000000 NMI_MASK SYMBOL TABLE: mini-os: file format elf64-x86-64 So the MemSize only covers part of the bss segment. Why would ld do that? MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/9/07 09:55, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> I think you ment readelf -l and no. MemSize says 1200f0 and then > 121000 is the next page. > > But it is not something I did wrong. The Mini-OS example kernel is > already broken in this way:Some versions of ld get the MemSize wrong when there are alignment constraints specified in the linker script. I''m using ld 2.15 and it appears to get this right. Which version are you using? The alignment constraints in minios''s x86/64 linker script look pointless, as they add padding for sections that are not actually used. I can fix that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 10:07 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 25/9/07 09:55, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> I think you ment readelf -l and no. MemSize says 1200f0 and then >> 121000 is the next page. >> >> But it is not something I did wrong. The Mini-OS example kernel is >> already broken in this way: > > Some versions of ld get the MemSize wrong when there are alignment > constraints specified in the linker script. I''m using ld 2.15 and it appears > to get this right. Which version are you using? > > The alignment constraints in minios''s x86/64 linker script look pointless, > as they add padding for sections that are not actually used. I can fix that. > > -- KeirI''m using etch so: GNU ld version 2.17 Debian GNU/Linux If you work on Mini-OS note that 3.0.4 works but 3.1 fails for me. Do you see the same or is that a side effect of the wrong MemSize? I commented out the align statements and that seems to do the trick. Not ideal but it is ok for now. Thanks. I would have never found that fix. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Seemingly Similar Threads
- Xen capable linux-tree-2.6.16 deb?
- Bug#444000: Wrong path for dump files
- Using SYSCALL/SYSRET with a minios kernel
- Processed: submitter 252771, submitter 268152, submitter 312829, submitter 418048, submitter 436960 ...
- e2defrag - Unable to allocate buffer for inode priorities