Hi, I hope I''m right here. If not then please point me in the right direction. My problem in short: I have problems using (pinning, mmu_update) physical pages from 0x900000 to 0xB1000 usualy. I''m writing my own little amd64 64bit toy kernel (based on Mini-OS as starting point) for xen and I run into problems with the way the start of day sets up the physical pages. My kernel is mapped at 0 (due to Mini-OS being there): _text : 0x0 _etext : 0xcaef _edata : 0xe8c4 __bss_start : 0x10000 _end : 0x21590 nr_pages : 3072 start_info : 0x27000 pt_base : 0x2a000 nr_pt_frames : 5 machine_to_phys_mapping: 0xffff800000000000 phys_to_machine_mapping: 0x21000 The Mini-OS source says that free pages follow pt_base + nr_pt_frames + 3 (store, console, something pages). So far so good. So I reserve myself 42 pages for initial data structures and remove the rest from the initial page tables. After some initializing I move over the unmapped pages, pin them as PIN_L1_TABLE and UNPIN_TABLE before adding the machine addresses to my list of free pages. Now here is an example output of this loop: ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 9b000 [1cb58000], op = 0, mfn = 1cb58 ERROR: -22 pinning failed: addr = 9c000 [707b000], op = 0, mfn = 707b ERROR: -22 pinning failed: addr = 9d000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 9e000 [7753000], op = 0, mfn = 7753 ERROR: -22 pinning failed: addr = a0000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a1000 [1cf10000], op = 0, mfn = 1cf10 ERROR: -22 pinning failed: addr = a2000 [1000], op = 0, mfn = 1 ERROR: -22 pinning failed: addr = a3000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a4000 [ffffffff000], op = 0, mfn = ffffffff ERROR: -22 pinning failed: addr = a5000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a6000 [781d000], op = 0, mfn = 781d ERROR: -22 pinning failed: addr = a7000 [1cb58000], op = 0, mfn = 1cb58 ERROR: -22 pinning failed: addr = ac000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = ad000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = ae000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = af000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = b0000 [1000000], op = 0, mfn = 1000 ERROR: -22 pinning failed: addr = b1000 [21000000], op = 0, mfn = 21000 <return of mmuext_op> <phys> [mach] <contents of mmuext_op_t> where I loop over <phys> and [mach] is from the phys_to_machine_mapping. The number of errors varies between runs but repeats for consecutive runs and is (all but once I''ve seen) one continuious chunk. It appears to be also limited to the first 512 physical pages and centered around 0xA0000. I looked at my code and can''t find anything wrong. The pages that don''t fail the initial pin/unpin test I can use fine as PageTables or as normal data pages later on. Is there something supposed to be mapped at that address range that I should stay out of? Or am I seeing some bug in xen that causes a corrupted phys to machine mapping? MfG Goswin Hardware info: # xm info host : book release : 2.6.20.9-xen-1 version : #1 SMP Fri Apr 27 10:45:05 CEST 2007 machine : x86_64 nr_cpus : 2 nr_nodes : 1 sockets_per_node : 1 cores_per_socket : 2 threads_per_core : 1 cpu_mhz : 1995 hw_caps : 178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f total_memory : 1919 free_memory : 1024 xen_major : 3 xen_minor : 0 xen_extra : .4-1 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) cc_compile_by : mrvn cc_compile_domain : book.localnet cc_compile_date : Sat Sep 22 01:27:47 CEST 2007 xend_config_format : 3 Domain Config: #---------------------------------------------------------------------------- # Kernel image file. kernel = "kernel.gz" # Initial memory allocation (in megabytes) for the new domain. # Should be at least 12 MB memory = 12 # A name for your domain. All domains must have different names. name = "Moose" on_crash = ''destroy'' vfb = [ ''type=sdl'' ] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, I hope I''m right here. If not then please point me in the right direction. My problem in short: I have problems using (pinning, mmu_update) physical pages from 0x900000 to 0xB1000 usualy. I''m writing my own little amd64 64bit toy kernel (based on Mini-OS as starting point) for xen and I run into problems with the way the start of day sets up the physical pages. My kernel is mapped at 0 (due to Mini-OS being there): _text : 0x0 _etext : 0xcaef _edata : 0xe8c4 __bss_start : 0x10000 _end : 0x21590 nr_pages : 3072 start_info : 0x27000 pt_base : 0x2a000 nr_pt_frames : 5 machine_to_phys_mapping: 0xffff800000000000 phys_to_machine_mapping: 0x21000 The Mini-OS source says that free pages follow pt_base + nr_pt_frames + 3 (store, console, something pages). So far so good. So I reserve myself 42 pages for initial data structures and remove the rest from the initial page tables. After some initializing I move over the unmapped pages, pin them as PIN_L1_TABLE and UNPIN_TABLE before adding the machine addresses to my list of free pages. Now here is an example output of this loop: ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 9b000 [1cb58000], op = 0, mfn = 1cb58 ERROR: -22 pinning failed: addr = 9c000 [707b000], op = 0, mfn = 707b ERROR: -22 pinning failed: addr = 9d000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = 9e000 [7753000], op = 0, mfn = 7753 ERROR: -22 pinning failed: addr = a0000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a1000 [1cf10000], op = 0, mfn = 1cf10 ERROR: -22 pinning failed: addr = a2000 [1000], op = 0, mfn = 1 ERROR: -22 pinning failed: addr = a3000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a4000 [ffffffff000], op = 0, mfn = ffffffff ERROR: -22 pinning failed: addr = a5000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = a6000 [781d000], op = 0, mfn = 781d ERROR: -22 pinning failed: addr = a7000 [1cb58000], op = 0, mfn = 1cb58 ERROR: -22 pinning failed: addr = ac000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = ad000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = ae000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = af000 [0], op = 0, mfn = 0 ERROR: -22 pinning failed: addr = b0000 [1000000], op = 0, mfn = 1000 ERROR: -22 pinning failed: addr = b1000 [21000000], op = 0, mfn = 21000 <return of mmuext_op> <phys> [mach] <contents of mmuext_op_t> where I loop over <phys> and [mach] is from the phys_to_machine_mapping. The number of errors varies between runs but repeats for consecutive runs and is (all but once I''ve seen) one continuious chunk. It appears to be also limited to the first 512 physical pages and centered around 0xA0000. I looked at my code and can''t find anything wrong. The pages that don''t fail the initial pin/unpin test I can use fine as PageTables or as normal data pages later on. Is there something supposed to be mapped at that address range that I should stay out of? Or am I seeing some bug in xen that causes a corrupted phys to machine mapping? MfG Goswin Hardware info: # xm info host : book release : 2.6.20.9-xen-1 version : #1 SMP Fri Apr 27 10:45:05 CEST 2007 machine : x86_64 nr_cpus : 2 nr_nodes : 1 sockets_per_node : 1 cores_per_socket : 2 threads_per_core : 1 cpu_mhz : 1995 hw_caps : 178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f total_memory : 1919 free_memory : 1024 xen_major : 3 xen_minor : 0 xen_extra : .4-1 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) cc_compile_by : mrvn cc_compile_domain : book.localnet cc_compile_date : Sat Sep 22 01:27:47 CEST 2007 xend_config_format : 3 Domain Config: #---------------------------------------------------------------------------- # Kernel image file. kernel = "kernel.gz" # Initial memory allocation (in megabytes) for the new domain. # Should be at least 12 MB memory = 12 # A name for your domain. All domains must have different names. name = "Moose" on_crash = ''destroy'' vfb = [ ''type=sdl'' ] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 23/9/07 06:29, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> Now here is an example output of this loop: > > ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 > ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 > ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 > ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 > ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 > ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 > ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 > ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 > ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0The [phys] values look screwed. There are duplicates and many are 0! So it looks rather like your p2m lookup logic is broken somehow. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-24 18:47 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 23/9/07 06:29, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> Now here is an example output of this loop: >> >> ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 >> ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 >> ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 >> ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 >> ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 >> ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 >> ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 >> ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 >> ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 > > The [phys] values look screwed. There are duplicates and many are 0! So it > looks rather like your p2m lookup logic is broken somehow. > > -- KeirI can''t fathom what could be wrong with this: unsigned long *phys_to_machine_mapping; phys_to_machine_mapping = (unsigned long *)start_info.mfn_list; machine_address = phys_to_machine_mapping[(addr - VIRT_START) >> PAGE_SHIFT] << PAGE_SHIFT; The code is too simple and it works for all other pages outside that one range. No I don''t think it is this piece of code. But if there is nothing to be there than the data itself must be corrupt. Although I can''t think of anything that could be overwriting that phys_to_machine_mapping array. Maybe I could hack the domain creator to map the array read-only and see if I get a segfault? MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-24 23:50 UTC
Re: [Xen-devel] Confused about start of day setup
Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes:> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes: > >> On 23/9/07 06:29, "Goswin von Brederlow" >> <brederlo@informatik.uni-tuebingen.de> wrote: >> >>> Now here is an example output of this loop: >>> >>> ERROR: -22 pinning failed: addr = 90000 [212000], op = 0, mfn = 212 >>> ERROR: -22 pinning failed: addr = 91000 [0], op = 0, mfn = 0 >>> ERROR: -22 pinning failed: addr = 92000 [b3d5000], op = 0, mfn = b3d5 >>> ERROR: -22 pinning failed: addr = 93000 [2000], op = 0, mfn = 2 >>> ERROR: -22 pinning failed: addr = 96000 [2000], op = 0, mfn = 2 >>> ERROR: -22 pinning failed: addr = 97000 [1000000], op = 0, mfn = 1000 >>> ERROR: -22 pinning failed: addr = 98000 [0], op = 0, mfn = 0 >>> ERROR: -22 pinning failed: addr = 99000 [1000], op = 0, mfn = 1 >>> ERROR: -22 pinning failed: addr = 9a000 [0], op = 0, mfn = 0 >> >> The [phys] values look screwed. There are duplicates and many are 0! So it >> looks rather like your p2m lookup logic is broken somehow. >> >> -- Keir > > I can''t fathom what could be wrong with this: > > unsigned long *phys_to_machine_mapping; > phys_to_machine_mapping = (unsigned long *)start_info.mfn_list; > > machine_address = phys_to_machine_mapping[(addr - VIRT_START) >> PAGE_SHIFT] << PAGE_SHIFT; > > The code is too simple and it works for all other pages outside that > one range. No I don''t think it is this piece of code. But if there is > nothing to be there than the data itself must be corrupt. Although I > can''t think of anything that could be overwriting that > phys_to_machine_mapping array. > > Maybe I could hack the domain creator to map the array read-only and > see if I get a segfault? > > MfG > GoswinIt seems that my stack ends up in the middle of the phys_to_machine_mapping for some reason. I don''t know if that is already broken in Mini-OS or if I broke something though. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 00:52 UTC
Re: [Xen-devel] Confused about start of day setup
Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes:> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: > It seems that my stack ends up in the middle of the > phys_to_machine_mapping for some reason. I don''t know if that is > already broken in Mini-OS or if I broke something though. > > MfG > GoswinAnd once more I answere myself. I thought it was the stack being wrong because the return value of a function was stored in the phys_to_machine_mapping. But actualy it the variable this value gets saved in that is there. What decides where the phys_to_machine_mapping frames ends up? As it is I get it at [0x121000 - 0x127000], which happens to be overlapping with my bss section: objdump -t moose.elf | grep bss | sort 000000000011d520 g O .bss 0000000000004000 irqstack 0000000000121520 g O .bss 0000000000000018 __cacheline_aligned 0000000000121540 g O .bss 0000000000000020 xen_features 0000000000121560 g O .bss 0000000000000008 HYPERVISOR_shared_info 0000000000121568 g O .bss 0000000000000008 phys_to_machine_mapping And those addresses correspond directly to the pages that are later corrupt in the mapping. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/9/07 01:52, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> And once more I answere myself. I thought it was the stack being wrong > because the return value of a function was stored in the > phys_to_machine_mapping. But actualy it the variable this value gets > saved in that is there. > > What decides where the phys_to_machine_mapping frames ends up? > > As it is I get it at [0x121000 - 0x127000], which happens to be > overlapping with my bss section:The domain builder accounts for BSS space, so it sounds like you have a broken domU kernel image that does not specify enough BSS space in its program header. If you ''objdump -l'' your kernel''s image file, does the ''MemSiz'' plus the load address entirely cover your BSS? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 08:55 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 25/9/07 01:52, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> And once more I answere myself. I thought it was the stack being wrong >> because the return value of a function was stored in the >> phys_to_machine_mapping. But actualy it the variable this value gets >> saved in that is there. >> >> What decides where the phys_to_machine_mapping frames ends up? >> >> As it is I get it at [0x121000 - 0x127000], which happens to be >> overlapping with my bss section: > > The domain builder accounts for BSS space, so it sounds like you have a > broken domU kernel image that does not specify enough BSS space in its > program header. If you ''objdump -l'' your kernel''s image file, does the > ''MemSiz'' plus the load address entirely cover your BSS? > > -- KeirI think you ment readelf -l and no. MemSize says 1200f0 and then 121000 is the next page. But it is not something I did wrong. The Mini-OS example kernel is already broken in this way: mrvn@book:~/src/xen/xen-3-3.1.0/extras/mini-os% readelf -l mini-os Elf file type is EXEC (Executable file) Entry point 0x0 There are 2 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x00000000000000c0 0x0000000000000000 0x0000000000000000 0x00000000000105d0 0x0000000000023f88 RWE 20 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 8 Section to Segment mapping: Segment Sections... 00 .text .rodata .data .data.rel.ro.local .data.rel .data.rel.local .got .got.plt 01 mrvn@book:~/src/xen/xen-3-3.1.0/extras/mini-os% objdump -t mini-os | sort ... 0000000000012000 l d .bss 0000000000000000 .bss ... 0000000000024960 g O .bss 0000000000000020 xen_features 0000000000024980 g O .bss 0000000000000008 HYPERVISOR_shared_info 00000000000249a0 g O .bss 0000000000001000 tx_buffers 00000000000259a0 g O .bss 0000000000000008 phys_to_machine_mapping 00000000000259a8 g *ABS* 0000000000000000 _end 0000000080000000 l *ABS* 0000000000000000 NMI_MASK SYMBOL TABLE: mini-os: file format elf64-x86-64 So the MemSize only covers part of the bss segment. Why would ld do that? MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/9/07 09:55, "Goswin von Brederlow" <brederlo@informatik.uni-tuebingen.de> wrote:> I think you ment readelf -l and no. MemSize says 1200f0 and then > 121000 is the next page. > > But it is not something I did wrong. The Mini-OS example kernel is > already broken in this way:Some versions of ld get the MemSize wrong when there are alignment constraints specified in the linker script. I''m using ld 2.15 and it appears to get this right. Which version are you using? The alignment constraints in minios''s x86/64 linker script look pointless, as they add padding for sections that are not actually used. I can fix that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Goswin von Brederlow
2007-Sep-25 10:07 UTC
Re: [Xen-devel] Confused about start of day setup
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 25/9/07 09:55, "Goswin von Brederlow" > <brederlo@informatik.uni-tuebingen.de> wrote: > >> I think you ment readelf -l and no. MemSize says 1200f0 and then >> 121000 is the next page. >> >> But it is not something I did wrong. The Mini-OS example kernel is >> already broken in this way: > > Some versions of ld get the MemSize wrong when there are alignment > constraints specified in the linker script. I''m using ld 2.15 and it appears > to get this right. Which version are you using? > > The alignment constraints in minios''s x86/64 linker script look pointless, > as they add padding for sections that are not actually used. I can fix that. > > -- KeirI''m using etch so: GNU ld version 2.17 Debian GNU/Linux If you work on Mini-OS note that 3.0.4 works but 3.1 fails for me. Do you see the same or is that a side effect of the wrong MemSize? I commented out the align statements and that seems to do the trick. Not ideal but it is ok for now. Thanks. I would have never found that fix. MfG Goswin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Seemingly Similar Threads
- Xen capable linux-tree-2.6.16 deb?
- Bug#444000: Wrong path for dump files
- Using SYSCALL/SYSRET with a minios kernel
- Processed: submitter 252771, submitter 268152, submitter 312829, submitter 418048, submitter 436960 ...
- e2defrag - Unable to allocate buffer for inode priorities