Olaf Hering
2011-Jun-20 15:26 UTC
[Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
The following series adds an option -P to xenctx to dump the guests pagetable. Another option to work an all vcpus is also added. The reason for such a change is that on some systems a live migration from a system with a small amount of memory to a system with a huge amount of memory (>100GB) fails. DomU crashes on the target. To get some debug data I''m trying to dump the pagetable, just to verify that nothing bad to them happend during transit. The code to walk the pagetables is/was based on xc_translate_foreign_address(), but I still think I have some major bugs in there (the last patch). I cant figure out why some l3/l2/l1 entries can not be mapped. Any help with getting that output fixed for at least a a 64bit PV guest is much appreciated. Olaf part of xenctx output: ... 0 l4e ff 00000000a1240067: x 0 a1240000 0 0 1 A c t u W P 0 l3e 1ff 00000000bc133067: x 0 bc133000 0 0 s 1 A c t U W P 0 l2e 64 00000000a1471067: x 0 a1471000 0 0 s 1 A c t U W P 0 l1e 46 80000000bd56f167: X 0 bd56f000 0 G p s D A c t U W P 7fffcc846000 0 l1e 47 80000000bd57c167: X 0 bd57c000 0 G p s D A c t U W P 7fffcc847000 0 l1e 48 80000000bd5af165: X 0 bd5af000 0 G p s D A c t U w P 7fffcc848000 0 l1e db 00000000a1972125: x 0 a1972000 0 G p s d A c t U w P 7fffcc8db000 0 l4e 100 000000013fff8067: x 0 13fff8000 0 0 1 A c t u W P xc_map_foreign_range l3: Invalid argument 0 l4e 101 000000013fff0063: x 0 13fff0000 0 0 1 A c t u W P xc_map_foreign_range l3: Invalid argument 0 l4e 102 00000000bc5b0063: x 0 bc5b0000 0 0 1 A c t u W P 0 l3e fe 00000000a7ed2067: x 0 a7ed2000 0 0 s 1 A c t U W P 0 l2e 163 00000000a154a067: x 0 a154a000 0 0 s 1 A c t U W P 0 l1e 1eb 00000000bc1a7067: x 0 bc1a7000 0 g p s D A c t U W P 813fac7eb000 0 l1e 1ec 00000000a154f067: x 0 a154f000 0 g p s D A c t U W P 813fac7ec000 ... in dmesg: ... (XEN) mm.c:880:d0 Error getting mfn 13fff8 (pfn 5555555555555555) from L1 entry 800000013fff8625 for l1e_owner=0, pg_owner=19 (XEN) mm.c:880:d0 Error getting mfn 13fff0 (pfn 5555555555555555) from L1 entry 800000013fff0625 for l1e_owner=0, pg_owner=19 ... Olaf -- [PATCH 1/5] xenctx: recognize also -S option for --stack-trace [PATCH 2/5] xenctx: move all globals into struct xenctx [PATCH 3/5] xenctx: move xc_* access out of dump_ctx [PATCH 4/5] xenctx: add option -C to dump context for all vcpus [PATCH 5/5] xenctx: dump pagetable tools/xentrace/xenctx.c | 405 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 329 insertions(+), 76 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-20 15:26 UTC
[Xen-devel] [PATCH 1 of 5] xenctx: recognize also -S option for --stack-trace
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1308312572 -7200 # Node ID 434e772168bc2208e1b9219d690b5e4909b552a7 # Parent eca057e4475ca455ec36f962b9179fd2c9674196 xenctx: recognize also -S option for --stack-trace Update help text. Signed-off-by: Olaf Hering <olaf@aepfle.de> diff -r eca057e4475c -r 434e772168bc tools/xentrace/xenctx.c --- a/tools/xentrace/xenctx.c Fri Jun 17 08:08:13 2011 +0100 +++ b/tools/xentrace/xenctx.c Fri Jun 17 14:09:32 2011 +0200 @@ -924,7 +924,7 @@ static void usage(void) printf(" frame pointers.\n"); printf(" -s SYMTAB, --symbol-table=SYMTAB\n"); printf(" read symbol table from SYMTAB.\n"); - printf(" --stack-trace print a complete stack trace.\n"); + printf(" -S --stack-trace print a complete stack trace.\n"); printf(" -k, --kernel-start\n"); printf(" set user/kernel split. (default 0xc0000000)\n"); #ifdef __ia64__ @@ -938,7 +938,7 @@ static void usage(void) int main(int argc, char **argv) { int ch; - static const char *sopts = "fs:hak:" + static const char *sopts = "fs:hak:S" #ifdef __ia64__ "r:" #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-20 15:26 UTC
[Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1308312618 -7200 # Node ID 132480af62e4cb35d24350cf6ea1c2769f1cd1b5 # Parent 434e772168bc2208e1b9219d690b5e4909b552a7 xenctx: move all globals into struct xenctx Move all globals used for options and libxc data into a new struct xenctx. This is used in subsequent changes. Signed-off-by: Olaf Hering <olaf@aepfle.de> diff -r 434e772168bc -r 132480af62e4 tools/xentrace/xenctx.c --- a/tools/xentrace/xenctx.c Fri Jun 17 14:09:32 2011 +0200 +++ b/tools/xentrace/xenctx.c Fri Jun 17 14:10:18 2011 +0200 @@ -29,11 +29,14 @@ #include <xen/foreign/x86_64.h> #include <xen/hvm/save.h> -xc_interface *xc_handle = 0; -int domid = 0; -int frame_ptrs = 0; -int stack_trace = 0; -int disp_all = 0; +static struct xenctx { + xc_interface *xc_handle; + int domid; + int frame_ptrs; + int stack_trace; + int disp_all; + xc_dominfo_t dominfo; +} xenctx; #if defined (__i386__) || defined (__x86_64__) typedef unsigned long long guest_word_t; @@ -300,7 +303,7 @@ static void print_ctx_32(vcpu_guest_cont printf(" fs: %04x\t", regs->fs); printf(" gs: %04x\n", regs->gs); - if (disp_all) { + if (xenctx.disp_all) { print_special(ctx->ctrlreg, "cr", 0x1d, 4); print_special(ctx->debugreg, "dr", 0xcf, 4); } @@ -329,7 +332,7 @@ static void print_ctx_32on64(vcpu_guest_ printf(" fs: %04x\t", regs->fs); printf(" gs: %04x\n", regs->gs); - if (disp_all) { + if (xenctx.disp_all) { print_special(ctx->ctrlreg, "cr", 0x1d, 4); print_special(ctx->debugreg, "dr", 0xcf, 4); } @@ -373,7 +376,7 @@ static void print_ctx_64(vcpu_guest_cont printf(" gs: %04x @ %016"PRIx64"/%016"PRIx64"\n", regs->gs, ctx->gs_base_kernel, ctx->gs_base_user); - if (disp_all) { + if (xenctx.disp_all) { print_special(ctx->ctrlreg, "cr", 0x1d, 8); print_special(ctx->debugreg, "dr", 0xcf, 8); } @@ -681,7 +684,7 @@ static void *map_page(vcpu_guest_context static unsigned long previous_mfn = 0; static void *mapped = NULL; - unsigned long mfn = xc_translate_foreign_address(xc_handle, domid, vcpu, virt); + unsigned long mfn = xc_translate_foreign_address(xenctx.xc_handle, xenctx.domid, vcpu, virt); unsigned long offset = virt & ~XC_PAGE_MASK; if (mapped && mfn == previous_mfn) @@ -692,7 +695,7 @@ static void *map_page(vcpu_guest_context previous_mfn = mfn; - mapped = xc_map_foreign_range(xc_handle, domid, XC_PAGE_SIZE, PROT_READ, mfn); + mapped = xc_map_foreign_range(xenctx.xc_handle, xenctx.domid, XC_PAGE_SIZE, PROT_READ, mfn); if (mapped == NULL) { fprintf(stderr, "failed to map page.\n"); @@ -764,21 +767,21 @@ static void print_stack(vcpu_guest_conte } printf("\n"); - if(stack_trace) + if(xenctx.stack_trace) printf("Stack Trace:\n"); else printf("Call Trace:\n"); - printf("%c [<", stack_trace ? ''*'' : '' ''); + printf("%c [<", xenctx.stack_trace ? ''*'' : '' ''); print_stack_word(instr_pointer(ctx), width); printf(">] "); print_symbol(instr_pointer(ctx)); printf(" <--\n"); - if (frame_ptrs) { + if (xenctx.frame_ptrs) { stack = stack_pointer(ctx); frame = frame_pointer(ctx); while(frame && stack < stack_limit) { - if (stack_trace) { + if (xenctx.stack_trace) { while (stack < frame) { p = map_page(ctx, vcpu, stack); printf("| "); @@ -792,7 +795,7 @@ static void print_stack(vcpu_guest_conte p = map_page(ctx, vcpu, stack); frame = read_stack_word(p, width); - if (stack_trace) { + if (xenctx.stack_trace) { printf("|-- "); print_stack_word(read_stack_word(p, width), width); printf("\n"); @@ -802,7 +805,7 @@ static void print_stack(vcpu_guest_conte if (frame) { p = map_page(ctx, vcpu, stack); word = read_stack_word(p, width); - printf("%c [<", stack_trace ? ''|'' : '' ''); + printf("%c [<", xenctx.stack_trace ? ''|'' : '' ''); print_stack_word(word, width); printf(">] "); print_symbol(word); @@ -821,7 +824,7 @@ static void print_stack(vcpu_guest_conte printf(">] "); print_symbol(word); printf("\n"); - } else if (stack_trace) { + } else if (xenctx.stack_trace) { printf(" "); print_stack_word(word, width); printf("\n"); @@ -836,37 +839,36 @@ static void dump_ctx(int vcpu) { int ret; vcpu_guest_context_any_t ctx; - xc_dominfo_t dominfo; - xc_handle = xc_interface_open(0,0,0); /* for accessing control interface */ + xenctx.xc_handle = xc_interface_open(0,0,0); /* for accessing control interface */ - ret = xc_domain_getinfo(xc_handle, domid, 1, &dominfo); + ret = xc_domain_getinfo(xenctx.xc_handle, xenctx.domid, 1, &xenctx.dominfo); if (ret < 0) { perror("xc_domain_getinfo"); exit(-1); } - ret = xc_domain_pause(xc_handle, domid); + ret = xc_domain_pause(xenctx.xc_handle, xenctx.domid); if (ret < 0) { perror("xc_domain_pause"); exit(-1); } - ret = xc_vcpu_getcontext(xc_handle, domid, vcpu, &ctx); + ret = xc_vcpu_getcontext(xenctx.xc_handle, xenctx.domid, vcpu, &ctx); if (ret < 0) { - if (!dominfo.paused) - xc_domain_unpause(xc_handle, domid); + if (!xenctx.dominfo.paused) + xc_domain_unpause(xenctx.xc_handle, xenctx.domid); perror("xc_vcpu_getcontext"); exit(-1); } #if defined(__i386__) || defined(__x86_64__) { - if (dominfo.hvm) { + if (xenctx.dominfo.hvm) { struct hvm_hw_cpu cpuctx; xen_capabilities_info_t xen_caps = ""; if (xc_domain_hvm_getcontext_partial( - xc_handle, domid, HVM_SAVE_CODE(CPU), + xenctx.xc_handle, xenctx.domid, HVM_SAVE_CODE(CPU), vcpu, &cpuctx, sizeof cpuctx) != 0) { perror("xc_domain_hvm_getcontext_partial"); exit(-1); @@ -874,7 +876,7 @@ static void dump_ctx(int vcpu) guest_word_size = (cpuctx.msr_efer & 0x400) ? 8 : 4; guest_protected_mode = (cpuctx.cr0 & CR0_PE); /* HVM guest context records are always host-sized */ - if (xc_version(xc_handle, XENVER_capabilities, &xen_caps) != 0) { + if (xc_version(xenctx.xc_handle, XENVER_capabilities, &xen_caps) != 0) { perror("xc_version"); exit(-1); } @@ -882,9 +884,9 @@ static void dump_ctx(int vcpu) } else { struct xen_domctl domctl; memset(&domctl, 0, sizeof domctl); - domctl.domain = domid; + domctl.domain = xenctx.domid; domctl.cmd = XEN_DOMCTL_get_address_size; - if (xc_domctl(xc_handle, &domctl) == 0) + if (xc_domctl(xenctx.xc_handle, &domctl) == 0) ctxt_word_size = guest_word_size = domctl.u.address_size.size / 8; } } @@ -897,15 +899,15 @@ static void dump_ctx(int vcpu) print_stack(&ctx, vcpu, guest_word_size); #endif - if (!dominfo.paused) { - ret = xc_domain_unpause(xc_handle, domid); + if (!xenctx.dominfo.paused) { + ret = xc_domain_unpause(xenctx.xc_handle, xenctx.domid); if (ret < 0) { perror("xc_domain_unpause"); exit(-1); } } - ret = xc_interface_close(xc_handle); + ret = xc_interface_close(xenctx.xc_handle); if (ret < 0) { perror("xc_interface_close"); exit(-1); @@ -962,13 +964,13 @@ int main(int argc, char **argv) while ((ch = getopt_long(argc, argv, sopts, lopts, NULL)) != -1) { switch(ch) { case ''f'': - frame_ptrs = 1; + xenctx.frame_ptrs = 1; break; case ''s'': symbol_table = optarg; break; case ''S'': - stack_trace = 1; + xenctx.stack_trace = 1; break; #ifdef __ia64__ case ''r'': @@ -1004,7 +1006,7 @@ int main(int argc, char **argv) break; #else case ''a'': - disp_all = 1; + xenctx.disp_all = 1; break; #endif case ''k'': @@ -1026,8 +1028,8 @@ int main(int argc, char **argv) exit(-1); } - domid = atoi(argv[0]); - if (domid==0) { + xenctx.domid = atoi(argv[0]); + if (xenctx.domid==0) { fprintf(stderr, "cannot trace dom0\n"); exit(-1); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-20 15:26 UTC
[Xen-devel] [PATCH 3 of 5] xenctx: move xc_* access out of dump_ctx
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1308312712 -7200 # Node ID 0e0c9fbedc2e29e2d507e2e1dce9536d06af442d # Parent 132480af62e4cb35d24350cf6ea1c2769f1cd1b5 xenctx: move xc_* access out of dump_ctx move xc_* access out of dump_ctx. Update code paths to return an error instead of calling exit(). On error unpause the guest if it was paused by xenctx. Signed-off-by: Olaf Hering <olaf@aepfle.de> diff -r 132480af62e4 -r 0e0c9fbedc2e tools/xentrace/xenctx.c --- a/tools/xentrace/xenctx.c Fri Jun 17 14:10:18 2011 +0200 +++ b/tools/xentrace/xenctx.c Fri Jun 17 14:11:52 2011 +0200 @@ -35,6 +35,7 @@ static struct xenctx { int frame_ptrs; int stack_trace; int disp_all; + int self_paused; xc_dominfo_t dominfo; } xenctx; @@ -699,7 +700,7 @@ static void *map_page(vcpu_guest_context if (mapped == NULL) { fprintf(stderr, "failed to map page.\n"); - exit(-1); + return NULL; } out: @@ -722,7 +723,7 @@ static void print_stack_word(guest_word_ printf(FMT_64B_WORD, word); } -static void print_code(vcpu_guest_context_any_t *ctx, int vcpu) +static int print_code(vcpu_guest_context_any_t *ctx, int vcpu) { guest_word_t instr; int i; @@ -732,6 +733,8 @@ static void print_code(vcpu_guest_contex instr -= 21; for(i=0; i<32; i++) { unsigned char *c = map_page(ctx, vcpu, instr+i); + if (!c) + return -1; if (instr+i == instr_pointer(ctx)) printf("<%02x> ", *c); else @@ -740,9 +743,10 @@ static void print_code(vcpu_guest_contex printf("\n"); printf("\n"); + return 0; } -static void print_stack(vcpu_guest_context_any_t *ctx, int vcpu, int width) +static int print_stack(vcpu_guest_context_any_t *ctx, int vcpu, int width) { guest_word_t stack = stack_pointer(ctx); guest_word_t stack_limit; @@ -758,6 +762,8 @@ static void print_stack(vcpu_guest_conte for (i=1; i<5 && stack < stack_limit; i++) { while(stack < stack_limit && stack < stack_pointer(ctx) + i*32) { p = map_page(ctx, vcpu, stack); + if (!p) + return -1; word = read_stack_word(p, width); printf(" "); print_stack_word(word, width); @@ -784,6 +790,8 @@ static void print_stack(vcpu_guest_conte if (xenctx.stack_trace) { while (stack < frame) { p = map_page(ctx, vcpu, stack); + if (!p) + return -1; printf("| "); print_stack_word(read_stack_word(p, width), width); printf(" \n"); @@ -794,6 +802,8 @@ static void print_stack(vcpu_guest_conte } p = map_page(ctx, vcpu, stack); + if (!p) + return -1; frame = read_stack_word(p, width); if (xenctx.stack_trace) { printf("|-- "); @@ -804,6 +814,8 @@ static void print_stack(vcpu_guest_conte if (frame) { p = map_page(ctx, vcpu, stack); + if (!p) + return -1; word = read_stack_word(p, width); printf("%c [<", xenctx.stack_trace ? ''|'' : '' ''); print_stack_word(word, width); @@ -817,6 +829,8 @@ static void print_stack(vcpu_guest_conte stack = stack_pointer(ctx); while(stack < stack_limit) { p = map_page(ctx, vcpu, stack); + if (!p) + return -1; word = read_stack_word(p, width); if (is_kernel_text(word)) { printf(" [<"); @@ -832,34 +846,17 @@ static void print_stack(vcpu_guest_conte stack += width; } } + return 0; } #endif static void dump_ctx(int vcpu) { - int ret; vcpu_guest_context_any_t ctx; - xenctx.xc_handle = xc_interface_open(0,0,0); /* for accessing control interface */ - - ret = xc_domain_getinfo(xenctx.xc_handle, xenctx.domid, 1, &xenctx.dominfo); - if (ret < 0) { - perror("xc_domain_getinfo"); - exit(-1); - } - - ret = xc_domain_pause(xenctx.xc_handle, xenctx.domid); - if (ret < 0) { - perror("xc_domain_pause"); - exit(-1); - } - - ret = xc_vcpu_getcontext(xenctx.xc_handle, xenctx.domid, vcpu, &ctx); - if (ret < 0) { - if (!xenctx.dominfo.paused) - xc_domain_unpause(xenctx.xc_handle, xenctx.domid); + if (xc_vcpu_getcontext(xenctx.xc_handle, xenctx.domid, vcpu, &ctx) < 0) { perror("xc_vcpu_getcontext"); - exit(-1); + return; } #if defined(__i386__) || defined(__x86_64__) @@ -871,14 +868,14 @@ static void dump_ctx(int vcpu) xenctx.xc_handle, xenctx.domid, HVM_SAVE_CODE(CPU), vcpu, &cpuctx, sizeof cpuctx) != 0) { perror("xc_domain_hvm_getcontext_partial"); - exit(-1); + return; } guest_word_size = (cpuctx.msr_efer & 0x400) ? 8 : 4; guest_protected_mode = (cpuctx.cr0 & CR0_PE); /* HVM guest context records are always host-sized */ if (xc_version(xenctx.xc_handle, XENVER_capabilities, &xen_caps) != 0) { perror("xc_version"); - exit(-1); + return; } ctxt_word_size = (strstr(xen_caps, "xen-3.0-x86_64")) ? 8 : 4; } else { @@ -894,24 +891,12 @@ static void dump_ctx(int vcpu) print_ctx(&ctx); #ifndef NO_TRANSLATION - print_code(&ctx, vcpu); + if (print_code(&ctx, vcpu)) + return; if (is_kernel_text(instr_pointer(&ctx))) - print_stack(&ctx, vcpu, guest_word_size); + if (print_stack(&ctx, vcpu, guest_word_size)) + return; #endif - - if (!xenctx.dominfo.paused) { - ret = xc_domain_unpause(xenctx.xc_handle, xenctx.domid); - if (ret < 0) { - perror("xc_domain_unpause"); - exit(-1); - } - } - - ret = xc_interface_close(xenctx.xc_handle); - if (ret < 0) { - perror("xc_interface_close"); - exit(-1); - } } static void usage(void) @@ -940,6 +925,7 @@ static void usage(void) int main(int argc, char **argv) { int ch; + int ret; static const char *sopts = "fs:hak:S" #ifdef __ia64__ "r:" @@ -1040,8 +1026,43 @@ int main(int argc, char **argv) if (symbol_table) read_symbol_table(symbol_table); + xenctx.xc_handle = xc_interface_open(0,0,0); /* for accessing control interface */ + if (xenctx.xc_handle < 0) { + perror("xc_interface_open"); + exit(-1); + } + + ret = xc_domain_getinfo(xenctx.xc_handle, xenctx.domid, 1, &xenctx.dominfo); + if (ret < 0) { + perror("xc_domain_getinfo"); + exit(-1); + } + + if (!xenctx.dominfo.paused) { + ret = xc_domain_pause(xenctx.xc_handle, xenctx.domid); + if (ret < 0) { + perror("xc_domain_pause"); + exit(-1); + } + xenctx.self_paused = 1; + } + dump_ctx(vcpu); + if (xenctx.self_paused) { + ret = xc_domain_unpause(xenctx.xc_handle, xenctx.domid); + if (ret < 0) { + perror("xc_domain_unpause"); + exit(-1); + } + } + + ret = xc_interface_close(xenctx.xc_handle); + if (ret < 0) { + perror("xc_interface_close"); + exit(-1); + } + return 0; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-20 15:27 UTC
[Xen-devel] [PATCH 4 of 5] xenctx: add option -C to dump context for all vcpus
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1308313376 -7200 # Node ID ddf16ea954876d8af371f9751fc06eb4c9e78b36 # Parent 0e0c9fbedc2e29e2d507e2e1dce9536d06af442d xenctx: add option -C to dump context for all vcpus Signed-off-by: Olaf Hering <olaf@aepfle.de> diff -r 0e0c9fbedc2e -r ddf16ea95487 tools/xentrace/xenctx.c --- a/tools/xentrace/xenctx.c Fri Jun 17 14:11:52 2011 +0200 +++ b/tools/xentrace/xenctx.c Fri Jun 17 14:22:56 2011 +0200 @@ -35,6 +35,7 @@ static struct xenctx { int frame_ptrs; int stack_trace; int disp_all; + int all_vcpus; int self_paused; xc_dominfo_t dominfo; } xenctx; @@ -899,6 +900,19 @@ static void dump_ctx(int vcpu) #endif } +static void dump_all_vcpus(void) +{ + xc_vcpuinfo_t vinfo; + int vcpu; + for (vcpu = 0; vcpu <= xenctx.dominfo.max_vcpu_id; vcpu++) + { + if ( xc_vcpu_getinfo(xenctx.xc_handle, xenctx.domid, vcpu, &vinfo) ) + continue; + if ( vinfo.online ) + dump_ctx(vcpu); + } +} + static void usage(void) { printf("usage:\n\n"); @@ -920,13 +934,14 @@ static void usage(void) #else printf(" -a --all display more registers\n"); #endif + printf(" -C --all-vcpus print info for all vcpus\n"); } int main(int argc, char **argv) { int ch; int ret; - static const char *sopts = "fs:hak:S" + static const char *sopts = "fs:hak:SC" #ifdef __ia64__ "r:" #endif @@ -940,6 +955,7 @@ int main(int argc, char **argv) {"regs", 1, NULL, ''r''}, #endif {"all", 0, NULL, ''a''}, + {"all-vcpus", 0, NULL, ''C''}, {"help", 0, NULL, ''h''}, {0, 0, 0, 0} }; @@ -995,6 +1011,9 @@ int main(int argc, char **argv) xenctx.disp_all = 1; break; #endif + case ''C'': + xenctx.all_vcpus = 1; + break; case ''k'': kernel_start = strtoull(optarg, NULL, 0); break; @@ -1047,7 +1066,10 @@ int main(int argc, char **argv) xenctx.self_paused = 1; } - dump_ctx(vcpu); + if (xenctx.all_vcpus) + dump_all_vcpus(); + else + dump_ctx(vcpu); if (xenctx.self_paused) { ret = xc_domain_unpause(xenctx.xc_handle, xenctx.domid); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1308582353 -7200 # Node ID 0443de5faea9b904b92219b3b804dec147697366 # Parent ddf16ea954876d8af371f9751fc06eb4c9e78b36 xenctx: dump pagetable This change is buggy... diff -r ddf16ea95487 -r 0443de5faea9 tools/xentrace/xenctx.c --- a/tools/xentrace/xenctx.c Fri Jun 17 14:22:56 2011 +0200 +++ b/tools/xentrace/xenctx.c Mon Jun 20 17:05:53 2011 +0200 @@ -25,21 +25,34 @@ #include <getopt.h> #include "xenctrl.h" +#include "xg_private.h" +#include "xc_private.h" #include <xen/foreign/x86_32.h> #include <xen/foreign/x86_64.h> #include <xen/hvm/save.h> +#define CR4_PAE 0x20 +#define EFER_LMA 0x400 + static struct xenctx { xc_interface *xc_handle; int domid; int frame_ptrs; int stack_trace; + int dump_pagetable; int disp_all; int all_vcpus; int self_paused; xc_dominfo_t dominfo; } xenctx; +struct cpuctx { + vcpu_guest_context_any_t any; +#if defined(__i386__) || defined(__x86_64__) + struct hvm_hw_cpu hvm; +#endif +} cpuctx; + #if defined (__i386__) || defined (__x86_64__) typedef unsigned long long guest_word_t; #define FMT_32B_WORD "%08llx" @@ -436,6 +449,184 @@ static guest_word_t frame_pointer(vcpu_g return ctx->x64.user_regs.rbp; } +static void walk_l1(int vcpu, uint64_t l1, uint64_t virt) +{ + uint64_t *map; + unsigned long long e, l1e, phys_mask = ~(-1ull << 52) & (-1ull << 12); + char buf[123]; + fflush(stdout); + map = xc_map_foreign_range(xenctx.xc_handle, xenctx.domid, PAGE_SIZE, PROT_READ, l1 >> PAGE_SHIFT); + if (!map) { + perror("xc_map_foreign_range l1"); + return; + } + for (e = 0; e < L1_PAGETABLE_ENTRIES_X86_64; e++) { + l1e = map[e]; + if (!l1e) + continue; + buf[0] = ''\0''; + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & ( 1ull << 63) ? ''X'' : ''x''); /* NX */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %3llx", (l1e >> 52) & ~( 1ull << 11)); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %12llx", l1e & phys_mask); /* phys base adress */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l1e >> 9) & 0x7); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_GLOBAL ? ''G'' : ''g''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_PAT ? ''P'' : ''p''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_PSE ? ''S'' : ''s''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_DIRTY ? ''D'' : ''d''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_ACCESSED ? ''A'' : ''a''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_PCD ? ''C'' : ''c''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_PWT ? ''T'' : ''t''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_USER ? ''U'' : ''u''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_RW ? ''W'' : ''w''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l1e & _PAGE_PRESENT ? ''P'' : ''p''); + printf("%3d l1e %3llx %016llx: %s %16llx\n", vcpu, e, l1e, buf, (unsigned long long)virt | (e << L1_PAGETABLE_SHIFT_X86_64)); + } + munmap(map, PAGE_SIZE); +} + +static void walk_l2(int vcpu, uint64_t l2, uint64_t virt) +{ + uint64_t *map; + unsigned long long e, l2e, l1_mask = ~(-1ull << 52) & (-1ull << 12); + char buf[123]; + fflush(stdout); + map = xc_map_foreign_range(xenctx.xc_handle, xenctx.domid, PAGE_SIZE, PROT_READ, l2 >> PAGE_SHIFT); + if (!map) { + perror("xc_map_foreign_range l2"); + return; + } + for (e = 0; e < L2_PAGETABLE_ENTRIES_X86_64; e++) { + l2e = map[e]; + if (!l2e) + continue; + buf[0] = ''\0''; + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & ( 1ull << 63) ? ''X'' : ''x''); /* NX */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %3llx", (l2e >> 52) & ~( 1ull << 11)); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %12llx", l2e & l1_mask); /* l1 base adress */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l2e >> 9) & 0x7); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l2e >> 8) & 0x1); /* Ignored */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", (l2e >> 7) & 0x1 ? ''S'' : ''s''); /* SP */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l2e >> 6) & 0x1); /* Ignored */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_ACCESSED ? ''A'' : ''a''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_PCD ? ''C'' : ''c''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_PWT ? ''T'' : ''t''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_USER ? ''U'' : ''u''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_RW ? ''W'' : ''w''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l2e & _PAGE_PRESENT ? ''P'' : ''p''); + printf("%3d l2e %3llx %016llx: %s\n", vcpu, e, l2e, buf); + if (l2e & _PAGE_PRESENT) + walk_l1(vcpu, l2e & l1_mask, virt | (e << L2_PAGETABLE_SHIFT_X86_64)); + } + munmap(map, PAGE_SIZE); +} + +static void walk_l3(int vcpu, uint64_t l3, uint64_t virt) +{ + uint64_t *map; + unsigned long long e, l3e, l2_mask = ~(-1ull << 52) & (-1ull << 12); + char buf[123]; + fflush(stdout); + map = xc_map_foreign_range(xenctx.xc_handle, xenctx.domid, PAGE_SIZE, PROT_READ, l3 >> PAGE_SHIFT); + if (!map) { + perror("xc_map_foreign_range l3"); + return; + } + for (e = 0; e < L3_PAGETABLE_ENTRIES_X86_64; e++) { + l3e = map[e]; + if (!l3e) + continue; + buf[0] = ''\0''; + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & ( 1ull << 63) ? ''X'' : ''x''); /* NX */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %3llx", (l3e >> 52) & ~( 1ull << 11)); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %12llx", l3e & l2_mask); /* l2 base adress */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l3e >> 9) & 0x7); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l3e >> 8) & 0x1); /* MBZ */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", (l3e >> 7) & 0x1 ? ''S'' : ''s''); /* SP */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l3e >> 6) & 0x1); /* Ignored */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_ACCESSED ? ''A'' : ''a''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_PCD ? ''C'' : ''c''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_PWT ? ''T'' : ''t''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_USER ? ''U'' : ''u''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_RW ? ''W'' : ''w''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l3e & _PAGE_PRESENT ? ''P'' : ''p''); + printf("%3d l3e %3llx %016llx: %s\n", vcpu, e, l3e, buf); + if (l3e & _PAGE_PRESENT) + walk_l2(vcpu, l3e & l2_mask, virt | (e << L3_PAGETABLE_SHIFT_X86_64)); + } + munmap(map, PAGE_SIZE); +} + +static void walk_l4(int vcpu, uint64_t l4) +{ + uint64_t *map; + unsigned long long e, l4e, l3_mask = ~(-1ull << 52) & (-1ull << 12), virt = 0; + char buf[123]; + fflush(stdout); + map = xc_map_foreign_range(xenctx.xc_handle, xenctx.domid, PAGE_SIZE, PROT_READ, l4 >> PAGE_SHIFT); + if (!map) { + perror("xc_map_foreign_range l4"); + return; + } + for (e = 0; e < L4_PAGETABLE_ENTRIES_X86_64; e++) { + l4e = map[e]; + if (!l4e) + continue; + buf[0] = ''\0''; + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & ( 1ull << 63) ? ''X'' : ''x''); /* NX */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %3llx", (l4e >> 52) & ~( 1ull << 11)); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %12llx", l4e & l3_mask); /* l3 base adress */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l4e >> 9) & 0x7); /* Available */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l4e >> 7) & 0x3); /* MBZ */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %llx", (l4e >> 6) & 0x1); /* Ignored */ + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_ACCESSED ? ''A'' : ''a''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_PCD ? ''C'' : ''c''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_PWT ? ''T'' : ''t''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_USER ? ''u'' : ''u''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_RW ? ''W'' : ''w''); + snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), " %c", l4e & _PAGE_PRESENT ? ''P'' : ''p''); + printf("%3d l4e %3llx %016llx: %s\n", vcpu, e, l4e, buf); + if (l4e & _PAGE_PRESENT) + walk_l3(vcpu, l4e & l3_mask, virt | (e << L4_PAGETABLE_SHIFT_X86_64)); + } + munmap(map, PAGE_SIZE); +} + +static void dump_pagetable(struct cpuctx *cpuctx, int vcpu) +{ + uint64_t paddr; + int pt_levels, size; + + printf("\npagetable for vcpu %d\n", vcpu); + /* What kind of paging are we dealing with? */ + if (xenctx.dominfo.hvm) { + if (!guest_protected_mode) { + printf("protected mode disabled\n"); + return; + } + pt_levels = (cpuctx->hvm.msr_efer & EFER_LMA) ? 4 : (cpuctx->hvm.cr4 & CR4_PAE) ? 3 : 2; + paddr = cpuctx->hvm.cr3 & ((pt_levels == 3) ? ~0x1full : ~0xfffull); + } else { + if (guest_word_size == 8) { + pt_levels = 4; + paddr = cpuctx->any.x64.ctrlreg[3] & ~0xfffull; + } else { + pt_levels = 3; + paddr = ((uint64_t) xen_cr3_to_pfn(cpuctx->any.x32.ctrlreg[3])) << PAGE_SHIFT; + } + } + + size = (pt_levels == 2 ? 4 : 8); + if (pt_levels == 4) { + walk_l4(vcpu, paddr); + } else if (pt_levels == 3) { + walk_l3(vcpu, paddr, 0); + } else { + walk_l2(vcpu, paddr, 0); + } + printf("\n"); + fflush(stdout); +} + #elif defined(__ia64__) #define PTE_ED_SHIFT 52 @@ -853,9 +1044,11 @@ static int print_stack(vcpu_guest_contex static void dump_ctx(int vcpu) { - vcpu_guest_context_any_t ctx; + struct cpuctx cpuctx; - if (xc_vcpu_getcontext(xenctx.xc_handle, xenctx.domid, vcpu, &ctx) < 0) { + memset(&cpuctx, 0, sizeof(cpuctx)); + + if (xc_vcpu_getcontext(xenctx.xc_handle, xenctx.domid, vcpu, &cpuctx.any) < 0) { perror("xc_vcpu_getcontext"); return; } @@ -863,16 +1056,15 @@ static void dump_ctx(int vcpu) #if defined(__i386__) || defined(__x86_64__) { if (xenctx.dominfo.hvm) { - struct hvm_hw_cpu cpuctx; xen_capabilities_info_t xen_caps = ""; if (xc_domain_hvm_getcontext_partial( xenctx.xc_handle, xenctx.domid, HVM_SAVE_CODE(CPU), - vcpu, &cpuctx, sizeof cpuctx) != 0) { + vcpu, &cpuctx.hvm, sizeof (cpuctx.hvm)) != 0) { perror("xc_domain_hvm_getcontext_partial"); return; } - guest_word_size = (cpuctx.msr_efer & 0x400) ? 8 : 4; - guest_protected_mode = (cpuctx.cr0 & CR0_PE); + guest_word_size = (cpuctx.hvm.msr_efer & EFER_LMA) ? 8 : 4; + guest_protected_mode = (cpuctx.hvm.cr0 & CR0_PE); /* HVM guest context records are always host-sized */ if (xc_version(xenctx.xc_handle, XENVER_capabilities, &xen_caps) != 0) { perror("xc_version"); @@ -890,14 +1082,18 @@ static void dump_ctx(int vcpu) } #endif - print_ctx(&ctx); + print_ctx(&cpuctx.any); #ifndef NO_TRANSLATION - if (print_code(&ctx, vcpu)) + if (print_code(&cpuctx.any, vcpu)) return; - if (is_kernel_text(instr_pointer(&ctx))) - if (print_stack(&ctx, vcpu, guest_word_size)) + if (is_kernel_text(instr_pointer(&cpuctx.any))) + if (print_stack(&cpuctx.any, vcpu, guest_word_size)) return; #endif +#if defined(__i386__) || defined(__x86_64__) + if (xenctx.dump_pagetable) + dump_pagetable(&cpuctx, vcpu); +#endif } static void dump_all_vcpus(void) @@ -926,6 +1122,10 @@ static void usage(void) printf(" -s SYMTAB, --symbol-table=SYMTAB\n"); printf(" read symbol table from SYMTAB.\n"); printf(" -S --stack-trace print a complete stack trace.\n"); +#if defined(__i386__) || defined(__x86_64__) + printf(" -P --dump-pagetable\n"); + printf(" print a complete pagetable for the given vcpu.\n"); +#endif printf(" -k, --kernel-start\n"); printf(" set user/kernel split. (default 0xc0000000)\n"); #ifdef __ia64__ @@ -941,7 +1141,7 @@ int main(int argc, char **argv) { int ch; int ret; - static const char *sopts = "fs:hak:SC" + static const char *sopts = "fs:hak:SPC" #ifdef __ia64__ "r:" #endif @@ -951,6 +1151,9 @@ int main(int argc, char **argv) {"symbol-table", 1, NULL, ''s''}, {"frame-pointers", 0, NULL, ''f''}, {"kernel-start", 1, NULL, ''k''}, +#if defined(__i386__) || defined(__x86_64__) + {"dump-pagetable", 0, NULL, ''P''}, +#endif #ifdef __ia64__ {"regs", 1, NULL, ''r''}, #endif @@ -974,6 +1177,11 @@ int main(int argc, char **argv) case ''S'': xenctx.stack_trace = 1; break; +#if defined(__i386__) || defined(__x86_64__) + case ''P'': + xenctx.dump_pagetable = 1; + break; +#endif #ifdef __ia64__ case ''r'': { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jun-20 17:33 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
Hi, At 17:26 +0200 on 20 Jun (1308590816), Olaf Hering wrote:> The code to walk the pagetables is/was based on > xc_translate_foreign_address(), but I still think I have some major bugs in > there (the last patch). I cant figure out why some l3/l2/l1 entries can not be mapped. > Any help with getting that output fixed for at least a a 64bit PV guest is much > appreciated.I didn''t spot anything broken in the walker (though it will need a bunch of cleaning up) and the output looks very plausible. I take it this is a PV guest? Is it a well-behaved one or a broken post-migrate one? One thing that might be causing trouble is if the domain''s not paused while you do the walk, then you might see inconsistent tables, though I''d expect them to be garbage rather than looking like this. Your patch 3/5 does seem to make the pausing conditional where before it always happened. Tim.> part of xenctx output: > ... > 0 l4e ff 00000000a1240067: x 0 a1240000 0 0 1 A c t u W P > 0 l3e 1ff 00000000bc133067: x 0 bc133000 0 0 s 1 A c t U W P > 0 l2e 64 00000000a1471067: x 0 a1471000 0 0 s 1 A c t U W P > 0 l1e 46 80000000bd56f167: X 0 bd56f000 0 G p s D A c t U W P 7fffcc846000 > 0 l1e 47 80000000bd57c167: X 0 bd57c000 0 G p s D A c t U W P 7fffcc847000 > 0 l1e 48 80000000bd5af165: X 0 bd5af000 0 G p s D A c t U w P 7fffcc848000 > 0 l1e db 00000000a1972125: x 0 a1972000 0 G p s d A c t U w P 7fffcc8db000 > 0 l4e 100 000000013fff8067: x 0 13fff8000 0 0 1 A c t u W P > xc_map_foreign_range l3: Invalid argument > 0 l4e 101 000000013fff0063: x 0 13fff0000 0 0 1 A c t u W P > xc_map_foreign_range l3: Invalid argument > 0 l4e 102 00000000bc5b0063: x 0 bc5b0000 0 0 1 A c t u W P > 0 l3e fe 00000000a7ed2067: x 0 a7ed2000 0 0 s 1 A c t U W P > 0 l2e 163 00000000a154a067: x 0 a154a000 0 0 s 1 A c t U W P > 0 l1e 1eb 00000000bc1a7067: x 0 bc1a7000 0 g p s D A c t U W P 813fac7eb000 > 0 l1e 1ec 00000000a154f067: x 0 a154f000 0 g p s D A c t U W P 813fac7ec000 > ... > > in dmesg: > ... > (XEN) mm.c:880:d0 Error getting mfn 13fff8 (pfn 5555555555555555) from L1 entry 800000013fff8625 for l1e_owner=0, pg_owner=19 > (XEN) mm.c:880:d0 Error getting mfn 13fff0 (pfn 5555555555555555) from L1 entry 800000013fff0625 for l1e_owner=0, pg_owner=19-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-20 17:51 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
On Mon, Jun 20, Tim Deegan wrote:> Hi, > > At 17:26 +0200 on 20 Jun (1308590816), Olaf Hering wrote: > > The code to walk the pagetables is/was based on > > xc_translate_foreign_address(), but I still think I have some major bugs in > > there (the last patch). I cant figure out why some l3/l2/l1 entries can not be mapped. > > Any help with getting that output fixed for at least a a 64bit PV guest is much > > appreciated. > > I didn''t spot anything broken in the walker (though it will need a bunch > of cleaning up) and the output looks very plausible. I take it this is > a PV guest? Is it a well-behaved one or a broken post-migrate one?Yes, its a PV guest. I cant reproduce the migrate crashes, it happens only on very few systems or on a certain configuration. The logs I posted are from my test system. Any ideas why some mfns are not accessible? Are there any other paging states maintained outside of the guests memory?> One thing that might be causing trouble is if the domain''s not paused > while you do the walk, then you might see inconsistent tables, though > I''d expect them to be garbage rather than looking like this. Your patch > 3/5 does seem to make the pausing conditional where before it always > happened.It tries it preserve the state, if the guest was paused it probably should not be unpaused. Is the dominfo.paused flag somehow unreliable, I thought it comes right from struct domain? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jun-20 21:56 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
On Mon, Jun 20, 2011 at 07:51:58PM +0200, Olaf Hering wrote:> On Mon, Jun 20, Tim Deegan wrote: > > > Hi, > > > > At 17:26 +0200 on 20 Jun (1308590816), Olaf Hering wrote: > > > The code to walk the pagetables is/was based on > > > xc_translate_foreign_address(), but I still think I have some major bugs in > > > there (the last patch). I cant figure out why some l3/l2/l1 entries can not be mapped. > > > Any help with getting that output fixed for at least a a 64bit PV guest is much > > > appreciated. > > > > I didn''t spot anything broken in the walker (though it will need a bunch > > of cleaning up) and the output looks very plausible. I take it this is > > a PV guest? Is it a well-behaved one or a broken post-migrate one? > > Yes, its a PV guest. I cant reproduce the migrate crashes, it happens > only on very few systems or on a certain configuration. The logs I > posted are from my test system. > Any ideas why some mfns are not accessible?They look to be the special I/O PFNs. The ones that cover ACPI, framebuffer, PCI IO bars, MP tale.> > Are there any other paging states maintained outside of the guests > memory?They look to be I/O pages. But not sure why they are mapped to your guest?> > > One thing that might be causing trouble is if the domain''s not paused > > while you do the walk, then you might see inconsistent tables, though > > I''d expect them to be garbage rather than looking like this. Your patch > > 3/5 does seem to make the pausing conditional where before it always > > happened. > > It tries it preserve the state, if the guest was paused it probably > should not be unpaused. Is the dominfo.paused flag somehow unreliable, I > thought it comes right from struct domain? > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jun-21 09:59 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
Hi, At 17:56 -0400 on 20 Jun (1308592583), Konrad Rzeszutek Wilk wrote:> > Any ideas why some mfns are not accessible? > > They look to be the special I/O PFNs. The ones that cover ACPI, framebuffer, > PCI IO bars, MP tale.I think they''re too high for that, but if you can post the e820 map of the system this happened on then we''ll know.> > Are there any other paging states maintained outside of the guests > > memory? > > They look to be I/O pages. > > But not sure why they are mapped to your guest?But they''re not mapped into the guest - from the look of them they''re not mapped anywhere. You could add some extra printouts around that warning in mm.c to show whether the MFNs are valid and if so which domain owns them. Also, interesting that it''s the addresses just above 0xffff800000000000 that are different - what lives at that address in the PV kernel you''re running? Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-21 10:31 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
>>> On 21.06.11 at 11:59, Tim Deegan <Tim.Deegan@citrix.com> wrote: > Hi, > > At 17:56 -0400 on 20 Jun (1308592583), Konrad Rzeszutek Wilk wrote: >> > Any ideas why some mfns are not accessible? >> >> They look to be the special I/O PFNs. The ones that cover ACPI, framebuffer, >> PCI IO bars, MP tale. > > I think they''re too high for that, but if you can post the e820 map of > the system this happened on then we''ll know. > >> > Are there any other paging states maintained outside of the guests >> > memory? >> >> They look to be I/O pages. >> >> But not sure why they are mapped to your guest? > > But they''re not mapped into the guest - from the look of them they''re > not mapped anywhere. You could add some extra printouts around that > warning in mm.c to show whether the MFNs are valid and if so which > domain owns them. > > Also, interesting that it''s the addresses just above 0xffff800000000000 > that are different - what lives at that address in the PV kernel you''re > running?That''s Xen''s space, isn''t it. Clearly any non-hypervisor based page table walking code has to ignore this range for PV guests'' page tables. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering writes ("[Xen-devel] [PATCH 5 of 5] xenctx: dump pagetable"):> # HG changeset patch > # User Olaf Hering <olaf@aepfle.de> > # Date 1308582353 -7200 > # Node ID 0443de5faea9b904b92219b3b804dec147697366 > # Parent ddf16ea954876d8af371f9751fc06eb4c9e78b36 > xenctx: dump pagetable > > This change is buggy...Uh... ? Did this slip through by mistake somehow ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Jun-21 12:36 UTC
Re: [Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx
Olaf Hering writes ("[Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx"):> xenctx: move all globals into struct xenctxHaving read 1-4 I think they look reasonable and I''m happy to apply them when you give the word you think they''re ready. It wasn''t entirely clear from your messages whether you want me to do so. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Jun 21, Ian Jackson wrote:> Olaf Hering writes ("[Xen-devel] [PATCH 5 of 5] xenctx: dump pagetable"): > > # HG changeset patch > > # User Olaf Hering <olaf@aepfle.de> > > # Date 1308582353 -7200 > > # Node ID 0443de5faea9b904b92219b3b804dec147697366 > > # Parent ddf16ea954876d8af371f9751fc06eb4c9e78b36 > > xenctx: dump pagetable > > > > This change is buggy... > > Uh... ? Did this slip through by mistake somehow ?Its here for review and discussion, I forgot to add a reasonable description. The change itself needs PAE and i386 support at least. I will work on a better version. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-21 15:27 UTC
Re: [Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx
On Tue, Jun 21, Ian Jackson wrote:> Olaf Hering writes ("[Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx"): > > xenctx: move all globals into struct xenctx > > Having read 1-4 I think they look reasonable and I''m happy to apply > them when you give the word you think they''re ready. It wasn''t > entirely clear from your messages whether you want me to do so.1-4 is ready, just last one (the pagetable change) needs more work. If the concept is ok, please apply these changes. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Jun-21 15:31 UTC
Re: [Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx
Olaf Hering writes ("Re: [Xen-devel] [PATCH 2 of 5] xenctx: move all globals into struct xenctx"):> 1-4 is ready, just last one (the pagetable change) needs more work. > If the concept is ok, please apply these changes.Done, thanks. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Jun-22 13:50 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
On Tue, Jun 21, Jan Beulich wrote:> >>> On 21.06.11 at 11:59, Tim Deegan <Tim.Deegan@citrix.com> wrote: > > Hi, > > > > At 17:56 -0400 on 20 Jun (1308592583), Konrad Rzeszutek Wilk wrote: > >> > Any ideas why some mfns are not accessible? > >> > >> They look to be the special I/O PFNs. The ones that cover ACPI, framebuffer, > >> PCI IO bars, MP tale. > > > > I think they''re too high for that, but if you can post the e820 map of > > the system this happened on then we''ll know. > > > >> > Are there any other paging states maintained outside of the guests > >> > memory? > >> > >> They look to be I/O pages. > >> > >> But not sure why they are mapped to your guest? > > > > But they''re not mapped into the guest - from the look of them they''re > > not mapped anywhere. You could add some extra printouts around that > > warning in mm.c to show whether the MFNs are valid and if so which > > domain owns them. > > > > Also, interesting that it''s the addresses just above 0xffff800000000000 > > that are different - what lives at that address in the PV kernel you''re > > running? > > That''s Xen''s space, isn''t it. Clearly any non-hypervisor based page > table walking code has to ignore this range for PV guests'' page > tables.Is there a way to detect that? I cant seem to match any of these ranges to something in guests dmesg or /proc. After adding some debug to my xenctx from sles11 4.0, I get this on vcpu 0: xc_map_foreign_range: walk_l3: 0 virt 0000800000000000 mfn 13fff8: Invalid argument xc_map_foreign_range: walk_l3: 0 virt 0000808000000000 mfn 13fff0: Invalid argument xc_map_foreign_range: walk_l2: 0 virt 0000814000000000 mfn 13fff8: Invalid argument xc_map_foreign_range: walk_l2: 0 virt 0000814040000000 mfn 13fff0: Invalid argument xc_map_foreign_range: walk_l1: 0 virt 00008140a0000000 mfn 13fff8: Invalid argument xc_map_foreign_range: walk_l1: 0 virt 00008140a0200000 mfn 13fff0: Invalid argument xc_map_foreign_range: walk_l1: 0 virt 00008140a0800000 mfn 137ff8: Invalid argument xc_map_foreign_range: walk_l1: 0 virt 00008140a0a00000 mfn bf49a: Invalid argument xc_map_foreign_range: walk_l1: 0 virt 00008140a0c00000 mfn bf495: Invalid argument xc_map_foreign_range: walk_l2: 0 virt 0000814100000000 mfn 137ff8: Invalid argument xc_map_foreign_range: walk_l2: 0 virt 0000814140000000 mfn bf49a: Invalid argument xc_map_foreign_range: walk_l2: 0 virt 0000814180000000 mfn bf495: Invalid argument xc_map_foreign_range: walk_l3: 0 virt 0000820000000000 mfn 137ff8: Invalid argument xc_map_foreign_range: walk_l3: 0 virt 0000828000000000 mfn bf49a: Invalid argument xc_map_foreign_range: walk_l3: 0 virt 0000830000000000 mfn bf495: Invalid argument And dmesg has: (XEN) mm.c:880:d0 Error getting mfn 13fff8 (pfn 5555555555555555) from L1 entry 800000013fff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 13fff0 (pfn 5555555555555555) from L1 entry 800000013fff0625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 13fff8 (pfn 5555555555555555) from L1 entry 800000013fff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 13fff0 (pfn 5555555555555555) from L1 entry 800000013fff0625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 13fff8 (pfn 5555555555555555) from L1 entry 800000013fff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 13fff0 (pfn 5555555555555555) from L1 entry 800000013fff0625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 137ff8 (pfn f180d) from L1 entry 8000000137ff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf49a (pfn 5555555555555555) from L1 entry 80000000bf49a625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf495 (pfn 5555555555555555) from L1 entry 80000000bf495625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 137ff8 (pfn f180d) from L1 entry 8000000137ff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf49a (pfn 5555555555555555) from L1 entry 80000000bf49a625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf495 (pfn 5555555555555555) from L1 entry 80000000bf495625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn 137ff8 (pfn f180d) from L1 entry 8000000137ff8625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf49a (pfn 5555555555555555) from L1 entry 80000000bf49a625 for l1e_owner=0, pg_owner=1 (XEN) mm.c:880:d0 Error getting mfn bf495 (pfn 5555555555555555) from L1 entry 80000000bf495625 for l1e_owner=0, pg_owner=1 Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-22 14:34 UTC
Re: [Xen-devel] [PATCH 0 of 5] update xenctx to dump pagetables
>>> On 22.06.11 at 15:50, Olaf Hering <olaf@aepfle.de> wrote: > On Tue, Jun 21, Jan Beulich wrote: > >> >>> On 21.06.11 at 11:59, Tim Deegan <Tim.Deegan@citrix.com> wrote: >> > Hi, >> > >> > At 17:56 -0400 on 20 Jun (1308592583), Konrad Rzeszutek Wilk wrote: >> >> > Any ideas why some mfns are not accessible? >> >> >> >> They look to be the special I/O PFNs. The ones that cover ACPI, > framebuffer, >> >> PCI IO bars, MP tale. >> > >> > I think they''re too high for that, but if you can post the e820 map of >> > the system this happened on then we''ll know. >> > >> >> > Are there any other paging states maintained outside of the guests >> >> > memory? >> >> >> >> They look to be I/O pages. >> >> >> >> But not sure why they are mapped to your guest? >> > >> > But they''re not mapped into the guest - from the look of them they''re >> > not mapped anywhere. You could add some extra printouts around that >> > warning in mm.c to show whether the MFNs are valid and if so which >> > domain owns them. >> > >> > Also, interesting that it''s the addresses just above 0xffff800000000000 >> > that are different - what lives at that address in the PV kernel you''re >> > running? >> >> That''s Xen''s space, isn''t it. Clearly any non-hypervisor based page >> table walking code has to ignore this range for PV guests'' page >> tables. > > Is there a way to detect that? I cant seem to match any of these ranges to > something in guests dmesg or /proc.No need for detection - the hole is part of the ABI (minus its dynamic size in 32-bit pv guests running on 64-bit hypervisor - that may require use of heuristics). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel