Hi Konrad, here is another one from the hm-what? department: Colin discovered that running the attached program with the fork active (e.g. "./mmap-example -f 0x10000", the address can be that or iomem) this triggers the following weird messages: [ 6824.453724] mmap-example:3481 map pfn expected mapping type write-back for [mem 0x00010000-0x00010fff], got uncached-minus [ 6824.453776] ------------[ cut here ]------------ [ 6824.453796] WARNING: at /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: mmap-example Tainted: GF 3.8.0-6-generic #13-Ubuntu [ 6824.453926] Call Trace: [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 [ 6824.454027] [<ffffffff81056d9f>] copy_process.part.22+0xa5f/0x1510 [ 6824.454038] [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] ---[ end trace 4918cdd0a4c9fea4 ]--- I found that this is related to your bandaid patch commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Fri Feb 10 09:16:27 2012 -0500 xen/pat: Disable PAT support for now. I just do not understand how this happens. From the trace it seems the fork fails when duplicating the VMAs (dup_mm calls mmput on failure). So maybe the warning is just related to this. So primarily the question is how on fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are cleared from the supported mask by the patch, so somehow I would think nothing should be able to set it... But apparently not so. Not sure it is a big deal since I never saw this in normal operation and it seems to be ok when unapping before doing the fork. It is just plain odd. -Stefan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote:> Hi Konrad,Hey Stefan,> > here is another one from the hm-what? department:Heh - the really good-bug-hunting one. Lets also include Jinsong as he has been tracking a similar one with mcelog.> > Colin discovered that running the attached program with the fork active (e.g. > "./mmap-example -f 0x10000", the address can be that or iomem) this triggers the > following weird messages: > > [ 6824.453724] mmap-example:3481 map pfn expected mapping type write-back for > [mem 0x00010000-0x00010fff], got uncached-minus > [ 6824.453776] ------------[ cut here ]------------ > [ 6824.453796] WARNING: at /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 > untrack_pfn+0xb8/0xd0() > ... > [ 6824.453920] Pid: 3481, comm: mmap-example Tainted: GF > 3.8.0-6-generic #13-Ubuntu > [ 6824.453926] Call Trace: > [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 > [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 > [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 > [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 > [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 > [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 > [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 > [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 > [ 6824.454027] [<ffffffff81056d9f>] copy_process.part.22+0xa5f/0x1510 > [ 6824.454038] [<ffffffff81057931>] do_fork+0x91/0x350 > [ 6824.454048] [<ffffffff81057c76>] sys_clone+0x16/0x20 > [ 6824.454060] [<ffffffff816ccbf9>] stub_clone+0x69/0x90 > [ 6824.454069] [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f > [ 6824.454076] ---[ end trace 4918cdd0a4c9fea4 ]--- > > I found that this is related to your bandaid patch > > commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Fri Feb 10 09:16:27 2012 -0500 > > xen/pat: Disable PAT support for now. > > I just do not understand how this happens. From the trace it seems the fork > fails when duplicating the VMAs (dup_mm calls mmput on failure). So maybe the > warning is just related to this. So primarily the question is how on fork the > _PAGE_PCD bit can become set? That and _PAGE_PWT are cleared from the supported > mask by the patch, so somehow I would think nothing should be able to set it... > But apparently not so. > Not sure it is a big deal since I never saw this in normal operation and it > seems to be ok when unapping before doing the fork. It is just plain odd.Jinsong mentioned that there is some oddity with the MTRR. Somehow the ranges are swapped or not correct. Jinsong, could you shed some light on what you have found so far?> > -Stefan> #include <stdio.h> > #include <stdlib.h> > #include <stdint.h> > #include <stdbool.h> > #include <unistd.h> > #include <sys/mman.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <sys/types.h> > #include <sys/wait.h> > #include <fcntl.h> > > int main(int argc, char **argv) > { > uint8_t *data; > int fd; > unsigned long long offset; > pid_t pid; > int status; > int opt; > bool opt_fork = false; > > while ((opt = getopt(argc, argv, "f")) != -1) { > switch (opt) { > case ''f'': > opt_fork = true; > break; > } > } > > if (argc <= optind) { > fprintf(stderr, "%s: [-f] address\n", argv[0]); > fprintf(stderr, "\t-f specifices if we should fork with the mmap\n"); > exit(EXIT_FAILURE); > } > if (sscanf(argv[optind], "%lli", &offset) != 1) { > fprintf(stderr, "Cannot determine mmap address from %s\n", argv[optind]); > exit(EXIT_FAILURE); > } > > if ((fd = open("/dev/mem", O_RDONLY)) < 0) { > fprintf(stderr, "Cannot open /dev/mem\n"); > exit(EXIT_FAILURE); > } > > printf("mmap: 0x%llx..0x%llx\n", offset, offset + 4095); > > if ((data = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd, (off_t)offset)) == MAP_FAILED) { > fprintf(stderr, "Cannot mmap 0x%llx\n", offset); > exit(EXIT_FAILURE); > } > > close(fd); > > if (opt_fork) { > pid = fork(); > if (pid == 0) { > /* child */ > _exit(0); > } else { > /* parent */ > waitpid(pid, &status, 0); > } > } > > if (munmap(data, 4096) < 0) { > fprintf(stderr, "Cannot munmap %p\n", data); > exit(EXIT_FAILURE); > } > exit(EXIT_SUCCESS); > } >> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk wrote:> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >> Hi Konrad, > > Hey Stefan, >> >> here is another one from the hm-what? department: > > Heh - the really good-bug-hunting one. Lets also include Jinsong as > he has been tracking a similar one with mcelog. >> >> Colin discovered that running the attached program with the fork >> active (e.g. "./mmap-example -f 0x10000", the address can be that or >> iomem) this triggers the following weird messages: >> >> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >> [ 6824.453776] ------------[ cut here ]------------ >> [ 6824.453796] WARNING: at >> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >> mmap-example Tainted: GF >> 3.8.0-6-generic #13-Ubuntu >> [ 6824.453926] Call Trace: >> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 >> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 >> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 >> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 >> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 >> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 >> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 >> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 >> [ 6824.454027] [<ffffffff81056d9f>] >> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] >> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] >> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] >> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] >> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] >> ---[ end trace 4918cdd0a4c9fea4 ]--- >> >> I found that this is related to your bandaid patch >> >> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> Date: Fri Feb 10 09:16:27 2012 -0500 >> >> xen/pat: Disable PAT support for now. >> >> I just do not understand how this happens. From the trace it seems >> the fork >> fails when duplicating the VMAs (dup_mm calls mmput on failure). So >> maybe the >> warning is just related to this. So primarily the question is how on >> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >> cleared from the supported >> mask by the patch, so somehow I would think nothing should be able >> to set it... >> But apparently not so. >> Not sure it is a big deal since I never saw this in normal operation >> and it >> seems to be ok when unapping before doing the fork. It is just plain >> odd. > > Jinsong mentioned that there is some oddity with the MTRR. Somehow the > ranges are swapped or not correct. Jinsong, could you shed some light > on what you have found so far? >Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. Basically, it occurs when mcelog user daemon start, do_fork --> copy_process --> dup_mm --> dup_mmap --> copy_page_range --> track_pfn_copy --> reserve_pfn_range --> line 624: flags != want_flags It comes from different memory types of page table (_PAGE_CACHE_WB) and mtrr (_PAGE_CACHE_UC_MINUS). However, why it get different memory types from page table and mtrr is still unclear, reproducing the bug is difficult and unstable. Thanks, Jinsong>> >> -Stefan > >> #include <stdio.h> >> #include <stdlib.h> >> #include <stdint.h> >> #include <stdbool.h> >> #include <unistd.h> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> #include <sys/types.h> >> #include <sys/wait.h> >> #include <fcntl.h> >> >> int main(int argc, char **argv) >> { >> uint8_t *data; >> int fd; >> unsigned long long offset; >> pid_t pid; >> int status; >> int opt; >> bool opt_fork = false; >> >> while ((opt = getopt(argc, argv, "f")) != -1) { >> switch (opt) { >> case ''f'': >> opt_fork = true; >> break; >> } >> } >> >> if (argc <= optind) { >> fprintf(stderr, "%s: [-f] address\n", argv[0]); >> fprintf(stderr, "\t-f specifices if we should fork with the >> mmap\n"); exit(EXIT_FAILURE); } >> if (sscanf(argv[optind], "%lli", &offset) != 1) { >> fprintf(stderr, "Cannot determine mmap address from %s\n", >> argv[optind]); exit(EXIT_FAILURE); } >> >> if ((fd = open("/dev/mem", O_RDONLY)) < 0) { >> fprintf(stderr, "Cannot open /dev/mem\n"); >> exit(EXIT_FAILURE); >> } >> >> printf("mmap: 0x%llx..0x%llx\n", offset, offset + 4095); >> >> if ((data = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd, >> (off_t)offset)) == MAP_FAILED) { fprintf(stderr, "Cannot mmap >> 0x%llx\n", offset); exit(EXIT_FAILURE); >> } >> >> close(fd); >> >> if (opt_fork) { >> pid = fork(); >> if (pid == 0) { >> /* child */ >> _exit(0); >> } else { >> /* parent */ >> waitpid(pid, &status, 0); >> } >> } >> >> if (munmap(data, 4096) < 0) { >> fprintf(stderr, "Cannot munmap %p\n", data); >> exit(EXIT_FAILURE); >> } >> exit(EXIT_SUCCESS); >> } >> > > > > >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 25.02.2013 04:15, Liu, Jinsong wrote:> Konrad Rzeszutek Wilk wrote: >> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >>> Hi Konrad, >> >> Hey Stefan, >>> >>> here is another one from the hm-what? department: >> >> Heh - the really good-bug-hunting one. Lets also include Jinsong as >> he has been tracking a similar one with mcelog. >>> >>> Colin discovered that running the attached program with the fork >>> active (e.g. "./mmap-example -f 0x10000", the address can be that or >>> iomem) this triggers the following weird messages: >>> >>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >>> [ 6824.453776] ------------[ cut here ]------------ >>> [ 6824.453796] WARNING: at >>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >>> mmap-example Tainted: GF >>> 3.8.0-6-generic #13-Ubuntu >>> [ 6824.453926] Call Trace: >>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 >>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 >>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 >>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 >>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 >>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 >>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 >>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 >>> [ 6824.454027] [<ffffffff81056d9f>] >>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] >>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] >>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] >>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] >>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] >>> ---[ end trace 4918cdd0a4c9fea4 ]--- >>> >>> I found that this is related to your bandaid patch >>> >>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> Date: Fri Feb 10 09:16:27 2012 -0500 >>> >>> xen/pat: Disable PAT support for now. >>> >>> I just do not understand how this happens. From the trace it seems >>> the fork >>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So >>> maybe the >>> warning is just related to this. So primarily the question is how on >>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >>> cleared from the supported >>> mask by the patch, so somehow I would think nothing should be able >>> to set it... >>> But apparently not so. >>> Not sure it is a big deal since I never saw this in normal operation >>> and it >>> seems to be ok when unapping before doing the fork. It is just plain >>> odd. >> >> Jinsong mentioned that there is some oddity with the MTRR. Somehow the >> ranges are swapped or not correct. Jinsong, could you shed some light >> on what you have found so far? >> > > Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. > > Basically, it occurs when mcelog user daemon start, > do_fork > --> copy_process > --> dup_mm > --> dup_mmap > --> copy_page_range > --> track_pfn_copy > --> reserve_pfn_range > --> line 624: flags != want_flags > It comes from different memory types of page table (_PAGE_CACHE_WB) and mtrr (_PAGE_CACHE_UC_MINUS). > > However, why it get different memory types from page table and mtrr is still unclear, reproducing the bug is difficult and unstable. > > Thanks,Ok, so this seems to take the same code paths. As for the test program, it fails on duplicating some mmap on a fork. The test program does this all the time (except the backtrace warning which is warn_once). So you say, the UC- comes from the MTRR side... Hm, have to look at that. Thanks, Stefan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 25.02.2013 10:10, Stefan Bader wrote:> On 25.02.2013 04:15, Liu, Jinsong wrote: >> Konrad Rzeszutek Wilk wrote: >>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >>>> Hi Konrad, >>> >>> Hey Stefan, >>>> >>>> here is another one from the hm-what? department: >>> >>> Heh - the really good-bug-hunting one. Lets also include Jinsong as >>> he has been tracking a similar one with mcelog. >>>> >>>> Colin discovered that running the attached program with the fork >>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or >>>> iomem) this triggers the following weird messages: >>>> >>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >>>> [ 6824.453776] ------------[ cut here ]------------ >>>> [ 6824.453796] WARNING: at >>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >>>> mmap-example Tainted: GF >>>> 3.8.0-6-generic #13-Ubuntu >>>> [ 6824.453926] Call Trace: >>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 >>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 >>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 >>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 >>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 >>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 >>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 >>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 >>>> [ 6824.454027] [<ffffffff81056d9f>] >>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] >>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] >>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] >>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] >>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] >>>> ---[ end trace 4918cdd0a4c9fea4 ]--- >>>> >>>> I found that this is related to your bandaid patch >>>> >>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>> Date: Fri Feb 10 09:16:27 2012 -0500 >>>> >>>> xen/pat: Disable PAT support for now. >>>> >>>> I just do not understand how this happens. From the trace it seems >>>> the fork >>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So >>>> maybe the >>>> warning is just related to this. So primarily the question is how on >>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >>>> cleared from the supported >>>> mask by the patch, so somehow I would think nothing should be able >>>> to set it... >>>> But apparently not so. >>>> Not sure it is a big deal since I never saw this in normal operation >>>> and it >>>> seems to be ok when unapping before doing the fork. It is just plain >>>> odd. >>> >>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the >>> ranges are swapped or not correct. Jinsong, could you shed some light >>> on what you have found so far? >>> >> >> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. >> >> Basically, it occurs when mcelog user daemon start, >> do_fork >> --> copy_process >> --> dup_mm >> --> dup_mmap >> --> copy_page_range >> --> track_pfn_copy >> --> reserve_pfn_rangeSo that makes it clearer as this will do reserve_memtype(...) --> pat_x_mtrr_type --> mtrr_type_lookup --> __mtrr_type_lookup And that can return -1/0xff in case of mtrr not being enabled/initialized. Which is not the case (given there are no messages for it in dmesg). This is not equal to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. It looks like the problem starts early in reserve_memtype: if (!pat_enabled) { /* This is identical to page table setting without PAT */ if (new_type) { if (req_type == _PAGE_CACHE_WC) *new_type = _PAGE_CACHE_UC_MINUS; else *new_type = req_type & _PAGE_CACHE_MASK; } return 0; } This would be what we want, but only clearing the PWT and PCD flags from the supported flags is not changing pat_enabled (which is 1 when PAT support is compiled into the kernel). Unfortunately the variable is local and since there are not any messages about PAT in dmesg I would say pat_init() is not called either. Which might be used to disable PAT support by clearing the CPU feature flag. Right now it seems the only work-around that message appearing is to user "nopat" on the kernel command line. -Stefan>> --> line 624: flags != want_flags >> It comes from different memory types of page table (_PAGE_CACHE_WB) and mtrr (_PAGE_CACHE_UC_MINUS). >> >> However, why it get different memory types from page table and mtrr is still unclear, reproducing the bug is difficult and unstable. >> >> Thanks, > > Ok, so this seems to take the same code paths. As for the test program, it fails > on duplicating some mmap on a fork. The test program does this all the time > (except the backtrace warning which is warn_once). > So you say, the UC- comes from the MTRR side... Hm, have to look at that. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote:> On 25.02.2013 10:10, Stefan Bader wrote: > > On 25.02.2013 04:15, Liu, Jinsong wrote: > >> Konrad Rzeszutek Wilk wrote: > >>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: > >>>> Hi Konrad, > >>> > >>> Hey Stefan, > >>>> > >>>> here is another one from the hm-what? department: > >>> > >>> Heh - the really good-bug-hunting one. Lets also include Jinsong as > >>> he has been tracking a similar one with mcelog. > >>>> > >>>> Colin discovered that running the attached program with the fork > >>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or > >>>> iomem) this triggers the following weird messages: > >>>> > >>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type > >>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus > >>>> [ 6824.453776] ------------[ cut here ]------------ > >>>> [ 6824.453796] WARNING: at > >>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 > >>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: > >>>> mmap-example Tainted: GF > >>>> 3.8.0-6-generic #13-Ubuntu > >>>> [ 6824.453926] Call Trace: > >>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 > >>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 > >>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 > >>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 > >>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 > >>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 > >>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 > >>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 > >>>> [ 6824.454027] [<ffffffff81056d9f>] > >>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] > >>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] > >>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] > >>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] > >>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] > >>>> ---[ end trace 4918cdd0a4c9fea4 ]--- > >>>> > >>>> I found that this is related to your bandaid patch > >>>> > >>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 > >>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > >>>> Date: Fri Feb 10 09:16:27 2012 -0500 > >>>> > >>>> xen/pat: Disable PAT support for now. > >>>> > >>>> I just do not understand how this happens. From the trace it seems > >>>> the fork > >>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So > >>>> maybe the > >>>> warning is just related to this. So primarily the question is how on > >>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are > >>>> cleared from the supported > >>>> mask by the patch, so somehow I would think nothing should be able > >>>> to set it... > >>>> But apparently not so. > >>>> Not sure it is a big deal since I never saw this in normal operation > >>>> and it > >>>> seems to be ok when unapping before doing the fork. It is just plain > >>>> odd. > >>> > >>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the > >>> ranges are swapped or not correct. Jinsong, could you shed some light > >>> on what you have found so far? > >>> > >> > >> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. > >> > >> Basically, it occurs when mcelog user daemon start, > >> do_fork > >> --> copy_process > >> --> dup_mm > >> --> dup_mmap > >> --> copy_page_range > >> --> track_pfn_copy > >> --> reserve_pfn_range > > So that makes it clearer as this will do > > reserve_memtype(...) > --> pat_x_mtrr_type > --> mtrr_type_lookup > --> __mtrr_type_lookup > > And that can return -1/0xff in case of mtrr not being enabled/initialized. Which > is not the case (given there are no messages for it in dmesg). This is not equal > to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. > > It looks like the problem starts early in reserve_memtype: > > if (!pat_enabled) { > /* This is identical to page table setting without PAT */ > if (new_type) { > if (req_type == _PAGE_CACHE_WC) > *new_type = _PAGE_CACHE_UC_MINUS; > else > *new_type = req_type & _PAGE_CACHE_MASK; > } > return 0; > } > > This would be what we want, but only clearing the PWT and PCD flags from the > supported flags is not changing pat_enabled (which is 1 when PAT support is > compiled into the kernel). Unfortunately the variable is local and since there > are not any messages about PAT in dmesg I would say pat_init() is not called > either. Which might be used to disable PAT support by clearing the CPU feature > flag.Hm, so: diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 39928d1..9ac70c5 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) cpuid_leaf1_edx_mask ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ + (1 << X86_FEATURE_PAT) | /* disable PAT */ (1 << X86_FEATURE_ACC)); /* thermal monitoring */ if (!xen_initial_domain()) should do it right? Let me check on the troublesome machine I saw it.
On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote:> On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote: >> On 25.02.2013 10:10, Stefan Bader wrote: >>> On 25.02.2013 04:15, Liu, Jinsong wrote: >>>> Konrad Rzeszutek Wilk wrote: >>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >>>>>> Hi Konrad, >>>>> >>>>> Hey Stefan, >>>>>> >>>>>> here is another one from the hm-what? department: >>>>> >>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as >>>>> he has been tracking a similar one with mcelog. >>>>>> >>>>>> Colin discovered that running the attached program with the fork >>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or >>>>>> iomem) this triggers the following weird messages: >>>>>> >>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >>>>>> [ 6824.453776] ------------[ cut here ]------------ >>>>>> [ 6824.453796] WARNING: at >>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >>>>>> mmap-example Tainted: GF >>>>>> 3.8.0-6-generic #13-Ubuntu >>>>>> [ 6824.453926] Call Trace: >>>>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 >>>>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 >>>>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 >>>>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 >>>>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 >>>>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 >>>>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 >>>>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 >>>>>> [ 6824.454027] [<ffffffff81056d9f>] >>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] >>>>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] >>>>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] >>>>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] >>>>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] >>>>>> ---[ end trace 4918cdd0a4c9fea4 ]--- >>>>>> >>>>>> I found that this is related to your bandaid patch >>>>>> >>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >>>>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>>>> Date: Fri Feb 10 09:16:27 2012 -0500 >>>>>> >>>>>> xen/pat: Disable PAT support for now. >>>>>> >>>>>> I just do not understand how this happens. From the trace it seems >>>>>> the fork >>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So >>>>>> maybe the >>>>>> warning is just related to this. So primarily the question is how on >>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >>>>>> cleared from the supported >>>>>> mask by the patch, so somehow I would think nothing should be able >>>>>> to set it... >>>>>> But apparently not so. >>>>>> Not sure it is a big deal since I never saw this in normal operation >>>>>> and it >>>>>> seems to be ok when unapping before doing the fork. It is just plain >>>>>> odd. >>>>> >>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the >>>>> ranges are swapped or not correct. Jinsong, could you shed some light >>>>> on what you have found so far? >>>>> >>>> >>>> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. >>>> >>>> Basically, it occurs when mcelog user daemon start, >>>> do_fork >>>> --> copy_process >>>> --> dup_mm >>>> --> dup_mmap >>>> --> copy_page_range >>>> --> track_pfn_copy >>>> --> reserve_pfn_range >> >> So that makes it clearer as this will do >> >> reserve_memtype(...) >> --> pat_x_mtrr_type >> --> mtrr_type_lookup >> --> __mtrr_type_lookup >> >> And that can return -1/0xff in case of mtrr not being enabled/initialized. Which >> is not the case (given there are no messages for it in dmesg). This is not equal >> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. >> >> It looks like the problem starts early in reserve_memtype: >> >> if (!pat_enabled) { >> /* This is identical to page table setting without PAT */ >> if (new_type) { >> if (req_type == _PAGE_CACHE_WC) >> *new_type = _PAGE_CACHE_UC_MINUS; >> else >> *new_type = req_type & _PAGE_CACHE_MASK; >> } >> return 0; >> } >> >> This would be what we want, but only clearing the PWT and PCD flags from the >> supported flags is not changing pat_enabled (which is 1 when PAT support is >> compiled into the kernel). Unfortunately the variable is local and since there >> are not any messages about PAT in dmesg I would say pat_init() is not called >> either. Which might be used to disable PAT support by clearing the CPU feature >> flag. > > Hm, so: > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 39928d1..9ac70c5 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) > > cpuid_leaf1_edx_mask > ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ > + (1 << X86_FEATURE_PAT) | /* disable PAT */ > (1 << X86_FEATURE_ACC)); /* thermal monitoring */ > > if (!xen_initial_domain()) > > > > should do it right? Let me check on the troublesome machine I saw > it. >I could try it as well but somehow my reading of pat_init() is that if that would have been called at all, there should be some message about PAT in dmesg. And normal dom0 boots do not show anything. It looks like pat_init only gets called from two places in mtrr code, and that probably is not done in dom0 either. So clearing the feature is one step, but I would think that also there needs to be a call to pat_init... -Stefan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Tue, Feb 26, 2013 at 06:11:42PM +0100, Stefan Bader wrote:> On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote: > > On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote: > >> On 25.02.2013 10:10, Stefan Bader wrote: > >>> On 25.02.2013 04:15, Liu, Jinsong wrote: > >>>> Konrad Rzeszutek Wilk wrote: > >>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: > >>>>>> Hi Konrad, > >>>>> > >>>>> Hey Stefan, > >>>>>> > >>>>>> here is another one from the hm-what? department: > >>>>> > >>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as > >>>>> he has been tracking a similar one with mcelog. > >>>>>> > >>>>>> Colin discovered that running the attached program with the fork > >>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or > >>>>>> iomem) this triggers the following weird messages: > >>>>>> > >>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type > >>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus > >>>>>> [ 6824.453776] ------------[ cut here ]------------ > >>>>>> [ 6824.453796] WARNING: at > >>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 > >>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: > >>>>>> mmap-example Tainted: GF > >>>>>> 3.8.0-6-generic #13-Ubuntu > >>>>>> [ 6824.453926] Call Trace: > >>>>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 > >>>>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 > >>>>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 > >>>>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 > >>>>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 > >>>>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 > >>>>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 > >>>>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 > >>>>>> [ 6824.454027] [<ffffffff81056d9f>] > >>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] > >>>>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] > >>>>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] > >>>>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] > >>>>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] > >>>>>> ---[ end trace 4918cdd0a4c9fea4 ]--- > >>>>>> > >>>>>> I found that this is related to your bandaid patch > >>>>>> > >>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 > >>>>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > >>>>>> Date: Fri Feb 10 09:16:27 2012 -0500 > >>>>>> > >>>>>> xen/pat: Disable PAT support for now. > >>>>>> > >>>>>> I just do not understand how this happens. From the trace it seems > >>>>>> the fork > >>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So > >>>>>> maybe the > >>>>>> warning is just related to this. So primarily the question is how on > >>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are > >>>>>> cleared from the supported > >>>>>> mask by the patch, so somehow I would think nothing should be able > >>>>>> to set it... > >>>>>> But apparently not so. > >>>>>> Not sure it is a big deal since I never saw this in normal operation > >>>>>> and it > >>>>>> seems to be ok when unapping before doing the fork. It is just plain > >>>>>> odd. > >>>>> > >>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the > >>>>> ranges are swapped or not correct. Jinsong, could you shed some light > >>>>> on what you have found so far? > >>>>> > >>>> > >>>> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. > >>>> > >>>> Basically, it occurs when mcelog user daemon start, > >>>> do_fork > >>>> --> copy_process > >>>> --> dup_mm > >>>> --> dup_mmap > >>>> --> copy_page_range > >>>> --> track_pfn_copy > >>>> --> reserve_pfn_range > >> > >> So that makes it clearer as this will do > >> > >> reserve_memtype(...) > >> --> pat_x_mtrr_type > >> --> mtrr_type_lookup > >> --> __mtrr_type_lookup > >> > >> And that can return -1/0xff in case of mtrr not being enabled/initialized. Which > >> is not the case (given there are no messages for it in dmesg). This is not equal > >> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. > >> > >> It looks like the problem starts early in reserve_memtype: > >> > >> if (!pat_enabled) { > >> /* This is identical to page table setting without PAT */ > >> if (new_type) { > >> if (req_type == _PAGE_CACHE_WC) > >> *new_type = _PAGE_CACHE_UC_MINUS; > >> else > >> *new_type = req_type & _PAGE_CACHE_MASK; > >> } > >> return 0; > >> } > >> > >> This would be what we want, but only clearing the PWT and PCD flags from the > >> supported flags is not changing pat_enabled (which is 1 when PAT support is > >> compiled into the kernel). Unfortunately the variable is local and since there > >> are not any messages about PAT in dmesg I would say pat_init() is not called > >> either. Which might be used to disable PAT support by clearing the CPU feature > >> flag. > > > > Hm, so: > > > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > > index 39928d1..9ac70c5 100644 > > --- a/arch/x86/xen/enlighten.c > > +++ b/arch/x86/xen/enlighten.c > > @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) > > > > cpuid_leaf1_edx_mask > > ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ > > + (1 << X86_FEATURE_PAT) | /* disable PAT */ > > (1 << X86_FEATURE_ACC)); /* thermal monitoring */ > > > > if (!xen_initial_domain()) > > > > > > > > should do it right? Let me check on the troublesome machine I saw > > it. > > > I could try it as well but somehow my reading of pat_init() is that if that > would have been called at all, there should be some message about PAT in dmesg. > And normal dom0 boots do not show anything. > > It looks like pat_init only gets called from two places in mtrr code, and that > probably is not done in dom0 either. So clearing the feature is one step, but I > would think that also there needs to be a call to pat_init...So how about this super-complex patch: diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 39928d1..c8e1c7b 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -67,6 +67,7 @@ #include <asm/hypervisor.h> #include <asm/mwait.h> #include <asm/pci_x86.h> +#include <asm/pat.h> #ifdef CONFIG_ACPI #include <linux/acpi.h> @@ -1417,7 +1418,14 @@ asmlinkage void __init xen_start_kernel(void) */ acpi_numa = -1; #endif - +#ifdef CONFIG_X86_PAT + /* + * For right now disable the PAT. We should remove this once + * git commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 + * (xen/pat: Disable PAT support for now) is reverted. + */ + pat_enabled = 0; +#endif /* Don''t do the full vcpu_info placement stuff until we have a possible map and a non-dummy shared_info. */ per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; trying it out now.> > -Stefan >
On Tue, Feb 26, 2013 at 12:43:13PM -0500, Konrad Rzeszutek Wilk wrote:> On Tue, Feb 26, 2013 at 06:11:42PM +0100, Stefan Bader wrote: > > On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote: > > > On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote: > > >> On 25.02.2013 10:10, Stefan Bader wrote: > > >>> On 25.02.2013 04:15, Liu, Jinsong wrote: > > >>>> Konrad Rzeszutek Wilk wrote: > > >>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: > > >>>>>> Hi Konrad, > > >>>>> > > >>>>> Hey Stefan, > > >>>>>> > > >>>>>> here is another one from the hm-what? department: > > >>>>> > > >>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as > > >>>>> he has been tracking a similar one with mcelog. > > >>>>>> > > >>>>>> Colin discovered that running the attached program with the fork > > >>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or > > >>>>>> iomem) this triggers the following weird messages: > > >>>>>> > > >>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type > > >>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus > > >>>>>> [ 6824.453776] ------------[ cut here ]------------ > > >>>>>> [ 6824.453796] WARNING: at > > >>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 > > >>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: > > >>>>>> mmap-example Tainted: GF > > >>>>>> 3.8.0-6-generic #13-Ubuntu > > >>>>>> [ 6824.453926] Call Trace: > > >>>>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 > > >>>>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 > > >>>>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 > > >>>>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 > > >>>>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 > > >>>>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 > > >>>>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 > > >>>>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 > > >>>>>> [ 6824.454027] [<ffffffff81056d9f>] > > >>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] > > >>>>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] > > >>>>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] > > >>>>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] > > >>>>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] > > >>>>>> ---[ end trace 4918cdd0a4c9fea4 ]--- > > >>>>>> > > >>>>>> I found that this is related to your bandaid patch > > >>>>>> > > >>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 > > >>>>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > >>>>>> Date: Fri Feb 10 09:16:27 2012 -0500 > > >>>>>> > > >>>>>> xen/pat: Disable PAT support for now. > > >>>>>> > > >>>>>> I just do not understand how this happens. From the trace it seems > > >>>>>> the fork > > >>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So > > >>>>>> maybe the > > >>>>>> warning is just related to this. So primarily the question is how on > > >>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are > > >>>>>> cleared from the supported > > >>>>>> mask by the patch, so somehow I would think nothing should be able > > >>>>>> to set it... > > >>>>>> But apparently not so. > > >>>>>> Not sure it is a big deal since I never saw this in normal operation > > >>>>>> and it > > >>>>>> seems to be ok when unapping before doing the fork. It is just plain > > >>>>>> odd. > > >>>>> > > >>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the > > >>>>> ranges are swapped or not correct. Jinsong, could you shed some light > > >>>>> on what you have found so far? > > >>>>> > > >>>> > > >>>> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. > > >>>> > > >>>> Basically, it occurs when mcelog user daemon start, > > >>>> do_fork > > >>>> --> copy_process > > >>>> --> dup_mm > > >>>> --> dup_mmap > > >>>> --> copy_page_range > > >>>> --> track_pfn_copy > > >>>> --> reserve_pfn_range > > >> > > >> So that makes it clearer as this will do > > >> > > >> reserve_memtype(...) > > >> --> pat_x_mtrr_type > > >> --> mtrr_type_lookup > > >> --> __mtrr_type_lookup > > >> > > >> And that can return -1/0xff in case of mtrr not being enabled/initialized. Which > > >> is not the case (given there are no messages for it in dmesg). This is not equal > > >> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. > > >> > > >> It looks like the problem starts early in reserve_memtype: > > >> > > >> if (!pat_enabled) { > > >> /* This is identical to page table setting without PAT */ > > >> if (new_type) { > > >> if (req_type == _PAGE_CACHE_WC) > > >> *new_type = _PAGE_CACHE_UC_MINUS; > > >> else > > >> *new_type = req_type & _PAGE_CACHE_MASK; > > >> } > > >> return 0; > > >> } > > >> > > >> This would be what we want, but only clearing the PWT and PCD flags from the > > >> supported flags is not changing pat_enabled (which is 1 when PAT support is > > >> compiled into the kernel). Unfortunately the variable is local and since there > > >> are not any messages about PAT in dmesg I would say pat_init() is not called > > >> either. Which might be used to disable PAT support by clearing the CPU feature > > >> flag. > > > > > > Hm, so: > > > > > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > > > index 39928d1..9ac70c5 100644 > > > --- a/arch/x86/xen/enlighten.c > > > +++ b/arch/x86/xen/enlighten.c > > > @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) > > > > > > cpuid_leaf1_edx_mask > > > ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ > > > + (1 << X86_FEATURE_PAT) | /* disable PAT */ > > > (1 << X86_FEATURE_ACC)); /* thermal monitoring */ > > > > > > if (!xen_initial_domain()) > > > > > > > > > > > > should do it right? Let me check on the troublesome machine I saw > > > it. > > > > > I could try it as well but somehow my reading of pat_init() is that if that > > would have been called at all, there should be some message about PAT in dmesg. > > And normal dom0 boots do not show anything. > > > > It looks like pat_init only gets called from two places in mtrr code, and that > > probably is not done in dom0 either. So clearing the feature is one step, but I > > would think that also there needs to be a call to pat_init... > > So how about this super-complex patch: > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 39928d1..c8e1c7b 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -67,6 +67,7 @@ > #include <asm/hypervisor.h> > #include <asm/mwait.h> > #include <asm/pci_x86.h> > +#include <asm/pat.h> > > #ifdef CONFIG_ACPI > #include <linux/acpi.h> > @@ -1417,7 +1418,14 @@ asmlinkage void __init xen_start_kernel(void) > */ > acpi_numa = -1; > #endif > - > +#ifdef CONFIG_X86_PAT > + /* > + * For right now disable the PAT. We should remove this once > + * git commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 > + * (xen/pat: Disable PAT support for now) is reverted. > + */ > + pat_enabled = 0; > +#endif > /* Don''t do the full vcpu_info placement stuff until we have a > possible map and a non-dummy shared_info. */ > per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; > > trying it out now.Seems to work for me with the mcelog that kept on failing.
On 26.02.2013 18:53, Konrad Rzeszutek Wilk wrote:> On Tue, Feb 26, 2013 at 12:43:13PM -0500, Konrad Rzeszutek Wilk wrote: >> On Tue, Feb 26, 2013 at 06:11:42PM +0100, Stefan Bader wrote: >>> On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote: >>>> On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote: >>>>> On 25.02.2013 10:10, Stefan Bader wrote: >>>>>> On 25.02.2013 04:15, Liu, Jinsong wrote: >>>>>>> Konrad Rzeszutek Wilk wrote: >>>>>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >>>>>>>>> Hi Konrad, >>>>>>>> >>>>>>>> Hey Stefan, >>>>>>>>> >>>>>>>>> here is another one from the hm-what? department: >>>>>>>> >>>>>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as >>>>>>>> he has been tracking a similar one with mcelog. >>>>>>>>> >>>>>>>>> Colin discovered that running the attached program with the fork >>>>>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that or >>>>>>>>> iomem) this triggers the following weird messages: >>>>>>>>> >>>>>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >>>>>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >>>>>>>>> [ 6824.453776] ------------[ cut here ]------------ >>>>>>>>> [ 6824.453796] WARNING: at >>>>>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >>>>>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >>>>>>>>> mmap-example Tainted: GF >>>>>>>>> 3.8.0-6-generic #13-Ubuntu >>>>>>>>> [ 6824.453926] Call Trace: >>>>>>>>> [ 6824.453944] [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0 >>>>>>>>> [ 6824.453954] [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20 >>>>>>>>> [ 6824.453963] [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0 >>>>>>>>> [ 6824.453975] [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100 >>>>>>>>> [ 6824.453985] [<ffffffff81157459>] unmap_vmas+0x49/0x90 >>>>>>>>> [ 6824.453995] [<ffffffff8115f808>] exit_mmap+0x98/0x170 >>>>>>>>> [ 6824.454007] [<ffffffff810559a4>] mmput+0x64/0x100 >>>>>>>>> [ 6824.454017] [<ffffffff810560f5>] dup_mm+0x445/0x660 >>>>>>>>> [ 6824.454027] [<ffffffff81056d9f>] >>>>>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] >>>>>>>>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] >>>>>>>>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] >>>>>>>>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] >>>>>>>>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076] >>>>>>>>> ---[ end trace 4918cdd0a4c9fea4 ]--- >>>>>>>>> >>>>>>>>> I found that this is related to your bandaid patch >>>>>>>>> >>>>>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >>>>>>>>> Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>>>>>>> Date: Fri Feb 10 09:16:27 2012 -0500 >>>>>>>>> >>>>>>>>> xen/pat: Disable PAT support for now. >>>>>>>>> >>>>>>>>> I just do not understand how this happens. From the trace it seems >>>>>>>>> the fork >>>>>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So >>>>>>>>> maybe the >>>>>>>>> warning is just related to this. So primarily the question is how on >>>>>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >>>>>>>>> cleared from the supported >>>>>>>>> mask by the patch, so somehow I would think nothing should be able >>>>>>>>> to set it... >>>>>>>>> But apparently not so. >>>>>>>>> Not sure it is a big deal since I never saw this in normal operation >>>>>>>>> and it >>>>>>>>> seems to be ok when unapping before doing the fork. It is just plain >>>>>>>>> odd. >>>>>>>> >>>>>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow the >>>>>>>> ranges are swapped or not correct. Jinsong, could you shed some light >>>>>>>> on what you have found so far? >>>>>>>> >>>>>>> >>>>>>> Yes, Sander once also reported a similar weird warning when start mcelog daemon, as attached. >>>>>>> >>>>>>> Basically, it occurs when mcelog user daemon start, >>>>>>> do_fork >>>>>>> --> copy_process >>>>>>> --> dup_mm >>>>>>> --> dup_mmap >>>>>>> --> copy_page_range >>>>>>> --> track_pfn_copy >>>>>>> --> reserve_pfn_range >>>>> >>>>> So that makes it clearer as this will do >>>>> >>>>> reserve_memtype(...) >>>>> --> pat_x_mtrr_type >>>>> --> mtrr_type_lookup >>>>> --> __mtrr_type_lookup >>>>> >>>>> And that can return -1/0xff in case of mtrr not being enabled/initialized. Which >>>>> is not the case (given there are no messages for it in dmesg). This is not equal >>>>> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. >>>>> >>>>> It looks like the problem starts early in reserve_memtype: >>>>> >>>>> if (!pat_enabled) { >>>>> /* This is identical to page table setting without PAT */ >>>>> if (new_type) { >>>>> if (req_type == _PAGE_CACHE_WC) >>>>> *new_type = _PAGE_CACHE_UC_MINUS; >>>>> else >>>>> *new_type = req_type & _PAGE_CACHE_MASK; >>>>> } >>>>> return 0; >>>>> } >>>>> >>>>> This would be what we want, but only clearing the PWT and PCD flags from the >>>>> supported flags is not changing pat_enabled (which is 1 when PAT support is >>>>> compiled into the kernel). Unfortunately the variable is local and since there >>>>> are not any messages about PAT in dmesg I would say pat_init() is not called >>>>> either. Which might be used to disable PAT support by clearing the CPU feature >>>>> flag. >>>> >>>> Hm, so: >>>> >>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >>>> index 39928d1..9ac70c5 100644 >>>> --- a/arch/x86/xen/enlighten.c >>>> +++ b/arch/x86/xen/enlighten.c >>>> @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) >>>> >>>> cpuid_leaf1_edx_mask >>>> ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ >>>> + (1 << X86_FEATURE_PAT) | /* disable PAT */ >>>> (1 << X86_FEATURE_ACC)); /* thermal monitoring */ >>>> >>>> if (!xen_initial_domain()) >>>> >>>> >>>> >>>> should do it right? Let me check on the troublesome machine I saw >>>> it. >>>> >>> I could try it as well but somehow my reading of pat_init() is that if that >>> would have been called at all, there should be some message about PAT in dmesg. >>> And normal dom0 boots do not show anything. >>> >>> It looks like pat_init only gets called from two places in mtrr code, and that >>> probably is not done in dom0 either. So clearing the feature is one step, but I >>> would think that also there needs to be a call to pat_init... >> >> So how about this super-complex patch: >> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index 39928d1..c8e1c7b 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -67,6 +67,7 @@ >> #include <asm/hypervisor.h> >> #include <asm/mwait.h> >> #include <asm/pci_x86.h> >> +#include <asm/pat.h> >> >> #ifdef CONFIG_ACPI >> #include <linux/acpi.h> >> @@ -1417,7 +1418,14 @@ asmlinkage void __init xen_start_kernel(void) >> */ >> acpi_numa = -1; >> #endif >> - >> +#ifdef CONFIG_X86_PAT >> + /* >> + * For right now disable the PAT. We should remove this once >> + * git commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >> + * (xen/pat: Disable PAT support for now) is reverted. >> + */ >> + pat_enabled = 0; >> +#endif >> /* Don''t do the full vcpu_info placement stuff until we have a >> possible map and a non-dummy shared_info. */ >> per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; >> >> trying it out now. > > Seems to work for me with the mcelog that kept on failing. >Yeah, would need to compile it but that looks to be a good solution. Somewhat missed the fact that pat_enabled is actually exported in pat.h. Sometimes the obvious is so blanked out. :-P -Stefan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel