I know that xen-unstable is on a feature freeze, and this is not strictly a bugfix (yet; see below), but as it is safe and designed to help clarity in the case of a crash, so I request that it be considered for inclusion. I have constructed an artificial case where the information reported in 1 per-cpu crash note was stale by using low_crashinfo mode, crashing Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This leaves stale register state written into the offlined CPU crash information. In this case, the information was stale but correct, due to the predictable nature of the Xen crash path from the ''C'' debug key, but there is no guarantee that in the case of a real crash, the same will still be true. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 04.05.12 at 13:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > I know that xen-unstable is on a feature freeze, and this is not > strictly a bugfix (yet; see below), but as it is safe and designed to > help clarity in the case of a crash, so I request that it be considered > for inclusion. > > I have constructed an artificial case where the information reported in > 1 per-cpu crash note was stale by using low_crashinfo mode, crashing > Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This > leaves stale register state written into the offlined CPU crash > information. In this case, the information was stale but correct, due > to the predictable nature of the Xen crash path from the ''C'' debug key, > but there is no guarantee that in the case of a real crash, the same > will still be true.Apart from the missing blanks in the for() statement (which could as well be a simple memset() afaict), Acked-by: Jan Beulich <jbeulich@suse.com> I personally would think that this can go in as a bug fix. Jan
On 04/05/12 13:51, Jan Beulich wrote:>>>> On 04.05.12 at 13:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> I know that xen-unstable is on a feature freeze, and this is not >> strictly a bugfix (yet; see below), but as it is safe and designed to >> help clarity in the case of a crash, so I request that it be considered >> for inclusion. >> >> I have constructed an artificial case where the information reported in >> 1 per-cpu crash note was stale by using low_crashinfo mode, crashing >> Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This >> leaves stale register state written into the offlined CPU crash >> information. In this case, the information was stale but correct, due >> to the predictable nature of the Xen crash path from the ''C'' debug key, >> but there is no guarantee that in the case of a real crash, the same >> will still be true. > Apart from the missing blanks in the for() statement (which could as > well be a simple memset() afaict), > Acked-by: Jan Beulich <jbeulich@suse.com> > > I personally would think that this can go in as a bug fix. > > JanHow would you format the for loop differently? (Not that I mind - just so I know for next time) As for clear_page vs memset - clear_page is faster, and liable to be conditionally tuned more in the future. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
>>> On 04.05.12 at 17:19, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 04/05/12 13:51, Jan Beulich wrote: >>>>> On 04.05.12 at 13:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> I know that xen-unstable is on a feature freeze, and this is not >>> strictly a bugfix (yet; see below), but as it is safe and designed to >>> help clarity in the case of a crash, so I request that it be considered >>> for inclusion. >>> >>> I have constructed an artificial case where the information reported in >>> 1 per-cpu crash note was stale by using low_crashinfo mode, crashing >>> Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This >>> leaves stale register state written into the offlined CPU crash >>> information. In this case, the information was stale but correct, due >>> to the predictable nature of the Xen crash path from the ''C'' debug key, >>> but there is no guarantee that in the case of a real crash, the same >>> will still be true. >> Apart from the missing blanks in the for() statement (which could as >> well be a simple memset() afaict), >> Acked-by: Jan Beulich <jbeulich@suse.com> >> >> I personally would think that this can go in as a bug fix. >> >> Jan > > How would you format the for loop differently? (Not that I mind - just > so I know for next time)for ( i = 0; i < (crash_heap_size >> PAGE_SHIFT); ++i )> As for clear_page vs memset - clear_page is faster, and liable to be > conditionally tuned more in the future.Certainly, but does this matter here? Jan
On 04/05/12 16:25, Jan Beulich wrote:>>>> On 04.05.12 at 17:19, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 04/05/12 13:51, Jan Beulich wrote: >>>>>> On 04.05.12 at 13:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>> I know that xen-unstable is on a feature freeze, and this is not >>>> strictly a bugfix (yet; see below), but as it is safe and designed to >>>> help clarity in the case of a crash, so I request that it be considered >>>> for inclusion. >>>> >>>> I have constructed an artificial case where the information reported in >>>> 1 per-cpu crash note was stale by using low_crashinfo mode, crashing >>>> Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This >>>> leaves stale register state written into the offlined CPU crash >>>> information. In this case, the information was stale but correct, due >>>> to the predictable nature of the Xen crash path from the ''C'' debug key, >>>> but there is no guarantee that in the case of a real crash, the same >>>> will still be true. >>> Apart from the missing blanks in the for() statement (which could as >>> well be a simple memset() afaict), >>> Acked-by: Jan Beulich <jbeulich@suse.com> >>> >>> I personally would think that this can go in as a bug fix. >>> >>> Jan >> How would you format the for loop differently? (Not that I mind - just >> so I know for next time) > for ( i = 0; i < (crash_heap_size >> PAGE_SHIFT); ++i )Ok - refreshed the patch as such> >> As for clear_page vs memset - clear_page is faster, and liable to be >> conditionally tuned more in the future. > Certainly, but does this matter here? > > Jan >crash_heap_size scales linearly with the number of PCPUs on the system, so very large boxes might start noticing a difference in boot speed. (Probably not in the grand scheme of things, but as we are explicitly allocating pages, it makes sense to clear them as pages) -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 04/05/12 16:39, Andrew Cooper wrote:> On 04/05/12 16:25, Jan Beulich wrote: >>>>> On 04.05.12 at 17:19, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> On 04/05/12 13:51, Jan Beulich wrote: >>>>>>> On 04.05.12 at 13:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>> I know that xen-unstable is on a feature freeze, and this is not >>>>> strictly a bugfix (yet; see below), but as it is safe and designed to >>>>> help clarity in the case of a crash, so I request that it be considered >>>>> for inclusion. >>>>> >>>>> I have constructed an artificial case where the information reported in >>>>> 1 per-cpu crash note was stale by using low_crashinfo mode, crashing >>>>> Xen, allowing it to reboot, offlining a CPU then re-crashing Xen. This >>>>> leaves stale register state written into the offlined CPU crash >>>>> information. In this case, the information was stale but correct, due >>>>> to the predictable nature of the Xen crash path from the ''C'' debug key, >>>>> but there is no guarantee that in the case of a real crash, the same >>>>> will still be true. >>>> Apart from the missing blanks in the for() statement (which could as >>>> well be a simple memset() afaict), >>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>>> >>>> I personally would think that this can go in as a bug fix. >>>> >>>> Jan >>> How would you format the for loop differently? (Not that I mind - just >>> so I know for next time) >> for ( i = 0; i < (crash_heap_size >> PAGE_SHIFT); ++i ) > > Ok - refreshed the patch as suchSee also CODING_STYLE in the top level of the Xen source tree.>>> As for clear_page vs memset - clear_page is faster, and liable to be >>> conditionally tuned more in the future. >> Certainly, but does this matter here? >> >> Jan >> > > crash_heap_size scales linearly with the number of PCPUs on the system, > so very large boxes might start noticing a difference in boot speed. > (Probably not in the grand scheme of things, but as we are explicitly > allocating pages, it makes sense to clear them as pages)If this is important then there should be a alloc_zeroed_xenheap_pages() function for this and not an open-coded loop. I''d just use a memset() here. David