This on a 4 CPU AMD box with 64G RAM. cpu 0 receives ext NMI, calls kdb_nmi() from do_nmi(). asmlinkage void do_nmi(struct cpu_user_regs *regs) { unsigned int cpu = smp_processor_id(); unsigned char reason; ++nmi_count(cpu); if ( nmi_callback(regs, cpu) ) return; if ( nmi_watchdog ) nmi_watchdog_tick(regs); #ifdef XEN_KDB_CONFIG kdb_nmi(TRAP_nmi, regs); #endif .... } kdb_nmi(..): { watchdog_disable(); set_nmi_callback(kdb_nmi_receive); smp_send_nmi_allbutself(); ...... } However, in do_nmi(), nmi_callback still points to dummy (receiving cpus). What''sinteresting is, if I put two print lines back to back with nothing in between right at the beginning, then the first prints dummy but the second prints kdb_nmi_receive. I''m at a complete loss. Does NMI change cache protocol? I''ve been looking thru Intel/AMD manuals, but nothing.... Thanks, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mukesh Rathor <mukesh.rathor <at> oracle.com> writes: .......> However, in do_nmi(), nmi_callback still points to dummy (receiving cpus). > What''sinteresting is, if I put two print lines back to back with nothing > in between right at the beginning, then the first prints dummy but the > second prints kdb_nmi_receive. I''m at a complete loss. Does NMI change > cache protocol? I''ve been looking thru Intel/AMD manuals, but nothing....I should clarify I made nmi_callback volatile before adding prints. Moreover, I also put wbinvd() after set_nmi_callback, still same behaviour. thanks mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/8/08 03:20, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> However, in do_nmi(), nmi_callback still points to dummy (receiving cpus). > What''sinteresting is, if I put two print lines back to back with nothing > in between right at the beginning, then the first prints dummy but the > second prints kdb_nmi_receive. I''m at a complete loss. Does NMI change > cache protocol? I''ve been looking thru Intel/AMD manuals, but nothing....What you describe is indeed impossible. My guess is that the NMI executing on the other CPUs is not the one triggered by smp_send_nmi_allbutself() immediately after set_nmi_callback(). For example, it could be a watchdog NMI or something like that. smp_send_nmi_allbutself() is not safe to call from within NMI context. send_IPI_mask() is not atomic, and it would be possible for an NMI handler to interrupt it, reenter it, and corrupt the IPI state being set up by the context that got interrupted. You can make it safe by saving/restoring the top half of the APIC ICR register, as that''s what would get corrupted. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
You are right! What''s happening is that when pressing NMI, this hardware generates it on all CPUs. During my prev experiments on different systems, it was always only one CPU, BSP, receiving the external NMI. I was totally not expecting it, and given that watchdog was disabled and nothing else to generate NMI, i was baffled. Anyways, your response got me thinking in the possibiliy of different source. thanks. Mukesh Keir Fraser wrote:> On 15/8/08 03:20, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > >> However, in do_nmi(), nmi_callback still points to dummy (receiving cpus). >> What''sinteresting is, if I put two print lines back to back with nothing >> in between right at the beginning, then the first prints dummy but the >> second prints kdb_nmi_receive. I''m at a complete loss. Does NMI change >> cache protocol? I''ve been looking thru Intel/AMD manuals, but nothing.... > > What you describe is indeed impossible. My guess is that the NMI executing > on the other CPUs is not the one triggered by smp_send_nmi_allbutself() > immediately after set_nmi_callback(). For example, it could be a watchdog > NMI or something like that. > > smp_send_nmi_allbutself() is not safe to call from within NMI context. > send_IPI_mask() is not atomic, and it would be possible for an NMI handler > to interrupt it, reenter it, and corrupt the IPI state being set up by the > context that got interrupted. You can make it safe by saving/restoring the > top half of the APIC ICR register, as that''s what would get corrupted. > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel