Hey folks, Yesterday I committed a workaround for a deadlock condition in RELENG_5 and RELENG_5_4 caused by errata in AMD Opteron and Athlon64 processors. (-CURRENT after April 8 is not affected due to changes in critical sections.) The deadlock appeared under heavy load, particularly with lots of I/O interrupts, in SMP environments on both FreeBSD/amd64 and FreeBSD/i386. Specifically, a spinwait as part of inter-processor TLB flushing triggered Errata 106, which causes the cache on the spinning processor to not update. To workaround the issue we enabled interrupts during the spinwait, which breaks the lock on the cache when an interrupt occurs. AMD also offers a workaround that the BIOS can implement, but not all systems have applied the errata. This workaround will appear in 5.4-RELEASE. In addition, I committed a KDB feature written by Stephen Uphoff (ups) to assist in debugging SMP situations where one processor is stuck with interrupts disabled. Since the regular cpustop IPI won't get through in that case, he implemented an NMI handler to perform the stop. This feature, compiled in with the kernel option KDB_STOP_NMI and activated by debug.kdb.stop_cpus_with_nmi, is available on all current branches and will ship with 5.4-RELEASE. If you have had problems with deadlock conditions on AMD processors in an SMP environment, we urge you to update to the latest RELENG_5, or try the upcoming 5.4-RC4. I want to thank the following individuals for their help in debugging and fixing the deadlock issue: Paul Vixie and Peter Losher at ISC Stephen Uphoff (ups) Alan Cox (alc) John Baldwin (jhb) The rest of the FreeBSD RE team -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org