Chris Worley
2008-Apr-06 00:11 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
Two issues with the RHEL 2.6.9-67.0.4 Lustre kernel vs. the kernel of the same rev from RH... 1) Need to understand how the patches caused a change in the CPU mapping in the kernel. I was running RHEL''s 2.6.9-67.0.4 kernel w/o Lustre patches, and the core to logical processor allocation was (as shown by /proc/cpuinfo): ============= ============= Socket ====== ====== ====== ====== L2 cache domain 0 4 1 5 2 6 3 7 logical processor After installing the Lustre version of the kernel, the allocation is: ============= ============= Socket ====== ====== ====== ====== L2 cache domain 0 1 2 3 4 5 6 7 logical processor Why and where was this changed? I have a user that seems to really care. 2) One user has an application whose performance took a 10x nosedive after the change in kernel from RHEL''s 2.6.9-67.0.4 to Lustre''s kernel of the same rev. The application can run w/ and w/o MPI. It uses HP-MPI. In a single-node 8-processor case, running both with and without MPI shows the difference... so it''s something in HP-MPI, but it''s happening even on a single node (no IB) and it''s also peculeure to this app (another HP-MPI user saw only an 8% degradation in performance after this kernel change). In looking at a histogram of the system calls in the app having issues, there were lots of calls to: sched_yield, rt_sig* functions, getppid, and "select". It also calls "close" a lot with invalid handles! This app also has >600 threads running (on a single node!) during its lifespan. Any idea what happened to this apps performance in the kernel change? Note that I didn''t patch the kernel myself, I''m using the kernel from the lustre.org web site. Thanks, Chris
Brian J. Murrell
2008-Apr-08 14:19 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
On Sat, 2008-04-05 at 18:11 -0600, Chris Worley wrote:> Two issues with the RHEL 2.6.9-67.0.4 Lustre kernel vs. the kernel of > the same rev from RH...Chris, This looks like it needs investigating. We do strive to make as few changes to the vendor supplied kernel and configuration as we can. Perhaps something has diverged or been overlooked. Can you file a bugzilla bug for this? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080408/6235e96b/attachment-0002.bin
Johann Lombardi
2008-Apr-10 18:05 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
Chris, On Sat, Apr 05, 2008 at 06:11:32PM -0600, Chris Worley wrote:> I was running RHEL''s 2.6.9-67.0.4 kernel w/o Lustre patches, and theWhat is the CPU architecture? x86_64 or IA64?> core to logical processor allocation was (as shown by /proc/cpuinfo): > > > ============= ============= Socket > > ====== ====== ====== ====== L2 cache domain > > 0 4 1 5 2 6 3 7 logical processor > > > After installing the Lustre version of the kernel, the allocation is: > > ============= ============= Socket > > ====== ====== ====== ====== L2 cache domain > > 0 1 2 3 4 5 6 7 logical processorHard to believe that one of our patches could cause this. Have you compared the kernel config files? Cheers, Johann
Chris Worley
2008-Apr-10 18:13 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
On Thu, Apr 10, 2008 at 12:05 PM, Johann Lombardi <johann at sun.com> wrote:> Chris, > > > On Sat, Apr 05, 2008 at 06:11:32PM -0600, Chris Worley wrote: > > I was running RHEL''s 2.6.9-67.0.4 kernel w/o Lustre patches, and the > > What is the CPU architecture? x86_64 or IA64?x86_64.> > > > core to logical processor allocation was (as shown by /proc/cpuinfo): > > > > > > ============= ============= Socket > > > > ====== ====== ====== ====== L2 cache domain > > > > 0 4 1 5 2 6 3 7 logical processor > > > > > > After installing the Lustre version of the kernel, the allocation is: > > > > ============= ============= Socket > > > > ====== ====== ====== ====== L2 cache domain > > > > 0 1 2 3 4 5 6 7 logical processor > > Hard to believe that one of our patches could cause this. > Have you compared the kernel config files?This is the default from RedHat vs. the default from downloads.lustre.org. We didn''t rebuild either from scratch. Chris> > Cheers, > Johann >
Aaron Knister
2008-Apr-23 04:25 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
Did you ever find a resolution? And out of curiosity, how did you determine that the core to logical processor allocation had changed? I''m trying to figure it out in my own set up. -Aaron On Apr 10, 2008, at 2:13 PM, Chris Worley wrote:> On Thu, Apr 10, 2008 at 12:05 PM, Johann Lombardi <johann at sun.com> > wrote: >> Chris, >> >> >> On Sat, Apr 05, 2008 at 06:11:32PM -0600, Chris Worley wrote: >>> I was running RHEL''s 2.6.9-67.0.4 kernel w/o Lustre patches, and the >> >> What is the CPU architecture? x86_64 or IA64? > > x86_64. > >> >> >>> core to logical processor allocation was (as shown by /proc/ >>> cpuinfo): >>> >>> >>> ============= ============= Socket >>> >>> ====== ====== ====== ====== L2 cache domain >>> >>> 0 4 1 5 2 6 3 7 >>> logical processor >>> >>> >>> After installing the Lustre version of the kernel, the allocation >>> is: >>> >>> ============= ============= Socket >>> >>> ====== ====== ====== ====== L2 cache domain >>> >>> 0 1 2 3 4 5 6 7 >>> logical processor >> >> Hard to believe that one of our patches could cause this. >> Have you compared the kernel config files? > > This is the default from RedHat vs. the default from > downloads.lustre.org. We didn''t rebuild either from scratch. > > Chris >> >> Cheers, >> Johann >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussAaron Knister Associate Systems Analyst Center for Ocean-Land-Atmosphere Studies (301) 595-7000 aaron at iges.org
Chris Worley
2008-Apr-23 15:34 UTC
[Lustre-discuss] 1.6.4.3 changes CPU mapping in RHEL., and 10x performance loss in one application
On Tue, Apr 22, 2008 at 10:25 PM, Aaron Knister <aaron at iges.org> wrote:> Did you ever find a resolution?The core mapping change had to do with "noacpi" on my kernel command line (ACPI, not APIC). It seems ACPI has a lot to do with core mapping (not just power). It also effected interrupt distribution/balancing (/proc/interrupts was showing all timer interrupts handled by CPU0, for example). ACPI had to both be defined in the kernel and not disabled on the kernel command line. This did not solve the 2x to 10x performance issue with 1.6.4.3, but I don''t have that problem with a manually patched RHEL 2.6.9-67.0.4 kernel in 1.6.4.2. My best guess is: I omitted the Quadrics patches from my manual patching... maybe they have something to do with the slowdown. I have a list of system calls that I believe are associated with the slowdown... but in looking at CPU counters, the application takes no more CPU time, the walltime just increases... like the kernel is forgetting to schedule the app. More on this at: https://bugzilla.lustre.org/show_bug.cgi?id=15478> And out of curiosity, how did you determine > that the core to logical processor allocation had changed? I''m trying to > figure it out in my own set up.A quick glance at /proc/cpuinfo shows the difference. The "correct" case looks like: # cat /proc/cpuinfo | grep -e processor -e "core id" processor : 0 core id : 0 processor : 1 core id : 2 processor : 2 core id : 4 processor : 3 core id : 6 processor : 4 core id : 1 processor : 5 core id : 3 processor : 6 core id : 5 processor : 7 core id : 7 The "incorrect" mapping shows "processor" == "core id" (as it does above for cpu''s 0 and 7... but for all processors). I work w/ benchmark clusters (they are only used for benchmarking and tuning applications), and many immediately saw the differences in codes they''d been benchmarking. Some folks run on fewer cores than are available per node (i.e. to not share cache between MPI processes, or, in some cases of multithreaded apps, they do want to share cache), and the optimal MPI CPU mapping for an 8 core system (at least for this vendors CPUs) puts logical cores 0 and 1 on a different socket, 2 and 3 share sockets w/ 0 and 1, but different L2 caches. With ACPI disabled, the logical and physical mapping were the same. In those cases where the MPI does process pinning, the apps were (mostly) okay... but other apps don''t specifically pin, and, where logical==physical, all four processes were running on the same socket and their performance went down. You could argue that apps should pin if needed, but also argue that it''s nice to have a CPU mapping that helps apps that don''t pin. Furthermore, others noticed that even w/ proper processor pinning and using physical processors 0, 2, 4, 6 there results were worse than using 1,3,5,7... this turned out again to be ACPI related where interrupts weren''t being balanced across the CPUs (look at the first line of "timer" interrupts in /proc/interrupts and see if all go to CPU0... that imbalance will effect performance on MPI apps that use all cores too). Hope that helps. Chris> > -Aaron > > > > On Apr 10, 2008, at 2:13 PM, Chris Worley wrote: > > > > > > > > > > On Thu, Apr 10, 2008 at 12:05 PM, Johann Lombardi <johann at sun.com> wrote: > > > > > Chris, > > > > > > > > > On Sat, Apr 05, 2008 at 06:11:32PM -0600, Chris Worley wrote: > > > > > > > I was running RHEL''s 2.6.9-67.0.4 kernel w/o Lustre patches, and the > > > > > > > > > > What is the CPU architecture? x86_64 or IA64? > > > > > > > x86_64. > > > > > > > > > > > > > > > > > core to logical processor allocation was (as shown by /proc/cpuinfo): > > > > > > > > > > > > ============= ============= Socket > > > > > > > > ====== ====== ====== ====== L2 cache domain > > > > > > > > 0 4 1 5 2 6 3 7 logical > processor > > > > > > > > > > > > After installing the Lustre version of the kernel, the allocation is: > > > > > > > > ============= ============= Socket > > > > > > > > ====== ====== ====== ====== L2 cache domain > > > > > > > > 0 1 2 3 4 5 6 7 logical > processor > > > > > > > > > > Hard to believe that one of our patches could cause this. > > > Have you compared the kernel config files? > > > > > > > This is the default from RedHat vs. the default from > > downloads.lustre.org. We didn''t rebuild either from scratch. > > > > Chris > > > > > > > > Cheers, > > > Johann > > > > > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > Aaron Knister > Associate Systems Analyst > Center for Ocean-Land-Atmosphere Studies > > (301) 595-7000 > aaron at iges.org > > > > >