Hi List, I was wondering if anyone here has looked at the performance characteristics of lustre OSSes on dual tylersburg motherboards with raid controllers split up onto separate IO hubs. I imagine that without proper pinning of service threads to the right CPUs/IOH and memory pools this could cause some nasty QPI contention. Is this actually a problem in practice? Is it possible to pin service threads in a reasonable way based on which OST is involved? Anyone doing this on purpose to try and gain more overall PCIE bandwidth? I imagine that in general it''s probably best to stick with a single socket single IOH OSS. No pinning to worry about, very direct QPI setup, consistent performance characteristics, etc. Thanks, Mark
Look for the Bull NUMIOA presentation from the recent LUG. The short story is that OST thread pinning is critical to getting good performance. The numbers are something like 3.6GB/s without, and 6.0 GB/s with thread affinity. Cheers, Andreas On 2011-06-02, at 7:23 PM, Mark Nelson <mark at msi.umn.edu> wrote:> Hi List, > > I was wondering if anyone here has looked at the performance > characteristics of lustre OSSes on dual tylersburg motherboards with > raid controllers split up onto separate IO hubs. I imagine that without > proper pinning of service threads to the right CPUs/IOH and memory pools > this could cause some nasty QPI contention. Is this actually a problem > in practice? Is it possible to pin service threads in a reasonable way > based on which OST is involved? Anyone doing this on purpose to try and > gain more overall PCIE bandwidth? > > I imagine that in general it''s probably best to stick with a single > socket single IOH OSS. No pinning to worry about, very direct QPI > setup, consistent performance characteristics, etc. > > Thanks, > Mark > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Mark, In addition to thread pinning, see also Bug 22078, which allows a different network interface to be used for different OSTs on the same server: a single IB interface is not enough to saturate one IOH, let alone multiple. Normally all the threads are in a shared pool, where any thread can service any incoming request for any OST. The most common server configuration is probably still dual-socket single IOH. Kevin Andreas Dilger wrote:> Look for the Bull NUMIOA presentation from the recent LUG. The short story is that OST thread pinning is critical to getting good performance. The numbers are something like 3.6GB/s without, and 6.0 GB/s with thread affinity. > > Cheers, Andreas > > On 2011-06-02, at 7:23 PM, Mark Nelson <mark at msi.umn.edu> wrote: > > >> Hi List, >> >> I was wondering if anyone here has looked at the performance >> characteristics of lustre OSSes on dual tylersburg motherboards with >> raid controllers split up onto separate IO hubs. I imagine that without >> proper pinning of service threads to the right CPUs/IOH and memory pools >> this could cause some nasty QPI contention. Is this actually a problem >> in practice? Is it possible to pin service threads in a reasonable way >> based on which OST is involved? Anyone doing this on purpose to try and >> gain more overall PCIE bandwidth? >> >> I imagine that in general it''s probably best to stick with a single >> socket single IOH OSS. No pinning to worry about, very direct QPI >> setup, consistent performance characteristics, etc. >> >> Thanks, >> Mark >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Thanks Andreas and Kevin for the excellent information. Sounds like the Bull presentation is exactly what I''m after. I wonder if in really bad situations with lots of bidirectional traffic if you could even end up dropping below 3GB/s (ie single QDR limit). Seems like this could become a pretty important topic as network interconnect speeds improve. Mark On 6/2/11 8:49 PM, Kevin Van Maren wrote:> Mark, > > In addition to thread pinning, see also Bug 22078, which allows a > different network interface to be used for different OSTs on the same > server: a single IB interface is not enough to saturate one IOH, let > alone multiple. > > Normally all the threads are in a shared pool, where any thread can > service any incoming request for any OST. > > The most common server configuration is probably still dual-socket > single IOH. > > Kevin > > > Andreas Dilger wrote: >> Look for the Bull NUMIOA presentation from the recent LUG. The short >> story is that OST thread pinning is critical to getting good >> performance. The numbers are something like 3.6GB/s without, and 6.0 >> GB/s with thread affinity. >> Cheers, Andreas >> >> On 2011-06-02, at 7:23 PM, Mark Nelson <mark at msi.umn.edu> wrote: >> >>> Hi List, >>> >>> I was wondering if anyone here has looked at the performance >>> characteristics of lustre OSSes on dual tylersburg motherboards with >>> raid controllers split up onto separate IO hubs. I imagine that >>> without proper pinning of service threads to the right CPUs/IOH and >>> memory pools this could cause some nasty QPI contention. Is this >>> actually a problem in practice? Is it possible to pin service >>> threads in a reasonable way based on which OST is involved? Anyone >>> doing this on purpose to try and gain more overall PCIE bandwidth? >>> >>> I imagine that in general it''s probably best to stick with a single >>> socket single IOH OSS. No pinning to worry about, very direct QPI >>> setup, consistent performance characteristics, etc. >>> >>> Thanks, >>> Mark >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss