Hello, I want to do some tests injecting artificial latency to LNET using netem over an InfiniBand network. This doesn''t work out of the box, since LNET''s RDMA transfers aren''t affected by the kernel latency. Does anyway know if there is a way to disable LNET''s RDMA support and force its packets through the kernel? Other ideas are also welcomed. Thanks, Alvaro. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110223/d1155b6d/attachment.html
Christopher J. Morrone
2011-Feb-23 19:41 UTC
[Lustre-discuss] Disabling RDMA on an IB interface
You are using IP traffic control, right? So you need to make lustre use IP over IB instead of native IB. You need to set up lnet to use the socklnd instead of the o2iblnd. You''ll need to use lnet network names like "tcp0" instead of "o2ib0". I don''t know what is is you hope to investigate in LNET, but the behavior of the socklnd and the o2iblnd can be quite different. On 02/23/2011 07:44 AM, Alvaro Aguilera wrote:> Hello, > > I want to do some tests injecting artificial latency to LNET using netem over an InfiniBand network. This doesn''t work out of the box, since LNET''s RDMA transfers aren''t affected by the kernel latency. Does anyway know if there is a way to disable LNET''s RDMA support and force its packets through the kernel? Other ideas are also welcomed. > > Thanks, > Alvaro. > >
On 2011-02-23, at 8:44 AM, Alvaro Aguilera wrote:> I want to do some tests injecting artificial latency to LNET using netem over an InfiniBand network. This doesn''t work out of the box, since LNET''s RDMA transfers aren''t affected by the kernel latency. Does anyway know if there is a way to disable LNET''s RDMA support and force its packets through the kernel? Other ideas are also welcomed.Depending on what you are trying to measure, you may want to contact some of the folks at NRL, who have done a bunch of IB WAN latency testing. They inject latency into the network using special hardware, so that probably won''t help you, but maybe they already measured what you are looking at :-). Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc.
As Chris mentioned your talking about two very different methods. I think you can use netem with IPoIB but I have never tried it. If you use connected mode I think your still technically doing RDMA but the maximum size (MTU) is around 64k which isn''t sufficient for higher latencies. In the first few slides of my LUG presentation last year I have some graphs that show how RDMA performance is affected by latency and need to be increased to compensate for the bandwidth delay product (BDP). If you do what to use IPoIB you can add a line similar the following in your /etc/modprobe.conf or a fille in /etc/modprobe.d directory: options lnet networks=tcp(ib0) If you want to use RC QPs as ko2iblnd does, we use the following kernel parameters: options lnet networks=o2ib(ib0) options ko2iblnd map_on_demand=2 peer_credits=128 credits=256 concurrent_sends=256 ntx=512 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 I''m not aware of any patches that add delay to the OFED stack. At NRL we use real delay between sites and hardware simulated delay using Obsidian Longbow XRs (http://www.obsidianresearch.com). They support a single SDR connection and can add delay to the IB channel up to 1 second. Jeremy On Wed, Feb 23, 2011 at 5:19 PM, Andreas Dilger <adilger at whamcloud.com>wrote:> On 2011-02-23, at 8:44 AM, Alvaro Aguilera wrote: > > I want to do some tests injecting artificial latency to LNET using netem > over an InfiniBand network. This doesn''t work out of the box, since LNET''s > RDMA transfers aren''t affected by the kernel latency. Does anyway know if > there is a way to disable LNET''s RDMA support and force its packets through > the kernel? Other ideas are also welcomed. > > Depending on what you are trying to measure, you may want to contact some > of the folks at NRL, who have done a bunch of IB WAN latency testing. They > inject latency into the network using special hardware, so that probably > won''t help you, but maybe they already measured what you are looking at :-). > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer > Whamcloud, Inc. > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110223/634381a9/attachment.html
On Feb 24, 2011, at 10:45 AM, Jeremy Filizetti wrote:> As Chris mentioned your talking about two very different methods. I think you can use netem with IPoIB but I have never tried it. If you use connected mode I think your still technically doing RDMA but the maximum size (MTU) is around 64k which isn''t sufficient for higher latencies. In the first few slides of my LUG presentation last year I have some graphs that show how RDMA performance is affected by latency and need to be increased to compensate for the bandwidth delay product (BDP). If you do what to use IPoIB you can add a line similar the following in your /etc/modprobe.conf or a fille in /etc/modprobe.d directory: > > options lnet networks=tcp(ib0) > > If you want to use RC QPs as ko2iblnd does, we use the following kernel parameters: > > options lnet networks=o2ib(ib0) > options ko2iblnd map_on_demand=2 peer_credits=128 credits=256 concurrent_sends=256 ntx=512 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1If you have peer_credits=128, then I would suggest increase credits=1024 and ntx=2048, otherwise a couple of clients could consume all NI credits, concurrent_sends is not necessary here because o2iblnd will estimate proper value for it. Thanks Liang
Thanks for all your suggestions. TCPoIB seems to be doing the job. Regards, Alvaro. On Thu, Feb 24, 2011 at 4:16 AM, Liang Zhen <liang at whamcloud.com> wrote:> > On Feb 24, 2011, at 10:45 AM, Jeremy Filizetti wrote: > > > As Chris mentioned your talking about two very different methods. I > think you can use netem with IPoIB but I have never tried it. If you use > connected mode I think your still technically doing RDMA but the maximum > size (MTU) is around 64k which isn''t sufficient for higher latencies. In > the first few slides of my LUG presentation last year I have some graphs > that show how RDMA performance is affected by latency and need to be > increased to compensate for the bandwidth delay product (BDP). If you do > what to use IPoIB you can add a line similar the following in your > /etc/modprobe.conf or a fille in /etc/modprobe.d directory: > > > > options lnet networks=tcp(ib0) > > > > If you want to use RC QPs as ko2iblnd does, we use the following kernel > parameters: > > > > options lnet networks=o2ib(ib0) > > options ko2iblnd map_on_demand=2 peer_credits=128 credits=256 > concurrent_sends=256 ntx=512 fmr_pool_size=2048 fmr_flush_trigger=512 > fmr_cache=1 > > > If you have peer_credits=128, then I would suggest increase credits=1024 > and ntx=2048, otherwise a couple of clients could consume all NI credits, > concurrent_sends is not necessary here because o2iblnd will estimate proper > value for it. > > Thanks > Liang > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110225/a650a399/attachment.html