philippe.bernadat@hp.com
2006-Dec-19 07:29 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 I forgot to update this one. With help from the openib-general@openib.org list we have identified that the performance degradation was due to the use of a 2K MTU on the 23108 tavor HCAs. The OFED tavor_quirk to force a 1K MTU is broken. So in the mean time I simply added a cmid->route.path_rec->mtu = IB_MTU_1024; statement just prior to the rdma_create_qp() call in kiblnd_create_conn()
philippe.bernadat@hp.com
2006-Dec-20 02:26 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 So here are the new numbers when using a 1K IB MTU. __1MB_MTU_!FMR__ __1MB_MTU_+FMR__ VIB O2IB Ratio VIB O2IB Ratio Writes MB/s 679 680 100 % 579 585 101 % Reads MB/s 659 663 101 % 567 579 102 % I redid all measurements (were 2 different nodes, may explain slight diff with initial figures) So OFED-1.1 now always performs at least as good as VIB. Good news. I am still not happy with the way to fix this IB MTU issues by the way. Right now I have a modparam to force the MTU. But since it may be either the active or passive node that has a 23108 board, there should be a better way to negotiate the MTU. Using the SM is a problem, since there are multiple flavors of SMs. We also still have a fair peformance drop when going to FMR, but this hasn''t changed as compared to VIB and is a different story. !FMR +FMR Writes MB/s 680 585 86 % Reads MB/s 663 579 87 %
philippe.bernadat@hp.com
2006-Dec-20 03:55 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 Created an attachment (id=9188) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9188&action=view) Patch to add an ib_mtu modparam to o2ibln So here is the patch I used to force the MTU. To activate, you need to add in the modprobe.conf file a line such as options ko2iblnd ib_mtu=1024 This must be turned on the active host side (OSC/MDS) if either end (active/passive, MDS/OST/OSC) has a 23108 Tavor HCA. It can''t hurt to set it on the passive side (OST)
philippe.bernadat@hp.com
2006-Dec-21 00:43 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 Created an attachment (id=9197) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9197&action=view) Patch to add an ib_mtu modparam to o2ibln More checks but still not ideal.
ogerlitz@voltaire.com
2006-Dec-21 02:03 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #24)> So OFED-1.1 now always performs at least as good as VIB. Good news.good job, thanks for working on this> I am still not happy with the way to fix this IB MTU issues by the way. > Right now I have a modparam to force the MTU.sure, this will be solved in the community (and our SM/SA if the solution requires this)> We also still have a fair peformance drop when going to FMR, but this hasn''t > changed as compared to VIB and is a different story.This is not accepted, Eric should raise this on the list and the issue will be worked there. Basically you need to send the code so your FMR usage can be reviewed
philippe.bernadat@hp.com
2006-Dec-21 02:30 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #28)> > the conversion from the mtu mod param numeric value to the ib mtu define value > belongs to the o2ibnld startup flow and not to conn establishment. >Agree.> ??? why you need to change the global value ???I just wanted one to be able to check the current value with cat /sys/module/ko2iblnd/ib_mtu May be I should have two params, a requested and a current value.
philippe.bernadat@hp.com
2006-Dec-21 02:31 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 Created an attachment (id=9199) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9199&action=view) Patch to add an ib_mtu modparam to o2ibln With -pu
philippe.bernadat@hp.com
2006-Dec-21 02:59 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #31)> so *kiblnd_tunables.kib_ib_mtu gets back its initial value...*kiblnd_tunables.kib_ib_mtu gets the mtu in use. Not necessarily the initial value. On the active side it gets back the requested one (if it was OK) or the default. On the passive side it gets the mtu in use.
philippe.bernadat@hp.com
2006-Dec-21 04:45 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 Created an attachment (id=9203) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9203&action=view) Patch to add an ib_mtu modparam to o2ibln New attempt. Now check that the mtu is valid at module load time. Don''t overwrite the initial mtu requested size. I still left the int_to_enum conversion in conn_param, since I need to compare the mtu int values (not the enums) But feel free to change. What I''d really like is the passive node to be able to force the MTU value as well. No way to do that ?
ogerlitz@voltaire.com
2006-Dec-21 23:55 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #29)>> ??? why you need to change the global value ???> I just wanted one to be able to check the current value with > cat /sys/module/ko2iblnd/ib_mtubut the code says kiblnd_tunables.kib_ib_mtu = &ib_mtu and later mtu = compute/ib/mtu/from/the/numeric/value/of/*kiblnd_tunables.kib_ib_mtu and later cmid->route.path_rec->mtu = mtu and later *kiblnd_tunables.kib_ib_mtu = ib_mtu_enum_to_int(cmid->route.path_rec->mtu); so *kiblnd_tunables.kib_ib_mtu gets back its initial value...
ogerlitz@voltaire.com
2006-Dec-21 23:55 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #32)> *kiblnd_tunables.kib_ib_mtu gets the mtu in use. Not necessarily the initialvalue.> On the active side it gets back the requested one (if it was OK) or the default.> On the passive side it gets the mtu in use.OK, got it, still the giant switch statement/logic can be moved to the startup code and be under if(mtu/mod/param/is/not/zero) so you only either set cmid->route->path_rec->mtu if the mod param is not zero and you are at active/connect state or set the kiblnd_tunables.kib_ib_mtu value if you are at the IBLND_CONN_PASSIVE_WAIT state.
ogerlitz@voltaire.com
2006-Dec-21 23:55 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 (In reply to comment #34)> What I''d really like is the passive node to be able to force the MTU value as > well. No way to do that ?This will require o2ibnld to negotiate the mtu using the IB CM private data and possibly some changes to the rdma cm, i would recommend to avoid that unless the solution that would be agreed upon at openib will not be sufficient to your taste/needs
eeb@clusterfs.com
2007-Jan-04 09:57 UTC
[Lustre-devel] [Bug 11245] Significant Performance degradation with OFED
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11245 What |Removed |Added ---------------------------------------------------------------------------- Attachment #9203 is|0 |1 obsolete| | Created an attachment (id=9270) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9270&action=view) updated patch I''ve changed the patch a little to ensure the update to *kiblnd_tunables.kib_ib_mtu serialises properly. I tested on a couple of nodes at LLNL and saw the expected bandwidth improvement.