Jiffin Tony Thottan
2017-Jan-09 05:02 UTC
[Gluster-users] Ganesha with Gluster transport RDMA does not work
Hi Andreas, By checking the code IMO currently this is limitation with in FSAL_GLUSTER. It tries to establish connection with glusterfs servers only using "tcp". It is easy to fix as well. You can raise a bug in https://bugzilla.redhat.com/enter_bug.cgi?product=nfs-ganesha under FSAL_GLUSTER. I don't have any hardware to test the fix. I can either help you in writing up fix for the issue or provide a test rpms with the fix . Also thanks for trying out nfs-ganesha with rdma and finding about this issue. For the time being , if possible you can try with tcp,rdma volume to solve the problem. Regards, Jiffin On 06/01/17 22:56, Andreas Kurzac wrote:> > Dear All, > > i have a glusterfs pool with 3 servers with Centos7.3, Glusterfs > 3.8.5, network is Infiniband. > > Pacemaker/Corosync and Ganesha-NFS is installed and all seems to be > OK, no error logged. > > I created a replica 3 volume with transport rdma (without tcp!). > > When i mount this volume via glusterfs and do some IO, no errors are > logged and everything seems to go pretty well. > > When i mount the volume via nfs and do some IO, nfs freezes immediatly > and following logs are written to > > ganesha-gfapi.log: > > 2017-01-05 23:23:53.536526] W [MSGID: 103004] > [rdma.c:452:gf_rdma_register_arena] 0-rdma: allocation of mr failed > > [2017-01-05 23:23:53.541519] W [MSGID: 103004] > [rdma.c:1463:__gf_rdma_create_read_chunks_from_vector] > 0-rpc-transport/rdma: memory registration failed > (peer:10.40.1.1:49152) [Keine Berechtigung] > > [2017-01-05 23:23:53.541547] W [MSGID: 103029] > [rdma.c:1558:__gf_rdma_create_read_chunks] 0-rpc-transport/rdma: > cannot create read chunks from vector entry->prog_payload > > [2017-01-05 23:23:53.541553] W [MSGID: 103033] > [rdma.c:2063:__gf_rdma_ioq_churn_request] 0-rpc-transport/rdma: > creation of read chunks failed > > [2017-01-05 23:23:53.541557] W [MSGID: 103040] > [rdma.c:2775:__gf_rdma_ioq_churn_entry] 0-rpc-transport/rdma: failed > to process request ioq entry to peer(10.40.1.1:49152) > > [2017-01-05 23:23:53.541562] W [MSGID: 103040] > [rdma.c:2859:gf_rdma_writev] 0-vmstor1-client-0: processing ioq entry > destined to (10.40.1.1:49152) failed > > [2017-01-05 23:23:53.541569] W [MSGID: 103037] > [rdma.c:3016:gf_rdma_submit_request] 0-rpc-transport/rdma: sending > request to peer (10.40.1.1:49152) failed > > [?] > > Some additional info: > > Firewall is disabled, SELinux is disabled. > > Different hardware with Centos 7.1 and the Mellanox OFED 3.4 packages > instead of the Centos Infiniband packages lead to the same results. > > Just to mention: I am not trying to do NFS over RDMA, the Ganesha FSAL > is just configured to "glusterfs". > > I hope someone could help me, i am running out of ideas? > > Kind regards, > > Andreas > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20170109/138adb2a/attachment.html>
Andreas Kurzac
2017-Jan-09 10:57 UTC
[Gluster-users] Ganesha with Gluster transport RDMA does not work
Hi Jiffin, i raised bug 1411281. If you could provide test-rpms i would be very happy to test them in our environment. In the meantime i will switch to tcp,rdma and continue working on our setup, we can then switch back to pure rdma any time for testing. Thanks for your help! Regards, Andreas Von: Jiffin Tony Thottan [mailto:jthottan at redhat.com] Gesendet: Montag, 9. Januar 2017 06:02 An: Andreas Kurzac <akurzac at kinetik.de>; gluster-users at gluster.org Betreff: Re: [Gluster-users] Ganesha with Gluster transport RDMA does not work Hi Andreas, By checking the code IMO currently this is limitation with in FSAL_GLUSTER. It tries to establish connection with glusterfs servers only using "tcp". It is easy to fix as well. You can raise a bug in https://bugzilla.redhat.com/enter_bug.cgi?product=nfs-ganesha under FSAL_GLUSTER. I don't have any hardware to test the fix. I can either help you in writing up fix for the issue or provide a test rpms with the fix . Also thanks for trying out nfs-ganesha with rdma and finding about this issue. For the time being , if possible you can try with tcp,rdma volume to solve the problem. Regards, Jiffin On 06/01/17 22:56, Andreas Kurzac wrote: Dear All, i have a glusterfs pool with 3 servers with Centos7.3, Glusterfs 3.8.5, network is Infiniband. Pacemaker/Corosync and Ganesha-NFS is installed and all seems to be OK, no error logged. I created a replica 3 volume with transport rdma (without tcp!). When i mount this volume via glusterfs and do some IO, no errors are logged and everything seems to go pretty well. When i mount the volume via nfs and do some IO, nfs freezes immediatly and following logs are written to ganesha-gfapi.log: 2017-01-05 23:23:53.536526] W [MSGID: 103004] [rdma.c:452:gf_rdma_register_arena] 0-rdma: allocation of mr failed [2017-01-05 23:23:53.541519] W [MSGID: 103004] [rdma.c:1463:__gf_rdma_create_read_chunks_from_vector] 0-rpc-transport/rdma: memory registration failed (peer:10.40.1.1:49152) [Keine Berechtigung] [2017-01-05 23:23:53.541547] W [MSGID: 103029] [rdma.c:1558:__gf_rdma_create_read_chunks] 0-rpc-transport/rdma: cannot create read chunks from vector entry->prog_payload [2017-01-05 23:23:53.541553] W [MSGID: 103033] [rdma.c:2063:__gf_rdma_ioq_churn_request] 0-rpc-transport/rdma: creation of read chunks failed [2017-01-05 23:23:53.541557] W [MSGID: 103040] [rdma.c:2775:__gf_rdma_ioq_churn_entry] 0-rpc-transport/rdma: failed to process request ioq entry to peer(10.40.1.1:49152) [2017-01-05 23:23:53.541562] W [MSGID: 103040] [rdma.c:2859:gf_rdma_writev] 0-vmstor1-client-0: processing ioq entry destined to (10.40.1.1:49152) failed [2017-01-05 23:23:53.541569] W [MSGID: 103037] [rdma.c:3016:gf_rdma_submit_request] 0-rpc-transport/rdma: sending request to peer (10.40.1.1:49152) failed [...] Some additional info: Firewall is disabled, SELinux is disabled. Different hardware with Centos 7.1 and the Mellanox OFED 3.4 packages instead of the Centos Infiniband packages lead to the same results. Just to mention: I am not trying to do NFS over RDMA, the Ganesha FSAL is just configured to "glusterfs". I hope someone could help me, i am running out of ideas... Kind regards, Andreas _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> http://www.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170109/1b50b541/attachment.html>
Jiffin Tony Thottan
2017-Jan-10 11:38 UTC
[Gluster-users] Ganesha with Gluster transport RDMA does not work
On 09/01/17 16:27, Andreas Kurzac wrote:> > Hi Jiffin, > > i raised bug 1411281. > > If you could provide test-rpms i would be very happy to test them in > our environment. >Sorry If I missed in previous mail, which nfs-ganesha (2.3 or 2.4) version are u using ? -- Jiffin> In the meantime i will switch to tcp,rdma and continue working on our > setup, we can then switch back to pure rdma any time for testing. > > Thanks for your help! > > Regards, > > Andreas > > *Von:*Jiffin Tony Thottan [mailto:jthottan at redhat.com] > *Gesendet:* Montag, 9. Januar 2017 06:02 > *An:* Andreas Kurzac <akurzac at kinetik.de>; gluster-users at gluster.org > *Betreff:* Re: [Gluster-users] Ganesha with Gluster transport RDMA > does not work > > Hi Andreas, > > By checking the code IMO currently this is limitation with in > FSAL_GLUSTER. It tries to > > establish connection with glusterfs servers only using "tcp". It is > easy to fix as well. > > You can raise a bug in > https://bugzilla.redhat.com/enter_bug.cgi?product=nfs-ganesha > > under FSAL_GLUSTER. I don't have any hardware to test the fix. I can > either help you in > > writing up fix for the issue or provide a test rpms with the fix . > > Also thanks for trying out nfs-ganesha with rdma and finding about > this issue. > > For the time being , if possible you can try with tcp,rdma volume to > solve the problem. > > Regards, > > Jiffin > > On 06/01/17 22:56, Andreas Kurzac wrote: > > Dear All, > > i have a glusterfs pool with 3 servers with Centos7.3, Glusterfs > 3.8.5, network is Infiniband. > > Pacemaker/Corosync and Ganesha-NFS is installed and all seems to > be OK, no error logged. > > I created a replica 3 volume with transport rdma (without tcp!). > > When i mount this volume via glusterfs and do some IO, no errors > are logged and everything seems to go pretty well. > > When i mount the volume via nfs and do some IO, nfs freezes > immediatly and following logs are written to > > ganesha-gfapi.log: > > 2017-01-05 23:23:53.536526] W [MSGID: 103004] > [rdma.c:452:gf_rdma_register_arena] 0-rdma: allocation of mr failed > > [2017-01-05 23:23:53.541519] W [MSGID: 103004] > [rdma.c:1463:__gf_rdma_create_read_chunks_from_vector] > 0-rpc-transport/rdma: memory registration failed > (peer:10.40.1.1:49152) [Keine Berechtigung] > > [2017-01-05 23:23:53.541547] W [MSGID: 103029] > [rdma.c:1558:__gf_rdma_create_read_chunks] 0-rpc-transport/rdma: > cannot create read chunks from vector entry->prog_payload > > [2017-01-05 23:23:53.541553] W [MSGID: 103033] > [rdma.c:2063:__gf_rdma_ioq_churn_request] 0-rpc-transport/rdma: > creation of read chunks failed > > [2017-01-05 23:23:53.541557] W [MSGID: 103040] > [rdma.c:2775:__gf_rdma_ioq_churn_entry] 0-rpc-transport/rdma: > failed to process request ioq entry to peer(10.40.1.1:49152) > > [2017-01-05 23:23:53.541562] W [MSGID: 103040] > [rdma.c:2859:gf_rdma_writev] 0-vmstor1-client-0: processing ioq > entry destined to (10.40.1.1:49152) failed > > [2017-01-05 23:23:53.541569] W [MSGID: 103037] > [rdma.c:3016:gf_rdma_submit_request] 0-rpc-transport/rdma: sending > request to peer (10.40.1.1:49152) failed > > [?] > > Some additional info: > > Firewall is disabled, SELinux is disabled. > > Different hardware with Centos 7.1 and the Mellanox OFED 3.4 > packages instead of the Centos Infiniband packages lead to the > same results. > > Just to mention: I am not trying to do NFS over RDMA, the Ganesha > FSAL is just configured to "glusterfs". > > I hope someone could help me, i am running out of ideas? > > Kind regards, > > Andreas > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20170110/c9fa3190/attachment.html>