thr3ads.net - Gluster users - [Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Alessandro De Salvo

2015-Jun-12 07:35 UTC

[Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size

Hi Malahal,

> Il giorno 12/giu/2015, alle ore 01:23, Malahal Naineni <malahal at
us.ibm.com> ha scritto:
> 
> The logs indicate that ganesha was started successfully without any
> exports.  gstack output seemed normal as well -- threads were waiting to
> serve requests.
Yes, no exports as it was the default config before enabling Ganesha on any
gluster volume.
> 
> Assuming that you are running "showmount -e" on the same system,
there
> shouldn't be any firewall coming into the picture.
Yes it was the case in my last attempt, from the same machine. I also tried from
another machine, but the result was the same. The firewall (firewalld, as
it's a CentOS 7.1) is disabled anyways.
> If you are running
> "showmount" from some other system, make sure there is no
firewall
> dropping the packets.
> 
> I think you need tcpdump trace to figure out the problem. My wireshark
> trace showed two requests from the client to complete the "showmount
-e"
> command:
> 
> 1. Client sent "GETPORT" call to port 111 (rpcbind) to get the
port number
>   of MOUNT.
> 2. Then it sent "EXPORT" call to mountd port (port it got in
response to #1).
Yes, I did it already, and indeed it showed the two requests, so the portmapper
works fine, but it hangs on the second request.
Also "rpcinfo -t localhost portmapper" returns successfully, while
"rpcinfo -t localhost nfs" hangs.
The output of rpcinfo -p is the following:

    program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  56082  status
    100024    1   tcp  41858  status
    100003    3   udp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   udp   2049  nfs
    100003    4   tcp   2049  nfs
    100005    1   udp  45611  mountd
    100005    1   tcp  55915  mountd
    100005    3   udp  45611  mountd
    100005    3   tcp  55915  mountd
    100021    4   udp  48775  nlockmgr
    100021    4   tcp  51621  nlockmgr
    100011    1   udp   4501  rquotad
    100011    1   tcp   4501  rquotad
    100011    2   udp   4501  rquotad
    100011    2   tcp   4501  rquotad
> 
> What does "rpcinfo -p <server-ip>" show?
> 
> Do you have selinux enabled? I am not sure if that is playing any role
> here...
Nope, it's disabled:

# uname -a
Linux node2 3.10.0-229.4.2.el7.x86_64 #1 SMP Wed May 13 10:06:09 UTC 2015 x86_64
x86_64 x86_64 GNU/Linux


Thanks for the help,

    Alessandro
> 
> Regards, Malahal.
> 
> Alessandro De Salvo [Alessandro.DeSalvo at roma1.infn.it] wrote:
>> Hi,
>> this was an extract from the old logs, before Soumya's suggestion
of
>> changing the rquota port in the conf file. The new logs are attached
>> (ganesha-20150611.log.gz) as well as the gstack of the ganesha process
>> while I was executing the hanging showmount
>> (ganesha-20150611.gstack.gz).
>> Thanks,
>> 
>>    Alessandro
>> 
>> 
>> 
>>> On Thu, 2015-06-11 at 11:37 -0500, Malahal Naineni wrote:
>>> Soumya Koduri [skoduri at redhat.com] wrote:
>>>> CCin ganesha-devel to get more inputs.
>>>> 
>>>> In case of ipv6 enabled, only v6 interfaces are used by
NFS-Ganesha.
>>> 
>>> I am not a network expert but I have seen IPv4 traffic over IPv6
>>> interface while fixing few things before. This may be normal.
>>> 
>>>> commit - git show 'd7e8f255' , which got added in v2.2
has more details.
>>>> 
>>>>> # netstat -ltaupn | grep 2049
>>>>> tcp6       4      0 :::2049                 :::*
>>>>> LISTEN      32080/ganesha.nfsd
>>>>> tcp6       1      0 x.x.x.2:2049      x.x.x.2:33285    
CLOSE_WAIT
>>>>> -
>>>>> tcp6       1      0 127.0.0.1:2049          127.0.0.1:39555
>>>>> CLOSE_WAIT  -
>>>>> udp6       0      0 :::2049                 :::*
>>>>> 32080/ganesha.nfsd
>>>> 
>>>>>>> I have enabled the full debug already, but I see
nothing special. Before exporting any volume the log shows no error, even when I
do a showmount (the log is attached, ganesha.log.gz). If I do the same after
exporting a volume nfs-ganesha does not even start, complaining for not being
able to bind the IPv6 ruota socket, but in fact there is nothing listening on
IPv6, so it should not happen:
>>>>>>> 
>>>>>>> tcp6       0      0 :::111                  :::*   
LISTEN      7433/rpcbind
>>>>>>> tcp6       0      0 :::2224                 :::*   
LISTEN      9054/ruby
>>>>>>> tcp6       0      0 :::22                   :::*   
LISTEN      1248/sshd
>>>>>>> udp6       0      0 :::111                  :::*   
7433/rpcbind
>>>>>>> udp6       0      0 fe80::8c2:27ff:fef2:123 :::*   
31238/ntpd
>>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*   
31238/ntpd
>>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*   
31238/ntpd
>>>>>>> udp6       0      0 fe80::230:48ff:fed2:123 :::*   
31238/ntpd
>>>>>>> udp6       0      0 ::1:123                 :::*   
31238/ntpd
>>>>>>> udp6       0      0 fe80::5484:7aff:fef:123 :::*   
31238/ntpd
>>>>>>> udp6       0      0 :::123                  :::*   
31238/ntpd
>>>>>>> udp6       0      0 :::824                  :::*   
7433/rpcbind
>>>>>>> 
>>>>>>> The error, as shown in the attached
ganesha-after-export.log.gz logfile, is the following:
>>>>>>> 
>>>>>>> 
>>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6
socket, error 98 (Address already in use)
>>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] Bind_sockets :DISP :FATAL :Error binding to V6
interface. Cannot continue.
>>>>>>> 10/06/2015 02:07:48 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] glusterfs_unload :FSAL :DEBUG :FSAL Gluster unloaded
>>> 
>>> The above messages indicate that someone tried to restart ganesha.
But
>>> ganesha failed to come up because RQUOTA port (default is 875) is
>>> already in use by an old ganesha instance or some other program
holding
>>> it. The new instance of ganesha will die, but if you are using
systemd,
>>> it will try to restart automatically. We have disabled systemd auto
>>> restart in our environment as it was causing issues for debugging.
>>> 
>>> What version of ganesha is this?
>>> 
>>> Regards, Malahal.
> 
> 
>

Alessandro De Salvo

2015-Jun-12 12:34 UTC

head link

[Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size

Hi,
looking at the code and having recompiled adding some more debug, I
might be wrong, but it seems that in nfs_rpc_dispatcher_thread.c,
fuction nfs_rpc_dequeue_req, the threads enter the while (!(wqe->flags &
Wqe_LFlag_SyncDone)) and never exit from there.
I do not know if it's normal or not as I should read better the code.
Cheers,

	Alessandro

On Fri, 2015-06-12 at 09:35 +0200, Alessandro De Salvo
wrote:> Hi Malahal,
> 
> 
> > Il giorno 12/giu/2015, alle ore 01:23, Malahal Naineni <malahal at
us.ibm.com> ha scritto:
> > 
> > The logs indicate that ganesha was started successfully without any
> > exports.  gstack output seemed normal as well -- threads were waiting
to
> > serve requests.
> 
> Yes, no exports as it was the default config before enabling Ganesha on any
gluster volume.
> 
> > 
> > Assuming that you are running "showmount -e" on the same
system, there
> > shouldn't be any firewall coming into the picture.
> 
> Yes it was the case in my last attempt, from the same machine. I also tried
from another machine, but the result was the same. The firewall (firewalld, as
it's a CentOS 7.1) is disabled anyways.
> 
> > If you are running
> > "showmount" from some other system, make sure there is no
firewall
> > dropping the packets.
> > 
> > I think you need tcpdump trace to figure out the problem. My wireshark
> > trace showed two requests from the client to complete the
"showmount -e"
> > command:
> > 
> > 1. Client sent "GETPORT" call to port 111 (rpcbind) to get
the port number
> >   of MOUNT.
> > 2. Then it sent "EXPORT" call to mountd port (port it got in
response to #1).
> 
> Yes, I did it already, and indeed it showed the two requests, so the
portmapper works fine, but it hangs on the second request.
> Also "rpcinfo -t localhost portmapper" returns successfully,
while "rpcinfo -t localhost nfs" hangs.
> The output of rpcinfo -p is the following:
> 
>     program vers proto   port  service
>     100000    4   tcp    111  portmapper
>     100000    3   tcp    111  portmapper
>     100000    2   tcp    111  portmapper
>     100000    4   udp    111  portmapper
>     100000    3   udp    111  portmapper
>     100000    2   udp    111  portmapper
>     100024    1   udp  56082  status
>     100024    1   tcp  41858  status
>     100003    3   udp   2049  nfs
>     100003    3   tcp   2049  nfs
>     100003    4   udp   2049  nfs
>     100003    4   tcp   2049  nfs
>     100005    1   udp  45611  mountd
>     100005    1   tcp  55915  mountd
>     100005    3   udp  45611  mountd
>     100005    3   tcp  55915  mountd
>     100021    4   udp  48775  nlockmgr
>     100021    4   tcp  51621  nlockmgr
>     100011    1   udp   4501  rquotad
>     100011    1   tcp   4501  rquotad
>     100011    2   udp   4501  rquotad
>     100011    2   tcp   4501  rquotad
> 
> > 
> > What does "rpcinfo -p <server-ip>" show?
> > 
> > Do you have selinux enabled? I am not sure if that is playing any role
> > here...
> 
> Nope, it's disabled:
> 
> # uname -a
> Linux node2 3.10.0-229.4.2.el7.x86_64 #1 SMP Wed May 13 10:06:09 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> Thanks for the help,
> 
>     Alessandro
> 
> > 
> > Regards, Malahal.
> > 
> > Alessandro De Salvo [Alessandro.DeSalvo at roma1.infn.it] wrote:
> >> Hi,
> >> this was an extract from the old logs, before Soumya's
suggestion of
> >> changing the rquota port in the conf file. The new logs are
attached
> >> (ganesha-20150611.log.gz) as well as the gstack of the ganesha
process
> >> while I was executing the hanging showmount
> >> (ganesha-20150611.gstack.gz).
> >> Thanks,
> >> 
> >>    Alessandro
> >> 
> >> 
> >> 
> >>> On Thu, 2015-06-11 at 11:37 -0500, Malahal Naineni wrote:
> >>> Soumya Koduri [skoduri at redhat.com] wrote:
> >>>> CCin ganesha-devel to get more inputs.
> >>>> 
> >>>> In case of ipv6 enabled, only v6 interfaces are used by
NFS-Ganesha.
> >>> 
> >>> I am not a network expert but I have seen IPv4 traffic over
IPv6
> >>> interface while fixing few things before. This may be normal.
> >>> 
> >>>> commit - git show 'd7e8f255' , which got added in
v2.2 has more details.
> >>>> 
> >>>>> # netstat -ltaupn | grep 2049
> >>>>> tcp6       4      0 :::2049                 :::*
> >>>>> LISTEN      32080/ganesha.nfsd
> >>>>> tcp6       1      0 x.x.x.2:2049      x.x.x.2:33285   
CLOSE_WAIT
> >>>>> -
> >>>>> tcp6       1      0 127.0.0.1:2049         
127.0.0.1:39555
> >>>>> CLOSE_WAIT  -
> >>>>> udp6       0      0 :::2049                 :::*
> >>>>> 32080/ganesha.nfsd
> >>>> 
> >>>>>>> I have enabled the full debug already, but I
see nothing special. Before exporting any volume the log shows no error, even
when I do a showmount (the log is attached, ganesha.log.gz). If I do the same
after exporting a volume nfs-ganesha does not even start, complaining for not
being able to bind the IPv6 ruota socket, but in fact there is nothing listening
on IPv6, so it should not happen:
> >>>>>>> 
> >>>>>>> tcp6       0      0 :::111                 
:::*                    LISTEN      7433/rpcbind
> >>>>>>> tcp6       0      0 :::2224                
:::*                    LISTEN      9054/ruby
> >>>>>>> tcp6       0      0 :::22                  
:::*                    LISTEN      1248/sshd
> >>>>>>> udp6       0      0 :::111                 
:::*                                7433/rpcbind
> >>>>>>> udp6       0      0 fe80::8c2:27ff:fef2:123
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 fe80::230:48ff:fed2:123
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 ::1:123                
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 fe80::5484:7aff:fef:123
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 :::123                 
:::*                                31238/ntpd
> >>>>>>> udp6       0      0 :::824                 
:::*                                7433/rpcbind
> >>>>>>> 
> >>>>>>> The error, as shown in the attached
ganesha-after-export.log.gz logfile, is the following:
> >>>>>>> 
> >>>>>>> 
> >>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] Bind_sockets_V6 :DISP :WARN :Cannot bind RQUOTA tcp6
socket, error 98 (Address already in use)
> >>>>>>> 10/06/2015 02:07:47 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] Bind_sockets :DISP :FATAL :Error binding to V6
interface. Cannot continue.
> >>>>>>> 10/06/2015 02:07:48 : epoch 55777fb5 : node2 :
ganesha.nfsd-26195[main] glusterfs_unload :FSAL :DEBUG :FSAL Gluster unloaded
> >>> 
> >>> The above messages indicate that someone tried to restart
ganesha. But
> >>> ganesha failed to come up because RQUOTA port (default is 875)
is
> >>> already in use by an old ganesha instance or some other
program holding
> >>> it. The new instance of ganesha will die, but if you are using
systemd,
> >>> it will try to restart automatically. We have disabled systemd
auto
> >>> restart in our environment as it was causing issues for
debugging.
> >>> 
> >>> What version of ganesha is this?
> >>> 
> >>> Regards, Malahal.
> > 
> > 
> > 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Jun 2015 - [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size

[Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size

[Gluster-users] [Nfs-ganesha-devel] Questions on ganesha HA and shared storage size