Hi Soumya, When I originally did the tests I ran tcpdump on the client. I have rerun the tests, doing tcpdump on the server tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap The results are in the same place http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ capture_nfsfail.pcap has the results from the failed touch experiment capture_nfssucceed.pcap has the results from the successful touch experiment The brick log files are there too. I believe we are using kernel-NFS exporting a fuse mounted gluster volume. I am having Steve confirm this. I tried to find the fuse-mnt logs but failed. Where should I look for them? Thanks Pat On 07/03/2017 07:58 AM, Soumya Koduri wrote:> > > On 06/30/2017 07:56 PM, Pat Haley wrote: >> >> Hi, >> >> I was wondering if there were any additional test we could perform to >> help debug the group write-permissions issue? > > Sorry for the delay. Please find response inline -- > >> >> Thanks >> >> Pat >> >> >> On 06/27/2017 12:29 PM, Pat Haley wrote: >>> >>> Hi Soumya, >>> >>> One example, we have a common working directory dri_fleat in the >>> gluster volume >>> >>> drwxrwsr-x 22 root dri_fleat 4.0K May 1 15:14 dri_fleat >>> >>> my user (phaley) does not own that directory but is a member of the >>> group dri_fleat and should have write permissions. When I go to the >>> nfs-mounted version and try to use the touch command I get the >>> following >>> >>> ibfdr-compute-0-4(dri_fleat)% touch dum >>> touch: cannot touch `dum': Permission denied >>> >>> One of the sub-directories under dri_fleat is "test" which phaley owns >>> >>> drwxrwsr-x 2 phaley dri_fleat 4.0K May 1 15:16 test >>> >>> Under this directory (mounted via nfs) user phaley can write >>> >>> ibfdr-compute-0-4(test)% touch dum >>> ibfdr-compute-0-4(test)% >>> >>> I have put the packet captures in >>> >>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ >>> >>> capture_nfsfail.pcap has the results from the failed touch experiment >>> capture_nfssucceed.pcap has the results from the successful touch >>> experiment >>> >>> The command I used for these was >>> >>> tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap > > I hope these pkts were captured on the node where NFS server is > running. Could you please use '-i any' as I do not see glusterfs > traffic in the tcpdump. > > Also looks like NFS v4 is used between client & nfs server. Are you > using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster > volume)? > If that is the case please capture fuse-mnt logs as well. This error > may well be coming from kernel-NFS itself before the request is sent > to fuse-mnt process. > > FWIW, we have below option - > > Option: server.manage-gids > Default Value: off > Description: Resolve groups on the server-side. > > I haven't looked into what this option exactly does. But it may worth > testing with this option on. > > Thanks, > Soumya > > >>> >>> The brick log files are also in the above link. If I read them >>> correctly they both funny times. Specifically I see entries from >>> around 2017-06-27 14:02:37.404865 even though the system time was >>> 2017-06-27 12:00:00. >>> >>> One final item, another reply to my post had a link for possible >>> problems that could arise from users belonging to too many group. We >>> have seen the above problem even with a user belonging to only 4 >>> groups. >>> >>> Let me know what additional information I can provide. > >>> >>> Thanks >>> >>> Pat >>> >>> >>> On 06/27/2017 02:45 AM, Soumya Koduri wrote: >>>> >>>> >>>> On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote: >>>>> The only problem with using gluster mounted via NFS is that it >>>>> does not >>>>> respect the group write permissions which we need. >>>>> >>>>> We have an exercise coming up in the a couple of weeks. It seems >>>>> to me >>>>> that in order to improve our write times before then, it would be >>>>> good >>>>> to solve the group write permissions for gluster mounted via NFS now. >>>>> We can then revisit gluster mounted via FUSE afterwards. >>>>> >>>>> What information would you need to help us force gluster mounted via >>>>> NFS >>>>> to respect the group write permissions? >>>> >>>> Is this owning group or one of the auxiliary groups whose write >>>> permissions are not considered? AFAIK, there are no special >>>> permission checks done by gNFS server when compared to gluster native >>>> client. >>>> >>>> Could you please provide simple steps to reproduce the issue and >>>> collect pkt trace and nfs/brick logs as well. >>>> >>>> Thanks, >>>> Soumya >>> >>-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pat Haley Email: phaley at mit.edu Center for Ocean Engineering Phone: (617) 253-6824 Dept. of Mechanical Engineering Fax: (617) 253-8125 MIT, Room 5-213 http://web.mit.edu/phaley/www/ 77 Massachusetts Avenue Cambridge, MA 02139-4301
On 07/03/2017 09:01 PM, Pat Haley wrote:> > Hi Soumya, > > When I originally did the tests I ran tcpdump on the client. > > I have rerun the tests, doing tcpdump on the server > > tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap > > The results are in the same place > > http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ > > capture_nfsfail.pcap has the results from the failed touch experiment > capture_nfssucceed.pcap has the results from the successful touch > experiment > > The brick log files are there too.Thanks for sharing. Looks like the error is not generated @gluster-server side. The permission denied error was caused by either kNFS or by fuse-mnt process or probably by the combination. To check fuse-mnt logs, please look at /var/log/glusterfs/<fuse_mnt_direcotry>.log For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt and exported it via kNFS, the log location for that fuse_mnt shall be at /var/log/glusterfs/mnt-fuse-mnt.log Also why not switch to either gluster-NFS native server or NFS-Ganesha instead of using kNFS, as they are recommended NFS servers to use with gluster? Thanks, Soumya> > I believe we are using kernel-NFS exporting a fuse mounted gluster > volume. I am having Steve confirm this. I tried to find the fuse-mnt > logs but failed. Where should I look for them? > > Thanks > > Pat > > > > On 07/03/2017 07:58 AM, Soumya Koduri wrote: >> >> >> On 06/30/2017 07:56 PM, Pat Haley wrote: >>> >>> Hi, >>> >>> I was wondering if there were any additional test we could perform to >>> help debug the group write-permissions issue? >> >> Sorry for the delay. Please find response inline -- >> >>> >>> Thanks >>> >>> Pat >>> >>> >>> On 06/27/2017 12:29 PM, Pat Haley wrote: >>>> >>>> Hi Soumya, >>>> >>>> One example, we have a common working directory dri_fleat in the >>>> gluster volume >>>> >>>> drwxrwsr-x 22 root dri_fleat 4.0K May 1 15:14 dri_fleat >>>> >>>> my user (phaley) does not own that directory but is a member of the >>>> group dri_fleat and should have write permissions. When I go to the >>>> nfs-mounted version and try to use the touch command I get the >>>> following >>>> >>>> ibfdr-compute-0-4(dri_fleat)% touch dum >>>> touch: cannot touch `dum': Permission denied >>>> >>>> One of the sub-directories under dri_fleat is "test" which phaley owns >>>> >>>> drwxrwsr-x 2 phaley dri_fleat 4.0K May 1 15:16 test >>>> >>>> Under this directory (mounted via nfs) user phaley can write >>>> >>>> ibfdr-compute-0-4(test)% touch dum >>>> ibfdr-compute-0-4(test)% >>>> >>>> I have put the packet captures in >>>> >>>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ >>>> >>>> capture_nfsfail.pcap has the results from the failed touch experiment >>>> capture_nfssucceed.pcap has the results from the successful touch >>>> experiment >>>> >>>> The command I used for these was >>>> >>>> tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap >> >> I hope these pkts were captured on the node where NFS server is >> running. Could you please use '-i any' as I do not see glusterfs >> traffic in the tcpdump. >> >> Also looks like NFS v4 is used between client & nfs server. Are you >> using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster >> volume)? >> If that is the case please capture fuse-mnt logs as well. This error >> may well be coming from kernel-NFS itself before the request is sent >> to fuse-mnt process. >> >> FWIW, we have below option - >> >> Option: server.manage-gids >> Default Value: off >> Description: Resolve groups on the server-side. >> >> I haven't looked into what this option exactly does. But it may worth >> testing with this option on. >> >> Thanks, >> Soumya >> >> >>>> >>>> The brick log files are also in the above link. If I read them >>>> correctly they both funny times. Specifically I see entries from >>>> around 2017-06-27 14:02:37.404865 even though the system time was >>>> 2017-06-27 12:00:00. >>>> >>>> One final item, another reply to my post had a link for possible >>>> problems that could arise from users belonging to too many group. We >>>> have seen the above problem even with a user belonging to only 4 >>>> groups. >>>> >>>> Let me know what additional information I can provide. >> >>>> >>>> Thanks >>>> >>>> Pat >>>> >>>> >>>> On 06/27/2017 02:45 AM, Soumya Koduri wrote: >>>>> >>>>> >>>>> On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote: >>>>>> The only problem with using gluster mounted via NFS is that it >>>>>> does not >>>>>> respect the group write permissions which we need. >>>>>> >>>>>> We have an exercise coming up in the a couple of weeks. It seems >>>>>> to me >>>>>> that in order to improve our write times before then, it would be >>>>>> good >>>>>> to solve the group write permissions for gluster mounted via NFS now. >>>>>> We can then revisit gluster mounted via FUSE afterwards. >>>>>> >>>>>> What information would you need to help us force gluster mounted via >>>>>> NFS >>>>>> to respect the group write permissions? >>>>> >>>>> Is this owning group or one of the auxiliary groups whose write >>>>> permissions are not considered? AFAIK, there are no special >>>>> permission checks done by gNFS server when compared to gluster native >>>>> client. >>>>> >>>>> Could you please provide simple steps to reproduce the issue and >>>>> collect pkt trace and nfs/brick logs as well. >>>>> >>>>> Thanks, >>>>> Soumya >>>> >>> >
Hi Soumya, (1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ I've placed the following 2 log files etc-glusterfs-glusterd.vol.log gdata.log The first has repeated messages about nfs disconnects. The second had the <fuse_mnt_direcotry>.log name (but not much information). (2) About the gluster-NFS native server: do you know where we can find documentation on how to use/install it? We haven't had success in our searches. Thanks Pat On 07/04/2017 05:01 AM, Soumya Koduri wrote:> > > On 07/03/2017 09:01 PM, Pat Haley wrote: >> >> Hi Soumya, >> >> When I originally did the tests I ran tcpdump on the client. >> >> I have rerun the tests, doing tcpdump on the server >> >> tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap >> >> The results are in the same place >> >> http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ >> >> capture_nfsfail.pcap has the results from the failed touch experiment >> capture_nfssucceed.pcap has the results from the successful touch >> experiment >> >> The brick log files are there too. > > Thanks for sharing. Looks like the error is not generated > @gluster-server side. The permission denied error was caused by either > kNFS or by fuse-mnt process or probably by the combination. > > To check fuse-mnt logs, please look at > /var/log/glusterfs/<fuse_mnt_direcotry>.log > > For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt > and exported it via kNFS, the log location for that fuse_mnt shall be > at /var/log/glusterfs/mnt-fuse-mnt.log > > > Also why not switch to either gluster-NFS native server or NFS-Ganesha > instead of using kNFS, as they are recommended NFS servers to use with > gluster? > > Thanks, > Soumya > >> >> I believe we are using kernel-NFS exporting a fuse mounted gluster >> volume. I am having Steve confirm this. I tried to find the fuse-mnt >> logs but failed. Where should I look for them? >> >> Thanks >> >> Pat >> >> >> >> On 07/03/2017 07:58 AM, Soumya Koduri wrote: >>> >>> >>> On 06/30/2017 07:56 PM, Pat Haley wrote: >>>> >>>> Hi, >>>> >>>> I was wondering if there were any additional test we could perform to >>>> help debug the group write-permissions issue? >>> >>> Sorry for the delay. Please find response inline -- >>> >>>> >>>> Thanks >>>> >>>> Pat >>>> >>>> >>>> On 06/27/2017 12:29 PM, Pat Haley wrote: >>>>> >>>>> Hi Soumya, >>>>> >>>>> One example, we have a common working directory dri_fleat in the >>>>> gluster volume >>>>> >>>>> drwxrwsr-x 22 root dri_fleat 4.0K May 1 15:14 dri_fleat >>>>> >>>>> my user (phaley) does not own that directory but is a member of the >>>>> group dri_fleat and should have write permissions. When I go to the >>>>> nfs-mounted version and try to use the touch command I get the >>>>> following >>>>> >>>>> ibfdr-compute-0-4(dri_fleat)% touch dum >>>>> touch: cannot touch `dum': Permission denied >>>>> >>>>> One of the sub-directories under dri_fleat is "test" which phaley >>>>> owns >>>>> >>>>> drwxrwsr-x 2 phaley dri_fleat 4.0K May 1 15:16 test >>>>> >>>>> Under this directory (mounted via nfs) user phaley can write >>>>> >>>>> ibfdr-compute-0-4(test)% touch dum >>>>> ibfdr-compute-0-4(test)% >>>>> >>>>> I have put the packet captures in >>>>> >>>>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ >>>>> >>>>> capture_nfsfail.pcap has the results from the failed touch >>>>> experiment >>>>> capture_nfssucceed.pcap has the results from the successful touch >>>>> experiment >>>>> >>>>> The command I used for these was >>>>> >>>>> tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w >>>>> /root/capture_nfstest.pcap >>> >>> I hope these pkts were captured on the node where NFS server is >>> running. Could you please use '-i any' as I do not see glusterfs >>> traffic in the tcpdump. >>> >>> Also looks like NFS v4 is used between client & nfs server. Are you >>> using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster >>> volume)? >>> If that is the case please capture fuse-mnt logs as well. This error >>> may well be coming from kernel-NFS itself before the request is sent >>> to fuse-mnt process. >>> >>> FWIW, we have below option - >>> >>> Option: server.manage-gids >>> Default Value: off >>> Description: Resolve groups on the server-side. >>> >>> I haven't looked into what this option exactly does. But it may worth >>> testing with this option on. >>> >>> Thanks, >>> Soumya >>> >>> >>>>> >>>>> The brick log files are also in the above link. If I read them >>>>> correctly they both funny times. Specifically I see entries from >>>>> around 2017-06-27 14:02:37.404865 even though the system time was >>>>> 2017-06-27 12:00:00. >>>>> >>>>> One final item, another reply to my post had a link for possible >>>>> problems that could arise from users belonging to too many group. We >>>>> have seen the above problem even with a user belonging to only 4 >>>>> groups. >>>>> >>>>> Let me know what additional information I can provide. >>> >>>>> >>>>> Thanks >>>>> >>>>> Pat >>>>> >>>>> >>>>> On 06/27/2017 02:45 AM, Soumya Koduri wrote: >>>>>> >>>>>> >>>>>> On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote: >>>>>>> The only problem with using gluster mounted via NFS is that it >>>>>>> does not >>>>>>> respect the group write permissions which we need. >>>>>>> >>>>>>> We have an exercise coming up in the a couple of weeks. It seems >>>>>>> to me >>>>>>> that in order to improve our write times before then, it would be >>>>>>> good >>>>>>> to solve the group write permissions for gluster mounted via NFS >>>>>>> now. >>>>>>> We can then revisit gluster mounted via FUSE afterwards. >>>>>>> >>>>>>> What information would you need to help us force gluster mounted >>>>>>> via >>>>>>> NFS >>>>>>> to respect the group write permissions? >>>>>> >>>>>> Is this owning group or one of the auxiliary groups whose write >>>>>> permissions are not considered? AFAIK, there are no special >>>>>> permission checks done by gNFS server when compared to gluster >>>>>> native >>>>>> client. >>>>>> >>>>>> Could you please provide simple steps to reproduce the issue and >>>>>> collect pkt trace and nfs/brick logs as well. >>>>>> >>>>>> Thanks, >>>>>> Soumya >>>>> >>>> >>-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pat Haley Email: phaley at mit.edu Center for Ocean Engineering Phone: (617) 253-6824 Dept. of Mechanical Engineering Fax: (617) 253-8125 MIT, Room 5-213 http://web.mit.edu/phaley/www/ 77 Massachusetts Avenue Cambridge, MA 02139-4301