When i have used the packages from "centos-gluster37" to setup my Gluster with ZFS backend, ganesha-nfsd will throw me a ABRT signal when i tried to copy, or simply rsync a directory to the share exported from nfs-ganesha. Environment: CentOS 7 - kernel 3.10.0-327.22.2.el7.x86_64 ZFS Version: 0.6.5.7 Release : 1.el7.centos Gluster 3.7.13.1.el7 from centos-gluster37 nfs-ganesha 2.3.0.1.el7 from centos-gluster37 This is the only line i got from strace, off from the PID of gaensha-nfsd futex(0x7f623ffff9d0, FUTEX_WAIT, 38303, NULL <detached ...> In ganesha-gfapi.log, when i try to copy files - the log will pop up the following entries, which keep complaining split-brain, and stale-file-handle, issues in the Gluster volume. [2016-08-01 23:06:09.423901] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-nfsvol1-replicate-0: Unreadable subvolume -1 found with event generation 2 for gfid 3f713211-7573-45b1-aed8-503c8e17714b. (Possible split-brain) [2016-08-01 23:06:09.425664] E [MSGID: 109040] [dht-helper.c:1190:dht_migration_complete_check_task] 0-nfsvol1-dht: <gfid:3f713211-7573-45b1-aed8-503c8e17714b>: failed to lookup the file on nfsvol1-dht [Stale file handle] I tried with both Gluster 3.7.13, and 3.7.12, these versions both give me the same problem. Until i downgrade Gluster to 3.7.11, nfs-ganesha then plays nicely with Gluster. I once wondered if that's related to my ZFS backend setup, then i set something quick in my laptop using XFS as the backend with 3 nodes running Gluster 3.7.13, and nfs-ganesha 2.3.0, and i got the same result. Rsync/Copy files to the NFS shares exported from Gluster+Ganesha aborted after a few files got copied. For your reference, pacemaker+corosync are both still running in the background even when this happens. I am wondering if there are something introduced since 3.7.12, which somehow breaks the interface between nfs-ganesha, and Gluster. Any pointers will be appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160802/cb0cc191/attachment.html>
Soumya Koduri
2016-Aug-02 18:31 UTC
[Gluster-users] Gluster 3.7.13 with nfs-ganesha 2.3.0.1
Hi, Does your test involve multiple multiple ganesha servers or removing files? http://review.gluster.org/#/c/14522/ (merged in 3.7.12) caused a regression in upcall processing of nfs-ganesha. It is being fixed as part of http://review.gluster.org/14701 . Could you please turn off upcalls (using below cmd) and re-try the tests. cmd: gluster v set <volname> features.cache-invalidation off Thanks, Soumya On 08/02/2016 11:48 PM, ML Wong wrote:> When i have used the packages from "centos-gluster37" to setup my > Gluster with ZFS backend, ganesha-nfsd will throw me a ABRT signal when > i tried to copy, or simply rsync a directory to the share exported from > nfs-ganesha. > > Environment: > CentOS 7 - kernel 3.10.0-327.22.2.el7.x86_64 > ZFS Version: 0.6.5.7 Release : 1.el7.centos > Gluster 3.7.13.1.el7 from centos-gluster37 > nfs-ganesha 2.3.0.1.el7 from centos-gluster37 > > This is the only line i got from strace, off from the PID of gaensha-nfsd > > futex(0x7f623ffff9d0, FUTEX_WAIT, 38303, NULL <detached ...> > > In ganesha-gfapi.log, when i try to copy files - the log will pop up the > following entries, which keep complaining split-brain, and > stale-file-handle, issues in the Gluster volume. > > [2016-08-01 23:06:09.423901] W [MSGID: 108008] > [afr-read-txn.c:244:afr_read_txn] 0-nfsvol1-replicate-0: Unreadable > subvolume -1 found with event generation 2 for gfid > 3f713211-7573-45b1-aed8-503c8e17714b. (Possible split-brain) > > [2016-08-01 23:06:09.425664] E [MSGID: 109040] > [dht-helper.c:1190:dht_migration_complete_check_task] 0-nfsvol1-dht: > <gfid:3f713211-7573-45b1-aed8-503c8e17714b>: failed to lookup the file > on nfsvol1-dht [Stale file handle] > > > I tried with both Gluster 3.7.13, and 3.7.12, these versions both give > me the same problem. Until i downgrade Gluster to 3.7.11, nfs-ganesha > then plays nicely with Gluster. I once wondered if that's related to my > ZFS backend setup, then i set something quick in my laptop using XFS as > the backend with 3 nodes running Gluster 3.7.13, and nfs-ganesha 2.3.0, > and i got the same result. Rsync/Copy files to the NFS shares exported > from Gluster+Ganesha aborted after a few files got copied. For your > reference, pacemaker+corosync are both still running in the background > even when this happens. > > I am wondering if there are something introduced since 3.7.12, which > somehow breaks the interface between nfs-ganesha, and Gluster. Any > pointers will be appreciated. > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >