莊尚豪
2015-Jun-24 09:09 UTC
[Gluster-users] pnfs(glusterfs-3.7.1 + ganesha-2.2) problem for clients layout commit
Hi all, I test some perfomance from pnfs (gluster-3.7.1 + ganesha-2.2) in Fedora 22. There are 4 glusterfs nodes with ganesha. I reference from https://gluster.readthedocs.org/en/latest/Features/mount_gluster_volume_usin g_pnfs/ The clients(Fedora 21) are fine to mount and commit some small files to gluster. However, when dd the bigger files(dd 600MB file), client will be suspend on layout commit protocol. There are some tshark information from client -------------------------------------------------- 331036 68.713945 192.168.100.12 -> 192.168.100.16 NFS 182 V4 Reply (Call In 331012) WRITE 331037 68.718067 192.168.100.16 -> 192.168.100.12 NFS 286 V4 Call COMMIT FH: 0x67571bfb Offset: 0 Len: 0 331038 68.718999 192.168.100.12 -> 192.168.100.16 NFS 174 V4 Reply (Call In 331037) COMMIT 331039 68.740898 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call LAYOUTCOMMIT 331040 68.741619 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply (Call In 331039) SEQUENCE Status: NFS4ERR_BADSESSION 331041 68.741684 192.168.100.16 -> 192.168.100.10 TCP 66 908?2049 [ACK] Seq=8561 Ack=6417 Win=942 Len=0 TSval=8509746 TSecr=108629619 331042 68.742060 192.168.100.16 -> 192.168.100.10 NFS 186 V4 Call DESTROY_SESSION 331043 68.742686 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply (Call In 331042) DESTROY_SESSION Status: NFS4ERR_BADSESSION 331044 68.742819 192.168.100.16 -> 192.168.100.10 NFS 298 V4 Call CREATE_SESSION 331045 68.743371 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply (Call In 331044) CREATE_SESSION Status: NFS4ERR_STALE_CLIENTID 331046 68.743529 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call EXCHANGE_ID 331047 68.744174 192.168.100.10 -> 192.168.100.16 NFS 182 V4 Reply (Call In 331046) EXCHANGE_ID 331048 68.744317 192.168.100.16 -> 192.168.100.10 NFS 298 V4 Call CREATE_SESSION 331049 68.756698 192.168.100.10 -> 192.168.100.16 NFS 154 V1 CB_NULL Call 331050 68.756825 192.168.100.16 -> 192.168.100.10 NFS 94 V1 CB_NULL Reply (Call In 331049) 331051 68.757543 192.168.100.10 -> 192.168.100.16 NFS 194 V4 Reply (Call In 331048) CREATE_SESSION 331052 68.757655 192.168.100.16 -> 192.168.100.10 NFS 218 V4 Call PUTROOTFH | GETATTR 331053 68.758289 192.168.100.10 -> 192.168.100.16 NFS 182 V4 Reply (Call In 331052) PUTROOTFH | GETATTR 331054 68.758329 192.168.100.16 -> 192.168.100.12 TCP 66 980?2049 [ACK] Seq=1574831169 Ack=697393 Win=31360 Len=0 TSval=8509763 TSecr=109327597 331055 68.761509 192.168.100.16 -> 192.168.100.10 NFS 338 V4 Call OPEN DH: 0x1dfddbb4/ 331056 68.762148 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply (Call In 331055) OPEN Status: NFS4ERR_NO_GRACE 331057 68.762324 192.168.100.16 -> 192.168.100.10 NFS 210 V4 Call RECLAIM_COMPLETE 331058 68.762969 192.168.100.10 -> 192.168.100.16 NFS 158 V4 Reply (Call In 331057) RECLAIM_COMPLETE 331059 68.763135 192.168.100.16 -> 192.168.100.10 NFS 338 V4 Call OPEN DH: 0x1dfddbb4/ 331060 68.763844 192.168.100.10 -> 192.168.100.16 NFS 398 V4 Reply (Call In 331059) OPEN StateID: 0x9d75 331061 68.764080 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call LAYOUTCOMMIT 331062 68.764720 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply (Call In 331061) LAYOUTCOMMIT Status: NFS4ERR_EXPIRED 331063 68.765075 192.168.100.16 -> 192.168.100.10 NFS 202 V4 Call SEQUENCE 331064 68.765699 192.168.100.10 -> 192.168.100.16 NFS 150 V4 Reply (Call In 331063) SEQUENCE 331065 68.765906 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call LAYOUTCOMMIT 331066 68.766555 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply (Call In 331065) LAYOUTCOMMIT Status: NFS4ERR_EXPIRED 331067 68.766855 192.168.100.16 -> 192.168.100.10 NFS 202 V4 Call SEQUENCE 331068 68.767493 192.168.100.10 -> 192.168.100.16 NFS 150 V4 Reply (Call In 331067) SEQUENCE 331069 68.767697 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call LAYOUTCOMMIT -------------------------------------------------- Meanwhile, the ganesha server appears the log like these. There are some server logs when client are failed to commit. ---------------------------------------------------------------------------- ------------------------ 24/06/2015 16:08:49 : epoch 558a6555 : gluster1 : nfs-ganesha-20876[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now NOT IN GRACE 24/06/2015 16:09:56 : epoch 558a6555 : gluster1 : nfs-ganesha-20876[work-10] nfs4_op_lookup :EXPORT :MAJ :PSEUDO FS JUNCTION TRAVERSAL: Failed to get FSAL credentials for /ganesha, id=1 24/06/2015 16:09:56 : epoch 558a6555 : gluster1 : nfs-ganesha-20876[work-13] nfs4_op_lookup :EXPORT :MAJ :PSEUDO FS JUNCTION TRAVERSAL: Failed to get FSAL credentials for /ganesha, id=1 ---------------------------------------------------------------------------- ------------------------ The following is my glusterfs nodes configuration. There are the same confiuration for all node. /etc/ganesha/gluster.conf -------------------------------------------------- EXPORT{ Export_Id = 1; Path = /ganesha; #Is the path attribute useless in the configuration? FSAL { name = GLUSTER; hostname = "localhost"; volume = "gluster"; } Access_type = RW; Squash = No_Root_Squash; Disable_ACL = TRUE; Pseudo = /ganesha; Protocols = "4"; Transports = "TCP"; SecType = sys, krb5, krb5i, krb5p; } -------------------------------------------------- gluster volume info -------------------------------------------------- Volume Name: gluster Type: Distribute Volume ID: 8a5afe82-41fe-456e-935f-3361edce1995 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 192.168.100.10:/volume1/brick1 Brick2: 192.168.100.11:/volume1/brick1 Brick3: 192.168.100.12:/volume1/brick1 Brick4: 192.168.100.13:/volume1/brick1 Options Reconfigured: nfs.disable: ON performance.readdir-ahead: on -------------------------------------------------- Are this problem from ganesha server or client? or it would be fixed in the ganesha 2.3 version? Many thanks, Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150624/5724149d/attachment.html>
Jiffin Tony Thottan
2015-Jun-24 12:59 UTC
[Gluster-users] pnfs(glusterfs-3.7.1 + ganesha-2.2) problem for clients layout commit
Hi, Comments inline. On 24/06/15 14:39, ??? wrote:> > Hi all, > > I test some perfomance from pnfs (gluster-3.7.1 + ganesha-2.2) in > Fedora 22. > > There are 4 glusterfs nodes with ganesha. > > I reference from > https://gluster.readthedocs.org/en/latest/Features/mount_gluster_volume_using_pnfs/ >Can u check whether ganesha is running on every nodes(M.D.S and D.Ses) , # service nfs-ganesha status or try #ps ax | grep ganesha and also checks whether volume is exported in every node # showmount -e local host> The clients(Fedora 21) are fine to mount and commit some small files > to gluster. > > However, when dd the bigger files(dd 600MB file), client will be > suspend on layout commit protocol. >I had tested dd command (two to three months back) upto file size 10GB , I didn't notice this issue> There are some tshark information from client > > -------------------------------------------------- > > 331036 68.713945 192.168.100.12 -> 192.168.100.16 NFS 182 V4 Reply > (Call In 331012) WRITE > > 331037 68.718067 192.168.100.16 -> 192.168.100.12 NFS 286 V4 Call > COMMIT FH: 0x67571bfb Offset: 0 Len: 0 > > 331038 68.718999 192.168.100.12 -> 192.168.100.16 NFS 174 V4 Reply > (Call In 331037) COMMIT > > 331039 68.740898 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call > LAYOUTCOMMIT > > 331040 68.741619 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply > (Call In 331039) SEQUENCE Status: NFS4ERR_BADSESSION > > 331041 68.741684 192.168.100.16 -> 192.168.100.10 TCP 66 908?2049 > [ACK] Seq=8561 Ack=6417 Win=942 Len=0 TSval=8509746 TSecr=108629619 > > 331042 68.742060 192.168.100.16 -> 192.168.100.10 NFS 186 V4 Call > DESTROY_SESSION > > 331043 68.742686 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply > (Call In 331042) DESTROY_SESSION Status: NFS4ERR_BADSESSION > > 331044 68.742819 192.168.100.16 -> 192.168.100.10 NFS 298 V4 Call > CREATE_SESSION > > 331045 68.743371 192.168.100.10 -> 192.168.100.16 NFS 114 V4 Reply > (Call In 331044) CREATE_SESSION Status: NFS4ERR_STALE_CLIENTID > > 331046 68.743529 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call > EXCHANGE_ID > > 331047 68.744174 192.168.100.10 -> 192.168.100.16 NFS 182 V4 Reply > (Call In 331046) EXCHANGE_ID > > 331048 68.744317 192.168.100.16 -> 192.168.100.10 NFS 298 V4 Call > CREATE_SESSION > > 331049 68.756698 192.168.100.10 -> 192.168.100.16 NFS 154 V1 CB_NULL Call > > 331050 68.756825 192.168.100.16 -> 192.168.100.10 NFS 94 V1 CB_NULL > Reply (Call In 331049) > > 331051 68.757543 192.168.100.10 -> 192.168.100.16 NFS 194 V4 Reply > (Call In 331048) CREATE_SESSION > > 331052 68.757655 192.168.100.16 -> 192.168.100.10 NFS 218 V4 Call > PUTROOTFH | GETATTR > > 331053 68.758289 192.168.100.10 -> 192.168.100.16 NFS 182 V4 Reply > (Call In 331052) PUTROOTFH | GETATTR > > 331054 68.758329 192.168.100.16 -> 192.168.100.12 TCP 66 980?2049 > [ACK] Seq=1574831169 Ack=697393 Win=31360 Len=0 TSval=8509763 > TSecr=109327597 > > 331055 68.761509 192.168.100.16 -> 192.168.100.10 NFS 338 V4 Call OPEN > DH: 0x1dfddbb4/ > > 331056 68.762148 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply > (Call In 331055) OPEN Status: NFS4ERR_NO_GRACE > > 331057 68.762324 192.168.100.16 -> 192.168.100.10 NFS 210 V4 Call > RECLAIM_COMPLETE > > 331058 68.762969 192.168.100.10 -> 192.168.100.16 NFS 158 V4 Reply > (Call In 331057) RECLAIM_COMPLETE > > 331059 68.763135 192.168.100.16 -> 192.168.100.10 NFS 338 V4 Call OPEN > DH: 0x1dfddbb4/ > > 331060 68.763844 192.168.100.10 -> 192.168.100.16 NFS 398 V4 Reply > (Call In 331059) OPEN StateID: 0x9d75 > > 331061 68.764080 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call > LAYOUTCOMMIT > > 331062 68.764720 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply > (Call In 331061) LAYOUTCOMMIT Status: NFS4ERR_EXPIRED > > 331063 68.765075 192.168.100.16 -> 192.168.100.10 NFS 202 V4 Call SEQUENCE > > 331064 68.765699 192.168.100.10 -> 192.168.100.16 NFS 150 V4 Reply > (Call In 331063) SEQUENCE > > 331065 68.765906 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call > LAYOUTCOMMIT > > 331066 68.766555 192.168.100.10 -> 192.168.100.16 NFS 166 V4 Reply > (Call In 331065) LAYOUTCOMMIT Status: NFS4ERR_EXPIRED > > 331067 68.766855 192.168.100.16 -> 192.168.100.10 NFS 202 V4 Call SEQUENCE > > 331068 68.767493 192.168.100.10 -> 192.168.100.16 NFS 150 V4 Reply > (Call In 331067) SEQUENCE > > 331069 68.767697 192.168.100.16 -> 192.168.100.10 NFS 334 V4 Call > LAYOUTCOMMIT > > -------------------------------------------------- > > Meanwhile, the ganesha server appears the log like these. > > There are some server logs when client are failed to commit. > > ---------------------------------------------------------------------------------------------------- > > 24/06/2015 16:08:49 : epoch 558a6555 : gluster1 : > nfs-ganesha-20876[reaper] nfs_in_grace :STATE :EVENT :NFS Server Now > NOT IN GRACE > > 24/06/2015 16:09:56 : epoch 558a6555 : gluster1 : > nfs-ganesha-20876[work-10] nfs4_op_lookup :EXPORT :MAJ :PSEUDO FS > JUNCTION TRAVERSAL: Failed to get FSAL credentials for /ganesha, id=1 > > 24/06/2015 16:09:56 : epoch 558a6555 : gluster1 : > nfs-ganesha-20876[work-13] nfs4_op_lookup :EXPORT :MAJ :PSEUDO FS > JUNCTION TRAVERSAL: Failed to get FSAL credentials for /ganesha, id=1 > > ---------------------------------------------------------------------------------------------------- >I suspect logs says lookup fails to find the export entry. Some modification might need in the configuration :> The following is my glusterfs nodes configuration. There are the same > confiuration for all node. > > /etc/ganesha/gluster.conf > > -------------------------------------------------- > > EXPORT{ > > Export_Id = 1; >It is better to use a different export_id other than one and zero. As far as I know export_id = 1 will be used by pseudo_fs ('/')> Path = /ganesha; #Is the path attribute useless in the configuration? >It should be Path=/<volname>, in your case it should be /gluster> > FSAL { > > name = GLUSTER; > > hostname = "localhost"; > > volume = "gluster"; > > } > > Access_type = RW; > > Squash = No_Root_Squash; > > Disable_ACL = TRUE; > > Pseudo = /ganesha; > > Protocols = "4"; > > Transports = "TCP"; > > SecType = sys, krb5, krb5i, krb5p; > > } > > -------------------------------------------------- >You should include following block in conf file of M.D.S (due to the latest changes) : GLUSTER { PNFS_MDS = true; }> gluster volume info > > -------------------------------------------------- > > Volume Name: gluster > > Type: Distribute > > Volume ID: 8a5afe82-41fe-456e-935f-3361edce1995 > > Status: Started > > Number of Bricks: 4 > > Transport-type: tcp > > Bricks: > > Brick1: 192.168.100.10:/volume1/brick1 > > Brick2: 192.168.100.11:/volume1/brick1 > > Brick3: 192.168.100.12:/volume1/brick1 > > Brick4: 192.168.100.13:/volume1/brick1 > > Options Reconfigured: > > nfs.disable: ON > > performance.readdir-ahead: on >also turn on cache-invalidation feature for the volume. gluster v set <volname> features.cache-invalidation on> -------------------------------------------------- > > Are this problem from ganesha server or client? > > or it would be fixed in the ganesha 2.3 version? > > Many thanks, > > Ben >Regards, Jiffin> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150624/4e105414/attachment.html>