Avra Sengupta
2016-Jun-28 10:17 UTC
[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
Hi, The patch (http://review.gluster.org/#/c/14811/) passed all regressions. If any of you could merge it, I would backport it to 3.8 Regards, Avra On 06/27/2016 12:04 PM, Avra Sengupta wrote:> On 06/25/2016 01:19 AM, Vijay Bellur wrote: >> On 06/24/2016 02:12 PM, Alastair Neil wrote: >>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am >>> unable to mount my gluster cluster. >>> >>> The update installed: >>> >>> glusterfs-3.8.0-1.fc24.x86_64 >>> glusterfs-libs-3.8.0-1.fc24.x86_64 >>> glusterfs-fuse-3.8.0-1.fc24.x86_64 >>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64 >>> >>> the gluster is running 3.7.11 >>> >>> The volume is replica 3 >>> >>> I see these errors in the mount log: >>> >>> [2016-06-24 17:55:34.016462] I [MSGID: 100030] >>> [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running >>> /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs >>> --volfile-server=gluster1 --volfile-id=homes /mnt/homes) >>> [2016-06-24 17:55:34.094345] I [MSGID: 101190] >>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>> thread with index 1 >>> [2016-06-24 17:55:34.240135] I [MSGID: 101190] >>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>> thread with index 2 >>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>> thread with index 4 >>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>> thread with index 3 >>> [2016-06-24 17:55:34.241499] I [MSGID: 114020] >>> [client.c:2356:notify] 0-homes-client-2: parent translators are >>> ready, attempting connect on transport >>> [2016-06-24 17:55:34.249172] I [MSGID: 114020] >>> [client.c:2356:notify] 0-homes-client-5: parent translators are >>> ready, attempting connect on transport >>> [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>> 0-homes-client-2: changing port to 49171 (from 0) >>> [2016-06-24 17:55:34.253347] I [MSGID: 114020] >>> [client.c:2356:notify] 0-homes-client-6: parent translators are >>> ready, attempting connect on transport >>> [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>> 0-homes-client-5: changing port to 49154 (from 0) >>> [2016-06-24 17:55:34.255115] I [MSGID: 114057] >>> [client-handshake.c:1441:select_server_supported_programs] >>> 0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), >>> Version (330) >>> [2016-06-24 17:55:34.255861] W [MSGID: 114007] >>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: >>> failed to find key 'child_up' in the options >>> [2016-06-24 17:55:34.259097] I [MSGID: 114057] >>> [client-handshake.c:1441:select_server_supported_programs] >>> 0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), >>> Version (330) >>> Final graph: >>> +------------------------------------------------------------------------------+ >>> >>> 1: volume homes-client-2 >>> 2: type protocol/client >>> 3: option clnt-lk-version 1 >>> 4: option volfile-checksum 0 >>> 5: option volfile-key homes >>> 6: option client-version 3.8.0 >>> 7: option process-uuid >>> Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 >>> 8: option fops-version 1298437 >>> 9: option ping-timeout 20 >>> 10: option remote-host gluster-2 >>> 11: option remote-subvolume /export/brick2/home >>> 12: option transport-type socket >>> 13: option event-threads 4 >>> 14: option send-gids true >>> 15: end-volume >>> 16: >>> 17: volume homes-client-5 >>> 18: type protocol/client >>> 19: option clnt-lk-version 1 >>> 20: option volfile-checksum 0 >>> 21: option volfile-key homes >>> 22: option client-version 3.8.0 >>> 23: option process-uuid >>> Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 >>> 24: option fops-version 1298437 >>> 25: option ping-timeout 20 >>> 26: option remote-host gluster1.vsnet.gmu.edu >>> <http://gluster1.vsnet.gmu.edu> >>> 27: option remote-subvolume /export/brick2/home >>> 28: option transport-type socket >>> 29: option event-threads 4 >>> 30: option send-gids true >>> 31: end-volume >>> 32: >>> 33: volume homes-client-6 >>> 34: type protocol/client >>> 35: option ping-timeout 20 >>> 36: option remote-host gluster0 >>> 37: option remote-subvolume /export/brick2/home >>> 38: option transport-type socket >>> 39: option event-threads 4 >>> 40: option send-gids true >>> 41: end-volume >>> 42: >>> 43: volume homes-replicate-0 >>> 44: type cluster/replicate >>> 45: option background-self-heal-count 20 >>> 46: option metadata-self-heal on >>> 47: option data-self-heal off >>> 48: option entry-self-heal on >>> 49: option data-self-heal-window-size 8 >>> 50: option data-self-heal-algorithm diff >>> 51: option eager-lock on >>> 52: option quorum-type auto >>> 53: option self-heal-readdir-size 64KB >>> 54: subvolumes homes-client-2 homes-client-5 homes-client-6 >>> 55: end-volume >>> 56: >>> 57: volume homes-dht >>> 58: type cluster/distribute >>> 59: option min-free-disk 5% >>> 60: option rebalance-stats on >>> 61: option readdir-optimize on >>> 62: subvolumes homes-replicate-0 >>> 63: end-volume >>> 64: >>> 65: volume homes-read-ahead >>> 66: type performance/read-ahead >>> 67: subvolumes homes-dht >>> 68: end-volume >>> 69: >>> 70: volume homes-io-cache >>> 71: type performance/io-cache >>> 72: subvolumes homes-read-ahead >>> 73: end-volume >>> 74: >>> 75: volume homes-quick-read >>> 76: type performance/quick-read >>> 77: subvolumes homes-io-cache >>> 78: end-volume >>> 79: >>> 80: volume homes-open-behind >>> 81: type performance/open-behind >>> 82: subvolumes homes-quick-read >>> 83: end-volume >>> 84: >>> 85: volume homes-md-cache >>> 86: type performance/md-cache >>> 87: subvolumes homes-open-behind >>> 88: end-volume >>> 89: >>> 90: volume homes >>> 91: type debug/io-stats >>> 92: option log-level INFO >>> 93: option latency-measurement off >>> 94: option count-fop-hits on >>> 95: subvolumes homes-md-cache >>> 96: end-volume >>> 97: >>> 98: volume meta-autoload >>> 99: type meta >>> 100: subvolumes homes >>> 101: end-volume >>> 102: >>> +------------------------------------------------------------------------------+ >>> >>> [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>> 0-homes-client-6: changing port to 49153 (from 0) >>> [2016-06-24 17:55:34.266096] I [MSGID: 114057] >>> [client-handshake.c:1441:select_server_supported_programs] >>> 0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437), >>> Version (330) >>> [2016-06-24 17:55:34.266905] W [MSGID: 114007] >>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6: >>> failed to find key 'child_up' in the options >>> [2016-06-24 17:55:34.273618] W [MSGID: 114007] >>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5: >>> failed to find key 'child_up' in the options >> >>> >>> >>> >>> I checked the release notes for 3.8.0 but I did not see any caveats or >>> compatibility warnings. >>> >>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes? >>> >> >> Seems like it is due to this commit: >> >> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404 >> Author: Avra Sengupta >> Date: Mon Feb 29 14:43:58 2016 +0530 >> >> protocol client/server: Fix client-server handshake >> >> This commit introduced a new check to determine the existence of a >> key in the dictionary that gets exchanged between clients and servers >> during a handshake. Upon not finding the key, the clients bail out. >> >> Avra - would it be possible to avoid a hard check of 'child_up' >> during a handshake? > Yes Vijay, This particular failure is because the client is expecting > a 'child_up' from the server during a handshake, to determine if all > children in the server are up and it's not just a handshake. Although > this is the ideal behaviour in which the handshake should work, it is > currently breaking backward compatibility with 3.7 volumes, as those > servers are not sending the appropriate key which the newer client is > expecting. > > I would prefer not to bypass this check in the client, but rather > enforce this check only for connections comming from servers running 3.8. > > + Adding Raghavendra Gowdappa > > Raghavendra, > > Would it be possible to keep this check in the client specific to > servers running on 3.8 and beyond. >> >> Note that if servers are upgraded ahead of the clients, this problem >> should not be seen. >> >> Thanks, >> Vijay >> >> >
Avra Sengupta
2016-Jun-29 07:15 UTC
[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
Thanks Jeff for merging the patch. I have backported it to 3.8 (http://review.gluster.org/#/c/14810). I will notify once the regressions have passed. Regards, Avra On 06/28/2016 03:47 PM, Avra Sengupta wrote:> Hi, > > The patch (http://review.gluster.org/#/c/14811/) passed all > regressions. If any of you could merge it, I would backport it to 3.8 > > Regards, > Avra > > On 06/27/2016 12:04 PM, Avra Sengupta wrote: >> On 06/25/2016 01:19 AM, Vijay Bellur wrote: >>> On 06/24/2016 02:12 PM, Alastair Neil wrote: >>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am >>>> unable to mount my gluster cluster. >>>> >>>> The update installed: >>>> >>>> glusterfs-3.8.0-1.fc24.x86_64 >>>> glusterfs-libs-3.8.0-1.fc24.x86_64 >>>> glusterfs-fuse-3.8.0-1.fc24.x86_64 >>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64 >>>> >>>> the gluster is running 3.7.11 >>>> >>>> The volume is replica 3 >>>> >>>> I see these errors in the mount log: >>>> >>>> [2016-06-24 17:55:34.016462] I [MSGID: 100030] >>>> [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running >>>> /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs >>>> --volfile-server=gluster1 --volfile-id=homes /mnt/homes) >>>> [2016-06-24 17:55:34.094345] I [MSGID: 101190] >>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>> thread with index 1 >>>> [2016-06-24 17:55:34.240135] I [MSGID: 101190] >>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>> thread with index 2 >>>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>> thread with index 4 >>>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>> thread with index 3 >>>> [2016-06-24 17:55:34.241499] I [MSGID: 114020] >>>> [client.c:2356:notify] 0-homes-client-2: parent translators are >>>> ready, attempting connect on transport >>>> [2016-06-24 17:55:34.249172] I [MSGID: 114020] >>>> [client.c:2356:notify] 0-homes-client-5: parent translators are >>>> ready, attempting connect on transport >>>> [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>> 0-homes-client-2: changing port to 49171 (from 0) >>>> [2016-06-24 17:55:34.253347] I [MSGID: 114020] >>>> [client.c:2356:notify] 0-homes-client-6: parent translators are >>>> ready, attempting connect on transport >>>> [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>> 0-homes-client-5: changing port to 49154 (from 0) >>>> [2016-06-24 17:55:34.255115] I [MSGID: 114057] >>>> [client-handshake.c:1441:select_server_supported_programs] >>>> 0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), >>>> Version (330) >>>> [2016-06-24 17:55:34.255861] W [MSGID: 114007] >>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: >>>> failed to find key 'child_up' in the options >>>> [2016-06-24 17:55:34.259097] I [MSGID: 114057] >>>> [client-handshake.c:1441:select_server_supported_programs] >>>> 0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), >>>> Version (330) >>>> Final graph: >>>> +------------------------------------------------------------------------------+ >>>> >>>> 1: volume homes-client-2 >>>> 2: type protocol/client >>>> 3: option clnt-lk-version 1 >>>> 4: option volfile-checksum 0 >>>> 5: option volfile-key homes >>>> 6: option client-version 3.8.0 >>>> 7: option process-uuid >>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 >>>> 8: option fops-version 1298437 >>>> 9: option ping-timeout 20 >>>> 10: option remote-host gluster-2 >>>> 11: option remote-subvolume /export/brick2/home >>>> 12: option transport-type socket >>>> 13: option event-threads 4 >>>> 14: option send-gids true >>>> 15: end-volume >>>> 16: >>>> 17: volume homes-client-5 >>>> 18: type protocol/client >>>> 19: option clnt-lk-version 1 >>>> 20: option volfile-checksum 0 >>>> 21: option volfile-key homes >>>> 22: option client-version 3.8.0 >>>> 23: option process-uuid >>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 >>>> 24: option fops-version 1298437 >>>> 25: option ping-timeout 20 >>>> 26: option remote-host gluster1.vsnet.gmu.edu >>>> <http://gluster1.vsnet.gmu.edu> >>>> 27: option remote-subvolume /export/brick2/home >>>> 28: option transport-type socket >>>> 29: option event-threads 4 >>>> 30: option send-gids true >>>> 31: end-volume >>>> 32: >>>> 33: volume homes-client-6 >>>> 34: type protocol/client >>>> 35: option ping-timeout 20 >>>> 36: option remote-host gluster0 >>>> 37: option remote-subvolume /export/brick2/home >>>> 38: option transport-type socket >>>> 39: option event-threads 4 >>>> 40: option send-gids true >>>> 41: end-volume >>>> 42: >>>> 43: volume homes-replicate-0 >>>> 44: type cluster/replicate >>>> 45: option background-self-heal-count 20 >>>> 46: option metadata-self-heal on >>>> 47: option data-self-heal off >>>> 48: option entry-self-heal on >>>> 49: option data-self-heal-window-size 8 >>>> 50: option data-self-heal-algorithm diff >>>> 51: option eager-lock on >>>> 52: option quorum-type auto >>>> 53: option self-heal-readdir-size 64KB >>>> 54: subvolumes homes-client-2 homes-client-5 homes-client-6 >>>> 55: end-volume >>>> 56: >>>> 57: volume homes-dht >>>> 58: type cluster/distribute >>>> 59: option min-free-disk 5% >>>> 60: option rebalance-stats on >>>> 61: option readdir-optimize on >>>> 62: subvolumes homes-replicate-0 >>>> 63: end-volume >>>> 64: >>>> 65: volume homes-read-ahead >>>> 66: type performance/read-ahead >>>> 67: subvolumes homes-dht >>>> 68: end-volume >>>> 69: >>>> 70: volume homes-io-cache >>>> 71: type performance/io-cache >>>> 72: subvolumes homes-read-ahead >>>> 73: end-volume >>>> 74: >>>> 75: volume homes-quick-read >>>> 76: type performance/quick-read >>>> 77: subvolumes homes-io-cache >>>> 78: end-volume >>>> 79: >>>> 80: volume homes-open-behind >>>> 81: type performance/open-behind >>>> 82: subvolumes homes-quick-read >>>> 83: end-volume >>>> 84: >>>> 85: volume homes-md-cache >>>> 86: type performance/md-cache >>>> 87: subvolumes homes-open-behind >>>> 88: end-volume >>>> 89: >>>> 90: volume homes >>>> 91: type debug/io-stats >>>> 92: option log-level INFO >>>> 93: option latency-measurement off >>>> 94: option count-fop-hits on >>>> 95: subvolumes homes-md-cache >>>> 96: end-volume >>>> 97: >>>> 98: volume meta-autoload >>>> 99: type meta >>>> 100: subvolumes homes >>>> 101: end-volume >>>> 102: >>>> +------------------------------------------------------------------------------+ >>>> >>>> [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>> 0-homes-client-6: changing port to 49153 (from 0) >>>> [2016-06-24 17:55:34.266096] I [MSGID: 114057] >>>> [client-handshake.c:1441:select_server_supported_programs] >>>> 0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437), >>>> Version (330) >>>> [2016-06-24 17:55:34.266905] W [MSGID: 114007] >>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6: >>>> failed to find key 'child_up' in the options >>>> [2016-06-24 17:55:34.273618] W [MSGID: 114007] >>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5: >>>> failed to find key 'child_up' in the options >>> >>>> >>>> >>>> >>>> I checked the release notes for 3.8.0 but I did not see any caveats or >>>> compatibility warnings. >>>> >>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes? >>>> >>> >>> Seems like it is due to this commit: >>> >>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404 >>> Author: Avra Sengupta >>> Date: Mon Feb 29 14:43:58 2016 +0530 >>> >>> protocol client/server: Fix client-server handshake >>> >>> This commit introduced a new check to determine the existence of a >>> key in the dictionary that gets exchanged between clients and >>> servers during a handshake. Upon not finding the key, the clients >>> bail out. >>> >>> Avra - would it be possible to avoid a hard check of 'child_up' >>> during a handshake? >> Yes Vijay, This particular failure is because the client is expecting >> a 'child_up' from the server during a handshake, to determine if all >> children in the server are up and it's not just a handshake. Although >> this is the ideal behaviour in which the handshake should work, it is >> currently breaking backward compatibility with 3.7 volumes, as those >> servers are not sending the appropriate key which the newer client is >> expecting. >> >> I would prefer not to bypass this check in the client, but rather >> enforce this check only for connections comming from servers running >> 3.8. >> >> + Adding Raghavendra Gowdappa >> >> Raghavendra, >> >> Would it be possible to keep this check in the client specific to >> servers running on 3.8 and beyond. >>> >>> Note that if servers are upgraded ahead of the clients, this problem >>> should not be seen. >>> >>> Thanks, >>> Vijay >>> >>> >> >
Avra Sengupta
2016-Jul-05 11:48 UTC
[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
The 3.8 patch(http://review.gluster.org/#/c/14810/) has passed all regressions. Can someone please merge it. Regards, Avra On 06/29/2016 12:45 PM, Avra Sengupta wrote:> Thanks Jeff for merging the patch. > > I have backported it to 3.8 (http://review.gluster.org/#/c/14810). I > will notify once the regressions have passed. > > Regards, > Avra > > On 06/28/2016 03:47 PM, Avra Sengupta wrote: >> Hi, >> >> The patch (http://review.gluster.org/#/c/14811/) passed all >> regressions. If any of you could merge it, I would backport it to 3.8 >> >> Regards, >> Avra >> >> On 06/27/2016 12:04 PM, Avra Sengupta wrote: >>> On 06/25/2016 01:19 AM, Vijay Bellur wrote: >>>> On 06/24/2016 02:12 PM, Alastair Neil wrote: >>>>> I upgraded my fedora 23 system to f24 a couple of days ago, now I am >>>>> unable to mount my gluster cluster. >>>>> >>>>> The update installed: >>>>> >>>>> glusterfs-3.8.0-1.fc24.x86_64 >>>>> glusterfs-libs-3.8.0-1.fc24.x86_64 >>>>> glusterfs-fuse-3.8.0-1.fc24.x86_64 >>>>> glusterfs-client-xlators-3.8.0-1.fc24.x86_64 >>>>> >>>>> the gluster is running 3.7.11 >>>>> >>>>> The volume is replica 3 >>>>> >>>>> I see these errors in the mount log: >>>>> >>>>> [2016-06-24 17:55:34.016462] I [MSGID: 100030] >>>>> [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running >>>>> /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs >>>>> --volfile-server=gluster1 --volfile-id=homes /mnt/homes) >>>>> [2016-06-24 17:55:34.094345] I [MSGID: 101190] >>>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>>> thread with index 1 >>>>> [2016-06-24 17:55:34.240135] I [MSGID: 101190] >>>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>>> thread with index 2 >>>>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>>> thread with index 4 >>>>> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >>>>> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >>>>> thread with index 3 >>>>> [2016-06-24 17:55:34.241499] I [MSGID: 114020] >>>>> [client.c:2356:notify] 0-homes-client-2: parent translators are >>>>> ready, attempting connect on transport >>>>> [2016-06-24 17:55:34.249172] I [MSGID: 114020] >>>>> [client.c:2356:notify] 0-homes-client-5: parent translators are >>>>> ready, attempting connect on transport >>>>> [2016-06-24 17:55:34.250186] I >>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>>> 0-homes-client-2: changing port to 49171 (from 0) >>>>> [2016-06-24 17:55:34.253347] I [MSGID: 114020] >>>>> [client.c:2356:notify] 0-homes-client-6: parent translators are >>>>> ready, attempting connect on transport >>>>> [2016-06-24 17:55:34.254213] I >>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>>> 0-homes-client-5: changing port to 49154 (from 0) >>>>> [2016-06-24 17:55:34.255115] I [MSGID: 114057] >>>>> [client-handshake.c:1441:select_server_supported_programs] >>>>> 0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), >>>>> Version (330) >>>>> [2016-06-24 17:55:34.255861] W [MSGID: 114007] >>>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: >>>>> failed to find key 'child_up' in the options >>>>> [2016-06-24 17:55:34.259097] I [MSGID: 114057] >>>>> [client-handshake.c:1441:select_server_supported_programs] >>>>> 0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), >>>>> Version (330) >>>>> Final graph: >>>>> +------------------------------------------------------------------------------+ >>>>> >>>>> 1: volume homes-client-2 >>>>> 2: type protocol/client >>>>> 3: option clnt-lk-version 1 >>>>> 4: option volfile-checksum 0 >>>>> 5: option volfile-key homes >>>>> 6: option client-version 3.8.0 >>>>> 7: option process-uuid >>>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 >>>>> 8: option fops-version 1298437 >>>>> 9: option ping-timeout 20 >>>>> 10: option remote-host gluster-2 >>>>> 11: option remote-subvolume /export/brick2/home >>>>> 12: option transport-type socket >>>>> 13: option event-threads 4 >>>>> 14: option send-gids true >>>>> 15: end-volume >>>>> 16: >>>>> 17: volume homes-client-5 >>>>> 18: type protocol/client >>>>> 19: option clnt-lk-version 1 >>>>> 20: option volfile-checksum 0 >>>>> 21: option volfile-key homes >>>>> 22: option client-version 3.8.0 >>>>> 23: option process-uuid >>>>> Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 >>>>> 24: option fops-version 1298437 >>>>> 25: option ping-timeout 20 >>>>> 26: option remote-host gluster1.vsnet.gmu.edu >>>>> <http://gluster1.vsnet.gmu.edu> >>>>> 27: option remote-subvolume /export/brick2/home >>>>> 28: option transport-type socket >>>>> 29: option event-threads 4 >>>>> 30: option send-gids true >>>>> 31: end-volume >>>>> 32: >>>>> 33: volume homes-client-6 >>>>> 34: type protocol/client >>>>> 35: option ping-timeout 20 >>>>> 36: option remote-host gluster0 >>>>> 37: option remote-subvolume /export/brick2/home >>>>> 38: option transport-type socket >>>>> 39: option event-threads 4 >>>>> 40: option send-gids true >>>>> 41: end-volume >>>>> 42: >>>>> 43: volume homes-replicate-0 >>>>> 44: type cluster/replicate >>>>> 45: option background-self-heal-count 20 >>>>> 46: option metadata-self-heal on >>>>> 47: option data-self-heal off >>>>> 48: option entry-self-heal on >>>>> 49: option data-self-heal-window-size 8 >>>>> 50: option data-self-heal-algorithm diff >>>>> 51: option eager-lock on >>>>> 52: option quorum-type auto >>>>> 53: option self-heal-readdir-size 64KB >>>>> 54: subvolumes homes-client-2 homes-client-5 homes-client-6 >>>>> 55: end-volume >>>>> 56: >>>>> 57: volume homes-dht >>>>> 58: type cluster/distribute >>>>> 59: option min-free-disk 5% >>>>> 60: option rebalance-stats on >>>>> 61: option readdir-optimize on >>>>> 62: subvolumes homes-replicate-0 >>>>> 63: end-volume >>>>> 64: >>>>> 65: volume homes-read-ahead >>>>> 66: type performance/read-ahead >>>>> 67: subvolumes homes-dht >>>>> 68: end-volume >>>>> 69: >>>>> 70: volume homes-io-cache >>>>> 71: type performance/io-cache >>>>> 72: subvolumes homes-read-ahead >>>>> 73: end-volume >>>>> 74: >>>>> 75: volume homes-quick-read >>>>> 76: type performance/quick-read >>>>> 77: subvolumes homes-io-cache >>>>> 78: end-volume >>>>> 79: >>>>> 80: volume homes-open-behind >>>>> 81: type performance/open-behind >>>>> 82: subvolumes homes-quick-read >>>>> 83: end-volume >>>>> 84: >>>>> 85: volume homes-md-cache >>>>> 86: type performance/md-cache >>>>> 87: subvolumes homes-open-behind >>>>> 88: end-volume >>>>> 89: >>>>> 90: volume homes >>>>> 91: type debug/io-stats >>>>> 92: option log-level INFO >>>>> 93: option latency-measurement off >>>>> 94: option count-fop-hits on >>>>> 95: subvolumes homes-md-cache >>>>> 96: end-volume >>>>> 97: >>>>> 98: volume meta-autoload >>>>> 99: type meta >>>>> 100: subvolumes homes >>>>> 101: end-volume >>>>> 102: >>>>> +------------------------------------------------------------------------------+ >>>>> >>>>> [2016-06-24 17:55:34.261219] I >>>>> [rpc-clnt.c:1855:rpc_clnt_reconfig] >>>>> 0-homes-client-6: changing port to 49153 (from 0) >>>>> [2016-06-24 17:55:34.266096] I [MSGID: 114057] >>>>> [client-handshake.c:1441:select_server_supported_programs] >>>>> 0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437), >>>>> Version (330) >>>>> [2016-06-24 17:55:34.266905] W [MSGID: 114007] >>>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6: >>>>> failed to find key 'child_up' in the options >>>>> [2016-06-24 17:55:34.273618] W [MSGID: 114007] >>>>> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5: >>>>> failed to find key 'child_up' in the options >>>> >>>>> >>>>> >>>>> >>>>> I checked the release notes for 3.8.0 but I did not see any >>>>> caveats or >>>>> compatibility warnings. >>>>> >>>>> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes? >>>>> >>>> >>>> Seems like it is due to this commit: >>>> >>>> commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404 >>>> Author: Avra Sengupta >>>> Date: Mon Feb 29 14:43:58 2016 +0530 >>>> >>>> protocol client/server: Fix client-server handshake >>>> >>>> This commit introduced a new check to determine the existence of a >>>> key in the dictionary that gets exchanged between clients and >>>> servers during a handshake. Upon not finding the key, the clients >>>> bail out. >>>> >>>> Avra - would it be possible to avoid a hard check of 'child_up' >>>> during a handshake? >>> Yes Vijay, This particular failure is because the client is >>> expecting a 'child_up' from the server during a handshake, to >>> determine if all children in the server are up and it's not just a >>> handshake. Although this is the ideal behaviour in which the >>> handshake should work, it is currently breaking backward >>> compatibility with 3.7 volumes, as those servers are not sending the >>> appropriate key which the newer client is expecting. >>> >>> I would prefer not to bypass this check in the client, but rather >>> enforce this check only for connections comming from servers running >>> 3.8. >>> >>> + Adding Raghavendra Gowdappa >>> >>> Raghavendra, >>> >>> Would it be possible to keep this check in the client specific to >>> servers running on 3.8 and beyond. >>>> >>>> Note that if servers are upgraded ahead of the clients, this >>>> problem should not be seen. >>>> >>>> Thanks, >>>> Vijay >>>> >>>> >>> >> >