Vijay Bellur
2016-Jun-24 19:49 UTC
[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
On 06/24/2016 02:12 PM, Alastair Neil wrote:> I upgraded my fedora 23 system to f24 a couple of days ago, now I am > unable to mount my gluster cluster. > > The update installed: > > glusterfs-3.8.0-1.fc24.x86_64 > glusterfs-libs-3.8.0-1.fc24.x86_64 > glusterfs-fuse-3.8.0-1.fc24.x86_64 > glusterfs-client-xlators-3.8.0-1.fc24.x86_64 > > the gluster is running 3.7.11 > > The volume is replica 3 > > I see these errors in the mount log: > > [2016-06-24 17:55:34.016462] I [MSGID: 100030] > [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running > /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs > --volfile-server=gluster1 --volfile-id=homes /mnt/homes) > [2016-06-24 17:55:34.094345] I [MSGID: 101190] > [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started > thread with index 1 > [2016-06-24 17:55:34.240135] I [MSGID: 101190] > [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started > thread with index 2 > [2016-06-24 17:55:34.240130] I [MSGID: 101190] > [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started > thread with index 4 > [2016-06-24 17:55:34.240130] I [MSGID: 101190] > [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started > thread with index 3 > [2016-06-24 17:55:34.241499] I [MSGID: 114020] > [client.c:2356:notify] 0-homes-client-2: parent translators are > ready, attempting connect on transport > [2016-06-24 17:55:34.249172] I [MSGID: 114020] > [client.c:2356:notify] 0-homes-client-5: parent translators are > ready, attempting connect on transport > [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig] > 0-homes-client-2: changing port to 49171 (from 0) > [2016-06-24 17:55:34.253347] I [MSGID: 114020] > [client.c:2356:notify] 0-homes-client-6: parent translators are > ready, attempting connect on transport > [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig] > 0-homes-client-5: changing port to 49154 (from 0) > [2016-06-24 17:55:34.255115] I [MSGID: 114057] > [client-handshake.c:1441:select_server_supported_programs] > 0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), > Version (330) > [2016-06-24 17:55:34.255861] W [MSGID: 114007] > [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: > failed to find key 'child_up' in the options > [2016-06-24 17:55:34.259097] I [MSGID: 114057] > [client-handshake.c:1441:select_server_supported_programs] > 0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), > Version (330) > Final graph: > +------------------------------------------------------------------------------+ > 1: volume homes-client-2 > 2: type protocol/client > 3: option clnt-lk-version 1 > 4: option volfile-checksum 0 > 5: option volfile-key homes > 6: option client-version 3.8.0 > 7: option process-uuid > Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 > 8: option fops-version 1298437 > 9: option ping-timeout 20 > 10: option remote-host gluster-2 > 11: option remote-subvolume /export/brick2/home > 12: option transport-type socket > 13: option event-threads 4 > 14: option send-gids true > 15: end-volume > 16: > 17: volume homes-client-5 > 18: type protocol/client > 19: option clnt-lk-version 1 > 20: option volfile-checksum 0 > 21: option volfile-key homes > 22: option client-version 3.8.0 > 23: option process-uuid > Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 > 24: option fops-version 1298437 > 25: option ping-timeout 20 > 26: option remote-host gluster1.vsnet.gmu.edu > <http://gluster1.vsnet.gmu.edu> > 27: option remote-subvolume /export/brick2/home > 28: option transport-type socket > 29: option event-threads 4 > 30: option send-gids true > 31: end-volume > 32: > 33: volume homes-client-6 > 34: type protocol/client > 35: option ping-timeout 20 > 36: option remote-host gluster0 > 37: option remote-subvolume /export/brick2/home > 38: option transport-type socket > 39: option event-threads 4 > 40: option send-gids true > 41: end-volume > 42: > 43: volume homes-replicate-0 > 44: type cluster/replicate > 45: option background-self-heal-count 20 > 46: option metadata-self-heal on > 47: option data-self-heal off > 48: option entry-self-heal on > 49: option data-self-heal-window-size 8 > 50: option data-self-heal-algorithm diff > 51: option eager-lock on > 52: option quorum-type auto > 53: option self-heal-readdir-size 64KB > 54: subvolumes homes-client-2 homes-client-5 homes-client-6 > 55: end-volume > 56: > 57: volume homes-dht > 58: type cluster/distribute > 59: option min-free-disk 5% > 60: option rebalance-stats on > 61: option readdir-optimize on > 62: subvolumes homes-replicate-0 > 63: end-volume > 64: > 65: volume homes-read-ahead > 66: type performance/read-ahead > 67: subvolumes homes-dht > 68: end-volume > 69: > 70: volume homes-io-cache > 71: type performance/io-cache > 72: subvolumes homes-read-ahead > 73: end-volume > 74: > 75: volume homes-quick-read > 76: type performance/quick-read > 77: subvolumes homes-io-cache > 78: end-volume > 79: > 80: volume homes-open-behind > 81: type performance/open-behind > 82: subvolumes homes-quick-read > 83: end-volume > 84: > 85: volume homes-md-cache > 86: type performance/md-cache > 87: subvolumes homes-open-behind > 88: end-volume > 89: > 90: volume homes > 91: type debug/io-stats > 92: option log-level INFO > 93: option latency-measurement off > 94: option count-fop-hits on > 95: subvolumes homes-md-cache > 96: end-volume > 97: > 98: volume meta-autoload > 99: type meta > 100: subvolumes homes > 101: end-volume > 102: > +------------------------------------------------------------------------------+ > [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig] > 0-homes-client-6: changing port to 49153 (from 0) > [2016-06-24 17:55:34.266096] I [MSGID: 114057] > [client-handshake.c:1441:select_server_supported_programs] > 0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437), > Version (330) > [2016-06-24 17:55:34.266905] W [MSGID: 114007] > [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6: > failed to find key 'child_up' in the options > [2016-06-24 17:55:34.273618] W [MSGID: 114007] > [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5: > failed to find key 'child_up' in the options> > > > I checked the release notes for 3.8.0 but I did not see any caveats or > compatibility warnings. > > Anyone else seeing issues with 3.8 clients mounting 3.7 volumes? >Seems like it is due to this commit: commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404 Author: Avra Sengupta Date: Mon Feb 29 14:43:58 2016 +0530 protocol client/server: Fix client-server handshake This commit introduced a new check to determine the existence of a key in the dictionary that gets exchanged between clients and servers during a handshake. Upon not finding the key, the clients bail out. Avra - would it be possible to avoid a hard check of 'child_up' during a handshake? Note that if servers are upgraded ahead of the clients, this problem should not be seen. Thanks, Vijay
Avra Sengupta
2016-Jun-27 06:34 UTC
[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting
On 06/25/2016 01:19 AM, Vijay Bellur wrote:> On 06/24/2016 02:12 PM, Alastair Neil wrote: >> I upgraded my fedora 23 system to f24 a couple of days ago, now I am >> unable to mount my gluster cluster. >> >> The update installed: >> >> glusterfs-3.8.0-1.fc24.x86_64 >> glusterfs-libs-3.8.0-1.fc24.x86_64 >> glusterfs-fuse-3.8.0-1.fc24.x86_64 >> glusterfs-client-xlators-3.8.0-1.fc24.x86_64 >> >> the gluster is running 3.7.11 >> >> The volume is replica 3 >> >> I see these errors in the mount log: >> >> [2016-06-24 17:55:34.016462] I [MSGID: 100030] >> [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running >> /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs >> --volfile-server=gluster1 --volfile-id=homes /mnt/homes) >> [2016-06-24 17:55:34.094345] I [MSGID: 101190] >> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >> thread with index 1 >> [2016-06-24 17:55:34.240135] I [MSGID: 101190] >> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >> thread with index 2 >> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >> thread with index 4 >> [2016-06-24 17:55:34.240130] I [MSGID: 101190] >> [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started >> thread with index 3 >> [2016-06-24 17:55:34.241499] I [MSGID: 114020] >> [client.c:2356:notify] 0-homes-client-2: parent translators are >> ready, attempting connect on transport >> [2016-06-24 17:55:34.249172] I [MSGID: 114020] >> [client.c:2356:notify] 0-homes-client-5: parent translators are >> ready, attempting connect on transport >> [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >> 0-homes-client-2: changing port to 49171 (from 0) >> [2016-06-24 17:55:34.253347] I [MSGID: 114020] >> [client.c:2356:notify] 0-homes-client-6: parent translators are >> ready, attempting connect on transport >> [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >> 0-homes-client-5: changing port to 49154 (from 0) >> [2016-06-24 17:55:34.255115] I [MSGID: 114057] >> [client-handshake.c:1441:select_server_supported_programs] >> 0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), >> Version (330) >> [2016-06-24 17:55:34.255861] W [MSGID: 114007] >> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: >> failed to find key 'child_up' in the options >> [2016-06-24 17:55:34.259097] I [MSGID: 114057] >> [client-handshake.c:1441:select_server_supported_programs] >> 0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), >> Version (330) >> Final graph: >> +------------------------------------------------------------------------------+ >> 1: volume homes-client-2 >> 2: type protocol/client >> 3: option clnt-lk-version 1 >> 4: option volfile-checksum 0 >> 5: option volfile-key homes >> 6: option client-version 3.8.0 >> 7: option process-uuid >> Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 >> 8: option fops-version 1298437 >> 9: option ping-timeout 20 >> 10: option remote-host gluster-2 >> 11: option remote-subvolume /export/brick2/home >> 12: option transport-type socket >> 13: option event-threads 4 >> 14: option send-gids true >> 15: end-volume >> 16: >> 17: volume homes-client-5 >> 18: type protocol/client >> 19: option clnt-lk-version 1 >> 20: option volfile-checksum 0 >> 21: option volfile-key homes >> 22: option client-version 3.8.0 >> 23: option process-uuid >> Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 >> 24: option fops-version 1298437 >> 25: option ping-timeout 20 >> 26: option remote-host gluster1.vsnet.gmu.edu >> <http://gluster1.vsnet.gmu.edu> >> 27: option remote-subvolume /export/brick2/home >> 28: option transport-type socket >> 29: option event-threads 4 >> 30: option send-gids true >> 31: end-volume >> 32: >> 33: volume homes-client-6 >> 34: type protocol/client >> 35: option ping-timeout 20 >> 36: option remote-host gluster0 >> 37: option remote-subvolume /export/brick2/home >> 38: option transport-type socket >> 39: option event-threads 4 >> 40: option send-gids true >> 41: end-volume >> 42: >> 43: volume homes-replicate-0 >> 44: type cluster/replicate >> 45: option background-self-heal-count 20 >> 46: option metadata-self-heal on >> 47: option data-self-heal off >> 48: option entry-self-heal on >> 49: option data-self-heal-window-size 8 >> 50: option data-self-heal-algorithm diff >> 51: option eager-lock on >> 52: option quorum-type auto >> 53: option self-heal-readdir-size 64KB >> 54: subvolumes homes-client-2 homes-client-5 homes-client-6 >> 55: end-volume >> 56: >> 57: volume homes-dht >> 58: type cluster/distribute >> 59: option min-free-disk 5% >> 60: option rebalance-stats on >> 61: option readdir-optimize on >> 62: subvolumes homes-replicate-0 >> 63: end-volume >> 64: >> 65: volume homes-read-ahead >> 66: type performance/read-ahead >> 67: subvolumes homes-dht >> 68: end-volume >> 69: >> 70: volume homes-io-cache >> 71: type performance/io-cache >> 72: subvolumes homes-read-ahead >> 73: end-volume >> 74: >> 75: volume homes-quick-read >> 76: type performance/quick-read >> 77: subvolumes homes-io-cache >> 78: end-volume >> 79: >> 80: volume homes-open-behind >> 81: type performance/open-behind >> 82: subvolumes homes-quick-read >> 83: end-volume >> 84: >> 85: volume homes-md-cache >> 86: type performance/md-cache >> 87: subvolumes homes-open-behind >> 88: end-volume >> 89: >> 90: volume homes >> 91: type debug/io-stats >> 92: option log-level INFO >> 93: option latency-measurement off >> 94: option count-fop-hits on >> 95: subvolumes homes-md-cache >> 96: end-volume >> 97: >> 98: volume meta-autoload >> 99: type meta >> 100: subvolumes homes >> 101: end-volume >> 102: >> +------------------------------------------------------------------------------+ >> [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig] >> 0-homes-client-6: changing port to 49153 (from 0) >> [2016-06-24 17:55:34.266096] I [MSGID: 114057] >> [client-handshake.c:1441:select_server_supported_programs] >> 0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437), >> Version (330) >> [2016-06-24 17:55:34.266905] W [MSGID: 114007] >> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6: >> failed to find key 'child_up' in the options >> [2016-06-24 17:55:34.273618] W [MSGID: 114007] >> [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5: >> failed to find key 'child_up' in the options > >> >> >> >> I checked the release notes for 3.8.0 but I did not see any caveats or >> compatibility warnings. >> >> Anyone else seeing issues with 3.8 clients mounting 3.7 volumes? >> > > Seems like it is due to this commit: > > commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404 > Author: Avra Sengupta > Date: Mon Feb 29 14:43:58 2016 +0530 > > protocol client/server: Fix client-server handshake > > This commit introduced a new check to determine the existence of a key > in the dictionary that gets exchanged between clients and servers > during a handshake. Upon not finding the key, the clients bail out. > > Avra - would it be possible to avoid a hard check of 'child_up' during > a handshake?Yes Vijay, This particular failure is because the client is expecting a 'child_up' from the server during a handshake, to determine if all children in the server are up and it's not just a handshake. Although this is the ideal behaviour in which the handshake should work, it is currently breaking backward compatibility with 3.7 volumes, as those servers are not sending the appropriate key which the newer client is expecting. I would prefer not to bypass this check in the client, but rather enforce this check only for connections comming from servers running 3.8. + Adding Raghavendra Gowdappa Raghavendra, Would it be possible to keep this check in the client specific to servers running on 3.8 and beyond.> > Note that if servers are upgraded ahead of the clients, this problem > should not be seen. > > Thanks, > Vijay > >