Atin Mukherjee
2017-Jul-05 15:22 UTC
[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
And what does glusterd log indicate for these failures? On Wed, Jul 5, 2017 at 8:43 PM, Gianluca Cecchi <gianluca.cecchi at gmail.com> wrote:> > > On Wed, Jul 5, 2017 at 5:02 PM, Sahina Bose <sabose at redhat.com> wrote: > >> >> >> On Wed, Jul 5, 2017 at 8:16 PM, Gianluca Cecchi < >> gianluca.cecchi at gmail.com> wrote: >> >>> >>> >>> On Wed, Jul 5, 2017 at 7:42 AM, Sahina Bose <sabose at redhat.com> wrote: >>> >>>> >>>> >>>>> ... >>>>> >>>>> then the commands I need to run would be: >>>>> >>>>> gluster volume reset-brick export ovirt01.localdomain.local:/gluster/brick3/export >>>>> start >>>>> gluster volume reset-brick export ovirt01.localdomain.local:/gluster/brick3/export >>>>> gl01.localdomain.local:/gluster/brick3/export commit force >>>>> >>>>> Correct? >>>>> >>>> >>>> Yes, correct. gl01.localdomain.local should resolve correctly on all 3 >>>> nodes. >>>> >>> >>> >>> It fails at first step: >>> >>> [root at ovirt01 ~]# gluster volume reset-brick export >>> ovirt01.localdomain.local:/gluster/brick3/export start >>> volume reset-brick: failed: Cannot execute command. The cluster is >>> operating at version 30712. reset-brick command reset-brick start is >>> unavailable in this version. >>> [root at ovirt01 ~]# >>> >>> It seems somehow in relation with this upgrade not of the commercial >>> solution Red Hat Gluster Storage >>> https://access.redhat.com/documentation/en-US/Red_Hat_Storag >>> e/3.1/html/Installation_Guide/chap-Upgrading_Red_Hat_Storage.html >>> >>> So ti seems I have to run some command of type: >>> >>> gluster volume set all cluster.op-version XXXXX >>> >>> with XXXXX > 30712 >>> >>> It seems that latest version of commercial Red Hat Gluster Storage is >>> 3.1 and its op-version is indeed 30712.. >>> >>> So the question is which particular op-version I have to set and if the >>> command can be set online without generating disruption.... >>> >> >> It should have worked with the glusterfs 3.10 version from Centos repo. >> Adding gluster-users for help on the op-version >> >> >>> >>> Thanks, >>> Gianluca >>> >> >> > > It seems op-version is not updated automatically by default, so that it > can manage mixed versions while you update one by one... > > I followed what described here: > https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/ > > > - Get current version: > > [root at ovirt01 ~]# gluster volume get all cluster.op-version > Option Value > > ------ ----- > > cluster.op-version 30712 > > [root at ovirt01 ~]# > > > - Get maximum version I can set for current setup: > > [root at ovirt01 ~]# gluster volume get all cluster.max-op-version > Option Value > > ------ ----- > > cluster.max-op-version 31000 > > [root at ovirt01 ~]# > > > - Get op version information for all the connected clients: > > [root at ovirt01 ~]# gluster volume status all clients | grep ":49" | awk > '{print $4}' | sort | uniq -c > 72 31000 > [root at ovirt01 ~]# > > --> ok > > > - Update op-version > > [root at ovirt01 ~]# gluster volume set all cluster.op-version 31000 > volume set: success > [root at ovirt01 ~]# > > > - Verify: > [root at ovirt01 ~]# gluster volume get all cluster.op-versionOption > Value > ------ ----- > > cluster.op-version 31000 > > [root at ovirt01 ~]# > > --> ok > > [root at ovirt01 ~]# gluster volume reset-brick export > ovirt01.localdomain.local:/gluster/brick3/export start > volume reset-brick: success: reset-brick start operation successful > > [root at ovirt01 ~]# gluster volume reset-brick export > ovirt01.localdomain.local:/gluster/brick3/export gl01.localdomain.local:/gluster/brick3/export > commit force > volume reset-brick: failed: Commit failed on ovirt02.localdomain.local. > Please check log file for details. > Commit failed on ovirt03.localdomain.local. Please check log file for > details. > [root at ovirt01 ~]# > > [root at ovirt01 bricks]# gluster volume info export > > Volume Name: export > Type: Replicate > Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: gl01.localdomain.local:/gluster/brick3/export > Brick2: ovirt02.localdomain.local:/gluster/brick3/export > Brick3: ovirt03.localdomain.local:/gluster/brick3/export (arbiter) > Options Reconfigured: > transport.address-family: inet > performance.readdir-ahead: on > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: off > cluster.quorum-type: auto > cluster.server-quorum-type: server > storage.owner-uid: 36 > storage.owner-gid: 36 > features.shard: on > features.shard-block-size: 512MB > performance.low-prio-threads: 32 > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 6 > network.ping-timeout: 30 > user.cifs: off > nfs.disable: on > performance.strict-o-direct: on > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt02.localdomain.local:/gluster/brick3/export start > volume reset-brick: success: reset-brick start operation successful > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt02.localdomain.local:/gluster/brick3/export gl02.localdomain.local:/gluster/brick3/export > commit force > volume reset-brick: failed: Commit failed on localhost. Please check log > file for details. > [root at ovirt01 bricks]# > > I proceed (I have actually nothing on export volume...) > > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt02.localdomain.local:/gluster/brick3/export start > volume reset-brick: success: reset-brick start operation successful > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt02.localdomain.local:/gluster/brick3/export gl02.localdomain.local:/gluster/brick3/export > commit force > volume reset-brick: failed: Commit failed on localhost. Please check log > file for details. > [root at ovirt01 bricks]# > > Again error > > > [root at ovirt01 bricks]# gluster volume info export > > Volume Name: export > Type: Replicate > Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 > Status: Started > Snapshot Count: 0 > Number of Bricks: 0 x (2 + 1) = 2 > Transport-type: tcp > Bricks: > Brick1: gl01.localdomain.local:/gluster/brick3/export > Brick2: ovirt03.localdomain.local:/gluster/brick3/export > Options Reconfigured: > transport.address-family: inet > performance.readdir-ahead: on > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: off > cluster.quorum-type: auto > cluster.server-quorum-type: server > storage.owner-uid: 36 > storage.owner-gid: 36 > features.shard: on > features.shard-block-size: 512MB > performance.low-prio-threads: 32 > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 6 > network.ping-timeout: 30 > user.cifs: off > nfs.disable: on > performance.strict-o-direct: on > [root at ovirt01 bricks]# > > > The last > > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt03.localdomain.local:/gluster/brick3/export start > volume reset-brick: success: reset-brick start operation successful > [root at ovirt01 bricks]# gluster volume reset-brick export > ovirt03.localdomain.local:/gluster/brick3/export gl03.localdomain.local:/gluster/brick3/export > commit force > volume reset-brick: failed: Commit failed on localhost. Please check log > file for details. > [root at ovirt01 bricks]# > > again error > > > [root at ovirt01 bricks]# gluster volume info export > > Volume Name: export > Type: Replicate > Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 > Status: Started > Snapshot Count: 0 > Number of Bricks: 0 x (2 + 1) = 1 > Transport-type: tcp > Bricks: > Brick1: gl01.localdomain.local:/gluster/brick3/export > Options Reconfigured: > transport.address-family: inet > performance.readdir-ahead: on > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: off > cluster.quorum-type: auto > cluster.server-quorum-type: server > storage.owner-uid: 36 > storage.owner-gid: 36 > features.shard: on > features.shard-block-size: 512MB > performance.low-prio-threads: 32 > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 6 > network.ping-timeout: 30 > user.cifs: off > nfs.disable: on > performance.strict-o-direct: on > [root at ovirt01 bricks]# > > See here for gluster log in gzip format.... > https://drive.google.com/file/d/0BwoPbcrMv8mvQmlYZjAySTZKTzQ/ > view?usp=sharing > > The first command executed at 14:57 and the other two at 15:04 > > This is what seen by oVirt right now for the volume > https://drive.google.com/file/d/0BwoPbcrMv8mvNFAyd043TnNwSEU/ > view?usp=sharing > > (After the first command I saw 2 of 3 up) > > Gianluca > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170705/82628030/attachment.html>
Gianluca Cecchi
2017-Jul-05 15:42 UTC
[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
On Wed, Jul 5, 2017 at 5:22 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> And what does glusterd log indicate for these failures? >See here in gzip format https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/view?usp=sharing It seems that on each host the peer files have been updated with a new entry "hostname2": [root at ovirt01 ~]# cat /var/lib/glusterd/peers/* uuid=b89311fe-257f-4e44-8e15-9bff6245d689 state=3 hostname1=ovirt02.localdomain.local hostname2=10.10.2.103 uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3 state=3 hostname1=ovirt03.localdomain.local hostname2=10.10.2.104 [root at ovirt01 ~]# [root at ovirt02 ~]# cat /var/lib/glusterd/peers/* uuid=e9717281-a356-42aa-a579-a4647a29a0bc state=3 hostname1=ovirt01.localdomain.local hostname2=10.10.2.102 uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3 state=3 hostname1=ovirt03.localdomain.local hostname2=10.10.2.104 [root at ovirt02 ~]# [root at ovirt03 ~]# cat /var/lib/glusterd/peers/* uuid=b89311fe-257f-4e44-8e15-9bff6245d689 state=3 hostname1=ovirt02.localdomain.local hostname2=10.10.2.103 uuid=e9717281-a356-42aa-a579-a4647a29a0bc state=3 hostname1=ovirt01.localdomain.local hostname2=10.10.2.102 [root at ovirt03 ~]# But not the gluster info on the second and third node that have lost the ovirt01/gl01 host brick information... Eg on ovirt02 [root at ovirt02 peers]# gluster volume info export Volume Name: export Type: Replicate Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 Status: Started Snapshot Count: 0 Number of Bricks: 0 x (2 + 1) = 2 Transport-type: tcp Bricks: Brick1: ovirt02.localdomain.local:/gluster/brick3/export Brick2: ovirt03.localdomain.local:/gluster/brick3/export Options Reconfigured: transport.address-family: inet performance.readdir-ahead: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: off cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 features.shard: on features.shard-block-size: 512MB performance.low-prio-threads: 32 cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 6 network.ping-timeout: 30 user.cifs: off nfs.disable: on performance.strict-o-direct: on [root at ovirt02 peers]# And on ovirt03 [root at ovirt03 ~]# gluster volume info export Volume Name: export Type: Replicate Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 Status: Started Snapshot Count: 0 Number of Bricks: 0 x (2 + 1) = 2 Transport-type: tcp Bricks: Brick1: ovirt02.localdomain.local:/gluster/brick3/export Brick2: ovirt03.localdomain.local:/gluster/brick3/export Options Reconfigured: transport.address-family: inet performance.readdir-ahead: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: off cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 features.shard: on features.shard-block-size: 512MB performance.low-prio-threads: 32 cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 6 network.ping-timeout: 30 user.cifs: off nfs.disable: on performance.strict-o-direct: on [root at ovirt03 ~]# While on ovirt01 it seems isolated... [root at ovirt01 ~]# gluster volume info export Volume Name: export Type: Replicate Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153 Status: Started Snapshot Count: 0 Number of Bricks: 0 x (2 + 1) = 1 Transport-type: tcp Bricks: Brick1: gl01.localdomain.local:/gluster/brick3/export Options Reconfigured: transport.address-family: inet performance.readdir-ahead: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: off cluster.quorum-type: auto cluster.server-quorum-type: server storage.owner-uid: 36 storage.owner-gid: 36 features.shard: on features.shard-block-size: 512MB performance.low-prio-threads: 32 cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 6 network.ping-timeout: 30 user.cifs: off nfs.disable: on performance.strict-o-direct: on [root at ovirt01 ~]# -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170705/dd319b8a/attachment.html>
Reasonably Related Threads
- op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
- op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
- op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
- op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)
- op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)