thr3ads.net - Gluster users - [Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1) [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2017-Jul-05 15:22 UTC

[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

And what does glusterd log indicate for these failures?

On Wed, Jul 5, 2017 at 8:43 PM, Gianluca Cecchi <gianluca.cecchi at
gmail.com>
wrote:
>
>
> On Wed, Jul 5, 2017 at 5:02 PM, Sahina Bose <sabose at redhat.com>
wrote:
>
>>
>>
>> On Wed, Jul 5, 2017 at 8:16 PM, Gianluca Cecchi <
>> gianluca.cecchi at gmail.com> wrote:
>>
>>>
>>>
>>> On Wed, Jul 5, 2017 at 7:42 AM, Sahina Bose <sabose at
redhat.com> wrote:
>>>
>>>>
>>>>
>>>>> ...
>>>>>
>>>>> then the commands I need to run would be:
>>>>>
>>>>> gluster volume reset-brick export
ovirt01.localdomain.local:/gluster/brick3/export
>>>>> start
>>>>> gluster volume reset-brick export
ovirt01.localdomain.local:/gluster/brick3/export
>>>>> gl01.localdomain.local:/gluster/brick3/export commit force
>>>>>
>>>>> Correct?
>>>>>
>>>>
>>>> Yes, correct. gl01.localdomain.local should resolve correctly
on all 3
>>>> nodes.
>>>>
>>>
>>>
>>> It fails at first step:
>>>
>>>  [root at ovirt01 ~]# gluster volume reset-brick export
>>> ovirt01.localdomain.local:/gluster/brick3/export start
>>> volume reset-brick: failed: Cannot execute command. The cluster is
>>> operating at version 30712. reset-brick command reset-brick start
is
>>> unavailable in this version.
>>> [root at ovirt01 ~]#
>>>
>>> It seems somehow in relation with this upgrade not of the
commercial
>>> solution Red Hat Gluster Storage
>>> https://access.redhat.com/documentation/en-US/Red_Hat_Storag
>>> e/3.1/html/Installation_Guide/chap-Upgrading_Red_Hat_Storage.html
>>>
>>> So ti seems I have to run some command of type:
>>>
>>> gluster volume set all cluster.op-version XXXXX
>>>
>>> with XXXXX > 30712
>>>
>>> It seems that latest version of commercial Red Hat Gluster Storage
is
>>> 3.1 and its op-version is indeed 30712..
>>>
>>> So the question is which particular op-version I have to set and if
the
>>> command can be set online without generating disruption....
>>>
>>
>> It should have worked with the glusterfs 3.10 version from Centos repo.
>> Adding gluster-users for help on the op-version
>>
>>
>>>
>>> Thanks,
>>> Gianluca
>>>
>>
>>
>
> It seems op-version is not updated automatically by default, so that it
> can manage mixed versions while you update one by one...
>
> I followed what described here:
> https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/
>
>
> - Get current version:
>
> [root at ovirt01 ~]# gluster volume get all cluster.op-version
> Option                                  Value
>
> ------                                  -----
>
> cluster.op-version                      30712
>
> [root at ovirt01 ~]#
>
>
> - Get maximum version I can set for current setup:
>
> [root at ovirt01 ~]# gluster volume get all cluster.max-op-version
> Option                                  Value
>
> ------                                  -----
>
> cluster.max-op-version                  31000
>
> [root at ovirt01 ~]#
>
>
> - Get op version information for all the connected clients:
>
> [root at ovirt01 ~]# gluster volume status all clients | grep
":49" | awk
> '{print $4}' | sort | uniq -c
>      72 31000
> [root at ovirt01 ~]#
>
> --> ok
>
>
> - Update op-version
>
> [root at ovirt01 ~]# gluster volume set all cluster.op-version 31000
> volume set: success
> [root at ovirt01 ~]#
>
>
> - Verify:
> [root at ovirt01 ~]# gluster volume get all cluster.op-versionOption
>                          Value
> ------                                  -----
>
> cluster.op-version                      31000
>
> [root at ovirt01 ~]#
>
> --> ok
>
> [root at ovirt01 ~]# gluster volume reset-brick export
> ovirt01.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
>
> [root at ovirt01 ~]# gluster volume reset-brick export
> ovirt01.localdomain.local:/gluster/brick3/export
gl01.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on ovirt02.localdomain.local.
> Please check log file for details.
> Commit failed on ovirt03.localdomain.local. Please check log file for
> details.
> [root at ovirt01 ~]#
>
> [root at ovirt01 bricks]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gl01.localdomain.local:/gluster/brick3/export
> Brick2: ovirt02.localdomain.local:/gluster/brick3/export
> Brick3: ovirt03.localdomain.local:/gluster/brick3/export (arbiter)
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 6
> network.ping-timeout: 30
> user.cifs: off
> nfs.disable: on
> performance.strict-o-direct: on
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt02.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt02.localdomain.local:/gluster/brick3/export
gl02.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on localhost. Please check log
> file for details.
> [root at ovirt01 bricks]#
>
> I proceed (I have actually nothing on export volume...)
>
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt02.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt02.localdomain.local:/gluster/brick3/export
gl02.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on localhost. Please check log
> file for details.
> [root at ovirt01 bricks]#
>
> Again error
>
>
> [root at ovirt01 bricks]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 0 x (2 + 1) = 2
> Transport-type: tcp
> Bricks:
> Brick1: gl01.localdomain.local:/gluster/brick3/export
> Brick2: ovirt03.localdomain.local:/gluster/brick3/export
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 6
> network.ping-timeout: 30
> user.cifs: off
> nfs.disable: on
> performance.strict-o-direct: on
> [root at ovirt01 bricks]#
>
>
> The last
>
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt03.localdomain.local:/gluster/brick3/export start
> volume reset-brick: success: reset-brick start operation successful
> [root at ovirt01 bricks]# gluster volume reset-brick export
> ovirt03.localdomain.local:/gluster/brick3/export
gl03.localdomain.local:/gluster/brick3/export
> commit force
> volume reset-brick: failed: Commit failed on localhost. Please check log
> file for details.
> [root at ovirt01 bricks]#
>
> again error
>
>
> [root at ovirt01 bricks]# gluster volume info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 0 x (2 + 1) = 1
> Transport-type: tcp
> Bricks:
> Brick1: gl01.localdomain.local:/gluster/brick3/export
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard: on
> features.shard-block-size: 512MB
> performance.low-prio-threads: 32
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 6
> network.ping-timeout: 30
> user.cifs: off
> nfs.disable: on
> performance.strict-o-direct: on
> [root at ovirt01 bricks]#
>
> See here for gluster log in gzip format....
> https://drive.google.com/file/d/0BwoPbcrMv8mvQmlYZjAySTZKTzQ/
> view?usp=sharing
>
> The first command executed at 14:57 and the other two at 15:04
>
> This is what seen by oVirt right now for the volume
> https://drive.google.com/file/d/0BwoPbcrMv8mvNFAyd043TnNwSEU/
> view?usp=sharing
>
> (After the first command I saw 2 of 3 up)
>
> Gianluca
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170705/82628030/attachment.html>

Gianluca Cecchi

2017-Jul-05 15:42 UTC

head link

[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

On Wed, Jul 5, 2017 at 5:22 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
> And what does glusterd log indicate for these failures?
>

See here in gzip format

https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/view?usp=sharing


It seems that on each host the peer files have been updated with a new
entry "hostname2":

[root at ovirt01 ~]# cat /var/lib/glusterd/peers/*
uuid=b89311fe-257f-4e44-8e15-9bff6245d689
state=3
hostname1=ovirt02.localdomain.local
hostname2=10.10.2.103
uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
state=3
hostname1=ovirt03.localdomain.local
hostname2=10.10.2.104
[root at ovirt01 ~]#

[root at ovirt02 ~]# cat /var/lib/glusterd/peers/*
uuid=e9717281-a356-42aa-a579-a4647a29a0bc
state=3
hostname1=ovirt01.localdomain.local
hostname2=10.10.2.102
uuid=ec81a04c-a19c-4d31-9d82-7543cefe79f3
state=3
hostname1=ovirt03.localdomain.local
hostname2=10.10.2.104
[root at ovirt02 ~]#

[root at ovirt03 ~]# cat /var/lib/glusterd/peers/*
uuid=b89311fe-257f-4e44-8e15-9bff6245d689
state=3
hostname1=ovirt02.localdomain.local
hostname2=10.10.2.103
uuid=e9717281-a356-42aa-a579-a4647a29a0bc
state=3
hostname1=ovirt01.localdomain.local
hostname2=10.10.2.102
[root at ovirt03 ~]#


But not the gluster info on the second and third node that have lost the
ovirt01/gl01 host brick information...

Eg on ovirt02


[root at ovirt02 peers]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root at ovirt02 peers]#

And on ovirt03

[root at ovirt03 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 2
Transport-type: tcp
Bricks:
Brick1: ovirt02.localdomain.local:/gluster/brick3/export
Brick2: ovirt03.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root at ovirt03 ~]#

While on ovirt01 it seems isolated...

[root at ovirt01 ~]# gluster volume info export

Volume Name: export
Type: Replicate
Volume ID: b00e5839-becb-47e7-844f-6ce6ce1b7153
Status: Started
Snapshot Count: 0
Number of Bricks: 0 x (2 + 1) = 1
Transport-type: tcp
Bricks:
Brick1: gl01.localdomain.local:/gluster/brick3/export
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 6
network.ping-timeout: 30
user.cifs: off
nfs.disable: on
performance.strict-o-direct: on
[root at ovirt01 ~]#
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170705/dd319b8a/attachment.html>

Reasonably Related Threads

Search for more maybe matching threads

Gluster users - Jul 2017 - op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

[Gluster-users] op-version for reset-brick (Was: Re: [ovirt-users] Upgrading HC from 4.0 to 4.1)

Reasonably Related Threads