thr3ads.net - Gluster users - [Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Steve Dainard

2016-Aug-22 22:47 UTC

[Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes

About 5 hours after upgrading gluster 3.7.6 -> 3.7.13 on Centos 7, one of
my gluster servers disconnected its volume. The other two volumes this host
serves were not affected.

# gluster volume status storage
Status of volume: storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.231.50:/mnt/raid6-storage/storage
            49159     0          Y       30743
Brick 10.0.231.51:/mnt/raid6-storage/storage
            49159     0          Y       676
*Brick 10.0.231.52:/mnt/raid6-storage/storag**e
              N/A       N/A        N       N/A  *
Brick 10.0.231.53:/mnt/raid6-storage/storage
            49154     0          Y       10253
Brick 10.0.231.54:/mnt/raid6-storage/storage
            49153     0          Y       2792
Brick 10.0.231.55:/mnt/raid6-storage/storage
            49153     0          Y       13590
Brick 10.0.231.56:/mnt/raid6-storage/storage
            49152     0          Y       9281
NFS Server on localhost                     2049      0          Y
30775
Quota Daemon on localhost                   N/A       N/A        Y
30781
NFS Server on 10.0.231.54                   2049      0          Y
2817
Quota Daemon on 10.0.231.54                 N/A       N/A        Y
2824
NFS Server on 10.0.231.51                   2049      0          Y
710
Quota Daemon on 10.0.231.51                 N/A       N/A        Y
719
NFS Server on 10.0.231.52                   2049      0          Y
9090
Quota Daemon on 10.0.231.52                 N/A       N/A        Y
9098
NFS Server on 10.0.231.55                   2049      0          Y
13611
Quota Daemon on 10.0.231.55                 N/A       N/A        Y
13619
NFS Server on 10.0.231.56                   2049      0          Y
9303
Quota Daemon on 10.0.231.56                 N/A       N/A        Y
9310
NFS Server on 10.0.231.53                   2049      0          Y
26304
Quota Daemon on 10.0.231.53                 N/A       N/A        Y
26320

Task Status of Volume storage
------------------------------------------------------------------------------
There are no active volume tasks

I see lots of logs related to trashcan (failed [file exists]), set xattrs
(failed [no such file or directory]), quota (invalid arguments) in the
brick logs, which I enabled as a feature after the upgrade this morning.

After restarting glusterd on that host, the volume came back online.

I've attached logs from that host if someone can take a look.

# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
Status: Started
Number of Bricks: 7
Transport-type: tcp
Bricks:
Brick1: 10.0.231.50:/mnt/raid6-storage/storage
Brick2: 10.0.231.51:/mnt/raid6-storage/storage
Brick3: 10.0.231.52:/mnt/raid6-storage/storage
Brick4: 10.0.231.53:/mnt/raid6-storage/storage
Brick5: 10.0.231.54:/mnt/raid6-storage/storage
Brick6: 10.0.231.55:/mnt/raid6-storage/storage
Brick7: 10.0.231.56:/mnt/raid6-storage/storage
Options Reconfigured:
nfs.disable: no
features.trash-max-filesize: 1GB
features.trash: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on

# rpm -qa  |grep glusterfs
glusterfs-fuse-3.7.13-1.el7.x86_64
glusterfs-cli-3.7.13-1.el7.x86_64
glusterfs-3.7.13-1.el7.x86_64
glusterfs-server-3.7.13-1.el7.x86_64
glusterfs-api-3.7.13-1.el7.x86_64
glusterfs-libs-3.7.13-1.el7.x86_64
glusterfs-client-xlators-3.7.13-1.el7.x86_64


Thanks,
Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160822/bfb24427/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: etc-glusterfs-glusterd.vol.log
Type: text/x-log
Size: 161664 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160822/bfb24427/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mnt-raid6-storage-storage.log
Type: text/x-log
Size: 644099 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160822/bfb24427/attachment-0003.bin>

Atin Mukherjee

2016-Aug-23 04:15 UTC

head link

[Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes

On Tue, Aug 23, 2016 at 4:17 AM, Steve Dainard <sdainard at spd1.com>
wrote:
> About 5 hours after upgrading gluster 3.7.6 -> 3.7.13 on Centos 7, one
of
> my gluster servers disconnected its volume. The other two volumes this host
> serves were not affected.
>
> # gluster volume status storage
> Status of volume: storage
> Gluster process                             TCP Port  RDMA Port  Online
>  Pid
> ------------------------------------------------------------
> ------------------
> Brick 10.0.231.50:/mnt/raid6-storage/storage
>               49159     0          Y       30743
> Brick 10.0.231.51:/mnt/raid6-storage/storage
>               49159     0          Y       676
> *Brick 10.0.231.52:/mnt/raid6-storage/storag**e
>                 N/A       N/A        N       N/A  *
> Brick 10.0.231.53:/mnt/raid6-storage/storage
>               49154     0          Y       10253
> Brick 10.0.231.54:/mnt/raid6-storage/storage
>               49153     0          Y       2792
> Brick 10.0.231.55:/mnt/raid6-storage/storage
>               49153     0          Y       13590
> Brick 10.0.231.56:/mnt/raid6-storage/storage
>               49152     0          Y       9281
> NFS Server on localhost                     2049      0          Y
> 30775
> Quota Daemon on localhost                   N/A       N/A        Y
> 30781
> NFS Server on 10.0.231.54                   2049      0          Y
> 2817
> Quota Daemon on 10.0.231.54                 N/A       N/A        Y
> 2824
> NFS Server on 10.0.231.51                   2049      0          Y
> 710
> Quota Daemon on 10.0.231.51                 N/A       N/A        Y
> 719
> NFS Server on 10.0.231.52                   2049      0          Y
> 9090
> Quota Daemon on 10.0.231.52                 N/A       N/A        Y
> 9098
> NFS Server on 10.0.231.55                   2049      0          Y
> 13611
> Quota Daemon on 10.0.231.55                 N/A       N/A        Y
> 13619
> NFS Server on 10.0.231.56                   2049      0          Y
> 9303
> Quota Daemon on 10.0.231.56                 N/A       N/A        Y
> 9310
> NFS Server on 10.0.231.53                   2049      0          Y
> 26304
> Quota Daemon on 10.0.231.53                 N/A       N/A        Y
> 26320
>
> Task Status of Volume storage
> ------------------------------------------------------------
> ------------------
> There are no active volume tasks
>
> I see lots of logs related to trashcan (failed [file exists]), set xattrs
> (failed [no such file or directory]), quota (invalid arguments) in the
> brick logs, which I enabled as a feature after the upgrade this morning.
>
Could you let us know the time (in UTC) around which this issue was seen
such that we can look at the logs around that time and see if something
went wrong.

> After restarting glusterd on that host, the volume came back online.
>
> I've attached logs from that host if someone can take a look.
>
> # gluster volume info storage
>
> Volume Name: storage
> Type: Distribute
> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
> Status: Started
> Number of Bricks: 7
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
> Options Reconfigured:
> nfs.disable: no
> features.trash-max-filesize: 1GB
> features.trash: on
> features.quota-deem-statfs: on
> features.inode-quota: on
> features.quota: on
> performance.readdir-ahead: on
>
> # rpm -qa  |grep glusterfs
> glusterfs-fuse-3.7.13-1.el7.x86_64
> glusterfs-cli-3.7.13-1.el7.x86_64
> glusterfs-3.7.13-1.el7.x86_64
> glusterfs-server-3.7.13-1.el7.x86_64
> glusterfs-api-3.7.13-1.el7.x86_64
> glusterfs-libs-3.7.13-1.el7.x86_64
> glusterfs-client-xlators-3.7.13-1.el7.x86_64
>
>
> Thanks,
> Steve
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>


-- 

--Atin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160823/d3d2eb63/attachment.html>

Gluster users - Aug 2016 - Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes

[Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes

[Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes