Displaying 20 results from an estimated 1000 matches similar to: "Troubleshooting glusterfs"
2018 Feb 04
1
Fwd: Troubleshooting glusterfs
Please help troubleshooting glusterfs with the following setup:
Distributed volume without replication. Sharding enabled.
# cat /etc/centos-release
CentOS release 6.9 (Final)
# glusterfs --version
glusterfs 3.12.3
[root at master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume info
Volume Name: gv0
Type: Distribute
Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925
Status:
2018 Feb 05
2
Fwd: Troubleshooting glusterfs
Hi,
I see a lot of the following messages in the logs:
[2018-02-04 03:22:01.544446] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
0-glusterfs: No change in volfile,continuing
[2018-02-04 07:41:16.189349] W [MSGID: 109011]
[dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 122440868
[2018-02-04 07:41:16.244261] W [fuse-bridge.c:2398:fuse_writev_cbk]
0-glusterfs-fuse:
2018 Feb 05
0
Fwd: Troubleshooting glusterfs
On 5 February 2018 at 15:40, Nithya Balachandran <nbalacha at redhat.com>
wrote:
> Hi,
>
>
> I see a lot of the following messages in the logs:
> [2018-02-04 03:22:01.544446] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile,continuing
> [2018-02-04 07:41:16.189349] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no
2018 Feb 05
2
Fwd: Troubleshooting glusterfs
Hello Nithya!
Thank you so much, I think we are close to build a stable storage solution
according to your recommendations. Here's our rebalance log - please don't
pay attention to error messages after 9AM - this is when we manually
destroyed volume to recreate it for further testing. Also all remove-brick
operations you could see in the log were executed manually when recreating
volume.
2018 Feb 07
0
Fwd: Troubleshooting glusterfs
Hello Nithya! Thank you for your help on figuring this out!
We changed our configuration and after having a successful test yesterday
we have run into new issue today.
The test including moderate read/write (~20-30 Mb/s) and scaling the
storage was running about 3 hours and at some moment system got stuck:
On the user level there are such errors when trying to work with filesystem:
OSError:
2017 Aug 16
0
Is transport=rdma tested with "stripe"?
> Note that "stripe" is not tested much and practically unmaintained.
Ah, this was what I suspected. Understood. I'll be happy with "shard".
Having said that, "stripe" works fine with transport=tcp. The failure reproduces with just 2 RDMA servers (with InfiniBand), one of those acts also as a client.
I looked into logs. I paste lengthy logs below with
2017 Sep 29
1
Gluster geo replication volume is faulty
I am trying to get up geo replication between two gluster volumes
I have set up two replica 2 arbiter 1 volumes with 9 bricks
[root at gfs1 ~]# gluster volume info
Volume Name: gfsvol
Type: Distributed-Replicate
Volume ID: c2fb4365-480b-4d37-8c7d-c3046bca7306
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: gfs2:/gfs/brick1/gv0
Brick2:
2018 May 10
0
broken gluster config
[trying to read,
I cant understand what is wrong?
root at glusterp1 gv0]# gluster volume heal gv0 info
Brick glusterp1:/bricks/brick1/gv0
<gfid:eafb8799-4e7a-4264-9213-26997c5a4693> - Is in split-brain
Status: Connected
Number of entries: 1
Brick glusterp2:/bricks/brick1/gv0
<gfid:eafb8799-4e7a-4264-9213-26997c5a4693> - Is in split-brain
Status: Connected
Number of entries: 1
2018 Mar 12
2
Can't heal a volume: "Please check if all brick processes are running."
Hello,
We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total
(87TB after 10% of gluster disk space reservation)
For some reasons I can?t ?heal? the volume:
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has
been unsuccessful on bricks that are down. Please check if all brick
processes
2017 Sep 20
1
"Input/output error" on mkdir for PPC64 based client
I put the share into debug mode and then repeated the process from a ppc64
client and an x86 client. Weirdly the client logs were almost identical.
Here's the ppc64 gluster client log of attempting to create a folder...
-------------
[2017-09-20 13:34:23.344321] D
[rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (-->
2018 Mar 13
0
Can't heal a volume: "Please check if all brick processes are running."
Hi,
Maybe someone can point me to a documentation or explain this? I can't
find it myself.
Do we have any other useful resources except doc.gluster.org? As I see
many gluster options are not described there or there are no explanation
what is doing...
On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
> Hello,
>
> We have a very fresh gluster 3.10.10 installation.
> Our volume
2018 May 22
1
split brain? but where?
I tried looking for a file of the same size and the gfid doesnt show up,
8><---
[root at glusterp2 fb]# pwd
/bricks/brick1/gv0/.glusterfs/ea/fb
[root at glusterp2 fb]# ls -al
total 3130892
drwx------. 2 root root 64 May 22 13:01 .
drwx------. 4 root root 24 May 8 14:27 ..
-rw-------. 1 root root 3294887936 May 4 11:07
eafb8799-4e7a-4264-9213-26997c5a4693
-rw-r--r--. 1 root
2017 Oct 18
1
gfid entries in volume heal info that do not heal
Hey Matt,
>From the xattr output, it looks like the files are not present on the
arbiter brick & needs healing. But on the parent it does not have the
pending markers set for those entries.
The workaround for this is you need to do a lookup on the file which needs
heal from the mount, so it will create the entry on the arbiter brick and
then run the volume heal to do the healing.
Follow
2018 May 10
2
broken gluster config
Whatever repair happened has now finished but I still have this,
I cant find anything so far telling me how to fix it. Looking at
http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/heal-info-and-split-brain-resolution/
I cant determine what file? dir gvo? is actually the issue.
[root at glusterp1 gv0]# gluster volume heal gv0 info split-brain
Brick
2017 Oct 17
0
gfid entries in volume heal info that do not heal
Attached is the heal log for the volume as well as the shd log.
>> Run these commands on all the bricks of the replica pair to get the attrs set on the backend.
[root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
getfattr: Removing leading '/' from absolute path names
# file:
2018 Mar 13
0
Can't heal a volume: "Please check if all brick processes are running."
Can we add a smarter error message for this situation by checking volume
type first?
Cheers,
Laura B
On Wednesday, March 14, 2018, Karthik Subrahmanya <ksubrahm at redhat.com>
wrote:
> Hi Anatoliy,
>
> The heal command is basically used to heal any mismatching contents
> between replica copies of the files.
> For the command "gluster volume heal <volname>"
2013 Nov 29
1
Self heal problem
Hi,
I have a glusterfs volume replicated on three nodes. I am planing to use
the volume as storage for vMware ESXi machines using NFS. The reason for
using tree nodes is to be able to configure Quorum and avoid
split-brains. However, during my initial testing when intentionally and
gracefully restart the node "ned", a split-brain/self-heal error
occurred.
The log on "todd"
2018 Mar 13
4
Can't heal a volume: "Please check if all brick processes are running."
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents between
replica copies of the files.
For the command "gluster volume heal <volname>" to succeed, you should have
the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you have a plain distribute volume where you do not store the
replica of any
2018 Mar 14
0
Can't heal a volume: "Please check if all brick processes are running."
Hi Karthik,
Thanks a lot for the explanation.
Does it mean a distributed volume health can be checked only by "gluster
volume status " command?
And one more question: cluster.min-free-disk is 10% by default. What
kind of "side effects" can we face if this option will be reduced to,
for example, 5%? Could you point to any best practice document(s)?
Regards,
Anatoliy
2018 May 22
0
split brain? but where?
I tried this already.
8><---
[root at glusterp2 fb]# find /bricks/brick1/gv0 -samefile
/bricks/brick1/gv0/.glusterfs/ea/fb/eafb8799-4e7a-4264-9213-26997c5a4693
/bricks/brick1/gv0/.glusterfs/ea/fb/eafb8799-4e7a-4264-9213-26997c5a4693
[root at glusterp2 fb]#
8><---
gluster 4
Centos 7.4
8><---
df -h
[root at glusterp2 fb]# df -h
Filesystem