thr3ads.net - similar to: "Troubleshooting glusterfs"

Displaying 20 results from an estimated 1000 matches similar to: "Troubleshooting glusterfs"

2018 Feb 04

Fwd: Troubleshooting glusterfs

Please help troubleshooting glusterfs with the following setup: Distributed volume without replication. Sharding enabled. # cat /etc/centos-release CentOS release 6.9 (Final) # glusterfs --version glusterfs 3.12.3 [root at master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume info Volume Name: gv0 Type: Distribute Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925 Status:

Fwd: Troubleshooting glusterfs

2018 Feb 05

Fwd: Troubleshooting glusterfs

Hi, I see a lot of the following messages in the logs: [2018-02-04 03:22:01.544446] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing [2018-02-04 07:41:16.189349] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no subvolume for hash (value) = 122440868 [2018-02-04 07:41:16.244261] W [fuse-bridge.c:2398:fuse_writev_cbk] 0-glusterfs-fuse:

Fwd: Troubleshooting glusterfs

2018 Feb 05

Fwd: Troubleshooting glusterfs

On 5 February 2018 at 15:40, Nithya Balachandran <nbalacha at redhat.com> wrote: > Hi, > > > I see a lot of the following messages in the logs: > [2018-02-04 03:22:01.544446] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk] > 0-glusterfs: No change in volfile,continuing > [2018-02-04 07:41:16.189349] W [MSGID: 109011] > [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no

Fwd: Troubleshooting glusterfs

2018 Feb 05

Fwd: Troubleshooting glusterfs

Hello Nithya! Thank you so much, I think we are close to build a stable storage solution according to your recommendations. Here's our rebalance log - please don't pay attention to error messages after 9AM - this is when we manually destroyed volume to recreate it for further testing. Also all remove-brick operations you could see in the log were executed manually when recreating volume.

Fwd: Troubleshooting glusterfs

2018 Feb 07

Fwd: Troubleshooting glusterfs

Hello Nithya! Thank you for your help on figuring this out! We changed our configuration and after having a successful test yesterday we have run into new issue today. The test including moderate read/write (~20-30 Mb/s) and scaling the storage was running about 3 hours and at some moment system got stuck: On the user level there are such errors when trying to work with filesystem: OSError:

Is transport=rdma tested with "stripe"?

2017 Aug 16

Is transport=rdma tested with "stripe"?

> Note that "stripe" is not tested much and practically unmaintained. Ah, this was what I suspected. Understood. I'll be happy with "shard". Having said that, "stripe" works fine with transport=tcp. The failure reproduces with just 2 RDMA servers (with InfiniBand), one of those acts also as a client. I looked into logs. I paste lengthy logs below with

Gluster geo replication volume is faulty

2017 Sep 29

Gluster geo replication volume is faulty

I am trying to get up geo replication between two gluster volumes I have set up two replica 2 arbiter 1 volumes with 9 bricks [root at gfs1 ~]# gluster volume info Volume Name: gfsvol Type: Distributed-Replicate Volume ID: c2fb4365-480b-4d37-8c7d-c3046bca7306 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: gfs2:/gfs/brick1/gv0 Brick2:

broken gluster config

2018 May 10

broken gluster config

[trying to read, I cant understand what is wrong? root at glusterp1 gv0]# gluster volume heal gv0 info Brick glusterp1:/bricks/brick1/gv0 <gfid:eafb8799-4e7a-4264-9213-26997c5a4693> - Is in split-brain Status: Connected Number of entries: 1 Brick glusterp2:/bricks/brick1/gv0 <gfid:eafb8799-4e7a-4264-9213-26997c5a4693> - Is in split-brain Status: Connected Number of entries: 1

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 12

Can't heal a volume: "Please check if all brick processes are running."

Hello, We have a very fresh gluster 3.10.10 installation. Our volume is created as distributed volume, 9 bricks 96TB in total (87TB after 10% of gluster disk space reservation) For some reasons I can?t ?heal? the volume: # gluster volume heal gv0 Launching heal operation to perform index self heal on volume gv0 has been unsuccessful on bricks that are down. Please check if all brick processes

"Input/output error" on mkdir for PPC64 based client

2017 Sep 20

"Input/output error" on mkdir for PPC64 based client

I put the share into debug mode and then repeated the process from a ppc64 client and an x86 client. Weirdly the client logs were almost identical. Here's the ppc64 gluster client log of attempting to create a folder... ------------- [2017-09-20 13:34:23.344321] D [rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (-->

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Hi, Maybe someone can point me to a documentation or explain this? I can't find it myself. Do we have any other useful resources except doc.gluster.org? As I see many gluster options are not described there or there are no explanation what is doing... On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: > Hello, > > We have a very fresh gluster 3.10.10 installation. > Our volume

split brain? but where?

2018 May 22

split brain? but where?

I tried looking for a file of the same size and the gfid doesnt show up, 8><--- [root at glusterp2 fb]# pwd /bricks/brick1/gv0/.glusterfs/ea/fb [root at glusterp2 fb]# ls -al total 3130892 drwx------. 2 root root 64 May 22 13:01 . drwx------. 4 root root 24 May 8 14:27 .. -rw-------. 1 root root 3294887936 May 4 11:07 eafb8799-4e7a-4264-9213-26997c5a4693 -rw-r--r--. 1 root

gfid entries in volume heal info that do not heal

2017 Oct 18

gfid entries in volume heal info that do not heal

Hey Matt, >From the xattr output, it looks like the files are not present on the arbiter brick & needs healing. But on the parent it does not have the pending markers set for those entries. The workaround for this is you need to do a lookup on the file which needs heal from the mount, so it will create the entry on the arbiter brick and then run the volume heal to do the healing. Follow

broken gluster config

2018 May 10

broken gluster config

Whatever repair happened has now finished but I still have this, I cant find anything so far telling me how to fix it. Looking at http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/heal-info-and-split-brain-resolution/ I cant determine what file? dir gvo? is actually the issue. [root at glusterp1 gv0]# gluster volume heal gv0 info split-brain Brick

gfid entries in volume heal info that do not heal

2017 Oct 17

gfid entries in volume heal info that do not heal

Attached is the heal log for the volume as well as the shd log. >> Run these commands on all the bricks of the replica pair to get the attrs set on the backend. [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2 getfattr: Removing leading '/' from absolute path names # file:

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Can we add a smarter error message for this situation by checking volume type first? Cheers, Laura B On Wednesday, March 14, 2018, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > Hi Anatoliy, > > The heal command is basically used to heal any mismatching contents > between replica copies of the files. > For the command "gluster volume heal <volname>"

Self heal problem

2013 Nov 29

Self heal problem

Hi, I have a glusterfs volume replicated on three nodes. I am planing to use the volume as storage for vMware ESXi machines using NFS. The reason for using tree nodes is to be able to configure Quorum and avoid split-brains. However, during my initial testing when intentionally and gracefully restart the node "ned", a split-brain/self-heal error occurred. The log on "todd"

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Hi Anatoliy, The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running, which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 14

Can't heal a volume: "Please check if all brick processes are running."

Hi Karthik, Thanks a lot for the explanation. Does it mean a distributed volume health can be checked only by "gluster volume status " command? And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)? Regards, Anatoliy

split brain? but where?

2018 May 22

split brain? but where?

I tried this already. 8><--- [root at glusterp2 fb]# find /bricks/brick1/gv0 -samefile /bricks/brick1/gv0/.glusterfs/ea/fb/eafb8799-4e7a-4264-9213-26997c5a4693 /bricks/brick1/gv0/.glusterfs/ea/fb/eafb8799-4e7a-4264-9213-26997c5a4693 [root at glusterp2 fb]# 8><--- gluster 4 Centos 7.4 8><--- df -h [root at glusterp2 fb]# df -h Filesystem

similar to: Troubleshooting glusterfs