thr3ads.net - similar to: "Clients can't connect after a server reboot (need to use volume force start)"

Displaying 20 results from an estimated 6000 matches similar to: "Clients can't connect after a server reboot (need to use volume force start)"

Clients can't connect after a server reboot (need to use volume force start)

2017 Oct 02

Clients can't connect after a server reboot (need to use volume force start)

On Thu, Sep 28, 2017 at 8:49 AM, Frizz <frizzthecat at googlemail.com> wrote: > After I rebooted my GlusterFS servers I can?t connect from clients any > more. > > The volume is running, but I have to do a volume start FORCE on all server > hosts to make it work again. > > I am running glusterfs 3.12.1 on Ubuntu 16.04. > > Is this a bug? > Have you been able to

Access from multiple hosts where users have different uid/gid

2017 Oct 05

Access from multiple hosts where users have different uid/gid

I have a setup with multiple hosts, each of them are administered separately. So there are no unified uid/gid for the users. When mounting a GlusterFS volume, a file owned by user1 on host1 might become owned by user2 on host2. I was looking into POSIX ACL or bindfs, but that won't help me much. What did other people do with this kind of problem? -------------- next part -------------- An

Wrong volume size with df

2018 Jan 02

Wrong volume size with df

For what it's worth here, after I added a hot tier to the pool, the brick sizes are now reporting the correct size of all bricks combined instead of just one brick. Not sure if that gives you any clues for this... maybe adding another brick to the pool would have a similar effect? On Thu, Dec 21, 2017 at 11:44 AM, Tom Fite <tomfite at gmail.com> wrote: > Sure! > > > 1 -

Wrong volume size with df

2017 Dec 21

Wrong volume size with df

Sure! > 1 - output of gluster volume heal <volname> info Brick pod-sjc1-gluster1:/data/brick1/gv0 Status: Connected Number of entries: 0 Brick pod-sjc1-gluster2:/data/brick1/gv0 Status: Connected Number of entries: 0 Brick pod-sjc1-gluster1:/data/brick2/gv0 Status: Connected Number of entries: 0 Brick pod-sjc1-gluster2:/data/brick2/gv0 Status: Connected Number of entries: 0 Brick

How to fix an out-of-sync node?

2018 Feb 08

How to fix an out-of-sync node?

I have a setup with 3 nodes running GlusterFS. gluster volume create myBrick replica 3 node01:/mnt/data/myBrick node02:/mnt/data/myBrick node03:/mnt/data/myBrick Unfortunately node1 seemed to stop syncing with the other nodes, but this was undetected for weeks! When I noticed it, I did a "service glusterd restart" on node1, hoping the three nodes would sync again. But this did not

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Can we add a smarter error message for this situation by checking volume type first? Cheers, Laura B On Wednesday, March 14, 2018, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > Hi Anatoliy, > > The heal command is basically used to heal any mismatching contents > between replica copies of the files. > For the command "gluster volume heal <volname>"

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 14

Can't heal a volume: "Please check if all brick processes are running."

Hi Karthik, Thanks a lot for the explanation. Does it mean a distributed volume health can be checked only by "gluster volume status " command? And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)? Regards, Anatoliy

gfid entries in volume heal info that do not heal

2017 Nov 06

gfid entries in volume heal info that do not heal

That took a while! I have the following stats: 4085169 files in both bricks3162940 files only have a single hard link. All of the files exist on both servers. bmidata2 (below) WAS running when bmidata1 died. gluster volume heal clifford statistics heal-countGathering count of entries to be healed on volume clifford has been successful Brick bmidata1:/data/glusterfs/clifford/brick/brickNumber of

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 14

Can't heal a volume: "Please check if all brick processes are running."

On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > > > On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> > wrote: > >> Hi Karthik, >> >> >> Thanks a lot for the explanation. >> >> Does it mean a distributed volume health can be checked only by "gluster >> volume

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Hi, Maybe someone can point me to a documentation or explain this? I can't find it myself. Do we have any other useful resources except doc.gluster.org? As I see many gluster options are not described there or there are no explanation what is doing... On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: > Hello, > > We have a very fresh gluster 3.10.10 installation. > Our volume

Healing completely loss file on replica 3 volume

2019 Nov 29

Healing completely loss file on replica 3 volume

I'm trying to manually garbage data on bricks (when the volume is stopped) and then check whether healing is possible. For example: Start: # glusterd --debug Bricks (on EXT4 mounted with 'rw,realtime'): # mkdir /root/data0 # mkdir /root/data1 # mkdir /root/data2 Volume: # gluster volume create gv0 replica 3 [local-ip]:/root/data0 [local-ip]:/root/data1 [local-ip]:/root/data2

gfid entries in volume heal info that do not heal

2017 Oct 18

gfid entries in volume heal info that do not heal

Hey Matt, >From the xattr output, it looks like the files are not present on the arbiter brick & needs healing. But on the parent it does not have the pending markers set for those entries. The workaround for this is you need to do a lookup on the file which needs heal from the mount, so it will create the entry on the arbiter brick and then run the volume heal to do the healing. Follow

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 14

Can't heal a volume: "Please check if all brick processes are running."

On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote: > Hi Karthik, > > > Thanks a lot for the explanation. > > Does it mean a distributed volume health can be checked only by "gluster > volume status " command? > Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 12

Can't heal a volume: "Please check if all brick processes are running."

Hello, We have a very fresh gluster 3.10.10 installation. Our volume is created as distributed volume, 9 bricks 96TB in total (87TB after 10% of gluster disk space reservation) For some reasons I can?t ?heal? the volume: # gluster volume heal gv0 Launching heal operation to perform index self heal on volume gv0 has been unsuccessful on bricks that are down. Please check if all brick processes

gfid entries in volume heal info that do not heal

2017 Oct 16

gfid entries in volume heal info that do not heal

OK, so here?s my output of the volume info and the heal info. I have not yet tracked down physical location of these files, any tips to finding them would be appreciated, but I?m definitely just wanting them gone. I forgot to mention earlier that the cluster is running 3.12 and was upgraded from 3.10; these files were likely stuck like this when it was on 3.10. [root at tpc-cent-glus1-081017 ~]#

gfid entries in volume heal info that do not heal

2017 Oct 23

gfid entries in volume heal info that do not heal

Hi Jim & Matt, Can you also check for the link count in the stat output of those hardlink entries in the .glusterfs folder on the bricks. If the link count is 1 on all the bricks for those entries, then they are orphaned entries and you can delete those hardlinks. To be on the safer side have a backup before deleting any of the entries. Regards, Karthik On Fri, Oct 20, 2017 at 3:18 AM, Jim

Blocking IO when hot tier promotion daemon runs

2018 Jan 10

Blocking IO when hot tier promotion daemon runs

I should add that additional testing has shown that only accessing files is held up, IO is not interrupted for existing transfers. I think this points to the heat metadata in the sqlite DB for the tier, is it possible that a table is temporarily locked while the promotion daemon runs so the calls to update the access count on files are blocked? On Wed, Jan 10, 2018 at 10:17 AM, Tom Fite

gfid entries in volume heal info that do not heal

2017 Oct 24

gfid entries in volume heal info that do not heal

I have 14,734 GFIDS that are different. All the different ones are only on the brick that was live during the outage and concurrent file copy- in. The brick that was down at that time has no GFIDs that are not also on the up brick. As the bricks are 10TB, the find is going to be a long running process. I'm running several finds at once with gnu parallel but it will still take some time.

Can't heal a volume: "Please check if all brick processes are running."

2018 Mar 13

Can't heal a volume: "Please check if all brick processes are running."

Hi Anatoliy, The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running, which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any

gfid entries in volume heal info that do not heal

2017 Oct 17

gfid entries in volume heal info that do not heal

Attached is the heal log for the volume as well as the shd log. >> Run these commands on all the bricks of the replica pair to get the attrs set on the backend. [root at tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2 getfattr: Removing leading '/' from absolute path names # file:

similar to: Clients can't connect after a server reboot (need to use volume force start)