similar to: GlusterFS healing questions

Displaying 20 results from an estimated 7000 matches similar to: "GlusterFS healing questions"

2017 Nov 09
0
GlusterFS healing questions
Hi Rolf, answers follow inline... On Thu, Nov 9, 2017 at 3:20 PM, Rolf Larsen <rolf at jotta.no> wrote: > Hi, > > We ran a test on GlusterFS 3.12.1 with erasurecoded volumes 8+2 with 10 > bricks (default config,tested with 100gb, 200gb, 400gb bricksizes,10gbit > nics) > > 1. > Tests show that healing takes about double the time on healing 200gb vs > 100, and
2017 Nov 09
2
GlusterFS healing questions
Hi, You can set disperse.shd-max-threads to 2 or 4 in order to make heal faster. This makes my heal times 2-3x faster. Also you can play with disperse.self-heal-window-size to read more bytes at one time, but i did not test it. On Thu, Nov 9, 2017 at 4:47 PM, Xavi Hernandez <jahernan at redhat.com> wrote: > Hi Rolf, > > answers follow inline... > > On Thu, Nov 9, 2017 at
2017 Nov 09
0
GlusterFS healing questions
Someone on the #gluster-users irc channel said the following : "Decreasing features.locks-revocation-max-blocked to an absurdly low number is letting our distributed-disperse set heal again." Is this something to concider? Does anyone else have experience with tweaking this to speed up healing? Sent from my iPhone > On 9 Nov 2017, at 18:00, Serkan ?oban <cobanserkan at
2018 Mar 13
4
Can't heal a volume: "Please check if all brick processes are running."
Hi Anatoliy, The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running, which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any
2018 Mar 14
2
Can't heal a volume: "Please check if all brick processes are running."
On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote: > Hi Karthik, > > > Thanks a lot for the explanation. > > Does it mean a distributed volume health can be checked only by "gluster > volume status " command? > Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to
2018 Mar 26
3
Sharding problem - multiple shard copies with mismatching gfids
On Mon, Mar 26, 2018 at 12:40 PM, Krutika Dhananjay <kdhananj at redhat.com> wrote: > The gfid mismatch here is between the shard and its "link-to" file, the > creation of which happens at a layer below that of shard translator on the > stack. > > Adding DHT devs to take a look. > Thanks Krutika. I assume shard doesn't do any dentry operations like rename,
2018 Mar 25
2
Sharding problem - multiple shard copies with mismatching gfids
Hello all, We are having a rather interesting problem with one of our VM storage systems. The GlusterFS client is throwing errors relating to GFID mismatches. We traced this down to multiple shards being present on the gluster nodes, with different gfids. Hypervisor gluster mount log: [2018-03-25 18:54:19.261733] E [MSGID: 133010] [shard.c:1724:shard_common_lookup_shards_cbk]
2018 Mar 12
2
Can't heal a volume: "Please check if all brick processes are running."
Hello, We have a very fresh gluster 3.10.10 installation. Our volume is created as distributed volume, 9 bricks 96TB in total (87TB after 10% of gluster disk space reservation) For some reasons I can?t ?heal? the volume: # gluster volume heal gv0 Launching heal operation to perform index self heal on volume gv0 has been unsuccessful on bricks that are down. Please check if all brick processes
2018 Mar 14
0
Can't heal a volume: "Please check if all brick processes are running."
Hi Karthik, Thanks a lot for the explanation. Does it mean a distributed volume health can be checked only by "gluster volume status " command? And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)? Regards, Anatoliy
2017 Jul 08
2
[Gluster-devel] gfid and volume-id extended attributes lost
Ram, As per the code, self-heal was the only candidate which *can* do it. Could you check logs of self-heal daemon and the mount to check if there are any metadata heals on root? +Sanoj Sanoj, Is there any systemtap script we can use to detect which process is removing these xattrs? On Sat, Jul 8, 2017 at 2:58 AM, Ankireddypalle Reddy <areddy at commvault.com> wrote: >
2018 Mar 13
0
Can't heal a volume: "Please check if all brick processes are running."
Hi, Maybe someone can point me to a documentation or explain this? I can't find it myself. Do we have any other useful resources except doc.gluster.org? As I see many gluster options are not described there or there are no explanation what is doing... On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: > Hello, > > We have a very fresh gluster 3.10.10 installation. > Our volume
2018 Mar 26
0
Sharding problem - multiple shard copies with mismatching gfids
The gfid mismatch here is between the shard and its "link-to" file, the creation of which happens at a layer below that of shard translator on the stack. Adding DHT devs to take a look. -Krutika On Mon, Mar 26, 2018 at 1:09 AM, Ian Halliday <ihalliday at ndevix.com> wrote: > Hello all, > > We are having a rather interesting problem with one of our VM storage >
2017 Jul 07
0
[Gluster-devel] gfid and volume-id extended attributes lost
We lost the attributes on all the bricks on servers glusterfs2 and glusterfs3 again. [root at glusterfs2 Log_Files]# gluster volume info Volume Name: StoragePool Type: Distributed-Disperse Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f Status: Started Number of Bricks: 20 x (2 + 1) = 60 Transport-type: tcp Bricks: Brick1: glusterfs1sds:/ws/disk1/ws_brick Brick2: glusterfs2sds:/ws/disk1/ws_brick
2018 Mar 14
0
Can't heal a volume: "Please check if all brick processes are running."
On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > > > On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> > wrote: > >> Hi Karthik, >> >> >> Thanks a lot for the explanation. >> >> Does it mean a distributed volume health can be checked only by "gluster >> volume
2018 Mar 13
0
Can't heal a volume: "Please check if all brick processes are running."
Can we add a smarter error message for this situation by checking volume type first? Cheers, Laura B On Wednesday, March 14, 2018, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > Hi Anatoliy, > > The heal command is basically used to heal any mismatching contents > between replica copies of the files. > For the command "gluster volume heal <volname>"
2017 Jul 07
3
[Gluster-devel] gfid and volume-id extended attributes lost
On Fri, Jul 7, 2017 at 9:25 PM, Ankireddypalle Reddy <areddy at commvault.com> wrote: > 3.7.19 > These are the only callers for removexattr and only _posix_remove_xattr has the potential to do removexattr as posix_removexattr already makes sure that it is not gfid/volume-id. And surprise surprise _posix_remove_xattr happens only from healing code of afr/ec. And this can only happen
2018 Jan 09
2
Bricks to sub-volume mapping
Hi Team, Please let me know how I can know which bricks are part of which sub-volumes in case of disperse volume, for example in below volume has two sub-volumes : Type: Distributed-Disperse Volume ID: 6dc8ced8-27aa-4481-bfe8-057133c31d0b Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: pdchyperscale1sds:/ws/disk1/ws_brick Brick2:
2018 Mar 26
1
Sharding problem - multiple shard copies with mismatching gfids
Ian, Do you've a reproducer for this bug? If not a specific one, a general outline of what operations where done on the file will help. regards, Raghavendra On Mon, Mar 26, 2018 at 12:55 PM, Raghavendra Gowdappa <rgowdapp at redhat.com> wrote: > > > On Mon, Mar 26, 2018 at 12:40 PM, Krutika Dhananjay <kdhananj at redhat.com> > wrote: > >> The gfid mismatch
2018 Apr 03
0
Sharding problem - multiple shard copies with mismatching gfids
Raghavendra, Sorry for the late follow up. I have some more data on the issue. The issue tends to happen when the shards are created. The easiest time to reproduce this is during an initial VM disk format. This is a log from a test VM that was launched, and then partitioned and formatted with LVM / XFS: [2018-04-03 02:05:00.838440] W [MSGID: 109048]
2018 Feb 08
5
self-heal trouble after changing arbiter brick
Hi folks, I'm troubled moving an arbiter brick to another server because of I/O load issues. My setup is as follows: # gluster volume info Volume Name: myvol Type: Distributed-Replicate Volume ID: 43ba517a-ac09-461e-99da-a197759a7dc8 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: gv0:/data/glusterfs Brick2: gv1:/data/glusterfs Brick3: