thr3ads.net - similar to: "modifying data via fues causes heal problem"

Displaying 20 results from an estimated 6000 matches similar to: "modifying data via fues causes heal problem"

2017 Sep 04

heal info OK but statistics not working

Ravi/Karthick, If one of the self heal process is down, will the statstics heal-count command work? On Mon, Sep 4, 2017 at 7:24 PM, lejeczek <peljasz at yahoo.co.uk> wrote: > 1) one peer, out of four, got separated from the network, from the rest of > the cluster. > 2) that unavailable(while it was unavailable) peer got detached with > "gluster peer detach" command

heal info OK but statistics not working

2017 Sep 04

heal info OK but statistics not working

1) one peer, out of four, got separated from the network, from the rest of the cluster. 2) that unavailable(while it was unavailable) peer got detached with "gluster peer detach" command which succeeded, so now cluster comprise of three peers 3) Self-heal daemon (for some reason) does not start(with an attempt to restart glusted) on the peer which probed that fourth peer. 4) fourth

one brick one volume process dies?

2017 Sep 28

one brick one volume process dies?

On 13/09/17 20:47, Ben Werthmann wrote: > These symptoms appear to be the same as I've recorded in > this post: > > http://lists.gluster.org/pipermail/gluster-users/2017-September/032435.html > > On Wed, Sep 13, 2017 at 7:01 AM, Atin Mukherjee > <atin.mukherjee83 at gmail.com > <mailto:atin.mukherjee83 at gmail.com>> wrote: > > Additionally the

one brick one volume process dies?

2017 Sep 13

one brick one volume process dies?

These symptoms appear to be the same as I've recorded in this post: http://lists.gluster.org/pipermail/gluster-users/2017-September/032435.html On Wed, Sep 13, 2017 at 7:01 AM, Atin Mukherjee <atin.mukherjee83 at gmail.com> wrote: > Additionally the brick log file of the same brick would be required. > Please look for if brick process went down or crashed. Doing a volume start

one brick one volume process dies?

2017 Sep 13

one brick one volume process dies?

Please send me the logs as well i.e glusterd.logs and cmd_history.log. On Wed, Sep 13, 2017 at 1:45 PM, lejeczek <peljasz at yahoo.co.uk> wrote: > > > On 13/09/17 06:21, Gaurav Yadav wrote: > >> Please provide the output of gluster volume info, gluster volume status >> and gluster peer status. >> >> Apart from above info, please provide glusterd logs,

one brick one volume process dies?

2017 Sep 13

one brick one volume process dies?

On 13/09/17 06:21, Gaurav Yadav wrote: > Please provide the output of gluster volume info, gluster > volume status and gluster peer status. > > Apart? from above info, please provide glusterd logs, > cmd_history.log. > > Thanks > Gaurav > > On Tue, Sep 12, 2017 at 2:22 PM, lejeczek > <peljasz at yahoo.co.uk <mailto:peljasz at yahoo.co.uk>> wrote:

one brick one volume process dies?

2017 Sep 13

one brick one volume process dies?

Additionally the brick log file of the same brick would be required. Please look for if brick process went down or crashed. Doing a volume start force should resolve the issue. On Wed, 13 Sep 2017 at 16:28, Gaurav Yadav <gyadav at redhat.com> wrote: > Please send me the logs as well i.e glusterd.logs and cmd_history.log. > > > On Wed, Sep 13, 2017 at 1:45 PM, lejeczek

connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

2017 Aug 02

connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

This means shd client is not able to establish the connection with the brick on port 49155. Now this could happen if glusterd has ended up providing a stale port back which is not what brick is listening to. If you had killed any brick process using sigkill signal instead of sigterm this is expected as portmap_signout is not received by glusterd in the former case and the old portmap entry is

connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

2017 Aug 01

connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

how critical is above? I get plenty of these on all three peers. hi guys I've recently upgraded from 3.8 to 3.10 and I'm seeing weird behavior. I see: $gluster vol status $_vol detail; takes long timeand mostly times out. I do: $ gluster vol heal $_vol info and I see: Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA Status: Transport endpoint is not connected Number

heal info OK but statistics not working

2017 Sep 04

heal info OK but statistics not working

Please provide the output of gluster volume info, gluster volume status and gluster peer status. On Mon, Sep 4, 2017 at 4:07 PM, lejeczek <peljasz at yahoo.co.uk> wrote: > hi all > > this: > $ vol heal $_vol info > outputs ok and exit code is 0 > But if I want to see statistics: > $ gluster vol heal $_vol statistics > Gathering crawl statistics on volume GROUP-WORK

heal info OK but statistics not working

2017 Sep 04

heal info OK but statistics not working

hi all this: $ vol heal $_vol info outputs ok and exit code is 0 But if I want to see statistics: $ gluster vol heal $_vol statistics Gathering crawl statistics on volume GROUP-WORK has been unsuccessful on bricks that are down. Please check if all brick processes are running. I suspect - gluster inability to cope with a situation where one peer(which is not even a brick for a single vol on

one brick one volume process dies?

2017 Sep 13

one brick one volume process dies?

Please provide the output of gluster volume info, gluster volume status and gluster peer status. Apart from above info, please provide glusterd logs, cmd_history.log. Thanks Gaurav On Tue, Sep 12, 2017 at 2:22 PM, lejeczek <peljasz at yahoo.co.uk> wrote: > hi everyone > > I have 3-peer cluster with all vols in replica mode, 9 vols. > What I see, unfortunately, is one brick

one brick one volume process dies?

2017 Sep 12

one brick one volume process dies?

hi everyone I have 3-peer cluster with all vols in replica mode, 9 vols. What I see, unfortunately, is one brick fails in one vol, when it happens it's always the same vol on the same brick. Command: gluster vol status $vol - would show brick not online. Restarting glusterd with systemclt does not help, only system reboot seem to help, until it happens, next time. How to troubleshoot this

stuck heal process

2017 Jul 25

stuck heal process

Good Morning! We are running RedHat 7.3 with glusterfs-server-3.8.4-18.el7rhgs.x86_64. Not sure if your able to help with this version or not. I have a 5 node setup with 1 node having no storage and only acting as a quorum node. We have a mis of direct attached storage and iscsi SAN storage. We have distributed replica volumes created across all 4 nodes. At some point last week one of the

Heal Info Shows Split Brain, but "file not in split brain" when attempted heal

2017 Sep 21

Heal Info Shows Split Brain, but "file not in split brain" when attempted heal

Hello I am using Glusterfs 3.10.5 on CentOS7. A replicated distributed volume with a dist-rep hot tier. During data migration, we noticed the tierd.log on one of nodes was huge. Upon review it seemed to be stuck on a certain set of files. Running a "gluster vol heal VOL info" showed that those same files caused problems in the tier, were in split brain. So we went to fix split

?==?utf-8?q? Heal operation detail of EC volumes

2017 Jun 02

?==?utf-8?q? Heal operation detail of EC volumes

Hi Serkan, On Thursday, June 01, 2017 21:31 CEST, Serkan ?oban <cobanserkan at gmail.com> wrote: ?>Is it possible that this matches your observations ? Yes that matches what I see. So 19 files is being in parallel by 19 SHD processes. I thought only one file is being healed at a time. Then what is the meaning of disperse.shd-max-threads parameter? If I set it to 2 then each SHD thread

Heal operation detail of EC volumes

2017 Jun 08

Heal operation detail of EC volumes

On Fri, Jun 2, 2017 at 1:01 AM, Serkan ?oban <cobanserkan at gmail.com> wrote: > >Is it possible that this matches your observations ? > Yes that matches what I see. So 19 files is being in parallel by 19 > SHD processes. I thought only one file is being healed at a time. > Then what is the meaning of disperse.shd-max-threads parameter? If I > set it to 2 then each SHD

Heal operation detail of EC volumes

2017 Jun 01

Heal operation detail of EC volumes

>Is it possible that this matches your observations ? Yes that matches what I see. So 19 files is being in parallel by 19 SHD processes. I thought only one file is being healed at a time. Then what is the meaning of disperse.shd-max-threads parameter? If I set it to 2 then each SHD thread will heal two files at the same time? >How many IOPS can handle your bricks ? Bricks are 7200RPM NL-SAS

Heal operation detail of EC volumes

2017 May 29

Heal operation detail of EC volumes

Hi, When a brick fails in EC, What is the healing read/write data path? Which processes do the operations? Assume a 2GB file is being healed in 16+4 EC configuration. I was thinking that SHD deamon on failed brick host will read 2GB from network and reconstruct its 100MB chunk and write it on to brick. Is this right?

gfid entries in volume heal info that do not heal

2017 Oct 16

gfid entries in volume heal info that do not heal

Hi all, I have a volume where the output of volume heal info shows several gfid entries to be healed, but they've been there for weeks and have not healed. Any normal file that shows up on the heal info does get healed as expected, but these gfid entries do not. Is there any way to remove these orphaned entries from the volume so they are no longer stuck in the heal process? Thank you!

similar to: modifying data via fues causes heal problem