thr3ads.net - Gluster users - [Gluster-users] Problem with self-heal [Jul 2014]

If this information is useful, please help other people find it:
Share via:

Miloš Kozák

2014-Jul-01 20:58 UTC

[Gluster-users] Problem with self-heal

Hi,
I am running some test on top of v3.5.1 in my 2 nodes configuration with 
one disk each and replica 2 mode.

I have two servers connected by a cable. Through this cable I let 
glusterd communicate. I start dd to create a relatively large file. In 
the middle of writing process I disconnect the cable, so on one server 
(node1) I can see all data and on the other one (node2) I can see just a 
split of the file when writing is finished.. no surprise so far.

Then I put the cable back. After a while peers are discovered, 
self-healing daemons start to communicate, so I can see:

gluster volume heal vg0 info
Brick node1:/dist1/brick/fs/
/node-middle - Possibly undergoing heal
Number of entries: 1

Brick node2:/dist1/brick/fs/
/node-middle - Possibly undergoing heal
Number of entries: 1

But on the network there are no data moving, which I verify by df..

Any help? In my opinion after a while I should get my nodes 
synchronized, but after 20minuts of waiting still nothing (the file was 
2G big)

Thanks Milos

Ravishankar N

2014-Jul-02 05:38 UTC

head link

[Gluster-users] Problem with self-heal

On 07/02/2014 02:28 AM, Milo? Koz?k wrote:> Hi,
> I am running some test on top of v3.5.1 in my 2 nodes configuration 
> with one disk each and replica 2 mode.
>
> I have two servers connected by a cable. Through this cable I let 
> glusterd communicate. I start dd to create a relatively large file. In 
> the middle of writing process I disconnect the cable, so on one server 
> (node1) I can see all data and on the other one (node2) I can see just 
> a split of the file when writing is finished
Does this mean your client (mount point) is also on node
1?> .. no surprise so far.
>
> Then I put the cable back. After a while peers are discovered, 
> self-healing daemons start to communicate, so I can see:
>
> gluster volume heal vg0 info
> Brick node1:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
>
> Brick node2:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
>
> But on the network there are no data moving, which I verify by df..
>When  you get "Possibly undergoing heal" and no I/O is going on from
the
client, it means the self-heal daemon is healing the file. Can you check 
if there are  messages in glustershd.log of node1 about self-heal 
completion ?> Any help? In my opinion after a while I should get my nodes 
> synchronized, but after 20minuts of waiting still nothing (the file 
> was 2G big)Does gluster volume status show all processes being
online?>
> Thanks Milos
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Tiziano Müller

2014-Jul-02 13:09 UTC

head link

[Gluster-users] Problem with self-heal

Hi there

Not sure whether this is related, but we see the same problem with
glusterfs-3.4(.2). Several files are listed as being healed but they never
finish and checksums are identical.
We had some problems with NTP, meaning that the clocks on the nodes diverged by
a couple of seconds. I suspect this may be the root cause for it, but I could
not do any further tests and the files are still in the same state
(self-healing).

Interestingly there are other threads describing this sort of problem, but
nothing came out so far.

Best,
Tiziano

Am 01.07.2014 22:58, schrieb Milo? Koz?k:> Hi,
> I am running some test on top of v3.5.1 in my 2 nodes configuration with
one
> disk each and replica 2 mode.
> 
> I have two servers connected by a cable. Through this cable I let glusterd
> communicate. I start dd to create a relatively large file. In the middle of
> writing process I disconnect the cable, so on one server (node1) I can see
all
> data and on the other one (node2) I can see just a split of the file when
> writing is finished.. no surprise so far.
> 
> Then I put the cable back. After a while peers are discovered, self-healing
> daemons start to communicate, so I can see:
> 
> gluster volume heal vg0 info
> Brick node1:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
> 
> Brick node2:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
> 
> But on the network there are no data moving, which I verify by df..
> 
> Any help? In my opinion after a while I should get my nodes synchronized,
but
> after 20minuts of waiting still nothing (the file was 2G big)
> 
> Thanks Milos
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-- 
stepping stone GmbH
Neufeldstrasse 9
CH-3012 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
tiziano.mueller@@stepping-stone.ch

Pranith Kumar Karampuri

2014-Jul-03 05:00 UTC

head link

[Gluster-users] Problem with self-heal

On 07/02/2014 02:28 AM, Milo? Koz?k wrote:> Hi,
> I am running some test on top of v3.5.1 in my 2 nodes configuration 
> with one disk each and replica 2 mode.
>
> I have two servers connected by a cable. Through this cable I let 
> glusterd communicate. I start dd to create a relatively large file. In 
> the middle of writing process I disconnect the cable, so on one server 
> (node1) I can see all data and on the other one (node2) I can see just 
> a split of the file when writing is finished.. no surprise so far.
>
> Then I put the cable back. After a while peers are discovered, 
> self-healing daemons start to communicate, so I can see:
>
> gluster volume heal vg0 info
> Brick node1:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
>
> Brick node2:/dist1/brick/fs/
> /node-middle - Possibly undergoing heal
> Number of entries: 1
>
> But on the network there are no data moving, which I verify by df..Could you execute "gluster volume statedump vg0" 2 times 2 minutes
apart
and attach the files in /var/run/gluster to the bug you raised. We need 
to verify if it is running into bug fixed by 
http://review.gluster.com/8187 for 3.5.2

Pranith>
> Any help? In my opinion after a while I should get my nodes 
> synchronized, but after 20minuts of waiting still nothing (the file 
> was 2G big)
>
> Thanks Milos
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Gluster users - Jul 2014 - Problem with self-heal

[Gluster-users] Problem with self-heal

[Gluster-users] Problem with self-heal

[Gluster-users] Problem with self-heal

[Gluster-users] Problem with self-heal