thr3ads.net - Gluster users - [Gluster-users] Still problem with trivial self heal [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Łukasz Osipiuk

2008-Nov-09 12:32 UTC

[Gluster-users] Still problem with trivial self heal

Hi!

I have trivial problem with self healing. Maybe somebody will be able
to tell mi what am I doing wrong, and why do the files not heal as I
expect.

Configuration:

Servers: two nodes A, B
---------
volume posix
  type storage/posix
  option directory /ext3/glusterfs13/brick
end-volume

volume brick
  type features/posix-locks
  option mandatory on
  subvolumes posix
end-volume

volume server
  type protocol/server
  option transport-type tcp/server
  option auth.ip.brick.allow *
  option auth.ip.brick-ns.allow *
  subvolumes brick
end-volume
--------

Client: C
-------
volume brick1
  type protocol/client
  option transport-type tcp/client
  option remote-host A
  option remote-subvolume brick
end-volume

volume brick2
  type protocol/client
  option transport-type tcp/client
  option remote-host B
  option remote-subvolume brick
end-volume

volume afr
 type cluster/afr
 subvolumes brick1 brick2
end-volume

volume iot
 type performance/io-threads
 subvolumes afr
 option thread-count 8
end-volume

Scenario:

1. mount remote afr brick on C
2. do some ops
3. stop the server A (to simulate machine failure)
4. wait some time so clock skew beetween A and B is not an issue
5. write file X to gluster mount on C
6. start the server A
7. wait for C to reconnect to A
8. wait some time so clock skew beetween A and B is not an issue
9. touch, read, stat, write to file X, ls the dir in which X is (all
on gluser mount on C)

And here is the problem. Whatever I do I cant make the file X appear
on backend fs on brick A which
was down when file X was created. Help is really appricciate.


PS. I discussed, similar auto-healing  problem on gluster-devel some
time ago, and then it magically worked once, so i stopped thinking
about it. Today I see it again and as we are willing to use glusterfs
in production soon auto-heal functionality is crucial.

Regards, Lukasz Osipiuk.

-- 
?ukasz Osipiuk
mailto: lukasz at osipiuk.net

Łukasz Osipiuk

2008-Nov-09 15:33 UTC

head link

[Gluster-users] Still problem with trivial self heal

I forgot one thing,

Software version is 1.3.12
glusterfs --version
glusterfs 1.3.12 built on Nov  7 2008 18:57:06
Repository revision: glusterfs--mainline--2.5--patch-797
Copyright (c) 2006, 2007, 2008 Z RESEARCH Inc. <http://www.zresearch.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
General Public License.


On Sun, Nov 9, 2008 at 1:32 PM, ?ukasz Osipiuk <lukasz at osipiuk.net>
wrote:> Hi!
>
> I have trivial problem with self healing. Maybe somebody will be able
> to tell mi what am I doing wrong, and why do the files not heal as I
> expect.
>
> Configuration:
>
> Servers: two nodes A, B
> ---------
> volume posix
>  type storage/posix
>  option directory /ext3/glusterfs13/brick
> end-volume
>
> volume brick
>  type features/posix-locks
>  option mandatory on
>  subvolumes posix
> end-volume
>
> volume server
>  type protocol/server
>  option transport-type tcp/server
>  option auth.ip.brick.allow *
>  option auth.ip.brick-ns.allow *
>  subvolumes brick
> end-volume
> --------
>
> Client: C
> -------
> volume brick1
>  type protocol/client
>  option transport-type tcp/client
>  option remote-host A
>  option remote-subvolume brick
> end-volume
>
> volume brick2
>  type protocol/client
>  option transport-type tcp/client
>  option remote-host B
>  option remote-subvolume brick
> end-volume
>
> volume afr
>  type cluster/afr
>  subvolumes brick1 brick2
> end-volume
>
> volume iot
>  type performance/io-threads
>  subvolumes afr
>  option thread-count 8
> end-volume
>
> Scenario:
>
> 1. mount remote afr brick on C
> 2. do some ops
> 3. stop the server A (to simulate machine failure)
> 4. wait some time so clock skew beetween A and B is not an issue
> 5. write file X to gluster mount on C
> 6. start the server A
> 7. wait for C to reconnect to A
> 8. wait some time so clock skew beetween A and B is not an issue
> 9. touch, read, stat, write to file X, ls the dir in which X is (all
> on gluser mount on C)
>
> And here is the problem. Whatever I do I cant make the file X appear
> on backend fs on brick A which
> was down when file X was created. Help is really appricciate.
>
>
> PS. I discussed, similar auto-healing  problem on gluster-devel some
> time ago, and then it magically worked once, so i stopped thinking
> about it. Today I see it again and as we are willing to use glusterfs
> in production soon auto-heal functionality is crucial.
>
> Regards, Lukasz Osipiuk.
>
> --
> ?ukasz Osipiuk
> mailto: lukasz at osipiuk.net
>


-- 
?ukasz Osipiuk
mailto: lukasz at osipiuk.net

Krishna Srinivas

2008-Nov-09 20:21 UTC

head link

[Gluster-users] Still problem with trivial self heal

Lukasz,
Do you remain in the same "pwd" when A goes down and comes back up? If
so can you see if you "cd" out of glusterfs mount point and
"cd" back
in again and see if that fixes the problem?
Thanks
Krishna

On Sun, Nov 9, 2008 at 9:03 PM, ?ukasz Osipiuk <lukasz at osipiuk.net>
wrote:> I forgot one thing,
>
> Software version is 1.3.12
> glusterfs --version
> glusterfs 1.3.12 built on Nov  7 2008 18:57:06
> Repository revision: glusterfs--mainline--2.5--patch-797
> Copyright (c) 2006, 2007, 2008 Z RESEARCH Inc.
<http://www.zresearch.com>
> GlusterFS comes with ABSOLUTELY NO WARRANTY.
> You may redistribute copies of GlusterFS under the terms of the GNU
> General Public License.
>
>
> On Sun, Nov 9, 2008 at 1:32 PM, ?ukasz Osipiuk <lukasz at
osipiuk.net> wrote:
>> Hi!
>>
>> I have trivial problem with self healing. Maybe somebody will be able
>> to tell mi what am I doing wrong, and why do the files not heal as I
>> expect.
>>
>> Configuration:
>>
>> Servers: two nodes A, B
>> ---------
>> volume posix
>>  type storage/posix
>>  option directory /ext3/glusterfs13/brick
>> end-volume
>>
>> volume brick
>>  type features/posix-locks
>>  option mandatory on
>>  subvolumes posix
>> end-volume
>>
>> volume server
>>  type protocol/server
>>  option transport-type tcp/server
>>  option auth.ip.brick.allow *
>>  option auth.ip.brick-ns.allow *
>>  subvolumes brick
>> end-volume
>> --------
>>
>> Client: C
>> -------
>> volume brick1
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host A
>>  option remote-subvolume brick
>> end-volume
>>
>> volume brick2
>>  type protocol/client
>>  option transport-type tcp/client
>>  option remote-host B
>>  option remote-subvolume brick
>> end-volume
>>
>> volume afr
>>  type cluster/afr
>>  subvolumes brick1 brick2
>> end-volume
>>
>> volume iot
>>  type performance/io-threads
>>  subvolumes afr
>>  option thread-count 8
>> end-volume
>>
>> Scenario:
>>
>> 1. mount remote afr brick on C
>> 2. do some ops
>> 3. stop the server A (to simulate machine failure)
>> 4. wait some time so clock skew beetween A and B is not an issue
>> 5. write file X to gluster mount on C
>> 6. start the server A
>> 7. wait for C to reconnect to A
>> 8. wait some time so clock skew beetween A and B is not an issue
>> 9. touch, read, stat, write to file X, ls the dir in which X is (all
>> on gluser mount on C)
>>
>> And here is the problem. Whatever I do I cant make the file X appear
>> on backend fs on brick A which
>> was down when file X was created. Help is really appricciate.
>>
>>
>> PS. I discussed, similar auto-healing  problem on gluster-devel some
>> time ago, and then it magically worked once, so i stopped thinking
>> about it. Today I see it again and as we are willing to use glusterfs
>> in production soon auto-heal functionality is crucial.
>>
>> Regards, Lukasz Osipiuk.
>>
>> --
>> ?ukasz Osipiuk
>> mailto: lukasz at osipiuk.net
>>
>
>
>
> --
> ?ukasz Osipiuk
> mailto: lukasz at osipiuk.net
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>

Krishna Srinivas

2008-Nov-10 12:15 UTC

head link

[Gluster-users] Still problem with trivial self heal

Lukasz,
That behavior is a bug. It will be fixed in coming releases.
Regards
Krishna

On Mon, Nov 10, 2008 at 5:18 PM, ?ukasz Osipiuk <lukasz at osipiuk.net>
wrote:> 2008/11/9 Krishna Srinivas <krishna at zresearch.com>:
>> Lukasz,
>> Do you remain in the same "pwd" when A goes down and comes
back up? If
>> so can you see if you "cd" out of glusterfs mount point and
"cd" back
>> in again and see if that fixes the problem?
>
> Hmm - it helped :)
>
> Can I rely on such behaviour. Would directory be healed if some other
> process on the same or another machine still
> used it as working directory?
>
> Thanks a lot, Lukasz
>
>
>> Thanks
>> Krishna
>>
>> On Sun, Nov 9, 2008 at 9:03 PM, ?ukasz Osipiuk <lukasz at
osipiuk.net> wrote:
>>> I forgot one thing,
>>>
>>> Software version is 1.3.12
>>> glusterfs --version
>>> glusterfs 1.3.12 built on Nov  7 2008 18:57:06
>>> Repository revision: glusterfs--mainline--2.5--patch-797
>>> Copyright (c) 2006, 2007, 2008 Z RESEARCH Inc.
<http://www.zresearch.com>
>>> GlusterFS comes with ABSOLUTELY NO WARRANTY.
>>> You may redistribute copies of GlusterFS under the terms of the GNU
>>> General Public License.
>>>
>>>
>>> On Sun, Nov 9, 2008 at 1:32 PM, ?ukasz Osipiuk <lukasz at
osipiuk.net> wrote:
>>>> Hi!
>>>>
>>>> I have trivial problem with self healing. Maybe somebody will
be able
>>>> to tell mi what am I doing wrong, and why do the files not heal
as I
>>>> expect.
>>>>
>>>> Configuration:
>>>>
>>>> Servers: two nodes A, B
>>>> ---------
>>>> volume posix
>>>>  type storage/posix
>>>>  option directory /ext3/glusterfs13/brick
>>>> end-volume
>>>>
>>>> volume brick
>>>>  type features/posix-locks
>>>>  option mandatory on
>>>>  subvolumes posix
>>>> end-volume
>>>>
>>>> volume server
>>>>  type protocol/server
>>>>  option transport-type tcp/server
>>>>  option auth.ip.brick.allow *
>>>>  option auth.ip.brick-ns.allow *
>>>>  subvolumes brick
>>>> end-volume
>>>> --------
>>>>
>>>> Client: C
>>>> -------
>>>> volume brick1
>>>>  type protocol/client
>>>>  option transport-type tcp/client
>>>>  option remote-host A
>>>>  option remote-subvolume brick
>>>> end-volume
>>>>
>>>> volume brick2
>>>>  type protocol/client
>>>>  option transport-type tcp/client
>>>>  option remote-host B
>>>>  option remote-subvolume brick
>>>> end-volume
>>>>
>>>> volume afr
>>>>  type cluster/afr
>>>>  subvolumes brick1 brick2
>>>> end-volume
>>>>
>>>> volume iot
>>>>  type performance/io-threads
>>>>  subvolumes afr
>>>>  option thread-count 8
>>>> end-volume
>>>>
>>>> Scenario:
>>>>
>>>> 1. mount remote afr brick on C
>>>> 2. do some ops
>>>> 3. stop the server A (to simulate machine failure)
>>>> 4. wait some time so clock skew beetween A and B is not an
issue
>>>> 5. write file X to gluster mount on C
>>>> 6. start the server A
>>>> 7. wait for C to reconnect to A
>>>> 8. wait some time so clock skew beetween A and B is not an
issue
>>>> 9. touch, read, stat, write to file X, ls the dir in which X is
(all
>>>> on gluser mount on C)
>>>>
>>>> And here is the problem. Whatever I do I cant make the file X
appear
>>>> on backend fs on brick A which
>>>> was down when file X was created. Help is really appricciate.
>>>>
>>>>
>>>> PS. I discussed, similar auto-healing  problem on gluster-devel
some
>>>> time ago, and then it magically worked once, so i stopped
thinking
>>>> about it. Today I see it again and as we are willing to use
glusterfs
>>>> in production soon auto-heal functionality is crucial.
>>>>
>>>> Regards, Lukasz Osipiuk.
>>>>
>>>> --
>>>> ?ukasz Osipiuk
>>>> mailto: lukasz at osipiuk.net
>>>>
>>>
>>>
>>>
>>> --
>>> ?ukasz Osipiuk
>>> mailto: lukasz at osipiuk.net
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>>
>>
>
>
>
> --
> ?ukasz Osipiuk
> mailto: lukasz at osipiuk.net
>

Reasonably Related Threads

Search for more apparently analagous threads

Gluster users - Nov 2008 - Still problem with trivial self heal

[Gluster-users] Still problem with trivial self heal

[Gluster-users] Still problem with trivial self heal

[Gluster-users] Still problem with trivial self heal

[Gluster-users] Still problem with trivial self heal

Reasonably Related Threads