thr3ads.net - Gluster users - [Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ? [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Daniel Maher

2008-Dec-08 10:26 UTC

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

Hello all,

I have been running a four-node (two servers, two clients) server-based 
AFR cluster for some time now, the architecture of which is described 
fairly accurately by the following Wiki page :
http://www.gluster.org/docs/index.php/High-availability_storage_using_server-side_AFR

In summary, there are two servers and two clients ; the clients are set 
up to connect to a single hostname, which is a round-robin DNS entry for 
both of the servers.

Last night, glusterfsd on one of the servers crashed (w/ coredump), and 
instead of the remaining server being used automatically, the entire 
cluster became unusable.  The logs for both the remaining functional 
server, as well as the clients, are littered with tens of thousands of 
error messages, and the mounted shares were not accessible.

It is (was?) my understanding that Gluster is tolerant of faults wherein 
one of the nodes becomes inaccessible.  Is this or is this not the case ?

Particulars...

Both servers :
[root at server glusterfs]# uname -s -r -o -i
Linux 2.6.25.10-86.fc9.i686 i386 GNU/Linux
[root at server glusterfs]# cat /etc/redhat-release
Fedora release 9 (Sulphur)
GLUSTER CONFIG : http://glusterfs.pastebin.com/m45feb982


Both clients :
[root at client glusterfs]# uname -s -r -o -i
Linux 2.6.24.4 x86_64 GNU/Linux
[root at client glusterfs]# cat /etc/redhat-release
Fedora release 8 (Werewolf)
GLUSTER CONFIG : http://glusterfs.pastebin.com/m48b7dd28


LOGS FROM THE INCIDENT : http://glusterfs.pastebin.com/m72cbc8f5
(excerpts from all four machines)


(note the following from the server that crashed...)
[0x110400]
/usr/lib/libglusterfs.so.0(dict_del+0x2d)[0x808e7d]
/usr/lib/glusterfs/1.3.12/xlator/protocol/client.so(notify+0x21b)[0x126a4b]
/usr/lib/libglusterfs.so.0(transport_notify+0x3d)[0x81374d]
/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xf9)[0x814779]
/usr/lib/libglusterfs.so.0(poll_iteration+0xa0)[0x8138f0]
[glusterfs](main+0x786)[0x804a156]
/lib/libc.so.6(__libc_start_main+0xe6)[0xb655d6]
[glusterfs][0x8049431]
---------


What could have caused Gluster to crash ?  Should the cluster have 
continued to function or not ?  What, if anything, can be done to 
prevent this from happening in the future ?

Thank you, all.


-- 
Daniel Maher <dma+gluster AT witbe DOT net>

Daniel Maher

2008-Dec-08 14:17 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

Stas Oskin wrote:
> Based on my limited knowledge of GlusterFS, the most reliable and
> recommended way (in wiki) is client-side AFR, where the clients aware of
> the servers status, and replicate the files accordingly.
I've reviewed the AFR-related sections of the documentation on the wiki...
http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29
http://www.gluster.org/docs/index.php/Understanding_AFR_Translator

Nowhere in those sections is it stated, either directly or implicitly,
that client-side AFR is more reliable than server-side AFR. I'm not
saying that the statement is incorrect, but rather that the
documentation noted above doesn't seem to suggest that this is the case.

How, exactly, does relying on the clients to perform the AFR logic
become more reliable than allowing the servers to do so ? In either
case, Gluster is responsible for all of the transactions, and for
determining how to deal with node failure...

I am also curious about the network traffic with such a change. In the
current setup, the overhead of replication is restricted to two nodes -
the servers. Perhaps i mis-understand client-based AFR (which is
entirely possible!), but i suspect that my replication overhead would
increase for each client, since each client would send writes to both
servers. Currently this isn't a problem, but as the number of clients
increases, so would the overhead - correct ? We intended to double the
number of servers as well (remote site) - wouldn't this in turn double
the replication overhead for each client ? This would get out of hand
fairly quickly...

Don't get me wrong, i am more than happy to try a client-based AFR
config if it truly is superior ; however as of right now i don't know
how or why this would be the case.

Thank you all for your continued suggestions and discourse.

--
Daniel Maher <dma+gluster AT witbe DOT net>

Keith Freedman

2008-Dec-08 21:32 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote:
>
> > Based on my limited knowledge of GlusterFS, the most reliable and
> > recommended way (in wiki) is client-side AFR, where the clients aware
of
> > the servers status, and replicate the files accordingly.
>
>I''ve reviewed the AFR-related sections of the documentation on the
wiki...
>http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29
>http://www.gluster.org/docs/index.php/Understanding_AFR_Translator
>
>Nowhere in those sections is it stated, either directly or implicitly,
>that client-side AFR is more reliable than server-side AFR.  I''m
not
>saying that the statement is incorrect, but rather that the
>documentation noted above doesn''t seem to suggest that this is the
case.
the issue isn''t reliability, it''s availability.

if a client only talks to one server and that server goes down then 
the client has nothing to ''fail over'' to.  however, if the
client
talks to both servers then if one goes down it''ll keep talking to the 
other one.

There are costs and benefits to each approach.
server side AFR is handy to insure that the filesystems are in sync, 
so no matter which server a client connects to, it''ll have the correct
data.
with client side AFR you lend yourself to more configuration problems.
For example.
if client 1 only knows about server 1, it will update files happily 
and no AFR takes place
if client 2 is doing client side AFR between server 1 and server 2, 
then it keeps both servers in sync, and occasionally when it accesses 
a file that client 1 updated on server 1, then client 2 takes the 
responsibility of replicating that file to server 2.

I really think a better approach would be to always have server side 
AFR, and then when a gluster client connects to a server, it''s given 
the AFR config, so that it has a ''failover pool'' that it can
use in
case it''s connection to it''s primary server gets interrupted.

Hopefully this will make it into a future version of gluster, because 
I think it will really simplify administration and increase availability.
There could be an option to make the client responsible for the 
replication, but the control and config should be centralized at the 
server, to eliminate cases where some clients are replicating to 
certain servers and not others.

my .02

Keith Freedman

2008-Dec-08 21:32 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote:
>
> > Based on my limited knowledge of GlusterFS, the most reliable and
> > recommended way (in wiki) is client-side AFR, where the clients aware
of
> > the servers status, and replicate the files accordingly.
>
>I''ve reviewed the AFR-related sections of the documentation on the
wiki...
>http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29
>http://www.gluster.org/docs/index.php/Understanding_AFR_Translator
>
>Nowhere in those sections is it stated, either directly or implicitly,
>that client-side AFR is more reliable than server-side AFR.  I''m
not
>saying that the statement is incorrect, but rather that the
>documentation noted above doesn''t seem to suggest that this is the
case.
the issue isn''t reliability, it''s availability.

if a client only talks to one server and that server goes down then 
the client has nothing to ''fail over'' to.  however, if the
client
talks to both servers then if one goes down it''ll keep talking to the 
other one.

There are costs and benefits to each approach.
server side AFR is handy to insure that the filesystems are in sync, 
so no matter which server a client connects to, it''ll have the correct
data.
with client side AFR you lend yourself to more configuration problems.
For example.
if client 1 only knows about server 1, it will update files happily 
and no AFR takes place
if client 2 is doing client side AFR between server 1 and server 2, 
then it keeps both servers in sync, and occasionally when it accesses 
a file that client 1 updated on server 1, then client 2 takes the 
responsibility of replicating that file to server 2.

I really think a better approach would be to always have server side 
AFR, and then when a gluster client connects to a server, it''s given 
the AFR config, so that it has a ''failover pool'' that it can
use in
case it''s connection to it''s primary server gets interrupted.

Hopefully this will make it into a future version of gluster, because 
I think it will really simplify administration and increase availability.
There could be an option to make the client responsible for the 
replication, but the control and config should be centralized at the 
server, to eliminate cases where some clients are replicating to 
certain servers and not others.

my .02

Keith Freedman

2008-Dec-08 21:32 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote:
>
> > Based on my limited knowledge of GlusterFS, the most reliable and
> > recommended way (in wiki) is client-side AFR, where the clients aware
of
> > the servers status, and replicate the files accordingly.
>
>I''ve reviewed the AFR-related sections of the documentation on the
wiki...
>http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29
>http://www.gluster.org/docs/index.php/Understanding_AFR_Translator
>
>Nowhere in those sections is it stated, either directly or implicitly,
>that client-side AFR is more reliable than server-side AFR.  I''m
not
>saying that the statement is incorrect, but rather that the
>documentation noted above doesn''t seem to suggest that this is the
case.
the issue isn''t reliability, it''s availability.

if a client only talks to one server and that server goes down then 
the client has nothing to ''fail over'' to.  however, if the
client
talks to both servers then if one goes down it''ll keep talking to the 
other one.

There are costs and benefits to each approach.
server side AFR is handy to insure that the filesystems are in sync, 
so no matter which server a client connects to, it''ll have the correct
data.
with client side AFR you lend yourself to more configuration problems.
For example.
if client 1 only knows about server 1, it will update files happily 
and no AFR takes place
if client 2 is doing client side AFR between server 1 and server 2, 
then it keeps both servers in sync, and occasionally when it accesses 
a file that client 1 updated on server 1, then client 2 takes the 
responsibility of replicating that file to server 2.

I really think a better approach would be to always have server side 
AFR, and then when a gluster client connects to a server, it''s given 
the AFR config, so that it has a ''failover pool'' that it can
use in
case it''s connection to it''s primary server gets interrupted.

Hopefully this will make it into a future version of gluster, because 
I think it will really simplify administration and increase availability.
There could be an option to make the client responsible for the 
replication, but the control and config should be centralized at the 
server, to eliminate cases where some clients are replicating to 
certain servers and not others.

my .02

Daniel Maher

2008-Dec-09 09:29 UTC

head link

[Gluster-users] AFR w/ RRDNS failover - does it work or not ? (WAS: simple AFR setup, one server crashes, entire cluster becomes unusable ?)

Keith Freedman wrote:
> the issue isn't reliability, it's availability.
> 
> if a client only talks to one server and that server goes down then the 
> client has nothing to 'fail over' to.  however, if the client talks
to
> both servers then if one goes down it'll keep talking to the other one.
Either the clients will honour the RRDNS and pick another server, or 
they won't - unfortunately, we now have a case where two opposing 
possibilities are being presented.  To wit :

 From the ? Gotcha ? page :
http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas

Applies to server side
[...]
"The clients connect only to 1 server. You would need to implement some 
kind of load balancing or something either with round robin DNS [...]"
"If you have client1 connected to server1 and client2 connected to 
server2, and then server2 goes down, so does client2. The cluster also 
becomes unavailable."

Ok, that seems like a straightforward enough statement, however, if we 
take a look back through the mailing list archives, we find a statment 
from Mr. Anand Avati which suggests exactly the opposite :
http://lists.nongnu.org/archive/html/gluster-devel/2008-04/msg00007.html

[...]
"Or, put another way, if ClientA (by chance) resolves 
roundrobin.gluster.local to 192.168.252.1, but .1 is currently down - 
what happens ?

it will attempt on .2, and if that fails (or disconnects after a while), 
it will attempt on .3, and once all the entries are used 'once', it will
do a fresh dns query.  it does not honor dns refresh timeouts (yet)."

The remaining basic question then is this : does AFR w/ RRDNS failover 
work or not ?  If it does, then the ? Gotcha ? page should be updated, 
/and/ further investigation is required to determine why it failed to 
operate as advertised in my environment.  If it does /not/, then the ? 
Gotcha ? page should be updated, and the wiki page i wrote (based 
largely on the suggestions of the developers) should likely be scrapped. :P

As always, thank you all for your continued discourse !

-- 
Daniel Maher <dma+gluster AT witbe DOT net>

Stas Oskin

2008-Dec-09 10:47 UTC

head link

[Gluster-users] Fwd: simple AFR setup, one server crashes, entire cluster becomes unusable ?

Hi.

What about using Wackamole and server side AFR?

Wackamole (http://www.backhand.org/wackamole/) allows to set a P2P kind of
fault tolerance, where remaining server would take the IP of the crashed
one. Then the client could continue working with the remaining server.

What do you think about this?

Also, can someone provide more info about server side - I remember I only
seen some config examples, but never any info how it actually works.

Regards.


2008/12/8 Keith Freedman <freedman at freeformit.com>

At 06:17 AM 12/8/2008, Daniel Maher wrote:>
>> Stas Oskin wrote:
>>
>> > Based on my limited knowledge of GlusterFS, the most reliable and
>> > recommended way (in wiki) is client-side AFR, where the clients
aware of
>> > the servers status, and replicate the files accordingly.
>>
>> I've reviewed the AFR-related sections of the documentation on the
wiki...
>>
>>
http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29
>> http://www.gluster.org/docs/index.php/Understanding_AFR_Translator
>>
>> Nowhere in those sections is it stated, either directly or implicitly,
>> that client-side AFR is more reliable than server-side AFR.  I'm
not
>> saying that the statement is incorrect, but rather that the
>> documentation noted above doesn't seem to suggest that this is the
case.
>>
>
> the issue isn't reliability, it's availability.
>
> if a client only talks to one server and that server goes down then the
> client has nothing to 'fail over' to.  however, if the client talks
to both
> servers then if one goes down it'll keep talking to the other one.
>
> There are costs and benefits to each approach.
> server side AFR is handy to insure that the filesystems are in sync, so no
> matter which server a client connects to, it'll have the correct data.
> with client side AFR you lend yourself to more configuration problems.
> For example.
> if client 1 only knows about server 1, it will update files happily and no
> AFR takes place
> if client 2 is doing client side AFR between server 1 and server 2, then it
> keeps both servers in sync, and occasionally when it accesses a file that
> client 1 updated on server 1, then client 2 takes the responsibility of
> replicating that file to server 2.
>
> I really think a better approach would be to always have server side AFR,
> and then when a gluster client connects to a server, it's given the AFR
> config, so that it has a 'failover pool' that it can use in case
it's
> connection to it's primary server gets interrupted.
>
> Hopefully this will make it into a future version of gluster, because I
> think it will really simplify administration and increase availability.
> There could be an option to make the client responsible for the
> replication, but the control and config should be centralized at the
server,
> to eliminate cases where some clients are replicating to certain servers
and
> not others.
>
> my .02
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081209/60e4b856/attachment.html>

Keith Freedman

2008-Dec-09 11:11 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>What about using Wackamole and server side AFR?
>
>Wackamole 
>(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/)
>allows to set a P2P kind of fault tolerance, where remaining server 
>would take the IP of the crashed one. Then the client could continue 
>working with the remaining server.
>
>What do you think about this?
I think this would likely be fine.  the client would timeout then try 
to reconnect at which point it would connect to the other server.
Server-side AFR also keeps the clients out of the replication process 
which seems better to me.
>Also, can someone provide more info about server side - I remember I 
>only seen some config examples, but never any info how it actually works.
here''s my server configs:

volume home1
   type storage/posix                   # POSIX FS translator
   option directory /gluster/home        # Export this directory
end-volume

volume posix-locks-home1
   type features/posix-locks
   option mandatory on
   subvolumes home1
end-volume

## Reference volume "home2" from remote server
volume home2
   type protocol/client                   # POSIX FS translator
   option transport-type tcp/client
   option remote-host 192.168.2.2       # IP address of remote host
   option remote-subvolume posix-locks-home1     # use home1 on remote host
   option transport-timeout 10
end-volume

### Create automatic file replication
volume home
   type cluster/afr
   option read-subvolume posix-locks-home1
   subvolumes posix-locks-home1 home2
end-volume

### Add network serving capability to above home.
volume server
   type protocol/server
   option transport-type tcp/server     # For TCP/IP transport
   subvolumes posix-locks-home1
   option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1


###I believe the following will do what you want, it''s not exactly 
the same as mine since I added the auth option for the clients 
(192.168.1.x) to mount home--the AFR volume
   option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 #
end-volume

Keith Freedman

2008-Dec-09 11:11 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>What about using Wackamole and server side AFR?
>
>Wackamole 
>(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/)
>allows to set a P2P kind of fault tolerance, where remaining server 
>would take the IP of the crashed one. Then the client could continue 
>working with the remaining server.
>
>What do you think about this?
I think this would likely be fine.  the client would timeout then try 
to reconnect at which point it would connect to the other server.
Server-side AFR also keeps the clients out of the replication process 
which seems better to me.
>Also, can someone provide more info about server side - I remember I 
>only seen some config examples, but never any info how it actually works.
here''s my server configs:

volume home1
   type storage/posix                   # POSIX FS translator
   option directory /gluster/home        # Export this directory
end-volume

volume posix-locks-home1
   type features/posix-locks
   option mandatory on
   subvolumes home1
end-volume

## Reference volume "home2" from remote server
volume home2
   type protocol/client                   # POSIX FS translator
   option transport-type tcp/client
   option remote-host 192.168.2.2       # IP address of remote host
   option remote-subvolume posix-locks-home1     # use home1 on remote host
   option transport-timeout 10
end-volume

### Create automatic file replication
volume home
   type cluster/afr
   option read-subvolume posix-locks-home1
   subvolumes posix-locks-home1 home2
end-volume

### Add network serving capability to above home.
volume server
   type protocol/server
   option transport-type tcp/server     # For TCP/IP transport
   subvolumes posix-locks-home1
   option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1


###I believe the following will do what you want, it''s not exactly 
the same as mine since I added the auth option for the clients 
(192.168.1.x) to mount home--the AFR volume
   option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 #
end-volume

Keith Freedman

2008-Dec-09 11:11 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>What about using Wackamole and server side AFR?
>
>Wackamole 
>(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/)
>allows to set a P2P kind of fault tolerance, where remaining server 
>would take the IP of the crashed one. Then the client could continue 
>working with the remaining server.
>
>What do you think about this?
I think this would likely be fine.  the client would timeout then try 
to reconnect at which point it would connect to the other server.
Server-side AFR also keeps the clients out of the replication process 
which seems better to me.
>Also, can someone provide more info about server side - I remember I 
>only seen some config examples, but never any info how it actually works.
here''s my server configs:

volume home1
   type storage/posix                   # POSIX FS translator
   option directory /gluster/home        # Export this directory
end-volume

volume posix-locks-home1
   type features/posix-locks
   option mandatory on
   subvolumes home1
end-volume

## Reference volume "home2" from remote server
volume home2
   type protocol/client                   # POSIX FS translator
   option transport-type tcp/client
   option remote-host 192.168.2.2       # IP address of remote host
   option remote-subvolume posix-locks-home1     # use home1 on remote host
   option transport-timeout 10
end-volume

### Create automatic file replication
volume home
   type cluster/afr
   option read-subvolume posix-locks-home1
   subvolumes posix-locks-home1 home2
end-volume

### Add network serving capability to above home.
volume server
   type protocol/server
   option transport-type tcp/server     # For TCP/IP transport
   subvolumes posix-locks-home1
   option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1


###I believe the following will do what you want, it''s not exactly 
the same as mine since I added the auth option for the clients 
(192.168.1.x) to mount home--the AFR volume
   option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 #
end-volume

Stas Oskin

2008-Dec-09 17:28 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

Hi.

Thanks for the example, but how actually server-side AFR works?

I mean, when you put a file on one server, it writes to the second one? And
vice-versa?

Regards.

here's my server configs:>
> volume home1
>  type storage/posix                   # POSIX FS translator
>  option directory /gluster/home        # Export this directory
> end-volume
>
> volume posix-locks-home1
>  type features/posix-locks
>  option mandatory on
>  subvolumes home1
> end-volume
>
> ## Reference volume "home2" from remote server
> volume home2
>  type protocol/client                   # POSIX FS translator
>  option transport-type tcp/client
>  option remote-host 192.168.2.2       # IP address of remote host
>  option remote-subvolume posix-locks-home1     # use home1 on remote host
>  option transport-timeout 10
> end-volume
>
> ### Create automatic file replication
> volume home
>  type cluster/afr
>  option read-subvolume posix-locks-home1
>  subvolumes posix-locks-home1 home2
> end-volume
>
> ### Add network serving capability to above home.
> volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  subvolumes posix-locks-home1
>  option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1
>
>
> ###I believe the following will do what you want, it's not exactly the
same
> as mine since I added the auth option for the clients (192.168.1.x) to
mount
> home--the AFR volume
>  option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 #
> end-volume
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081209/599f3c3e/attachment.html>

Keith Freedman

2008-Dec-09 17:36 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>Thanks for the example, but how actually server-side AFR works?
>
>I mean, when you put a file on one server, it writes to the second 
>one? And vice-versa?
yes.

client 1 updates file on server 1.  server 1 and server 2 (if AFR''ed) 
communicate and server 1 pushes file to server 2.

Client 1 reads file from server 1.  Server 1 and server 2 coordinate 
to see if they are already in sync.  if so, server 1 sends file to 
client 1.  if not server 1 gets newer version first then send to client 1.
>Regards.
>
>here''s my server configs:
>
>volume home1
>  type storage/posix                   # POSIX FS translator
>  option directory /gluster/home        # Export this directory
>end-volume
>
>volume posix-locks-home1
>  type features/posix-locks
>  option mandatory on
>  subvolumes home1
>end-volume
>
>## Reference volume "home2" from remote server
>volume home2
>  type protocol/client                   # POSIX FS translator
>  option transport-type tcp/client
>  option remote-host <http://192.168.2.2>192.168.2.2       # IP 
> address of remote host
>  option remote-subvolume posix-locks-home1     # use home1 on remote host
>  option transport-timeout 10
>end-volume
>
>### Create automatic file replication
>volume home
>  type cluster/afr
>  option read-subvolume posix-locks-home1
>  subvolumes posix-locks-home1 home2
>end-volume
>
>### Add network serving capability to above home.
>volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  subvolumes posix-locks-home1
>  option auth.addr.posix-locks-home1.allow 
> <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1
>
>
>###I believe the following will do what you want, it''s not exactly 
>the same as mine since I added the auth option for the clients 
>(192.168.1.x) to mount home--the AFR volume
>  option auth.addr.home.allow 
>
<http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1
> #
>end-volume
>
>

Keith Freedman

2008-Dec-09 17:36 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>Thanks for the example, but how actually server-side AFR works?
>
>I mean, when you put a file on one server, it writes to the second 
>one? And vice-versa?
yes.

client 1 updates file on server 1.  server 1 and server 2 (if AFR''ed) 
communicate and server 1 pushes file to server 2.

Client 1 reads file from server 1.  Server 1 and server 2 coordinate 
to see if they are already in sync.  if so, server 1 sends file to 
client 1.  if not server 1 gets newer version first then send to client 1.
>Regards.
>
>here''s my server configs:
>
>volume home1
>  type storage/posix                   # POSIX FS translator
>  option directory /gluster/home        # Export this directory
>end-volume
>
>volume posix-locks-home1
>  type features/posix-locks
>  option mandatory on
>  subvolumes home1
>end-volume
>
>## Reference volume "home2" from remote server
>volume home2
>  type protocol/client                   # POSIX FS translator
>  option transport-type tcp/client
>  option remote-host <http://192.168.2.2>192.168.2.2       # IP 
> address of remote host
>  option remote-subvolume posix-locks-home1     # use home1 on remote host
>  option transport-timeout 10
>end-volume
>
>### Create automatic file replication
>volume home
>  type cluster/afr
>  option read-subvolume posix-locks-home1
>  subvolumes posix-locks-home1 home2
>end-volume
>
>### Add network serving capability to above home.
>volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  subvolumes posix-locks-home1
>  option auth.addr.posix-locks-home1.allow 
> <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1
>
>
>###I believe the following will do what you want, it''s not exactly 
>the same as mine since I added the auth option for the clients 
>(192.168.1.x) to mount home--the AFR volume
>  option auth.addr.home.allow 
>
<http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1
> #
>end-volume
>
>

Keith Freedman

2008-Dec-09 17:36 UTC

head link

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi.
>
>Thanks for the example, but how actually server-side AFR works?
>
>I mean, when you put a file on one server, it writes to the second 
>one? And vice-versa?
yes.

client 1 updates file on server 1.  server 1 and server 2 (if AFR''ed) 
communicate and server 1 pushes file to server 2.

Client 1 reads file from server 1.  Server 1 and server 2 coordinate 
to see if they are already in sync.  if so, server 1 sends file to 
client 1.  if not server 1 gets newer version first then send to client 1.
>Regards.
>
>here''s my server configs:
>
>volume home1
>  type storage/posix                   # POSIX FS translator
>  option directory /gluster/home        # Export this directory
>end-volume
>
>volume posix-locks-home1
>  type features/posix-locks
>  option mandatory on
>  subvolumes home1
>end-volume
>
>## Reference volume "home2" from remote server
>volume home2
>  type protocol/client                   # POSIX FS translator
>  option transport-type tcp/client
>  option remote-host <http://192.168.2.2>192.168.2.2       # IP 
> address of remote host
>  option remote-subvolume posix-locks-home1     # use home1 on remote host
>  option transport-timeout 10
>end-volume
>
>### Create automatic file replication
>volume home
>  type cluster/afr
>  option read-subvolume posix-locks-home1
>  subvolumes posix-locks-home1 home2
>end-volume
>
>### Add network serving capability to above home.
>volume server
>  type protocol/server
>  option transport-type tcp/server     # For TCP/IP transport
>  subvolumes posix-locks-home1
>  option auth.addr.posix-locks-home1.allow 
> <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1
>
>
>###I believe the following will do what you want, it''s not exactly 
>the same as mine since I added the auth option for the clients 
>(192.168.1.x) to mount home--the AFR volume
>  option auth.addr.home.allow 
>
<http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1
> #
>end-volume
>
>

Gluster users - Dec 2008 - simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] AFR w/ RRDNS failover - does it work or not ? (WAS: simple AFR setup, one server crashes, entire cluster becomes unusable ?)

[Gluster-users] Fwd: simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?

[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?