Daniel Maher
2008-Dec-08 10:26 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
Hello all, I have been running a four-node (two servers, two clients) server-based AFR cluster for some time now, the architecture of which is described fairly accurately by the following Wiki page : http://www.gluster.org/docs/index.php/High-availability_storage_using_server-side_AFR In summary, there are two servers and two clients ; the clients are set up to connect to a single hostname, which is a round-robin DNS entry for both of the servers. Last night, glusterfsd on one of the servers crashed (w/ coredump), and instead of the remaining server being used automatically, the entire cluster became unusable. The logs for both the remaining functional server, as well as the clients, are littered with tens of thousands of error messages, and the mounted shares were not accessible. It is (was?) my understanding that Gluster is tolerant of faults wherein one of the nodes becomes inaccessible. Is this or is this not the case ? Particulars... Both servers : [root at server glusterfs]# uname -s -r -o -i Linux 2.6.25.10-86.fc9.i686 i386 GNU/Linux [root at server glusterfs]# cat /etc/redhat-release Fedora release 9 (Sulphur) GLUSTER CONFIG : http://glusterfs.pastebin.com/m45feb982 Both clients : [root at client glusterfs]# uname -s -r -o -i Linux 2.6.24.4 x86_64 GNU/Linux [root at client glusterfs]# cat /etc/redhat-release Fedora release 8 (Werewolf) GLUSTER CONFIG : http://glusterfs.pastebin.com/m48b7dd28 LOGS FROM THE INCIDENT : http://glusterfs.pastebin.com/m72cbc8f5 (excerpts from all four machines) (note the following from the server that crashed...) [0x110400] /usr/lib/libglusterfs.so.0(dict_del+0x2d)[0x808e7d] /usr/lib/glusterfs/1.3.12/xlator/protocol/client.so(notify+0x21b)[0x126a4b] /usr/lib/libglusterfs.so.0(transport_notify+0x3d)[0x81374d] /usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xf9)[0x814779] /usr/lib/libglusterfs.so.0(poll_iteration+0xa0)[0x8138f0] [glusterfs](main+0x786)[0x804a156] /lib/libc.so.6(__libc_start_main+0xe6)[0xb655d6] [glusterfs][0x8049431] --------- What could have caused Gluster to crash ? Should the cluster have continued to function or not ? What, if anything, can be done to prevent this from happening in the future ? Thank you, all. -- Daniel Maher <dma+gluster AT witbe DOT net>
Daniel Maher
2008-Dec-08 14:17 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
Stas Oskin wrote:> Based on my limited knowledge of GlusterFS, the most reliable and > recommended way (in wiki) is client-side AFR, where the clients aware of > the servers status, and replicate the files accordingly.I've reviewed the AFR-related sections of the documentation on the wiki... http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29 http://www.gluster.org/docs/index.php/Understanding_AFR_Translator Nowhere in those sections is it stated, either directly or implicitly, that client-side AFR is more reliable than server-side AFR. I'm not saying that the statement is incorrect, but rather that the documentation noted above doesn't seem to suggest that this is the case. How, exactly, does relying on the clients to perform the AFR logic become more reliable than allowing the servers to do so ? In either case, Gluster is responsible for all of the transactions, and for determining how to deal with node failure... I am also curious about the network traffic with such a change. In the current setup, the overhead of replication is restricted to two nodes - the servers. Perhaps i mis-understand client-based AFR (which is entirely possible!), but i suspect that my replication overhead would increase for each client, since each client would send writes to both servers. Currently this isn't a problem, but as the number of clients increases, so would the overhead - correct ? We intended to double the number of servers as well (remote site) - wouldn't this in turn double the replication overhead for each client ? This would get out of hand fairly quickly... Don't get me wrong, i am more than happy to try a client-based AFR config if it truly is superior ; however as of right now i don't know how or why this would be the case. Thank you all for your continued suggestions and discourse. -- Daniel Maher <dma+gluster AT witbe DOT net>
Keith Freedman
2008-Dec-08 21:32 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote: > > > Based on my limited knowledge of GlusterFS, the most reliable and > > recommended way (in wiki) is client-side AFR, where the clients aware of > > the servers status, and replicate the files accordingly. > >I''ve reviewed the AFR-related sections of the documentation on the wiki... >http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29 >http://www.gluster.org/docs/index.php/Understanding_AFR_Translator > >Nowhere in those sections is it stated, either directly or implicitly, >that client-side AFR is more reliable than server-side AFR. I''m not >saying that the statement is incorrect, but rather that the >documentation noted above doesn''t seem to suggest that this is the case.the issue isn''t reliability, it''s availability. if a client only talks to one server and that server goes down then the client has nothing to ''fail over'' to. however, if the client talks to both servers then if one goes down it''ll keep talking to the other one. There are costs and benefits to each approach. server side AFR is handy to insure that the filesystems are in sync, so no matter which server a client connects to, it''ll have the correct data. with client side AFR you lend yourself to more configuration problems. For example. if client 1 only knows about server 1, it will update files happily and no AFR takes place if client 2 is doing client side AFR between server 1 and server 2, then it keeps both servers in sync, and occasionally when it accesses a file that client 1 updated on server 1, then client 2 takes the responsibility of replicating that file to server 2. I really think a better approach would be to always have server side AFR, and then when a gluster client connects to a server, it''s given the AFR config, so that it has a ''failover pool'' that it can use in case it''s connection to it''s primary server gets interrupted. Hopefully this will make it into a future version of gluster, because I think it will really simplify administration and increase availability. There could be an option to make the client responsible for the replication, but the control and config should be centralized at the server, to eliminate cases where some clients are replicating to certain servers and not others. my .02
Keith Freedman
2008-Dec-08 21:32 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote: > > > Based on my limited knowledge of GlusterFS, the most reliable and > > recommended way (in wiki) is client-side AFR, where the clients aware of > > the servers status, and replicate the files accordingly. > >I''ve reviewed the AFR-related sections of the documentation on the wiki... >http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29 >http://www.gluster.org/docs/index.php/Understanding_AFR_Translator > >Nowhere in those sections is it stated, either directly or implicitly, >that client-side AFR is more reliable than server-side AFR. I''m not >saying that the statement is incorrect, but rather that the >documentation noted above doesn''t seem to suggest that this is the case.the issue isn''t reliability, it''s availability. if a client only talks to one server and that server goes down then the client has nothing to ''fail over'' to. however, if the client talks to both servers then if one goes down it''ll keep talking to the other one. There are costs and benefits to each approach. server side AFR is handy to insure that the filesystems are in sync, so no matter which server a client connects to, it''ll have the correct data. with client side AFR you lend yourself to more configuration problems. For example. if client 1 only knows about server 1, it will update files happily and no AFR takes place if client 2 is doing client side AFR between server 1 and server 2, then it keeps both servers in sync, and occasionally when it accesses a file that client 1 updated on server 1, then client 2 takes the responsibility of replicating that file to server 2. I really think a better approach would be to always have server side AFR, and then when a gluster client connects to a server, it''s given the AFR config, so that it has a ''failover pool'' that it can use in case it''s connection to it''s primary server gets interrupted. Hopefully this will make it into a future version of gluster, because I think it will really simplify administration and increase availability. There could be an option to make the client responsible for the replication, but the control and config should be centralized at the server, to eliminate cases where some clients are replicating to certain servers and not others. my .02
Keith Freedman
2008-Dec-08 21:32 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 06:17 AM 12/8/2008, Daniel Maher wrote:>Stas Oskin wrote: > > > Based on my limited knowledge of GlusterFS, the most reliable and > > recommended way (in wiki) is client-side AFR, where the clients aware of > > the servers status, and replicate the files accordingly. > >I''ve reviewed the AFR-related sections of the documentation on the wiki... >http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29 >http://www.gluster.org/docs/index.php/Understanding_AFR_Translator > >Nowhere in those sections is it stated, either directly or implicitly, >that client-side AFR is more reliable than server-side AFR. I''m not >saying that the statement is incorrect, but rather that the >documentation noted above doesn''t seem to suggest that this is the case.the issue isn''t reliability, it''s availability. if a client only talks to one server and that server goes down then the client has nothing to ''fail over'' to. however, if the client talks to both servers then if one goes down it''ll keep talking to the other one. There are costs and benefits to each approach. server side AFR is handy to insure that the filesystems are in sync, so no matter which server a client connects to, it''ll have the correct data. with client side AFR you lend yourself to more configuration problems. For example. if client 1 only knows about server 1, it will update files happily and no AFR takes place if client 2 is doing client side AFR between server 1 and server 2, then it keeps both servers in sync, and occasionally when it accesses a file that client 1 updated on server 1, then client 2 takes the responsibility of replicating that file to server 2. I really think a better approach would be to always have server side AFR, and then when a gluster client connects to a server, it''s given the AFR config, so that it has a ''failover pool'' that it can use in case it''s connection to it''s primary server gets interrupted. Hopefully this will make it into a future version of gluster, because I think it will really simplify administration and increase availability. There could be an option to make the client responsible for the replication, but the control and config should be centralized at the server, to eliminate cases where some clients are replicating to certain servers and not others. my .02
Daniel Maher
2008-Dec-09 09:29 UTC
[Gluster-users] AFR w/ RRDNS failover - does it work or not ? (WAS: simple AFR setup, one server crashes, entire cluster becomes unusable ?)
Keith Freedman wrote:> the issue isn't reliability, it's availability. > > if a client only talks to one server and that server goes down then the > client has nothing to 'fail over' to. however, if the client talks to > both servers then if one goes down it'll keep talking to the other one.Either the clients will honour the RRDNS and pick another server, or they won't - unfortunately, we now have a case where two opposing possibilities are being presented. To wit : From the ? Gotcha ? page : http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas Applies to server side [...] "The clients connect only to 1 server. You would need to implement some kind of load balancing or something either with round robin DNS [...]" "If you have client1 connected to server1 and client2 connected to server2, and then server2 goes down, so does client2. The cluster also becomes unavailable." Ok, that seems like a straightforward enough statement, however, if we take a look back through the mailing list archives, we find a statment from Mr. Anand Avati which suggests exactly the opposite : http://lists.nongnu.org/archive/html/gluster-devel/2008-04/msg00007.html [...] "Or, put another way, if ClientA (by chance) resolves roundrobin.gluster.local to 192.168.252.1, but .1 is currently down - what happens ? it will attempt on .2, and if that fails (or disconnects after a while), it will attempt on .3, and once all the entries are used 'once', it will do a fresh dns query. it does not honor dns refresh timeouts (yet)." The remaining basic question then is this : does AFR w/ RRDNS failover work or not ? If it does, then the ? Gotcha ? page should be updated, /and/ further investigation is required to determine why it failed to operate as advertised in my environment. If it does /not/, then the ? Gotcha ? page should be updated, and the wiki page i wrote (based largely on the suggestions of the developers) should likely be scrapped. :P As always, thank you all for your continued discourse ! -- Daniel Maher <dma+gluster AT witbe DOT net>
Stas Oskin
2008-Dec-09 10:47 UTC
[Gluster-users] Fwd: simple AFR setup, one server crashes, entire cluster becomes unusable ?
Hi. What about using Wackamole and server side AFR? Wackamole (http://www.backhand.org/wackamole/) allows to set a P2P kind of fault tolerance, where remaining server would take the IP of the crashed one. Then the client could continue working with the remaining server. What do you think about this? Also, can someone provide more info about server side - I remember I only seen some config examples, but never any info how it actually works. Regards. 2008/12/8 Keith Freedman <freedman at freeformit.com> At 06:17 AM 12/8/2008, Daniel Maher wrote:> >> Stas Oskin wrote: >> >> > Based on my limited knowledge of GlusterFS, the most reliable and >> > recommended way (in wiki) is client-side AFR, where the clients aware of >> > the servers status, and replicate the files accordingly. >> >> I've reviewed the AFR-related sections of the documentation on the wiki... >> >> http://www.gluster.org/docs/index.php/GlusterFS_Translators_v1.3#Automatic_File_Replication_Translator_.28AFR.29 >> http://www.gluster.org/docs/index.php/Understanding_AFR_Translator >> >> Nowhere in those sections is it stated, either directly or implicitly, >> that client-side AFR is more reliable than server-side AFR. I'm not >> saying that the statement is incorrect, but rather that the >> documentation noted above doesn't seem to suggest that this is the case. >> > > the issue isn't reliability, it's availability. > > if a client only talks to one server and that server goes down then the > client has nothing to 'fail over' to. however, if the client talks to both > servers then if one goes down it'll keep talking to the other one. > > There are costs and benefits to each approach. > server side AFR is handy to insure that the filesystems are in sync, so no > matter which server a client connects to, it'll have the correct data. > with client side AFR you lend yourself to more configuration problems. > For example. > if client 1 only knows about server 1, it will update files happily and no > AFR takes place > if client 2 is doing client side AFR between server 1 and server 2, then it > keeps both servers in sync, and occasionally when it accesses a file that > client 1 updated on server 1, then client 2 takes the responsibility of > replicating that file to server 2. > > I really think a better approach would be to always have server side AFR, > and then when a gluster client connects to a server, it's given the AFR > config, so that it has a 'failover pool' that it can use in case it's > connection to it's primary server gets interrupted. > > Hopefully this will make it into a future version of gluster, because I > think it will really simplify administration and increase availability. > There could be an option to make the client responsible for the > replication, but the control and config should be centralized at the server, > to eliminate cases where some clients are replicating to certain servers and > not others. > > my .02 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081209/60e4b856/attachment.html>
Keith Freedman
2008-Dec-09 11:11 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi. > >What about using Wackamole and server side AFR? > >Wackamole >(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/) >allows to set a P2P kind of fault tolerance, where remaining server >would take the IP of the crashed one. Then the client could continue >working with the remaining server. > >What do you think about this?I think this would likely be fine. the client would timeout then try to reconnect at which point it would connect to the other server. Server-side AFR also keeps the clients out of the replication process which seems better to me.>Also, can someone provide more info about server side - I remember I >only seen some config examples, but never any info how it actually works.here''s my server configs: volume home1 type storage/posix # POSIX FS translator option directory /gluster/home # Export this directory end-volume volume posix-locks-home1 type features/posix-locks option mandatory on subvolumes home1 end-volume ## Reference volume "home2" from remote server volume home2 type protocol/client # POSIX FS translator option transport-type tcp/client option remote-host 192.168.2.2 # IP address of remote host option remote-subvolume posix-locks-home1 # use home1 on remote host option transport-timeout 10 end-volume ### Create automatic file replication volume home type cluster/afr option read-subvolume posix-locks-home1 subvolumes posix-locks-home1 home2 end-volume ### Add network serving capability to above home. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport subvolumes posix-locks-home1 option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1 ###I believe the following will do what you want, it''s not exactly the same as mine since I added the auth option for the clients (192.168.1.x) to mount home--the AFR volume option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 # end-volume
Keith Freedman
2008-Dec-09 11:11 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi. > >What about using Wackamole and server side AFR? > >Wackamole >(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/) >allows to set a P2P kind of fault tolerance, where remaining server >would take the IP of the crashed one. Then the client could continue >working with the remaining server. > >What do you think about this?I think this would likely be fine. the client would timeout then try to reconnect at which point it would connect to the other server. Server-side AFR also keeps the clients out of the replication process which seems better to me.>Also, can someone provide more info about server side - I remember I >only seen some config examples, but never any info how it actually works.here''s my server configs: volume home1 type storage/posix # POSIX FS translator option directory /gluster/home # Export this directory end-volume volume posix-locks-home1 type features/posix-locks option mandatory on subvolumes home1 end-volume ## Reference volume "home2" from remote server volume home2 type protocol/client # POSIX FS translator option transport-type tcp/client option remote-host 192.168.2.2 # IP address of remote host option remote-subvolume posix-locks-home1 # use home1 on remote host option transport-timeout 10 end-volume ### Create automatic file replication volume home type cluster/afr option read-subvolume posix-locks-home1 subvolumes posix-locks-home1 home2 end-volume ### Add network serving capability to above home. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport subvolumes posix-locks-home1 option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1 ###I believe the following will do what you want, it''s not exactly the same as mine since I added the auth option for the clients (192.168.1.x) to mount home--the AFR volume option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 # end-volume
Keith Freedman
2008-Dec-09 11:11 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 02:47 AM 12/9/2008, Stas Oskin wrote:>Hi. > >What about using Wackamole and server side AFR? > >Wackamole >(<http://www.backhand.org/wackamole/>http://www.backhand.org/wackamole/) >allows to set a P2P kind of fault tolerance, where remaining server >would take the IP of the crashed one. Then the client could continue >working with the remaining server. > >What do you think about this?I think this would likely be fine. the client would timeout then try to reconnect at which point it would connect to the other server. Server-side AFR also keeps the clients out of the replication process which seems better to me.>Also, can someone provide more info about server side - I remember I >only seen some config examples, but never any info how it actually works.here''s my server configs: volume home1 type storage/posix # POSIX FS translator option directory /gluster/home # Export this directory end-volume volume posix-locks-home1 type features/posix-locks option mandatory on subvolumes home1 end-volume ## Reference volume "home2" from remote server volume home2 type protocol/client # POSIX FS translator option transport-type tcp/client option remote-host 192.168.2.2 # IP address of remote host option remote-subvolume posix-locks-home1 # use home1 on remote host option transport-timeout 10 end-volume ### Create automatic file replication volume home type cluster/afr option read-subvolume posix-locks-home1 subvolumes posix-locks-home1 home2 end-volume ### Add network serving capability to above home. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport subvolumes posix-locks-home1 option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1 ###I believe the following will do what you want, it''s not exactly the same as mine since I added the auth option for the clients (192.168.1.x) to mount home--the AFR volume option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 # end-volume
Stas Oskin
2008-Dec-09 17:28 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
Hi. Thanks for the example, but how actually server-side AFR works? I mean, when you put a file on one server, it writes to the second one? And vice-versa? Regards. here's my server configs:> > volume home1 > type storage/posix # POSIX FS translator > option directory /gluster/home # Export this directory > end-volume > > volume posix-locks-home1 > type features/posix-locks > option mandatory on > subvolumes home1 > end-volume > > ## Reference volume "home2" from remote server > volume home2 > type protocol/client # POSIX FS translator > option transport-type tcp/client > option remote-host 192.168.2.2 # IP address of remote host > option remote-subvolume posix-locks-home1 # use home1 on remote host > option transport-timeout 10 > end-volume > > ### Create automatic file replication > volume home > type cluster/afr > option read-subvolume posix-locks-home1 > subvolumes posix-locks-home1 home2 > end-volume > > ### Add network serving capability to above home. > volume server > type protocol/server > option transport-type tcp/server # For TCP/IP transport > subvolumes posix-locks-home1 > option auth.addr.posix-locks-home1.allow 192.168.2.2,127.0.0.1 > > > ###I believe the following will do what you want, it's not exactly the same > as mine since I added the auth option for the clients (192.168.1.x) to mount > home--the AFR volume > option auth.addr.home.allow 92.168.1.1,192.168.1.2,127.0.0.1 # > end-volume > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081209/599f3c3e/attachment.html>
Keith Freedman
2008-Dec-09 17:36 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi. > >Thanks for the example, but how actually server-side AFR works? > >I mean, when you put a file on one server, it writes to the second >one? And vice-versa?yes. client 1 updates file on server 1. server 1 and server 2 (if AFR''ed) communicate and server 1 pushes file to server 2. Client 1 reads file from server 1. Server 1 and server 2 coordinate to see if they are already in sync. if so, server 1 sends file to client 1. if not server 1 gets newer version first then send to client 1.>Regards. > >here''s my server configs: > >volume home1 > type storage/posix # POSIX FS translator > option directory /gluster/home # Export this directory >end-volume > >volume posix-locks-home1 > type features/posix-locks > option mandatory on > subvolumes home1 >end-volume > >## Reference volume "home2" from remote server >volume home2 > type protocol/client # POSIX FS translator > option transport-type tcp/client > option remote-host <http://192.168.2.2>192.168.2.2 # IP > address of remote host > option remote-subvolume posix-locks-home1 # use home1 on remote host > option transport-timeout 10 >end-volume > >### Create automatic file replication >volume home > type cluster/afr > option read-subvolume posix-locks-home1 > subvolumes posix-locks-home1 home2 >end-volume > >### Add network serving capability to above home. >volume server > type protocol/server > option transport-type tcp/server # For TCP/IP transport > subvolumes posix-locks-home1 > option auth.addr.posix-locks-home1.allow > <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1 > > >###I believe the following will do what you want, it''s not exactly >the same as mine since I added the auth option for the clients >(192.168.1.x) to mount home--the AFR volume > option auth.addr.home.allow > <http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1 > # >end-volume > >
Keith Freedman
2008-Dec-09 17:36 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi. > >Thanks for the example, but how actually server-side AFR works? > >I mean, when you put a file on one server, it writes to the second >one? And vice-versa?yes. client 1 updates file on server 1. server 1 and server 2 (if AFR''ed) communicate and server 1 pushes file to server 2. Client 1 reads file from server 1. Server 1 and server 2 coordinate to see if they are already in sync. if so, server 1 sends file to client 1. if not server 1 gets newer version first then send to client 1.>Regards. > >here''s my server configs: > >volume home1 > type storage/posix # POSIX FS translator > option directory /gluster/home # Export this directory >end-volume > >volume posix-locks-home1 > type features/posix-locks > option mandatory on > subvolumes home1 >end-volume > >## Reference volume "home2" from remote server >volume home2 > type protocol/client # POSIX FS translator > option transport-type tcp/client > option remote-host <http://192.168.2.2>192.168.2.2 # IP > address of remote host > option remote-subvolume posix-locks-home1 # use home1 on remote host > option transport-timeout 10 >end-volume > >### Create automatic file replication >volume home > type cluster/afr > option read-subvolume posix-locks-home1 > subvolumes posix-locks-home1 home2 >end-volume > >### Add network serving capability to above home. >volume server > type protocol/server > option transport-type tcp/server # For TCP/IP transport > subvolumes posix-locks-home1 > option auth.addr.posix-locks-home1.allow > <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1 > > >###I believe the following will do what you want, it''s not exactly >the same as mine since I added the auth option for the clients >(192.168.1.x) to mount home--the AFR volume > option auth.addr.home.allow > <http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1 > # >end-volume > >
Keith Freedman
2008-Dec-09 17:36 UTC
[Gluster-users] simple AFR setup, one server crashes, entire cluster becomes unusable ?
At 09:28 AM 12/9/2008, Stas Oskin wrote:>Hi. > >Thanks for the example, but how actually server-side AFR works? > >I mean, when you put a file on one server, it writes to the second >one? And vice-versa?yes. client 1 updates file on server 1. server 1 and server 2 (if AFR''ed) communicate and server 1 pushes file to server 2. Client 1 reads file from server 1. Server 1 and server 2 coordinate to see if they are already in sync. if so, server 1 sends file to client 1. if not server 1 gets newer version first then send to client 1.>Regards. > >here''s my server configs: > >volume home1 > type storage/posix # POSIX FS translator > option directory /gluster/home # Export this directory >end-volume > >volume posix-locks-home1 > type features/posix-locks > option mandatory on > subvolumes home1 >end-volume > >## Reference volume "home2" from remote server >volume home2 > type protocol/client # POSIX FS translator > option transport-type tcp/client > option remote-host <http://192.168.2.2>192.168.2.2 # IP > address of remote host > option remote-subvolume posix-locks-home1 # use home1 on remote host > option transport-timeout 10 >end-volume > >### Create automatic file replication >volume home > type cluster/afr > option read-subvolume posix-locks-home1 > subvolumes posix-locks-home1 home2 >end-volume > >### Add network serving capability to above home. >volume server > type protocol/server > option transport-type tcp/server # For TCP/IP transport > subvolumes posix-locks-home1 > option auth.addr.posix-locks-home1.allow > <http://192.168.2.2>192.168.2.2,<http://127.0.0.1>127.0.0.1 > > >###I believe the following will do what you want, it''s not exactly >the same as mine since I added the auth option for the clients >(192.168.1.x) to mount home--the AFR volume > option auth.addr.home.allow > <http://92.168.1.1>92.168.1.1,<http://192.168.1.2>192.168.1.2,<http://127.0.0.1>127.0.0.1 > # >end-volume > >