Stas Oskin
2009-Mar-08 08:58 UTC
[Gluster-users] GlusterFS running, but not syncing is done
Hi. I'm trying to run my first GlusterFS setup, basically 2 servers running in AFR mode. While the servers find and connect to each other, unfortunately the file are not being synchronized between them. I mean, when I place a file in one of the servers, the other one does not receive it. Here is what I receive on each of the servers: 2009-03-08 02:41:43 N [server-protocol.c:7186:mop_setvolume] server: accepted client from 192.168.253.41:1020 2009-03-08 02:41:48 D [client-protocol.c:5924:client_protocol_reconnect] home2: breaking reconnect chain 2009-03-08 02:41:48 D [client-protocol.c:5924:client_protocol_reconnect] home2: breaking reconnect chain and 2009-03-08 02:41:43 D [client-protocol.c:6557:notify] home2: got GF_EVENT_CHILD_UP 2009-03-08 02:41:43 D [socket.c:951:socket_connect] home2: connect () called on transport already connected 2009-03-08 02:41:43 N [client-protocol.c:5853:client_setvolume_cbk] home2: connection and handshake succeeded 2009-03-08 02:41:53 D [client-protocol.c:5924:client_protocol_reconnect] home2: breaking reconnect chain 2009-03-08 02:41:53 D [client-protocol.c:5924:client_protocol_reconnect] home2: breaking reconnect chain Any idea why the files are not synchronized and how it can be diagnosed? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090308/2dc9a172/attachment.html>
Krishna Srinivas
2009-Mar-08 19:59 UTC
[Gluster-users] GlusterFS running, but not syncing is done
Stas, Was it working for your previously? Any other error logs on machine with afr? what version are you using? If it was working previously what changed in your setup recently? Can you paste your vol files (just to be sure) Krishna On Sun, Mar 8, 2009 at 2:28 PM, Stas Oskin <stas.oskin at gmail.com> wrote:> Hi. > > I'm trying to run my first GlusterFS setup, basically 2 servers running in > AFR mode. > > While the servers find and connect to each other, unfortunately the file are > not being synchronized between them. I mean, when I place a file in one of > the servers, the other one does not receive it. > > Here is what I receive on each of the servers: > 2009-03-08 02:41:43 N [server-protocol.c:7186:mop_setvolume] server: > accepted client from 192.168.253.41:1020 > 2009-03-08 02:41:48 D [client-protocol.c:5924:client_protocol_reconnect] > home2: breaking reconnect chain > 2009-03-08 02:41:48 D [client-protocol.c:5924:client_protocol_reconnect] > home2: breaking reconnect chain > > and > > 2009-03-08 02:41:43 D [client-protocol.c:6557:notify] home2: got > GF_EVENT_CHILD_UP > 2009-03-08 02:41:43 D [socket.c:951:socket_connect] home2: connect () called > on transport already connected > 2009-03-08 02:41:43 N [client-protocol.c:5853:client_setvolume_cbk] home2: > connection and handshake succeeded > 2009-03-08 02:41:53 D [client-protocol.c:5924:client_protocol_reconnect] > home2: breaking reconnect chain > 2009-03-08 02:41:53 D [client-protocol.c:5924:client_protocol_reconnect] > home2: breaking reconnect chain > > Any idea why the files are not synchronized and how it can be diagnosed? > > Thanks. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users > >
Stas Oskin
2009-Mar-09 20:02 UTC
[Gluster-users] Fwd: GlusterFS running, but not syncing is done
---------- Forwarded message ---------- From: Stas Oskin <stas.oskin at gmail.com> Date: 2009/3/9 Subject: Re: [Gluster-users] GlusterFS running, but not syncing is done To: Krishna Srinivas ?krishna at zresearch.com? Hi. The boxes participating in AFR are running OpenVZ host kernels - can it be related in any way to the issue? Regards. 2009/3/9 Stas Oskin <stas.oskin at gmail.com>> Hi. > These are my new 2 vol files, one for client and one for server. > > Can you advice if they are correct? > > Thanks in advance. > > glusterfs.vol (client) > > ## Reference volume "home2" from remote server > volume home2 > type protocol/client > option transport-type tcp/client > option remote-host 192.168.253.41 # IP address of remote host > option remote-subvolume posix-locks-home1 # use home1 on remote host > option transport-timeout 10 # value in seconds; it should be set > relatively low > end-volume > > ### Create automatic file replication > volume home > type cluster/afr > option metadata-self-heal on > option read-subvolume posix-locks-home1 > # option favorite-child home2 > subvolumes posix-locks-home1 home2 > end-volume > > > glusterfsd.vol (server) > > volume home1 > type storage/posix # POSIX FS translator > option directory /media/storage # Export this directory > end-volume > > volume posix-locks-home1 > type features/posix-locks > option mandatory-locks on > subvolumes home1 > end-volume > > ### Add network serving capability to above home. > volume server > type protocol/server > option transport-type tcp > subvolumes posix-locks-home1 > option auth.addr.posix-locks-home1.allow 192.168.253.41,127.0.0.1 # Allow > access to "home1" volume > end-volume > > 2009/3/9 Krishna Srinivas <krishna at zresearch.com> > > Stats, >> >> I think there was nothing changed between rc2 and rc4 that could >> affect this functionality. >> >> Your vol files look fine, i will look into why it is not working. >> >> Do not use single process as both server and client as we saw issues >> related to locking. Can you see if using different processes for >> server and client works fine w.r.t replication? >> >> Also subvolumes list of all AFRs should be in same order (in your case >> its interchanged) >> >> Regards >> Krishna >> >> On Mon, Mar 9, 2009 at 5:44 PM, Stas Oskin <stas.oskin at gmail.com> wrote: >> > Actually, I see a new version came out, rc4. >> > Any idea if anything related was fixed? >> > Regards. >> > 2009/3/9 Stas Oskin <stas.oskin at gmail.com> >> >> >> >> Hi. >> >>> >> >>> Was it working for your previously? Any other error logs on machine >> >>> with afr? what version are you using? If it was working previously >> >>> what changed in your setup recently? Can you paste your vol files >> >>> (just to be sure) >> >> >> >> >> >> Nope, it actually my first setup in lab. No errors - it just seems as >> not >> >> synchronizing anything. The version I'm using is the latest one - 2 >> rc2. >> >> Perhaps I need to modify anything else in addition to GlusterFS >> >> installation - like file-systems attributes or something? >> >> The approach I'm using is the one that was recommended by Kieth over >> >> direct emails (Keith, hope you don't mind me posting them :) ). >> >> The idea is basically to have single vol file both for client and for >> >> server, and to have one glusterfs process doing the job both as client >> and >> >> as server. >> >> Thanks for the help. >> >> Server 1: >> >> volume home1 >> >> type storage/posix # POSIX FS translator >> >> option directory /media/storage # Export this directory >> >> end-volume >> >> >> >> volume posix-locks-home1 >> >> type features/posix-locks >> >> option mandatory-locks on >> >> subvolumes home1 >> >> end-volume >> >> >> >> ## Reference volume "home2" from remote server >> >> volume home2 >> >> type protocol/client >> >> option transport-type tcp/client >> >> option remote-host 192.168.253.42 # IP address of remote host >> >> option remote-subvolume posix-locks-home1 # use home1 on remote >> host >> >> option transport-timeout 10 # value in seconds; it should be >> >> set relatively low >> >> end-volume >> >> >> >> ### Add network serving capability to above home. >> >> volume server >> >> type protocol/server >> >> option transport-type tcp >> >> subvolumes posix-locks-home1 >> >> option auth.addr.posix-locks-home1.allow 192.168.253.42,127.0.0.1 # >> Allow >> >> access to "home1" volume >> >> end-volume >> >> >> >> ### Create automatic file replication >> >> volume home >> >> type cluster/afr >> >> option metadata-self-heal on >> >> option read-subvolume posix-locks-home1 >> >> # option favorite-child home2 >> >> subvolumes home2 posix-locks-home1 >> >> end-volume >> >> >> >> >> >> Server 2: >> >> >> >> volume home1 >> >> type storage/posix # POSIX FS translator >> >> option directory /media/storage # Export this directory >> >> end-volume >> >> >> >> volume posix-locks-home1 >> >> type features/posix-locks >> >> option mandatory-locks on >> >> subvolumes home1 >> >> end-volume >> >> >> >> ## Reference volume "home2" from remote server >> >> volume home2 >> >> type protocol/client >> >> option transport-type tcp/client >> >> option remote-host 192.168.253.41 # IP address of remote host >> >> option remote-subvolume posix-locks-home1 # use home1 on remote >> host >> >> option transport-timeout 10 # value in seconds; it should be >> >> set relatively low >> >> end-volume >> >> >> >> ### Add network serving capability to above home. >> >> volume server >> >> type protocol/server >> >> option transport-type tcp >> >> subvolumes posix-locks-home1 >> >> option auth.addr.posix-locks-home1.allow 192.168.253.41,127.0.0.1 # >> Allow >> >> access to "home1" volume >> >> end-volume >> >> >> >> ### Create automatic file replication >> >> volume home >> >> type cluster/afr >> >> option metadata-self-heal on >> >> option read-subvolume posix-locks-home1 >> >> # option favorite-child home2 >> >> subvolumes home2 posix-locks-home1 >> >> end-volume >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090309/5868e0ec/attachment.html>
Krishna Srinivas
2009-Mar-12 13:57 UTC
[Gluster-users] GlusterFS running, but not syncing is done
Hi Stats, Excuse me for missing out on this mail. Your vol files for having 2 servers and 2 clients are incorrect. on server vol (both the machines) you need to have: protocol/server -> features/locks -> storage/posix On client vol (both the machines) you need to have: cluster/afr -> (two protocol/clients) each of the protocol/clients connect to each of the servers. You would use the client vol to mount the glusterfs. Let us know if you still face problems. Krishna On Tue, Mar 10, 2009 at 1:32 AM, Stas Oskin <stas.oskin at gmail.com> wrote:> Hi. > The boxes participating in AFR are running OpenVZ host kernels - can it be > related in any way to the issue? > Regards. > > 2009/3/9 Stas Oskin <stas.oskin at gmail.com> >> >> Hi. >> These are my new 2 vol files, one for client and one for server. >> Can you advice if they are correct? >> Thanks in advance. >> glusterfs.vol (client) >> ## Reference volume "home2" from remote server >> volume home2 >> type protocol/client >> option transport-type tcp/client >> option remote-host 192.168.253.41 # IP address of remote host >> option remote-subvolume posix-locks-home1 # use home1 on remote host >> option transport-timeout 10 # value in seconds; it should be >> set relatively low >> end-volume >> ### Create automatic file replication >> volume home >> type cluster/afr >> option metadata-self-heal on >> option read-subvolume posix-locks-home1 >> # option favorite-child home2 >> subvolumes posix-locks-home1 home2 >> end-volume >> >> glusterfsd.vol (server) >> >> volume home1 >> type storage/posix # POSIX FS translator >> option directory /media/storage # Export this directory >> end-volume >> volume posix-locks-home1 >> type features/posix-locks >> option mandatory-locks on >> subvolumes home1 >> end-volume >> ### Add network serving capability to above home. >> volume server >> type protocol/server >> option transport-type tcp >> subvolumes posix-locks-home1 >> option auth.addr.posix-locks-home1.allow 192.168.253.41,127.0.0.1 # Allow >> access to "home1" volume >> end-volume >> 2009/3/9 Krishna Srinivas <krishna at zresearch.com> >>> >>> Stats, >>> >>> I think there was nothing changed between rc2 and rc4 that could >>> affect this functionality. >>> >>> Your vol files look fine, i will look into why it is not working. >>> >>> Do not use single process as both server and client as we saw issues >>> related to locking. Can you see if using different processes for >>> server and client works fine w.r.t replication? >>> >>> Also subvolumes list of all AFRs should be in same order (in your case >>> its interchanged) >>> >>> Regards >>> Krishna >>> >>> On Mon, Mar 9, 2009 at 5:44 PM, Stas Oskin <stas.oskin at gmail.com> wrote: >>> > Actually, I see a new version came out, rc4. >>> > Any idea if anything related was fixed? >>> > Regards. >>> > 2009/3/9 Stas Oskin <stas.oskin at gmail.com> >>> >> >>> >> Hi. >>> >>> >>> >>> Was it working for your previously? Any other error logs on machine >>> >>> with afr? what version are you using? If it was working previously >>> >>> what changed in your setup recently? Can you paste your vol files >>> >>> (just to be sure) >>> >> >>> >> >>> >> Nope, it actually my first setup in lab. No errors - it just seems as >>> >> not >>> >> synchronizing anything. The version I'm using is the latest one - 2 >>> >> rc2. >>> >> Perhaps I need to modify anything else in addition to GlusterFS >>> >> installation - like file-systems attributes or something? >>> >> The approach I'm using is the one that was recommended by Kieth over >>> >> direct emails (Keith, hope you don't mind me posting them :) ). >>> >> The idea is basically to have single vol file both for client and for >>> >> server, and to have one glusterfs process doing the job both as client >>> >> and >>> >> as server. >>> >> Thanks for the help. >>> >> Server 1: >>> >> volume home1 >>> >> type storage/posix # POSIX FS translator >>> >> option directory /media/storage # Export this directory >>> >> end-volume >>> >> >>> >> volume posix-locks-home1 >>> >> type features/posix-locks >>> >> option mandatory-locks on >>> >> subvolumes home1 >>> >> end-volume >>> >> >>> >> ## Reference volume "home2" from remote server >>> >> volume home2 >>> >> type protocol/client >>> >> option transport-type tcp/client >>> >> option remote-host 192.168.253.42 # IP address of remote host >>> >> option remote-subvolume posix-locks-home1 # use home1 on remote >>> >> host >>> >> option transport-timeout 10 # value in seconds; it should >>> >> be >>> >> set relatively low >>> >> end-volume >>> >> >>> >> ### Add network serving capability to above home. >>> >> volume server >>> >> type protocol/server >>> >> option transport-type tcp >>> >> subvolumes posix-locks-home1 >>> >> option auth.addr.posix-locks-home1.allow 192.168.253.42,127.0.0.1 # >>> >> Allow >>> >> access to "home1" volume >>> >> end-volume >>> >> >>> >> ### Create automatic file replication >>> >> volume home >>> >> type cluster/afr >>> >> option metadata-self-heal on >>> >> option read-subvolume posix-locks-home1 >>> >> # option favorite-child home2 >>> >> subvolumes home2 posix-locks-home1 >>> >> end-volume >>> >> >>> >> >>> >> Server 2: >>> >> >>> >> volume home1 >>> >> type storage/posix # POSIX FS translator >>> >> option directory /media/storage # Export this directory >>> >> end-volume >>> >> >>> >> volume posix-locks-home1 >>> >> type features/posix-locks >>> >> option mandatory-locks on >>> >> subvolumes home1 >>> >> end-volume >>> >> >>> >> ## Reference volume "home2" from remote server >>> >> volume home2 >>> >> type protocol/client >>> >> option transport-type tcp/client >>> >> option remote-host 192.168.253.41 # IP address of remote host >>> >> option remote-subvolume posix-locks-home1 # use home1 on remote >>> >> host >>> >> option transport-timeout 10 # value in seconds; it should >>> >> be >>> >> set relatively low >>> >> end-volume >>> >> >>> >> ### Add network serving capability to above home. >>> >> volume server >>> >> type protocol/server >>> >> option transport-type tcp >>> >> subvolumes posix-locks-home1 >>> >> option auth.addr.posix-locks-home1.allow 192.168.253.41,127.0.0.1 # >>> >> Allow >>> >> access to "home1" volume >>> >> end-volume >>> >> >>> >> ### Create automatic file replication >>> >> volume home >>> >> type cluster/afr >>> >> option metadata-self-heal on >>> >> option read-subvolume posix-locks-home1 >>> >> # option favorite-child home2 >>> >> subvolumes home2 posix-locks-home1 >>> >> end-volume >
Krishna Srinivas
2009-Mar-12 14:08 UTC
[Gluster-users] GlusterFS running, but not syncing is done
On Tue, Mar 10, 2009 at 2:23 AM, Keith Freedman <freedman at freeformit.com> wrote:> At 05:34 AM 3/9/2009, Krishna Srinivas wrote: >> >> Do not use single process as both server and client as we saw issues >> related to locking. Can you see if using different processes for >> server and client works fine w.r.t replication? > > this is news to me. when will this be fixed? > It used to be that single process was recommended for performance reasons?Yes it was recommended but later one of the users reported a bug in file record locking. To explain technically, fuse and protocol/server both maintain inode table (hashed inode structure list). Hence when client and server are used in same process a file will have two disparate inode structures - one in fuse's inode table and one in protocol/client's inode table. When two processes - one on the same machine and one on a different machine acquire a lock on a file both act on different inode structures and hence both lock calls succeeds - which is a problem. Hence it is not advisable to use a single process for client and server. We have no plans of fixing this now as things work fine when they are separate processes and technically it is difficult to fix. Regards Krishna> >> Regards >> Krishna > >