Richard de Vries
2010-Oct-14 20:51 UTC
[Gluster-users] mysql replication between two nodes
Hello All, I've a two node setup with replication between the nodes. On only one of the two nodes runs a mysql database where the files of the database are replicated. The files are located on both nodes at: /export/database and accessed via glusterfs on: /opt/test/database Gluster version 3.0.5 on RHEL5.5 In normal operation the database files are nicely replicated to the other node. When you disconnect the network cable wait some time and reconnect, The standby node comes back and some database files are out of sync. Now at that time you do several times a stat of the files: /opt/test/database/data.MYD, the stat returned different information. Sometimes the information of the file on the node 1 and sometimes the information of the file on node 2 Also I notice that before the disconnection of the cable the files on both nodes under: /export/database are nicely updated, but after the disconnect/reconnect only the files on the node 1 under /export/database are updated (seen with stat command). ls -lr of /opt/test/database on node 1 does not force an update of the files on the node 2. Could you help me in finding out why this occurs. As a solution I have now to stop the database, rsync the data and restart the database. After that, the replication goes again fine. Thanks! Richard
Hello! Quoting <rdevries1000 at gmail.com> (14.10.10 22:51):> As a solution I have > now to stop the database, rsync the data and restart the database. > After that, the replication goes again fine.It looks this is the way to go. In a replicate setup an open filedescriptor have a connection to each brick. When a connection breaks (networking problems, crash / reboot of a server) it would never be reestablished. To regain the sync between the bricks the descriptor must be closed and reopened - say the application restarted. Beat -- \|/ Beat Rubischon <beat at 0x1b.ch> ( 0^0 ) http://www.0x1b.ch/~beat/ oOO--(_)--OOo--------------------------------------------------- Meine Erlebnisse, Gedanken und Traeume: http://www.0x1b.ch/blog/
Richard de Vries
2010-Oct-15 10:03 UTC
[Gluster-users] mysql replication between two nodes
Hello Beat, This is a pitty. Because a stop of the service only to resync te standby node is not so nice... The stat of the database file in: /opt/test/database after a reboot of the node 2 shows different output, one time from the node 1 and another time from node 2. What is the role of self heal in this? It is noticed that the files are not equal (via stat). Would you see the same behaviour for example with qemu-kvm that keeps also files open? Regards, Richard> Hello! > > Quoting <rdevries1000 at gmail.com> (14.10.10 22:51): > >> As a solution I have >> now to stop the database, rsync the data and restart the database. >> After that, the replication goes again fine. > >>It looks this is the way to go. In a replicate setup an open filedescriptor >>have a connection to each brick. When a connection breaks (networking >>problems, crash / reboot of a server) it would never be reestablished. To >>regain the sync between the bricks the descriptor must be closed and >>reopened - say the application restarted. > >Beat
Richard de Vries
2010-Oct-18 12:59 UTC
[Gluster-users] mysql replication between two nodes
When disabling the stat prefetch translator, the issue with different stat information is gone. That's good. The only issue that stays is if the standby node fails, the files are only synchronized (self healed) when the database is restarted or an update is done to the database (mysql internally does a write to the open file). a ls -lRt does not trigger a self heal... Regards, Richard> You need to look at the 3.1 release. It is supposed to get rid of this > problem.> On Fri, Oct 15, 2010 at 7:26 AM, Richard de Vries <rdevries1000 at gmail.com>wrote:> The mysql database only runs on one node at a time. > I still find it hard to understand why you need to restart the service > if a brick goes down and comes back again. > > this is the volume file I'm using > > > glusterfsd.vol server file: > > > volume posix1 > type storage/posix > option directory /export/database > end-volume > > volume locks1 > type features/locks > subvolumes posix1 > end-volume > > volume database > type performance/io-threads > option thread-count 8 > subvolumes locks1 > end-volume > > volume server > type protocol/server > option transport-type tcp/server > option auth.addr.database.allow * > option transport.socket.listen-port 6996 > option transport.socket.nodelay on > subvolumes database > end-volume > > > > > database.vol file: > > volume databasenode1 > type protocol/client > option transport-type tcp > option transport.socket.nodelay on > option remote-port 6996 > option ping-timeout 2 > option remote-host node1 > option remote-subvolume database > end-volume > > volume databasenode2 > type protocol/client > option transport-type tcp > option transport.socket.nodelay on > option remote-port 6996 > option ping-timeout 2 > option remote-host node2 > option remote-subvolume database > end-volume > > volume replicate > type cluster/replicate > subvolumes databasenode1 databasenode2 > end-volume > > volume stat-performance > type performance/stat-prefetch > subvolumes replicate > end-volume > > Maybe the stat-performance translator has influence on the this stat > output. > > I'll try to disable this and test again. > > Regards, > Richard
Richard de Vries
2010-Oct-20 09:55 UTC
[Gluster-users] mysql replication between two nodes
The changelog for version 3.1 contains the next entry: Author: Anand Avati <avati at gluster.com> Date: Wed Sep 29 06:53:35 2010 +0000 replicate: remove checks which prevented self-heal when open fds were present this check is not needed anymore since the introduction of changelog piggybacking as the optimization technique instead of first-write-to-flush technique Does this tackle the problem seen on self-heal of open files of mysql database? Will there be a 3.0.6 version that contains these fixes? Regards, Richard> The only issue that stays is if the standby node fails, the files are > only synchronized (self healed) when the database is restarted or > an update is done to the database (mysql internally does a write to > the open file).> a ls -lRt does not trigger a self heal...
Richard de Vries
2010-Oct-22 06:12 UTC
[Gluster-users] mysql replication between two nodes
Hello Amar, Many thanks, this solves the problem! Tested with rc2 Regards, Richard> Yes.. You can try with 3.0.6rc1 tarball available in our qa-releases > directory ( > http://download.gluster.com/pub/gluster/glusterfs/qa-releases/glusterfs-3.0.6rc1.tar.gz) > and see if this fixes the issues. > > Thanks, > Amar