Hi all, I use gluster 3.12 on centos 7. I am writing a snapshot program for my geo-replicated cluster. Now when I started to run tests with my application I have found a very strange behavior regarding geo-replication in gluster. I have setup my geo-replication according to the docs: http://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/ Both master and slave clusters are replicated with just two machines (VM) and no arbiter. I have setup a geo-user (called geouser) and do not use root as the geo user, as specified in the docs. Both my master and slave volumes are named: vol If I pause the geo-replication with: gluster volume geo-replication vol geouser at ggluster1-geo::vol pause Pausing geo-replication session between vol & geouser at ggluster1-geo::vol has been successful Create a snapshot: gluster snapshot create my_snap_no_1000 vol snapshot create: success: Snap my_snap_no_1000-2018.02.21-07.45.32 created successfully Resume geo-replication: gluster volume geo-replication vol geouser at ggluster1-geo::vol resume Resuming geo-replication session between vol & geouser at ggluster1-geo::vol has been successful Everything works fine! But here comes the problem: If I by accident spell my slave user wrong or don't use the user at all, as I was using root, no matter what user I write pause/resume do NOT report any errors. The answer is always pausing/resuming successful. The problem comes after a successful pause when I try to create a snapshot. It fails with: snapshot create: failed: geo-replication session is running for the volume vol. Session needs to be stopped before taking a snapshot. gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------- ggluster1 vol /gluster geouser ssh://geouser at ggluster1-geo::vol N/A Paused N/A N/A ggluster2 vol /gluster geouser ssh://geouser at ggluster1-geo::vol N/A Paused N/A N/A After this snapshots fails all the time! If I use the correct user again and pause, no error (paused), snapshot fails. If I resume with correct user, no errors (active). Geo-replication still works fine, but some how has something gone wrong so snapshots fail. After restart of glusterd in all machines it starts to work again. Here is complete run through: gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------------------------- ggluster1 vol /gluster geouser ssh://geouser at ggluster1-geo::vol ggluster1-geo Active Changelog Crawl 2018-02-12 15:49:57 ggluster2 vol /gluster geouser ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A N/A # Using wrong user: abc gluster volume geo-replication vol abc at ggluster1-geo::vol pause Pausing geo-replication session between vol & abc at ggluster1-geo::vol has been successful gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------- ggluster1 vol /gluster geouser ssh://geouser at ggluster1-geo::vol N/A Paused N/A N/A ggluster2 vol /gluster geouser ssh://geouser at ggluster1-geo::vol N/A Paused N/A N/A gluster snapshot create snap_vol_1000 vol snapshot create: failed: geo-replication session is running for the volume vol. Session needs to be stopped before taking a snapshot. # Using wrong user: abc gluster volume geo-replication vol abc at ggluster1-geo::vol resume Resuming geo-replication session between vol & ggluster1-geo::vol has been successful gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ---------------------------------------------------------------------------------------------------------------------------------------------------------------- ggluster1 vol /gluster geouser ssh://geouser at ggluster1-geo::vol ggluster1-geo Active Changelog Crawl 2018-02-12 15:49:57 ggluster2 vol /gluster geouser ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A N/A Many thanks in advance! Best regards Marcus -- ************************************************** * Marcus Peders?n * * System administrator * ************************************************** * Interbull Centre * * ================ * * Department of Animal Breeding & Genetics ? SLU * * Box 7023, SE-750 07 * * Uppsala, Sweden * ************************************************** * Visiting address: * * Room 55614, Ulls v?g 26, Ultuna * * Uppsala * * Sweden * * * * Tel: +46-(0)18-67 1962 * * * ************************************************** * ISO 9001 Bureau Veritas No SE004561-1 * **************************************************
Kotresh Hiremath Ravishankar
2018-Feb-21 09:14 UTC
[Gluster-users] Geo replication snapshot error
Hi, Thanks for reporting the issue. This seems to be a bug. Could you please raise a bug at https://bugzilla.redhat.com/ under community/glusterfs ? We will take a look at it and fix it. Thanks, Kotresh HR On Wed, Feb 21, 2018 at 2:01 PM, Marcus Peders?n <marcus.pedersen at slu.se> wrote:> Hi all, > I use gluster 3.12 on centos 7. > I am writing a snapshot program for my geo-replicated cluster. > Now when I started to run tests with my application I have found > a very strange behavior regarding geo-replication in gluster. > > I have setup my geo-replication according to the docs: > http://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/ > > Both master and slave clusters are replicated with just two > machines (VM) and no arbiter. > > I have setup a geo-user (called geouser) and do not use > root as the geo user, as specified in the docs. > > Both my master and slave volumes are named: vol > > If I pause the geo-replication with: > gluster volume geo-replication vol geouser at ggluster1-geo::vol pause > Pausing geo-replication session between vol & geouser at ggluster1-geo::vol > has been successful > > Create a snapshot: > gluster snapshot create my_snap_no_1000 vol > snapshot create: success: Snap my_snap_no_1000-2018.02.21-07.45.32 > created successfully > > Resume geo-replication: > gluster volume geo-replication vol geouser at ggluster1-geo::vol resume > Resuming geo-replication session between vol & geouser at ggluster1-geo::vol > has been successful > > > Everything works fine! > > But here comes the problem: > If I by accident spell my slave user wrong or don't use > the user at all, as I was using root, > no matter what user I write pause/resume do NOT report > any errors. The answer is always pausing/resuming successful. > The problem comes after a successful pause when I try to > create a snapshot. It fails with: > snapshot create: failed: geo-replication session is running for the volume > vol. Session needs to be stopped before taking a snapshot. > > gluster volume geo-replication status > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------- > ggluster1 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > N/A > ggluster2 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > N/A > > > After this snapshots fails all the time! > If I use the correct user again and pause, no error (paused), snapshot > fails. > If I resume with correct user, no errors (active). > Geo-replication still works fine, but some how has something > gone wrong so snapshots fail. > After restart of glusterd in all machines it starts to work again. > > > Here is complete run through: > > gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > SLAVE NODE STATUS CRAWL STATUS > LAST_SYNCED > ------------------------------------------------------------ > ------------------------------------------------------------ > ---------------------------------------- > ggluster1 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol ggluster1-geo Active > Changelog Crawl 2018-02-12 15:49:57 > ggluster2 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A > N/A > > # Using wrong user: abc > gluster volume geo-replication vol abc at ggluster1-geo::vol pause > Pausing geo-replication session between vol & abc at ggluster1-geo::vol has > been successful > > > gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------- > ggluster1 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > N/A > ggluster2 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > N/A > > > gluster snapshot create snap_vol_1000 vol > snapshot create: failed: geo-replication session is running for the volume > vol. Session needs to be stopped before taking a snapshot. > > # Using wrong user: abc > gluster volume geo-replication vol abc at ggluster1-geo::vol resume > Resuming geo-replication session between vol & ggluster1-geo::vol has been > successful > > > gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > SLAVE NODE STATUS CRAWL STATUS > LAST_SYNCED > ------------------------------------------------------------ > ------------------------------------------------------------ > ---------------------------------------- > ggluster1 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol ggluster1-geo Active > Changelog Crawl 2018-02-12 15:49:57 > ggluster2 vol /gluster geouser > ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A > N/A > > > Many thanks in advance! > > Best regards > Marcus > > > -- > ************************************************** > * Marcus Peders?n * > * System administrator * > ************************************************** > * Interbull Centre * > * ================ * > * Department of Animal Breeding & Genetics ? SLU * > * Box 7023, SE-750 07 * > * Uppsala, Sweden * > ************************************************** > * Visiting address: * > * Room 55614, Ulls v?g 26, Ultuna * > * Uppsala * > * Sweden * > * * > * Tel: +46-(0)18-67 1962 * > * * > ************************************************** > * ISO 9001 Bureau Veritas No SE004561-1 * > ************************************************** > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180221/53f348f0/attachment.html>
Bug reported: Bug 1547446 Thanks! Marcus On Wed, Feb 21, 2018 at 02:44:00PM +0530, Kotresh Hiremath Ravishankar wrote:> Hi, > > Thanks for reporting the issue. This seems to be a bug. > Could you please raise a bug at https://bugzilla.redhat.com/ under > community/glusterfs ? > We will take a look at it and fix it. > > Thanks, > Kotresh HR > > On Wed, Feb 21, 2018 at 2:01 PM, Marcus Peders?n <marcus.pedersen at slu.se> > wrote: > > > Hi all, > > I use gluster 3.12 on centos 7. > > I am writing a snapshot program for my geo-replicated cluster. > > Now when I started to run tests with my application I have found > > a very strange behavior regarding geo-replication in gluster. > > > > I have setup my geo-replication according to the docs: > > http://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/ > > > > Both master and slave clusters are replicated with just two > > machines (VM) and no arbiter. > > > > I have setup a geo-user (called geouser) and do not use > > root as the geo user, as specified in the docs. > > > > Both my master and slave volumes are named: vol > > > > If I pause the geo-replication with: > > gluster volume geo-replication vol geouser at ggluster1-geo::vol pause > > Pausing geo-replication session between vol & geouser at ggluster1-geo::vol > > has been successful > > > > Create a snapshot: > > gluster snapshot create my_snap_no_1000 vol > > snapshot create: success: Snap my_snap_no_1000-2018.02.21-07.45.32 > > created successfully > > > > Resume geo-replication: > > gluster volume geo-replication vol geouser at ggluster1-geo::vol resume > > Resuming geo-replication session between vol & geouser at ggluster1-geo::vol > > has been successful > > > > > > Everything works fine! > > > > But here comes the problem: > > If I by accident spell my slave user wrong or don't use > > the user at all, as I was using root, > > no matter what user I write pause/resume do NOT report > > any errors. The answer is always pausing/resuming successful. > > The problem comes after a successful pause when I try to > > create a snapshot. It fails with: > > snapshot create: failed: geo-replication session is running for the volume > > vol. Session needs to be stopped before taking a snapshot. > > > > gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > > SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ------------------------- > > ggluster1 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > > N/A > > ggluster2 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > > N/A > > > > > > After this snapshots fails all the time! > > If I use the correct user again and pause, no error (paused), snapshot > > fails. > > If I resume with correct user, no errors (active). > > Geo-replication still works fine, but some how has something > > gone wrong so snapshots fail. > > After restart of glusterd in all machines it starts to work again. > > > > > > Here is complete run through: > > > > gluster volume geo-replication status > > > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > > SLAVE NODE STATUS CRAWL STATUS > > LAST_SYNCED > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ---------------------------------------- > > ggluster1 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol ggluster1-geo Active > > Changelog Crawl 2018-02-12 15:49:57 > > ggluster2 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A > > N/A > > > > # Using wrong user: abc > > gluster volume geo-replication vol abc at ggluster1-geo::vol pause > > Pausing geo-replication session between vol & abc at ggluster1-geo::vol has > > been successful > > > > > > gluster volume geo-replication status > > > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > > SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ------------------------- > > ggluster1 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > > N/A > > ggluster2 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol N/A Paused N/A > > N/A > > > > > > gluster snapshot create snap_vol_1000 vol > > snapshot create: failed: geo-replication session is running for the volume > > vol. Session needs to be stopped before taking a snapshot. > > > > # Using wrong user: abc > > gluster volume geo-replication vol abc at ggluster1-geo::vol resume > > Resuming geo-replication session between vol & ggluster1-geo::vol has been > > successful > > > > > > gluster volume geo-replication status > > > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE > > SLAVE NODE STATUS CRAWL STATUS > > LAST_SYNCED > > ------------------------------------------------------------ > > ------------------------------------------------------------ > > ---------------------------------------- > > ggluster1 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol ggluster1-geo Active > > Changelog Crawl 2018-02-12 15:49:57 > > ggluster2 vol /gluster geouser > > ssh://geouser at ggluster1-geo::vol ggluster2-geo Passive N/A > > N/A > > > > > > Many thanks in advance! > > > > Best regards > > Marcus > > > > > > -- > > ************************************************** > > * Marcus Peders?n * > > * System administrator * > > ************************************************** > > * Interbull Centre * > > * ================ * > > * Department of Animal Breeding & Genetics ? SLU * > > * Box 7023, SE-750 07 * > > * Uppsala, Sweden * > > ************************************************** > > * Visiting address: * > > * Room 55614, Ulls v?g 26, Ultuna * > > * Uppsala * > > * Sweden * > > * * > > * Tel: +46-(0)18-67 1962 * > > * * > > ************************************************** > > * ISO 9001 Bureau Veritas No SE004561-1 * > > ************************************************** > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > > -- > Thanks and Regards, > Kotresh H R-- ************************************************** * Marcus Peders?n * * System administrator * ************************************************** * Interbull Centre * * ================ * * Department of Animal Breeding & Genetics ? SLU * * Box 7023, SE-750 07 * * Uppsala, Sweden * ************************************************** * Visiting address: * * Room 55614, Ulls v?g 26, Ultuna * * Uppsala * * Sweden * * * * Tel: +46-(0)18-67 1962 * * * ************************************************** * ISO 9001 Bureau Veritas No SE004561-1 * **************************************************