Kotresh Hiremath Ravishankar
2015-May-20 12:17 UTC
[Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync
Hi Cyril,>From the brick logs, it seems the changelog-notifier thread has got killed for some reason,as notify is failing with EPIPE. Try the following. It should probably help: 1. Stop geo-replication. 2. Disable changelog: gluster vol set <master-vol-name> changelog.changelog off 3. Enable changelog: glluster vol set <master-vol-name> changelog.changelog on 4. Start geo-replication. Let me know if it works. Thanks and Regards, Kotresh H R ----- Original Message -----> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet at alcatel-lucent.com> > To: "gluster-users" <gluster-users at gluster.org> > Sent: Tuesday, May 19, 2015 3:16:22 AM > Subject: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync > > Hi Gluster Community, > > I have a 3 nodes setup at location A and a two node setup at location B. > > All running 3.5.2 under Centos-7. > > I have one volume I sync through georeplication process. > > So far so good, the first step of geo-replication is done (hybrid-crawl). > > Now I?d like to use the change log detector in order to delete files on the > slave when they are gone on master. > > But it always fallback to xsync mecanism (even when I force it using config > changelog_detector changelog): > > [2015-05-18 12:29:49.543922] I [monitor(monitor):129:monitor] Monitor: > ------------------------------------------------------------ > [2015-05-18 12:29:49.544018] I [monitor(monitor):130:monitor] Monitor: > starting gsyncd worker > [2015-05-18 12:29:49.614002] I [gsyncd(/export/raid/vol):532:main_i] <top>: > syncing: gluster://localhost:vol -> > ssh://root at x.x.x.x:gluster://localhost:vol > [2015-05-18 12:29:54.696532] I [master(/export/raid/vol):58:gmaster_builder] > <top>: setting up xsync change detection mode > [2015-05-18 12:29:54.696888] I [master(/export/raid/vol):357:__init__] > _GMaster: using 'rsync' as the sync engine > [2015-05-18 12:29:54.697930] I [master(/export/raid/vol):58:gmaster_builder] > <top>: setting up changelog change detection mode > [2015-05-18 12:29:54.698160] I [master(/export/raid/vol):357:__init__] > _GMaster: using 'rsync' as the sync engine > [2015-05-18 12:29:54.699239] I [master(/export/raid/vol):1104:register] > _GMaster: xsync temp directory: > /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/xsync > [2015-05-18 12:30:04.707216] I [master(/export/raid/vol):682:fallback_xsync] > _GMaster: falling back to xsync mode > [2015-05-18 12:30:04.742422] I [syncdutils(/export/raid/vol):192:finalize] > <top>: exiting. > [2015-05-18 12:30:05.708123] I [monitor(monitor):157:monitor] Monitor: > worker(/export/raid/vol) died in startup phase > [2015-05-18 12:30:05.708369] I [monitor(monitor):81:set_state] Monitor: new > state: faulty > [201 > > After some python debugging and stack strace printing I figure out that: > > /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/changes.log > > [2015-05-18 19:41:24.511423] I > [gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs: connecting > to changelog socket: > /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock (brick: > /export/raid/vol) > [2015-05-18 19:41:24.511445] W > [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection > attempt 1/5... > [2015-05-18 19:41:26.511556] W > [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection > attempt 2/5... > [2015-05-18 19:41:28.511670] W > [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection > attempt 3/5... > [2015-05-18 19:41:30.511790] W > [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection > attempt 4/5... > [2015-05-18 19:41:32.511890] W > [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection > attempt 5/5... > [2015-05-18 19:41:34.512016] E > [gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could not > connect to changelog socket! bailing out... > > > /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock doesn?t > exist. So the > https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L431 > is failing because > https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L153 > cannot open the socket file. > > And I don?t find any error related to changelog in log files, except on brick > logs node 2 (site A) > > bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636908] E > [changelog-helpers.c:168:changelog_rollover_changelog] 0-vol-changelog: > Failed to send file name to notify thread (reason: Broken pipe) > bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636949] E > [changelog-helpers.c:280:changelog_handle_change] 0-vol-changelog: Problem > rolling over changelog(s) > > gluster vol status is all fine, and change-log options are enabled in vol > file > > volume vol-changelog > type features/changelog > option changelog on > option changelog-dir /export/raid/vol/.glusterfs/changelogs > option changelog-brick /export/raid/vol > subvolumes vol-posix > end-volume > > Any help will be appreciated :) > > Oh Btw, hard to stop / restart the volume as I have around 4k clients > connected. > > Thanks ! > > -- > Cyril Peponnet > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users
PEPONNET, Cyril N (Cyril)
2015-May-21 15:39 UTC
[Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync
Hi, Unfortunately, # gluster vol set usr_global changelog.changelog off volume set: failed: Staging failed on mvdcgluster01.us.alcatel-lucent.com. Error: One or more connected clients cannot support the feature being set. These clients need to be upgraded or disconnected before running this command again I don?t know really why, I have some clients using 3.6 as fuse client others are running on 3.5.2. Any advice ? -- Cyril Peponnet> On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar <khiremat at redhat.com> wrote: > > Hi Cyril, > > From the brick logs, it seems the changelog-notifier thread has got killed for some reason, > as notify is failing with EPIPE. > > Try the following. It should probably help: > 1. Stop geo-replication. > 2. Disable changelog: gluster vol set <master-vol-name> changelog.changelog off > 3. Enable changelog: glluster vol set <master-vol-name> changelog.changelog on > 4. Start geo-replication. > > Let me know if it works. > > Thanks and Regards, > Kotresh H R > > ----- Original Message ----- >> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet at alcatel-lucent.com> >> To: "gluster-users" <gluster-users at gluster.org> >> Sent: Tuesday, May 19, 2015 3:16:22 AM >> Subject: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync >> >> Hi Gluster Community, >> >> I have a 3 nodes setup at location A and a two node setup at location B. >> >> All running 3.5.2 under Centos-7. >> >> I have one volume I sync through georeplication process. >> >> So far so good, the first step of geo-replication is done (hybrid-crawl). >> >> Now I?d like to use the change log detector in order to delete files on the >> slave when they are gone on master. >> >> But it always fallback to xsync mecanism (even when I force it using config >> changelog_detector changelog): >> >> [2015-05-18 12:29:49.543922] I [monitor(monitor):129:monitor] Monitor: >> ------------------------------------------------------------ >> [2015-05-18 12:29:49.544018] I [monitor(monitor):130:monitor] Monitor: >> starting gsyncd worker >> [2015-05-18 12:29:49.614002] I [gsyncd(/export/raid/vol):532:main_i] <top>: >> syncing: gluster://localhost:vol -> >> ssh://root at x.x.x.x:gluster://localhost:vol >> [2015-05-18 12:29:54.696532] I [master(/export/raid/vol):58:gmaster_builder] >> <top>: setting up xsync change detection mode >> [2015-05-18 12:29:54.696888] I [master(/export/raid/vol):357:__init__] >> _GMaster: using 'rsync' as the sync engine >> [2015-05-18 12:29:54.697930] I [master(/export/raid/vol):58:gmaster_builder] >> <top>: setting up changelog change detection mode >> [2015-05-18 12:29:54.698160] I [master(/export/raid/vol):357:__init__] >> _GMaster: using 'rsync' as the sync engine >> [2015-05-18 12:29:54.699239] I [master(/export/raid/vol):1104:register] >> _GMaster: xsync temp directory: >> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/xsync >> [2015-05-18 12:30:04.707216] I [master(/export/raid/vol):682:fallback_xsync] >> _GMaster: falling back to xsync mode >> [2015-05-18 12:30:04.742422] I [syncdutils(/export/raid/vol):192:finalize] >> <top>: exiting. >> [2015-05-18 12:30:05.708123] I [monitor(monitor):157:monitor] Monitor: >> worker(/export/raid/vol) died in startup phase >> [2015-05-18 12:30:05.708369] I [monitor(monitor):81:set_state] Monitor: new >> state: faulty >> [201 >> >> After some python debugging and stack strace printing I figure out that: >> >> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/changes.log >> >> [2015-05-18 19:41:24.511423] I >> [gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs: connecting >> to changelog socket: >> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock (brick: >> /export/raid/vol) >> [2015-05-18 19:41:24.511445] W >> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection >> attempt 1/5... >> [2015-05-18 19:41:26.511556] W >> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection >> attempt 2/5... >> [2015-05-18 19:41:28.511670] W >> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection >> attempt 3/5... >> [2015-05-18 19:41:30.511790] W >> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection >> attempt 4/5... >> [2015-05-18 19:41:32.511890] W >> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection >> attempt 5/5... >> [2015-05-18 19:41:34.512016] E >> [gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could not >> connect to changelog socket! bailing out... >> >> >> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock doesn?t >> exist. So the >> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L431 >> is failing because >> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L153 >> cannot open the socket file. >> >> And I don?t find any error related to changelog in log files, except on brick >> logs node 2 (site A) >> >> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636908] E >> [changelog-helpers.c:168:changelog_rollover_changelog] 0-vol-changelog: >> Failed to send file name to notify thread (reason: Broken pipe) >> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636949] E >> [changelog-helpers.c:280:changelog_handle_change] 0-vol-changelog: Problem >> rolling over changelog(s) >> >> gluster vol status is all fine, and change-log options are enabled in vol >> file >> >> volume vol-changelog >> type features/changelog >> option changelog on >> option changelog-dir /export/raid/vol/.glusterfs/changelogs >> option changelog-brick /export/raid/vol >> subvolumes vol-posix >> end-volume >> >> Any help will be appreciated :) >> >> Oh Btw, hard to stop / restart the volume as I have around 4k clients >> connected. >> >> Thanks ! >> >> -- >> Cyril Peponnet >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users