Sorry for not replying to the original thread, but I just joined this list. On Tue, 13 May 2008, Rich Megginson wrote: > Has anyone seen these errors with 1.1? We fixed a few 64-bit issues in 1.1. I am running two 32-bit FDS 1.1 (fedora-ds-1.1.0-3.fc6) servers, on RHEL 5.1, in an MMR configuration. These servers, which are configured behind a load balancer, act as the University''s central authentication service. We have are using the password policy plugin and have the "passwordisglobalpolicy" setting enabled, so there is a substantial amount of write activity due to replication of password- policy-related attributes (e.g., passwordRetryCount, retryCountResetTime, etc). Time on both systems is synchronized via NTP; clocks are in sync. We have the same situation as Reinhard Nappert reported on 5/13/2008: MMR will work fine for a while (usually a few weeks; the longest period we''ve gone is a month, the shortest time a few hours). Eventually replication will fail with the following sequence of messages in the errors log: [24/May/2008:05:18:54 -0700] - csngen_adjust_time: adjustment limit exceeded; value - 86401, limit - 86400 [24/May/2008:05:18:54 -0700] NSMMReplicationPlugin - conn=1800 op=60262 replica="<suffix>": Unable to acquire replica: error: excessive clock skew [24/May/2008:05:20:05 -0700] - csngen_adjust_time: adjustment limit exceeded; value - 86401, limit - 86400 [24/May/2008:05:20:05 -0700] NSMMReplicationPlugin - agmt="cn=kif2zapp" (zapp:389): Incremental protocol: fatal er ror - too much time skew between replicas! [24/May/2008:05:20:05 -0700] NSMMReplicationPlugin - agmt="cn=kif2zapp" (zapp:389): Incremental update failed and requires administrator action The "csngen_adjust_time" error message always reports the same value when this occurs (86401). We have also employed the workaround described by Chris St. Pierre in https://bugzilla.redhat.com/show_bug.cgi?id=233642 #c3. This resolves the problem for a short while, but it always reappears. BTW, I was in contact with Chris recently about his experiences with MMR and he said that, in addition to moving to FDS 1.1, he moved a lot of "frequently updated" data out of FDS and into MySQL, and that his problem disappeared afterward; obviously this isn''t a solution for us as we are utilizing FDS as an authentication engine. We are desperately trying to find a solution to this issue that will allow us to continue using MMR...we could resort to a traditional passive/active + shared storage HA design, but we want to keep that as a last resort. If there is any additional information I should provide, please let me know. -- Gary Windham Senior Enterprise Systems Architect The University of Arizona, UITS +1 520 626 5981
Rich Megginson
2008-Jun-11 18:03 UTC
Re: [Fedora-directory-users] Re: MMR: excessive clock skew
Gary Windham wrote:> Sorry for not replying to the original thread, but I just joined this > list. > > On Tue, 13 May 2008, Rich Megginson wrote: > > > Has anyone seen these errors with 1.1? We fixed a few 64-bit issues > in 1.1. > > I am running two 32-bit FDS 1.1 (fedora-ds-1.1.0-3.fc6) servers, on > RHEL 5.1, in an MMR configuration. These servers, which are > configured behind a load balancer, act as the University''s central > authentication service. We have are using the password policy plugin > and have the "passwordisglobalpolicy" setting enabled, so there is a > substantial amount of write activity due to replication of > password-policy-related attributes (e.g., passwordRetryCount, > retryCountResetTime, etc). Time on both systems is synchronized via > NTP; clocks are in sync. > > We have the same situation as Reinhard Nappert reported on 5/13/2008: > MMR will work fine for a while (usually a few weeks; the longest > period we''ve gone is a month, the shortest time a few hours). > Eventually replication will fail with the following sequence of > messages in the errors log: > > [24/May/2008:05:18:54 -0700] - csngen_adjust_time: adjustment limit > exceeded; value - 86401, limit - 86400 > [24/May/2008:05:18:54 -0700] NSMMReplicationPlugin - conn=1800 > op=60262 replica="<suffix>": Unable to acquire replica: error: > excessive clock skew > [24/May/2008:05:20:05 -0700] - csngen_adjust_time: adjustment limit > exceeded; value - 86401, limit - 86400 > [24/May/2008:05:20:05 -0700] NSMMReplicationPlugin - > agmt="cn=kif2zapp" (zapp:389): Incremental protocol: fatal er > ror - too much time skew between replicas! > [24/May/2008:05:20:05 -0700] NSMMReplicationPlugin - > agmt="cn=kif2zapp" (zapp:389): Incremental update failed and > requires administrator action > > The "csngen_adjust_time" error message always reports the same value > when this occurs (86401). > > We have also employed the workaround described by Chris St. Pierre in > https://bugzilla.redhat.com/show_bug.cgi?id=233642#c3. This resolves > the problem for a short while, but it always reappears. BTW, I was in > contact with Chris recently about his experiences with MMR and he said > that, in addition to moving to FDS 1.1, he moved a lot of "frequently > updated" data out of FDS and into MySQL, and that his problem > disappeared afterward; obviously this isn''t a solution for us as we > are utilizing FDS as an authentication engine. > > We are desperately trying to find a solution to this issue that will > allow us to continue using MMR...we could resort to a traditional > passive/active + shared storage HA design, but we want to keep that as > a last resort. If there is any additional information I should > provide, please let me know.I''ve attached a script to https://bugzilla.redhat.com/show_bug.cgi?id=233642 to help diagnose this problem.> > -- > Gary Windham > Senior Enterprise Systems Architect > The University of Arizona, UITS > +1 520 626 5981 > > -- > Fedora-directory-users mailing list > Fedora-directory-users@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-directory-users