Paul Kudla (Scom.ca Internet Services Inc.)
2022-Mar-05 23:38 UTC
Replicator Issues on Large Mail Boxes
Ok running :
freebsd-12.1
dovecot-2.3.18.current
dovecot-2.3-pigeonhole-0.5.18.current
simply put replication works fine on smaller email boxes without issues.
dsync also works when run manually, longest sync is 60 secords or so
when using dsync so need replicator bumped up ?
i get the fact that the file locking issue relative since a larger
mailbox will take longer to replicate when an email comes in ?
mail18????? 03-05 18:05:51 {dovecot}??????? [15799] (872623573)
doveadm(keith at elirpa.com)<32249><GnS3M53sI2L5fQAAz1jc/w>: Error:
Couldn't lock
/data/dovecot/users/elirpa.com/keith at elirpa.com//tmp/.dovecot-sync.lock:
fcntl(/data/dovecot/users/elirpa.com/keith at
elirpa.com//tmp/.dovecot-sync.lock,
write-lock, F_SETLKW)
??????????????????????????????????????????????????? locking failed:
Timed out after 30 seconds (WRITE lock held by pid 31519)
so what i need is where fcntl is setting a 30 second timeout for
replication (i have adjust all the others in the src code)
simply put replicator fails and retries and keeps failing which is
understandable as it probably needs a little more time ?
# sync.users
carol at scom.ca???????????????????? none???? 00:02:51? 08:16:34
-??????????? y
nick at elirpa.com?????????????????? low????? 02:45:17? 09:28:32
-??????????? y
keith at elirpa.com????????????????? none???? 02:30:13? 09:28:32
-??????????? y
paul at scom.ca????????????????????? high???? 02:45:17? 09:28:32
-??????????? y
ed at scom.ca??????????????????????? none???? 02:34:34? 09:28:32
-??????????? y
ed.hanna at dssmgmt.com????????????? high???? 02:45:17? 09:28:32
-??????????? y
i found under /programs/src/mail/dovecot-2.3.18.current/src/lib
file-lock.c
??? struct flock fl;
??? fl.l_type = lock_type;
??? fl.l_whence = SEEK_SET;
??? fl.l_start = 0;
??? fl.l_len = 0;
??? ret = fcntl(fd, timeout_secs != 0 ? F_SETLKW : F_SETLK, &fl);
??? if (timeout_secs != 0) {
????? alarm(0);
????? file_lock_wait_end(path);
??? }
??? if (ret == 0)
????? break;
??? if (timeout_secs == 0 &&
??????? (errno == EACCES || errno == EAGAIN)) {
????? /* locked by another process */
????? *error_r = t_strdup_printf(
??????? "fcntl(%s, %s, F_SETLK) locking failed: %m "
??????? "(File is already locked)", path, lock_type_str);
????? return 0;
??? }
??? if (err_is_lock_timeout(started, timeout_secs)) {
????? errno = EAGAIN;
????? *error_r = t_strdup_printf(
??????? "fcntl(%s, %s, F_SETLKW) locking failed: "
??????? "Timed out after %u seconds%s",
??????? path, lock_type_str, timeout_secs,
??????? file_lock_find(fd, set->lock_method,
???????????????? lock_type));
????? return 0;
??? }
??? *error_r = t_strdup_printf("fcntl(%s, %s, %s) locking failed:
%m",
????? path, lock_type_str, timeout_secs == 0 ? "F_SETLK" :
"F_SETLKW");
??? if (errno == EDEADLK && !set->allow_deadlock) {
????? i_panic("%s%s", *error_r,
??????? file_lock_find(fd, set->lock_method,
???????????????? lock_type));
??? }
??? return -1;
#endif
--
Happy Saturday !!!
Thanks - paul
Paul Kudla
Scom.ca Internet Services <http://www.scom.ca>
004-1009 Byron Street South
Whitby, Ontario - Canada
L1N 4S3
Toronto 416.642.7266
Main?1.866.411.7266
Fax?1.888.892.7266
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20220305/73d06917/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scomca-logo.jpg
Type: image/jpeg
Size: 135491 bytes
Desc: not available
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20220305/73d06917/attachment-0001.jpg>
Hi, > simply put replication works fine on smaller email boxes without issues. > > dsync also works when run manually, longest sync is 60 secords or so > when using dsync so need replicator bumped up ? You can configure dsync parameters for the replicator to adjust the time limit, e.g.: replication_dsync_parameters = -d -N -l 120 -U best regards, Carsten