Dovecot 2.2.18 on CentOS 6 I have a pair of servers setup with MySQL, Postfix, and Dovecot. Replication is setup and working between the two dovecot instances. The problem I'm running into is that a single mailbox receives a lot of messages, at times the rate is multiple messages per second. Delivery from Postfix to Dovecot is via tcp based LMTP. When I do 'ps -aef|grep lmtp|wc -l' I get 62 and does not appear to go higher than that. At the moment I have 4500 and 8300 messages queued on two Postfix instances waiting to deliver via LMTP to the same dovecot instance. Deliveries only happen via LMTP and only one of the two nodes actually gets the deliveries. What I'm seeing is very high load on the system (40) and queues building on the Postfix side. Replication is keeping up. Looking at the logs now I see anywhere from 4-7 messages per second delivered to this single mailbox. I would like to increase that rate a lot. These machines are VMs hosted on Xenserver 6.x. I have them setup with 8 vCPUs (2 sockets with 4 cores per socket), the dom0 machines have dual HBA connectors back to a SAN and have 128 CPUs and 256GB of RAM and are not taxed. I added a 2nd virtual disk that is used for storing mailbox data. It is ext4 and has noatime set during mount. /var is also mounted with noatime. The performance graphs in XenCenter show nearly all 8 vCPUs at about 50%, and the writes on the mailbox data disk are about 20%. iostat is showing mostly <5 for await times for the disks, though I do see a 10 now and again. I'm guessing that maybe I'm hitting a mailbox locking issue and not sure how to reduce the contention and thereby increase the delivery rate to this mailbox. -Chad
Chad M Stewart <cms at balius.com> wrote:> Dovecot 2.2.18 on CentOS 6 > > I have a pair of servers setup with MySQL, Postfix, and Dovecot. Replication is setup and working between the two dovecot instances. > > The problem I'm running into is that a single mailbox receives a lot > of messages, at times the rate is multiple messages per > second. Delivery from Postfix to Dovecot is via tcp based LMTP. When > I do 'ps -aef|grep lmtp|wc -l' I get 62 and does not appear to go > higher than that. At the moment I have 4500 and 8300 messages queued > on two Postfix instances waiting to deliver via LMTP to the same > dovecot instance. Deliveries only happen via LMTP and only one of the > two nodes actually gets the deliveries. > > What I'm seeing is very high load on the system (40) and queues > building on the Postfix side. Replication is keeping up. Looking at > the logs now I see anywhere from 4-7 messages per second delivered to > this single mailbox. I would like to increase that rate a lot. > > These machines are VMs hosted on Xenserver 6.x. I have them setup > with 8 vCPUs (2 sockets with 4 cores per socket), the dom0 machines > have dual HBA connectors back to a SAN and have 128 CPUs and 256GB of > RAM and are not taxed. I added a 2nd virtual disk that is used for > storing mailbox data. It is ext4 and has noatime set during > mount. /var is also mounted with noatime. > > The performance graphs in XenCenter show nearly all 8 vCPUs at about > 50%, and the writes on the mailbox data disk are about 20%. iostat is > showing mostly <5 for await times for the disks, though I do see a 10 > now and again. > > > I'm guessing that maybe I'm hitting a mailbox locking issue and not sure how > to reduce the contention and thereby increase the delivery rate to this > mailbox.Could you provide the following info: a) mailbox type (maildir/mbox/dbox/...) [mail_location in dovecot's config] b) file system type (ext2/ext3/ext4/fat32/...) [provided by "df -T" command on my system] -- A. Filip
On Aug 12, 2015, at 11:04 AM, Andrzej A. Filip <andrzej.filip at gmail.com> wrote:> ><..snip..>> Could you provide the following info: > a) mailbox type (maildir/mbox/dbox/...)maildir> [mail_location in dovecot's config]/srv/mail/<domain>/<user-mailbox>/> b) file system type (ext2/ext3/ext4/fat32/...) > [provided by "df -T" command on my system]As I said ext4. Since I posted I've changed a couple of things: ulimit -n 8192, and disabled fsync as in mail_fsync = never. I'm not sure if I'll put it back in the LMTP section or not. Given all the hardware abstraction layers. -Chad
On 08/12/2015 17:19, Chad M Stewart wrote:> What I'm seeing is very high load on the system (40) and queues building on the Postfix side.High load means, that there are a lot of processes waiting to run. The most likely cause for this is not CPU consumption, but I/O wait. Please run vmstat and iostat and post their output. Greetings Daniel -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: <http://dovecot.org/pipermail/dovecot/attachments/20150812/81931325/attachment.sig>
On Aug 12, 2015, at 11:58 AM, Daniel Tr?der <troeder at univention.de> wrote:> On 08/12/2015 17:19, Chad M Stewart wrote: >> What I'm seeing is very high load on the system (40) and queues building on the Postfix side. > High load means, that there are a lot of processes waiting to run. The > most likely cause for this is not CPU consumption, but I/O wait. > > Please run vmstat and iostat and post their output.I was watching iostat and avg service times, and maybe once every 30-45 seconds I'd see times of 10ms, but otherwise it was below that. I achieved the biggest impact by limiting the number of outbound connections from Postfix to Dovecot. I limited Postfix to 5 connections, which means a total of 10 inbound LMTP to Dovecot. Then I saw near 500 msgs per LMTP connection. I suspect the problem was a locking issue on the mailbox in question. Too many simultaneous delivery attempts via too many LMTP sessions. The backlog has cleared so I'm done troubleshooting for now. If this happens again I'll resume looking into it more. These are new servers so I'm tuning for the load, etc.. -Chad