Adrian Saul
2006-Jul-14 14:38 UTC
[zfs-discuss] S10U2: zfs instant hang with dovecot imap server using mmap
Hi, I just upgraded my home box to run Solaris 10 6/06 and converted my previous filesytems over to ZFS, including /var/mail. Previously on S10 FCS I was running dovecot mail server from blastwave.org without issue. On upgrading to Update 2 I have found that the mail server hangs frequently. The imap process cannot be killed, dtraced, pstacked or trussed. After a few goes at dtrace I took a core dump and had a look at that. The stack for the imap process was simply:> 0t1550::pid2proc|::walk thread|::findstack -vstack pointer for thread d77f8600: d41e8e2c d41e8e78 0xd41e8e44(20000, d41e8f44) d41e8ed8 zfs_write+0x59f(d6a273c0, d41e8f44, 0, d8adee10, 0) d41e8f0c fop_write+0x2d(d6a273c0, d41e8f44, 0, d8adee10, 0) d41e8f8c write+0x29a() d41e8fb4 sys_sysenter+0xdc() Some digging in sunsolve I found a few references to mmap and zfs_write locks - so on a hunch (because fuser previously showed my mailfile as mmaped) I disabled mmap in the dovecot configuration, and I no longer get the deadlocks. I could not find an exact bug in sunsolve - is this a known one or is more work needed? I can provide the core file or SSH access to the box if more analysis is needed. Cheers, Adrian This message posted from opensolaris.org
Adrian Saul
2006-Jul-14 14:39 UTC
[zfs-discuss] Re: S10U2: zfs instant hang with dovecot imap server using mmap
Oh - and by instant hang I mean I can reproduce it simply by rebooting, enabling the dovecot service and then connecting to dovecot with thunderbird. This message posted from opensolaris.org
Adrian Saul
2006-Jul-14 14:49 UTC
[zfs-discuss] Re: S10U2: zfs instant hang with dovecot imap server using mmap
There are also a number of procmail processes with the following thread stacks> 0t1611::pid2proc|::walk thread|::findstack -vstack pointer for thread d8ae1a00: d3bcbbac [ d3bcbbac 0xfe826b37() ] d3bcbbc4 swtch+0x13e() d3bcbbe8 cv_wait_sig+0x119(da58fb4c, d46a8680) d3bcbc00 wait_for_lock+0x30(da58fac8) d3bcbc20 flk_wait_execute_request+0x156(da58fac8) d3bcbc64 flk_process_request+0x4c7(da58fac8) d3bcbd38 reclock+0x3a9(d6a273c0, d3bcbe64, 6, 10a, 202e92c, 0) d3bcbd8c fs_frlock+0x252(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0) d3bcbdc0 zfs_frlock+0x73(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0) d3bcbdf8 fop_frlock+0x2c(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0) d3bcbf8c fcntl+0x95d() d3bcbfb4 sys_sysenter+0xdc() This message posted from opensolaris.org