I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail which doesn't touch the dovecot metadata and webmail uses imapd. Client connections to imapd go to random servers and I don't yet have solid means to keep certain users on certain servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran into Stale NFS file handles causing index/uidlist corruption causing inboxes to appear as empty when they were not. In some situations their corrupt index had to be deleted manually. I first suspected dovecot 1.2 since it was upgraded at the same time but I downgraded to 1.1 and its doing the same thing. I don't really have a wealth of details to go on yet and I usually stay quiet until I do, and half the time it is difficult to reproduce myself so I've had to put it in production to get a feel for progress. This only happens a dozen or so times per weekday but I feel the need to start taking bigger steps. I'll probably do what I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x on the remaining servers. A binary search is within possibility if I can reproduce the symptoms often enough even if I have to put a test server in production for a few hours. Any tips on where we could start looking, or alterations I could try making such as sysctls to return to older behavior? It might be worth noting that I've seen a considerable increase in traffic from my mail servers since the 8.x upgrade timeframe, on the order of 5-10x as much traffic to the NFS server. dovecot tries its hardest to flush out the access cache when needed and it was working well enough since about 1.0.16 (years ago). It seems like FreeBSD is what regressed in this scenario. dovecot 2.x is going in a different direction from my situation and I'm not ready to start testing that immediately if I can avoid it as it will involve some restructuring. Thanks for any input. For now the following errors are about all I have to go on: Nov 29 11:07:54 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 13:19:51 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 14:35:41 server1 dovecot: IMAP(user2): o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) failed: Stale NFS file handle Nov 29 11:57:22 server2 dovecot: IMAP(user4): open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file handle Nov 29 14:04:22 server2 dovecot: IMAP(user5): o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 14:27:21 server2 dovecot: IMAP(user6): o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 15:44:38 server2 dovecot: IMAP(user7): open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file handle Nov 29 19:04:54 server2 dovecot: IMAP(user8): o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Nov 29 06:32:11 server3 dovecot: IMAP(user9): open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file handle Nov 29 10:03:58 server3 dovecot: IMAP(user10): o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) failed: Stale NFS file handle
On Mon, Nov 29, 2010 at 08:06:54PM -0500, Adam McDougall wrote:> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 > servers (usually just 2) accessing mail on a Netapp over NFSv3 via > imapd. delivery is via procmail which doesn't touch the dovecot > metadata and webmail uses imapd. Client connections to imapd go to > random servers and I don't yet have solid means to keep certain > users on certain servers. I upgraded some of the servers to 8.x and > dovecot 1.2 and ran into Stale NFS file handles causing > index/uidlist corruption causing inboxes to appear as empty when > they were not. In some situations their corrupt index had to be > deleted manually. I first suspected dovecot 1.2 since it was > upgraded at the same time but I downgraded to 1.1 and its doing the > same thing. I don't really have a wealth of details to go on yet > and I usually stay quiet until I do, and half the time it is > difficult to reproduce myself so I've had to put it in production to > get a feel for progress. This only happens a dozen or so times per > weekday but I feel the need to start taking bigger steps. I'll > probably do what I can to get IMAP back on a stable base (7.x?) and > also try to debug 8.x on the remaining servers. A binary search is > within possibility if I can reproduce the symptoms often enough even > if I have to put a test server in production for a few hours. > > Any tips on where we could start looking, or alterations I could try > making such as sysctls to return to older behavior?http://wiki1.dovecot.org/NFS is a good start, especially if this problem is only seen with Dovecot. I would start there, specially adjusting your dovecot.conf to include the necessary directives.> It might be > worth noting that I've seen a considerable increase in traffic from > my mail servers since the 8.x upgrade timeframe, on the order of > 5-10x as much traffic to the NFS server. dovecot tries its hardest > to flush out the access cache when needed and it was working well > enough since about 1.0.16 (years ago). It seems like FreeBSD is > what regressed in this scenario. dovecot 2.x is going in a > different direction from my situation and I'm not ready to start > testing that immediately if I can avoid it as it will involve some > restructuring. > > Thanks for any input. For now the following errors are about all I > have to go on: > > Nov 29 11:07:54 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 13:19:51 server1 dovecot: IMAP(user1): o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:35:41 server1 dovecot: IMAP(user2): o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) > failed: Stale NFS file handle > > Nov 29 11:57:22 server2 dovecot: IMAP(user4): > open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 14:04:22 server2 dovecot: IMAP(user5): o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:27:21 server2 dovecot: IMAP(user6): o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:44:38 server2 dovecot: IMAP(user7): > open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 19:04:54 server2 dovecot: IMAP(user8): o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > > Nov 29 06:32:11 server3 dovecot: IMAP(user9): > open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 10:03:58 server3 dovecot: IMAP(user10): o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle-- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On 11/29/10 20:35, Chuck Swiger wrote:> Hi, Adam-- > > On Nov 29, 2010, at 5:06 PM, Adam McDougall wrote: >> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail which doesn't touch the dovecot metadata and webmail uses imapd. Client connections to imapd go to random servers and I don't yet have solid means to keep certain users on certain servers. > > Are you familiar with: > > http://wiki1.dovecot.org/NFS > > Basically, you're running a "try to avoid doing this" configuration, but it does discuss some options to improve the situation. If you can tolerate the performance hit, try disabling NFS attribute cache... > > Regards,I am familiar with that page, have taken it into account, worked closly with Timo the author of Dovecot and my mail servers have been running close enough to perfect on 7.x for years. The FreeBSD version is the the only major change that I can think of at this point other than the versions of other ports. I'm planning to revert some to 7.x to make sure.
Hi, Adam-- On Nov 29, 2010, at 5:06 PM, Adam McDougall wrote:> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. delivery is via procmail which doesn't touch the dovecot metadata and webmail uses imapd. Client connections to imapd go to random servers and I don't yet have solid means to keep certain users on certain servers.Are you familiar with: http://wiki1.dovecot.org/NFS Basically, you're running a "try to avoid doing this" configuration, but it does discuss some options to improve the situation. If you can tolerate the performance hit, try disabling NFS attribute cache... Regards, -- -Chuck
On 11/29/2010 17:06, Adam McDougall wrote:> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers > (usually just 2) accessing mail on a Netapp over NFSv3 via imapd.There are a whole lot more variables that I haven't seen covered yet. Are you using TCP mounts or UDP mounts? Try toggling that setting and see if your performance increases. Are you using rpc.lockd, or not? Try toggling that. What mount options are you using other than TCP/UDP? What does the network topology look like? It's very likely that we can help you here, but more information is needed. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers > (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. > delivery is via procmail which doesn't touch the dovecot metadata and > webmail uses imapd. Client connections to imapd go to random servers > and I don't yet have solid means to keep certain users on certain > servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran > into Stale NFS file handles causing index/uidlist corruption causing > inboxes to appear as empty when they were not. In some situations > their > corrupt index had to be deleted manually. I first suspected dovecot > 1.2 > since it was upgraded at the same time but I downgraded to 1.1 and its > doing the same thing. I don't really have a wealth of details to go on > yet and I usually stay quiet until I do, and half the time it is > difficult to reproduce myself so I've had to put it in production to > get > a feel for progress. This only happens a dozen or so times per weekday > but I feel the need to start taking bigger steps. I'll probably do > what > I can to get IMAP back on a stable base (7.x?) and also try to debug > 8.x > on the remaining servers. A binary search is within possibility if I > can reproduce the symptoms often enough even if I have to put a test > server in production for a few hours. > > Any tips on where we could start looking, or alterations I could try > making such as sysctls to return to older behavior? It might be worth > noting that I've seen a considerable increase in traffic from my mail > servers since the 8.x upgrade timeframe, on the order of 5-10x as much > traffic to the NFS server. dovecot tries its hardest to flush out the > access cache when needed and it was working well enough since about > 1.0.16 (years ago). It seems like FreeBSD is what regressed in this > scenario. dovecot 2.x is going in a different direction from my > situation and I'm not ready to start testing that immediately if I can > avoid it as it will involve some restructuring. > > Thanks for any input. For now the following errors are about all I > have > to go on: > > Nov 29 11:07:54 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 13:19:51 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:35:41 server1 dovecot: IMAP(user2): > o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) > failed: Stale NFS file handle > > Nov 29 11:57:22 server2 dovecot: IMAP(user4): > open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 14:04:22 server2 dovecot: IMAP(user5): > o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:27:21 server2 dovecot: IMAP(user6): > o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:44:38 server2 dovecot: IMAP(user7): > open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 19:04:54 server2 dovecot: IMAP(user8): > o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > > Nov 29 06:32:11 server3 dovecot: IMAP(user9): > open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 10:03:58 server3 dovecot: IMAP(user10): > o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle >Others have made good suggestions. One more you could try is disabling the negative name caching by setting the option "negnametimeo=0". The addition of negative name caching is also in FreeBSD7, but it is a fairly recent change, so your FreeBSD7 boxes may not have had it. I also think trying the "dot-locking" and running without statd and lockd (you can mount with the "nolock" option) would be worth trying. And, of course, disabling attribute caching is mentioned on the web page others cited. Good luck with it, rick ps: Unfortunately the NFS protocol cannot support for POSIX file system semantics, so some apps can never run correctly on NFS mounted volumes. NFSv4 comes closer, but it still can't provide full POSIX semantics.
> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers > (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. > delivery is via procmail which doesn't touch the dovecot metadata and > webmail uses imapd. Client connections to imapd go to random servers > and I don't yet have solid means to keep certain users on certain > servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran > into Stale NFS file handles causing index/uidlist corruption causing > inboxes to appear as empty when they were not. In some situations > their > corrupt index had to be deleted manually. I first suspected dovecot > 1.2 > since it was upgraded at the same time but I downgraded to 1.1 and its > doing the same thing. I don't really have a wealth of details to go on > yet and I usually stay quiet until I do, and half the time it is > difficult to reproduce myself so I've had to put it in production to > get > a feel for progress. This only happens a dozen or so times per weekday > but I feel the need to start taking bigger steps. I'll probably do > what > I can to get IMAP back on a stable base (7.x?) and also try to debug > 8.x > on the remaining servers. A binary search is within possibility if I > can reproduce the symptoms often enough even if I have to put a test > server in production for a few hours. > > Any tips on where we could start looking, or alterations I could try > making such as sysctls to return to older behavior? It might be worth > noting that I've seen a considerable increase in traffic from my mail > servers since the 8.x upgrade timeframe, on the order of 5-10x as much > traffic to the NFS server. dovecot tries its hardest to flush out the > access cache when needed and it was working well enough since about > 1.0.16 (years ago). It seems like FreeBSD is what regressed in this > scenario. dovecot 2.x is going in a different direction from my > situation and I'm not ready to start testing that immediately if I can > avoid it as it will involve some restructuring. > > Thanks for any input. For now the following errors are about all I > have > to go on: > > Nov 29 11:07:54 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 13:19:51 server1 dovecot: IMAP(user1): > o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:35:41 server1 dovecot: IMAP(user2): > o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) > failed: Stale NFS file handle > > Nov 29 11:57:22 server2 dovecot: IMAP(user4): > open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 14:04:22 server2 dovecot: IMAP(user5): > o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 14:27:21 server2 dovecot: IMAP(user6): > o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > Nov 29 15:44:38 server2 dovecot: IMAP(user7): > open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 19:04:54 server2 dovecot: IMAP(user8): > o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle > > Nov 29 06:32:11 server3 dovecot: IMAP(user9): > open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file > handle > Nov 29 10:03:58 server3 dovecot: IMAP(user10): > o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) > failed: Stale NFS file handle >Oh, and just in case you weren't aware of it, ESTALE means that the client is trying to access a file that has already been deleted on the server, but I don't think that tells you much w.r.t. how to work around it. The author of dovecot might have more insight into when the above files are being deleted, which might hint at a workaround? rick
On Monday, November 29, 2010 8:06:54 pm Adam McDougall wrote:> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare > minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers > (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. > delivery is via procmail which doesn't touch the dovecot metadata and > webmail uses imapd. Client connections to imapd go to random servers > and I don't yet have solid means to keep certain users on certain > servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran > into Stale NFS file handles causing index/uidlist corruption causing > inboxes to appear as empty when they were not. In some situations their > corrupt index had to be deleted manually. I first suspected dovecot 1.2 > since it was upgraded at the same time but I downgraded to 1.1 and its > doing the same thing. I don't really have a wealth of details to go on > yet and I usually stay quiet until I do, and half the time it is > difficult to reproduce myself so I've had to put it in production to get > a feel for progress. This only happens a dozen or so times per weekday > but I feel the need to start taking bigger steps. I'll probably do what > I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x > on the remaining servers. A binary search is within possibility if I > can reproduce the symptoms often enough even if I have to put a test > server in production for a few hours.There were some changes to allow more concurrency in the NFS client in 8 (and 7.2+) that caused ESTALE errors to occur on open(2) more frequently. You can try setting 'vfs.lookup_shared=0' to disable the extra concurrency (but at a performance cost) as a workaround. The most recent 7.x and 8.x have some changes to open(2) to minimize ESTALE errors that I think get it back to the same level as when lookup_shared is set to 0. -- John Baldwin
On 11/30/10 09:33, John Baldwin wrote:> On Monday, November 29, 2010 8:06:54 pm Adam McDougall wrote: >> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare >> minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers >> (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. >> delivery is via procmail which doesn't touch the dovecot metadata and >> webmail uses imapd. Client connections to imapd go to random servers >> and I don't yet have solid means to keep certain users on certain >> servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran >> into Stale NFS file handles causing index/uidlist corruption causing >> inboxes to appear as empty when they were not. In some situations their >> corrupt index had to be deleted manually. I first suspected dovecot 1.2 >> since it was upgraded at the same time but I downgraded to 1.1 and its >> doing the same thing. I don't really have a wealth of details to go on >> yet and I usually stay quiet until I do, and half the time it is >> difficult to reproduce myself so I've had to put it in production to get >> a feel for progress. This only happens a dozen or so times per weekday >> but I feel the need to start taking bigger steps. I'll probably do what >> I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x >> on the remaining servers. A binary search is within possibility if I >> can reproduce the symptoms often enough even if I have to put a test >> server in production for a few hours. > > There were some changes to allow more concurrency in the NFS client in 8 (and > 7.2+) that caused ESTALE errors to occur on open(2) more frequently. You can > try setting 'vfs.lookup_shared=0' to disable the extra concurrency (but at a > performance cost) as a workaround. The most recent 7.x and 8.x have some > changes to open(2) to minimize ESTALE errors that I think get it back to the > same level as when lookup_shared is set to 0. >I tried vfs.lookup_shared=0 on two of the three already with no help (forgot what it was called or I would have mentioned it), and I also tried vfs.nfs.prime_access_cache=1 on a guess on all three but that didn't help either. I'll go through the other suggestions and see where it gets me. Thanks all for the input.
On 11/30/10 08:33, Rick Macklem wrote:>> I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare >> minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers >> (usually just 2) accessing mail on a Netapp over NFSv3 via imapd. >> delivery is via procmail which doesn't touch the dovecot metadata and >> webmail uses imapd. Client connections to imapd go to random servers >> and I don't yet have solid means to keep certain users on certain >> servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran >> into Stale NFS file handles causing index/uidlist corruption causing >> inboxes to appear as empty when they were not. In some situations >> their >> corrupt index had to be deleted manually. I first suspected dovecot >> 1.2 >> since it was upgraded at the same time but I downgraded to 1.1 and its >> doing the same thing. I don't really have a wealth of details to go on >> yet and I usually stay quiet until I do, and half the time it is >> difficult to reproduce myself so I've had to put it in production to >> get >> a feel for progress. This only happens a dozen or so times per weekday >> but I feel the need to start taking bigger steps. I'll probably do >> what >> I can to get IMAP back on a stable base (7.x?) and also try to debug >> 8.x >> on the remaining servers. A binary search is within possibility if I >> can reproduce the symptoms often enough even if I have to put a test >> server in production for a few hours. >> >> Any tips on where we could start looking, or alterations I could try >> making such as sysctls to return to older behavior? It might be worth >> noting that I've seen a considerable increase in traffic from my mail >> servers since the 8.x upgrade timeframe, on the order of 5-10x as much >> traffic to the NFS server. dovecot tries its hardest to flush out the >> access cache when needed and it was working well enough since about >> 1.0.16 (years ago). It seems like FreeBSD is what regressed in this >> scenario. dovecot 2.x is going in a different direction from my >> situation and I'm not ready to start testing that immediately if I can >> avoid it as it will involve some restructuring. >> >> Thanks for any input. For now the following errors are about all I >> have >> to go on: >> >> Nov 29 11:07:54 server1 dovecot: IMAP(user1): >> o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> Nov 29 13:19:51 server1 dovecot: IMAP(user1): >> o_stream_send(/home/user1/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> Nov 29 14:35:41 server1 dovecot: IMAP(user2): >> o_stream_send(/home/user2/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> Nov 29 15:07:05 server1 dovecot: IMAP(user3): read(mail, uid=128990) >> failed: Stale NFS file handle >> >> Nov 29 11:57:22 server2 dovecot: IMAP(user4): >> open(/egr/mail/shared/vprgs/dovecot-acl-list) failed: Stale NFS file >> handle >> Nov 29 14:04:22 server2 dovecot: IMAP(user5): >> o_stream_send(/home/user5/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> Nov 29 14:27:21 server2 dovecot: IMAP(user6): >> o_stream_send(/home/user6/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> Nov 29 15:44:38 server2 dovecot: IMAP(user7): >> open(/egr/mail/shared/decs/dovecot-acl-list) failed: Stale NFS file >> handle >> Nov 29 19:04:54 server2 dovecot: IMAP(user8): >> o_stream_send(/home/user8/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> >> Nov 29 06:32:11 server3 dovecot: IMAP(user9): >> open(/egr/mail/shared/cmsc/dovecot-acl-list) failed: Stale NFS file >> handle >> Nov 29 10:03:58 server3 dovecot: IMAP(user10): >> o_stream_send(/home/user10/Maildir/dovecot/private/control/.INBOX/dovecot-uidlist) >> failed: Stale NFS file handle >> > Others have made good suggestions. One more you could try is disabling the negative > name caching by setting the option "negnametimeo=0". The addition of negative name > caching is also in FreeBSD7, but it is a fairly recent change, so your FreeBSD7 boxes > may not have had it. I also think trying the "dot-locking" and running without statd > and lockd (you can mount with the "nolock" option) would be worth trying. And, of course, > disabling attribute caching is mentioned on the web page others cited. > > Good luck with it, rick > ps: Unfortunately the NFS protocol cannot support for POSIX file system semantics, so > some apps can never run correctly on NFS mounted volumes. NFSv4 comes closer, but > it still can't provide full POSIX semantics. >I'll give negnametimeo=0 a try on one server starting tonight, I'll be busy tomorrow and don't want to risk making anything potentially worse than it is yet. I can't figure out how to disable the attr cache in FreeBSD. Neither suggestions seem to be valid, and years ago when I looked into it I got the impression that you can't, but I'd love to be proven wrong. I'll try dotlock when I can. Would disabling statd and lockd be the same as using nolock on all mounts? The vacation binary is the only thing I can think of that might use it, not sure how well it would like missing it which is how I discovered I needed it in the first place. Also, if disabling lockd shows an improvement, could it lead to further investigation or is it just a workaround? Just trying to understand the possibilities better. I know ESTALE means the file vanished but for the files I had an error on, it is expected that multiple systems are going to spontaneously replace the file. Thanks.