[ originally sent to geom, but am throwing it open to a wider audience as I didn;t get any replies there] I am using 7.2-STABLE from October 7th on all amchines, but this has been going on a while. Very simply I am mirroring together a pair of discs, one local, one remote. The remote disc is accessed using ggate. If the remote diisc is actually on a very close machine - e.g. a server plugged into the same ether net - then all works fine. If I make the remote disc somewhere actually substantially further away on the nbetwork, however, then when I attach the disc it starts to rebuild the mirror but then fails a fraction of a second later thus: GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a. GEOM_MIRROR: Synchronization request failed (error=5). ggate1a[WRITE(offset=1310720, length=131072)] GEOM_MIRROR: Device mysql0: provider ggate1a disconnected. GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a stopped. The interesting this is that the problem is only with gmirror, not with the underlying ggate disc which remains attached and accessible. I tested this by adding a second partition (ggate1b in the example above) and mounting a UFS filesystem on that. I've looked at the kernel code briefly, but it is not clear to me what is causing that write to fail. My conjecture would be that a buffer somewhere is filling up, causing a write to fail, and instead of gmirror waiting and retrying, instead it just fails the synchronisation. Any ideas ? Is this actually a bug ? I am wondering if it would also happen if mirroring a very fast disc against a very slow one (i.e. maybe it is independent of ggate) -pete.
Hi, On Fri, Oct 23, 2009 at 11:56:24AM +0100, Pete French wrote:> If the remote diisc is actually on a very close machine - e.g. a server > plugged into the same ether net - then all works fine. If I make > the remote disc somewhere actually substantially further away on the > nbetwork, however, then when I attach the disc it starts to rebuild the > mirror but then fails a fraction of a second later thus: > > GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a. > GEOM_MIRROR: Synchronization request failed (error=5). ggate1a[WRITE(offset=1310720, length=131072)] > GEOM_MIRROR: Device mysql0: provider ggate1a disconnected. > GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a stopped. > > The interesting this is that the problem is only with gmirror, not with > the underlying ggate disc which remains attached and accessible. I tested > this by adding a second partition (ggate1b in the example above) and > mounting a UFS filesystem on that.Just a wild guess, have you tried to set kern.geom.mirror.timeout to a higher value? - Olli -- | Oliver Brandmueller http://sysadm.in/ ob@sysadm.in | | Ich bin das Internet. Sowahr ich Gott helfe. |
> Just a wild guess, have you tried to set kern.geom.mirror.timeout to a > higher value?Yes, I tried values all the way up to 600, no effect at all - plus the failure comes way before that timeout value (which is in seconds I assume). -pete.
I think you hit the same bug as I did a while ago. http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/132798 You can get a patch at PR and give a try. Make sure you update both server and client; otherwise, it will cause a panic or so. Hiro On Tue, 03 Nov 2009 16:23:24 +0000 Pete French <petefrench@ticketswitch.com> wrote:> > Have you done any sockets tuning? > > In an older posting the following values were recommended: > > Yes, I need that to get the speed out of it for normal use > to a disc on a machine on the same ether - but even so, surely > it should block on a slow disc, not just abandon the mirroring ? > > -pete. > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"