Jeremy Chadwick
2010-Apr-18 21:37 UTC
rc(8) script -- waiting for the network to become usable
I'd like to discuss the possibility of introduction of a new script into
/etc/rc.d base system a script, which when enabled, would provide a way
to wait until the IP networking layer (using ping(8)) is up and usable
before continuing with daemon startup.
I've written a script that's in use on all of our RELENG_8 systems (I
have not tested RELENG_7) which works reliably; I'll include that script
at the bottom of my mail, and also a link to it[1].
Let's discuss. :-)
HISTORY
========The situation which brought this debacle to my attention:
I found that on reboot of some of our systems, ntpdate (used to sync the
clock initially before ntpd would be started) wouldn't work. The daemon
would report that it couldn't resolve any of the FQDNs within ntp.conf,
and would therefore act as a no-op before continuing on.
This failure had dire consequences -- Dovecot (at least with older
versions; newer seems to behave better[2]) would refuse to start up,
citing "time moved backwards". Dovecot not starting had a trick-down
effect on Postfix (which was compiled to use Dovecot for SMTP AUTH),
where Postfix would start but all inbound mail would fail due to
Dovecot's SMTP AUTH mech not listening on a domain socket. Ouch.
Since DNS failure was the root issue, I dug around rc.d/named and found
that Doug had introduced a feature to rc.d/named called "named_wait"
which calls "host $named_wait_host" repetitively (sleeping 1 second
between calls), waiting until successful resolution before continuing
onwards. This worked (e.g. set named_wait_host to "www.google.com" or
something Internet-bound).
However, named itself still complains during startup about "host
unreachable resolving XXX messages" with regards to the root servers.
These errors were visible in logs, etc... and could cause confusion or
unnecessary worry (they did in my case).
The root cause should be fairly obvious: the physical networking layer
hadn't fully come up by the time named had started. In other cases, the
physical network was available but layer 2 (ARP) hadn't finished.
So I wrote this.
USE
====1) Install script as /usr/local/etc/rc.d/waitnetwork
2) chmod 755 /usr/local/etc/rc.d/waitnetwork
3) Set the following in rc.conf:
waitnetwork_enable="yes"
waitnetwork_ip="some_ip_addr_to_ping"
Note that this does need to be an IP address and *not* an FQDN. I've
discussed this reasoning with some others and they agree. Don't pick
something like 127.0.0.1 either (meaning don't be silly). :-)
Other parameters you can adjust:
waitnetwork_count -- passed as ping(8) -c flag (default 5)
waitnetwork_timeout -- passed as ping(8) -t flag (default 60)
CAVEATS / POINTS OF INTEREST
=============================1) This script requires the $waitnetwork_ip
box/router/whatever respond
to ICMP ECHO requests. Please do not bikeshed on this point; we need
something that works, and this requirement shouldn't be that bad to deal
with (firewall/ACL-wise). For most folks (co-located in particular),
this could be your default gateway, but you can use whatever you want.
2) The needs of some folks may vary depending upon configuration; "we
have two NICs, dual-homed, so what exactly do I put in waitnetwork_ip?"
Yes, I understand the confusion -- hopefully these folks, given their
topologies, can figure out a way to make this work reliably for them.
3) Other stuff I probably haven't thought of.
For those considering arguing that "we should just wait for the NIC to
come up", that won't work -- what's needed is a way to verify layer
3/4
is usable, not layer 1.
I admit there's no universal way to cover every single person's needs,
but providing a simple framework to at least wait until something is
pingable would be a good starting point; it's better than nothing!
NOTES BEFORE COMMITTING
========================The script also contains some XXX comments which should
be reviewed by
anyone willing to commit this into the base system.
REFERENCES
===========[1]: http://jdc.parodius.com/freebsd/waitnetwork
[2]: http://wiki.dovecot.org/TimeMovedBackwards
--
| Jeremy Chadwick jdc@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
#!/bin/sh
#
# $FreeBSD: $
#
# PROVIDE: waitnetwork
# REQUIRE: NETWORKING
# BEFORE: mountcritremote
# KEYWORD: nojail
# XXX - once/if committed to base, it's better to have mountcritremote
# XXX - REQUIRE waitnetwork, rather than use the above BEFORE line.
. /etc/rc.subr
name="waitnetwork"
rc_var=`set_rcvar`
start_cmd="waitnetwork_start"
stop_cmd=":"
# XXX - once/if committed to base, the following defaults should
# XXX - be placed into src/etc/defaults/rc.conf instead of here
waitnetwork_enable="NO" # Wait for network availability before
# continuing with NETWORKING rc scripts
waitnetwork_ip="" # IP address to ping
waitnetwork_count="5" # ping count (see ping(8) -c flag)
waitnetwork_timeout="60" # ping timeout (see ping(8) -t flag)
waitnetwork_start()
{
local rc
if [ -z "${waitnetwork_ip}" ]; then
warn "You must define an IP address in waitnetwork_ip"
return
fi
echo "Waiting for ${waitnetwork_ip} to respond to ICMP..."
if [ -z "${waitnetwork_timeout}" ]; then
/sbin/ping -c ${waitnetwork_count} ${waitnetwork_ip} >/dev/null 2>&1
rc=$?
else
info "Using timeout of ${waitnetwork_timeout} seconds"
/sbin/ping -t ${waitnetwork_timeout} -c ${waitnetwork_count} ${waitnetwork_ip}
>/dev/null 2>&1
rc=$?
fi
if [ $rc -eq 0 ]; then
echo "Host reachable; network considered available."
else
echo "No response from IP. Continuing, but be aware you may not"
echo "have a fully functional networking layer at this point."
fi
}
load_rc_config $name
run_rc_command "$1"
Andrew Reilly
2010-Apr-18 23:24 UTC
rc(8) script -- waiting for the network to become usable
On Sun, Apr 18, 2010 at 02:37:27PM -0700, Jeremy Chadwick wrote:> I'd like to discuss the possibility of introduction of a new script into > /etc/rc.d base system a script, which when enabled, would provide a way > to wait until the IP networking layer (using ping(8)) is up and usable > before continuing with daemon startup. > > Let's discuss. :-) > > > HISTORY > ========> The situation which brought this debacle to my attention: > > I found that on reboot of some of our systems, ntpdate (used to sync the > clock initially before ntpd would be started) wouldn't work. The daemon > would report that it couldn't resolve any of the FQDNs within ntp.conf, > and would therefore act as a no-op before continuing on.By way of discussion, I'd just like to re-iterate what I said the first time around: it must be understood that this sort of thing is a (necessary) hacky-workaround that should ultimately be unnecessary. In preference, we should work on the failing daemons or hassle up-stream daemon authors so that the daemons in question either (a) retry until they *do* get the information they're after or (b) fail properly, so that they can be restarted by an external process monitoring framework like sysutils/daemontools or launchd. The reasoning is simple: network outage is something that can happen even after startup, and when network connectivity returns, the routing and addresses that are visible won't necessarily be the same. Consider laptops that suspend, as a particular example. Or mobile devices that switch from wi-fi to cellular networking to no connectivity on a regular basis. The "get it right at boot time" model is important and traditional, but (I think) a fragile and diminishing fraction of use cases. Our rc-ng framework favours solution (a). I'm more a fan of approach (b), myself: I use daemontools for many services, and I like the way that launchd works on my Mac laptops. Cheers, -- Andrew
Jeremy, A good proposal to improve start-up robustness. If I may suggest, waitnetwork_ip should include a short list of alternate IP's in the event of a local network outage, or DOS, etc. Something like: waitnetwork_ip="IP1 IP2 IP3" Having multiple target IP's will improve the likelihood of timely booting when silly/nasty things happen on the wider network. Good idea to have incorporated into the base system. Andrew, I agree that the problems should be corrected at the source; and my preference is to fail properly (b) so that other mitigation may occur. Done in parallel, would eventually provide a belts and braces start-up: wait for the network, and fail properly for network dependent processes. (I can't speak to desktops that resume from a suspend when the network has changed state.) Regards, Phil