PGNd
2014-Sep-14 05:38 UTC
shorewall-init fails & 1 provider (vpn) interface unusable on boot -- but sys ends up fully up & running after boot anyway. why?
I'm attempting to troubleshoot boot-time failures for shorewall-init.service start, and the interface of one of my provider's interfaces (my vpn). I'm currently stymied by the fact that *despite* the failures -- for which I can't yet find the reason -- once *fully* booted, the system heals itself and everything's running OK. I'll certainly take the 'up' state, but would prefer to fix the boot-time problem; I'd appreciate a fresh set of eyes ... On my edge router/firewall I've installed shorewll-init shorewall-lite shorewall6-lite openvpn I've configured two providers, prov1 (the 'net @ eth0) & prov2 (vpn @ tun1) *During* startup, journalctl shows that shorewall-init.service fails to start and 'tun1' is not usable. journalctl -b | grep -i shorewall | grep -iv shorewall6 !!! Sep 13 21:52:25 core shorewall-init[932]: Initializing "Shorewall-based firewalls": Sep 13 21:52:25 core systemd[1]: shorewall-init.service: main process exited, code=exited, status=1/FAILURE Sep 13 21:52:25 core systemd[1]: Unit shorewall-init.service entered failed state. Sep 13 21:53:29 core systemd[1]: Starting shorewall-lite... Sep 13 21:53:30 core shorewall-lite[3280]: Starting Shorewall Lite.... Sep 13 21:53:31 core shorewall-lite[3280]: OK ping @ INTFC=eth0 Sep 13 21:53:31 core shorewall-lite[3280]: Initializing... Sep 13 21:53:33 core shorewall-lite[3280]: Processing init user exit ... Sep 13 21:53:33 core shorewall-lite[3280]: Processing tcclear user exit ... Sep 13 21:53:33 core shorewall-lite[3280]: Setting up Route Filtering... Sep 13 21:53:33 core shorewall-lite[3280]: Setting up Martian Logging... Sep 13 21:53:33 core shorewall-lite[3280]: Setting up Accept Source Routing... Sep 13 21:53:33 core shorewall-lite[3280]: Setting up Proxy ARP... Sep 13 21:53:33 core shorewall-lite[3280]: Adding Providers... !!! Sep 13 21:53:34 core shorewall-lite[3280]: WARNING: Interface tun1 is not usable -- Provider prov2 (2) not Started Sep 13 21:53:34 core shorewall-lite[3280]: Preparing iptables-restore input... Sep 13 21:53:34 core shorewall-lite[3280]: Running /usr/sbin/iptables-restore... Sep 13 21:53:34 core shorewall-lite[3280]: IPv4 Forwarding Enabled Sep 13 21:53:34 core shorewall-lite[3280]: Processing start user exit ... Sep 13 21:53:34 core shorewall-lite[3280]: Processing started user exit ... Sep 13 21:53:34 core logger[3821]: Shorewall Lite started Sep 13 21:53:34 core shorewall-lite[3280]: done. Once the system's fully booted, the shorewall-init service is NOT running, systemctl status shorewall-init shorewall-init.service - Shorewall IPv4 firewall Loaded: loaded (/etc/systemd/system/shorewall-init.service; enabled) Active: failed (Result: exit-code) since Sat 2014-09-13 21:52:25 PDT; 29min ago Process: 932 ExecStart=/usr/sbin/shorewall-init $OPTIONS start (code=exited, status=1/FAILURE) Main PID: 932 (code=exited, status=1/FAILURE) Sep 13 21:52:25 core shorewall-init[932]: Initializing "Shorewall-based firewalls": Sep 13 21:52:25 core systemd[1]: shorewall-init.service: main process exited, code=exited, status=1/FAILURE Sep 13 21:52:25 core systemd[1]: Unit shorewall-init.service entered failed state. BUT shorewall-lite shows that all the routes are actually set for BOTH prov1 & prov2 -- only possible (iiuc) if both interfaces are 'usable' Checking shorewall-lite show routing Shorewall Lite 4.6.3.3 Routing at core - Sat Sep 13 21:57:59 PDT 2014 Routing Rules 0: from all lookup local 10000: from all fwmark 0x100/0xff00 lookup prov1 10001: from all fwmark 0x200/0xff00 lookup prov2 20000: from xx.xx.xx.xx lookup prov1 20000: from 10.0.0.2 lookup prov2 32766: from all lookup main 32767: from all lookup default Table default: ... Table local: ... Table main: ... Table prov1: ... Table prov2: ... At this point, the vpn's also fully up & running. Everything appears to be working -- as intended. Somewhere between the initial fail @ boot, and a running-system state, things appear to straighten themselves out. I *suspect* it's systemd dependencies among shorewall-init, shorewall-lite, network & openvpn ... Is it? With this interface/provider config /params THIS_EXT_IF=eth0 THIS_INT_IF=eth1 THIS_VPN_IF=tun1 /interfaces ?FORMAT 2 net EXT_IF optional,physical=$THIS_EXT_IF,... vpn1 VPN_IF optional,physical=$THIS_VPN_IF,... - INT_IF physical=$THIS_INT_IF,... /providers prov1 1 0x100 main EXT_IF detect track,balance INT_IF prov2 2 0x200 main VPN_IF 10.0.0.1 track,fallback INT_IF I've got the following systemd units in place; I think (?) these are all that are relevant here ... cat /etc/systemd/system/shorewall-init.service [Unit] Description=Shorewall IPv4 firewall After=syslog.target Before=network.target [Service] Type=oneshot RemainAfterExit=yes EnvironmentFile=-/etc/sysconfig/shorewall-init StandardOutput=syslog ExecStart=/usr/sbin/shorewall-init $OPTIONS start ExecStop=/usr/sbin/shorewall-init $OPTIONS stop [Install] WantedBy=multi-user.target cat /etc/systemd/system/shorewall-lite.service [Unit] Description=shorewall-lite After=syslog.target network.target Before=shorewall-lite.target Requires=network.target [Service] Type=oneshot RemainAfterExit=yes StandardOutput=syslog ExecStartPre=/usr/local/etc/shorewall/scripts/launch4.sh ExecStart=/usr/sbin/shorewall-lite start ExecStop=/usr/sbin/shorewall-lite stop [Install] WantedBy=multi-user.target cat /etc/systemd/system/openvpn-custom.service [Unit] Description=OpenVPN Server After=syslog.target network.target shorewall-lite.target Before=openvpn-custom.target Requires=shorewall-lite.target Requires=network.target [Service] PrivateTmp=true Environment=PATH="/usr/local/scripts:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin" Type=forking ExecStartPre=/usr/local/etc/openvpn/up.script ExecStart=/usr/local/openvpn/sbin/openvpn \ --daemon \ --cd /usr/local/etc/openvpn/ \ --config client.conf \ --writepid /var/run/openvpn/openvpn.pid ExecStopPost=/usr/local/etc/openvpn/down.script Restart=always RestartSec=30 [Install] WantedBy=multi-user.target I've not yet been successful figuring out exactly WHY I'm seeing those initial fails, and am unclear why/how is seems to end up working. Have I screwed up the After/Before/Require dependencies? Something in my shorewall config? Or am I looking in the completely wrong place for the problem? Any ideas? If more info from my end is needed, happy to provide -- just not sure what's useful, yet. ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk