thr3ads.net - freebsd stable - Networking panic on 12

If this information is useful, please help other people find it:
Share via:

Andrey V. Elsukov

2019-Feb-06 12:16 UTC

More CARP issues under 12

On 05.02.2019 18:06, Pete French wrote:> The branch and revision is 12.0-STABLE r343538 GENERIC
> 
>> # kgdb
>>
>> (kgdb) list *ether_output+0x6b6
> 
> trying to do this on the actual box is hard, as it panics, but on another
> machine running the same build I get this, which should suffice if you
> are just interested in seeing the line in the source code ?
> 
> (kgdb)  list *ether_output+0x6b6
> 0xffffffff80ca1526 is in ether_output
(/usr/src/sys/net/if_ethersubr.c:435).
> 430                     if (m == NULL)
> 431                             return (0);
> 432             }
> 433
> 434             /* Continue with link-layer output */
> 435             return ether_output_frame(ifp, m);
> 436     }
> 437
> 438     static bool
> 439     ether_set_pcp(struct mbuf **mp, struct ifnet *ifp, uint8_t pcp)
Hi,

this doesn't look very useful.
Do you have some specificity with this host except carp? Some
modifications to kernel config, lagg, jails, etc.

-- 
WBR, Andrey V. Elsukov

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 554 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20190206/77c8d97b/attachment.sig>

Pete French

2019-Feb-06 12:30 UTC

head link

More CARP issues under 12

On 06/02/2019 12:16, Andrey V. Elsukov wrote:> Hi,
> 
> this doesn't look very useful.
> Do you have some specificity with this host except carp? Some
> modifications to kernel config, lagg, jails, etc.
No, none of those. Its a supermicro motherboard, runs FreeBSD
GENERIC and mysql+redis on top, thats it. The only oddity is
carp (used to fail over the redis). but the panic happens when
I disable carp and have removed all the ports too. My only customisation 
to the build is to disable sendmail and lpr.

We do use geli for the dirves, and load aesni as a module as well to 
speed that up.

loader.conf below:

	kern.geom.label.disk_ident.enable=0
	kern.geom.label.gptid.enable=0

	ahci_load="YES"
	console="comconsole"

	aesni_load="YES"
	cryptodev_load="YES"
	geom_eli_load="YES"
	carp_load="YES"

	zfs_load="YES"
	vfs.zfs.arc_max="1G"
	vfs.zfs.prefetch_disable="1"
	vfs.zfs.txg.timeout="5"
	vfs.zfs.vdev.cache.size="10M"
	vfs.zfs.vdev.cache.max="10M"

rc.conf below

	geli_enable="YES"
	geli_autodetach="NO"
	geli_devices="ada0p4 ada1p4"
	
	hostname="serpentine-passive.telehouse-internal.ingresso.co.uk"
	
	ifconfig_igb0="inet 10.32.10.4/16"
	ifconfig_igb0_ipv6="inet6 2a02:1658:1:2:e550::4/64"
	ifconfig_igb0_alias0="inet 10.32.10.8/16 vhid 80 advskew 160 pass
redacted"
	
	defaultrouter="10.32.10.6"
	ipv6_defaultrouter="2a02:1658:1:2:e550::6"
	
	ifconfig_igb1="down"
	
	pf_enable="NO"
	pf_rules="/usr/local/etc/pf.conf"
	
	redis_enable="YES"
	stunnel_enable="YES"
	
	mysql_enable="YES"
	mysql_dbdir="/usr/home/mysql/data"
	
	tsw_redis_capture_enable="YES"
	tsw_redis_capture_if="igb0"
	
	datadog_enable="YES"
	datadog_user="root"
	datadog_chdir="/usr/local/datadog"
	
	sshd_enable="YES"
	named_enable="YES"
	zfs_enable="YES"
	ntpd_enable="YES"
	
	syslogd_enable="NO"
	syslog_ng_enable="YES"
	
	exim_enable="YES"
	sendmail_enable="NO"
	sendmail_submit_enable="NO"
	sendmail_outbound_enable="NO"
	sendmail_msp_queue_enable="NO"
	
	nfs_server_enable="NO"
	nfs_client_enable="YES"
	nfsv4_server_enable="NO"
	nfsuserd_enable="YES"
	rpcbind_enable="YES"
	rpc_lockd_enable="YES"
	rpc_lockd_flags="-p 819"
	rpc_statd_enable="YES"
	rpc_statd_flags="-p 823"
	mountd_enable="NO"
	
	fluentd_enable="YES"

The tsw_redis_capture script just set the carp to MASTER if redis is 
enabled - means if the machine boots without redis running then carp 
wont grap the address anyway.

Pete French

2019-Feb-08 11:01 UTC

head link

More CARP issues under 12

So, another datapoint on this - I just PXE booted the 12.0-RELEASE image
downloaded from https://mfsbsd.vx.sk/ and that works fine. Which means
that it siether something which has crept in since 12.0-RELEASE or its
something to do with my config on that machine.

I did try and buld an mfsroot image of the kernel I am trying to
deploy,. but that failed, which is a bit of a shame, as thats easier to
try than the full upgrade (because rolling that back after a crash
is tricky!). The laternative is to bild 12.0-RELEASE and see if that boots
up. Not sure when I will get around to trying either of those though.

-pete.

Pete French

2019-Feb-12 14:53 UTC

head link

Networking panic on 12 - found the cause

I found my panic. If I take everything out of rc.conf and loader.conf 
and sysctl.conf and boot the system it works fine when I add an IP 
address. If I add this one line to sysctl.conf

	net.link.ether.inet.garp_rexmit_count=2

Then I get a panic when I configure the interface:

root at serpentine-passive:~ #  ifconfig igb0 inet 10.32.10.4/16 up


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x28
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c987f1
stack pointer           = 0x28:0xfffffe00004d5730
frame pointer           = 0x28:0xfffffe00004d5750
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi4: clock (0))
trap number             = 12
panic: page fault
cpuid = 0
time = 1549981620
KDB: stack backtrace:
#0 0xffffffff80bdfdc7 at kdb_backtrace+0x67
#1 0xffffffff80b93fa3 at vpanic+0x1a3
#2 0xffffffff80b93df3 at panic+0x43
#3 0xffffffff8106a7bf at trap_fatal+0x35f
#4 0xffffffff8106a819 at trap_pfault+0x49
#5 0xffffffff81069e3e at trap+0x29e
#6 0xffffffff810450c5 at calltrap+0x8
#7 0xffffffff80c986f6 at ether_output+0x6b6
#8 0xffffffff80d03354 at arprequest+0x4c4
#9 0xffffffff80d0515c at garp_rexmit+0xbc
#10 0xffffffff80bade19 at softclock_call_cc+0x129
#11 0xffffffff80bae2f9 at softclock+0x79
#12 0xffffffff80b57c57 at ithread_loop+0x1a7
#13 0xffffffff80b54da2 at fork_exit+0x82
#14 0xffffffff810460be at fork_trampoline+0xe
Uptime: 2m6s

freebsd stable - Feb 2019 - Networking panic on 12 - found the cause

More CARP issues under 12

More CARP issues under 12

More CARP issues under 12

Networking panic on 12 - found the cause