On 05.02.2019 18:06, Pete French wrote:> The branch and revision is 12.0-STABLE r343538 GENERIC > >> # kgdb >> >> (kgdb) list *ether_output+0x6b6 > > trying to do this on the actual box is hard, as it panics, but on another > machine running the same build I get this, which should suffice if you > are just interested in seeing the line in the source code ? > > (kgdb) list *ether_output+0x6b6 > 0xffffffff80ca1526 is in ether_output (/usr/src/sys/net/if_ethersubr.c:435). > 430 if (m == NULL) > 431 return (0); > 432 } > 433 > 434 /* Continue with link-layer output */ > 435 return ether_output_frame(ifp, m); > 436 } > 437 > 438 static bool > 439 ether_set_pcp(struct mbuf **mp, struct ifnet *ifp, uint8_t pcp)Hi, this doesn't look very useful. Do you have some specificity with this host except carp? Some modifications to kernel config, lagg, jails, etc. -- WBR, Andrey V. Elsukov -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 554 bytes Desc: OpenPGP digital signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20190206/77c8d97b/attachment.sig>
On 06/02/2019 12:16, Andrey V. Elsukov wrote:> Hi, > > this doesn't look very useful. > Do you have some specificity with this host except carp? Some > modifications to kernel config, lagg, jails, etc.No, none of those. Its a supermicro motherboard, runs FreeBSD GENERIC and mysql+redis on top, thats it. The only oddity is carp (used to fail over the redis). but the panic happens when I disable carp and have removed all the ports too. My only customisation to the build is to disable sendmail and lpr. We do use geli for the dirves, and load aesni as a module as well to speed that up. loader.conf below: kern.geom.label.disk_ident.enable=0 kern.geom.label.gptid.enable=0 ahci_load="YES" console="comconsole" aesni_load="YES" cryptodev_load="YES" geom_eli_load="YES" carp_load="YES" zfs_load="YES" vfs.zfs.arc_max="1G" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" vfs.zfs.vdev.cache.size="10M" vfs.zfs.vdev.cache.max="10M" rc.conf below geli_enable="YES" geli_autodetach="NO" geli_devices="ada0p4 ada1p4" hostname="serpentine-passive.telehouse-internal.ingresso.co.uk" ifconfig_igb0="inet 10.32.10.4/16" ifconfig_igb0_ipv6="inet6 2a02:1658:1:2:e550::4/64" ifconfig_igb0_alias0="inet 10.32.10.8/16 vhid 80 advskew 160 pass redacted" defaultrouter="10.32.10.6" ipv6_defaultrouter="2a02:1658:1:2:e550::6" ifconfig_igb1="down" pf_enable="NO" pf_rules="/usr/local/etc/pf.conf" redis_enable="YES" stunnel_enable="YES" mysql_enable="YES" mysql_dbdir="/usr/home/mysql/data" tsw_redis_capture_enable="YES" tsw_redis_capture_if="igb0" datadog_enable="YES" datadog_user="root" datadog_chdir="/usr/local/datadog" sshd_enable="YES" named_enable="YES" zfs_enable="YES" ntpd_enable="YES" syslogd_enable="NO" syslog_ng_enable="YES" exim_enable="YES" sendmail_enable="NO" sendmail_submit_enable="NO" sendmail_outbound_enable="NO" sendmail_msp_queue_enable="NO" nfs_server_enable="NO" nfs_client_enable="YES" nfsv4_server_enable="NO" nfsuserd_enable="YES" rpcbind_enable="YES" rpc_lockd_enable="YES" rpc_lockd_flags="-p 819" rpc_statd_enable="YES" rpc_statd_flags="-p 823" mountd_enable="NO" fluentd_enable="YES" The tsw_redis_capture script just set the carp to MASTER if redis is enabled - means if the machine boots without redis running then carp wont grap the address anyway.
So, another datapoint on this - I just PXE booted the 12.0-RELEASE image downloaded from https://mfsbsd.vx.sk/ and that works fine. Which means that it siether something which has crept in since 12.0-RELEASE or its something to do with my config on that machine. I did try and buld an mfsroot image of the kernel I am trying to deploy,. but that failed, which is a bit of a shame, as thats easier to try than the full upgrade (because rolling that back after a crash is tricky!). The laternative is to bild 12.0-RELEASE and see if that boots up. Not sure when I will get around to trying either of those though. -pete.
I found my panic. If I take everything out of rc.conf and loader.conf and sysctl.conf and boot the system it works fine when I add an IP address. If I add this one line to sysctl.conf net.link.ether.inet.garp_rexmit_count=2 Then I get a panic when I configure the interface: root at serpentine-passive:~ # ifconfig igb0 inet 10.32.10.4/16 up Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x28 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80c987f1 stack pointer = 0x28:0xfffffe00004d5730 frame pointer = 0x28:0xfffffe00004d5750 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock (0)) trap number = 12 panic: page fault cpuid = 0 time = 1549981620 KDB: stack backtrace: #0 0xffffffff80bdfdc7 at kdb_backtrace+0x67 #1 0xffffffff80b93fa3 at vpanic+0x1a3 #2 0xffffffff80b93df3 at panic+0x43 #3 0xffffffff8106a7bf at trap_fatal+0x35f #4 0xffffffff8106a819 at trap_pfault+0x49 #5 0xffffffff81069e3e at trap+0x29e #6 0xffffffff810450c5 at calltrap+0x8 #7 0xffffffff80c986f6 at ether_output+0x6b6 #8 0xffffffff80d03354 at arprequest+0x4c4 #9 0xffffffff80d0515c at garp_rexmit+0xbc #10 0xffffffff80bade19 at softclock_call_cc+0x129 #11 0xffffffff80bae2f9 at softclock+0x79 #12 0xffffffff80b57c57 at ithread_loop+0x1a7 #13 0xffffffff80b54da2 at fork_exit+0x82 #14 0xffffffff810460be at fork_trampoline+0xe Uptime: 2m6s