On 8 Dec 2020, at 19:49, Peter wrote:> On Tue, Dec 08, 2020 at 04:50:00PM +0100, Kristof Provost wrote:
> ! Yeah, the bug is not exclusive to epair but that?s where it?s
> most easily
> ! seen.
>
> Ack.
>
> ! Try
>
http://people.freebsd.org/~kp/0001-if-Fix-panic-when-destroying-vnet-and-epair-simultan.patch
>
> Great, thanks a lot.
>
> Now I have bad news: when playing yoyo with the next-best three
> application jails (with all their installed stuff) it took about
> ten up and down's then I got this one:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 02
> fault virtual address = 0x10
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff80aad73c
> stack pointer = 0x28:0xfffffe003f80e810
> frame pointer = 0x28:0xfffffe003f80e810
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 15486 (ifconfig)
> trap number = 12
> panic: page fault
> cpuid = 1
> time = 1607450838
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe003f80e4d0
> vpanic() at vpanic+0x17b/frame 0xfffffe003f80e520
> panic() at panic+0x43/frame 0xfffffe003f80e580
> trap_fatal() at trap_fatal+0x391/frame 0xfffffe003f80e5e0
> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe003f80e630
> trap() at trap+0x4cf/frame 0xfffffe003f80e740
> calltrap() at calltrap+0x8/frame 0xfffffe003f80e740
> --- trap 0xc, rip = 0xffffffff80aad73c, rsp = 0xfffffe003f80e810, rbp
> = 0xfffffe003f80e810 ---
> ng_eiface_mediastatus() at ng_eiface_mediastatus+0xc/frame
> 0xfffffe003f80e810
> ifmedia_ioctl() at ifmedia_ioctl+0x174/frame 0xfffffe003f80e850
> ifhwioctl() at ifhwioctl+0x639/frame 0xfffffe003f80e8d0
> ifioctl() at ifioctl+0x448/frame 0xfffffe003f80e990
> kern_ioctl() at kern_ioctl+0x275/frame 0xfffffe003f80e9f0
> sys_ioctl() at sys_ioctl+0x101/frame 0xfffffe003f80eac0
> amd64_syscall() at amd64_syscall+0x380/frame 0xfffffe003f80ebf0
> fast_syscall_common() at fast_syscall_common+0xf8/frame
> 0xfffffe003f80ebf0
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800475b2a, rsp =
> 0x7fffffffe358, rbp = 0x7fffffffe450 ---
> Uptime: 9m51s
> Dumping 899 out of 3959 MB:
>
> I decided to give it a second try, and this is what I did:
>
> root at edge:/var/crash # jls
> JID IP Address Hostname Path
> 1 1*********** gate.***********.org /j/gate
> 3 1*********** raix.***********.org /j/raix
> 4 oper.***********.org /j/oper
> 5 admn.***********.org /j/admn
> 6 data.***********.org /j/data
> 7 conn.***********.org /j/conn
> 8 kerb.***********.org /j/kerb
> 9 tele.***********.org /j/tele
> 10 rail.***********.org /j/rail
> root at edge:/var/crash # service jail stop rail
> Stopping jails: rail.
> root at edge:/var/crash # service jail stop tele
> Stopping jails: tele.
> root at edge:/var/crash # service jail stop kerb
> Stopping jails: kerb.
> root at edge:/var/crash # jls
> JID IP Address Hostname Path
> 1 1*********** gate.***********.org /j/gate
> 3 1*********** raix.***********.org /j/raix
> 4 oper.***********.org /j/oper
> 5 admn.***********.org /j/admn
> 6 data.***********.org /j/data
> 7 conn.***********.org /j/conn
> root at edge:/var/crash # jls -d
> JID IP Address Hostname Path
> 1 1*********** gate.***********.org /j/gate
> 3 1*********** raix.***********.org /j/raix
> 4 oper.***********.org /j/oper
> 5 admn.***********.org /j/admn
> 6 data.***********.org /j/data
> 7 conn.***********.org /j/conn
> 9 tele.***********.org /j/tele
> 10 rail.***********.org /j/rail
> root at edge:/var/crash # service jail start kerb
> Starting jails:Fssh_packet_write_wait: Connection to 1*********** port
> 22: Broken pipe
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 02
> fault virtual address = 0x0
> fault code = supervisor read instruction, page not
> present
> instruction pointer = 0x20:0x0
> stack pointer = 0x28:0xfffffe00540ea658
> frame pointer = 0x28:0xfffffe00540ea670
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 13420 (ifconfig)
> trap number = 12
> panic: page fault
> cpuid = 1
> time = 1607451910
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe00540ea310
> vpanic() at vpanic+0x17b/frame 0xfffffe00540ea360
> panic() at panic+0x43/frame 0xfffffe00540ea3c0
> trap_fatal() at trap_fatal+0x391/frame 0xfffffe00540ea420
> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00540ea470
> trap() at trap+0x4cf/frame 0xfffffe00540ea580
> calltrap() at calltrap+0x8/frame 0xfffffe00540ea580
> --- trap 0xc, rip = 0, rsp = 0xfffffe00540ea658, rbp =
> 0xfffffe00540ea670 ---
> ??() at 0/frame 0xfffffe00540ea670
> sysctl_rtsock() at sysctl_rtsock+0x3d5/frame 0xfffffe00540ea8a0
> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x90/frame
> 0xfffffe00540ea8e0
> sysctl_root() at sysctl_root+0x248/frame 0xfffffe00540ea960
> userland_sysctl() at userland_sysctl+0x178/frame 0xfffffe00540eaa10
> sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe00540eaac0
> amd64_syscall() at amd64_syscall+0x380/frame 0xfffffe00540eabf0
> fast_syscall_common() at fast_syscall_common+0xf8/frame
> 0xfffffe00540eabf0
> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80047646a, rsp
> = 0x7fffffffe378, rbp = 0x7fffffffe3b0 ---
> Uptime: 16m48s
> Dumping 938 out of 3959 MB:
>
>
> Sorry for the bad news.
>
You appear to be triggering two or three different bugs there.
Can you reduce your netgraph use case to a small test case that can
trigger the problem? I?m not likely to be able to do anything unless I
can reproduce the problem(s).
Best regards,
Kristof