Hello,
I am trying 1.1 branch and I experience a segmentation fault upon ALRM signal.
This looks like a race condition.
I have my tincd daemon instantiated manually in if-up.d/jmuchemb (without
IF_TINC_NET) and when if-up.d/tinc runs, it sends a ALRM signal that makes tincd
crash.
It fails here:
Core was generated by `tincd -D -n jmuchemb -d -o ConnectTo srv -o srv.Address
81.x.y.z -o Connect'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000040a685 in retry () at net.c:349
349 if(c->outgoing && !c->node) {
(gdb) p c
$1 = (connection_t *) 0x0
Here the end of strace:
> read(24, "-----BEGIN RSA PUBLIC KEY-----\nM"..., 4096) = 426
> close(24) = 0
> munmap(0x7f9f1978d000, 4096) = 0
> epoll_wait(14, {{EPOLLOUT, {u32=23, u64=23}}}, 32, 4907) = 1
> sendto(23, "0 tecra 17.0\n1 94 64 0 0 A2B583B"..., 538, 0, NULL,
0) = 538
> epoll_ctl(14, EPOLL_CTL_MOD, 23, {EPOLLIN, {u32=23, u64=23}}) = 0
> epoll_wait(14, 69d660, 32, 4904) = -1 EINTR (Interrupted system
call)
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> sendto(15, "\16", 1, 0, NULL, 0) = 1
> rt_sigreturn(0xf) = -1 EINTR (Interrupted system
call)
> epoll_wait(14, {{EPOLLIN, {u32=16, u64=16}}}, 32, 4886) = 1
> recvfrom(16, "\16", 1024, 0, NULL, NULL) = 1
> recvfrom(16, 0x7f9f191aa4e0, 1024, 0, 0, 0) = -1 EAGAIN (Resource
temporarily unavailable)
> futex(0x7f9f18926840, FUTEX_WAKE_PRIVATE, 2147483647) = 0
> write(2, "Got Alarm clock signal\n", 23) = 23
> write(2, "Could not set up a meta connecti"..., 42) = 42
> write(2, "Trying to re-establish outgoing "..., 56) = 56
> epoll_ctl(14, EPOLL_CTL_DEL, 22, {EPOLLIN, {u32=22, u64=22}}) = 0
> close(22) = 0
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
It's easily reproducible for me so I can send more information if you want,
including core dump, binaries (Debian) and strace log.
Regards,
Julien