thr3ads.net - dovecot - dovecot: master: Panic: kevent() failed: Invalid argument [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Adrian Gonzalez

2018-Feb-13 01:48 UTC

dovecot: master: Panic: kevent() failed: Invalid argument

Hello

I've been seeing these panics with dovecot 2.2.31 (and older versions) 
under FreeBSD about once every 2-3 months or so.? I know it's been 
discussed here in the list, but I didn't see a definitive fix.? If it's 
been fixed, please ignore this post (sorry!).

I finally managed to observe the bug on a server that has the patch that 
prints out the kevent() timespec values and got:

Nov 29 16:46:27 mail dovecot: master: Panic: kevent(events=74, 
ts=59.1000000000) failed: Invalid argument

Looking through the code, the panic is triggered in 
io_loop_handler_run_internal().? The timeout values for the kevent() 
call are obtained by calling io_loop_get_wait_time(), which in turn 
calls timeout_get_wait_time() (both in ioloop.c).

The timeout is computed by subtracting the value returned by 
gettimeofday() from timeout->next_run, and it looks like in very rare 
cases the result in tv_r->tv_usec is 1000000uS (1 second).

So, it seems that if gettimeofday() returns exactly 0 microseconds, and 
timeout->next_run has exactly 1000000 microseconds, the kevent() call 
will fail (1 in 1e12 chance?).

I'm not entirely sure where timeout->next_run is computed, but looking 
at timeout_update_next(), it adjusts the seconds only if the 
microseconds are greater than 1000000:? (ioloop.c, line 235)

if (timeout->next_run.tv_usec > 1000000) {
 ??????????????? timeout->next_run.tv_sec++;
 ??????????????? timeout->next_run.tv_usec -= 1000000;
}

It seems like the comparison here should be greater or equal, but I'm 
not familiar enough with the code to say with absolute certainty.

I added more debug error messages to the code in timeout_update_next(), 
and although I do see next_run->tv_usec at 1000000 several times a day, 
I haven't seen a panic since early December on any of our servers, so I 
decided to post this to see if this analysis makes sense.

Aki Tuomi

2018-Feb-13 07:17 UTC

head link

dovecot: master: Panic: kevent() failed: Invalid argument

On 13.02.2018 03:48, Adrian Gonzalez wrote:>
> Hello
>
> I've been seeing these panics with dovecot 2.2.31 (and older versions)
> under FreeBSD about once every 2-3 months or so.? I know it's been
> discussed here in the list, but I didn't see a definitive fix.? If
> it's been fixed, please ignore this post (sorry!).
>
> I finally managed to observe the bug on a server that has the patch
> that prints out the kevent() timespec values and got:
>
> Nov 29 16:46:27 mail dovecot: master: Panic: kevent(events=74,
> ts=59.1000000000) failed: Invalid argument
>
> Looking through the code, the panic is triggered in
> io_loop_handler_run_internal().? The timeout values for the kevent()
> call are obtained by calling io_loop_get_wait_time(), which in turn
> calls timeout_get_wait_time() (both in ioloop.c).
>
> The timeout is computed by subtracting the value returned by
> gettimeofday() from timeout->next_run, and it looks like in very rare
> cases the result in tv_r->tv_usec is 1000000uS (1 second).
>
> So, it seems that if gettimeofday() returns exactly 0 microseconds,
> and timeout->next_run has exactly 1000000 microseconds, the kevent()
> call will fail (1 in 1e12 chance?).
>
> I'm not entirely sure where timeout->next_run is computed, but
looking
> at timeout_update_next(), it adjusts the seconds only if the
> microseconds are greater than 1000000:? (ioloop.c, line 235)
>
> if (timeout->next_run.tv_usec > 1000000) {
> ??????????????? timeout->next_run.tv_sec++;
> ??????????????? timeout->next_run.tv_usec -= 1000000;
> }
>
> It seems like the comparison here should be greater or equal, but I'm
> not familiar enough with the code to say with absolute certainty.
>
> I added more debug error messages to the code in
> timeout_update_next(), and although I do see next_run->tv_usec at
> 1000000 several times a day, I haven't seen a panic since early
> December on any of our servers, so I decided to post this to see if
> this analysis makes sense.
>
>
>
Hi!

I noticed that further down we actually use >= instead of >, so I'm
guessing it was just missed here. Thanks.

Aki

Reasonably Related Threads

Search for more reasonably related threads

dovecot - Feb 2018 - dovecot: master: Panic: kevent() failed: Invalid argument

dovecot: master: Panic: kevent() failed: Invalid argument

dovecot: master: Panic: kevent() failed: Invalid argument

Reasonably Related Threads