Hi, Our icecast server (version 2.2.0) locked up whilst I was out this evening. Looking at the error log, I find a lot of the following: [2005-04-28 08:21:16] WARN connection/_accept_connection accept() failed with error 24: Too many open files [2005-04-28 08:21:17] WARN connection/_accept_connection accept() failed with error 24: Too many open files [2005-04-28 08:21:17] WARN connection/_accept_connection accept() failed with error 24: Too many open files [2005-04-28 08:21:18] WARN connection/_accept_connection accept() failed with error 24: Too many open files There's really no other helpful info I'm afraid. I've still got the logs if there's something that might help. Anyone know what might be the cause or how to prevent it? Presumably some file handle(s) aren't being closed when they should. Geoff. -- Geoff Shang <geoff@hitsandpieces.net> Phone: +61-418-96-5590 MSN: geoff@acbradio.org Make sure your E-mail can be read by everyone! http://www.betips.net/etc/evilmail.html Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
On 4/28/05, Geoff Shang <geoff@hitsandpieces.net> wrote:> Hi, > Our icecast server (version 2.2.0) locked up whilst I was out this evening. > > Looking at the error log, I find a lot of the following: > > [2005-04-28 08:21:16] WARN connection/_accept_connection accept() failed > with error 24: Too many open filesHow many clients were connected? (An estimate would be ok). Generally speaking, this sort of thing happens for one of three reasons: 1) file descriptor leak. If it's this, that's a pretty serious icecast bug, probably exploitable as a DoS attack. 2) Hitting a compiled-in limit. Usually, this should only happen if you have quite a lot of clients connected (at least several hundred), AND you've compiled it using select() rather than poll() - which should only happen on systems that don't have poll(). You use linux, right? If so, it's pretty unlikely that this is your problem. 3) Hitting a kernel limit (either per-user or global). Unlikely unless you have a lot of clients (hundreds or thousands). If it's a per-user limit, it should be easy to change. If it's global... well, you'd need at least a few thousand clients for that, so it's unlikely. All in all, I'd say the first one (fd leak) is the most likely, but there's not enough info here for me to guess at where (and if there was, I couldn't do anything about it anyway - don't have my computer with me here in Berlin :-). Mike
Just to add to what Mike said On Thu, 2005-04-28 at 18:30, Michael Smith wrote:> > [2005-04-28 08:21:16] WARN connection/_accept_connection accept() failed > > with error 24: Too many open files...> 1) file descriptor leak. If it's this, that's a pretty serious icecast > bug, probably exploitable as a DoS attack.agreed, on linux check the /proc/<pid>/fd directory, use of utilities like lsof may indicate excessive file descriptor usage.> 2) Hitting a compiled-in limit. Usually, this should only happen if > you have quite a lot of clients connected (at least several hundred), > AND you've compiled it using select() rather than poll() - which > should only happen on systems that don't have poll(). You use linux, > right? If so, it's pretty unlikely that this is your problem.most have poll now, but select has a limit that was typically 1024 IIRC> 3) Hitting a kernel limit (either per-user or global). Unlikely unless > you have a lot of clients (hundreds or thousands). If it's a per-user > limit, it should be easy to change. If it's global... well, you'd need > at least a few thousand clients for that, so it's unlikely.check ulimit (-n) for open files, the default is 1024.> All in all, I'd say the first one (fd leak) is the most likely, but > there's not enough info here for me to guess at where (and if there > was, I couldn't do anything about it anyway - don't have my computer > with me here in Berlin :-).agreed, the logs may indicate an unusual pattern showing up karl.
Michael Smith wrote:> Generally speaking, this sort of thing happens for one of three reasons: > > 1) file descriptor leak. If it's this, that's a pretty serious icecast > bug, probably exploitable as a DoS attack. > > 2) Hitting a compiled-in limit. Usually, this should only happen if > you have quite a lot of clients connected (at least several hundred),Presumably you mean concurrently. I don't know the exact figure, but it was almost certainly fewer than 100, and quite probably fewer than 50.> AND you've compiled it using select() rather than poll() - which > should only happen on systems that don't have poll(). You use linux, > right? If so, it's pretty unlikely that this is your problem.configure:19034: checking for poll configure:19084: gcc -o conftest -g -O2 conftest.c >&5 configure:19087: $? = 0 configure:19090: test -s conftest configure:19093: $? = 0 configure:19104: result: yes> 3) Hitting a kernel limit (either per-user or global). Unlikely unless > you have a lot of clients (hundreds or thousands). If it's a per-user > limit, it should be easy to change. If it's global... well, you'd need > at least a few thousand clients for that, so it's unlikely.Like I said, definitely not a lot of concurrent users, particularly at the time of day it occured (08:21 GMT).> All in all, I'd say the first one (fd leak) is the most likely, but > there's not enough info here for me to guess at where (and if there > was, I couldn't do anything about it anyway - don't have my computer > with me here in Berlin :-).Yeah, I realise the scarcity of information here, but there's really nothing to see. Nothing unusual happened before it started, and obviously not a lot could happen after. Apart from keeping an eye on things, is there anything else I can do? Geoff.