Stefan Viljoen
2015-Aug-11 09:00 UTC
[asterisk-users] 786 000 files limit Centos 7 - Asterisk keep complaining
>> Anybody else ran into this?>No, but I would ask myself why so many file descriptors are being used. >It sounds like you have a file descriptor leak (not being closed when >finished with).Hi Tony Thanks for replying. I suspected something like that, though repeatedly running lsof | wc -l Always stays quite low - 100 000 open files, which is still 8 times less than the system maximum as confirmed by running ulimit -n I also note that this number will increase to about 125 000 but never go higher than that, then, as calls hang up, decreate again - during times when the CLI is spammed with 100s of "broken pipe" errors due to insuffiecient file descriptors, this number never reaches beyond 125 000 out of the available 800 000 open files.>You might also want to look at the output of lsof (or at least some of it) >to see what all these file descriptors are pointing to, and whether it is >indeed Asterisk that is consuming them.If I grep by asterisk on the output of lsof the few thousand lines I have looked at all seem to indicate legitimate uses - there are at least two files for each conversation in progress (I assume for inward and outward RTP) plus one for each file being mixmonitored (which also seems logical) and also number-of-active-calls connections to res_timing_dahdi - which all looks correct...>If it is Asterisk, it's quite possible, even probable, that such a leak >has been found and fixed, even in the 1.8 series. 1.8.11.0 is rather old - >the latest is 1.8.32.3, so it would be best to update to that version and >see if the problem persists.Ok, I will have to consider that. The thing is the problem is not consistent - I can (for example) run 60 calls, with no problems and no reported failures in opening files, then calls will -decrease- to about 40 and then later spike to 70, but around 50 calls I get the errors coming up thousands of times in the CLI, then suddenly stop as the calls -increase- which doesn't make sense. But this kind of behaviour does seem consistent with a possible leak. SOMETHING NEW I have now ran /usr/bin/prlimit --pid `pidof asterisk` and I have noticed that even though I have 800 000 files specified, the ACTUAL limit in place on Asterisk for numbers of files is only 1024?! # prlimit --pid `pidof asterisk` RESOURCE DESCRIPTION SOFT HARD UNITS AS address space limit unlimited unlimited bytes CORE max core file size unlimited unlimited blocks CPU CPU time unlimited unlimited seconds DATA max data size unlimited unlimited bytes FSIZE max file size unlimited unlimited blocks LOCKS max number of file locks held unlimited unlimited MEMLOCK max locked-in-memory address space 65536 65536 bytes MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes NICE max nice prio allowed to raise 0 0 NOFILE max number of open files 1024 4096 NPROC max number of processes 30861 30861 RSS max resident set size unlimited unlimited pages RTPRIO max real-time priority 0 0 RTTIME timeout for real-time tasks unlimited unlimited microsecs SIGPENDING max number of pending signals 30861 30861 STACK max stack size 8388608 unlimited bytes Accordingly I have put this into a cronjob ran each minute: prlimit --pid `pidof asterisk` --nofile=786000:786000 to try and force the running binary to keep a high file limit (sources say to keep it less than the ACTUAL system file limit, in my case 800 000 files) on the live Asterisk process. I'll see if this maybe helps - the above runs via cron each minute. So it appears for some reason somehow the live running asterisk process "loses track" of how many open files it may have, or when it starts it somehow does not start with the correct number of maximum open files, as set in the system / kernel config? Anyway, thank you for replying, I'll monitor this new "Cronjob fixup" I'm trying and see if it helps. No wonder it is complaining about running out of file handles if it ACTUALLY was only using 1024! Kind regards Stefan
Markus Weiler
2015-Aug-11 16:29 UTC
[asterisk-users] 786 000 files limit Centos 7 - Asterisk keep complaining
Hi Stefan, we ran into a similar problem using Debian. There we are able to check the current limits using: pidof asterisk -> 23351 cat /proc/23351/limits Output: Limit Soft Limit Hard Limit Units Max open files 1024 1024 files I think that in the end /etc/security/limits.conf * hard nofile 500000 * soft nofile 500000 root hard nofile 500000 root soft nofile 500000 did the trick. We also tried vi /etc/sysctl.conf fs.file-max = 500000 not sure what the solution in the end was. But I remember rebooting was important. Markus Am 11.08.2015 um 11:00 schrieb Stefan Viljoen:>>> Anybody else ran into this? >> No, but I would ask myself why so many file descriptors are being used. >> It sounds like you have a file descriptor leak (not being closed when >> finished with). > Hi Tony > > Thanks for replying. > > I suspected something like that, though repeatedly running > > lsof | wc -l > > Always stays quite low - 100 000 open files, which is still 8 times less > than the system maximum as confirmed by running ulimit -n > > I also note that this number will increase to about 125 000 but never go > higher than that, then, as calls hang up, decreate again - during times when > the CLI is spammed with 100s of "broken pipe" errors due to insuffiecient > file descriptors, this number never reaches beyond 125 000 out of the > available 800 000 open files. > >> You might also want to look at the output of lsof (or at least some of it) >> to see what all these file descriptors are pointing to, and whether it is >> indeed Asterisk that is consuming them. > If I grep by asterisk on the output of lsof the few thousand lines I have > looked at all seem to indicate legitimate uses - there are at least two > files for each conversation in progress (I assume for inward and outward > RTP) plus one for each file being mixmonitored (which also seems logical) > and also number-of-active-calls connections to res_timing_dahdi - which all > looks correct... > >> If it is Asterisk, it's quite possible, even probable, that such a leak >> has been found and fixed, even in the 1.8 series. 1.8.11.0 is rather old - >> the latest is 1.8.32.3, so it would be best to update to that version and >> see if the problem persists. > Ok, I will have to consider that. The thing is the problem is not consistent > - I can (for example) run 60 calls, with no problems and no reported > failures in opening files, then calls will -decrease- to about 40 and then > later spike to 70, but around 50 calls I get the errors coming up thousands > of times in the CLI, then suddenly stop as the calls -increase- which > doesn't make sense. But this kind of behaviour does seem consistent with a > possible leak. > > SOMETHING NEW > > I have now ran > > /usr/bin/prlimit --pid `pidof asterisk` > > and I have noticed that even though I have 800 000 files specified, the > ACTUAL limit in place on Asterisk for numbers of files is only 1024?! > > # prlimit --pid `pidof asterisk` > RESOURCE DESCRIPTION SOFT HARD UNITS > AS address space limit unlimited unlimited bytes > CORE max core file size unlimited unlimited blocks > CPU CPU time unlimited unlimited seconds > DATA max data size unlimited unlimited bytes > FSIZE max file size unlimited unlimited blocks > LOCKS max number of file locks held unlimited unlimited > MEMLOCK max locked-in-memory address space 65536 65536 bytes > MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes > NICE max nice prio allowed to raise 0 0 > NOFILE max number of open files 1024 4096 > NPROC max number of processes 30861 30861 > RSS max resident set size unlimited unlimited pages > RTPRIO max real-time priority 0 0 > RTTIME timeout for real-time tasks unlimited unlimited microsecs > SIGPENDING max number of pending signals 30861 30861 > STACK max stack size 8388608 unlimited bytes > > Accordingly I have put this into a cronjob ran each minute: > > prlimit --pid `pidof asterisk` --nofile=786000:786000 > > to try and force the running binary to keep a high file limit (sources say > to keep it less than the ACTUAL system file limit, in my case 800 000 files) > on the live Asterisk process. > > I'll see if this maybe helps - the above runs via cron each minute. > > So it appears for some reason somehow the live running asterisk process > "loses track" of how many open files it may have, or when it starts it > somehow does not start with the correct number of maximum open files, as set > in the system / kernel config? > > Anyway, thank you for replying, I'll monitor this new "Cronjob fixup" I'm > trying and see if it helps. > > No wonder it is complaining about running out of file handles if it ACTUALLY > was only using 1024! > > Kind regards > > Stefan > >
Tony Mountifield
2015-Aug-11 16:50 UTC
[asterisk-users] 786 000 files limit Centos 7 - Asterisk keep complaining
In article <002b01d0d414$36af31b0$a40d9510$@verishare.co.za>, Stefan Viljoen <viljoens at verishare.co.za> wrote:> >> Anybody else ran into this? > > >No, but I would ask myself why so many file descriptors are being used. > >It sounds like you have a file descriptor leak (not being closed when > >finished with). > > Hi Tony > > Thanks for replying. > > I suspected something like that, though repeatedly running > > lsof | wc -l > > Always stays quite low - 100 000 open files, which is still 8 times less > than the system maximum as confirmed by running ulimit -n>From what you said below, the above is probably not relevant...> > SOMETHING NEW > > I have now ran > > /usr/bin/prlimit --pid `pidof asterisk` > > and I have noticed that even though I have 800 000 files specified, the > ACTUAL limit in place on Asterisk for numbers of files is only 1024?!Yes, this is likely. Have a look in /usr/sbin/safe_asterisk, at the commented-out settings for SYSMAXFILES and MAXFILES, and try setting those. Cheers Tony -- Tony Mountifield Work: tony at softins.co.uk - http://www.softins.co.uk Play: tony at mountifield.org - http://tony.mountifield.org
Steve Edwards
2015-Aug-11 16:57 UTC
[asterisk-users] 786 000 files limit Centos 7 - Asterisk keep complaining
On Tue, 11 Aug 2015, Stefan Viljoen wrote:> I suspected something like that, though repeatedly running > > lsof | wc -l > > Always stays quite low - 100 000 open files, which is still 8 times less > than the system maximum as confirmed by running ulimit -nWhat the 'h' are you doing that takes x00,000 open files? I'm running Asterisk 11.17.0 on CentOS 6.7 and my 'numbers' seem insignificant by comparison. sudo /usr/sbin/asterisk -r -x 'core show channels' | grep active 347 active channels 344 active calls sudo lsof | wc -l 3945 sudo lsof | grep asterisk | wc -l 2161 -- Thanks in advance, ------------------------------------------------------------------------- Steve Edwards sedwards at sedwards.com Voice: +1-760-468-3867 PST