We have recently install an asterisk solution with about 60 physical extensions. While the system is running it runs reasonably well (Still have a few teething problems) but twice now they have experienced a degradation in voice quality and dropped calls and then finally asterisk completely crashes out. Restarting asterisk will work for a little while and it will crash again, each time less time will pass before a crash out. The first time I didn't have much logging so I didn't get anything to work with. I have since turned on debugging and following is the logs from the time of the last crash. Can anyone point out where the problem may lay, suggested updates or changes? Jul 4 11:56:51 DEBUG[20042] chan_sip.c: Auto destroying call 'aca7e8d7fc914018 at 192.168.12.164' Jul 4 11:56:54 DEBUG[20042] chan_sip.c: Stopping retransmission on '6eeb52b53a414a6975facbc22ca10686 at 192.168.10.12' of Request 102: Match Found Jul 4 11:56:55 VERBOSE[20050] logger.c: -- Accepting voice call from '' to '40312688' on channel 0/2, span 5 Jul 4 11:56:55 DEBUG[20050] chan_zap.c: Enabled echo cancellation on channel 14 Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'CID withheld' Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing Set("Zap/14-1", "CALLERID(name)=Old Main Line:CID withheld") in new stack Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing Goto("Zap/14-1", "mainq|q|1") in new stack Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,1) Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'false' Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing Set("Zap/14-1", "NightMode=false") in new stack Jul 4 11:56:55 DEBUG[20295] pbx.c: Expression result is '0' Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing GotoIf("Zap/14-1", "0?afterhoursq|q|1") in new stack Jul 4 11:56:55 DEBUG[20295] pbx.c: Not taking any branch Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing GotoIfTime("Zap/14-1", "8:00-17:30|mon-fri|*|*|?businesshours") in new stack Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,5) Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing Set("Zap/14-1", "__ALERT_INFO=<http://www.example.com>;info=MainQ") in new stack Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing Queue("Zap/14-1", "mainq1|twr|||10") in new stack Jul 4 11:56:55 DEBUG[20295] chan_zap.c: Requested indication 3 on channel Zap/14-1 Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Called Local/700 at callagents Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing Set("Local/700 at callagents-bc5a,2", "Extension=700") in new stack Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing Set("Local/700 at callagents-bc5a,2", "__ALERT_INFO=<http://www.example.com>;info=MainQ") in new stack Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing Dial("Local/700 at callagents-bc5a,2", "SIP/700||tw") in new stack Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on RTP to 524288 Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on VRTP to 524288 Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Outgoing Call for 700 Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Called 700 Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping retransmission (but retaining packet) on '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' Request 102: Found Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping retransmission (but retaining packet) on '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' Request 102: Found Jul 4 11:56:55 VERBOSE[20298] logger.c: -- SIP/700-09530a90 is ringing Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Local/700 at callagents-bc5a,1 is ringing Jul 4 11:56:56 DEBUG[20042] chan_sip.c: Auto destroying call '4c2e1191399110af402dcc5b2c48303a at 192.168.10.12' Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Acked pending invite 102 Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Stopping retransmission on '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' of Request 102: Match Found Jul 4 11:56:59 DEBUG[20042] chan_sip.c: build_route: Contact hop: <sip:700 at 192.168.0.64:5060;transport=udp;user=phone> Jul 4 11:56:59 DEBUG[20027] channel.c: Avoiding initial deadlock for 'SIP/700-09530a90' Jul 4 11:56:59 VERBOSE[20298] logger.c: -- SIP/700-09530a90 answered Local/700 at callagents-bc5a,2 Jul 4 11:56:59 DEBUG[20295] app_queue.c: Dunno what to do with control type -1 Jul 4 11:56:59 VERBOSE[20295] logger.c: -- Local/700 at callagents-bc5a,1 answered Zap/14-1 Jul 4 11:56:59 DEBUG[20295] chan_zap.c: Set option TONE VERIFY, mode: MUTECONF(1) on Zap/14-1 Jul 4 11:56:59 DEBUG[20295] chan_zap.c: No echo training requested Jul 4 11:56:59 DEBUG[20298] channel.c: Planning to masquerade channel SIP/700-09530a90 into the structure of Local/700 at callagents-bc5a,1 Jul 4 11:56:59 DEBUG[20298] channel.c: Done planning to masquerade channel SIP/700-09530a90 into the structure of Local/700 at callagents-bc5a,1 Jul 4 11:56:59 DEBUG[20295] channel.c: Got clone lock for masquerade on 'SIP/700-09530a90' at 0x952ab64 Jul 4 11:56:59 DEBUG[20298] chan_local.c: Not posting to queue since already masked on 'Local/700 at callagents-bc5a,2' Jul 4 11:56:59 DEBUG[20295] channel.c: Putting channel SIP/700-09530a90 in 64/64 formats Jul 4 11:56:59 DEBUG[20295] channel.c: Released clone lock on 'Local/700 at callagents-bc5a,1<ZOMBIE>' Jul 4 11:56:59 DEBUG[20295] channel.c: Done Masquerading SIP/700-09530a90 (6) The last entry was just before the crash. We also have this in dmesg (Not sure if its related) NOTICE-xpd_bri: XBUS-01/XPD-10: D-Chan RX Bad checksum: [FE:28=FC] (252) NOTICE-xpd_bri: XBUS-01/XPD-10: Multibyte Drop: errno=-71 BUG: warning at kernel/softirq.c:138/local_bh_enable() (Not tainted) [<c042b0cf>] local_bh_enable+0x45/0x92 [<c06002bd>] cond_resched_softirq+0x2c/0x42 [<c059adf3>] release_sock+0x4f/0x9d [<c05c670d>] tcp_sendmsg+0x90b/0x9f9 [<c059adb6>] release_sock+0x12/0x9d [<c05c7755>] tcp_recvmsg+0x8d2/0x9de [<c04808dc>] core_sys_select+0x218/0x2f3 [<c05dec95>] inet_sendmsg+0x3b/0x45 [<c0598731>] sock_aio_write+0xf6/0x102 [<c045da36>] get_page_from_freelist+0x23b/0x2a5 [<c04754ee>] do_sync_write+0xc7/0x10a [<c0436e71>] autoremove_wake_function+0x0/0x35 [<c0475d47>] vfs_write+0xbc/0x154 [<c0476342>] sys_write+0x41/0x67 [<c0404f70>] syscall_call+0x7/0xb ======================NOTICE-xpd_bri: XBUS-01/XPD-08: D-Chan RX Bad checksum: [FA:FA=FC] (252) NOTICE-xpd_bri: XBUS-01/XPD-08: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [78:3A=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [AA:AA=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC] (252) NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX short frame (idx=3) NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC] (252) NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [22:22=FC] (252) NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [55:02=FC] (252) NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 zaptel Disabled echo canceller because of tone (rx) on channel 4 NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [94:2D=FD] (253) NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 Once the system becomes unstable the only way to get it up again is to shutdown (not restart) pull the power and USB on the Xorcom Bri 4 devices. Then plug them back in and start the system up. If the power and USB is not disconnected a the devices may look like they are working fine but zaptel will not start stating that it can not find span ** what ever it happens to fail on. We are running the following versions Asterisk - Asterisk 1.2.18-BRIstuffed-0.3.0-PRE-1y-g Zaptel - zaptel-1.2.18 XPP - version: trunk-r3965 srcversion: 723B8A27A7E9750BB039D00 If you need any more information please let me know. Nathan Dennis __________________________________________________________ Integrated Solutions (QLD)P/L Phone: +61 (7) 4044 0300 Direct: +61 (7) 4044 0302 124 Spence Street Fax: +61 (7) 4041 6600 CAIRNS QLD 4868 Mobile: 0418 608609 Australia E-mail: nathan.dennis at i-solutions.net.au <mailto:nathan.dennis at i-solutions.net.au> Web Site: www.i-solutions.net.au <http://www.i-solutions.net.au/> Offices and agents in Cairns - Brisbane - Melbourne -- Adelaide -- Sydney __________________________________________________________ The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20070705/8870aa44/attachment.htm
On Thu, Jul 05, 2007 at 09:14:12AM +1000, Nathan Dennis wrote:> We have recently install an asterisk solution with about 60 physical > extensions. While the system is running it runs reasonably well (Still > have a few teething problems) but twice now they have experienced a > degradation in voice quality and dropped calls and then finally asterisk > completely crashes out. Restarting asterisk will work for a little while > and it will crash again, each time less time will pass before a crash > out. The first time I didn't have much logging so I didn't get anything > to work with. I have since turned on debugging and following is the logs > from the time of the last crash. Can anyone point out where the problem > may lay, suggested updates or changes? > > > Jul 4 11:56:51 DEBUG[20042] chan_sip.c: Auto destroying call > 'aca7e8d7fc914018 at 192.168.12.164' > Jul 4 11:56:54 DEBUG[20042] chan_sip.c: Stopping retransmission on > '6eeb52b53a414a6975facbc22ca10686 at 192.168.10.12' of Request 102: Match > Found > Jul 4 11:56:55 VERBOSE[20050] logger.c: -- Accepting voice call > from '' to '40312688' on channel 0/2, span 5 > Jul 4 11:56:55 DEBUG[20050] chan_zap.c: Enabled echo cancellation on > channel 14 > Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'CID withheld' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "CALLERID(name)=Old Main Line:CID withheld") in new > stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Goto("Zap/14-1", "mainq|q|1") in new stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,1) > Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'false' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "NightMode=false") in new stack > Jul 4 11:56:55 DEBUG[20295] pbx.c: Expression result is '0' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > GotoIf("Zap/14-1", "0?afterhoursq|q|1") in new stack > Jul 4 11:56:55 DEBUG[20295] pbx.c: Not taking any branch > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > GotoIfTime("Zap/14-1", "8:00-17:30|mon-fri|*|*|?businesshours") in new > stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,5) > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "__ALERT_INFO=<http://www.example.com>;info=MainQ") in > new stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Queue("Zap/14-1", "mainq1|twr|||10") in new stack > Jul 4 11:56:55 DEBUG[20295] chan_zap.c: Requested indication 3 on > channel Zap/14-1 > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Called > Local/700 at callagents > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Set("Local/700 at callagents-bc5a,2", "Extension=700") in new stack > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Set("Local/700 at callagents-bc5a,2", > "__ALERT_INFO=<http://www.example.com>;info=MainQ") in new stack > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Dial("Local/700 at callagents-bc5a,2", "SIP/700||tw") in new stack > Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on RTP to 524288 > Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on VRTP to 524288 > Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Outgoing Call for 700 > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Called 700 > Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping > retransmission (but retaining packet) on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' Request 102: Found > Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping > retransmission (but retaining packet) on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' Request 102: Found > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- SIP/700-09530a90 is > ringing > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- > Local/700 at callagents-bc5a,1 is ringing > Jul 4 11:56:56 DEBUG[20042] chan_sip.c: Auto destroying call > '4c2e1191399110af402dcc5b2c48303a at 192.168.10.12' > Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Acked pending invite 102 > Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Stopping retransmission on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12' of Request 102: Match > Found > Jul 4 11:56:59 DEBUG[20042] chan_sip.c: build_route: Contact hop: > <sip:700 at 192.168.0.64:5060;transport=udp;user=phone> > Jul 4 11:56:59 DEBUG[20027] channel.c: Avoiding initial deadlock for > 'SIP/700-09530a90' > Jul 4 11:56:59 VERBOSE[20298] logger.c: -- SIP/700-09530a90 > answered Local/700 at callagents-bc5a,2 > Jul 4 11:56:59 DEBUG[20295] app_queue.c: Dunno what to do with control > type -1 > Jul 4 11:56:59 VERBOSE[20295] logger.c: -- > Local/700 at callagents-bc5a,1 answered Zap/14-1 > Jul 4 11:56:59 DEBUG[20295] chan_zap.c: Set option TONE VERIFY, mode: > MUTECONF(1) on Zap/14-1 > Jul 4 11:56:59 DEBUG[20295] chan_zap.c: No echo training requested > Jul 4 11:56:59 DEBUG[20298] channel.c: Planning to masquerade channel > SIP/700-09530a90 into the structure of Local/700 at callagents-bc5a,1 > Jul 4 11:56:59 DEBUG[20298] channel.c: Done planning to masquerade > channel SIP/700-09530a90 into the structure of > Local/700 at callagents-bc5a,1 > Jul 4 11:56:59 DEBUG[20295] channel.c: Got clone lock for masquerade on > 'SIP/700-09530a90' at 0x952ab64 > Jul 4 11:56:59 DEBUG[20298] chan_local.c: Not posting to queue since > already masked on 'Local/700 at callagents-bc5a,2' > Jul 4 11:56:59 DEBUG[20295] channel.c: Putting channel SIP/700-09530a90 > in 64/64 formats > Jul 4 11:56:59 DEBUG[20295] channel.c: Released clone lock on > 'Local/700 at callagents-bc5a,1<ZOMBIE>' > Jul 4 11:56:59 DEBUG[20295] channel.c: Done Masquerading > SIP/700-09530a90 (6) > > The last entry was just before the crash. > > We also have this in dmesg (Not sure if its related) > > NOTICE-xpd_bri: XBUS-01/XPD-10: D-Chan RX Bad checksum: [FE:28=FC] (252) > NOTICE-xpd_bri: XBUS-01/XPD-10: Multibyte Drop: errno=-71 > BUG: warning at kernel/softirq.c:138/local_bh_enable() (Not tainted) > [<c042b0cf>] local_bh_enable+0x45/0x92 > [<c06002bd>] cond_resched_softirq+0x2c/0x42 > [<c059adf3>] release_sock+0x4f/0x9d > [<c05c670d>] tcp_sendmsg+0x90b/0x9f9 > [<c059adb6>] release_sock+0x12/0x9d > [<c05c7755>] tcp_recvmsg+0x8d2/0x9de > [<c04808dc>] core_sys_select+0x218/0x2f3 > [<c05dec95>] inet_sendmsg+0x3b/0x45 > [<c0598731>] sock_aio_write+0xf6/0x102 > [<c045da36>] get_page_from_freelist+0x23b/0x2a5 > [<c04754ee>] do_sync_write+0xc7/0x10a > [<c0436e71>] autoremove_wake_function+0x0/0x35 > [<c0475d47>] vfs_write+0xbc/0x154 > [<c0476342>] sys_write+0x41/0x67 > [<c0404f70>] syscall_call+0x7/0xbThis stack trace is from the networking area. Any chance you could get a more complete trace? A quick search brings out the following Fedora bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240982 So if this is a Fedora system, maybe try latest kernel. That bug is marked "closed" there.> ======================> NOTICE-xpd_bri: XBUS-01/XPD-08: D-Chan RX Bad checksum: [FA:FA=FC] (252) > NOTICE-xpd_bri: XBUS-01/XPD-08: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [78:3A=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [AA:AA=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC] (252) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX short frame (idx=3) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC] (252) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [22:22=FC] (252) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [55:02=FC] (252) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > zaptel Disabled echo canceller because of tone (rx) on channel 4 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [94:2D=FD] (253) > NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71I believe that those messages can also be caused by Asterisk suddenly dropping out with parts of the driver still working. But I'm not sure.> > > Once the system becomes unstable the only way to get it up again is to > shutdown (not restart) pull the power and USB on the Xorcom Bri 4 > devices.Are you sure a reset of the Astribank is needed? And if so: can you try: rmmod xpp_usb; /usr/share/zaptel/xpp_fxloader reset instead?> Then plug them back in and start the system up. > If the power and USB is not disconnected a the devices may look like > they are working fine but zaptel will not start stating that it can not > find span ** what ever it happens to fail on. > > > We are running the following versions > Asterisk - Asterisk 1.2.18-BRIstuffed-0.3.0-PRE-1y-g > Zaptel - zaptel-1.2.18 > XPP - version: trunk-r3965 > srcversion: 723B8A27A7E9750BB039D00What distribution? What kernel version?> > If you need any more information please let me know. > > > > > Nathan Dennis > __________________________________________________________ > Integrated Solutions (QLD)P/L Phone: +61 (7) 4044 0300 > Direct: +61 (7) 4044 > 0302 > 124 Spence Street Fax: +61 (7) 4041 6600 > CAIRNS QLD 4868 Mobile: 0418 608609 > > Australia > > E-mail: nathan.dennis at i-solutions.net.au > <mailto:nathan.dennis at i-solutions.net.au> > Web Site: www.i-solutions.net.au <http://www.i-solutions.net.au/> > > Offices and agents in Cairns - Brisbane - Melbourne -- Adelaide -- > Sydney > __________________________________________________________ > The information transmitted is intended only for the person or entity to > which > it is addressed and may contain confidential and/or privileged material. > > Any review, retransmission, dissemination or other use of, or taking of > any > action in reliance upon, this information by persons or entities other > than the > intended recipient is prohibited. If you received this in error, please > contact > the sender and delete the material from any computer. >> _______________________________________________ > --Bandwidth and Colocation Provided by http://www.api-digital.com-- > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-- Tzafrir Cohen icq#16849755 jabber:tzafrir at jabber.org +972-50-7952406 mailto:tzafrir.cohen at xorcom.com http://www.xorcom.com iax:guest at local.xorcom.com/tzafrir
Thanks for the input, but I still don't seem to have any luck with the devices locking up. I've even rebuilt a new system on new hardware and a new xorcom device but still no good. Once the device locks up that's it the only way to get zaptel and asterisk back up is to turn them off and restart the server. The command you have me rmmod xpp_usb; /usr/share/zaptel/xpp_fxloader reset Works great and will reload the firmware as long as the devices are frozen. Once they lock up this command will not reload the firmware and brings up the following errors. 'xpp_fxloader'[16065]: Resetting FPGA Firmware on /dev/bus/usb/005/016 'xpp_fxloader'[16067]: /usr/sbin/fpga_load: ERROR (247): bulk_write failed: error reaping URB: No such device 'xpp_fxloader'[16067]: /usr/sbin/fpga_load: ERROR (810): Renumeration to default failed: errno=-19 'xpp_fxloader'[16067]: /usr/sbin/fpga_load: ERROR (203): Releasing interface: usb: could not release intf 0: No such device 'xpp_fxloader'[16342]: fpga_load failed remoivng with status 237 I'm running Fedora 7, Kernel 2.6.21-1.3194.fc7 Will hopefully be upgrading the kernel tonight if I can get some downtime to do so. As for more traces, I can do that, but being reasonably new to this I will need some help getting them for you. On Thu, Jul 05, 2007 at 09:14:12AM +1000, Nathan Dennis wrote:> We have recently install an asterisk solution with about 60 physical > extensions. While the system is running it runs reasonably well (Still > have a few teething problems) but twice now they have experienced a > degradation in voice quality and dropped calls and then finallyasterisk> completely crashes out. Restarting asterisk will work for a littlewhile> and it will crash again, each time less time will pass before a crash > out. The first time I didn't have much logging so I didn't getanything> to work with. I have since turned on debugging and following is thelogs> from the time of the last crash. Can anyone point out where theproblem> may lay, suggested updates or changes? > > > Jul 4 11:56:51 DEBUG[20042] chan_sip.c: Auto destroying call > 'aca7e8d7fc914018 at 192.168.12.164<http://lists.digium.com/mailman/listinfo/asterisk-users> '> Jul 4 11:56:54 DEBUG[20042] chan_sip.c: Stopping retransmission on > '6eeb52b53a414a6975facbc22ca10686 at 192.168.10.12<http://lists.digium.com/mailman/listinfo/asterisk-users> ' of Request 102: Match> Found > Jul 4 11:56:55 VERBOSE[20050] logger.c: -- Accepting voice call > from '' to '40312688' on channel 0/2, span 5 > Jul 4 11:56:55 DEBUG[20050] chan_zap.c: Enabled echo cancellation on > channel 14 > Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'CID withheld' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "CALLERID(name)=Old Main Line:CID withheld") in new > stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Goto("Zap/14-1", "mainq|q|1") in new stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,1) > Jul 4 11:56:55 DEBUG[20295] pbx.c: Function result is 'false' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "NightMode=false") in new stack > Jul 4 11:56:55 DEBUG[20295] pbx.c: Expression result is '0' > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > GotoIf("Zap/14-1", "0?afterhoursq|q|1") in new stack > Jul 4 11:56:55 DEBUG[20295] pbx.c: Not taking any branch > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > GotoIfTime("Zap/14-1", "8:00-17:30|mon-fri|*|*|?businesshours") in new > stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Goto (mainq,q,5) > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Set("Zap/14-1", "__ALERT_INFO=<http://www.example.com<http://www.example.com/> >;info=MainQ") in> new stack > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Executing > Queue("Zap/14-1", "mainq1|twr|||10") in new stack > Jul 4 11:56:55 DEBUG[20295] chan_zap.c: Requested indication 3 on > channel Zap/14-1 > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- Called > Local/700 at callagents<http://lists.digium.com/mailman/listinfo/asterisk-users>> Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Set("Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,2", "Extension=700") in new stack> Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Set("Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,2",> "__ALERT_INFO=<http://www.example.com <http://www.example.com/> >;info=MainQ") in new stack > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Executing > Dial("Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,2", "SIP/700||tw") in new stack> Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on RTP to 524288 > Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Setting NAT on VRTP to 524288 > Jul 4 11:56:55 DEBUG[20298] chan_sip.c: Outgoing Call for 700 > Jul 4 11:56:55 VERBOSE[20298] logger.c: -- Called 700 > Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping > retransmission (but retaining packet) on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12<http://lists.digium.com/mailman/listinfo/asterisk-users> ' Request 102: Found> Jul 4 11:56:55 DEBUG[20042] chan_sip.c: (Provisional) Stopping > retransmission (but retaining packet) on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12<http://lists.digium.com/mailman/listinfo/asterisk-users> ' Request 102: Found> Jul 4 11:56:55 VERBOSE[20298] logger.c: -- SIP/700-09530a90 is > ringing > Jul 4 11:56:55 VERBOSE[20295] logger.c: -- > Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,1 is ringing> Jul 4 11:56:56 DEBUG[20042] chan_sip.c: Auto destroying call > '4c2e1191399110af402dcc5b2c48303a at 192.168.10.12<http://lists.digium.com/mailman/listinfo/asterisk-users> '> Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Acked pending invite 102 > Jul 4 11:56:59 DEBUG[20042] chan_sip.c: Stopping retransmission on > '1dc479bd153590ff41b7f77817106b18 at 192.168.0.12<http://lists.digium.com/mailman/listinfo/asterisk-users> ' of Request 102: Match> Found > Jul 4 11:56:59 DEBUG[20042] chan_sip.c: build_route: Contact hop: > <sip:700 at 192.168.0.64<http://lists.digium.com/mailman/listinfo/asterisk-users> :5060;transport=udp;user=phone>> Jul 4 11:56:59 DEBUG[20027] channel.c: Avoiding initial deadlock for > 'SIP/700-09530a90' > Jul 4 11:56:59 VERBOSE[20298] logger.c: -- SIP/700-09530a90 > answered Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,2> Jul 4 11:56:59 DEBUG[20295] app_queue.c: Dunno what to do withcontrol> type -1 > Jul 4 11:56:59 VERBOSE[20295] logger.c: -- > Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,1 answered Zap/14-1> Jul 4 11:56:59 DEBUG[20295] chan_zap.c: Set option TONE VERIFY, mode: > MUTECONF(1) on Zap/14-1 > Jul 4 11:56:59 DEBUG[20295] chan_zap.c: No echo training requested > Jul 4 11:56:59 DEBUG[20298] channel.c: Planning to masquerade channel > SIP/700-09530a90 into the structure of Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,1> Jul 4 11:56:59 DEBUG[20298] channel.c: Done planning to masquerade > channel SIP/700-09530a90 into the structure of > Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,1> Jul 4 11:56:59 DEBUG[20295] channel.c: Got clone lock for masqueradeon> 'SIP/700-09530a90' at 0x952ab64 > Jul 4 11:56:59 DEBUG[20298] chan_local.c: Not posting to queue since > already masked on 'Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,2'> Jul 4 11:56:59 DEBUG[20295] channel.c: Putting channelSIP/700-09530a90> in 64/64 formats > Jul 4 11:56:59 DEBUG[20295] channel.c: Released clone lock on > 'Local/700 at callagents-bc5a<http://lists.digium.com/mailman/listinfo/asterisk-users> ,1<ZOMBIE>'> Jul 4 11:56:59 DEBUG[20295] channel.c: Done Masquerading > SIP/700-09530a90 (6) > > The last entry was just before the crash. > > We also have this in dmesg (Not sure if its related) > > NOTICE-xpd_bri: XBUS-01/XPD-10: D-Chan RX Bad checksum: [FE:28=FC](252)> NOTICE-xpd_bri: XBUS-01/XPD-10: Multibyte Drop: errno=-71 > BUG: warning at kernel/softirq.c:138/local_bh_enable() (Not tainted) > [<c042b0cf>] local_bh_enable+0x45/0x92 > [<c06002bd>] cond_resched_softirq+0x2c/0x42 > [<c059adf3>] release_sock+0x4f/0x9d > [<c05c670d>] tcp_sendmsg+0x90b/0x9f9 > [<c059adb6>] release_sock+0x12/0x9d > [<c05c7755>] tcp_recvmsg+0x8d2/0x9de > [<c04808dc>] core_sys_select+0x218/0x2f3 > [<c05dec95>] inet_sendmsg+0x3b/0x45 > [<c0598731>] sock_aio_write+0xf6/0x102 > [<c045da36>] get_page_from_freelist+0x23b/0x2a5 > [<c04754ee>] do_sync_write+0xc7/0x10a > [<c0436e71>] autoremove_wake_function+0x0/0x35 > [<c0475d47>] vfs_write+0xbc/0x154 > [<c0476342>] sys_write+0x41/0x67 > [<c0404f70>] syscall_call+0x7/0xbThis stack trace is from the networking area. Any chance you could get a more complete trace? A quick search brings out the following Fedora bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240982 <https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240982> So if this is a Fedora system, maybe try latest kernel. That bug is marked "closed" there.> ======================> NOTICE-xpd_bri: XBUS-01/XPD-08: D-Chan RX Bad checksum: [FA:FA=FC](252)> NOTICE-xpd_bri: XBUS-01/XPD-08: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [78:3A=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [55:55=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [AA:AA=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC](252)> NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX short frame (idx=3) > NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [35:22=FC](252)> NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [22:22=FC](252)> NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > NOTICE-xpd_bri: XBUS-00/XPD-10: D-Chan RX Bad checksum: [55:02=FC](252)> NOTICE-xpd_bri: XBUS-00/XPD-10: Multibyte Drop: errno=-71 > zaptel Disabled echo canceller because of tone (rx) on channel 4 > NOTICE-xpd_bri: XBUS-00/XPD-00: D-Chan RX Bad checksum: [94:2D=FD](253)> NOTICE-xpd_bri: XBUS-00/XPD-00: Multibyte Drop: errno=-71I believe that those messages can also be caused by Asterisk suddenly dropping out with parts of the driver still working. But I'm not sure.> > > Once the system becomes unstable the only way to get it up again is to > shutdown (not restart) pull the power and USB on the Xorcom Bri 4 > devices.Are you sure a reset of the Astribank is needed? And if so: can you try: rmmod xpp_usb; /usr/share/zaptel/xpp_fxloader reset instead?> Then plug them back in and start the system up. > If the power and USB is not disconnected a the devices may look like > they are working fine but zaptel will not start stating that it cannot> find span ** what ever it happens to fail on. > > > We are running the following versions > Asterisk - Asterisk 1.2.18-BRIstuffed-0.3.0-PRE-1y-g > Zaptel - zaptel-1.2.18 > XPP - version: trunk-r3965 > srcversion: 723B8A27A7E9750BB039D00What distribution? What kernel version?> > If you need any more information please let me know. > > > > > Nathan Dennis > __________________________________________________________ > Integrated Solutions (QLD)P/L Phone: +61 (7) 4044 0300 > Direct: +61 (7) 4044 > 0302 > 124 Spence Street Fax: +61 (7) 4041 6600 > CAIRNS QLD 4868 Mobile: 0418 608609 > > Australia > > E-mail: nathan.dennis at i-solutions.net.au<http://lists.digium.com/mailman/listinfo/asterisk-users>> <mailto:nathan.dennis at i-solutions.net.au<http://lists.digium.com/mailman/listinfo/asterisk-users> >> Web Site: www.i-solutions.net.au <http://www.i-solutions.net.au/<http://www.i-solutions.net.au/> >> > Offices and agents in Cairns - Brisbane - Melbourne -- Adelaide -- > Sydney > __________________________________________________________ > The information transmitted is intended only for the person or entityto> which > it is addressed and may contain confidential and/or privilegedmaterial.> > Any review, retransmission, dissemination or other use of, or takingof> any > action in reliance upon, this information by persons or entities other > than the > intended recipient is prohibited. If you received this in error,please> contact > the sender and delete the material from any computer. >> _______________________________________________ > --Bandwidth and Colocation Provided by http://www.api-digital.com--<http://www.api-digital.com--/>> > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users<http://lists.digium.com/mailman/listinfo/asterisk-users> -- Tzafrir Cohen icq#16849755 jabber:tzafrir at jabber.org <http://lists.digium.com/mailman/listinfo/asterisk-users> +972-50-7952406 mailto:tzafrir.cohen at xorcom.com <http://lists.digium.com/mailman/listinfo/asterisk-users> http://www.xorcom.com <http://www.xorcom.com/> iax:guest at local.xorcom.com <http://lists.digium.com/mailman/listinfo/asterisk-users> /tzafrir Nathan Dennis