Hi, I have a problem that is biting at all my customer sites where they have PRIs taking heavy load. This happens both with the stable code stream and with the current CVS. What happens is that after some running, Asterisk starts reporting strange errors on the PRI, eventually calling the PRI down Starts with this sort of thing: Feb 21 09:39:23 DEBUG[18095]: Didn't get a frame from channel: Zap/68-1 Feb 21 09:39:23 DEBUG[18095]: Bridge stops bridging channels Zap/68-1 and Zap/32-1 Feb 21 09:39:23 DEBUG[22374]: Manager received command 'Command' Feb 21 09:39:25 DEBUG[18103]: Launching 'SetVar' Feb 21 09:39:25 DEBUG[18104]: Launching 'SetVar' Feb 21 09:39:33 DEBUG[22374]: Manager received command 'Command' Feb 21 09:39:43 DEBUG[22374]: Manager received command 'Command' Feb 21 09:39:51 DEBUG[18105]: Launching 'SetVar' Feb 21 09:39:53 DEBUG[22374]: Manager received command 'Command' Feb 21 09:39:56 DEBUG[18052]: Write returned -1 (Resource temporarily unavailable) on channel 31 Feb 21 09:40:03 DEBUG[22374]: Manager received command 'Command' Feb 21 09:40:06 DEBUG[18052]: Write returned -1 (Resource temporarily unavailable) on channel 31 Feb 21 09:40:06 DEBUG[17864]: Got RTCP report of 8 bytes Feb 21 09:40:06 DEBUG[17864]: Got RTCP report of 8 bytes Feb 21 09:40:13 DEBUG[22374]: Manager received command 'Command' Feb 21 09:40:23 DEBUG[22374]: Manager received command 'Command' Feb 21 09:40:33 DEBUG[22374]: Manager received command 'Command' Feb 21 09:40:34 DEBUG[18030]: Write returned -1 (Unknown error 500) on channel 45 Feb 21 09:40:34 DEBUG[18030]: Exception on 56, channel 45 Feb 21 09:40:34 DEBUG[18030]: Got event Alarm(4) on channel 45 (index 0) Feb 21 09:40:34 DEBUG[18099]: Exception on 44, channel 33 Feb 21 09:40:34 DEBUG[18099]: Got event Alarm(4) on channel 33 (index 0) Feb 21 09:40:34 DEBUG[18011]: Exception on 55, channel 44 Feb 21 09:40:34 DEBUG[18011]: Got event Alarm(4) on channel 44 (index 0) Feb 21 09:40:34 DEBUG[18063]: Exception on 57, channel 46 Feb 21 09:40:34 DEBUG[18063]: Got event Alarm(4) on channel 46 (index 0) ...etc... It complains about a bunch (though not all) channels. Then, shortly afterwards: Feb 21 09:40:34 DEBUG[22364]: Monitor doohicky got event Alarm on channel 34 Feb 21 09:40:34 WARNING[22364]: Detected alarm on channel 34: Red Alarm Feb 21 09:40:34 WARNING[22364]: Unable to disable echo cancellation on channel 34 Feb 21 09:40:34 DEBUG[22364]: Monitor doohicky got event Alarm on channel 35 Feb 21 09:40:34 WARNING[22364]: Detected alarm on channel 35: Red Alarm Feb 21 09:40:34 WARNING[22364]: Unable to disable echo cancellation on channel 35 Feb 21 09:40:34 DEBUG[22364]: Monitor doohicky got event Alarm on channel 36 Feb 21 09:40:34 WARNING[22364]: Detected alarm on channel 36: Red Alarm Feb 21 09:40:34 WARNING[22364]: Unable to disable echo cancellation on channel 36 ...etc... And the other PRI: Feb 21 09:41:01 DEBUG[22364]: Monitor doohicky got event Alarm on channel 1 Feb 21 09:41:01 WARNING[22364]: Detected alarm on channel 1: Red Alarm Feb 21 09:41:01 WARNING[22364]: Unable to disable echo cancellation on channel 1 Feb 21 09:41:01 DEBUG[22364]: Monitor doohicky got event Alarm on channel 5 Feb 21 09:41:01 WARNING[22364]: Detected alarm on channel 5: Red Alarm Feb 21 09:41:01 WARNING[22364]: Unable to disable echo cancellation on channel 5 Feb 21 09:41:01 DEBUG[22364]: Monitor doohicky got event Alarm on channel 6 Feb 21 09:41:01 WARNING[22364]: Detected alarm on channel 6: Red Alarm Feb 21 09:41:01 WARNING[22364]: Unable to disable echo cancellation on channel 6 Feb 21 09:41:01 DEBUG[22364]: Monitor doohicky got event Alarm on channel 7 Feb 21 09:41:01 WARNING[22364]: Detected alarm on channel 7: Red Alarm Feb 21 09:41:01 WARNING[22364]: Unable to disable echo cancellation on channel 7 Rapidly followed by the PRI coming back up: Feb 21 09:41:07 DEBUG[22364]: Monitor doohicky got event No more alarm on channel 34 Feb 21 09:41:07 NOTICE[22364]: Alarm cleared on channel 34 Feb 21 09:41:07 DEBUG[22364]: Monitor doohicky got event No more alarm on channel 35 Feb 21 09:41:07 NOTICE[22364]: Alarm cleared on channel 35 ..etc.. and: Feb 21 09:41:35 DEBUG[22364]: Monitor doohicky got event No more alarm on channel 1 Feb 21 09:41:35 NOTICE[22364]: Alarm cleared on channel 1 Feb 21 09:41:35 DEBUG[22364]: Monitor doohicky got event No more alarm on channel 5 Feb 21 09:41:35 NOTICE[22364]: Alarm cleared on channel 5 Feb 21 09:41:35 DEBUG[22364]: Monitor doohicky got event No more alarm on channel 6 Feb 21 09:41:35 NOTICE[22364]: Alarm cleared on channel 6 ..etc.. What I find in the one case is that the PRI keeps getting reported RED alarm and recovering, ad infinitum. In another case, I saw millions of lines logged in this vein: Feb 21 09:46:28 DEBUG[22364]: Monitor doohicky got event Event -1 on channel 12 across all the channels that alarmed and recovered. In either case whe whole system is now toast until it is restarted. At a particularly busy customer, with 4 machines, each machine usually gets into this state once a day or so. Because of the application, we can get away with automatically detecting and restarting the failed machine without much business impact. At another site this bites once a week or two; that's a standard PBX setup where the impact of the failure is obvious and severe, at this stage it seems quite likely that they will kick out the Asterisk system. I'm unable to tell whether the PRI alarm is a symptom or a cause. I think its a symptom of a bug, because once down it doesn't recover, but restart Asterisk and it comes up immediately. Can anyone help shed some light on the problem? Thanks, Steve