We're having issues with the FXO/FXS ports on our Digium TDM cards
sporadically. I'm wondering if anyone else has had these problems, or if
anyone can provide guidance diagnosing or fixing the issue?
The symptoms are that the FXO and FXS ports "stop working", usually
after 2-4 weeks of server uptime. When this happens, sending a (SIP)
call to an analog phone on an FXS port causes the phone to ring, but
when you answer the phone, the off-hook is not detected. Further rings
sound like clicks in the off-hook handset. If the handset is hung up,
ringing resumes.
If you take the phone off hook to make a call, no dial tone is heard on
the line. Calls which come in on a phone line connected to an FXO port
are not detected as ringing by asterisk.
When we notice this happening, I found no log lines in the asterisk
messages log which looked directly related, and nothing out of the
ordinary on verbose/debug console. However, we don't have a
verbose/debug console script from the actual moment when failure starts,
we only have console activity from after the fact, when we notice
something went wrong.
The only odd log lines I've found didn't seem related to the problem:
Sep 27 11:43:56 NOTICE[17782] channel.c: Dropping incompatible voice
frame on Local/228@internal_extensions-9eb8,2 of format ulaw since our
native format has changed to slin
I'm not sure what "slin" is, but I'm not using it.
Local/228@internal_extensions dials Zap channel 28, an FXS port with an
analog phone attached to it. This notice occurred about 5 times when I
don't think we were experiencing the Zap problem, and once when we
definitely were.
Sep 27 13:57:02 WARNING[2711] chan_zap.c: Ring/Off-hook in strange state 6 on
channel 31
Sep 27 13:57:04 WARNING[2711] chan_zap.c: Ring/Off-hook in strange state 6 on
channel 31
These are from about half an hour after the machine was rebooted to fix
the Zap problem, so I don't think they're related to the problem.
Channel 31 is an FXO port with a POTS phone line attached.
Our hardware and software:
- TE110P with a PRI on Zap channels 1-24 (24 is D-channel)
- TDM400P with 4 FXO ports on channels 25-28
- TDM400P with 4 FXS ports on channels 29-32
- FXO and FXS ports are loopstart
- Asterisk 1.2.0-beta1, libpri 1.2.0-beta1, zaptel 1.2.0-beta1
- we have also experienced this with 1.0.8
- Fedora core 3
The PRI channels are never affected at all by this problem.
So far, the only way I've found to deal with this is a server reboot.
Restarting asterisk does not help, nor does an explicit ztcfg. zttool
doesn't show anything wrong on any zap channels.
Prior to deploying the system in production, I experienced the same
problem much more frequently. At that time everything was set to
kewlstart. I changed everything to loopstart, since only one of the 8
ports had an actual phone line on it, and our phones, door phones, and
doorbells are not kewlstart compatible. The problem did not seem to
recur until yesterday, almost 4 weeks after the production deploy, and
2-3 weeks of server uptime.
The devices connected to the FXO ports are a POTS phone line and 2
Lucent door phones in trunk mode. Prior to production deployment, I
experienced the same problems with nothing connected to the FXO ports.
The devices connected to the FXS ports are a single analog telephone,
and 3 Lucent doorbells (a phone ringer with no handset, basically).
Prior to production deployment, I experienced the same problems with
analog phones conected to all four FXS ports.
Does this sound familiar to anyone? Where else should I look for signs
of the Zap failures? What extra logging should I put in place to make
diagnosis easier in case it happens again?
Thanks,
Alan Ferrency
pair Networks, Inc.
alan@pair.com