On Sunday, January 26, 2020 11:18:36 PM CET Pete Biggs
wrote:> First of all - disclaimer - I'm no network specialist, I just read and
> am interested in it. I may get things wrong!!
>
> > Both physical interfaces show the same. But does this mean it's
on as in
> > "rx- checksumming: on" or off as in "tx-checksum-ipv4:
off [fixed]"?
>
> As far as I understand it rx-checksum is the underlying wire
> checksumming - and from what I've read about it, disabling that
> disables the UDP checksums.
You mean layer 1 checksumming? Is there such a thing with ethernet? I think
I read something about encoding, when I was trying to understand what
"bandwidth" actually means, being involved in signal transmissions;
and I seem
to remember that there was no checksumming involved and it had to do with
identifying signals as a requirement for the very possibility to transmit
something before anything could be transmitted at all.
> > Assuming that I do not receive packets with invalid UPD checksums,
then
> > the
> > packages must be somehow altered and their UPD checksums recalculated
to
> > arrive here. Does bad hardware etc. do that? Why would the UDP
checksums
> > just happen to get recalculated correctly but like randomly without
> > intent?
>
> I'm not sure I understand what you are asking.
It is about VOIP calls via SRTP being interrupted at irregular intervals. The
intervals appear to depend on the time of day: Such phone calls can last for
a duration of about 5--25 minutes during the day to up to 1.5 hours at around
3am before being interrupted.
Asterisk says that a package is being replayed, meaning that libsrtp has
already seen and processed the packet earlier. That can happen a couple times
until asterisk reports authentication failures. The result is that the call
is interrupted in that I can not hear the opposite end while the other end
sometimes can still hear me, sometimes not. The interruption can take even
minutes and the audio can continue after that, though usually I either hang up
the call, or the calls ends by itself before the audio is back.
IIUC, authentication failures mean that libsrtp figures that the
authentication tag of an SRTP package does not match the data contained
otherwise within the packet. The authentication tag is encrpyted on the
sender side after initially keys have been exchanged between sender and
receiver from which new keys are being derived as needed. The key exchange
can go over SIP (using TLS) when sdes is used, which it is in this case.
The receiver decrypts the authentication tag and verifies that the tag matches
all the other data in the packet. Only when the package was thusly
successfully authenticated, the RTP-payload of the package is decrypted.
The SRTP package seems to be the entire payload of the UDP package, so if the
data of the SRTP package gets damaged or were to be intentionally altered, the
UDP checksum would have to be intentionally re-calculated.
Two independent installations of asterisk at physically different locations
are showing the same error messages, both connecting to the same VOIP
provider.
As you can imagine, this is really fun to debug ...
> But it's unlikely (very
> unlikely) that the checksums are randomly correct. But packet checksums
> are recalculated when packets are forwarded by layer 4 switches - the
> contents of the package are inspected as part of the switching process.
Yes, I thought so, IIRC it's required for routing and changing the TTL
maybe.
Now that someone would intentionally alter the SRTP packages and re-calculate
the checksums seems rather unlikely, all the more so since they would need to
do that at two different places.
> > Only when asterisk (i. e. libsrtp) finally verifies the authentication
tag
> > of an SRTP package against the authenticated part of the package ---
> > which, according to RFC 3711, seems to be the entire payload of the
UPD
> > package --- the verfication fails.
> >
> > How is that possible?
>
> If it's SRTP checksum error, then that checksum is part of the packet
> payload at the application level - the UDP checksum is for the whole
> packet. Presumably the contents of the application payload were
> altered after the SRTP checksum was calculated but before the UDP
> packet checksum. It could be a bad layer 4 switch I suppose.
Right --- or the SRTP package has been created incorrectly by their phone
system because it is overloaded at busy times, or it's buggy.
My favorite theory is that I am sometimes suddenly receiving the wrong SRTP
stream. I think it would fit the symptoms. Perhaps the VOIP provider is
experiencing interesting NAT issues when their connection tracking is getting
messed up at times when there are more connections than they can handle.
That defective hardware is causing the same problem at both places at the same
time seems rather unlikely.
So I've been trying to figure out what the problem might be. After learning
all this, I'm sufficiently sure that the problem is on their side.
> Probably your best bet is to use wireshark to decode the packets to see
> what the raw data looks like.
Hm, I tried that and wireshark doesn't seem to like SRTP packages very much.
Apparently it doesn't have a way to decrypt SRTP packages at all, even if I
could get the initial keys. Maybe someone who is much more proficient with
wireshark could find something. To me, it has been useless so far.
If wireshark could do stuff with SRTP packages, what could it possibly show
other than that some packages either carry a damaged payload, or that the
encryption keys don't fit, which is something I already know? If the
problem
was with asterisk or libsrtp, the problem would be much more common.