When you make a call, it's the SIP protocol that contacts the receiving device, agrees on the nature of the call, and makes the connection. After that, another protocol carries the content (RTP) of the call. When the call is over and the parties disconnect, SIP is again the protocol that terminates the call.
This may not sound like much of a security issue, but in fact, it is. That's because SIP wasn't originally designed to be secure, which means it's easily hacked. SIP is a text-based protocol. The same applies to RTP, another text based protocol for handling your calls audio/video/media. Unless an encrypted connection is requested, all of this takes place as plain text, that may travel across the open internet or your office network.
At Telnyx, we provide users with the ability to establish TLS and SRTP/ZRTP with our system for end-to-end SIP and Media encryption. By default, the features are not enabled, but are configurable from your SIP Connection. Additionally, at Telnyx, we leverage our private network to pull your traffic off the public internet and carry the media across our own fiber. By handling the media, we are able to ensure that your packets are exposed to as few public hops as possible.
In this article, we'll talk a little about how certificates work, how TLS sessions are established, and provide a brief overview of SRTP/ZRTP with example SDP paylods.
Certificates for Authentication
Certificates are used for authentication and encryption. They're always used with TLS. There's an SSL (TLS) certificate for sip.telnyx.com:5061 so that SIP-over-TLS traffic between our customers and Telnyx can be secure.
TLS/SSL certificates used for SIP are exactly the same as those used for HTTPS. There's no difference because TLS makes a separate transport layer and the protocol inside the TLS is independent of TLS. When a browser attempts to connect to a secure website using TLS, the web server sends a certificate (and proof that the server is properly using that certificate). The use of certificates is based on chains of trust.
The browser has a list of certificates that it trusts, but it does not have a list of all certificates. If the browser is to trust the certificate from the server, then the certificate is signed by a CA. Every certificate has an issuer (the certificate that signed it), and the chain of trust ends (or starts, depending on perspective) with a Root CA certificate. If the certificate from the web server is signed by a CA Certificate that the browser trusts, then the browser will also trust the server certificate. (Root CA certificates are signed by themselves, so are only ever explicitly trusted.)
Whenever a client connects to TCP:sip.telnyx.com:5061, the server sends a certificate chain: a bundle of the server certificate and the two intermediate certificates. If the client trusts the Root CA certificate that issued the topmost intermediate certificate, then (by the chain of trust) it'll implicitly trust the sip.telnyx.com server certificate.
Modern operating systems (Android, Ubuntu, MacOS, etc) have certificate bundles built in. The user usually does not have to configure certificates. When new Root Certificate Authority (CA) certificates are made or certificates are revoked, OS updates look after keeping the CA bundle updated.
Some customers use appliances for SIP traffic. An example is an SBC (Session Border Controller), such as Audiocodes M1000, Dialogic Bordernet SBC. These appliances route SIP traffic but they don't run an Operating System (OS) and don't have a comprehensive CA bundle built in.
If these devices are to trust the certificate from sip.telnyx.com, then they must either trust the certificate directly or trust (either of) the intermediate certificates. To do this, the administrator must load a certificate into the appliance; this is usually a manual task.
Without loading a certificate to establish the trust, TLS connections cannot be made to sip.telnyx.com:5061 and therefore SIP over TLS cannot work. Please download the crt.sh file from here.
You can also check the certificates, fingerprints, and expiry dates for sip.telnyx.com:5061 using the below example openssl commands on the command line / terminal / prompt.
openssl s_client -servername sip.telnyx.com -connect sip.telnyx.com:5061
openssl s_client -showcerts -connect sip.telnyx.com:7443
echo | openssl s_client -connect sip.telnyx.com:5061 |& openssl x509 -fingerprint -noout
echo | openssl s_client -servername sip.telnyx.com -connect sip.telnyx.com:5061 2>/dev/null | openssl x509 -noout -dates
SSL and the newer version TLS are cryptographic protocols that provide security on the Internet. TLS with SIP is used to encrypt sip signaling whereas SRTP (Secure Real-time Transport Protocol) / ZRTP (Z and Real-time Transport Protocol) is used to encrypt media streams. It is not mandatory to use SRTP/ZRTP when using TLS but in order to use SRTP effectively, one must use TLS.
TLS/SSL Handshake Protocol
The TLS handshake protocol is used between sip client and server to establish trust and then negotiate what secret key will be used to encrypt and decrypt the traffic.
TLS runs atop the TCP protocol, so the first thing that is performed is TCP three-way handshake. Once the TCP connection is established, SSL/TLS handshake can take place.
Step 1: Client initiates "Client Hello" which has information such as TLS version, cryptographic algorithm and data compression method supported by the client.
Step 2-6: Server sends "Server Hello", agreeing on the cryptographic algorithm from the list provided by the client, amongst other parameters . It then presents its digital certificate and public key.
Step 7-13: Client validates the certificate presented by the server. Once the client trusts the server, key exchange (shared secret key) will take place.
Step 14-15: Encrypted data will flow between the two endpoints and can continue until either party chooses to close the connection.
Once the TLS session is established, the SIP session will now take place where the negotiation of media will also be handled. Note that, with SRTP, the encryption parameters for the RTP are contained within the SIP signaling (which is why TLS should be used for SIP in the first place).
Actually, the parameters for the encryption are in the SDP (the Session Description Protocol), which is a part of the SIP message.
Remember, credential-based SIP Connections must register over TLS if you want inbound calls to be established over TLS. For FQDN and IP authentication based SIP Connections, you can specify the transport of choice in the SIP Connection settings.
What is SRTP?
The Secure Real-Time Transport Protocol. SRTP is a security profile for RTP that adds confidentiality, message authentication, and replay protection to that protocol. It was initially published by the Internet Engineering Task Force (IETF) in RFC 3711 in March 2004.
SRTP uses Advanced Encryption Standard (AES) as the default cipher along with two different cipher modes. Segmented Integer Counter Mode is standard, and is critical for running traffic over an unreliable network with a possible loss of packets. f8-mode is used for 3G mobile networks, and is a variation of output feedback mode designed to be seek-able with an altered initialization function.
While AES-128 is widely regarded as more than adequately secure, some users may be motivated to adopt AES-192 or AES-256 due to a perceived need to pursue a highly conservative security strategy. In the below example, you'll see Counter Mode (CM) and Galois Counter Mode (GCM).
Besides the AES cipher, SRTP allows the ability to disable encryption outright, using the so-called NULL cipher, which can be assumed as an alternate supported cipher. In fact, the NULL cipher does not perform any encryption if chosen. You'll also notice some suites containing the Hash-based Message Authentication Code with Secure Hashing Algorithm 1 (HMAC-SHA1) used for integrity checks between sender and receiver on the authentication tag. If the message authentication fails and the results don't match, the packet is discarded.
Example SDP of the media audio parameters in the SIP INVITE from the client:
- In the SDP of the invite the client will send the cipher suite that can be used to encrypt the media.
- It will also mention the
m= audio parameterthe port used for the media transfer.
- In the 200 OK sent by the server, it will choose one of the cipher suites that will be used to encrypt the media.
Session Description Protocol
Connection Information (c): IN IP4 198.51.100.64
Media Description, name and address (m): audio 51895 RTP/AVP 0 101
a=crypto:1 AEAD_AES_256_GCM_8 inline:URvowj9NlyRUU4Ro4NIUbg/39b1bEdLtxPelsfHpPGD9GtwWuKMMefhd76U=
a=crypto:2 AEAD_AES_128_GCM_8 inline:uATLXs8T8rjZ0QVH15np2ExFaVZixIRvHHXOow==
a=crypto:3 AES_256_CM_HMAC_SHA1_80 inline:JlgkqmY+uVPBLmJairIWLMTzsoP7sq6M+FLsTeZkAJUKobgTjouB0H+ZHV911Q==
a=crypto:4 AES_192_CM_HMAC_SHA1_80 inline:aNNDkHj6o2xl2J6PkjReCqK4dvGFeV3AjZ9Lohiriq5RRu8gMq0=
a=crypto:5 AES_CM_128_HMAC_SHA1_80 inline:TCkr9L8rtn4+aJXoeA1eiQcTAniJOvkYXUnZ5KTw
a=crypto:6 AES_256_CM_HMAC_SHA1_32 inline:98t1wsF9qRKbb+vJO0zldPSX1RvXDI/SSrzNEUl0CHDGv84vqtz3+tJNA2JQdA==
a=crypto:7 AES_192_CM_HMAC_SHA1_32 inline:K5rGraoWN0x2r3OHxKDMxDY4uk4TZC6z2Lu9NIU8YY/BesMyICA=
a=crypto:8 AES_CM_128_HMAC_SHA1_32 inline:CYuxgxVh3SpfdN4uXoR5M709+19SQMuNepdP0Dw8
a=crypto:9 AES_CM_128_NULL_AUTH inline:DzAaqmSiBsFG997HKmMhoFkxOo3c9TY7mO6fFHOn
In 183 with SDP / 200 OK Telnyx agrees on one of the clients attributes and presents our media IP address:
Session Description Protocol
Connection Information (c): IN IP4 220.127.116.11
Media Description, name and address (m): audio 29078 RTP/SAVP 0 101
o=root 301488255 301488255 IN IP4 18.104.22.168
s=Asterisk PBX 13.34.0
c=IN IP4 22.214.171.124
m=audio 29078 RTP/SAVP 0 101
a=crypto:5 AES_CM_128_HMAC_SHA1_80 inline:7aNj4ZNv8I94feZgZP2sPmVcyNxTGIHjhkbR1rLz
In this case the cipher suite 5 is chosen.
The Zimmermann Real-Time Transport Protocol. ZRTP is a key exchange protocol designed to enable VoIP devices to agree keys for encrypting media streams (voice or video) using RTP. The protocol was developed by Phil Zimmermann (who also created PGP), with help from Bryce Wilcox-O'Hearn, Colin Plumb, Jon Callas, and Alan Johnston.
The important distinction between ZRTP and SRTP is that ZRTP is used to set up the encryption, but SRTP is actually the encrypted media.
ZRTP uses the Diffie-Hellman algorithm which enables secure key agreement and avoids the overhead of certificate management or any other prior setup. ZRTP supports two Diffie-Hellman variants, finite field and elliptic curve. The keys agreed by ZRTP are ephemeral which means that they are discarded at the end of a call, avoiding the need for key management. ZRTP is the protocol that the two parties use to negotiate the SRTP session key.
The key agreement algorithm can be divided into 4 steps:
- Discovery - Hello & HelloACK
- Hash commitment - Commit
- Diffie-Hellman exchange and key derivation - DHPart1 DHPart2
- Confirmation - Confirm1, Confirm2 & Conf2ACK
In the discovery phase, the initiator and the responder exchange their ZRTP identifiers, as well as information about each other’s supported ZRTP versions, hash functions, ciphers, authorization tag lengths, key agreement types, and SAS algorithms.
The messages exchanged in the first phase are called Hello messages. An acknowledgment, called HelloACK message, is sent upon receipt of a Hello message.
After compatibility is checked, the initiator chooses which hash function, cipher, authorization tag length, key agreement type, and SAS algorithm should be used, basing his choice on the information brought in the Hello messages. This is sent to the responder via a message called Commit.
The initiator needs to generate his ephemeral key pair before sending the Commit, and the responder generates his key pair before sending a message called DHPart1. The Diffie-Hellman public values are exchanged in the DHPart1 and DHPart2 messages. ZRTP requires to generate new Diffie-Helman key pairs for each session.
SRTP cryptographic keys and salts are then calculated.
Finally, the confirmation phase takes place via Confirm1 and Confirm2 messages. They contain the cache expiration interval for the newly generated retained shared secret.
Note that ZRTP also doesn't rely on the Session Initiation Protocol (SIP) for key management. It doesn't rely on any SIP servers at all. It performs its key agreements and key management purely peer-to-peer over the RTP packet stream. You will not see a=crypto media attributes in the SIP INVITE's SDP; instead, you may see ZRTP "Hello" packets in the media stream.
Note also that ZRTP uses opportunistic encryption; a call that uses ZRTP may start with unencrypted RTP and then switch to SRTP a little later, provided that both sides support ZRTP (if they don't, then the call may proceed without SRTP).
TLS, SRTP, and ZRTP Conclusion
For security reasons, you may encrypt your SIP signalling over TLS to protect the content of the SIP messages. For encrypting media, you have a choice to use either SRTP or ZRTP.
Credential-based SIP Connections must ensure the client specifies TLS when attempting to REGISTER with our system, in order for inbound calls to be established over TLS.
Conversely, to use TLS for outbound calls, the customer should configure his client's (server or phone, etc) transport type and choose TLS for the SIP signaling.
For IP and FQDN SIP Connections, please make sure you review your inbound settings and specify the transport as TLS. This will ensure Telnyx sets up any inbound calls over TLS.
The media settings for SRTP/ZRTP are also configured in the Connection, under both the Inbound and Outbound tabs.
When Telnyx receives a call destined to a number associated with your SIP Connection, and this SIP Connection has SRTP enabled, we will send the cipher suites in our SIP INVITEs SDP. Please make sure your system is configured to respond with a preferred cipher suite in it's 183/200 OK.
NOTE: If you respond with no cipher suite in your 183/200 OK, our system will reject the call establishment. Likewise, if you do not have SRTP enabled on the SIP Connection but respond with cipher suites in your 183/200 OK, our system will reject the call.
The same applies to ZRTP, we'll send you the Hello packet and expect to receive a correct handshake.
NOTE: Once you enable either protocol, please ensure your system is also set up to negotiate with either protocol. Otherwise, you may see Telnyx send back a 488 Not acceptable SIP response.
For instance, if you enable SRTP for your outbound calls but do not send us the cipher suites in your SIP INVITEs SDP, we can't establish an encrypted media channel and it will also result in this 488 Not acceptable response from our system.
Likewise, if you attempt to send us cipher suites in your SIP INVITE but do not have SRTP specified on your SIP Connection, we will again send back a 488 Not acceptable response.
The same applies to ZRTP; if you do not send a Hello packet to establish the SRTP session key, then we will not be able to establish the call.