Difference between revisions of "Training:How does VOIP Work"

From IPitomy Wiki
Jump to navigation Jump to search
 
(36 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
= How Does VoIP Work?<br/> =
 
= How Does VoIP Work?<br/> =
  
The purpose of this section will be to provide a very high-level overview of Voice over IP (VoIP).
+
This section provides an introductory overview of Voice over Internet Protocol (VoIP), a technology that enables the transmission of voice communications via IP networks.
  
&nbsp;Many people have used a computer and a microphone to record a human voice or other sounds. The process involves sampling the sound that is heard by the computer at a very high rate (at least 8,000 times per second or more) and storing those "samples" in memory or in a file on the computer. Each sample of sound is just a very tiny bit of the person's voice or other sound recorded by the computer. The computer has the ability to take all of those samples and play them, so that the listener can hear what was recorded.
+
VoIP works on the principle of audio sampling, where a computer records a sound (such as a human voice) at a high rate (typically at least 8,000 times per second) and converts these audio samples into digital data. Unlike traditional recording, where these samples are stored locally, VoIP sends these samples over an IP network to be played back on a different device.
  
VoIP is based on the same idea, but the difference is that the audio samples are not stored locally. Instead, they are sent over the IP network to another device and played there.
+
The process of making VoIP function efficiently involves several key steps. Initially, the computer compresses the recorded sound samples to minimize the space they require, focusing particularly on voice frequencies. This compression and decompression process is handled by a tool known as a CODEC (compressor/decompressor). Numerous CODECs are available, and VoIP uses those optimized for voice compression, significantly reducing the bandwidth required compared to uncompressed audio.
  
Of course, there is much more required in order to make VoIP work. When recording the sound samples, the computer might compress those sounds so that they require less space and will certainly record only a limited frequency range. There are a number of ways to compress audio, the algorithm for which is referred to as a "compressor/de-compressor", or simply CODEC. Many CODECs exist for a variety of applications (e.g., movies and sound recordings) and, for VoIP, the CODECs are optimized for compressing voice, which significantly reduce the bandwidth used compared to an uncompressed audio stream. Speech CODECs are optimized to improve spoken words in the frequency range of human speech.
+
After compression, the samples are grouped into larger units and inserted into data packets ready for transmission over the IP network, a process known as packetization. A typical IP packet can contain 10 or more milliseconds of audio, with 20 or 30 milliseconds being the most common.
  
Once the sound is recorded by the computer and compressed into very small samples, the samples are collected together into larger chunks and placed into data packets for transmission over the IP network. This process is referred to packetization. Generally, a single IP packet will contain 10 or more milliseconds of audio, with 20 or 30 milliseconds being most common.
+
A comparison could be made to sending postcards through traditional mail. Each postcard (packet) carries a limited amount of information. Sending a lengthy message would require multiple postcards (packets), and to ensure they can be assembled correctly at the destination, they are organized using a sequence number or similar mechanism.
  
A good comparison is to think of a packet as a postcard sent via postal mail. A postcard contains just a limited amount of information. To deliver a very long message, one must send a lot of postcards. Of course, the post office might lose one or more postcards. One also has to assemble the received postcards in order, so some kind of mechanism must be used to properly organize the postcards, such as placing a sequence number on the bottom right corner. One can think of data packets in an IP network as postcards.
 
  
[[File:How voip 1.png|none|How voip 1.png]]
+
[[File:How_voip_1.gif|alt=]]
 +
 
  
 
Packets are sometimes delayed, just as with the postcards sent through the post office. This is particularly problematic for VoIP systems, as delays in delivering a voice packet means the information is too old to play. Such old packets are simply discarded, just as if the packet was never received. This is acceptable to a certain degree, as long as the assembled packets do not distort the sound. Too much delay will cause the sound to have less than desirable quality.
 
Packets are sometimes delayed, just as with the postcards sent through the post office. This is particularly problematic for VoIP systems, as delays in delivering a voice packet means the information is too old to play. Such old packets are simply discarded, just as if the packet was never received. This is acceptable to a certain degree, as long as the assembled packets do not distort the sound. Too much delay will cause the sound to have less than desirable quality.
Line 21: Line 21:
 
&nbsp;
 
&nbsp;
  
== Jitter in Packet Voice Networks<br/> ==
+
== Jitter in Packet Voice Networks ==
  
 
Jitter is defined as a variation in the delay of received packets. At the sending side, packets are sent in a continuous stream with the packets spaced evenly apart. Due to network congestion, improper queuing, or configuration errors, this steady stream can become lumpy, or the delay between each packet can vary instead of remaining constant.
 
Jitter is defined as a variation in the delay of received packets. At the sending side, packets are sent in a continuous stream with the packets spaced evenly apart. Due to network congestion, improper queuing, or configuration errors, this steady stream can become lumpy, or the delay between each packet can vary instead of remaining constant.
  
 
This diagram illustrates how a steady stream of packets is handled.
 
This diagram illustrates how a steady stream of packets is handled.
 
+
<div>[[File:HowVOIP2.gif|alt=|none|frame]]</div>When an IP device receives a Real-Time Protocol (RTP) audio stream for Voice over IP (VoIP), it must compensate for the jitter that is encountered. The mechanism that handles this function is the playout delay buffer. The playout delay buffer must buffer these packets and then play them out in a steady stream to the digital signal processors (DSPs) to be converted back to an analog audio stream. The playout delay buffer is also sometimes referred to as the de-jitter buffer.
http://wiki.ipitomy.com/images/3/36/HowVOIP2.gif
 
<div><br/></div>
 
When an IP device receives a Real-Time Protocol (RTP) audio stream for V
 
 
 
oice over IP (VoIP), it mu
 
 
 
st compensate for the jitter that is encountered. The mechanism that handles this function is the playout delay buffer. The playout delay buffer must buffer these packets and then play them out in a steady stream to the digital signal processors (DSPs) to be converted back to an analog audio stream. The playout delay buffer is also sometimes referred to as the de-jitter buffer.
 
 
 
 
This diagram illustrates how jitter is handled.
 
This diagram illustrates how jitter is handled.
  
[[File:HowVOIP3.gif]]
+
[[File:HowVOIP3.gif|alt=|none|frame]]
  
 
If the jitter is so large that it causes packets to be received out of the range of this buffer, the out-of-range packets are discarded and dropouts are heard in the audio. For losses as small as one packet, the DSP interpolates what it thinks the audio should be and no problem is audible. When jitter exceeds what the DSP can do to make up for the missing packets, audio problems are heard.
 
If the jitter is so large that it causes packets to be received out of the range of this buffer, the out-of-range packets are discarded and dropouts are heard in the audio. For losses as small as one packet, the DSP interpolates what it thinks the audio should be and no problem is audible. When jitter exceeds what the DSP can do to make up for the missing packets, audio problems are heard.
  
 
This diagram illustrates how excessive jitter is handled.
 
This diagram illustrates how excessive jitter is handled.
 +
[[File:HowVOIP4.gif|alt=|none|frame]]
  
&#x5B;&#x5B;File:|Description: 18902_Fg3.gif&#x5D;&#x5D;
 
 
&nbsp;
 
  
 
Video works in much the same way as voice. Video information received through a camera is broken into small pieces, compressed with a CODEC, placed into small packets, and transmitted over the IP network. This is one reason why VoIP is promising as a new technology: adding video or other media is relatively simple. Of course, there are certain issues that must be considered that are unique to video (e.g., frame refresh and much higher bandwidth requirements), but the basic principles of VoIP equally apply to video telephony.
 
Video works in much the same way as voice. Video information received through a camera is broken into small pieces, compressed with a CODEC, placed into small packets, and transmitted over the IP network. This is one reason why VoIP is promising as a new technology: adding video or other media is relatively simple. Of course, there are certain issues that must be considered that are unique to video (e.g., frame refresh and much higher bandwidth requirements), but the basic principles of VoIP equally apply to video telephony.
Line 53: Line 43:
 
VoIP is implemented in a variety of hardware devices, including IP phones, analog terminal adapters (ATAs), and gateways. In short, a large number of devices can enable VoIP communication, some of which allow one to use traditional telephone devices to interface with the IP networks.
 
VoIP is implemented in a variety of hardware devices, including IP phones, analog terminal adapters (ATAs), and gateways. In short, a large number of devices can enable VoIP communication, some of which allow one to use traditional telephone devices to interface with the IP networks.
  
In a good well performing network, VoIP calls should be as clear or clearer that and other type of audio transmissions.&nbsp; VoIP calls are pure digitized sound.&nbsp; Each audio packet contains the pure audio just exactly as it is spoken into the microphone.&nbsp;
+
In a well performing network, VoIP calls should be as clear or clearer that and other type of audio transmissions.&nbsp; VoIP calls are pure digitized sound.&nbsp; Each audio packet contains the pure audio just exactly as it is spoken into the microphone.&nbsp;
  
 
High definition voice contains a wider range of frequencies than typical voice transmissions and will deliver surprisingly good audio that contains a richer sound than most toll quality calls.
 
High definition voice contains a wider range of frequencies than typical voice transmissions and will deliver surprisingly good audio that contains a richer sound than most toll quality calls.
  
= VoIP Protocols<br/> =
+
= VoIP Protocols =
  
There are a number of protocols that may be employed in order to provide for VoIP communication services. In this section, we will focus on SIP since it is the protocol of choice for most devices now being deployed in the industry.
+
The success of VoIP communication hinges on employing an appropriate set of protocols. Here, we'll discuss IPitomy SIP Trunking, the preferred protocol for all IPitomy devices currently in circulation as well as many other third-party industry offerings.
  
Virtually every device in the world uses a standard called Real-Time Protocol (RTP) for transmitting audio and video packets between communicating computers. RTP is defined by the open standards that are set using various standards documents. RTP also addresses issues like packet order and provides mechanisms (via the Real-Time Control Protocol, to help address delay and jitter.
+
The Real-Time Protocol (RTP) is a standard used globally by nearly every device for transmitting audio and video packets between computers. RTP, guided by open standards outlined in various documents, manages issues like packet order and employs mechanisms such as the Real-Time Control Protocol to address delay and jitter.
  
Before audio or video media can flow between two devices, various protocols must be employed to find the remote device and to negotiate the means by which media will flow between the two devices. The protocols that are central to this process are referred to as call-signaling protocols, the most popular of which is Session Initiation Protocol (SIP).
+
Before media can flow between two devices, protocols are utilized to locate the remote device and negotiate the media flow methods. These crucial protocols are known as call-signaling protocols, with the Session Initiation Protocol (SIP) being the most widely used.
  
Advantage of SIP
 
  
*SIP is a simple protocol (at least for machines).&nbsp; It is a text base protocol that is designed to be easily read and lends itself well to troubleshooting by being able to see what is in the packets without having to completely decompile the software,
+
Advantages of IPitomy SIP Trunks
*SIP works very similar to email.&nbsp; The addressing is very similar.
 
*SIP calls are pure digital voice.&nbsp; In its native world, there is no distortion, no delay and no echo.
 
*Many different devices can interoperate in a network providing a wide variety of choice for users that extends far beyond what any single vendor can provide.
 
*Location of users is irrelevant.&nbsp; As long as they have access to broadband.
 
  
Anatomy of a SIP Call
+
SIP, a text-based protocol, is relatively straightforward for machines to understand. Its easy readability facilitates troubleshooting by allowing the inspection of packet contents without needing to decompile the software entirely. SIP's operation mirrors that of email, making it familiar and intuitive. Its addressing methods are particularly similar. SIP calls offer undiluted digital voice quality, free of distortion, delay, or echo in its native environment. Instant Scalability: Given IPitomy's IP PBX platform, capacity can be easily adjusted via a simple process in the admin GUI. Interoperability with a wide range of devices provides users with choices far beyond what any single vendor can offer. International dialing is supported. User location is inconsequential; remote users can be deployed globally without losing direct connectivity. An adequate internet connection is the only requirement.
  
A SIP Call has a signaling component and a Voice component.&nbsp; The signaling path is different from the actual voice transmission path.&nbsp; The signaling and voice transmission are unique for each call.&nbsp; All call setup and teardown signaling is over port 5060.&nbsp; All voice transmissions are on a range of ports 10,000 to 20,000.&nbsp; The ports are virtual ports and are simply part of the protocol for communicating over TCPIP.&nbsp;
 
  
Setting the Stage
+
Understanding a SIP Call
  
Before a SIP call can be placed, there needs to be SIP endpoints that have the ability to” find” and “be found by” other SIP endpoints in order to make and receive calls.&nbsp; For the purposes of this training guide, we will limit our SIP examples to endpoints that register to an IPitomy IP PBX.&nbsp; While it is possible to call from one SIP endpoint to another directly using a peer-to-peer method, most calls are facilitated through an IP PBX or soft switch to make dialing simple and easy just like a PSTN call.&nbsp; The big advantage of having an IP PBX is that the user does not have to dial an IP address to call another endpoint and all of the call information can be stored for reporting etc.&nbsp; The IP PBX will handle routing all of the calls to their destination on the PSTN or to local and remote extensions.&nbsp; Users simply dial phone numbers and extension numbers just like a legacy PBX system.&nbsp; An IP PBX supports analog PSTN lines, T1/PRI lines, DID’s and SIP Trunks.&nbsp;
+
A SIP call comprises a signaling component and a Voice component. The paths for signaling and actual voice transmission differ for each call. Setup and teardown signaling for all calls operate over port 5060. Voice transmissions are conducted within the range of ports 10,000 to 20,000. These are virtual ports integral to TCPIP communication.
  
In order to be part of the PBX database, SIP endpoints register with the PBX.&nbsp; Once they are registered, they have the ability to dial phone numbers and be called by other endpoints.&nbsp; The registration can be from phones on the Local Area Network (LAN) or from anywhere on the Internet.&nbsp;
 
  
Now that the phones are registered with the PBX, a call can be initiated or received.&nbsp; To start the call, the endpoint sends an invite to the server to ask the other endpoint if it is available to take a call.&nbsp; This takes place on the signaling port (port 5060).&nbsp; If the other endpoint is ready to accept the call, it sends an acknowledgement back to the initiating phone.&nbsp; The initiating phone then sends the call information telling the other phone which ports to commence the RTP (voice) session on.&nbsp; The RTP session is opened using the ports communicated by the initiating call.
+
Preconditions for a SIP Call
  
When the call is over, one of the endpoints sends a bye message and the call hangs up.&nbsp; That is a pretty simple description of how SIP makes a call.&nbsp; When the call is on the LAN it does not have to go through the router.&nbsp; The PBX will handle telling the endpoints which ports to use to connect the RTP (voice) stream.&nbsp;
+
Before a SIP call can occur, SIP endpoints capable of finding and being found by other SIP endpoints are needed. We'll restrict our SIP examples to endpoints that register to an IPitomy IP PBX for this training guide. While peer-to-peer SIP calls are possible, most are facilitated via an IP PBX or soft switch for a PSTN-like ease of dialing. The advantage of an IP PBX is that it simplifies calling another endpoint, allows call information to be stored for reporting, and manages call routing to local and remote extensions. Users dial phone numbers and extension numbers as they would with a legacy PBX system. Although no longer recommended due to their End-Of-Life status, an IPitomy IP PBX still supports analog PSTN lines and T1/PRI cards. SIP endpoints must register with the PBX to be included in the PBX database. Once registered, they can dial phone numbers and receive calls from other endpoints. These endpoints can be phones on the Local Area Network (LAN) or any location on the internet.
  
When the call is to a remote phone, the PBX knows the phone is outside of the firewall.&nbsp; This is when the router needs to have ports configured for signaling and RTP traffic.&nbsp;
+
 
 +
Starting a SIP Call
 +
 
 +
To start a call, the endpoint sends an invite to the server requesting the other endpoint's availability. This occurs on the signaling port (port 5060). If the other endpoint is ready, it sends an acknowledgement back to the initiating phone, which then sends the call information instructing the other phone on the ports to commence the RTP (voice) session. The RTP session is opened using the communicated ports.
 +
 
 +
Call termination is initiated by one of the endpoints sending a "bye" message, causing the call to hang up. This is a simplified explanation of a SIP call's lifecycle. When the call occurs on the LAN, it bypasses the router. The PBX instructs the endpoints on the ports to connect the RTP (voice) stream.
 +
 
 +
For calls to remote phones, the PBX understands that the phone is beyond the firewall. Router port configuration becomes necessary for signaling and RTP traffic at this stage.
  
 
Signaling Port 5060
 
Signaling Port 5060
  
When the ports are properly configured, port 5060 is forwarded in the router to the PBX systems IP address on the LAN.&nbsp; This allows the PBX to send signals to the remote phones as well as receive requests from them.
+
Proper port configuration enables port 5060 to be forwarded in the router to the PBX system's LAN IP address, allowing the PBX to send signals to remote phones and receive requests from them.
  
 
RTP Ports 10,000 – 20,000 Port Range Forwarding
 
RTP Ports 10,000 – 20,000 Port Range Forwarding
  
Once the call is setup using the signaling on Port 5060, the RTP is setup using a range of ports that are forwarded to the PBX LAN IP address.&nbsp; Using Port Range Forwarding in the Router, the range of Ports 10,000 – 20,000 is forwarded for this purpose.
+
Once call setup occurs via signaling on Port 5060, RTP is set up using a range of ports forwarded to the PBX LAN IP address. The Port Range Forwarding feature in the Router is used to forward the range of Ports 10,000 – 20,000 for this purpose.
  
Each call requires two ports for RTP; one for sending and one for receiving.&nbsp; SIP sets this up from the initiating phone. The ports in the router are open from the inside of the firewall.&nbsp; The phone on the far end receives the information on what ports to use in the SIP packets.
+
Each call requires two ports for RTP - one for sending and one for receiving. This is organized by the initiating phone. The router ports are open from inside the firewall. The remote phone receives the information about which ports to use from the SIP packets.
  
Local Phone Diagram
+
'''Local Phone Diagram'''
  
&#x5B;&#x5B;File:&#x5D;&#x5D;
+
[[File:5060.png|alt=]]
  
&nbsp;
+
'''Remote Phone Diagram'''
  
&#x5B;&#x5B;File:&#x5D;&#x5D;
+
[[File:Academy - RDP v5.gif|alt=]]
  
&nbsp;
+
== Network Address Translation – NAT ==
  
== Network Address Translation – NAT<br/> ==
+
TCP/IP, short for Transmission Control Protocol/Internet Protocol, is the underlying communication protocol used for data exchange on the internet. It leverages unique IP addresses to deliver data to the correct device on a network. The types and classes of IP addresses play a crucial role in how data is routed across the internet.
  
TCPIP is the protocol for sending data on the Internet.&nbsp; It relies on unique IP addresses in order to get the proper data to the proper computer/device on the network.&nbsp; There are several different types and classes of IP address.&nbsp;
+
If you're online, odds are you're using Network Address Translation (NAT). This becomes increasingly likely given the growing internet user base. As of 2023, the internet is accessed by nearly 5 billion people worldwide, approximately 63% of the global population. This figure continues to rise, with nearly 200 million new users connecting in the year leading up to April 2023.
  
If you are reading this, you are most likely connected to the Internet and there's a very good chance that you are using Network Address Translation (NAT) right now!
+
[[File:InternetUsers2019.png|alt=|700x700px]]
 
 
The Internet has grown larger than anyone ever imagined it could be. Although the exact size is unknown, the current estimate is that there are about 2.267 Billion users actively on the Internet. In fact, the rate of growth has been such that the Internet is effectively doubling in size each year.
 
 
 
&#x5B;&#x5B;File:&#x5D;&#x5D;
 
  
 
So what does the size of the Internet have to do with NAT? Everything! For a computer to communicate with other computers and Web servers on the Internet, it must have an IP address. An IP address (IP stands for Internet Protocol) is a unique 32-bit number that identifies the location of your computer on a network. Basically it works just like your street address: a way to find out exactly where you are and deliver information to you.
 
So what does the size of the Internet have to do with NAT? Everything! For a computer to communicate with other computers and Web servers on the Internet, it must have an IP address. An IP address (IP stands for Internet Protocol) is a unique 32-bit number that identifies the location of your computer on a network. Basically it works just like your street address: a way to find out exactly where you are and deliver information to you.
Line 125: Line 111:
 
With the explosion of the Internet and the increase in home networks and business networks, the number of available IP addresses is simply not enough. The obvious solution is to redesign the address format to allow for more possible addresses. This is being developed (IPv6) but will take several years to fully implement because it requires modification of the entire infrastructure of the Internet.
 
With the explosion of the Internet and the increase in home networks and business networks, the number of available IP addresses is simply not enough. The obvious solution is to redesign the address format to allow for more possible addresses. This is being developed (IPv6) but will take several years to fully implement because it requires modification of the entire infrastructure of the Internet.
  
&#x5B;&#x5B;File:&#x5D;&#x5D;
+
[[File:PNPN.png|alt=|700x700px]]
  
NAT Diagram – One Public IP Address is used by many Devices/Users
+
In our current internet landscape, dominated by IPv4 addressing, there's a finite number of unique IP addresses available. This limitation makes it challenging to provide every internet-connected device with its own IP address. The solution to this conundrum lies in the technique of Network Address Translation (NAT). The NAT process, employed by routers, enables a multitude of devices (like PCs, smartphones, and more) to share a single public IP address. This technique effectively extends the life of the current IPv4 addressing system until the broader implementation of IPv6, which promises a virtually limitless pool of IP addresses.
  
Under the current IP addressing scenario (IPv4) there are a finite number of IP addresses available on the Internet.&nbsp; There are not enough IP addresses available for each device to have their own unique IP address.&nbsp; To solve this problem, all routers have the ability to send data to devices through a Network Address Translation (NAT) process.&nbsp; This process allows a group of devices (like PC’s and Phones, etc.) to all share one Internet IP address.&nbsp; This process has stretched out the usefulness of the current IP address scheme until the next numbering scheme (IPv6) is fully deployed.
+
NAT operates by allowing the router to relay data to devices on the local area network (LAN) because it recognizes each device's unique internal IP address and Media Access Control (MAC) ID. When an external device needs to communicate with a device on the LAN via the internet, a specific route is required.
  
NAT works by the router passing data to devices because it is aware of the address of the specific devices on the local area network.&nbsp; The information you download to your PC comes directly to your PC because you have a unique internal IP address and a unique MAC ID.&nbsp;
+
Consider the case of a remote IP phone initiating a call through a PBX system on the LAN. The phone needs to send packets to the PBX, and the router must be informed where to route these packets. This is achieved by forwarding port 5060 to the PBX on the LAN, meaning all traffic arriving at this port is directed to the PBX.
  
When a device from outside of the local area network, wants to communicate from the Internet to a device on the LAN, it needs a path to guide it to the specific device (like the PBX).&nbsp; In the case of a remote IP phone, when the remote phone wants to make a call, it needs to send some packets to the PBX.&nbsp; In order to do that, the router needs to be instructed on where to send the IP phone packets.&nbsp; When port 5060 is forwarded to the PBX on the LAN, all traffic that comes in on port 5060 gets directed to the PBX.
+
Once the call is established, the Real-Time Protocol (RTP) traffic is directed to designated ports for transmission and reception. These ports are assigned based on instructions in the SIP packets used in the call setup. Misconfiguration of port forwarding can cause issues, the most common of which is "one-way audio," typically resulting from improper configuration of the RTP ports in the router. Note that some routers support an Application Layer Gateway (ALG) functionality, which often hampers packet delivery, despite its seeming compatibility with SIP, and should thus be disabled.
  
Once the call is setup, the RTP traffic is directed to ports for sending and receiving.&nbsp; These ports are determined through instructions in the call setup SIP packets.&nbsp; If the port forwarding is not configured properly the remote phone will not function properly.&nbsp; The symptom most often associated to “one way audio” is almost always caused by improper configuration of the RTP ports in the router.&nbsp; Some routers support Application Layer Gateway(ALG) functionality.&nbsp; While this usually appears to be designed for SIP, it most often interferes with packet delivery and must be turned off.
+
Disruptions in the RTP stream can arise when voice packets cannot reach their intended destination due to router configuration errors or an inability of the router to correctly perform NAT operations. Some routers are outright incapable of NAT, making them incompatible with remote IP phones.
  
It is easy to see how the RTP stream can be disrupted if the voice packets cannot reach the proper destination.&nbsp; Sometimes this is caused by the router configuration.&nbsp; Sometimes it can be the inability of the router to properly perform NAT functions.&nbsp; Some routers are simply not capable of NAT and therefore will not work with remote IP phones.
+
Having port forwarding enabled is critical, especially when considering remote access for maintenance, remote phones, and branch office connectivity. If a third party manages the router, all involved parties benefit from having these ports forwarded and the ALG disabled before the IP PBX is installed. Failure to follow these steps can lead to delays in implementation and should be factored in when providing price estimates to customers.
  
It is essential to be in a position to have port forwarding enabled for remote access for maintenance, remote phones and branch office connectivity.&nbsp; If a third party is in control of the router, it is in everyone’s best interest to have these ports forwarded and the ALG turned off and confirmed before the IP PBX is installed.&nbsp; Failure to have these ports forwarded will result in implementation delays and must be a consideration when proposing a price for the end customer.&nbsp;
+
In IP telephony, TCP/IP and the SIP protocol are employed to generate pure digitized sound. Any distortions such as "static," echo, hiss, or hum are not introduced by the PBX but arise from analog elements or packet loss. To troubleshoot these issues, check the analog connections (like the handset and its cable) and conduct a packet loss test.
  
IP Telephony over TCPIP using the SIP protocol produces pure digitized sound.&nbsp; There are no functions inside the PBX to add sounds like “static”, echo, hiss or hum.&nbsp; All of these sounds if present are produced in the analog world or are the result of packet loss.&nbsp; In order to troubleshoot issues on a TCPIP packetized network, it is necessary to look for the solutions in the most likely places.&nbsp;
+
IP phones are intelligent and circuit-independent devices. You can simply unplug a problematic phone and plug it into a different Ethernet connection for troubleshooting. If the problem persists, try using a phone known to function properly on the problematic Ethernet connection. If the same issues arise, inspect the cables and connections. Ensure that Ethernet cables are not draped over fluorescent lights or close to other devices that could introduce distortion into the packet delivery process.
 +
 
 +
== Implementing Quality of Service (QOS) is Critical in Your VoIP Installation<br/> ==
 +
 
 +
Implementing Quality of Service (QoS) is crucial for ensuring optimal performance in your VoIP installation. Underestimating the importance of proper QoS configuration can result in a poor user experience and increased support costs. Let's explore the significance of QoS and how to set it up effectively.
  
If a customer complains of static, it is most often packet loss in an IP network or an analog entry point like a handset.&nbsp; To identify the source of the problem, first check the analog connections e.g. handset, handset cable etc.&nbsp; Try a known good handset and cord.&nbsp; If that doesn’t solve the problem, run a test for packet loss.&nbsp;
 
  
IP Phones are intelligent devices and are not dependent on a circuit.&nbsp; It is easy to simply unplug the phone and plug it into another Ethernet connection.&nbsp; If that fixes the problem, plug a known good phone into the Ethernet connection of the phone that had issues. If a known good phone is plugged in to the Ethernet connection and exhibits the same problem, check the cables and connections for problems.&nbsp; Make sure the Ethernet cables are not draped over fluorescent lights are other devices that can induce distortion into the packet delivery process.
+
'''What Does QoS Do?'''
  
== Implementing Quality of Service (QOS) is Critical in Your VoIP Installation<br/> ==
+
QoS determines the priority of data packets on your Local Area Network (LAN). Since the available bandwidth on the LAN is shared by various applications, it is essential to prioritize voice packets for timely delivery. Voice packets are time-sensitive, and any interruption or delay can significantly degrade the audio quality of a call.
  
Implementing QOS has huge benefits for your VoIP application. Don’t underestimate the importance of setting this up properly. Proper configuration can save customers from a difficult experience as well as keep your support costs down.
 
  
=== What Does QOS Do?<br/> ===
+
'''Voice Packets vs. Regular Data Packets'''
  
QOS sets the priority for data packets on your LAN. The LAN has packets from a diverse set of applications all traveling through a limited amount of bandwidth. Voice occupies a very small portion of the bandwidth. Since the voice packets are delivered in a time sensitive manner, it is important that they do not get interrupted or delayed. If they do, the audio quality on the call can deteriorate to a noticeable degree.
+
Voice packets are distinguished from regular data packets by a designated field in their header. This distinction allows the data switch to prioritize voice packets, ensuring they are not delayed or interrupted. In a network, data packets are typically delivered on a best-effort basis, utilizing available bandwidth. However, if the network becomes congested, voice packets may be momentarily blocked by other data packets. Even a brief delay can cause noticeable audio interruptions in phone calls. By prioritizing voice packets, you guarantee uninterrupted voice communication. Since voice traffic occupies a minimal percentage of the total bandwidth, prioritizing voice packets does not have a noticeable impact on other data packets.
  
=== Voice Packets vs. Regular Data Packets<br/> ===
+
For instance, consider a scenario where 10 people on the LAN are simultaneously downloading a 20-megabyte file. In a standard 100Base-T network, this heavy data traffic could potentially block all other data temporarily. By prioritizing voice packets over data packets, voice communication experiences no delay because the downloaded file makes room for voice packets with minimal or imperceptible delay to the ongoing downloads.
  
Voice Packets are distinguished from other data packets by a designation in the voice packet Header. This allows the data switch to know how to prioritize the individual packets to avoid delaying voice packets. Networks always try to deliver data on a best efforts basis. If there is bandwidth available, the data switch will try to pass all of the packets through as soon as it gets them using all of the available bandwidth. If this happens, the voice packets can be momentarily blocked by all of the other data. Even though this may only take a few seconds, it is enough of a delay to cause the phone call to experience audio interruptions as packets are delivered too late to be able to be used. By prioritizing the voice packets, you insure that the voice will never be interrupted. Since the voice is a very small percentage of total bandwidth, there is no noticeable effect on all of the other data packets.
 
  
An example would be that 10 people on the LAN are trying to download a 20 meg file at the same time. In a normal 100 base T network that could completely block all data traffic for a brief time. By prioritizing the voice packets to always take priority over the data packets, the voice is delivered without delay because the downloaded file makes room for the voice packets with little or no perceptible delay to the downloads.
+
'''How to Set up QoS'''
  
=== How Do I Set up QOS?<br/> ===
+
QoS configuration is performed on the data switch. The IPitomy server uses specific settings to identify voice packets, which are set to CS3 by default. To ensure the highest priority for these packets, the data switch needs to be configured accordingly. Different switches employ various QoS labels, so you should determine the switch's specification to proceed. Since IPitomy utilizes the DSCP Class label, match that label in the switch to its highest priority setting (this could be a numerical value or "Highest," such as in the case of the Netgear FS728TP switch). It's important to ensure that no other devices on the network are using the same Class ID. If any other devices are utilizing it, either change their settings or modify the Class ID used by the IPitomy PBX under PBX Setup/SIP/Advanced. It is crucial to reserve the Class ID exclusively for voice traffic and not allocate it to other non-voice data devices.
  
QOS is set up in the data switch. The IPitomy server will have settings that it uses to identify the data packets. These settings are set to CS3 by default. The data switch will need to be configured to give the highest possible priority to these data packets. Switches use a variety of QOS labels so you will have to determine the scheme (specification) of the switch to be used. Since IPitomy uses the DSCP Class label, just match that label in the switch to the switch’s highest priority (this may be a digit or “Highest” as in the Netgear FS728TP). It’s important to know that no other devices on the network are utilizing that Class ID. If there are, change them or the IPitomy PBX under PBX Setup/SIP/Advanced. The Class ID used for voice traffic must not be used by other, non-voice data devices.
+
Note: QoS can only be set on the LAN, specifically in the data switch(es). It is not relevant for the Wide Area Network (WAN) or internet traffic since those routes are determined by network hops that are beyond your control. However, in private WANs like MPLS, the network provider may have the capability to configure QoS for point-to-point connections.
  
Note: QOS can only be set on the LAN [in the data switch(es)], it is not relevant on the WAN (Internet) since this media is routed by “hops” for which you have no control. The exception to this is private WAN’s like MPLS where the network provider may be able to configure QOS point-to-point.
+
VoIP (RTP) performs optimally when QoS is configured on the LAN. As a general rule, it is highly recommended to implement QoS for VoIP installations.
  
VOIP (RTP) works best when QOS is set on the LAN. As a rule, always implement QOS.
+
For further information on setting up QoS and its specific configuration for your system, please consult the relevant documentation or contact the IPitomy support team.
  
For more information on setting up QOS, see the article in IPitomy’s [http://wiki.ipitomy.com/index.php/QOS_Setup_Guide WIKI].
+
For more information on setting up QOS: [http://wiki.ipitomy.com/index.php/QOS_Setup_Guide Click Here].
 +
[[Category:Training]]

Latest revision as of 19:12, 12 June 2023

How Does VoIP Work?

This section provides an introductory overview of Voice over Internet Protocol (VoIP), a technology that enables the transmission of voice communications via IP networks.

VoIP works on the principle of audio sampling, where a computer records a sound (such as a human voice) at a high rate (typically at least 8,000 times per second) and converts these audio samples into digital data. Unlike traditional recording, where these samples are stored locally, VoIP sends these samples over an IP network to be played back on a different device.

The process of making VoIP function efficiently involves several key steps. Initially, the computer compresses the recorded sound samples to minimize the space they require, focusing particularly on voice frequencies. This compression and decompression process is handled by a tool known as a CODEC (compressor/decompressor). Numerous CODECs are available, and VoIP uses those optimized for voice compression, significantly reducing the bandwidth required compared to uncompressed audio.

After compression, the samples are grouped into larger units and inserted into data packets ready for transmission over the IP network, a process known as packetization. A typical IP packet can contain 10 or more milliseconds of audio, with 20 or 30 milliseconds being the most common.

A comparison could be made to sending postcards through traditional mail. Each postcard (packet) carries a limited amount of information. Sending a lengthy message would require multiple postcards (packets), and to ensure they can be assembled correctly at the destination, they are organized using a sequence number or similar mechanism.



Packets are sometimes delayed, just as with the postcards sent through the post office. This is particularly problematic for VoIP systems, as delays in delivering a voice packet means the information is too old to play. Such old packets are simply discarded, just as if the packet was never received. This is acceptable to a certain degree, as long as the assembled packets do not distort the sound. Too much delay will cause the sound to have less than desirable quality.

IP Devices generally measure the packet delay and expect the delay to remain relatively constant, though delay can increase and decrease during the course of a conversation. Variation in delay is called jitter.  Delay, itself, just means it takes longer for the recorded voice spoken by the first person to be heard by the user on the far end. In general, good networks have an end-to-end delay of less than 100ms, though delay up to 400ms is considered acceptable (especially when using satellite systems). Jitter can result in choppy voice or temporary glitches, so VoIP devices implement jitter buffer algorithms to compensate for jitter. Essentially, this means that a certain number of packets are queued before play-out and the queue length may be increased or decreased over time to reduce the number of discarded, late-arriving packets or to reduce "mouth to ear" delay. Such "adaptive jitter buffer" schemes are also used by a wide variety of devices that deal with variable delay.

 

Jitter in Packet Voice Networks

Jitter is defined as a variation in the delay of received packets. At the sending side, packets are sent in a continuous stream with the packets spaced evenly apart. Due to network congestion, improper queuing, or configuration errors, this steady stream can become lumpy, or the delay between each packet can vary instead of remaining constant.

This diagram illustrates how a steady stream of packets is handled.

When an IP device receives a Real-Time Protocol (RTP) audio stream for Voice over IP (VoIP), it must compensate for the jitter that is encountered. The mechanism that handles this function is the playout delay buffer. The playout delay buffer must buffer these packets and then play them out in a steady stream to the digital signal processors (DSPs) to be converted back to an analog audio stream. The playout delay buffer is also sometimes referred to as the de-jitter buffer.

This diagram illustrates how jitter is handled.

If the jitter is so large that it causes packets to be received out of the range of this buffer, the out-of-range packets are discarded and dropouts are heard in the audio. For losses as small as one packet, the DSP interpolates what it thinks the audio should be and no problem is audible. When jitter exceeds what the DSP can do to make up for the missing packets, audio problems are heard.

This diagram illustrates how excessive jitter is handled.


Video works in much the same way as voice. Video information received through a camera is broken into small pieces, compressed with a CODEC, placed into small packets, and transmitted over the IP network. This is one reason why VoIP is promising as a new technology: adding video or other media is relatively simple. Of course, there are certain issues that must be considered that are unique to video (e.g., frame refresh and much higher bandwidth requirements), but the basic principles of VoIP equally apply to video telephony.

Of course there is much more to VoIP than just sending the audio/video packets over the Internet. There must also be an agreed protocol for how computers find each other and how information is exchanged in order to allow packets to ultimately flow between the communicating devices. There must also be an agreed format (called payload format) for the contents of the media packets.

VoIP is implemented in a variety of hardware devices, including IP phones, analog terminal adapters (ATAs), and gateways. In short, a large number of devices can enable VoIP communication, some of which allow one to use traditional telephone devices to interface with the IP networks.

In a well performing network, VoIP calls should be as clear or clearer that and other type of audio transmissions.  VoIP calls are pure digitized sound.  Each audio packet contains the pure audio just exactly as it is spoken into the microphone. 

High definition voice contains a wider range of frequencies than typical voice transmissions and will deliver surprisingly good audio that contains a richer sound than most toll quality calls.

VoIP Protocols

The success of VoIP communication hinges on employing an appropriate set of protocols. Here, we'll discuss IPitomy SIP Trunking, the preferred protocol for all IPitomy devices currently in circulation as well as many other third-party industry offerings.

The Real-Time Protocol (RTP) is a standard used globally by nearly every device for transmitting audio and video packets between computers. RTP, guided by open standards outlined in various documents, manages issues like packet order and employs mechanisms such as the Real-Time Control Protocol to address delay and jitter.

Before media can flow between two devices, protocols are utilized to locate the remote device and negotiate the media flow methods. These crucial protocols are known as call-signaling protocols, with the Session Initiation Protocol (SIP) being the most widely used.


Advantages of IPitomy SIP Trunks

SIP, a text-based protocol, is relatively straightforward for machines to understand. Its easy readability facilitates troubleshooting by allowing the inspection of packet contents without needing to decompile the software entirely. SIP's operation mirrors that of email, making it familiar and intuitive. Its addressing methods are particularly similar. SIP calls offer undiluted digital voice quality, free of distortion, delay, or echo in its native environment. Instant Scalability: Given IPitomy's IP PBX platform, capacity can be easily adjusted via a simple process in the admin GUI. Interoperability with a wide range of devices provides users with choices far beyond what any single vendor can offer. International dialing is supported. User location is inconsequential; remote users can be deployed globally without losing direct connectivity. An adequate internet connection is the only requirement.


Understanding a SIP Call

A SIP call comprises a signaling component and a Voice component. The paths for signaling and actual voice transmission differ for each call. Setup and teardown signaling for all calls operate over port 5060. Voice transmissions are conducted within the range of ports 10,000 to 20,000. These are virtual ports integral to TCPIP communication.


Preconditions for a SIP Call

Before a SIP call can occur, SIP endpoints capable of finding and being found by other SIP endpoints are needed. We'll restrict our SIP examples to endpoints that register to an IPitomy IP PBX for this training guide. While peer-to-peer SIP calls are possible, most are facilitated via an IP PBX or soft switch for a PSTN-like ease of dialing. The advantage of an IP PBX is that it simplifies calling another endpoint, allows call information to be stored for reporting, and manages call routing to local and remote extensions. Users dial phone numbers and extension numbers as they would with a legacy PBX system. Although no longer recommended due to their End-Of-Life status, an IPitomy IP PBX still supports analog PSTN lines and T1/PRI cards. SIP endpoints must register with the PBX to be included in the PBX database. Once registered, they can dial phone numbers and receive calls from other endpoints. These endpoints can be phones on the Local Area Network (LAN) or any location on the internet.


Starting a SIP Call

To start a call, the endpoint sends an invite to the server requesting the other endpoint's availability. This occurs on the signaling port (port 5060). If the other endpoint is ready, it sends an acknowledgement back to the initiating phone, which then sends the call information instructing the other phone on the ports to commence the RTP (voice) session. The RTP session is opened using the communicated ports.

Call termination is initiated by one of the endpoints sending a "bye" message, causing the call to hang up. This is a simplified explanation of a SIP call's lifecycle. When the call occurs on the LAN, it bypasses the router. The PBX instructs the endpoints on the ports to connect the RTP (voice) stream.

For calls to remote phones, the PBX understands that the phone is beyond the firewall. Router port configuration becomes necessary for signaling and RTP traffic at this stage.

Signaling Port 5060

Proper port configuration enables port 5060 to be forwarded in the router to the PBX system's LAN IP address, allowing the PBX to send signals to remote phones and receive requests from them.

RTP Ports 10,000 – 20,000 Port Range Forwarding

Once call setup occurs via signaling on Port 5060, RTP is set up using a range of ports forwarded to the PBX LAN IP address. The Port Range Forwarding feature in the Router is used to forward the range of Ports 10,000 – 20,000 for this purpose.

Each call requires two ports for RTP - one for sending and one for receiving. This is organized by the initiating phone. The router ports are open from inside the firewall. The remote phone receives the information about which ports to use from the SIP packets.

Local Phone Diagram

Remote Phone Diagram

Network Address Translation – NAT

TCP/IP, short for Transmission Control Protocol/Internet Protocol, is the underlying communication protocol used for data exchange on the internet. It leverages unique IP addresses to deliver data to the correct device on a network. The types and classes of IP addresses play a crucial role in how data is routed across the internet.

If you're online, odds are you're using Network Address Translation (NAT). This becomes increasingly likely given the growing internet user base. As of 2023, the internet is accessed by nearly 5 billion people worldwide, approximately 63% of the global population. This figure continues to rise, with nearly 200 million new users connecting in the year leading up to April 2023.

So what does the size of the Internet have to do with NAT? Everything! For a computer to communicate with other computers and Web servers on the Internet, it must have an IP address. An IP address (IP stands for Internet Protocol) is a unique 32-bit number that identifies the location of your computer on a network. Basically it works just like your street address: a way to find out exactly where you are and deliver information to you.

When IP addressing first came out, everyone thought that there were plenty of addresses to cover any need. Theoretically, you could have 4,294,967,296 unique addresses (232). The actual number of available addresses is smaller (somewhere between 3.2 and 3.3 billion) because of the way that the addresses are separated into Classes and the need to set aside some of the addresses for multicasting, testing or other specific uses.

With the explosion of the Internet and the increase in home networks and business networks, the number of available IP addresses is simply not enough. The obvious solution is to redesign the address format to allow for more possible addresses. This is being developed (IPv6) but will take several years to fully implement because it requires modification of the entire infrastructure of the Internet.

In our current internet landscape, dominated by IPv4 addressing, there's a finite number of unique IP addresses available. This limitation makes it challenging to provide every internet-connected device with its own IP address. The solution to this conundrum lies in the technique of Network Address Translation (NAT). The NAT process, employed by routers, enables a multitude of devices (like PCs, smartphones, and more) to share a single public IP address. This technique effectively extends the life of the current IPv4 addressing system until the broader implementation of IPv6, which promises a virtually limitless pool of IP addresses.

NAT operates by allowing the router to relay data to devices on the local area network (LAN) because it recognizes each device's unique internal IP address and Media Access Control (MAC) ID. When an external device needs to communicate with a device on the LAN via the internet, a specific route is required.

Consider the case of a remote IP phone initiating a call through a PBX system on the LAN. The phone needs to send packets to the PBX, and the router must be informed where to route these packets. This is achieved by forwarding port 5060 to the PBX on the LAN, meaning all traffic arriving at this port is directed to the PBX.

Once the call is established, the Real-Time Protocol (RTP) traffic is directed to designated ports for transmission and reception. These ports are assigned based on instructions in the SIP packets used in the call setup. Misconfiguration of port forwarding can cause issues, the most common of which is "one-way audio," typically resulting from improper configuration of the RTP ports in the router. Note that some routers support an Application Layer Gateway (ALG) functionality, which often hampers packet delivery, despite its seeming compatibility with SIP, and should thus be disabled.

Disruptions in the RTP stream can arise when voice packets cannot reach their intended destination due to router configuration errors or an inability of the router to correctly perform NAT operations. Some routers are outright incapable of NAT, making them incompatible with remote IP phones.

Having port forwarding enabled is critical, especially when considering remote access for maintenance, remote phones, and branch office connectivity. If a third party manages the router, all involved parties benefit from having these ports forwarded and the ALG disabled before the IP PBX is installed. Failure to follow these steps can lead to delays in implementation and should be factored in when providing price estimates to customers.

In IP telephony, TCP/IP and the SIP protocol are employed to generate pure digitized sound. Any distortions such as "static," echo, hiss, or hum are not introduced by the PBX but arise from analog elements or packet loss. To troubleshoot these issues, check the analog connections (like the handset and its cable) and conduct a packet loss test.

IP phones are intelligent and circuit-independent devices. You can simply unplug a problematic phone and plug it into a different Ethernet connection for troubleshooting. If the problem persists, try using a phone known to function properly on the problematic Ethernet connection. If the same issues arise, inspect the cables and connections. Ensure that Ethernet cables are not draped over fluorescent lights or close to other devices that could introduce distortion into the packet delivery process.

Implementing Quality of Service (QOS) is Critical in Your VoIP Installation

Implementing Quality of Service (QoS) is crucial for ensuring optimal performance in your VoIP installation. Underestimating the importance of proper QoS configuration can result in a poor user experience and increased support costs. Let's explore the significance of QoS and how to set it up effectively.


What Does QoS Do?

QoS determines the priority of data packets on your Local Area Network (LAN). Since the available bandwidth on the LAN is shared by various applications, it is essential to prioritize voice packets for timely delivery. Voice packets are time-sensitive, and any interruption or delay can significantly degrade the audio quality of a call.


Voice Packets vs. Regular Data Packets

Voice packets are distinguished from regular data packets by a designated field in their header. This distinction allows the data switch to prioritize voice packets, ensuring they are not delayed or interrupted. In a network, data packets are typically delivered on a best-effort basis, utilizing available bandwidth. However, if the network becomes congested, voice packets may be momentarily blocked by other data packets. Even a brief delay can cause noticeable audio interruptions in phone calls. By prioritizing voice packets, you guarantee uninterrupted voice communication. Since voice traffic occupies a minimal percentage of the total bandwidth, prioritizing voice packets does not have a noticeable impact on other data packets.

For instance, consider a scenario where 10 people on the LAN are simultaneously downloading a 20-megabyte file. In a standard 100Base-T network, this heavy data traffic could potentially block all other data temporarily. By prioritizing voice packets over data packets, voice communication experiences no delay because the downloaded file makes room for voice packets with minimal or imperceptible delay to the ongoing downloads.


How to Set up QoS

QoS configuration is performed on the data switch. The IPitomy server uses specific settings to identify voice packets, which are set to CS3 by default. To ensure the highest priority for these packets, the data switch needs to be configured accordingly. Different switches employ various QoS labels, so you should determine the switch's specification to proceed. Since IPitomy utilizes the DSCP Class label, match that label in the switch to its highest priority setting (this could be a numerical value or "Highest," such as in the case of the Netgear FS728TP switch). It's important to ensure that no other devices on the network are using the same Class ID. If any other devices are utilizing it, either change their settings or modify the Class ID used by the IPitomy PBX under PBX Setup/SIP/Advanced. It is crucial to reserve the Class ID exclusively for voice traffic and not allocate it to other non-voice data devices.

Note: QoS can only be set on the LAN, specifically in the data switch(es). It is not relevant for the Wide Area Network (WAN) or internet traffic since those routes are determined by network hops that are beyond your control. However, in private WANs like MPLS, the network provider may have the capability to configure QoS for point-to-point connections.

VoIP (RTP) performs optimally when QoS is configured on the LAN. As a general rule, it is highly recommended to implement QoS for VoIP installations.

For further information on setting up QoS and its specific configuration for your system, please consult the relevant documentation or contact the IPitomy support team.

For more information on setting up QOS: Click Here.