Difference between revisions of "Troubleshooting Network"

From IPitomy Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 
#What has changed?  If the network and IP PBX were working fine for a period of time and suddenly there are problems, look at what has been changed on the network recently.  if you can undo the change and everything returns to normal, then you can usually determine what is wrong by evaluating what the change is.  Sometimes, something as simple as moving furniture can crush a cable or wall jack and this is never reported.  Ask questions about any changes to the environment.
 
#What has changed?  If the network and IP PBX were working fine for a period of time and suddenly there are problems, look at what has been changed on the network recently.  if you can undo the change and everything returns to normal, then you can usually determine what is wrong by evaluating what the change is.  Sometimes, something as simple as moving furniture can crush a cable or wall jack and this is never reported.  Ask questions about any changes to the environment.
----
+
#<span style="letter-spacing: 0px; ">Isolate the Issue.&nbsp; The easiest way to do this is to swap the phone with one that is working ok and has not demonstrated this issue.&nbsp; If the problem follows the phone, it is most likely a setting or something wrong with the phone. If the problem follows the location and happens on the other phone in this same location, the problem is probably on the segment of the LAN.&nbsp; After this has been tested, try swapping the ports on the data switch.&nbsp; It could be a bad port on the switch.&nbsp;</span>
 
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">Check the switch.&nbsp; You should use switches that implement QoS and provide error reporting.&nbsp; Some switches even have built in cable testers.&nbsp; If you log into the switch you should be able to find error counters for the ports on the switch.&nbsp; I would recommend resetting the counters to 0 and then checking back in a few hours and or minutes (depending on severity and frequency of issue).&nbsp; If you log back in and find errors on particular ports then I would recommend replacing the connectors and or cabling related to these ports.&nbsp; If there are errors on a particular segment, try removing the devices one at a time.&nbsp; If the errors stop when a device is disconnected, this can indicate a bad network interface card.&nbsp; Occasionally a particular port on a switch can malfunction as well, in this case you can often - just not use that port.&nbsp; But if hardware is failing we recommend you replace it.</span>
#<span style="letter-spacing: 0px">Isolate the Issue.&nbsp; The easiest way to do this is to swap the phone with one that is working ok and has not demonstrated this issue.&nbsp; If the problem follows the phone, it is most likely a setting or something wrong with the phone. If the problem follows the location and happens on the other phone in this same location, the problem is probably on the segment of the LAN.&nbsp; After this has been tested, try swapping the ports on the data switch.&nbsp; It could be a bad port on the switch.&nbsp;</span>
+
##<span style="letter-spacing: 0px; ">​</span><span style="letter-spacing: 0px; ">​Use the Error logs in the switch to your benefit to reduce the time it takes to discover the network segment at fault. You can usually restart the log (zero it out) and then return periodically to find out how many errors are occurring on the various ports. This can expedite your discovery of the source of your problem.</span>
----
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">Look at the Monitor screen in the PBX Setup – Reports menu.&nbsp; Is that extension lagged?&nbsp; Does it show the same number of milliseconds as the other phones on the monitor screen?&nbsp; The number of milliseconds represents the amount of time it takes a packet to travel round trip from the PBX to the phone.&nbsp; If this extension looks like it has a longer time than the other phones, it could be caused by a weak connection.&nbsp; Using the diagnostic tools in the reports section, use the ping tool to ping the IP address of the phone.&nbsp; If the ping time seems a little lagged, you need to look at the network.&nbsp; Try running the traceroute utility to the IP address of the phone.&nbsp; The traceroute utility will show if there is any packet loss.</span>
<span style="letter-spacing: 0px"></span>
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">Put a cable tester on the segment and make sure that it is performing at Cat5 or better (if it does not test out, check all of the cable connections and replace or re-terminate the wall jacks and plugs).&nbsp; If you don’t have a cable tester, try reseating all of the RJ-45 plugs in the segment; they may not be properly connected.&nbsp; If you simply want to do all you can to get on to the next stop on your day, re-terminate all the wall jacks and test.&nbsp; You might be pleasantly surprised that the issue has disappeared.</span>
#<span style="letter-spacing: 0px"></span><span style="letter-spacing: 0px; ">Check the switch.&nbsp; You should use switches that implement QoS and provide error reporting.&nbsp; Some switches even have built in cable testers.&nbsp; If you log into the switch you should be able to find error counters for the ports on the switch.&nbsp; I would recommend resetting the counters to 0 and then checking back in a few hours and or minutes (depending on severity and frequency of issue).&nbsp; If you log back in and find errors on particular ports then I would recommend replacing the connectors and or cabling related to these ports.&nbsp; If there are errors on a particular segment, try removing the devices one at a time.&nbsp; If the errors stop when a device is disconnected, this can indicate a bad network interface card.&nbsp; Occasionally a particular port on a switch can malfunction as well, in this case you can often - just not use that port.&nbsp; But if hardware is failing we recommend you replace it.</span>
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">If you have now retested the segment with your cable tester and are confident that the cables do not have a problem, then we must look at the Quality of Service settings in the Data Switch.&nbsp; If QOS is not set up properly, large data transfers can block packets occasionally on the LAN.&nbsp; QOS is easy to setup.&nbsp; Just follow the instructions provided by the data switch manufacturer.&nbsp; If you don’t have a Smart/Managed Switch, you will not be able to set the QOS parameters on the LAN.&nbsp; It is recommended that you have one installed if you want to resolve this issue.</span>
##<span style="letter-spacing: 0px; ">​Use the Error logs in the switch to your benefit to reduce the time it takes to discover the network segment at fault. You can usually restart the log (zero it out) and then return periodically to find out how many errors are occurring on the various ports. This can expedite your discovery of the source of your problem.</span>
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">If for some reason you are not able to set up QOS on the LAN, you will need to discuss the options with the owner.&nbsp; If they do not want to purchase a Smart/Managed Switch capable of setting QOS, your only choice will be to explain to them that they must refrain from large bandwidth using applications such as videos, Internet music and other apps that are taking up large amounts of bandwidth.&nbsp; QOS will eliminate the issue and give voice packets priority over the other data packets. &nbsp;</span>
----
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">Packet Capture.&nbsp; It is possible to use the packet capture utility to capture some packets to make sure that the packets are being sent from/to the PBX or from/to the phone.&nbsp; The packet capture is only helpful if it is real small and only contains the 30 seconds of data required to reproduce the issue.&nbsp; If you cannot reproduce the issue while the packet capture is on for just a very short amount of time, it is like looking for a needle in a haystack.&nbsp;</span>
<span style="letter-spacing: 0px; "></span>
+
#<span style="letter-spacing: 0px; "></span><span style="letter-spacing: 0px; ">Ask end users how other network devices are performing.&nbsp; If they tell you some computers don’t always work to access the internet, or if they have to reboot them to get them to work.&nbsp; This is a pretty good indicator that something is wrong with the network.&nbsp; It may or may not be related to the particular connections on the machines that are described.&nbsp; The thing that needs to be understood about networks is that they are one large electrical system.&nbsp; If one connection on the system is bad and sending out invalid data, any of the machines attached to the electrical system can have their connectivity disrupted.&nbsp; This is because either the bad data is seen as an error, which can cause a switch to spend a lot of time handling each error, or the bad data could be misinterpreted as some sort of broadcast packet and it could be distributed to every connection on the network, or as a packet to a specific destination in which case it could effect that destination (another computer on the LAN).</span>
#<span style="letter-spacing: 0px">Look at the Monitor screen in the PBX Setup – Reports menu.&nbsp; Is that extension lagged?&nbsp; Does it show the same number of milliseconds as the other phones on the monitor screen?&nbsp; The number of milliseconds represents the amount of time it takes a packet to travel round trip from the PBX to the phone.&nbsp; If this extension looks like it has a longer time than the other phones, it could be caused by a weak connection.&nbsp; Using the diagnostic tools in the reports section, use the ping tool to ping the IP address of the phone.&nbsp; If the ping time seems a little lagged, you need to look at the network.&nbsp; Try running the traceroute utility to the IP address of the phone.&nbsp; The traceroute utility will show if there is any packet loss.</span>
 
----
 
<span style="letter-spacing: 0px"></span>
 
#<span style="letter-spacing: 0px">Put a cable tester on the segment and make sure that it is performing at Cat5 or better (if it does not test out, check all of the cable connections and replace or re-terminate the wall jacks and plugs).&nbsp; If you don’t have a cable tester, try reseating all of the RJ-45 plugs in the segment; they may not be properly connected.&nbsp; If you simply want to do all you can to get on to the next stop on your day, re-terminate all the wall jacks and test.&nbsp; You might be pleasantly surprised that the issue has disappeared.</span>
 
----
 
<span style="letter-spacing: 0px"></span>
 
#<span style="letter-spacing: 0px">If you have now retested the segment with your cable tester and are confident that the cables do not have a problem, then we must look at the Quality of Service settings in the Data Switch.&nbsp; If QOS is not set up properly, large data transfers can block packets occasionally on the LAN.&nbsp; QOS is easy to setup.&nbsp; Just follow the instructions provided by the data switch manufacturer.&nbsp; If you don’t have a Smart/Managed Switch, you will not be able to set the QOS parameters on the LAN.&nbsp; It is recommended that you have one installed if you want to resolve this issue.</span>
 
----
 
<span style="letter-spacing: 0px"></span>
 
#<span style="letter-spacing: 0px">If for some reason you are not able to set up QOS on the LAN, you will need to discuss the options with the owner.&nbsp; If they do not want to purchase a Smart/Managed Switch capable of setting QOS, your only choice will be to explain to them that they must refrain from large bandwidth using applications such as videos, Internet music and other apps that are taking up large amounts of bandwidth.&nbsp; QOS will eliminate the issue and give voice packets priority over the other data packets. &nbsp;</span>
 
----
 
<span style="letter-spacing: 0px"></span>
 
#<span style="letter-spacing: 0px">Packet Capture.&nbsp; It is possible to use the packet capture utility to capture some packets to make sure that the packets are being sent from/to the PBX or from/to the phone.&nbsp; The packet capture is only helpful if it is real small and only contains the 30 seconds of data required to reproduce the issue.&nbsp; If you cannot reproduce the issue while the packet capture is on for just a very short amount of time, it is like looking for a needle in a haystack.&nbsp;</span>
 
----
 
<span style="letter-spacing: 0px"></span>
 
#<span style="letter-spacing: 0px">Ask end users how other network devices are performing.&nbsp; If they tell you some computers don’t always work to access the internet, or if they have to reboot them to get them to work.&nbsp; This is a pretty good indicator that something is wrong with the network.&nbsp; It may or may not be related to the particular connections on the machines that are described.&nbsp; The thing that needs to be understood about networks is that they are one large electrical system.&nbsp; If one connection on the system is bad and sending out invalid data, any of the machines attached to the electrical system can have their connectivity disrupted.&nbsp; This is because either the bad data is seen as an error, which can cause a switch to spend a lot of time handling each error, or the bad data could be misinterpreted as some sort of broadcast packet and it could be distributed to every connection on the network, or as a packet to a specific destination in which case it could effect that destination (another computer on the LAN).</span>
 

Revision as of 19:17, 9 November 2012

  1. What has changed?  If the network and IP PBX were working fine for a period of time and suddenly there are problems, look at what has been changed on the network recently.  if you can undo the change and everything returns to normal, then you can usually determine what is wrong by evaluating what the change is.  Sometimes, something as simple as moving furniture can crush a cable or wall jack and this is never reported.  Ask questions about any changes to the environment.
  2. Isolate the Issue.  The easiest way to do this is to swap the phone with one that is working ok and has not demonstrated this issue.  If the problem follows the phone, it is most likely a setting or something wrong with the phone. If the problem follows the location and happens on the other phone in this same location, the problem is probably on the segment of the LAN.  After this has been tested, try swapping the ports on the data switch.  It could be a bad port on the switch. 
  3. Check the switch.  You should use switches that implement QoS and provide error reporting.  Some switches even have built in cable testers.  If you log into the switch you should be able to find error counters for the ports on the switch.  I would recommend resetting the counters to 0 and then checking back in a few hours and or minutes (depending on severity and frequency of issue).  If you log back in and find errors on particular ports then I would recommend replacing the connectors and or cabling related to these ports.  If there are errors on a particular segment, try removing the devices one at a time.  If the errors stop when a device is disconnected, this can indicate a bad network interface card.  Occasionally a particular port on a switch can malfunction as well, in this case you can often - just not use that port.  But if hardware is failing we recommend you replace it.
    1. ​Use the Error logs in the switch to your benefit to reduce the time it takes to discover the network segment at fault. You can usually restart the log (zero it out) and then return periodically to find out how many errors are occurring on the various ports. This can expedite your discovery of the source of your problem.
  4. Look at the Monitor screen in the PBX Setup – Reports menu.  Is that extension lagged?  Does it show the same number of milliseconds as the other phones on the monitor screen?  The number of milliseconds represents the amount of time it takes a packet to travel round trip from the PBX to the phone.  If this extension looks like it has a longer time than the other phones, it could be caused by a weak connection.  Using the diagnostic tools in the reports section, use the ping tool to ping the IP address of the phone.  If the ping time seems a little lagged, you need to look at the network.  Try running the traceroute utility to the IP address of the phone.  The traceroute utility will show if there is any packet loss.
  5. Put a cable tester on the segment and make sure that it is performing at Cat5 or better (if it does not test out, check all of the cable connections and replace or re-terminate the wall jacks and plugs).  If you don’t have a cable tester, try reseating all of the RJ-45 plugs in the segment; they may not be properly connected.  If you simply want to do all you can to get on to the next stop on your day, re-terminate all the wall jacks and test.  You might be pleasantly surprised that the issue has disappeared.
  6. If you have now retested the segment with your cable tester and are confident that the cables do not have a problem, then we must look at the Quality of Service settings in the Data Switch.  If QOS is not set up properly, large data transfers can block packets occasionally on the LAN.  QOS is easy to setup.  Just follow the instructions provided by the data switch manufacturer.  If you don’t have a Smart/Managed Switch, you will not be able to set the QOS parameters on the LAN.  It is recommended that you have one installed if you want to resolve this issue.
  7. If for some reason you are not able to set up QOS on the LAN, you will need to discuss the options with the owner.  If they do not want to purchase a Smart/Managed Switch capable of setting QOS, your only choice will be to explain to them that they must refrain from large bandwidth using applications such as videos, Internet music and other apps that are taking up large amounts of bandwidth.  QOS will eliminate the issue and give voice packets priority over the other data packets.  
  8. Packet Capture.  It is possible to use the packet capture utility to capture some packets to make sure that the packets are being sent from/to the PBX or from/to the phone.  The packet capture is only helpful if it is real small and only contains the 30 seconds of data required to reproduce the issue.  If you cannot reproduce the issue while the packet capture is on for just a very short amount of time, it is like looking for a needle in a haystack. 
  9. Ask end users how other network devices are performing.  If they tell you some computers don’t always work to access the internet, or if they have to reboot them to get them to work.  This is a pretty good indicator that something is wrong with the network.  It may or may not be related to the particular connections on the machines that are described.  The thing that needs to be understood about networks is that they are one large electrical system.  If one connection on the system is bad and sending out invalid data, any of the machines attached to the electrical system can have their connectivity disrupted.  This is because either the bad data is seen as an error, which can cause a switch to spend a lot of time handling each error, or the bad data could be misinterpreted as some sort of broadcast packet and it could be distributed to every connection on the network, or as a packet to a specific destination in which case it could effect that destination (another computer on the LAN).