Detecting errors like dropped packets or retransmissions on the network level is relatively easy. Figuring out if those errors affect the performance and connectivity of your services is however another matter. Some network errors are mitigated and compensated for by network protocols and active networking components, like network interfaces. Meanwhile, other network errors lead to performance problems that negatively affect your services.
Following is an overview of common network errors and root causes, means and approaches of detecting such errors, and suggestions as to how monitoring tools can support you in staying on top of your services’ connectivity and performance.
TCP – Your protocol of choice since 1981
The TCP/IP protocol suite that we all know so well has been around for almost 40 years now. Although some alternatives have been developed over the years, TCP/IP still works well and it’s the foundation of almost all networking as we know it today. One of the reasons this protocol stack is still around is that it’s capable of compensating for many errors on its own. TCP, appropriate to the season, is the Santa Claus of protocols. It knows if your service is sleeping, it knows if it’s awake, it knows if the connections run bad or good, so [listen closely to what it says]. Your services need not worry about retransmissions or network congestion. TCP/IP does everything in its power to makes sure that your stateful connections are reliable and perform well. Nevertheless anybody running applications in production needs to understand TCP and its basics.
The top five common network errors
Network collisions
This is an oldie, but a goodie that’s now almost irrelevant because of full duplex switches and technology advances. Back in the days if two devices on the same Ethernet network (e.g., connected through a hub) tried to transmit data at the same time, the network would detect the collision and drop both packets. The CSMA/CD protocol, which made sure that nobody else was transmitting data before a device started transmitting its own data, was a step in the right direction. With full duplex switches, where communication end-points can talk to each other at the same time, this potential error is obsolete. Even in wireless networks, which still work basically like hubs, network collisions can be neglected because there are procedures in place to avoid collisions in the first place (e.g., CSMA/CA or RTS/CTS).
Checksum errors
When you download files from the Internet you often have the option of checking a file’s integrity with a MD5 or SHA-1 hash. With the help of checksums on the network level we are able to detect if a bit was toggled, missing, or duplicated by network data transmission. Checksums assure that received data is identical to the transmitted data.

Packets with incorrect checksums aren’t processed by the receiving host. If the Ethernet checksum (CRC) is wrong the Ethernet frame is silently dropped by the network interface and is never seen by the operating system, not even with packet capturing tools. With the IP checksum and TCP checksum in the respective headers there are two additional supervisory bodies that can detect integrity errors. Be aware that despite the efforts of checksumming, there are some errors that can’t be detected.
Full queues
If the processing queue on a switch or router is overloaded, the incoming packets will be dropped. Also if the queue for incoming packets on the host you try to connect to is full, the packets will also be dropped. This behavior is actively exploited during DoS/DDoS attacks. So while it’s actually a good thing that a host only accepts the number of packets than it can process, this behavior can be used to take down your service.
Time to live exceeded
The Time to live (TTL) field in the IPv4 header has a misleading name. Every router that forwards an IP packet decreases the value of the field by one — it actually has nothing to do with time at all. In the IPv6 header this field is called “hop limit”. If the TTL value hits 0, an ICMP message “time to live exceeded” is sent to the dispatcher of the packet. Meanwhile, some network components drop packets with TTL equal to zero silently. This mechanism is useful for preventing packets from becoming caught up in an endless routing loop within your network. The observant reader and network veteran is familiar with this technique because traceroute
uses it to identify all hops that a packet makes on its route to its destination.
Packet retransmissions
First off, retransmissions are essential for assuring reliable end-to-end communication in networks. Retransmissions are a sure sign that the self-healing powers of the TCP protocol are working — they are the symptom of a problem, not a problem in themselves. Common reasons for retransmissions include network congestion where packets are dropped (either a TCP segment is lost on its way to the destination, or the associated ACK is lost on the way back to the sender), tight router QoS rules that give preferential treatment to certain protocols, and TCP segments that arrive out of order at their destination, usually because the order of segments became mixed up on the way from sender to destination. The retransmission rate of traffic from and to the Internet should not exceed 2%. If the rate is higher, the user experience of your service may be affected.
The three commands you need to know to gather information about network errors
Now we know about common errors – let’s take a look at network troubleshooting. The good news is that most of the problems are findable using standard tools that are usually part of your operating system.
ifconfig
The first place to go to find basic information about your network interfaces is good old ifconfig
.

Besides the MAC address and the IP address information for v4 and v6 you’ll find detailed statistics about received and transmitted packets. The line that starts with RX contains information about received packets. The TX lines contain information about transmitted packets.
RX information

- packets shows the number of successfully received packets.
- errorscan result from faulty network cables, faulty hardware (e.g., NICs, switch ports), CRC errors, or a speed or duplex mismatch between computer and switch, which would also manifest itself in a high number of collisions (CSMA/CD sends its regards). You can check the configuration on your computer using
ethtool <device>
to find out at which speed your network interface is operating and if the connection is full duplex or not. - dropped can indicate that your system can’t process incoming packets or send outgoing packets fast enough, you’re receiving or sending packets with bad VLAN tags, you’re using unknown protocols, or you’re receiving IPv6 packets and your computer doesn’t support IPv6. You can counter the first error by increasing the ring-buffer. This is the buffer that the NIC transfers frames to before raising an IRQ in the kernel, for RX of your network interface using
ethtool
. - overruns display the number of fifo overruns, which indicates that the kernel can’t keep up with the speed of the ring-buffer being emptied.
- frame counts the number of received misaligned Ethernet frames.
TX information

- packets shows the number of successfully transmitted packets.
- errors shows the number of errors that occurred while transmitting packets due to carrier errors (duplex mismatch, faulty cable), fifo errors, heartbeat errors, and window errors.
- dropped indicates network congestion, e.g., the queue on the switchport your computer is connected to is full and packets are dropped because it can’t transmit data fast enough.
- overruns indicates that the ring-buffer of the network interface is full and the network interface doesn’t seem to get any kernel time to send out the frames stuck in the ring-buffer. Again, increasing the TX buffer using
ethtool
may help. - carrier shows the number of carrier errors, indicating a duplex mismatch or faulty hardware.
- collisions shows the number of collisions that occurred while transmitting packets which, in modern networks, should be zero.
- txqueuelen controls the length of the transmission buffer of the network interface. This parameter is relevant only for some queueing disciplines and can be overwritten using the
tc
command. For more information about queueing disciplines, take a look at this deep dive into Queueing in the Linux Network Stack and the tc-pfifo main page.
netstat
To see more detailed network statistics for the protocols TCP, UDP, IP, and ICMP you can use netstat -s
. This returns a lot of information and the output format is in a human-readable format, like the number of retransmitted and dropped packets sorted by protocol. If you want to focus on TCP retransmissions you can filter out the relevant information.

netstat
shows that there are 54 retransmitted segments. Meaning, for 54 TCP segments the corresponding ACK was not received within the timeout. Three TCP segments were “fast retransmitted” following the fast retransmission algorithm in RFC 2581. TCP SYN retransmission can happen if you want to connect to a remote host and the port on the remote host isn’t open (see example below).

ethtool
This toolallows you to query and control the settings of the network interface and the network driver, as seen before. It shows you a detailed list of all errors that can occur on the network interface level, like CRC errors and carrier errors.If you have no retransmissions on the TCP layer but ifconfig
still shows you a lot of erroneous packets, this is the place to look for the specifics. If a lot of errors show up in the ethtool
output, it usually means that there is something wrong with the hardware (NIC, cable, switchport).

Some might still want to dig deeper to find out everythingabout those errors. The next step would be to readthe Linux Device Drivers book, digest it, and then start reading throughthe kernel source code (e.g., linux/netdevice.h) and network driver code (e.g., Intel e1000 driver).
Discover how you can proactively identify connection issues with Dynatrace.
Learn more
tcpretrans
tcpretrans
is part of the perf-tools package. It offers you a live ticker of retransmitted TCP segments, including source and destination address and port, and TCP state information. If you suspect that more than one application or service is responsible for TCP retransmissions, tcpretrans
allows you to debug your network connections if you call your services in isolation from each other and watch the output of tcpretrans
.

tcpdump
tcpdump
is a command-line network analyzer that shows the traffic specified by filters directly on the command line. With a command line parameter you can write the output to a file for future analysis. tcpdumpis available in almost every *nix distribution out of the box and is therefore the tool of choice for a quick pragmatic network analysis.
Wireshark
Wireshark, formerly ethereal, is the Swiss Army knife of network and protocol analyzer tools for Windows and Unix when it comes to analyzing TCP sessions, identifying failed connections, and seeing all network traffic that travels to and from your computer. You can configure it to listen on a specific network interface, specify filters to, for example, concentrate on a certain protocol, host, or port, and you can dump captured traffic to a file for an future analysis. Also, Wireshark can read tcpdump
files, so you can capture traffic on one host on the command line and open the file for analysis in Wireshark on your computer for an analysis. Another feature of wireshark is that it knows a lot of common application protocols (e.g., HTTP and FTP). Thus you can see what’s going on above layer 4 and get insight into the payloads that are sent using TCP.

Following is an overview that shows which OSI layers the tools mentioned above cover and on which OSI layer the above-mentioned network errors occur.

Now you know how and where to find information about network errors. But what can you learn from this information? For starters, you can learn what type of errors you’re dealing with, which will guide your further investigation. Though do you really need to investigate anything at all? After all, investigating each retransmitted or dropped packet is pointless—the network protocol stack has self-healing powers and some of the alleged errors are simply part of the game.
What really counts
Usually, more than one computer, switches, and routers are involved in networking. When you have several hosts and detect a problem in your network it’s not efficient to ssh
each computer and perform all these exercises to find out what’s going on. Ultimately, in more complex environments, you need tool support to stay on top of things.
You need a monitoring tool that monitors all the hosts that are part of your infrastructure — a tool that notifies you when something out of the ordinary occurs. The tool should automatically create performance baselines for all running services, as well as incoming/outgoing network traffic, average response time to service calls, and the availability of the service from the network’s point of view. You need to be notified if any of these measurements fall in comparison to the baseline.
Although network errors may be the root cause of why your services aren’t available or are performing poorly, as a service provider in the real world you shouldn’t need to focus on networking errors. Your main concern should be providing high-performance services that are easy to use and always available. In general, you don’t want to be notified about all errors that occur in the network layer (or anywhere else in the application for that matter).
There are a number of networking and service-related metrics you can measure and evaluate. The following three are a good starting point.
Network traffic
Measuring network traffic provides a good overview of the overall usage and performance of your service. It’s also a good indicator of whether or not you need to upscale your infrastructure (e.g., your one server may no longer be enough to handle all the load).
Responsiveness
Responsiveness measures the time from the last request packet that the service receives to the first response packet that the service sends. It measures the time a process needs to produce a response to a given request and should be watched in correlation with hardware resources.
Connectivity
Connectivity shows the percentage of properly established TCP connections compared to TCP connections that were refused or timed out. It shows when services were available to clients and when they were not, over time.
My tool of choice for network analysis in the datacenter is Dynatrace, but I’m obviously a bit biased. In analyzing the network health of one of my Tomcat servers (see the example below), I found out that my service had a responsiveness time of about 3 ms, not much traffic, and 100% availability over the last two hours.

Now the interesting part is how you can relate network errors to actual service response times. If response times or service availability deviate from the baseline you’ll see a summary of the resulting problems that shows how many users are affected and what the root cause of the issue is. The really neat thing about Dynatrace is how well it integrates all this information to help me assess and fix this problem.

If you take a close look at the problem view you’ll see that this problem affected real users, 688user actions per minute to be specific. Furthermore, you can see that the JavaScript error rate increased and that the root cause of this problem is a crashed couchDB process (i.e., the TCP connectivity rate for the process decreased to 0%). If you click on the process name you’ll see the following screen, where you can clearly see that the TCP connections were refused and the connectivity dropped to 0% while the process was restarted. This is what a common network error looks like from a services’ point of view.

Conclusion
Assessing the quality of your services on physical hosts with an underlying network consisting of physical switches and physical routers is a piece of cake with the right tools in place. However monitoring connectivity and performance in more complex infrastructures with network overlays and encapsulation, virtual switches that run as applications, and intra-VM traffic that you never see on any physical network interface add additional layers of complexity. But that’s another story. So stay tuned!
If you’re curious about taking Dynatrace network monitoring for a test drive, you should definitely go for it. There is a free usage tier so you can walk through all the functionality described here and see for yourself how well it works in your own environment.
Start your free trial now
FAQs
What causes a network error? ›
may fail due any number of connectivity reasons: DNS failure, TCP error, TLS protocol violation, and so on. These errors may be caused by network misconfiguration, transient routing issues, server downtime, malware or other attacks against the user, etc.
What happens if packets arrive out of order? ›If too many packets are received out of order, TCP will cause a retransmission of packets similar to what happens with dropped packets. As such, the impact of out of order packets on goodput is similar to the impact of packet loss.
What is the reason for TCP retransmission? ›What Is TCP Retransmission? TCP (the Transmission Control Protocol) connects network devices to the internet. When an outbound segment is handed down to an IP and there's no acknowledgment for the data before TCP's automatic timer expires, the segment is retransmitted.
What does TCP retransmission mean in Wireshark? ›TCP Retransmission - Occurs when the sender retransmits a packet after the expiration of the acknowledgement. TCP Fast Retransmission - Occurs when the sender retransmits a packet before the expiration of the acknowledgement timer.
How do I detect network errors? ›- Check the hardware. When you're beginning the troubleshooting process, check all your hardware to make sure it's connected properly, turned on, and working. ...
- Use ipconfig. ...
- Use ping and tracert. ...
- Perform a DNS check. ...
- Contact the ISP. ...
- Check on virus and malware protection. ...
- Review database logs.
Single-bit error, multiple-bit error, and burst error are the types of error. Simple Parity check, Two-dimensional Parity check, Checksum, and Cyclic redundancy check are error detection methods. When the data is sent from the sender side to the receiver's side it needs to be detected and corrected.
What happens if a packet is corrupted? ›Packets can be corrupted, which means that for some reason, the received data no longer matches the originally sent data. Packets can be lost due to problems in the physical layer or in routers' forwarding tables.
What are two reasons why an Internet packet might not reach its destination? ›Packet loss occurs when network congestion, hardware issues, software bugs, and a number of other factors cause dropped packets during data transmission.
What does TCP stand for? ›Transmission Control Protocol (TCP)
How do you know if a packet is a retransmission? ›...
Set when all of the following are true:
- This is not a keepalive packet.
- In the forward direction, the segment length is greater than zero or the SYN or FIN flag is set.
- The next expected sequence number is greater than the current sequence number.
What causes TCP timeouts? ›
TCP Socket Timeouts are caused when a TCP socket times out talking to the far end. Socket timeouts can occur when attempting to connect to a remote server, or during communication, especially long-lived ones.
How do I fix TCP connection? ›- Step 1: Capture a network diagram. ...
- Step 2: Networking traces. ...
- Step 3: Ping the computer's local IP address. ...
- Step 4: Troubleshoot error messages that occurs during the ping or telnet test. ...
- Step 5: Ping or Telnet to the default gateway. ...
- Step 6: Check issues that affects the specific destination node.
In TCP, the sender detects packet loss by receiving three duplicate acknowledgments or the expiration of retransmission timeouts and treats every loss as an indication of network congestion.
What causes TCP Duplicate ACK? ›In TCP communication the “TCP dup ack” packet is sent in two possible cases: TCP segment was lost. TCP segment was delayed and received out of order at the receiver.
What causes duplicate ACKs? ›Typically, duplicate acknowledgements mean that one or more packets have been lost in the stream and the connection is attempting to recover. They are a common symptom of packet loss.
What are the most common causes of network problems? ›- Sudden hardware failure. Devices can fail, and hardware wears out over time. ...
- Support network failures. ...
- Power loss. ...
- Traffic spikes. ...
- Diverse linkages. ...
- Out-of-Band Management.
- Intermittent wireless connections.
- Unable to pair Bluetooth devices properly.
- Slow performance on one device when another is being used.
- Wireless signal strength decreases within normal ranges from the router.
- Decreased download and upload speeds.
The “Network error, please try again later” toast message is caused by a faulty app, particularly, an app working in the background. Previously, I tracked the error's pattern and noticed that it always displayed when I used a Google app.
What are 3 error detection techniques? ›Error Detecting Techniques:
Two-dimensional parity check. Checksum. Cyclic redundancy check.
Five of the most popular error codes are 403, 404, 500, 503, and 504.
What are the three 3 types of errors? ›
- (1) Systematic errors. With this type of error, the measured value is biased due to a specific cause. ...
- (2) Random errors. This type of error is caused by random circumstances during the measurement process.
- (3) Negligent errors.
What causes packet loss on a network? The most common cause of packet loss on a network is overloaded network devices. Switches and routers will drop data packets if they cannot process them in time. Other major packet loss causes include faulty equipment and cabling.
How do you troubleshoot packet loss in a network? ›However, packet loss can occur even when you have a fast internet connection. Basic troubleshooting steps include power cycling your router, switching from Wi-Fi to ethernet, and changing ethernet ports. You can also go into your router settings to update firmware or activate Quality of Service (QoS) settings.
How does a server get corrupted? ›What Causes Corruption? In virtually all cases of SQL Server database corruption (over 99.99% by most accounts), the root cause of corruption is due to problems at the IO subsystem level – meaning that the root cause is a problem with drives, controllers, and possibly even drivers.
What are the four sources of packet delay on the networks? ›The delay of a packet is calculated by adding the following four components: propagation delay, transmission delay, queuing delay, and processing delay.
What happens if a data packet does not reach its destination? ›When one or more of these packets fails to reach its intended destination, this is called packet loss. For users, packet loss manifests itself in the form of network disruption, slow service and even total loss of network connectivity.
What is IP vs TCP? ›How do TCP and IP differ? TCP and IP are two separate computer network protocols. IP is the part that obtains the address to which data is sent. TCP is responsible for data delivery once that IP address has been found.
What is UDP stand for? ›User datagram protocol (UDP) operates on top of the Internet Protocol (IP) to transmit datagrams over a network. UDP does not require the source and destination to establish a three-way handshake before transmission takes place.
What is HTTP and TCP difference? ›TCP tells the destination computer which application should receive data and ensures the proper delivery of said data, whereas HTTP is used to search and find the desired documents on the Internet.
How do you tell if a packet is fragmented? ›If the Fragment Offset field > 0 then it is a packet fragment, or if the Fragment Offset field = 0 and the MF flag is set then it is a fragment packet.
How does Wireshark check packet loss in network? ›
Click “Statistics” in the menu bar. Select “Capture File Properties.” A new window will open. Under “Interfaces,” you'll see “Dropped packets.” The number underneath it will tell you how many packets weren't captured.
How does TCP detect if a packet is corrupted? ›- Checksums: TCP uses a checksum to detect errors in the data that is being transmitted. ...
- Sequence and Acknowledgment Numbers: TCP uses sequence and acknowledgment numbers to keep track of the packets being sent and received.
- Check Your Connection. ...
- Disable Firewall and Antivirus Software Temporarily. ...
- Disable Proxy Settings. ...
- Change DNS Servers. ...
- Flush/Renew DNS. ...
- Check Your Hosts File. ...
- Check the DNS of Your Domain. ...
- Clear Browser Cache.
- Solution 1: Restart Your Internet Connection. ...
- Solution 2: Disable Your Firewall and Antivirus Software. ...
- Solution 3: Disable Proxy Server Settings or VPN. ...
- Solution 4: Clear Browser Cache. ...
- Solution 5: Change the DNS Server. ...
- Solution 6: Flush Out and Renew DNS.
What is a TCP Split Handshake Spoof? The client sends the SYN packet to the server as normal. The (malicious) server sends back a SYN packet instead of a SYN-ACK packet. The client incorrectly responds with a SYN-ACK packet to the server.
What is a common problem with TCP IP? ›Common TCP/IP communication problems include the inability to communicate with a host on your network and routing problems. These are some solutions. Resolver routines on hosts running TCP/IP attempt to resolve names, using these sources in the order listed.
What is TCP IP troubleshooting tools? ›Edit online. The netstat command is a good tool for diagnosing common problems in a Transmission Control Protocol/Internet Protocol (TCP/IP) network environment. The netstat command lets you determine which area of the network has a problem.
What does TCP IP error mean? ›"TCP/IP send error" They all mean the same thing -- the PC with PC-DMIS is unable to communicate with the CMM. Some causes of this can be: Hardware: Bad/loose cables, Network Interface Card (NIC), interference if using a Wi-Fi connections, etc.
What methods help detect lost packets? ›Everyone else can use two standard system utilities -- ping and traceroute -- to conduct packet loss tests and find where packets are going astray: Ping measures round-trip times between your computer and an internet destination.
How do you perform a TCP test? ›- Verify that TCP/IP communication is configured and started on each of the workstations. Use the documentation provided by your workstation vendor.
- From a workstation, open a command prompt and type ping followed by the IP address of the interface you have configured.
Does TCP use error detection? ›
TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network.
What happens if ACK is lost? ›A missing ACK means that the TCP transmit buffer is not purged. This makes eminent sense because the stack may be required to resend the data. If the data loaded into the TCP transmit buffer is greater than half the buffer size then the next block of data cannot be loaded into the stack until an ACK is received.
What happens when a TCP packet is not acknowledged? ›The TCP then waits until its own FIN is acknowledged, from whereon it deletes the connection. If an ACK is not received after the user timeout, the connection is aborted. Half close connections are closed independently of each other. Each host closes its half-duplex connection independently of each other.
What is difference between UDP and TCP? ›TCP is a connection-oriented protocol, whereas UDP is a connectionless protocol. A key difference between TCP and UDP is speed, as TCP is comparatively slower than UDP. Overall, UDP is a much faster, simpler, and efficient protocol, however, retransmission of lost data packets is only possible with TCP.
Why does TCP wait for 3 duplicates? ›The reason that the sending side has to wait until the third duplicate ACK is described in RFC2001 as follows: " Since TCP does not know whether a duplicate ACK is caused by a lost segment or just a reordering of segments, it waits for a small number of duplicate ACKs to be received.
Does TCP always send ACK? ›Recall that TCP is an acknowledgment (ACK) based protocol and every data packet that is received has to be acknowledged. The TCP-data packets are transmitted from the AP to the STAs, each of which transmits a TCP-ACK packet for every TCP-data packet that it receives.
How do you handle duplicate entries? ›- Select the range of cells that has duplicate values you want to remove. Tip: Remove any outlines or subtotals from your data before trying to remove duplicates.
- Click Data > Remove Duplicates, and then Under Columns, check or uncheck the columns where you want to remove the duplicates. ...
- Click OK.
- Troubleshoot Network Problems. ...
- Connect With a Different Device. ...
- Switch to Another Web Browser. ...
- Restart PC in Safe Mode. ...
- Restart Modem or Router. ...
- Deactivate Antivirus and Firewall. ...
- Disable Other Connections. ...
- Change the DNS Server Address.
It's highly likely that this Chrome download network error is caused by your computer's antivirus program. That's because most antivirus programs include traffic monitoring features such as HTTPS scanning. These features help protect your PC against malware hidden behind secured and encrypted websites.
How do I get rid of a failed network error? ›Use Incognito Mode
Another issue causing the “download failed: network error” issue could be that you're running an extension that's interfering with the download. The easiest way to get around that issue is to simply use incognito mode, also known as private browsing.
What is the three common network problem? ›
Slow network speeds, weak Wi-Fi signals and damaged cabling are just some of the most common network connection issues that IT departments need to troubleshoot.
What is a network service error? ›This, in most cases, is a result of hitting the maximum number of failed login attempts within a specific period of time. This is a security feature to help protect your mailbox and the data within it from brute force attacks.
What happens when it says network error? ›The “Network error, please try again later” toast message is caused by a faulty app, particularly, an app working in the background. Previously, I tracked the error's pattern and noticed that it always displayed when I used a Google app.
What does network connection failed mean? ›The Network Connection Failed error message indicates that your DVR / NVR is not communicating with your router. This error can be due to an issue with your DVR / NVR internal settings or an issue with how your DVR / NVR cables are connected.