How to Diagnose Network Latency Issues Systematically

Latency Is Not Bandwidth

The most common mistake in network troubleshooting is confusing latency with bandwidth. A 1 Gbps link can have 500ms latency. A 10 Mbps DSL line can have 8ms latency. Bandwidth is how much water fits through the pipe. Latency is how long the water takes to travel the pipe. They are different problems with different causes and different fixes.

Latency problems are harder to diagnose than bandwidth problems because they are often intermittent and their cause is invisible — sitting on a router three hops away that you do not control. This guide gives you a systematic method for finding where latency lives and what is causing it.

Step 1: Characterize the Problem

Before running any tools, answer three questions:

Constant or intermittent? Constant latency usually means a physical distance problem, a satellite link, or a misconfigured QoS policy. Intermittent latency points to congestion, route flaps, or failing hardware.
Symmetric or asymmetric? Is upload slow but download fast? That usually means a saturated uplink or a half-duplex link somewhere in the path.
All traffic or specific? If only HTTP is slow but ICMP is fast, the problem is application-layer — likely a slow server response, not a network issue.

Write these answers down. The rest of your diagnosis depends on getting this right.

Step 2: Baseline with ping

Start with the OpsCheck Ping Test or run one from your location:

# Extended ping with timestamps for pattern detection
ping -c 100 -i 0.2 target.example.com | while read line; do
  echo "$(date +%H:%M:%S.%N) $line"
done | tee ping-baseline.log

Look at the mdev (mean deviation) value — this is jitter. A high mdev means inconsistent latency, which is worse for real-time applications than consistently high latency. A VoIP call can handle 200ms round-trip if it is stable. It cannot handle 20ms with 50ms spikes.

If ping shows 0% loss and stable latency, the problem is not at the IP layer. Move up the stack.

Step 3: Measure Per-Hop Latency with MTR

ping tells you end-to-end. MTR tells you where in the path the latency lives:

mtr -r -c 100 target.example.com

The output shows loss% and latency (avg/best/worst) per hop. The key insight: loss on one hop that does not propagate to subsequent hops is usually a router that deprioritizes ICMP — not a real problem. Loss that cascades to every hop after it is a real issue at that hop.

Example: hop 5 shows 30% loss. Hops 6 through 10 also show 30% loss. The problem is at hop 5. But if hop 5 shows 30% loss and hop 6 shows 0%, hop 5 is just dropping ICMP replies — the traffic is passing through.

Step 4: Check for Bufferbloat

Bufferbloat is excessive buffering in routers and switches. When a link is congested, buffers fill up instead of dropping packets. This causes latency to spike into the hundreds or thousands of milliseconds as packets queue. TCP interprets this as congestion and reduces speed. The result: high latency and low throughput at the same time.

# Test bufferbloat with a simultaneous ping and large download
# Terminal 1: ping running
ping -c 200 -i 0.1 8.8.8.8 > ping-during-load.txt

# Terminal 2: saturate the link
curl -o /dev/null http://speedtest.example.com/100mb.bin

If ping latency jumps from 15ms to 800ms during the download, you have bufferbloat. The fix is on your router: enable SQM (Smart Queue Management) with fq_codel or CAKE. Most consumer routers have this as "QoS" or "Bufferbloat Protection" in the admin panel.

Step 5: TCP-Level Diagnostics

If ping is clean but applications are slow, the problem is at TCP:

# Measure TCP handshake time
time curl -s -o /dev/null -w 'TCP handshake: %{time_connect}s\nTLS handshake: %{time_appconnect}s\nTotal: %{time_total}s\n' https://target.example.com

# Check TCP retransmissions
ss -ti | grep -A1 target

High time_connect with low ping latency means the server is slow to accept connections — likely a SYN backlog problem or a reverse DNS lookup timeout. High time_appconnect means TLS negotiation is slow — check your SSL configuration with the OpsCheck SSL Certificate Checker.

TCP retransmissions show in ss -ti as retrans:5/10 — that means 5 retransmitted segments out of 10 total. Above 2% retransmission rate indicates packet loss that ICMP ping did not catch, usually because the loss is on larger packets.

Step 6: Check DNS Resolution Time

Slow DNS resolution looks exactly like network latency:

# Measure DNS resolution time for your target
time dig +short target.example.com

# Check which resolver you are using
dig +short whoami.akamai.net

If DNS takes 500ms and the actual connection takes 10ms, the problem is your DNS resolver, not the network. Switch to a faster resolver (1.1.1.1 or 8.8.8.8) or set up a local caching resolver like systemd-resolved or unbound.

Real Scenario: The 2-Second TLS Handshake

A hosting company migrated their customer portal to a new data center. Users reported the site loading slowly — 3-4 seconds per page. ping showed 22ms. MTR showed clean routing. The bandwidth test showed the full 10 Gbps available.

The culprit was found with curl's timing breakdown:

curl -w '@curl-format.txt' -o /dev/null -s https://portal.example.com

TCP handshake: 22ms. TLS handshake: 2,100ms. The TLS negotiation was slow because the server was configured to validate client certificates against an OCSP responder that was unreachable after the migration. Each TLS handshake waited for OCSP to time out (2 seconds) before falling back. The fix was updating the OCSP URL in the web server config.

Without curl timing breakdowns, this would have looked like "network latency" and wasted hours chasing the wrong layer.

Diagnostic Flowchart Summary

ping clean? → YES → check TCP handshake time (curl -w)
            → NO  → run MTR to find the hop

MTR shows loss at hop N → does it cascade? → YES → problem at hop N
                                          → NO  → deprioritized ICMP, move on

TCP handshake slow but ping fast? → DNS timeout or server SYN backlog
TLS handshake slow?               → OCSP/CRL timeout or cipher negotiation
High jitter but low loss?         → bufferbloat or QoS misconfiguration

Verify with: OpsCheck Ping Test and Port Scanner

Network latency diagnosis is about eliminating layers. Start at IP, move to TCP, then check application. Most problems are found at the layer above where you are looking.

← Back to Blog