Network Latency

What is network latency and how does it affect you?

Overview

This document discusses the following questions:

What is packet latency and how is it measured?
What does latency do to networks, client PCs, servers, and their associated traffic?
What is an acceptable level of latency?
How do you measure your latency?
What do you do if your latency is too high?

What is Network Latency?

Definition of latency

For the purpose of this document, network latency is defined as the amount of time it takes for a packet to cross the network from a device that created the packet to the destination device. This is also known as end-to-end latency. Delay could be considered a synonym for latency, but the word latency will be used throughout this document.

Latency can be more complicated. For example, Latency can be measured starting with the first bit that leaves the transmitting host and ending when the last bit of the packet enters the destination device. Fortunately, the amount of time it takes for a device to read a packet from or write a packet to the network doesn’t contribute significantly to the overall latency.

Many things can contribute to the overall end-to-end latency. Also, there are other network performance measurements that are related to latency. One such measurement is jitter, or the change in end-to-end packet latencies over time. This can have an effect on certain types of traffic.

What Causes Latency?

End-to-end latency is a cumulative effect of the individual latencies along the end-to-end network path. Listed are some of the typical components of latency from a workstation to a servers:

workstation LAN
WAN (if applicable)
access router
ISP link
ISP network
Path from ISP to Host
Host internal network

Network routers are the devices that create the most latency of any device on the end-to-end path. Routers can be found in each of the above network segments. Packet queuing due to link congestion is most often the culprit for large amounts of latency through a router. Some types of network technology such as satellite communications add large amounts of latency because of the time it takes for a packet to travel across the link. Since latency is cumulative, the more links and router hops there are, the larger end-to-end latency will be.

How latency is measured

Here are the steps to measure latency.

Note the time when the packet is transmitted – transmitted time
Note the time when the packet reaches the destination – arrival time
Subtract the transmitted time from the arrival time

Some difficulties with this process maybe evident:

What device is used for a universal clock to read the time for both the device that sends the packet and the device that receives the packet?
Tracking the time that a particular packet is sent and when it is received.
How useful is measuring end-to-end latency with only one packet through a complex network where there is ample opportunity for the latency to change over time?
What happens to latency under various amounts of traffic loads?

Round-trip latency

To solve the universal clock problem, some test equipment has the ability to sync up with a GPS clock and place a timestamp inside the packet that is sent to measure latency. The receiving device is a similar piece of equipment that can also sync up with a GPS clock. This device then compares the time that the packet is received with the timestamp in the received packet to obtain an end-to-end latency measurement for that packet.

This option is very expensive. Fortunately, there is a cost efficient method that provides acceptable accuracy. If the sender to the receiver path is the same as the path from the receiver to the sender, the round-trip latency (latency from the sender to the receiver and back) can be measured and assumed that the end-to-end latency is half of this result.

Measuring round-trip latency is easy and details of this will be covered later in this document. Measuring round-trip latency means that all time comparisons are made from the same device, which removes the need for devices to sync to a common clock. It also solves the problem of keeping up with the send and receive times for each packet since these times are all associated with one packet in one device.

Statistical significance

Measuring the round-trip latency of one packet is not useful because latency changes frequently. One good way to handle this variation in results is to measure the round-trip latency for a number of packets and calculate an average, maximum, and minimum for all these values.

Second-order latency values such as jitter (difference in latency values) or standard deviation (difference from the average latency) can offer more information as to the end-to-end network performance. However these values are considered insignificant compared with the average latency for the purposes of this document.

The more packets that are measured to obtain an average, the better the measurement. For practical reasons, measuring a minute or so of latency values to obtain an average latency should be adequate for each end-to-end latency measurement.

Latency and traffic load

Latency can change as the traffic load changes. As load increases, it is possible that latency will increase since buffers may begin to populate on the path between the sender and receiver. Measuring latency while considering network load can get complicated. To fully characterize the latency versus load, measurements must be made at various network loads. Controlling the network load during the measurement is difficult.

Fortunately, there are ways to simplify this process. Most enterprise networks have a fairly predictable bandwidth usage pattern. This pattern will change as new network applications are deployed and as more people use the network, but from one day to the next there is little change in this pattern. Typically the network bandwidth used is low from late afternoon until the beginning of the business day. Network traffic will increase during the workday and there may be slight decline in utilization during lunch.

Knowing these patterns makes it possible to test network latency with known levels of network utilization. To systematically measure latency, it is good to sample latency throughout the day in regular intervals.

Effect of latency on networks

The overwhelming majority of network traffic falls into one of two types of traffic – UDP (User Datagram Protocol) and TCP (Transmission Control Protocol). The majority of this traffic tends to be TCP.

UDP latency effects

UDP is a protocol that defines how to form messages that are sent over IP. A device that sends UDP packets assumes that they reach the destination, so there is no mechanism to alert senders that the packet has arrived. UDP traffic is typically used for streaming media applications where an occasional lost packet does not matter.

Since the sender of UDP packets does not require any knowledge that the destination received the packets, UDP is relatively immune to latency. The only effect that latency has on a UDP stream is an increased delay of the entire stream. Second-order effects such as jitter may affect some UDP applications in a negative way, but these issues are outside the scope of this document.

It is important to note that latency and throughput are completely independent with UDP traffic. In other words, if latency goes up or down, UDP throughput remains the same. This concept has more meaning on the effects of latency on TCP traffic.

There are no effects of latency on the sending device with UDP traffic. The receiving device may have to buffer the UDP packets longer with high amounts of jitter to help the application run better.

TCP latency effects

TCP is more complicated than UDP. TCP is a guaranteed delivery protocol, which means that the device that sends the packets is told that the packet did or did not arrive at the destination. To make this work, a device that needs to send packets to a destination must set up a session with the destination. Once this session has been set up, the receiver tells the sender which packets were received by sending an acknowledgement packet back to the sender. If the sender does not receive an acknowledgement packet for some packets after a length of time, the packets are resent.

In addition to providing guaranteed delivery of packets, TCP has the ability to adjust to the network capacity by adjusting the ‘window size’. The TCP window is the number of packets a sender will transmit before waiting for an acknowledgement. As acknowledgements arrive the window size is increased. As the window size increases, the sender may begin sending traffic at a rate that the end-to-end path can’t handle resulting in packet loss. Once packet loss is detected, the sender will react by cutting the sending packet rate in half. Then the process for increasing the window size begins again as more acknowledgements are received.

As end-to-end latency increases, the sender may spend lots of time waiting on acknowledgements instead of sending packets. In addition, the process of adjusting the window size becomes slower since this process is dependent on receiving acknowledgements.

Considering these inefficiencies, latency has a profound effect on TCP bandwidth. Unlike UDP, TCP has a direct inverse relationship between latency and throughput. As end-to-end latency increases, TCP throughput decreases. The following table shows what happens to TCP throughput as round trip latency increases. This data was generated by using a latency generator between two PCs connected via fast Ethernet (full duplex). Note the drastic reduction in TCP throughput as the latency increases.

Round trip latency	TCP Throughput
0ms	93.5 Mbps
30ms	16.2 Mbps
60ms	8.07 Mbps
90ms	5.32 Mbps

Table 1 - Effect of Latency on TCP Throughput

As latency increases, the sender may sit idle while waiting on acknowledgements from the receiver. The receiver however, must buffer packets until all the packets can be assembled into a complete TCP message. If the receiver is a server, this buffering effect can be complicated by the large number of sessions that the server may be terminating. This increased use of buffer memory can cause performance degradation in the server.

With all the problems that latency creates for TCP, packet loss compounds these problems. Packet loss causes the TCP window size to shrink, which may cause the sender to sit idle longer while waiting for acknowledgements with high latency. Also, acknowledgements may be lost which causes the sender to wait until a timeout occurs for the lost acknowledgement. If this happens, the associated packets will be retransmitted even though they may have been transmitted properly. The result is packet loss can further decrease TCP throughput.

The following table illustrates the effect of latency and packet loss on TCP throughput. This data was generated by using a latency and packet loss generator between two PCs connected via fast Ethernet (full duplex). The packet loss rate was set to 2%, which means that 2% of packets were discarded by the test equipment. Note that the TCP throughput values are much lower in the presence of packet loss.

Round trip latency	TCP Throughput with no packet loss	TCP Throughput with 2% packet loss
0 ms	93.50 Mbps	3.72 Mbps
30 ms	16.20 Mbps	1.63 Mbps
60 ms	8.07 Mbps	1.33 Mbps
90 ms	5.32 Mbps	0.85 Mbps

Table 2 - Effect of Latency and 2% Packet Loss on TCP Throughput

Some packet loss is unavoidable. If a network runs perfectly and does not drop any packets, an assumption cannot be made that other networks operate as well.

Regardless of the situation, keep in mind that packet loss and latency have a profoundly negative effect on TCP bandwidth and should be minimized as much as possible.

Measuring latency with ping

Most operating systems include a network connectivity test tool called ‘ping’. Here is some example output from running ping on a Mac running OS X. Other implementations of ping will produce similar output[macosxpowerbook] user% ping 152.1.1.1

PING 152.1.1.1 (152.1.1.1): 56 data bytes

64 bytes from 152.1.1.1: icmp_seq=0 ttl=254 time=35.068 ms

64 bytes from 152.1.1.1: icmp_seq=1 ttl=254 time=2.92 ms

64 bytes from 152.1.1.1: icmp_seq=2 ttl=254 time=3.45 ms

64 bytes from 152.1.1.1: icmp_seq=3 ttl=254 time=7.409 ms

64 bytes from 152.1.1.1: icmp_seq=4 ttl=254 time=3.319 ms

64 bytes from 152.1.1.1: icmp_seq=5 ttl=254 time=9.072 ms

64 bytes from 152.1.1.1: icmp_seq=6 ttl=254 time=14.982 ms

64 bytes from 152.1.1.1: icmp_seq=7 ttl=254 time=4.495 ms

64 bytes from 152.1.1.1: icmp_seq=8 ttl=254 time=3.193 ms

64 bytes from 152.1.1.1: icmp_seq=9 ttl=254 time=10.613 ms

--- 152.1.1.1 ping statistics ---

10 packets transmitted, 10 packets received, 0% packet loss

round-trip min/avg/max = 2.92/9.452/35.068 ms

The following is a list of useful information that can be inferred from the output::

The average round trip latency is 9.452ms (summary info in last line).
Ten 64 byte packets were sent.
The latency values ranged from 2.92ms to 35.068ms (summary info in last line).
No packets were lost (0% packet loss in the next to last line).

Pinging can be run manually at any time to measure latency and is fairly non-intrusive, so it should not affect any users. The tool is easily automated in scripts since it can run from the command line.

Latency standards

Defining a level of latency that is deemed acceptable is difficult since it’s hard to determine a threshold for user productivity based on application response times. However, it is important to define typical end-to-end latency values so you have a reasonable goal for latency.

Monitoring is important since it is helpful to know typical latencies between each remote and the servers. An increase in end-to-end latency may be an indicator of a network problem.

With these issues in mind, here are some “rules of thumb” for end-to-end latency (between workstation and servers):

A round trip end-to-end latency of 30ms or less for LEAs is healthy. This measurement should be monitored to track any changes.
Round trip latencies between 30ms and 50ms should be monitored. If there are ways to lower end-to-end latency, they should be considered.
Round trip latencies greater than 50ms require immediate attention to determine the cause of the latency and possible remedies to lower the end-to-end latency. Monitor this measurement to track the improvements.

What to do if your latency is too high

Review the ‘What Causes Latency?’ of this document. Note the components of latency that can be controlled. This typically includes the following items:

workstation LAN
WAN (if applicable)
access router
ISP link

Use ping to measure the latency through each of these components independently. If one or more components appear to be contributing significantly to latency, begin designing a strategy to reduce the latency in that component. There are too many possibilities to list here.

If high latency is outside of your control, report the problem to the parties responsible for those network components. If this is the case, report the issue to the ISP to see if they can relieve the problem and/or look into other options for an ISP.