Common Network Performance Problems
Transferring data from one computer to another involves passing that data
through dozens, sometimes hundreds, of software and hardware components.
Performance may be limited within a computer or within the network. Sometimes components which are functioning normally by themselves may interact in unexpected ways to create unexpected problems.
Workstation or Server
A file transfer may pass through the following components before it ever
leaves the system: Hard disk drive, system disk cache, transfer
application, network stack, software VPN, software firewalls and filters,
network drivers, and the hardware network adapter. There are many opportunities for the data to become delayed, or blocked. Following is a list of the computer components most likely to impact network performance.
The operating system manages the resources and settings for all of the other
system components. Driver configuration, CPU prioritization, IP stack
tuning, virtual memory, and many other factors can impede performance.
Generally speaking, older systems will have poorer performance than newer
versions. Windows systems will have poorer performance than unix systems of the same vintage. Most users stick with default operating system settings, so it is rare to find problems beyond the choice of system itself. However, some unix distributions may require UDP buffer tuning as described in Tech Note 0024.
The CPU(s) must be shared by all of the software running on a system. In
modern operating systems, there are typically twenty to sixty programs
running even when the user isn't doing anything. All of these should be
idle most of the time. But if just one of those programs tries to "hog" the CPU, it can seriously impede performance. Systems with multiple CPUs are not necessarily faster than those with single CPUs. Windows is sometimes slower on multiple CPU systems. Windows users can view CPU usage by displaying the Task Manager (control-alt-delete) and sorting by CPU.
The CPU can also constrain performance if you are using compression in conjunction with your data transfer. ExpeDat, for example, allows you to apply ZLIB compression to file transfers. Disabling compression may improve throughput, especially on very fast networks.
Hard Drive Usage
Most data transfer begins and ends with a hard drive. This is usually the
slowest component within the machine. The speed of hard drive data transfer
varies tremendously, often without obvious feedback to the user. In
particular, if two or more programs try to access the hard drive at the same
time, data throughput will drop exponentially. Even if the system appears
idle, one of the many background processes can still cause drive access.
Even intermittent access can cause data transfers to freeze for seconds at a
time, leading to greatly reduced network performance.
Hard drive limitations become dominant when the network is very fast or when
there are consistently multiple processes trying to access the drive. For
example, trying to download two different files at once from a gigabit
ethernet LAN to a single consumer hard drive may be much slower than
downloading one after the other. High-end drives or RAIDs may improve overall performance, but are still highly variable and severely
impacted by multiple accesses. RAIDs may require special configuration to perform well with network data. See Tech Note 0018 if you are using a RAID.
Firewalls and Filters
Many operating systems now include software which performs network functions
traditionally done in outside hardware. This includes firewalls, which
attempt to block or alter network traffic based on network characteristics,
and filters which attempt to block or alter network traffic based on
content. Modern Windows systems enable a software firewall by default. Most
other operating systems do not.
All such software works by stopping each data packet, checking it against a
set of rules, and then deciding what to do with that packet. Depending on
the amount of checking, this can greatly delay the packet and consume
substantial CPU resources. This can be especially problematic when there
are multiple layers, such as combining firewall, content filter, router, and
network address translation software. The best way to minimize problems with
these components is to disable those that are not necessary and to set rules in the rest that explicitly exempt MTP/IP traffic. See technical note 0002, "Configuring Firewalls". The Windows Vista firewall is particularly problematic if it is not precisely configured. See technical note 0006, "Configuring Vista Firewall".
Virtual Private Network (VPN) Software
VPN software is similar to a firewall or filter in that it stops each data
packet and processes it at significant cost. This processing may
include compression or encryption, which is especially costly.
Datagram based VPNs (particularly IPsec) process one datagram at a time,
adding a small amount of data to each one before sending it on its
way. This introduces some CPU load and may rarely cause MTU problems. Datagram based VPNs are unlikely to hurt performance much, except on very fast LANs. Low quality VPNs "tunnel" all
datagrams across a single TCP/IP connection. This includes so-called
SSL VPNs. These introduce substantial network overhead and
exponentially worsen TCP flow control and congestion problems. SSL and
other tunneling VPNs exhibit poor network performance under all
circumstances. MTP/IP cannot be used with SSL or other tunneling VPNs.
Although network data transfer does not typically involve much direct memory usage, the amount of available memory can significantly affect the
performance of other operating system components. Inadequate system memory will cause the system to access the hard drive more often, leading to poor performance when reading or writing to disk.
Microsoft Windows XP, Vista, Server 2003, and Server 2008 have internal limits on how quickly data can be read from or written to files based on the fixed size of the Windows "paged pool" buffer. If an MTP application produces an error such as "Insufficient system resources exist to complete the requested service" then you may need to adjust the Windows paged pool. The following Microsoft support note explains how to do this in the "RESOLUTION" section: http://support.microsoft.com/kb/304101. Windows Server 2012 or later is strongly recommended for speeds above a few hundred megabits per second.
Network Adapter / Network Interface Card (NIC)
The hardware component that moves data from the operating system to the physical media depends on correct drivers and settings to operate efficiently. Most modern NICs are rated at 1 gigabit per second. Some offer advanced features such as IP offloading. All have options to set a Maximum Transmit Unit (MTU).
For multigigabit networks, a very common problem is an NIC which is slower than the network speed. For example, a 1 gigabit NIC on a 10 gigabit network will only be able to move data at 1 gigabit per second. Likewise, a 10 gigabit NIC will not be able to fill a 40 gigabit data path. See Tech Note 0032 for more considerations on multigigabit data paths.
IP stack offloading attempts to perform checksum and other calculations in the NIC hardware, freeing the operating system kernel for other I/O tasks. This requires that compatible drivers be installed and correctly configured. Driver compatibility problems can cause IP offloading to make throughput slower, especially at gigabit and faster speeds. When experiencing performance problems, test with offloading both enabled and disabled.
The MTU set in the NIC should match that of the network path. For ethernet below one gigabit per second, this MTU should be at least 1500. For true multigigabit paths, this MTU should be at least 9000 (Jumbo). See the MTU section for further information.
MTP/IP depends heavily on the ability to accurately measure the timing of
data transmission and arrival. It relies on the operating system to
provide this timing information. Anything which disrupts the system
timing will adversely affect MTP/IP. For example, radically changing
the system clock, or putting the system to sleep for a while, will likely
cause any running MTP/IP transactions to fail. More subtle problems
may arise if the system clock is faulty, such as due to a failing
motherboard battery or incorrectly configured network time management
software (NTP). Such problems are rare on newer systems, but battery
failure is a distinct possibility on hardware over five years old.
Users must avoid situations which disrupt the timing of MTP/IP applications
or the system as a whole.
Any user interface activity may disrupt the system in a way that impacts network performance. For example, viewing a web page will substantially reduce hard drive, CPU, memory, and network performance. Windows in particular gives higher priority to user actions than to other processes. For example, clicking the desktop just once per second during a Windows 2012 RDP session can cause a 25% drop in network throughput. The Windows Activity Monitor can cause up to a 50% drop in network throughput.
During testing, users should avoid any other activity. On Windows servers, all users should log out and disconnect from RDP whenever possible.
Machine virtualization allows a "guest" operating system to be run inside a simulated environment controlled by a program running on a real "host" system. Because the guest is a full operating system, much of what it does is redundant with the operations of the host operating system. Thus a virtualized system more than doubles the amount of processing that must occur with each network datagram. In particular, all of the other problems that may affect a system are at least doubled in a virtualized system.
Virtualized server environments often involve multiple guest systems, all contending for limited CPU, memory, and hard drive access. These different redundant systems can interact in unexpected ways. Network performance may be severely reduced for some traffic at some times, while it is improved for other traffic at other times. Generally speaking, virtualization involves substantial performance costs. This effect can be mitigated by testing in the host, rather than the guest, system.
See the "Virtual & Cloud Machines" section of Tech Note 0023 for more virtual machine setup recommendations.
A typical Internet path involves between ten and twenty "hops", which are declared routers or other network nodes. For example, a datagram might travel through the source computer, a wireless gateway, an ethernet hub, a DSL or cable modem, an ATM switch, many fiber optic routers, a T1 line and router, several ethernet hubs, and the destination computer. Additional devices may be hidden between the hops, at the telecom level.
Following is a list of the components and factors most likely to impact
network performance. Note that several different components may be combined into a single network device. For example, a typical consumer DSL router includes a wireless gateway, ethernet switch, firewall, NAT, VPN, and modem in a single box.
Hubs, Switches, Routers, and Gateways
These devices relay network traffic amongst network links. By itself, this
functionality is typically very fast and rarely causes any problems.
However, devices with these names often have one or more of the following
components, which can be more problematic.
A firewall examines each datagram it receives and applies a set of rules to
determine whether or not it should be allowed to pass through. Hardware firewalls typically do this very quickly with little or no impact on network performance. However, this depends on the rules being correctly configured.
Hardware firewalls must be configured to explicitly allow MTP/IP traffic to
pass through without interference. If this is not done, then the
firewall may block or degrade MTP/IP's performance. It is possible for
a firewall to initially allow traffic to pass through, but then reduce or
cut-off that traffic after several minutes. Users may be accustomed to
their firewall correctly guessing what to do without explicit configuration,
but those guesses may not be correct when MTP is introduced. See technical note 0002, "Configuring Firewalls", for
advice on configuring firewalls.
Network emulators use statistical models that are designed around TCP and TCP-like network traffic. While emulators can be useful, they must be carefully configured using statistics appropriate to the traffic being tested. DEI strongly recommends testing MTP in real-world environments whenever possible. If you must use an emulator, carefully read Tech Note 0022 for information on programming it with the best possible data.
Network Address Translation (NAT)
Machines on a Local Area Network may use private IP addresses to communicate
with each other while sharing a single public IP address to communicate with
the rest of the Internet. This is done for both security and to conserve
scarce public addresses.
When a private machine seeks to talk to the public Internet, the NAT device
must translate the private address into the public one. Because many
private machines may be sharing the same public address, the NAT device has
to keep track of which traffic belongs to which private machine. This
involves keeping track of incoming and outgoing port numbers, and sometimes
involves changing those port numbers.
If a private machine initiates an MTP/IP transaction, the NAT device should
take note of the outgoing traffic and automatically route returning traffic
back to the correct machine. However, some NAT devices may forget this
information, causing an ongoing transaction to fail after several minutes.
The private machine will not be able to receive transactions initiated by
outside systems unless the NAT is specifically instructed to "map" external
ports to the internal machine. NAT port mapping is common task for
setting up any server behind an NAT device.
Virtual Private Network (VPN)
VPN hardware is similar to a gateway or router in that it stops each data
packet it receives and decides where to send it next. The difference
with a VPN is that it simulates virtual network paths by encapsulating the
data packets within other packets or data streams. One common example
of this is allowing machines on different private LANs to communicate using
private IP addresses. VPNs often add authentication and encryption to
provide additional security to the traffic they handle.
This encapsulation can be performed either by datagrams or by TCP
streams. Datagram based VPNs (particularly IPsec) process one datagram
at a time, adding a small amount of data to each one before sending it on
its way. This may rarely cause MTU problems, but
is otherwise unlikely to cause performance problems.
Low quality, and some very old, VPNs "tunnel" all datagrams across a single
TCP/IP connection. This includes so-called SSL VPNs. These introduce
substantial network overhead and exponentially worsen TCP flow control and
congestion problems. SSL and other tunneling VPNs exhibit poor network
performance under all circumstances. MTP/IP cannot be used with SSL or
other tunneling VPNs.
Maximum Transmit Unit (MTU)
Internet Protocol (IP) networks transmit data in discrete packets called
datagrams. Each datagram has source and destination addresses, some
descriptive information, and the data payload. IP allows datagrams to
be up to 65536 bytes in total size. However, most network media limits
datagrams to much smaller sizes. Ethernet, for example, typically
limits datagrams to just 1500 bytes. IP provides a mechanism for
network devices to divide up datagrams that are too large for a particular
link. This process is called fragmentation. Fragmentation
introduces extra overhead per-datagram, but bigger payloads mean fewer datagrams and usually result in improved throughput
Most devices limit datagram sizes, even with fragmentation. But due to a lack of standards conformance, it is not possible to know for certain how large an MTU will be supported by any given network path. In particular, some devices may support large datagrams at a severe performance penalty, while others may give improved performance with large datagrams, and still others may silently discard large datagrams. This situation may be further complicated by VPNs or other other tunneling protocols (such as PPPoE), which add to the size of datagrams without telling the computers at either end. Thus a datagram which is transmitted at a proper size can grow along the path to become too large.
For this reason, MTP/IP usually limits its datagrams to a total of 1480 bytes, including UDP and IP headers. This leaves 20 bytes for MPLS, PPPoE, or simple IPsec overhead. MTP/IP will attempt to detect when smaller datagrams are required. However, in some environments, it may be helpful to tell MTP/IP not to exceed a specific limit. In most MTP/IP applications, this is done with the MaxDatagram configuration option. This value should be set to about 56 bytes less than the known MTU limit.
If you are using MTP/IP with a VPN, PPPoE, or other tunneling protocol and performance is poor, try setting MaxDatagram to a value of 1280. If this improves performance, you may try increasing the value in increments of 16 until performance no longer improves.
Some network equipment supports Jumbo ethernet frames (9000 MTU) or larger. Larger datagram sizes greatly reduce overhead for every device on the path and is essential for multigigabit performance. However, if any device in the path does not support at least Jumbo frames, then increasing MTP's datagram size may severely reduce network performance or cause a loss of connectivity. Where Jumbo frames are supported by every device in the path, substantial performance improvements can be achieved by setting MTP's MinDatagram to 8192. Even larger values may be used for paths which support Super Jumbo frames.
10 gigabit or faster network equipment will support Jumbo ethernet frames (9000 MTU) or larger. Conversely, a network path which fragments datagrams larger than 1500 bytes almost certainly contains at least one component which is not capable of 10 gigabit speeds. This is a common problem with bonded links. See Tech Note 0032 for more about working with 10 gigabit networks.
Wireless communications, including 802.11 WiFi, Bluetooth, satellite, and
various cellular mechanisms present numerous additional challenges.
Wireless media is subject to highly variable conditions which can degrade
performance. Different media types react to problems in different
ways. Some will attempt to correct for lost or corrupted data. This
results in data delays and can cause transfers to "freeze up" for
significant periods of time. Others simply drop corrupted data,
allowing the transport and application layers to handle recovery.
MTP/IP is very efficient at error recovery, so disabling hardware error correction may improve MTP/IP performance.
Some wireless media, particularly some satellite and cellular links, may
entail tunneling across TCP/IP connections. MTP/IP will perform very poorly
in such environments. Unfortunately, there is no easy way to know what sort
of loss policies a given network is implementing, except by experimentation.
Multiplexed or Bonded Lines
Sometimes multiple low-speed data links can be combined to form a virtual high-speed link. For example, three T1 lines might be bonded to provide 4.5 megabits per second instead of the single 1.5. Proper multiplexing requires that devices be installed at both ends to both split apart and recombine the data flows. The devices must correctly balance the flow of data packets across all available lines.
A bonded line is not the same as a single line of the same speed.
Performance of multiplexed or bonded lines varies tremendously. Lines bonded at the telecom level may perform almost as well as a real line of the same total speed. But lines which are bonded with end-user multiplexing hardware often limit individual data flows to the speed of just a single line. Bonded lines may have difficulty distributing data during high loads and may exhibit erratic behavior with different traffic types. Proper configuration of end-user multiplexers is essential to good performance.
To achieve maximum performance with MTP/IP over a bonded or multiplexed data path, all lines must converge to a single IP address at each end and the multiplexing hardware must be configured to distribute UDP/IP traffic in round-robin or even distribution. Multiplexers which claim to perform "load-balancing" often do not properly handle UDP/IP or have separate configurations for UDP/IP, so examine their configurations carefully.
WAN Acceleration Appliances
Some devices attempt to capture and modify network traffic for the purpose of improving performance. This typically involves compression, caching, or de-duplication. Most such devices ignore UDP datagrams. Some, notably SilverPeak, will attempt to capture and modify UDP datagrams.
Even if it does not modify MTP/IP's UDP datagrams, the overhead created by the device may slow or limit network throughput. Legacy devices may have performance limits which are below that of the WAN network path. Devices which do modify UDP datagrams may cause corrupt packets and severely reduce or block MTP/IP throughput.
WAN Acceleration Appliances should be bypassed for MTP/IP traffic unless tests show a clear benefit.
Bandwidth Management Devices
This is a broad category which includes any device that actively interferes
with the datagrams flowing through it in an effort to change performance. A
common function is prioritization, in which a set of rules are applied to
determine whether performance of some traffic should be degraded in favor of
other traffic. Some devices attempt to compress or cache traffic in order
to reduce the amount of data traveling on the network. Some observe network
latency and will degrade the performance of any traffic which appears to
cause latency to exceed some threshold.
The functioning of such devices is heavily dependent on their correct
configuration. In particular, they must be properly programmed with the
priorities of the users and network managers. If those priorities, or the
network itself, change over time, the devices may begin to greatly hinder
MTP/IP should work well with correctly configured devices that obey Internet
Protocol standards. Other devices may severely degrade MTP/IP performance
and should be investigated to ensure that they are functioning as intended.
Though not a component of the network, third-party traffic flow often has
the largest impact on network performance. Any data traveling across any
part of the same network path will reduce the performance any other data
traveling that path. Ideally, network resources would be fairly divided
amongst all users. But variation in capacities, use patterns, and
requirements often makes it impossible to know what "fairly" means, let
alone enforce it. Worse, oscillations in traffic flows can cause some
traffic to create interference out of proportion to the bandwidth it
All else being equal, MTP/IP will perform much better than TCP/IP when third-party traffic is a factor. MTP/IP avoids the types of oscillations that cause some TCP/IP flows to excessively interfere with each other. However, networking is ultimately a zero-sum game: once the total data flow reaches the capacity of the network, no flow can gain without another one losing. Ideally, tests should be performed when third-party traffic is absent or at least steady. See technical note 0003, "Analyzing Network Performance", for additional testing advice.