Data Expedition, Inc. ®

Move Data Faster

Support
Tech Notes
Software Versions
Configuring Firewalls
Network Performance
Platform Support
Managing Bandwidth
Vista Firewall
License Binding
Anon. Win32 ExpeDat
Common Problems
Purchase Order Info.
ExpeDat 1.9-4+
Search DEI:

Page Index:
Operating System
CPU Usage
Hard Drive Usage
Software Firewalls
Software VPNs
Memory
Network Adapter
System Clock
User Interaction
Virtual Machines
Routers
Hardware Firewalls
NAT
Hardware VPNs
MTU
Wireless
Multiplexers
Bandwidth Managers
Other Traffic
Tech Note History
Nov. 08 2007Compression
June 19 2007First Post
 

Common Network Performance Problems

Transferring data from one computer to another involves passing that data through dozens, sometimes hundreds, of software and hardware components. Problems at any one of these steps may severely impede the overall performance of the transfer.  Sometimes components which are functioning normally by themselves may interact in unexpected ways to create unexpected problems.  As the number of components increases, the likelihood of problems increases exponentially.


Workstation or Server

A file transfer may pass through the following components before it ever leaves the system: Hard disk drive, system disk cache, transfer application, network stack, software VPN, software firewalls and filters, network drivers, and the hardware network adapter.

Each stage involves multiple passes through the computer's CPU and memory systems.  In some configurations, the data may encounter multiple layers at each stage or even pass through some stages multiple times.  Thus a single piece of data may be read and written many times, and interact nearly with every component of the system, before it ever leaves the computer.

As a result, there are many opportunities for the data to become delayed, blocked, or even corrupted.  Following is a list of the components and factors most likely to impact network performance.

Operating System

The operating system manages the resources and settings for all of the other system components.  Driver configuration, CPU prioritization, IP stack tuning, virtual memory, and many other factors can impede performance. Generally speaking, older systems will have poorer performance than newer versions.  Windows systems will have poorer performance than unix systems of the same vintage.  Most users stick with default operating system settings, so it is rare to find problems beyond the choice of system itself.

CPU Usage

The CPU(s) must be shared by all of the software running on a system.  In modern operating systems, there are typically twenty to sixty programs running even when the user isn't doing anything.  All of these should be idle most of the time.  But if just one of those programs tries to "hog" the CPU, it can seriously impede performance.  Systems with multiple CPUs are not necessarily faster than those with single CPUs.  Windows is sometimes slower on multiple CPU systems.  Windows users can view CPU usage by displaying the Task Manager (control-alt-delete) and sorting by CPU.

The CPU can also constrain performance if you are using compression in conjunction with your data transfer.  ExpeDat, for example, allows you to apply ZLIB compression to file transfers.  Disabling compression may improve throughput, especially on very fast networks.

Hard Drive Usage

Most data transfer begins and ends with a hard drive.  This is usually the slowest component within the machine.  The speed of hard drive data transfer varies tremendously, often without obvious feedback to the user.  In particular, if two or more programs try to access the hard drive at the same time, data throughput will drop exponentially.  Even if the system appears idle, one of the many background processes can still cause drive access. Even intermittent access can cause data transfers to freeze for seconds at a time, leading to greatly reduced network performance.

Hard drive limitations become dominant when the network is very fast or when there are consistently multiple processes trying to access the drive.  For example, trying to download two different files at once from a gigabit ethernet LAN to a single consumer hard drive may be much slower than downloading one after the other.  High-end drives or RAID configurations may improve overall performance, but are still highly variable and severely impacted by multiple accesses.

Firewalls and Filters

Many operating systems now include software which performs network functions traditionally done in outside hardware.  This includes firewalls, which attempt to block or alter network traffic based on network characteristics, and filters which attempt to block or alter network traffic based on content.  Modern Windows systems enable a software firewall by default.  Most other operating systems do not.

All such software works by stopping each data packet, checking it against a set of rules, and then deciding what to do with that packet.  Depending on the amount of checking, this can greatly delay the packet and consume substantial CPU resources.  This can be especially problematic when there are multiple layers, such as combining firewall, content filter, router, and network address translation software.  The best way to minimize problems with these components is to disable those that are not necessary and to set rules in the rest that explicitly exempt MTP/IP traffic.  See technical note 0002, "Configuring Firewalls".  The Windows Vista firewall is particularly problematic if it is not precisely configured.  See technical note 0006, "Configuring Vista Firewall".

Virtual Private Network (VPN)

VPN software is similar to a firewall or filter in that it stops each data packet and processes it at significant cost.  This processing may include compression or encryption, which is especially costly.  Datagram based VPNs (particularly IPsec) process one datagram at a time, adding a small amount of data to each one before sending it on its way.  This introduces some CPU load and may rarely cause MTU problems.  Datagram based VPNs are unlikely to hurt performance much, except on very fast LANs.  Low quality VPNs "tunnel" all datagrams across a single TCP/IP connection.  This includes so-called SSL VPNs.  These introduce substantial network overhead and exponentially worsen TCP flow control and congestion problems.  SSL and other tunneling VPNs exhibit poor network performance under all circumstances.  MTP/IP cannot be used with SSL or other tunneling VPNs.

Memory

Although network data transfer does not typically involve much direct memory usage, the amount of available memory can significantly affect the performance of other operating system components.  Inadequate system memory will cause the system to access the hard drive more often, leading to poor performance reading or writing to disk.

Network Adapter

The hardware component that moves data from the operating system to the physical media depends on correct drivers and settings to operate efficiently.  Network hardware and driver problems are rare these days.  They are also difficult to isolate because they may be intermittent and may not affect all traffic equally.

System Clock

MTP/IP depends heavily on the ability to accurately measure the timing of data transmission and arrival.  It relies on the operating system to provide this timing information.  Anything which disrupts the system timing will adversely affect MTP/IP.  For example, radically changing the system clock, or putting the system to sleep for a while, will likely cause any running MTP/IP transactions to fail.  More subtle problems may arise if the system clock is faulty, such as due to a failing motherboard battery or incorrectly configured network time management software (NTP).  Such problems are rare on newer systems, but battery failure is a distinct possibility on hardware over five years old.  Users must avoid situations which disrupt the timing of MTP/IP applications or the system as a whole.

User Interaction

Any action performed by the user may disrupt the system in a way that impacts network performance.  For example, viewing a web page will substantially reduce hard drive, CPU, memory, and network performance. Windows in particular gives higher priority to user actions than to other processes.  During testing, users should avoid any other activity.

Virtual Machines

Machine virtualization allows a "guest" operating system to be run inside a simulated environment controlled by a program running on a real "host" system.  Because the guest is a full operating system, much of what it does is redundant with the operations of the host operating system.  Thus a virtualized system more than doubles the amount of processing that must occur with each network datagram.  In particular, all of the other problems that may affect a system are at least doubled in a virtualized system.

Virtualized server environments often involve multiple guest systems, all contending for limited CPU, memory, and hard drive access.  These different redundant systems can interact in unexpected ways.  Network performance may be severely reduced for some traffic at some times, while it is improved for other traffic at other times.  Generally speaking, virtualization involves substantial performance costs.  This effect can be mitigated by testing in the host, rather than the guest, system.


Network Devices

When traveling between computers, network datagrams will encounter many different systems.  A typical Internet path involves between ten and twenty "hops", which are declared routers or other network nodes.  For example, a datagram might travel through the source computer, a wireless gateway, an ethernet hub, a DSL or cable modem, an ATM switch, many fiber optic routers, a T1 line and router, several ethernet hubs, and the destination computer.

Additional devices may be hidden between the hops, particularly when multiple media are used along the path.  Thus dozens of devices will be involved in most wide area network transactions, most of them outside the control of the end-user.  As a result, there are many opportunities for the data to become delayed, blocked, or even corrupted.

Following is a list of the components and factors most likely to impact network performance.  Note that several different components may be combined into a single network device.  For example, a typical consumer DSL router includes a wireless gateway, ethernet switch, firewall, NAT, VPN, and modem in a single box.

Hubs, Switches, Routers, and Gateways

These devices relay network traffic amongst network links.  By itself, this functionality is typically very fast and rarely causes any problems. However, devices with these names often have one or more of the following components, which can be more problematic.

Hardware Firewalls

A firewall examines each datagram it receives and applies a set of rules to determine whether or not it should be allowed to pass through.  Hardware firewalls typically do this very quickly with little or no impact on network performance.  However, this depends on the rules being correctly configured.

Hardware firewalls must be configured to explicitly allow MTP/IP traffic to pass through without interference.  If this is not done, then the firewall may block or degrade MTP/IP's performance.  It is possible for a firewall to initially allow traffic to pass through, but then reduce or cut-off that traffic after several minutes.  Users may be accustomed to their firewall correctly guessing what to do without explicit configuration, but those guesses may not be correct when MTP is introduced.  See technical note 0002, "Configuring Firewalls", for advice on configuring firewalls.

Network Address Translation (NAT)

Machines on a Local Area Network may use private IP addresses to communicate with each other while sharing a single public IP address to communicate with the rest of the Internet.  This is done for both security and to conserve scarce public addresses.

When a private machine seeks to talk to the public Internet, the NAT device must translate the private address into the public one.  Because many private machines may be sharing the same public address, the NAT device has to keep track of which traffic belongs to which private machine.  This involves keeping track of incoming and outgoing port numbers, and sometimes involves changing those port numbers.

If a private machine initiates an MTP/IP transaction, the NAT device should take note of the outgoing traffic and automatically route returning traffic back to the correct machine.  However, some NAT devices may forget this information, causing an ongoing transaction to fail after several minutes.

The private machine will not be able to receive transactions initiated by outside systems unless the NAT is specifically instructed to "map" external ports to the internal machine.  NAT port mapping is common task for setting up any server behind an NAT device.

Virtual Private Network (VPN)

VPN hardware is similar to a gateway or router in that it stops each data packet it receives and decides where to send it next.  The difference with a VPN is that it simulates virtual network paths by encapsulating the data packets within other packets or data streams.  One common example of this is allowing machines on different private LANs to communicate using private IP addresses.  VPNs often add authentication and encryption to provide additional security to the traffic they handle.

This encapsulation can be performed either by datagrams or by TCP streams.  Datagram based VPNs (particularly IPsec) process one datagram at a time, adding a small amount of data to each one before sending it on its way.  This may rarely cause MTU problems, but is otherwise unlikely to cause performance problems.

Low quality, and some very old, VPNs "tunnel" all datagrams across a single TCP/IP connection.  This includes so-called SSL VPNs.  These introduce substantial network overhead and exponentially worsen TCP flow control and congestion problems.  SSL and other tunneling VPNs exhibit poor network performance under all circumstances.  MTP/IP cannot be used with SSL or other tunneling VPNs.

Maximum Transmit Unit (MTU)

Internet Protocol (IP) networks transmit data in discrete packets called datagrams.  Each datagram has source and destination addresses, some descriptive information, and the data payload.  IP allows datagrams to be up to 65636 bytes in total size.  However, most network media limits datagrams to much smaller sizes.  Ethernet, for example, typically limits datagrams to just 1500 bytes.  IP provides a mechanism for network devices to divide up datagrams that are too large for a particular link.  This process is called fragmentation.  Fragmentation introduces extra overhead, but sometimes larger datagrams provide improved performance despite this.

Most devices limit datagram sizes, even with fragmentation.  But due to a lack of standards conformance, it is not possible to know for certain how large an MTU will be supported by any given network path.  In particular, some devices may support large datagrams at a severe performance penalty, while others may give improved performance with large datagrams, and still others may silently discard large datagrams.  This situation may be further complicated by VPNs or other other tunneling protocols (such as PPPoE), which add to the size of datagrams without telling the computers at either end.  Thus a datagram of the proper size can grow to become too large.

For this reason, TCP/IP usually limits its datagrams to about 1444 bytes. MTP/IP has sophisticated algorithms to detect when it is advantageous to use larger datagrams.  However, in some environments, it may be necessary to tell MTP not to exceed a given limit.  In most MTP applications, this is done with the MaxDatagram configuration option.  This value should be set to about 56 bytes less than the known MTU limit.

Wireless

Wireless communications, including 802.11 WiFi, Bluetooth, satellite, and various cellular mechanisms present numerous additional challenges.  Wireless media is subject to highly variable conditions which can degrade performance.  Different media types react to problems in different ways. Some will attempt to correct for lost or corrupted data.  This results in data delays and can cause transfers to "freeze up" for significant periods of time.  Others simply drop corrupted data, allowing the transport and application layers to handle recovery.  MTP/IP will perform best when it is allowed to manage its own error recovery.

Some wireless media, particularly some satellite and cellular links, may entail tunneling across TCP/IP connections.  MTP/IP will perform very poorly in such environments.  Unfortunately, there is no easy way to know what sort of loss policies a given network is implementing, except by experimentation.

Multiplexers

A bandwidth or line multiplexer attempts to combine the capacities of multiple data paths into a single large pipe.  For example, three independent T1 lines might be multiplexed in an attempt to provide 4.5 megabits per second instead of the single 1.5.  Multiplexing requires that devices be installed at both ends to both split apart and recombine the data flows.  The devices must correctly balance the flow of data packets across all available lines.

Performance of multiplexers varies greatly.  Some are able to efficiently distribute traffic, while others cause severe performance problems.  Many are sensitive to MTU (large datagram) issues.  Performance will often vary depending on the systems communicating.  MTP/IP should perform very well over a correctly functioning multiplexer.  But experience has shown that many multiplexers are not properly configured or do not correctly handle MTU issues.

Bandwidth Management Devices

This is a broad category which includes any device that actively interferes with the datagrams flowing through it in an effort to change performance.  A common function is prioritization, in which a set of rules are applied to determine whether performance of some traffic should be degraded in favor of other traffic.  Some devices attempt to compress or cache traffic in order to reduce the amount of data traveling on the network.  Some observe network latency and will degrade the performance of any traffic which appears to cause latency to exceed some threshold.

The functioning of such devices is heavily dependent on their correct configuration.  In particular, they must be properly programmed with the priorities of the users and network managers.  If those priorities, or the network itself, change over time, the devices may begin to greatly hinder performance.

MTP/IP should work well with correctly configured devices that obey Internet Protocol standards.  Other devices may severely degrade MTP/IP performance and should be investigated to ensure that they are functioning as intended.

Other Traffic

Though not a component of the network, third-party traffic flow often has the largest impact on network performance.  Any data traveling across any part of the same network path will reduce the performance any other data traveling that path.  Ideally, network resources would be fairly divided amongst all users.  But variation in capacities, use patterns, and requirements often makes it impossible to know what "fairly" means, let alone enforce it.  Worse, oscillations in traffic flows can cause some traffic to create interference out of proportion to the bandwidth it consumes.

All else being equal, MTP/IP will perform much better than TCP/IP when third-party traffic is a factor.  MTP avoids the types of oscillations that cause some TCP flows to excessively interfere with each other.  However, networking is ultimately a zero-sum game: once the total data flow reaches the capacity of the network, no flow can gain without another one losing. Ideally, tests should be performed when third-party traffic is absent or at least steady.  See technical note 0003, "Analyzing Network Performance", for additional testing advice.

info@DataExpedition.com 877-292-2280
Copyright © 2000-2008 Data Expedition, Inc.  All Rights Reserved.