Tech Note 0035

Linux Performance Tuning

Optimizing network throughput for Linux systems

Most Linux distributions include default configurations which are not optimized for high performance.  For network speeds above a few hundred megabits per second, these legacy settings can severely impair throughput.  Check the settings below whenever installing MTP/IP software on a system where such speeds are expected.  This advice is specific to Linux systems.  Advice for improving performance on all platforms can be found in Tech Note 0023.

UDP Buffers

MTP/IP software uses the UDP/IP packet format to provide network and operating system compatibility.  The UDP/IP buffer size determines how much data the operating system will store while handling other I/O operations.  Many Linux distributions limit UDP buffer sizes to just 128 kilobytes: enough for only 1 millisecond of data on a gigabit network.  Such small buffers can lead to high packet loss and limited network throughput.

Check the current UDP/IP buffer limit by typing the following commands:

sysctl net.core.wmem_max sysctl net.core.rmem_max

If the values are less than 2097152 bytes you should add the following lines to the /etc/sysctl.conf file:

net.core.wmem_max=2097152 net.core.rmem_max=2097152

Changes to /etc/sysctl.conf do not take effect until reboot.  To update the values immediately, type the following commands:

sudo sysctl -w net.core.wmem_max=2097152 sudo sysctl -w net.core.rmem_max=2097152

Increasing these limits will not affect most applications.  Only applications which specifically request larger buffers are affected.  MTP/IP will request the largest buffer size available, up to 2 megabytes.  For more information about UDP buffer tuning, see Tech Note 0024.

File Write Cache

Linux delays writing data to storage until 10% of RAM is filled and will freeze all storage access for flushing when 20% of RAM is filled.  Systems with large amounts of RAM may experience extremely inconsistent file performance: very fast until the cache fills up, then crippling freezes while it empties.  This is especially problematic when network speeds are much faster than storage speeds as gigabytes of data can become cached requiring many seconds or even minutes to flush.  During flushes, network performance may be impaired or suspended, resulting in overall poor throughput and dropped connections.  To ensure consistent high performance, these caches should be reduced so that storage can flush them quickly.

Storage caching is controlled by the sysctl variables vm.dirty_background_bytes and vm.dirty_bytes.  For networks with a high bandwidth delay product, these should be set to two and four times the bandwidth delay product, respectively.  The following values are good for most high-speed networks.  Avoid going much lower unless RAM is extremely limited.

sudo sysctl -w vm.dirty_background_bytes=125000000 sudo sysctl -w vm.dirty_bytes=250000000

To ensure that these values persist across restarts, add the following lines to /etc/sysctl.conf:

vm.dirty_background_bytes=125000000 vm.dirty_bytes=250000000

Changing these values may reduce performance for local applications which infrequently write amounts of data which are larger than vm.dirty_background_bytes but smaller than 10% of RAM, however it will greatly reduce the chances of data loss in case of a system crash.  See Tech Note 0023 for advice on improving the performance of the storage hardware.

NFS Mount Options

Performance for the NFS network attached storage protocol varies greatly depending on the version and configuration.  In some cases, there may be trade-offs between performance and reliability.  When using NFS with a NAS device, consult the device vendor for performance advice.  Following are general guidelines for both NFS devices and servers:

Update to the most recent operating system and NFS versions available.  Legacy NFS implementations may contain serious performance and file integrity bugs.  Verify that all systems are running an adequate number of biod and nfsd processes.

Use NFS over UDP when possible.

Use the following mount options:

async,rsize=32K,wsize=32K,soft,intr

Older systems may limit rsize and wsize to 8K.  Use the largest value available, up to 32K.  For some NFS servers, adding the no_wdelay option may further improve performance, so test with and without it.

Some NFS servers and devices have limited support or poor performance for file locks.  The ExpeDat and SyncDat server, servedat, uses such locks by default when running on Linux.  If you are receiving otherwise unexplained Permission Denied errors when uploading to NFS, try setting the NoEXCL option.

Tuning NFS performance can be a complex endeavor, especially when dealing with legacy or highly customized systems, or specialty hardware.  See Tech Note 0029 for general guidance on tuning network attached storage.

CPU Performance

Modern CPUs are able to adjust their performance to conserve energy.  Most Linux distributions enable that energy saving mode by default.  This can impair network throughput, especially for multigigabit networks.  To ensure maximum performance, scaling_governor must be set to performance for every CPU core.  This must be done on the bare-metal operating system as virtual machines do not have direct access to these hardware settings.

While some distributions include utilities for easily managing CPU performance, the instructions below use the most common technique of adjusting each core separately via /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor.  To view the current settings, type:

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

To change to performance mode, you must type a separate command for each CPU core.  For example:

ls -l /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor ...

To ensure that these changes persist across reboots, add the echo commands to /etc/rc.local, creating that file if necessary.  Verify the settings after reboot as the requirements for different Linux distributions may vary.

Setting scaling_governor to performance may increase energy use and heat output of the host hardware.  Adjusting scaling_governor is not usually necessary for network speeds of 1 gigabit per second or slower.

Tech Note History

Jan282020Clarified recommended vm.dirty values
Nov152019Updated NFS options
Feb232018First post