We've seen many features of TCP/IP that we've had to describe with the qualifier "it depends on the configuration." Typical examples are whether or not UDP checksums are enabled (Section 11.3), whether destination IP addresses with the same network ID but a different subnet ID are local or nonlocal (Section 18.4), and whether directed broadcasts are forwarded or not (Section 12.3). Indeed, many operating characteristics of a given TCP/IP implementation can be modified by the system administrator.
This appendix lists some of the configurable options for the various TCP/IP implementations that have been used throughout the text. As you might expect, every vendor does things differently from all others. Nevertheless, this appendix gives an idea of the types of parameters different implementations allow one to modify. A few options that are highly implementation specific, such as the low-water mark for the memory buffer pool, are not described.
These variables are described for informational purposes only. Their names, default values, or interpretation can change from one release to the next. Always check your vendor's documentation (or bug them for adequate documentation) for the final word on these variables.
This appendix does not cover the initialization that
takes place every time the system is bootstrapped: the initialization
of each network interface using ifconfig
(setting the IP address, the subnet mask, etc.), entering static
routes into the routing table, and the like. Instead, this appendix
focuses on the configuration options that affect how TCP/IP operates.
E.1 BSD/386 Version 1.0
This system is an example of the "classical" BSD configuration that has been used since 4.2BSD. Since the source code is distributed with the system, configuration options are specified by the administrator, and the kernel is recompiled. There are two types of options: constants that are defined in the kernel configuration file (see the config(8) manual page), and variable initializations in various C source files. Brave and knowledgeable administrators can also change the values of these C variables in either the running kernel or the kernel's disk image, using a debugger, to avoid rebuilding the kernel. Here are the constants that can be changed in the kernel's configuration file.
IPFORWARDING
The value of this constant initializes the kernel
variable ipforwarding. If 0 (default),
IP datagrams are not forwarded. If 1, forwarding is always enabled.
GATEWAY
If defined, causes IPFORWARDING
to be set to 1. Additionally, defining this constant causes certain
system tables (the ARP cache and the routing table) to be larger.
SUBNETSARELOCAL
The value of this constant initializes the kernel
variable subnetsarelocal. If 1 (default),
a destination IP address with the same network ID as the sending
host but a different subnet ID is considered local. If 0, only
destination IP addresses on an attached subnet are considered
local. This is summarized in Figure E.1.
subnetsarelocal | ||||
same different | different - |
local nonlocal | nonlocal nonlocal | always local depends on configuration always nonlocal |
This affects the MSS selected by TCP. When sending to local destinations, TCP chooses the MSS based on the MTU of the outgoing interface. When sending to nonlocal destinations, TCP uses the variable tcp_mssdflt as the MSS.
IPSENDREDIRECTS
The value of this constant initializes the kernel
variable ipsendredirects. If 1 (default),
the host will send ICMP redirects when forwarding IP datagrams.
If 0, ICMP redirects are not sent.
DIRECTED_BROADCAST
If 1 (default), received datagrams whose destination
address is the directed broadcast address of an attached interface
are forwarded as a link-layer broadcast. If 0, these datagrams
are silently discarded.
The following variables can also be modified. These variables are spread throughout different files in the /usr/src/sys/netinet directory.
tcprexmtthresh | The number of consecutive ACKs that triggers the fast retransmit and fast recovery algorithm. The default value is 3. |
tcp_ttl | The default value for the TTL field for TCP segments. Default value is 60. |
tcp_mssdflt | The default TCP MSS for nonlocal destinations. Default value is 512. |
tcp_keepidle | Number of 500-ms clock ticks before sending a keepalive probe. Default value is 14400 (2 hours). |
tcp_keepintvl | Number of 500-ms clock ticks between successive keepalive probes, when no response is received. Default value is 150 (75 seconds). |
tcp_sendspace | The default size of the TCP send buffer. Default value is 4096. |
tcp_recvspace | The default size of the TCP receive buffer. This affects the window size that is offered. Default value is 4096. |
udpcksum | If nonzero, UDP checksums are calculated for outgoing UDP datagrams, and incoming UDP datagrams containing nonzero checksums have their checksum verified. If 0, outgoing UDP datagrams do not contain a checksum, and no checksum verification is performed on incoming UDP datagrams, even if the sender calculated a checksum. Default is 1. |
udp_ttl | The default value for the TTL field in UDP datagrams. Default value is 30. |
udp_sendspace | The default size of the UDP send buffer. Defines the maximum UDP datagram that can be sent. Default is 9216. |
udp_recvspace | The default size of the UDP receive buffer. The default is 41600, allowing for 40 1024-byte datagrams. |
The method used with SunOS 4.1.3 is similar to what we saw with BSD/386. Since most of the kernel sources are not distributed, all the C variable initializations are contained in a single C source file that is provided.
The administrator's kernel configuration file (see the config(8) manual page) can define the following variables. After modifying your configuration file, a new kernel must be made and rebooted.
IPFORWARDING
The value of this constant initializes the kernel
variable ip_forwarding. If -1, IP
datagrams are never forwarded. Furthermore, the variable is never
changed. If 0 (default), IP datagrams are not forwarded, but the
variable's value is changed to 1 if multiple interfaces are up.
If 1, forwarding is always enabled.
SUBNETSARELOCAL
The value of the kernel variable ip_subnetsarelocal
is initialized from this constant. If 1 (default), a destination
IP address with the same network ID as the sending host but a
different subnet ID is considered local. If 0, only destination
IP addresses on an attached subnet are considered local. This
is summarized in Figure E.1. When sending to local destinations,
TCP chooses the MSS based on the MTU of the outgoing interface.
When sending to nonlocal destinations, TCP uses the variable tcp_default_mss.
IPSENDREDIRECTS
The value of this constant initializes the kernel
variable ip_sendredirects. If 1 (default),
the host will send ICMP redirects when forwarding IP datagrams.
If 0, ICMP redirects are not sent.
DIRECTED_BROADCAST
The value of this constant initializes the kernel
variable ip_dirbroadcast. If 1 (default),
received datagrams whose destination .address is the directed
broadcast address of an attached interface are forwarded as a
link-layer broadcast. If 0, these datagrams are silently discarded.
The file /usr/kvm/sys/netinet/in_proto.c defines the following variables that can be changed. Once these variables are changed, a new kernel must be made and rebooted.
tcp_default_mss | The default TCP MSS for nonlocal destinations. Default value is 512. |
tcp_sendspace | The default size of the TCP send buffer. Default value is 4096. |
tcp_recvspace | The default size of the TCP receive buffer. This affects the window size that is offered. Default value is 4096. |
tcp_keeplen | A keepalive probe to a 4.2BSD host must contain a single byte of data to get a response. Set the variable to 1 for compatibility with these older implementations. Default value is 1. |
tcp_ttl | The default value for the TTL field for TCP segments. Default value is 60. |
tcp_nodelack | If nonzero, ACKs are not delayed. Default value is 0. |
tcp_keepidle | Number of 500-ms clock ticks before sending a keepalive probe. Default value is 14400 (2 hours). |
tcp_keepintvl | Number of 500-ms clock ticks between successive keepalive probes, when no response is received. Default value is 150 (75 seconds). |
udp_cksum | If nonzero, UDP checksums are calculated for outgoing UDP datagrams, and incoming UDP datagrams containing nonzero checksums have their checksum verified. If 0, outgoing UDP datagrams do not contain a checksum, and no checksum verification is performed on incoming UDP datagrams, even if the sender calculated a checksum. Default is 0. |
udp_ttl | The default value for the TTL field in UDP datagrams. Default value is 60. |
udp_sendspace | The default size of the UDP send buffer. Defines the maximum UDP datagram that can be sent. Default is 9000. |
udp recvspace | The default size of the UDP receive buffer. The default is 18000, allowing for two 9000-byte datagrams. |
The TCP/IP configuration of SVR4 is similar to the previous two systems, but fewer options are available. In the file /etc/conf/pack.d/ip/space.c two constants can be defined, and the kernel must then be rebuilt and rebooted.
IPFORWARDING
The value of this constant initializes the kernel
variable ipforwarding. If 0 (default),
IP datagrams are not forwarded. If 1, forwarding is always enabled.
IPSENDREDIRECTS
The value of this constant initializes the kernel
variable ipsendredirects. If 1 (default),
the host will send ICMP redirects when forwarding IP datagrams.
If 0, ICMP redirects are not sent.
Many of the variables that we've described in the
previous two sections are defined in the kernel, but one must
patch the kernel to modify them. For example, there is a variable
named tcp_keepidle with a value of
14400.
E.4 Solaris 2.2
Solaris 2.2 is typical of the newer Unix systems that provide a program for the administrator to run to change the configuration options of the TCP/IP system. This allows reconfiguration without having to modify source files and rebuild a kernel.
The configuration program is ndd(l). We can run the program to see what parameters we can examine or modify in the UDP module:
solaris % ndd /dev/udp \? | |
udp_wroff_extra | (read and write) |
udp_def_ttl | (read and write) |
udp_first_anon_port | (read and write) |
udp_trust_optlen | (read and write) |
udp_do_checksum | (read and write) |
udp_status | (read only) |
There are five modules we can specify: /dev/ip, /dev/icmp, /dev/arp, /dev/udp, and /dev/tcp. The question mark argument (which we have to prevent the shell from interpreting by preceding it with a backslash) tells the program to list all the parameters for that module. An example that queries the value of a variable is:
solaris % ndd /dev/tcp
tcp_mss_def
536
To change the value of a variable we need superuser privilege and type:
solaris # ndd -set /dev/ip ip_forwarding 0
These variables can be divided into three categories:
We now describe the parameters in each module. All parameters are read-write, unless marked "(Read only)." The read-only parameters are the status variables from case 2 above. We also mark the "(Debug)" variables from case 3. Unless otherwise noted, all the timing variables are specified in milliseconds, which differs from the other systems that normally specify times as some number of 500-ms clock ticks.
ip_cksum_choice
(Debug) Selects between two independent implementations
of the IP checksum algorithm.
ip_debug
(Debug) Enables printing of debug output by
the kernel, if greater than 0. Larger values generate more output.
Default is 0.
ip_def_ttl
Default TTL for outgoing IP datagrams, if not
specified by transport layer. Default is 255.
ip_forward_directed_broadcasts
If 1 (default), received datagrams whose destination
address is the directed broadcast address of an attached interface
are forwarded as a link-layer broadcast. If 0, these datagrams
are silently discarded.
ip_forward_src_routed
If 1 (default), received datagrams containing
a source route option are forwarded. If 0, these datagrams are
discarded.
ip_forwarding
Specifies whether the system forwards incoming
IP datagrams: 0 means never forward, 1 means always forward, and
2 (default) means only forward when two or more interfaces are
up.
ip_icmp_return_data_bytes
The number of bytes of data beyond the IP header
that are returned in an ICMP error. Default is 64.
ip_ignore_delete_time
(Debug) Minimum lifetime of an IP routing table
entry (IRE). Default is 30 seconds. (This parameter is in seconds,
not milliseconds.)
ip_ill_status
(Read only) Displays the status of each IP lower
layer data structure. There is one lower layer structure for each
interface.
ip_ipif_status
(Read only) Displays the status of each IP interface
data structure (IP address, subnet mask, etc.). There is one of
these structures for each interface.
ip_ire_cleanup_interval
(Debug) The interval at which the IP routing
table entries are scanned for possible deletions. Default is 30000
ms (30 seconds).
ip_ire_flush_interval
The interval at which ARP information in unconditionally
flushed from the IP routing table. Default is 1200000 ms (20 minutes).
ip_irepathmtu_interval
The interval at which the path MTU discovery
algorithm tries to increase the MTU. Default is 30000 ms (30 seconds).
ip_ire_redirect_interval
The interval at which IP routing table entries
that are from ICMP redirects are deleted. Default is 60000 ms
(60 seconds).
ip_ire_status
(Read only) Displays all the IP routing table
entries.
ip_local_cksum
If 0 (default), IP does not calculate the IP
checksum or the higher layer protocol checksum (i.e., TCP, UDP,
ICMP, or IGMP) for datagrams sent or received through the loopback
interface. If 1, these checksums are calculated.
ip_mrtdebug
(Debug) Enables printing of debug output concerning
multicast routing by the kernel, if 1. Default is 0.
ip_path_mtu_discovery
If 1 (default), path MTU discovery is performed
by IP. If 0, IP never sets the "don't fragment" bit
in outgoing datagrams.
ip_respond_to_address_mask
If 0 (default), the host does not respond to
ICMP address mask requests. If 1, it does respond.
ip_respond_to_echo_broadcast
If 1 (default), the host responds to ICMP echo
requests that are sent to a broadcast address. If 0, it does not
respond.
ip_respond_to_timestamp
If 0 (default), the host does not respond to
ICMP timestamp requests. If 1, the host responds.
ip_respond_to_timestamp_broadcast
If 0 (default), the host does not respond to
ICMP timestamp requests that are sent to a broadcast address.
If 1, it does respond.
ip_rput_pullups
(Debug) Count of number of buffers from the
network interface driver that needed to be pulled up to access
the full IP header. Initialized to 0 at bootstrap time, and can
be reset to 0.
ip_send_redirects
If 1 (default), the host sends ICMP redirects
when acting as a router. If 0, these are not sent.
ip_send_source_quench
If 1 (default), the host generates ICMP source
quench errors when incoming datagrams are discarded. If 0, these
are not generated.
ip_wroff_extra
(Debug) Number of bytes of extra space to allocate
in buffers for IP headers. Default is 32.
icmp_bsd_compat
(Debug) If 1 (default), the length field in
the IP header of received datagrams is adjusted to exclude the
length of the IP header. This is compatible with Berkeley-derived
implementations and is for applications reading raw IP or raw
ICMP packets. If 0, the length field is not changed.
icmp_def_ttl
The default TTL for outgoing ICMP messages.
Default is 255.
icmp_wroff_extra
(Debug) Number of bytes of extra space to allocate
in buffers for IP options and data-link headers. Default is 32.
arp_cache_report
(Readonly) The ARP cache.
arp_cleanup_interval
The interval after which ARP entries are discarded
from ARP's cache. Default is 300000 ms (5 minutes). (IP maintains
its own cache of completed ARP translations; see ip_ire flush_interval.)
arp_debug
(Debug) If 1, enables printing of debug output
by the ARP driver. Default is 0.
udp_def_ttl
The default TTL for outgoing UDP datagrams.
Default value is 255.
udp_do_checksum
If 1 (default), UDP checksums are calculated
for outgoing UDP datagrams. If 0, outgoing UDP datagrams do not
contain a checksum. (Unlike most other implementations, this UDP
checksum flag does not affect incoming datagrams. If a received
datagram has a nonzero checksum, it is always verified.)
udp_largest_anon_port
Largest port number to allocate for UDP ephemeral
ports. Default is 65535.
udp_smallest_anon_port
Starting port number to allocate for UDP ephemeral
ports. Default is 32768.
udp_smallest_nonpriv_port
A process requires superuser privilege to assign
itself a port number less than this. Default is 1024.
udp_status
(Read only) The status of all local UDP end
points: local IP address and port, foreign IP address and port.
udp_trust_optlen
(Debug) No longer used.
udp_wroff_extra
(Debug) Number of bytes of extra space to allocate
in buffers for IP options and data-link headers. Default is 32.
tcp_close_wait_interval
The 2MSL value: the time spent in the TIME_WAIT
state. Default is 240000 ms (4 minutes).
tcp_conn_grace_period
(Debug) Additional time added to the timer interval
when sending a SYN. Default is 500 ms.
tcp_conn_req_max
The maximum number of pending connection requests
queued for any listening end point. Default is 5.
tcp_cwnd_max
The maximum value of the congestion window.
Default is 32768.
tcp_debug
(Debug) If 1, enables printing of debug output
by TCP. Default is 0.
tcp_deferred_ack_interval
The time to wait before sending a delayed ACK.
Default is 50 ms.
tcp_dupack_fast_retransmit
The number of consecutive duplicate ACKs that
triggers the fast retransmit, fast recovery algorithm. Default
is 3.
tcp_eager_listeners
(Debug) If 1 (default), TCP completes the three-way
handshake before returning a new connection to an application
with a pending passive open. "This is the way most TCP implementations
operate. If 0, TCP passes an incoming connection request (received
SYN) to the application, and does not complete the three-way handshake
until the application accepts the connection. (Setting this to
0 might break many existing applications.)
tcp_ignore_path_mtu
(Debug) If 1, path MTU discovery ignores received
ICMP fragmentation needed messages. If 0 (default), path MTU discovery
is enabled for TCP.
tcp_ip abort_cinterval
The total retransmit timeout value when TCP
is performing an active open. Default is 240000 ms (4 minutes).
tcp_ip_abort_interval
The total retransmit timeout value for a TCP
connection after it is established. Default is 120000 ms (2 minutes).
tcp_ip_notify_cinterval
The timeout value when TCP is performing an
active open after which TCP notifies IP to find a new route. Default
is 10000 ms (10 seconds).
tcp_ip_notify_interval
The timeout value for an established connection
after which TCP notifies IP to find a new route. Default is 10000
ms (10 seconds).
tcp_ip_ttl
The TTL to use for outgoing TCP segments. Default
is 255.
tcp_keepalive_interval
The time that a connection must be idle before
a keepalive probe is sent. Default is 7200000 ms (2 hours).
tcp_largest_anon_port
Largest port number to allocate for TCP ephemeral
ports. Default is 65535.
tcp_maxpsz_multiplier
(Debug) Specifies the multiple of the MSS into
which the stream head packetizes the application's write data.
Default is 1.
tcp_mss_def
Default MSS for nonlocal destinations. Default
is 536.
tcp_mss_max
The maximum MSS. Default is 65495.
tcp_mss_min
The minimum MSS. Default is 1.
tcp_naglim_def
(Debug) Maximum value of the per-connection
Nagle algorithm threshold. Default is 65535. The per-connection
value starts out as the minimum of the MSS or this value. The
per-connection value is set to 1 by the TCP_NODELAY socket option,
which disables the Nagle algorithm.
tcp_old_urp_interpretation
(Debug) If 1 (default), the older (but more
common) BSD interpretation of the urgent pointer is used: it points
1 byte beyond the last byte of urgent data. If 0, the Host Requirements
RFC interpretation is used; it points to the last byte of urgent
data.
tcp_rcv_push_wait
(Debug) Maximum number of bytes received without
the PUSH flag set before the data is passed to the application.
Default is 16384.
tcp_rexmit_interval_initial
(Debug) Initial retransmit timeout interval.
Default is 500 ms.
tcp_rexmit_interval_max
(Debug) Maximum retransmit timeout interval.
Default is 60000 ms (60 seconds).
tcp_rexmit_interval_min
(Debug) Minimum retransmit timeout interval.
Default is 200 ms.
tcp_rwin_credit_pct
(Debug) Percentage of receive window that must
be buffered before flow control is checked on every received segment.
Default is 50%.
tcp_smallest_anon_port
Starting port number to allocate for TCP ephemeral
ports. Default is 32768.
tcp_smallest_nonpriv_port
A process requires superuser privilege to assign
itself a port number less than this. Default is 1024.
tcp_snd_lowat_fraction
(Debug) If nonzero, the send buffer low-water
mark is the send buffer size divided by this value. Default is
0 (disabled).
tcp_status
(Read only) Information on all TCP connections.
tcp_sth_rcv_hiwat
(Debug) If nonzero, the value to set the stream
head high-water mark to. Default is 0.
tcp_sth_rcv_lowat
(Debug) If nonzero, the value to set the stream
head low-water mark to. Default is 0.
tcp_wroff_xtra
(Debug) Number of bytes of extra space to allocate
in buffers for IP options and data-link headers. Default is 32.
E.5 AIX 3.2.2
AIX 3.2.2 allows network options to be set at runtime using the no command. It can display the value of an option, set the value of an option, or set an option value back to its default. For example, to display an option we type:
aix % no -o udp_ttl
udp_ttl = 30
The following options can be modified.
arpt_killc
The time (in minutes) before an inactive completed
ARP entry is removed. Default is 20.
ipforwarding
If 1 (default), IP datagrams are always forwarded.
If 0, forwarding is disabled.
ipfragttl
The time to live (in seconds) for IP fragments
awaiting reassembly. Default is 60.
ipsendredirects
If 1 (default), the host will send ICMP redirects
when forwarding IP datagrams. If 0, ICMP redirects are not sent.
loop_check_sum
If 1 (default), the IP checksum is calculated
for datagrams sent through the loop-back interface. If 0, this
checksum is not calculated.
nonlocsrcroute
If 1 (default), received datagrams containing
a source route option are forwarded. If 0, these datagrams are
discarded.
subnetsarelocal
If 1 (default), a destination IP address with
the same network ID as the sending host but a different subnet
ID is considered local. If 0, only destination IP addresses on
an attached subnet are considered local. This is summarized in
Figure E.1. When sending to local destinations, TCP chooses the
MSS based on the MTU of the outgoing interface. When sending to
nonlocal destinations, TCP uses the default (536) as the MSS.
tcp_keepidle
Number of 500-ms clock ticks before sending
a keepalive probe. Default value is 14400 (2 hours).
tcp_keepintvl
Number of 500-ms clock ticks between successive
keepalive probes, when no response is received. Default value
is 150 (75 seconds).
tcp_recvspace
The default size of the TCP receive buffer.
This affects the window size that is offered. Default value is
16384.
tcp_sendspace
The default size of the TCP send buffer. Default
value is 16384.
tcp_ttl
The default value for the TTL field for TCP
segments. Default value is 60.
udp_recvspace
The default size of the UDP receive buffer.
The default is 41600, allowing for 40 1024-byte datagrams.
udp_sendspace
The default size of the UDP send buffer. Defines
the maximum UDP datagram that can be sent. Default is 9216.
udp_ttl
The default value for the TTL field in UDP datagrams.
Default value is 30.
E.6 4.4BSD
4.4BSD is the first of the Berkeley releases to provide dynamic configuration for numerous kernel parameters. The sysctl(8) command is used. The names for the parameters were chosen to look like MIB names from SNMP. To examine a parameter we type:
vangogh % sysctl net.inet.ip.forwarding
net.inet.ip.forwarding = 1
To change a parameter we need superuser privilege and then type:
vangogh # sysctl -w nat.inet.ip.ttl=128
The following parameters can be changed.
net.inet.ip.forwarding
If 0 (default), IP datagrams are not forwarded.
If 1, forwarding is enabled.
net.inet.ip.redirect
If 1 (default), the host will send ICMP redirects
when forwarding IP datagrams. If 0, ICMP redirects are not sent.
net.inet.ip.tti
The default TTL for both TCP and UDP. The default
is 64.
net.inet.icmp.maskrepi
If 0 (default), the host does not respond to
ICMP address mask requests. If 1, it does respond.
net.inet.udp.checksum
If 1 (default), UDP checksums are calculated
for outgoing UDP datagrams, and incoming UDP datagrams containing
nonzero checksums have their checksum verified. If 0, outgoing
UDP datagrams do not contain a checksum, and no checksum verification
is performed on incoming UDP datagrams, even if the sender calculated
a checksum.
Additionally, numerous variables that we've described earlier in this appendix are scattered among various source files (tcp_keepidle, subnetsarelocal, etc.) and can be modified.