Addressing, Routing, and Multiplexing
To deliver data between two Internet hosts, it is necessary to move data across the network to the correct host, and within that host to the correct user or process.
TCP/IP uses three schemes to accomplish these tasks:
Addressing : IP addresses deliver data to the correct host.
Routing : Gateway deliver data to the correct network.
Multiplexing : Protocol and port numbers deliver data to the correct software module within the host.
Each of these functions is necessary to send data between two co-operating applications across the Internet.
IP Host Address:
The Internetwork Protocol identifies hosts with a 32-bit number called IP address or a host address. To avoid confusion with MAC addresses, which are machine or station addresses, the term IP address will be used to designate this kind of address. IP addresses are written as four dot-separated decimal numbers between 0-255.
IP addresses must be unique among all connected machines (are any hosts that you can get over a network or connected set of networks, including your local area network, remote offices joined by the company's wide-area network, or even the entire Internet community).
The Internet Protocol moves data between the hosts in the form of datagrams. Each datagram is delivered to the address contained in the destination address of the datagrams header. The Destination Address is a standard 32-bit IP address that contains sufficient information to uniquely identify a network and a specific host on that network.
If your network is connected to the Internet, you have to get a range of IP addresses assigned to your machines through a central network administration authority. The IP address uniqueness requirement differs from the MAC addresses. IP addresses are unique only on connected networks, but machine MAC addresses are unique in the world, independent of any connectivity. Part of the reason for the difference in the uniqueness requirement is that IP addresses are 32-bits, while MAC addresses are 48-bits, so mapping every possible MAC address into an IP address requires some overlap. Of course, not every machine on a Ethernet is running IP protocols, so the many-to-one mapping isn't as bad as the numbers might indicate. There are a variety of reasons why the IP address is only 32 bits, while the MAC address is 48 bits, most of which are historical.
Since the network and data link layer use different addressing schemes, some system is needed to convert or map the IP addresses to the MAC addresses. Transport-layer services and user processes use IP addresses to identify hosts, but packets that go out on the network need MAC addresses. The Address Resolution Protocol (ARP) is used to convert the 32-bit IP address of a host into its 48-bit MAC address. When a hosts wants to map an IP address to a MAC address, it broadcasts an ARP request on the network, asking for the host using the IP address to respond. The host that sees its own IP address in the request returns its MAC address to the sender. With a MAC address, the sending host can transmit a packet on the Ethernet and know that the receiving host will recognise it.
IP Address Classes:
An IP address contains a network part and a host part, but the format of these parts in not the same in every IP address.
Figure 87 shows the IP address classes.
Not all network addresses or host addresses are available for use. The class A addresses, 0 and 127, that are reserved for special use. Network 0 designates the default route (is used to simplify the routing information that IP must handle) and network 127 is the loopback address (simplifies network applications by allowing the local host to be addressed in the same manner as a remote host). We use the special network addresses when configuring a host.
There are also some host addresses reserved for special use. In all network classes, host number 0 and 255 are reserved. An IP address with all host bits set to zero identifies the network itself. Addresses in this form are used in routing table listings to refer to entire networks. An IP address with all bits set to one is a broadcast address (is used to simultaneously address every host on a network). A datagram sent to this address is delivered to every individual host on that network.
IP uses the network portion of the address to route the datagram between networks. The full address, including the host information, is used to make final delivery when the datagram reaches the destination network.
Figure 88 shows host communication on a local network.
The standard structure of an IP address can be locally modified by using host address bits as additional network address bits. Essentially, the dividing line between network address bits and host bits is moved, creating additional networks, but reducing the maximum number of hosts that can belong to each network. These newly designed network bits define a network within the larger network, called a subnet. Subnetting allows decentralised management of host addressing. With the standard addressing scheme, a single administrator is responsible for managing host addresses for the entire network. By subnetting, the administrator can delegate address assignment to smaller organisations within the overall organisation.
Subnetting can also be used to overcome hardware differences and distance limitations. IP routers can link dissimilar physical networks together, but only if each physical network has its own unique network address. Subnetting divides a single network address into many unique subnet addresses, so that each physical network can have its own unique address.
Figure 89 shows IP addresses with and without subnetting.
A subnet is defined by applying a bitmask, the subnetmask, to the IP address. If a bit is on the mask, that equivalent bit in the address is interpreted as a network bit. If the bit in the mask is off, the bit belongs to the host part of the address. The subnet is only known locally. To the rest of the Internet, the address is still interpreted as a standard IP address.
Figure 90 shows host communication with subnetting.
As networks grow in size, so does the traffic imposed on the wire, which in turn impacts the overall network performance, including responses. To alleviate such a degradation, network specialist resort to breaking the network into multiple networks that are interconnected by specialised devices, including routers, bridges, and switches.
The routing approach calls on the implementation of various co-operative processes, in both routers and workstations, whose main concern is to allow for the intelligent delivery of data to its ultimate destination. Data exchange can take place between any workstation, whether or not both belong to the same network.
Figure 91 shows a view of routing.
Figure 91 emphasises that the underlying physical networks that a datagram travels through may be different and even incompatible. Host A1 on the Token Ring network routes the datagram through gateway G1, to reach host B1 on the Ethernet. Gateway G1 forwards the data through the X.25 network to gateway G2, for delivery to B1. The datagram traverses three physical different networks, but eventually arrives intact at B1.
A good place to start when discussing routers is with a through discussion of the addresses, including MAC addresses, network addresses, and the complete addresses.
The Routing Table:
To perform its function reliably, the routing process is equipped with the capability to maintain a road map depicting the entire internetwork of which it is part. This road map is commonly referred to as the routing table, and it includes routing information depicting every known network is, and how it can be reached. The routing process builds and maintains the routing table by employing a route discovery process known as the Routing Information Protocol (RIP).
Routers should be capable of selecting the shortest path connecting two networks. Routers discover the road map of the internetwork by dynamically exchanging routing information among themselves or by being statically configured by network installers, or both. The dynamic exchange of routing information is handled by yet another process besides the routing process itself. In the case of TCP/IP, IP handles the routing process, whereas RIP handles the route discovery process.
Internet Routing Architecture:
When a hierarchical structure is used, routing information about all of the networks in the internet is passed into the core gateway (a central delivery medium to carry long distance traffic). The core gateway process this information, and then exchange it among themselves using the Gateway-to-Gateway Protocol (GGP). The processed routing information is then passed back out to the external gateways.
Figure 92 shows the Internet Routing Architecture.
Outside of the Internet Core are groups of independent networks called Autonomous Systems (AS), it is a collection of networks and gateways with its own internal mechanism for collection routing information and passing it to other network systems.
The Routing Table:
Gateways route data between networks, but all network devices, hosts as well as gateways, must make routing decisions.
For most hosts, the routing decisions are simple:
If the destination is on the local network, the data is delivered to the destination host.
If the destination is on the remote network, the data is forwarded to a local gateway.
Because routing is network oriented, IP makes routing decisions based on the network portion of the address. The IP module determines the network part of the destination's IP address by checking the high-order bits of the address to determine the address class. The address class determines the portion of the address that IP uses to identify the network. If the destination network is the local network, the local subnet mask is applied to the destination address.
After determining the destination network, the IP module looks up the network in the local routing table. Packets are routed toward their destination as directed by the routing table. The routing table may be built by the system administrator or by routing protocols, but the end result is the same, IP routing decisions are simple table look-ups.
Figure 93 shows a flowchart depiction of the IP routing algorithm.
You can display the routing table's contents with the netstat -r command.
The netstat command displays a routing table containing the following fields:
Destination : The destination network or host.
Gateway : The gateway to use to reach the specified destination.
Flags : The flags describe certain characteristics of this route.
U: Indicates that the route is up and operational.
H: Indicates this is a route to a specific host.
G: Means the route uses a gateway.
D: Means that this route was adds because of an ICMP redirect.
Refcnt : Shows the number of times the route has been referenced to establish a connection.
Use : Shows the number of packets transmitted via this route.
Interface : The name of the network interface used by this route.
All of the gateways that appear in a routing table are networks directly connected to the local system. A routing table does not contain end-to-end routes. A rout only points to the next gateway, called the next hop, along the path to the destination network. The host relies on the local gateway to deliver the data, and the gateways relies on the other gateways. As a datagram moves from one gateway to another, it should eventually reach one that is directly connected to its destination network, It is this last gateway that finally delivers the data to the destination host.
The IP address and the routing table direct a datagram to a specific physical network, but when the data travels across a network, it must obey the physical layer protocol used by that network. The physical networks that underlay the TCP/IP network do not understand IP addressing. Physical networks have their own addressing schemes. and there are as many different addressing schemes as there are different types of physical networks. One task of the network access protocols is to map IP addresses to physical network addresses.
Figure 94 show the operation of ARP.
The most common example of this network access layer function is the translation of IP addresses to Ethernet addresses. The protocol that performs this function is Address Resolution Protocol (ARP).
Figure 95 shows the layout of an ARP request or ARP reply.
In figure 95, when an ARP request is sent, all fields in the layout are used except the Recipient Hardware Address (which the request is trying to identify). In an ARP reply, all the fields are used. The fields in the ARP request and reply can have several values.
The ARP software maintains a table of translations between IP addresses and Ethernet addresses. This table is built dynamically. When ARP receives a request to translate an IP address, it checks for the address in its table. If the address is found, it returns the Ethernet address in its table. If the address is not found in the table, ARP broadcast a packet to every host on the Ethernet. The packet contains the IP address for which an Ethernet address is sought. If a receiving host identifies the IP address as its own, it responds by sending its Ethernet address back to the requesting host. The response is then cached in the ARP table.
The arp -a command display all the contents of the ARP table.
Figure 96 shows Routing Domains
The Reverse Address Resolution Protocol (RARP), is a variant of the address resolution protocol. RARP also translates addresses, but in the opposite direction. It converts Ethernet addresses to IP addresses. The RARP protocol really has nothing to do with routing data from one system to another. RARP helps configure diskless systems by allowing diskless workstations to learn their IP address. The diskless workstations uses the Ethernet broadcast facility to ask which IP address maps to its Ethernet address. When a server on the network sees the request, it looks up the Ethernet address in the table. If it finds a match, the server replies with the workstation's IP address.
Figure 97 shows the interrelationship between IP and Ethernet MAC address as reflected in the Ethernet data frame.
In figure 97, Shaded fields correspondent to the destination and source address of host A, (the sender) and Host B (the receiver).
Protocols, Ports, and Sockets:
Once data is routed through the network and delivered to a specific host, it must be delivered to the correct user or process. As the data moves up or down the layers of TCP/IP, a mechanism is needed to deliver data to the correct protocols in each layer. The system must be able to combine data from many applications into a few transport protocols, and from the transport protocols into the Internet Protocol. Combining many sources of data into a single data stream is called multiplexing. Data arriving from the network must be demultiplexed, divided for delivery to multiple processes. To accomplish this, IP uses protocol numbers to identify transport protocols, and the transport protocols use port numbers to identify applications.
Figure 98 shows Protocol and Port Numbers.
Figure 99 shows the protocol interdependency between Application level protocols and Transport level protocols.
Is a single byte in the header of the datagram. The value identifies the protocol in the layer above IP to which the data should be passed.
A host may have many TCP and UDP connections at any time. Connections to a host are distinguished by a port number, which serves as a sort of mailbox number for incoming datagrams. There may be many processes using TCP and UDP on a single machine, and the port numbers distinguish these processes for incoming packets. When a user program opens a TCP or UDP socket, it gets connected to a port on the local host. The application may specify the port, usually when trying to reach some service with a well-defined port number, or it may allow the operating system to fill in the port number with the next available free port number.
After IP passes incoming data to the transport protocol, the transport protocol passes data to the correct application process. Application processes are identified by port numbers, which are 16-bit values. The source port number, which identifies the process that sent the data, and the destination port number, which identifies the process that is to receive the data are contained in the header of each TCP segment and UDP packet.
Port numbers are not unique between transport layer protocols, the numbers are only unique within a specific transport protocol. It is the combination of protocol and port numbers that uniquely identifies the specific process the data should be delivered to.
Figure 100 shows data packets multiplexed via TCP or UDP through port addresses and onto the targeted TCP/IP applications.
In figure 100, if a data packet arrives specifying a transport protocol of 6, it is forwarded to the TCP implementation. If the packet specifies 17 as the required protocol, the IP layer would forward the packet to the programs implementing UDP.
Figure 101 shows the exchange of port numbers during the TCP handshake.
In figure 101, the source host randomly generates a source port, in this example 3044. It sends out a segment with a source port of 3044 and a destination port of 23. The destination host receives the segment, and responds back using 23 as it source port and 3044 as its destination port.
Well-known ports are standardised port numbers that enables remote computers to know which port to connect to for a particular network service. This simplifies the connection process because both the sender and the receiver know in advance that data bound for a specific process will use a specific port.
There is a second type of port number called a dynamically allocated port. As the name implies, this ports are not pre-assigned. They are assigned to processes when needed. The system ensures that it does not assign the same port number to two processes, and that the number assigned are above the range of standard port numbers. She provide the flexibility needed to support multiple users.
The combination of an IP address and a port number is called a socket. A socket uniquely identifies a single network process within the entire internet. One pair of sockets, one socket for the receiving host and one for the sending host, define the connection for connection-oriented protocols such as TCP.
Names and Addresses:
Every network interface attached to a TCP/IP network is defined by a unique 32-bit IP address. A name, called a host name, can be assigned to any device that has an IP address. Names are assigned to devices because, compared to numeric Internet addresses, names are easier to remember and type correctly. The network software doesn't require names, but they do make it easier form humans to use the network. In most cases, host names and numeric addresses can be used interchangeably. Whether a command is entered with an address or a host name, the network connection always takes place based on the IP address. The system converts the host name to an address before the network connection is made. The network administrator is responsible for assigning names and addresses and storing them in the database used for the conversion. There are two methods for translating names into addresses. The older method simply looks up the host name in a table called the host table. The newer technique uses a distributed database system called Domain Name Service (DNS) to translate names to addresses.
The Host Table:
Is a simple text file that associates IP addresses with host names. Most systems have a small host table containing name and address information about the important hosts on the local network. This small table is used when DNS is not running, such as during the initial system start-up. Even if you use DNS, you should create a small host file containing entries for your host, for localhost, and for the gateway and servers on your local net. Sites that use NIS use the host table as input to the NIS host database. You can use NIS in conjunction with DNS, but even when they are used together, most NIS sites create host tables that have an entry for every host on the local network. Hosts connected to the Internet should use DNS.
The Network Information Centre (NIC) Host Table:
Maintain a large table of Internet hosts, which is stored on the host. The NIC places host names and addresses into the file for all sites on the Internet. The NIC table contains three types of entries: Network records, gateway records, and host records.
Figure 102 shows the format of the Host.txt records.
In figure 102, each record begins with a keyword (NET, HOST or GATEWAY) that identifies the record type, followed by an IP address, and one or more names associated with the address. The IP addresses and host names from the Host record are extracted to construct the /etc/hosts file. The network addresses and names from the NET records are used to create the etc/networks file.
Domain Name Service (DNS):
It is a distributed database system that doesn't bog down as the database grows. It guarantees that new host information will be disseminated to the rest of the network as it is needed to those who are interested. If a DNS server receives a request for information about a host for which it has no information, it passes on the request to an authoritative server (is any server responsible for maintaining accurate information about the domain which is being queried). When the authoritative server answers, the local server saves (caches) the answer for future use. The next time the local server receives a request for this information, it answers the request itself. The ability to control host information from an authoritative source and to automatically disseminate accurate information makes DNS superior to the host table, even for small networks not connected to the Internet.
Figure 103 shows resolution of a DNS query.
The Domain Hierarchy:
DNS is a distributed hierarchical system for resolving host names into IP addresses. Under DNS, there is no central database with all of the Internet host information. The information is distributed among thousands of name servers organised into a hierarchy. DNS has a root domain at the top of the domain hierarchy that is served by a group of name servers called the root server. Information about a domain is found by tracing pointers from the root domain, through subordinate domains, to the target domain. Directly under the root domain are the top level domains. There are two basic types of top-level domains, geographic and organisational.
Figure 104 shows Domain Hierarchy.
Creating Domains and Subdomains:
The Network Information Centre has the authority to allocate domains. To obtain a domain, you apply to the NIC for authority to create a domain under one of the top-level domains. Once the authority to create a domain is granted, you can create additional domains, called subdomains, under your domain.
Reflect the domain hierarchy. Domain names are written from most specific, a host name, to least specific, a top-level domain, with each part of the domain name separated by a dot (<host name>.<subdomain>.<domain>).
Figure 105 shows organisation of the DNS name space.
Network Information Service (NIS):
Is an administrative database system that provides central control and automatic dissemination of important administrative files, NIS can be used in conjunction with DNS, or as an alternative to it. NIS and DNS have some similarities and some differences. Like DNS, the NIS overcomes the problem of accurately distributing the host table, nut unlike DNS, it only provides service for the local area networks. NIS is not intended as a service for the Internet as a whole. Another difference is that NIS provides access to a wider range of information than DNS. As its name implies, NIS provides much more than name-to-address conversion. It converts several standard UNIX files into databases that can be queried over the network. These databases are called NIS maps.
NIS provides a distributed database system for common configuration files. NIS servers manage copies of the database files, and NIS clients request information from the servers instead of using their own, local copies of these files. Once NIS is running, simply updating the NIS server ensures that all machines will be able to retrieve the new configuration file information
A major problem in running a distributed computing environment is maintaining separate copies of common configuration files such as the password, group, and hosts files. Ideally, the network should be consistent in its configuration, so that users don't have to worry about where they have accounts or if they'll be able to find a new machine on the network. Preserving consistency, however, means that every change to one of these common files must be propagated to every host on the network. The Network Information Service (NIS) addresses these problems. It is a distributed database system that replaces copies of commonly replicated configuration files with a centralised management facility. Instead of having to manage each host's files, you maintain one database for each file on one central server. Machines that are using NIS retrieve information as needed from these database. If you add a new system to the network, you can modify on file on a central server and propagate this change to the rest of the network, rather than changing the hosts file for each individual host on the network. Because NIS enforces consistent views of files on the network, it is suited for files that have no host-specific information in them. Files that are generally the same on all hosts in a network, fit the NIS model of a distributed database nicely. NIS provides all hosts information from its global database.
Master, Slaves, and Clients:
NIS is built on the client-server model. An NIS server is a host that contains NIS data files, called maps. Clients are hosts that request information from these maps. Servers are further divided into master and slave servers: The master server is the true single owner of the map data. Slave NIS servers handle client requests, but they do not modify the NIS maps. The master server is responsible for all map maintenance and distribution to its slave servers. Once an NIS map is built on the master to include a change, the new map file is distributed to all slave servers. NIS clients see these changes when the perform queries on the map file, it doesn't matter whether the clients are talking to a master or a slave server, because once the map data is distributed, all NIS servers have the same information.
Figure 106 shows NIS masters, slaves, and clients.
With the distinction between NIS servers and clients firmly established, we can see that each system fits into the NIS scheme in one of three ways:
Client only: This is a typical of desktop workstations, where the system administrator tries to minimise the amount of host-specific tailoring required to bring a system onto the network. As an NIS client, the host gets all of its common configuration information from an extant server.
Server only: While the host services client request for map information, it does not use NIS for its own operation. Server-only configuration may be useful when a server must provide global host and password information for the NIS clients, but security concerns prohibit the server from using these same files. However, bypassing the central configuration scheme opens some of the same loopholes that NIS was intended to close. Although it is possible to configure a system to be an NIS server only, we don't recommend it.
Client and server: In most cases, an NIS server also function as an NIS client so that its management is streamlined with that of other client-only hosts.
Most precisely, a domain is a set of NIS maps. A client can refer to a map from any of several different domains. Most of the time, however, any given host will only look up data from one set of NIS maps. Therefore, it's common to use the term domain to mean the group of systems that share a set of NIS maps. All systems that need to share common configuration information are put into an NIS domain. Although each system can potentially look up information in any NIS domain, each system is assigned to a default domain, meaning that the system, by default, looks up information from a particular set of NIS maps. It is up to the administrator to decide how many different domains are needed.
An interruption in NIS service affects all NIS clients if no other servers are available. Even if another server is available, clients will suffer periodic slowdowns as the recognise the current server is down and hunt for a new one.
A second imperative for NIS servers is synchronisation. Clients may get their NIS information from any server, so all servers must have copies of every map file to ensure proper NIS operation. Furthermore, the data in each map file on the slave servers must agree with that on the master server, so that NIS clients cannot get out-of-data or stale data. NIS contains several mechanisms for making changes to map files and distributing these changes to all NIS servers on a regular basis.
Remote Procedure Call (RPC):
Provides a mechanism for one host to make a procedure call that appears to be part of the local process but is really executed on another machine on the network. Typically, the host on which the procedure call is executed has resources that are not available on the calling host. This distribution of computing services imposes a client/server relationship on the two hosts: The host owning the resource is a server for that resource, and the calling host becomes a client of the server when it needs access to the resource. The resource might be a centralised configuration file (NIS) or a shared filesystem (NFS).
Instead of executing the procedure on the local host, the RPC system bundles up the arguments passed to the procedure into a network datagram. The exact bundling method is determined by the presentation layer, described in the next section. The RPC client creates a session by locating the appropriate server and sending the datagram to a process on the server that can execute the RPC. On the server, the arguments are unpacked, the server executes the result, packages the result (if any), and sends it back to the client. Back on the client side, the reply is converted into a return value for the procedure call, and the user application is reentered as if a local procedure call has completed. RPC services may be built on either TCP or UDP transports, although most are UDP-oriented because the are centred short-lived requests. Using UDP also forces the RPC call to contain enough context information for its execution independent of any other RPC request, since UDP packets may arrive in any order, if at all.
When an RPC call is made, the client may specify a time-out period in which the call must complete. If the server is overloaded or has crashed, or if the request is lost in transit to the server, the remote call may not be executed before the time-out period expires. The action taken upon an RPC times varies by application, some resend the RPC call, while others may look for another server.
Remote Procedure Call Execution:
Figure 107 shows Remote Procedure Call Execution.
External Data Representation (XDR):
Is built on the notion of an immutable network byte ordering, called the canonical form. It isn't really important what the canonical form is, your systems may or may not use the same byte ordering and structure packing conventions. This form simply allows network hosts to exchange structured data independently of any peculiarities of a particular machine. All data structures are converted into the network byte ordering and padded appropriately.
The rule of XDR is sender makes local canonical, receivers makes canonical local. Any data that goes over the network is in canonical form. A host sending data on the network converts it to canonical form, and the host that receives the data converts it back into its local representation. A different way to implement the presentation layer might be receiver makes local. In this case, the sender does nothing to the local data, and the receiver must deduce the packing and encoding technique and convert it into the local equivalent, While this scheme may send less data over the network, it places the burden of incorporating a new hardware architecture on the receiving side, rather than on the new machine.