Nov 00 Getting Started
Volume Number: 16 (2000)
Issue Number: 11
Column Tag: Getting Started
Networks 201 pt. 5
by John C. Welch
Layer 3: The Network Layer
From the first article in our series, we recall that Layer 3, the Network Layer, is responsible for handling network connections that exist past the next object in line. In other words, Layer 3 is the routing layer. This is the layer that handles packet transmission over subnets, and between different types of networks. Layer 3 is not required in all circumstances. If you are using a network that does not route, or does not need routing information, Layer 3 may be very thin, or nonexistent. This is also the lowest level of the OSI model that communicates in an end-to-end fashion. This means that as far as Layer 3 is concerned, there are no Layers 1 or 2, only other machines running Layer 3 protocols. This is what we will talk about this month, so into the Fray!
As we noted above, the Network Layer deals with a bigger scope than the Data Link Layer. Where the Data Link Layer is concerned with getting frames from wire end A to wire end B, the Network Layer is concerned with getting the packet from the source to the destination, regardless of how many wires, routers, or other points in between the source and the destination. Like all the other layers, the Network Layer provides services to the Layer above it, in this case, the Transport Layer. This interface between the two layers is often the boundary of the network subnet, or the boundary between the customer, (the Transport Layer and up), and the carrier, (Network Layer and down.) To do this, the Network Layer services were designed with three primary goals:
- Network Layer services need to be independent of the subnet technology. That is, the services provided by the layer need to not care about whether the subnet is a TCP/IP, AppleTalk, or any other protocol.
- It needs to shield the Transport Layer from the number, type, and topology of the subnets present. The Transport Layer does not need to know any of this, as this is what the Network Layer does. All the Transport Layer needs to do is hand off information and data to the Network Layer, and let the Network Layer do it's job. This is in keeping with the general idea of each Layer having a specific purpose within the OSI model.
- The network addresses used by the Transport Layer should be part of a uniform numbering plan, regardless of the scope of the network. In other words, the transport layer shouldn't have to deal with how the network is addressed, or the scope of those addresses. Just that the addresses are there, and apply across the network.
To accomplish these goals, there are two points of view on how to do this, and both work well within their areas. The first point of view is that of the Internet community, and says that the only thing the subnet, and by extension, Layer 3 should be doing is pushing and getting bits. This takes the argument that the subnet is inherently unreliable, and that any error control and flow control need to be handled by the endpoints, or hosts. The Network Layer here, should be connectionless, and use only the smallest amount of network primitive commands, (SEND PACKET, RECEIVE PACKET, and not much else.) The reason that the layer should do no flow or error control is because the hosts are going to do that anyway, and besides, who knows where the packets really go in between points A and B with any reliability. To support the multiple paths packets may be taking, each packet needs to carry the full addresses of the source and destination.
In the other corner is the point of view of the telecommunications industry. This says that the subnet should be reliable, and should be connection oriented. There should be some error and flow control in the subnet, and all data transfers should have certain basic properties along the following lines:
- Before sending or receiving data, a connection is set up between the source and the destination. This connection creates a path between the two, and is a temporarily static path that encompasses any midpoint devices. This connection has a unique identifier that helps route packets.
- Once the connection is set up, then the two ends negotiate parameters, quality, cost, etc.
- All communications are bi-directional, and packets are delivered in sequence.
- Flow control is provided automatically to keep from overloading one or both ends.
- Once it is no longer needed, the connection is torn down, and all used buffers are flushed.
The real difference between the connectionless and connection - oriented arguments is where the complexity of the layer is handled. In a connectionless protocol, the end points deal with all the complexities of the network. This is because computing power is cheap, and it is easier to upgrade end nodes than major intermediary devices. Also, some functions, such as real-time oriented applications are far more concerned with speed of delivery, rather than accuracy of delivery. The connection - oriented folks argue that the subnet should help provide reliable, trouble-free service, and the end nodes shouldn't have to run complex Transport Layer protocols. In addition, there is a point to be made that real-time data does just as well in a reliable connection as in a connectionless service, and that it is easier to provide certain real-time information atop a reliable connection-oriented protocol.
In the end, both are used, depending on the application's needs. File transfers want a reliable connection, to avoid data corruption, whereas live video feeds prefer to drop a frame or two, while still keeping the stream running, without the overhead of resending multiple packets.
Connection - oriented services
These work primarily by creating virtual circuits that act as temporary paths between two end nodes. The idea here is to avoid having to create, or even look for a new route for every packet that is transferred. Instead, when a connection is established, a route between the two end nodes is created and stored, to be used for all traffic for the duration of that connection. Once the connection is taken down, the virtual circuit is also terminated. This has the effect of requiring a lot more out of the intermediary devices on the subnet. Routers must maintain an entry for every virtual circuit that is using it. They must check every packet for the virtual circuit number, so they can determine where the packet goes next.
When a new connection is created, the first unused virtual circuit, (VC) number is used. It is important to note that these numbers are of local significance, not global. This avoids having to synchronize every connection with every other connection to avoid VC number conflicts. Another issue with VC numbers is when a connection is initiated by both ends at once. This leads to two adjacent routers creating a duplex circuit that could have conflicting, (identical) VC numbers. At this point, the routers don't have any way to tell which way the packet is moving. One of the ways this is avoided is to use simplex connections.
The advantages to VCs are that the addressing is much simpler, relying on VC numbers more than full-blown addresses. The routing ends up being similar, because once the connection is established, that is the route that all packets will take for the duration of the connection. VCs also help with bandwidth needs, because part of the connection process is quality negotiation, so if need be, bandwidth can be reserved by the connection before the first packet is moved.
However, if the data needs of the connection are small, the overhead in setting up the VC can often be not worth the effort involved. Also, if one of the routers on the VC goes down, then the connection is broken, and has to be re-established. In fact, all the connections being serviced by that router are dropped, and have to be re-established.
These are also known as datagram networks, as that is the name used for the packets in this type of network. Each datagram contains the complete addresses of its sender and recipient. There is no connection establishment, nor is there a route established for that data either. Indeed each datagram can go a different way than the datagrams in front of or behind it.
This has the advantage of being a more reliable method of data delivery if the subnet quality is unknown, or not reliable. Since each datagram is independently routed, no one device can destroy the entire delivery. The downside to this is that since every datagram is independently routed, the routing becomes much more complicated than for a VC. This also makes congestion and flow control difficult.
We said earlier that one of the primary functions of the Network Layer is that of routing, or getting packets from source to destination, regardless of network types and the number of nodes in between. The methods and algorithms involved in routing are numerous and complex, so we will deal with the simplest, so as to give you an idea of how they work, without going in to too much detail. (There are books written on routing algorithms, so if you would like to get into more detail, a visit to the computer section of a well-stocked bookstore can get you all the detail you would wish for, and then some.)
The routing algorithm is what decides how a packet will travel from a given router. If datagrams are used, this decision is made for every packet. If VCs are used, then this decision is only made during the connection establishment, and the packets follow this route. This type of VC routing is also called session routing, as the route is used for the entire session. No matter which type of routing is used, there are certain goals for any routing algorithm: correctness, simplicity, robustness, stability, fairness, and optimality.
The first two items are fairly obvious. The algorithm must be correct, otherwise, the packets will never be delivered correctly. It must also be as simple as possible, so that it can be fast enough to handle the loads placed upon it. The third property, robustness is not as obvious, but some routers are in place for years at a time. The algorithm used by a router must be able to handle failures by the other devices it directly deals with, changes in topology, protocol, numbering scheme, etc. It must be able to do this without requiring human intervention or attention as well. Stability is also somewhat obvious. The algorithm must not cause problems due to the way it functions, otherwise it is not useful.
The final two are harder to reconcile with each other. Fairness dictates that no one part of the subnet be used to the point of saturation, yet choosing a route based solely on the optimal route may indeed cause this to happen. Even optimization can result in conflict, as minimizing packet delay does not always maximize network throughput. To help with this, and to deal with fairness, most algorithms concentrate on minimizing the number of hops a packet must make. This helps minimize delay while maximizing utilization.
While there are many algorithms, they all fall into two basic camps, static and adaptive algorithms. Static algorithms are decided outside of the router, and either downloaded to the router when it is booted, or manually entered on the router. If you have ever manually entered routes on products such as IPNetRouter, or SoftRouter, that is a type of static routing. Adaptive algorithms change routes based on information received from adjacent devices that inform them of the opening of a new route, or the closing of an existing one. These maintain their own routing tables, and do not require manual intervention to update themselves.
Of the static algorithms, flooding is the probably the simplest. In a flood routing setup, an incoming packet is sent out on every single line the router has except for the one it came on. Now, obviously, the potential for bringing down a network through a potentially infinite number of packets on the network. So there are some techniques to avoid this, such as inserting a hop counter in the header of each packet, decrementing it each time it passes through a router, and discarding the packet once the hop count is equal to zero. Another technique is to set up each flooded packet with a sequence number. The source router then has to provide the subsequent routers with a packet list, so they know which packets have been flooded, and they are not re-flooded. Another variation is selective flooding, where packets are only flooded in the appropriate direction. (i.e., a westbound packet is not flooded back east.) Although flooding may seem to be of little use, for the military, or other organizations that need to be able to bypass dead, or blown up routers, flooding is a quick, simple method to do just that. As well, flooding always chooses the shortest path, because it chooses every path. Consequently, if the flooding overhead is ignored, flooding actually produces the smallest delay of any algorithm.
Anther static algorithm is shortest path routing. Simply put, with this algorithm, the subnet is displayed as a graph, with each point on the graph representing a router or end node, and each segment on the graph a communications line between points. The algorithm then determines the shortest path, and sends the packet on its way. There are a number of ways to determine exactly what is meant by 'shortest'. The most common is to find the path that has the least number of hops. However, this can break down, especially when a two-hop path is a hundred miles, and a four - hop path is fifty miles. To avoid this, shortest path routers actually use hop count, geographic distance, queuing and transmission delays, etc. to find the true shortest distance. Each factor is given a weight, and that weight is used to find the shortest path.
The disadvantage to static routing is of course, that it's static. It cannot take advantage of improved conditions, or handle worse conditions. It can only route the way it knows. So much of today's routers use dynamic algorithms, that can adapt to current conditions on the network, without human intervention. Since these are much more complex than static routing, we will only look at one of them, distance vector routing.
Distance vector routing algorithms function by having each router keep a table, or vector with the best known distance to each destination, along with the associated lines. The routers update the vector tables by exchanging information with their neighbors. This type of routing is one of the oldest, being not only the original ARPANET routing algorithm, but also used as the RIP algorithm, and by DECNet, IPX, AppleTalk, and Cisco routers.
The vector tables maintain certain parameters about each route. The entry for each route has the line to be used for that destination, and the estimate of the time to that destination. This time can be a measure of the hops to the destination, time delays, queue lengths, etc. The router is also assumed to know the distance to each neighbor. If the metric is hops, then there is only one hop. If queue length is used, then the router analyzes each queue. If delay is used, then the router measures this.
Although distance vector routing works well on paper, the real world implementations can have problems, particularly where updates are concerned. Although distance vector routing reacts well to improvements in the subnet, it can take much longer to react to bad news. Especially if time delays are used, and a node or router is down, (giving it a time delay of infinity), propagating that throughout the subnet can end up taking an extremely long time, hence the name for the problem, 'count to infinity'.
There are a lot of uses for the Network Layer, most of which I have avoided, as they tend to get into specific protocol types, or network types, and I wanted to stay away from any one protocol. But if there is any sort of routing going on, regardless of protocol or network type, it is most likely being done at the Network Layer level. I hope that you have an idea of the differences between connection - oriented, and connectionless services, and also a basic understanding of routing, and routing algorithms. Again, I avoided getting into the math of the algorithms, as that could easily take up an entire magazine, and is of more use to those folks writing router software. If, as a network manager, you understand what a router is trying to do, and why, you will find that troubleshooting, and designing networks will be noticeably easier, and the reasons why networks need to be set up in a given fashion will probably make a lot more sense to you. Our next article will deal with the Transport Layer, which is not only at the heart of the OSI model, but of most other protocols as well. As always, I encourage you to delve into these things on your own as well, using not just my bibliography sources, but any other books you may find on the subject.
Bibliography and References
- Tannenbaum, Andrew S. Computer Networks. Third Edition Prentice Hall, 1996
John Welch <firstname.lastname@example.org> is the Mac and PC Administrator for AER Inc., a weather and atmospheric science company in Cambridge, Mass. He has over fifteen years of experience at making computers work. His specialties are figuring out ways to make the Mac do what nobody thinks it can, and showing that the Mac is the superior administrative platform.