when a router is deciding how to route a packet, what table does it consult?
iii.4 Routing¶
So far in this chapter we accept causeless that the switches and routers have enough cognition of the network topology so they can cull the correct port onto which each packet should exist output. In the instance of virtual circuits, routing is an upshot only for the connection request packet; all subsequent packets follow the same path as the request. In datagram networks, including IP networks, routing is an issue for every packet. In either case, a switch or router needs to be able to look at a destination address and then to determine which of the output ports is the best choice to get a parcel to that address. As we saw in an before section, the switch makes this decision by consulting a forwarding table. The key problem of routing is how switches and routers acquire the data in their forwarding tables.
Primal Takeaway
We restate an important distinction, which is often neglected, between forwarding and routing. Forwarding consists of receiving a packet, looking up its destination address in a table, and sending the parcel in a direction adamant by that table. We saw several examples of forwarding in the preceding section. It is a simple and well-divers process performed locally at each node, and is often referred to every bit the network'due south data plane. Routing is the process by which forwarding tables are congenital. It depends on complex distributed algorithms, and is often referred to as the network's control plane. [Next]
While the terms forwarding table and routing table are sometimes used interchangeably, we will make a stardom between them here. The forwarding table is used when a parcel is being forwarded and so must contain enough information to accomplish the forwarding function. This ways that a row in the forwarding table contains the mapping from a network prefix to an outgoing interface and some MAC information, such as the Ethernet address of the next hop. The routing table, on the other hand, is the tabular array that is built upwardly by the routing algorithms every bit a precursor to building the forwarding table. Information technology mostly contains mappings from network prefixes to next hops. It may also contain information well-nigh how this information was learned, so that the router will exist able to determine when it should discard some data.
Whether the routing table and forwarding table are actually split up data structures is something of an implementation choice, but there are numerous reasons to keep them dissever. For example, the forwarding table needs to exist structured to optimize the process of looking up an address when forwarding a packet, while the routing table needs to exist optimized for the purpose of calculating changes in topology. In many cases, the forwarding table may even exist implemented in specialized hardware, whereas this is rarely if ever done for the routing table.
Table 14 gives an example of a row from a routing table, which tells united states of america that network prefix xviii/viii is to be reached by a next hop router with the IP address 171.69.245.x
Prefix/Length | Next Hop |
---|---|
18/8 | 171.69.245.ten |
In contrast, Tabular array 15 gives an case of a row from a forwarding table, which contains the information nearly exactly how to forrard a packet to that adjacent hop: Send it out interface number 0 with a MAC accost of 8:0:2b:e4:b:i:two. Note that the terminal piece of information is provided by the Address Resolution Protocol.
Prefix/Length | Interface | MAC Address |
---|---|---|
18/eight | if0 | viii:0:2b:e4:b:1:2 |
Earlier getting into the details of routing, nosotros need to remind ourselves of the key question nosotros should be asking anytime we try to build a machinery for the Internet: "Does this solution scale?" The reply for the algorithms and protocols described in this department is "non so much." They are designed for networks of fairly pocket-size size—up to a few hundred nodes, in practice. However, the solutions nosotros draw exercise serve as a edifice block for a hierarchical routing infrastructure that is used in the Internet today. Specifically, the protocols described in this section are collectively known as intradomain routing protocols, or interior gateway protocols (IGPs). To understand these terms, we demand to ascertain a routing domain. A good working definition is an internetwork in which all the routers are under the same administrative command (e.g., a unmarried university campus, or the network of a single Internet Service Provider). The relevance of this definition volition become apparent in the next chapter when we look at interdomain routing protocols. For now, the of import affair to continue in mind is that nosotros are considering the problem of routing in the context of small to midsized networks, non for a network the size of the Net.
3.four.1 Network as a Graph¶
Routing is, in essence, a problem of graph theory. Figure 84 shows a graph representing a network. The nodes of the graph, labeled A through F, may be hosts, switches, routers, or networks. For our initial discussion, we volition focus on the case where the nodes are routers. The edges of the graph stand for to the network links. Each edge has an associated cost, which gives some indication of the desirability of sending traffic over that link. A discussion of how edge costs are assigned is given in a later section.
Annotation that the example networks (graphs) used throughout this chapter have undirected edges that are assigned a unmarried cost. This is really a slight simplification. It is more authentic to make the edges directed, which typically means that there would be a pair of edges between each node—one flowing in each management, and each with its own edge cost.
The basic problem of routing is to find the lowest-price path between whatsoever two nodes, where the toll of a path equals the sum of the costs of all the edges that brand up the path. For a simple network like the one in Figure 84, yous could imagine merely calculating all the shortest paths and loading them into some nonvolatile storage on each node. Such a static approach has several shortcomings:
- It does non bargain with node or link failures.
- It does not consider the improver of new nodes or links.
- It implies that border costs cannot change, even though we might reasonably wish to have link costs change over fourth dimension (e.grand., assigning high cost to a link that is heavily loaded).
For these reasons, routing is achieved in most practical networks by running routing protocols among the nodes. These protocols provide a distributed, dynamic way to solve the problem of finding the lowest-toll path in the presence of link and node failures and changing edge costs. Note the word distributed in the previous judgement; it is difficult to make centralized solutions scalable, so all the widely used routing protocols use distributed algorithms.
The distributed nature of routing algorithms is i of the principal reasons why this has been such a rich field of enquiry and development—there are a lot of challenges in making distributed algorithms work well. For example, distributed algorithms raise the possibility that 2 routers will at 1 instant take dissimilar ideas about the shortest path to some destination. In fact, each one may recall that the other one is closer to the destination and determine to send packets to the other one. Clearly, such packets will be stuck in a loop until the discrepancy between the two routers is resolved, and it would be practiced to resolve it every bit soon equally possible. This is just ane example of the type of problem routing protocols must address.
To begin our analysis, nosotros assume that the edge costs in the network are known. We will examine the two master classes of routing protocols: distance vector and link state. In a later department, we render to the problem of computing border costs in a meaningful way.
3.4 2 Distance-Vector (RIP)¶
The idea behind the altitude-vector algorithm is suggested by its name. (The other mutual proper noun for this form of algorithm is Bellman-Ford, after its inventors.) Each node constructs a one-dimensional array (a vector) containing the "distances" (costs) to all other nodes and distributes that vector to its firsthand neighbors. The starting supposition for distance-vector routing is that each node knows the cost of the link to each of its straight connected neighbors. These costs may be provided when the router is configured by a network manager. A link that is down is assigned an infinite price.
A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|
A | 0 | 1 | 1 | ∞ | 1 | i | ∞ |
B | 1 | 0 | ane | ∞ | ∞ | ∞ | ∞ |
C | i | one | 0 | ane | ∞ | ∞ | ∞ |
D | ∞ | ∞ | i | 0 | ∞ | ∞ | 1 |
E | i | ∞ | ∞ | ∞ | 0 | ∞ | ∞ |
F | one | ∞ | ∞ | ∞ | ∞ | 0 | ane |
One thousand | ∞ | ∞ | ∞ | ane | ∞ | 1 | 0 |
To meet how a altitude-vector routing algorithm works, it is easiest to consider an case similar the 1 depicted in Figure 85. In this example, the price of each link is set to 1, so that a to the lowest degree-cost path is simply the i with the fewest hops. (Since all edges have the same toll, we do not show the costs in the graph.) We can stand for each node'south cognition about the distances to all other nodes as a table like Tabular array 16. Annotation that each node knows only the information in 1 row of the table (the ane that bears its name in the left column). The global view that is presented hither is not bachelor at any single point in the network.
Nosotros may consider each row in Table 16 every bit a listing of distances from 1 node to all other nodes, representing the current beliefs of that node. Initially, each node sets a cost of one to its directly continued neighbors and ∞ to all other nodes. Thus, A initially believes that it can reach B in ane hop and that D is unreachable. The routing table stored at A reflects this set of beliefs and includes the name of the next hop that A would apply to attain any reachable node. Initially, then, A's routing table would look like Tabular array 17.
Destination | Toll | NextHop |
---|---|---|
B | 1 | B |
C | i | C |
D | ∞ | — |
Eastward | ane | Eastward |
F | 1 | F |
Thou | ∞ | — |
The adjacent step in distance-vector routing is that every node sends a message to its directly continued neighbors containing its personal list of distances. For example, node F tells node A that it tin can attain node G at a price of ane; A also knows it can reach F at a cost of 1, so it adds these costs to become the toll of reaching G by means of F. This total cost of 2 is less than the current cost of infinity, and then A records that it can reach G at a cost of 2 by going through F. Similarly, A learns from C that D can exist reached from C at a cost of 1; it adds this to the cost of reaching C (one) and decides that D tin can be reached via C at a price of two, which is better than the sometime toll of infinity. At the same time, A learns from C that B tin be reached from C at a cost of 1, so it concludes that the cost of reaching B via C is two. Since this is worse than the current cost of reaching B (1), this new information is ignored. At this point, A tin update its routing tabular array with costs and next hops for all nodes in the network. The result is shown in Tabular array 18.
Destination | Toll | NextHop |
---|---|---|
B | 1 | B |
C | 1 | C |
D | 2 | C |
E | i | Eastward |
F | 1 | F |
G | 2 | F |
In the absence of whatever topology changes, it takes only a few exchanges of information between neighbors before each node has a complete routing table. The process of getting consequent routing information to all the nodes is called convergence. Tabular array 19 shows the final fix of costs from each node to all other nodes when routing has converged. We must stress that there is no one node in the network that has all the data in this table—each node only knows about the contents of its own routing table. The beauty of a distributed algorithm like this is that it enables all nodes to achieve a consistent view of the network in the absenteeism of any centralized potency.
A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|
A | 0 | 1 | ane | 2 | 1 | one | 2 |
B | 1 | 0 | 1 | 2 | 2 | 2 | 3 |
C | 1 | ane | 0 | 1 | 2 | 2 | 2 |
D | 2 | 2 | 1 | 0 | 3 | two | 1 |
E | 1 | 2 | two | 3 | 0 | 2 | 3 |
F | 1 | two | two | two | two | 0 | 1 |
Thousand | ii | three | ii | 1 | 3 | 1 | 0 |
There are a few details to fill in before our word of distance-vector routing is complete. Beginning we annotation that there are two different circumstances under which a given node decides to send a routing update to its neighbors. One of these circumstances is the periodic update. In this instance, each node automatically sends an update message every and then oft, even if zippo has changed. This serves to let the other nodes know that this node is all the same running. It also makes sure that they proceed getting data that they may need if their electric current routes get unviable. The frequency of these periodic updates varies from protocol to protocol, just it is typically on the order of several seconds to several minutes. The second mechanism, sometimes called a triggered update, happens whenever a node notices a link failure or receives an update from i of its neighbors that causes it to change one of the routes in its routing table. Whenever a node's routing table changes, information technology sends an update to its neighbors, which may pb to a change in their tables, causing them to send an update to their neighbors.
Now consider what happens when a link or node fails. The nodes that notice first send new lists of distances to their neighbors, and normally the system settles downwardly fairly quickly to a new state. Equally to the question of how a node detects a failure, there are a couple of different answers. In 1 arroyo, a node continually tests the link to some other node past sending a control packet and seeing if it receives an acknowledgment. In some other arroyo, a node determines that the link (or the node at the other terminate of the link) is down if it does not receive the expected periodic routing update for the last few update cycles.
To understand what happens when a node detects a link failure, consider what happens when F detects that its link to G has failed. Offset, F sets its new altitude to Thou to infinity and passes that data along to A. Since A knows that its ii-hop path to Chiliad is through F, A would besides set its distance to G to infinity. Notwithstanding, with the next update from C, A would larn that C has a ii-hop path to Thou. Thus, A would know that it could achieve G in 3 hops through C, which is less than infinity, and so A would update its table accordingly. When it advertises this to F, node F would learn that information technology tin reach Yard at a cost of 4 through A, which is less than infinity, and the system would again become stable.
Unfortunately, slightly different circumstances can prevent the network from stabilizing. Suppose, for case, that the link from A to E goes down. In the adjacent circular of updates, A advertises a distance of infinity to Eastward, simply B and C advertise a distance of 2 to E. Depending on the exact timing of events, the following might happen: Node B, upon hearing that E can be reached in 2 hops from C, concludes that it can attain East in iii hops and advertises this to A; node A concludes that it can reach East in four hops and advertises this to C; node C concludes that it can reach E in 5 hops; and so on. This wheel stops simply when the distances attain some number that is large enough to be considered space. In the meantime, none of the nodes actually knows that E is unreachable, and the routing tables for the network do not stabilize. This state of affairs is known as the count to infinity trouble.
At that place are several partial solutions to this problem. The first one is to employ some relatively modest number every bit an approximation of infinity. For example, we might determine that the maximum number of hops to get beyond a sure network is never going to exist more than xvi, and and then we could selection xvi as the value that represents infinity. This at to the lowest degree bounds the corporeality of time that it takes to count to infinity. Of form, it could as well present a problem if our network grew to a bespeak where some nodes were separated past more than 16 hops.
One technique to ameliorate the time to stabilize routing is called split horizon. The idea is that when a node sends a routing update to its neighbors, it does not transport those routes it learned from each neighbor back to that neighbour. For example, if B has the route (E, 2, A) in its table, and so it knows it must have learned this road from A, and and then whenever B sends a routing update to A, it does not include the road (E, 2) in that update. In a stronger variation of split horizon, called split horizon with poison reverse, B actually sends that route back to A, just it puts negative information in the road to ensure that A will not eventually use B to go to E. For instance, B sends the route (E, ∞) to A. The problem with both of these techniques is that they just work for routing loops that involve two nodes. For larger routing loops, more drastic measures are called for. Continuing the above instance, if B and C had waited for a while after hearing of the link failure from A before advert routes to E, they would have establish that neither of them really had a route to E. Unfortunately, this approach delays the convergence of the protocol; speed of convergence is i of the key advantages of its competitor, link-country routing, the field of study of a later department.
Implementation¶
The code that implements this algorithm is very straightforward; nosotros give only some of the basics here. Construction Route
defines each entry in the routing table, and constant MAX_TTL
specifies how long an entry is kept in the table before it is discarded.
#define MAX_ROUTES 128 /* maximum size of routing tabular array */ #define MAX_TTL 120 /* time (in seconds) until route expires */ typedef struct { NodeAddr Destination ; /* address of destination */ NodeAddr NextHop ; /* address of next hop */ int Cost ; /* distance metric */ u_short TTL ; /* fourth dimension to live */ } Route ; int numRoutes = 0 ; Route routingTable [ MAX_ROUTES ];
The routine that updates the local node'due south routing table based on a new route is given past mergeRoute
. Although not shown, a timer office periodically scans the listing of routes in the node's routing table, decrements the TTL
(time to live) field of each route, and discards any routes that have a time to live of 0. Observe, however, that the TTL
field is reset to MAX_TTL
any fourth dimension the route is reconfirmed by an update message from a neighboring node.
void mergeRoute ( Route * new ) { int i ; for ( i = 0 ; i < numRoutes ; ++ i ) { if ( new -> Destination == routingTable [ i ]. Destination ) { if ( new -> Cost + 1 < routingTable [ i ]. Price ) { /* found a ameliorate route: */ break ; } else if ( new -> NextHop == routingTable [ i ]. NextHop ) { /* metric for current side by side-hop may have changed: */ break ; } else { /* route is uninteresting---merely ignore it */ return ; } } } if ( i == numRoutes ) { /* this is a completely new route; is there room for information technology? */ if ( numRoutes < MAXROUTES ) { ++ numRoutes ; } else { /* tin can`t fit this road in table so give up */ return ; } } routingTable [ i ] = * new ; /* reset TTL */ routingTable [ i ]. TTL = MAX_TTL ; /* account for hop to go to adjacent node */ ++ routingTable [ i ]. Price ; }
Finally, the procedure updateRoutingTable
is the main routine that calls mergeRoute
to incorporate all the routes independent in a routing update that is received from a neighboring node.
void updateRoutingTable ( Route * newRoute , int numNewRoutes ) { int i ; for ( i = 0 ; i < numNewRoutes ; ++ i ) { mergeRoute ( & newRoute [ i ]); } }
Routing Information Protocol (RIP)¶
One of the more widely used routing protocols in IP networks is the Routing Information Protocol (RIP). Its widespread use in the early days of IP was due in no small-scale function to the fact that it was distributed along with the popular Berkeley Software Distribution (BSD) version of Unix, from which many commercial versions of Unix were derived. Information technology is as well extremely uncomplicated. RIP is the approved instance of a routing protocol built on the distance-vector algorithm merely described.
Routing protocols in internetworks differ very slightly from the arcadian graph model described above. In an internetwork, the goal of the routers is to acquire how to forrard packets to various networks. Thus, rather than advertising the cost of reaching other routers, the routers advertise the cost of reaching networks. For example, in Figure 86, router C would advertise to router A the fact that it can accomplish networks 2 and iii (to which information technology is directly connected) at a cost of 0, networks five and 6 at cost 1, and network 4 at cost ii.
We can see evidence of this in the RIP (version 2) package format in Figure 87. The bulk of the packet is taken upward with (address, mask, distance)
triples. Yet, the principles of the routing algorithm are just the same. For example, if router A learns from router B that network X tin can be reached at a lower cost via B than via the existing next hop in the routing table, A updates the cost and next hop information for the network number accordingly.
RIP is in fact a fairly straightforward implementation of altitude-vector routing. Routers running RIP send their advertisements every 30 seconds; a router as well sends an update message whenever an update from some other router causes it to change its routing table. One point of interest is that information technology supports multiple address families, not only IP—that is the reason for the Family
role of the advertisements. RIP version ii (RIPv2) likewise introduced the subnet masks described in an earlier section, whereas RIP version 1 worked with the one-time classful addresses of IP.
Every bit we will see below, information technology is possible to use a range of different metrics or costs for the links in a routing protocol. RIP takes the simplest approach, with all link costs being equal to ane, simply equally in our instance above. Thus, it always tries to find the minimum hop road. Valid distances are one through fifteen, with xvi representing infinity. This also limits RIP to running on fairly pocket-sized networks—those with no paths longer than fifteen hops.
three.4.3 Link State (OSPF)¶
Link-state routing is the second major grade of intradomain routing protocol. The starting assumptions for link-state routing are rather similar to those for distance-vector routing. Each node is causeless to be capable of finding out the land of the link to its neighbors (upwardly or downwardly) and the toll of each link. Once more, nosotros want to provide each node with plenty information to enable information technology to observe the to the lowest degree-cost path to whatever destination. The basic idea backside link-state protocols is very simple: Every node knows how to reach its direct continued neighbors, and if we make sure that the totality of this knowledge is disseminated to every node, then every node will have enough knowledge of the network to build a complete map of the network. This is clearly a sufficient condition (although not a necessary ane) for finding the shortest path to whatsoever point in the network. Thus, link-state routing protocols rely on two mechanisms: reliable dissemination of link-state data, and the calculation of routes from the sum of all the accumulated link-state knowledge.
Reliable Flooding¶
Reliable flooding is the process of making certain that all the nodes participating in the routing protocol get a re-create of the link-state data from all the other nodes. As the term flooding suggests, the basic idea is for a node to send its link-state information out on all of its directly connected links; each node that receives this data then forwards it out on all of its links. This process continues until the information has reached all the nodes in the network.
More precisely, each node creates an update parcel, besides chosen a link-land packet (LSP), which contains the following information:
- The ID of the node that created the LSP
- A listing of directly connected neighbors of that node, with the cost of the link to each one
- A sequence number
- A fourth dimension to live for this packet
The offset ii items are needed to enable route calculation; the last two are used to make the process of flooding the packet to all nodes reliable. Reliability includes making certain that y'all have the most recent copy of the information, since there may exist multiple, contradictory LSPs from one node traversing the network. Making the flooding reliable has proven to be quite difficult. (For example, an early version of link-state routing used in the ARPANET caused that network to fail in 1981.)
Flooding works in the following mode. First, the transmission of LSPs between adjacent routers is made reliable using acknowledgments and retransmissions just as in the reliable link-layer protocol. However, several more steps are necessary to reliably flood an LSP to all nodes in a network.
Consider a node 10 that receives a copy of an LSP that originated at another node Y. Note that Y may be any other router in the same routing domain every bit 10. X checks to see if it has already stored a copy of an LSP from Y. If not, it stores the LSP. If it already has a copy, it compares the sequence numbers; if the new LSP has a larger sequence number, it is assumed to be the more than recent, and that LSP is stored, replacing the quondam i. A smaller (or equal) sequence number would imply an LSP older (or not newer) than the ane stored, so it would exist discarded and no farther action would be needed. If the received LSP was the newer one, 10 then sends a copy of that LSP to all of its neighbors except the neighbor from which the LSP was just received. The fact that the LSP is not sent back to the node from which it was received helps to bring an end to the flooding of an LSP. Since 10 passes the LSP on to all its neighbors, who and so plough around and do the same matter, the nigh recent copy of the LSP eventually reaches all nodes.
Figure 88 shows an LSP being flooded in a small network. Each node becomes shaded as it stores the new LSP. In Figure 88(a) the LSP arrives at node X, which sends it to neighbors A and C in Figure 88(b). A and C do not send it dorsum to 10, but ship it on to B. Since B receives 2 identical copies of the LSP, it will have whichever arrived showtime and ignore the 2nd equally a duplicate. Information technology then passes the LSP onto D, which has no neighbors to flood it to, and the process is complete.
Just every bit in RIP, each node generates LSPs nether two circumstances. Either the expiry of a periodic timer or a change in topology can cause a node to generate a new LSP. However, the only topology-based reason for a node to generate an LSP is if one of its directly connected links or firsthand neighbors has gone down. The failure of a link can be detected in some cases by the link-layer protocol. The demise of a neighbor or loss of connectivity to that neighbor can be detected using periodic "hello" packets. Each node sends these to its immediate neighbors at defined intervals. If a sufficiently long fourth dimension passes without receipt of a "hello" from a neighbor, the link to that neighbor will be declared downwardly, and a new LSP will be generated to reflect this fact.
1 of the of import design goals of a link-country protocol's flooding machinery is that the newest information must exist flooded to all nodes as chop-chop as possible, while old data must be removed from the network and not allowed to broadcast. In addition, information technology is clearly desirable to minimize the total amount of routing traffic that is sent around the network; after all, this is merely overhead from the perspective of those who really use the network for their applications. The side by side few paragraphs draw some of the ways that these goals are accomplished.
I easy way to reduce overhead is to avoid generating LSPs unless absolutely necessary. This tin exist done by using very long timers—oftentimes on the order of hours—for the periodic generation of LSPs. Given that the flooding protocol is truly reliable when topology changes, it is safe to assume that messages saying "nothing has changed" do non need to be sent very often.
To make certain that onetime information is replaced past newer information, LSPs carry sequence numbers. Each time a node generates a new LSP, it increments the sequence number by i. Unlike about sequence numbers used in protocols, these sequence numbers are not expected to wrap, so the field needs to be quite big (say, 64 bits). If a node goes down then comes dorsum up, information technology starts with a sequence number of 0. If the node was downwards for a long fourth dimension, all the old LSPs for that node will take timed out (equally described below); otherwise, this node volition eventually receive a copy of its ain LSP with a higher sequence number, which it can then increase and use as its own sequence number. This will ensure that its new LSP replaces any of its old LSPs left over from earlier the node went downwardly.
LSPs likewise bear a time to live. This is used to ensure that old link-state information is eventually removed from the network. A node e'er decrements the TTL of a newly received LSP before flooding it to its neighbors. It likewise "ages" the LSP while information technology is stored in the node. When the TTL reaches 0, the node refloods the LSP with a TTL of 0, which is interpreted past all the nodes in the network as a point to delete that LSP.
Route Calculation¶
Once a given node has a copy of the LSP from every other node, information technology is able to compute a complete map for the topology of the network, and from this map it is able to decide the best route to each destination. The question, then, is exactly how it calculates routes from this information. The solution is based on a well-known algorithm from graph theory—Dijkstra'south shortest-path algorithm.
We first define Dijkstra'south algorithm in graph-theoretic terms. Imagine that a node takes all the LSPs it has received and constructs a graphical representation of the network, in which N denotes the fix of nodes in the graph, fifty(i,j) denotes the nonnegative cost (weight) associated with the border between nodes i, j in N and l(i, j) = ∞ if no edge connects i and j. In the following description, nosotros let south in North denote this node, that is, the node executing the algorithm to find the shortest path to all the other nodes in N. Also, the algorithm maintains the following two variables: 1000 denotes the ready of nodes incorporated so far by the algorithm, and C(n) denotes the cost of the path from southward to each node n. Given these definitions, the algorithm is divers as follows:
M = { south } for each n in N - { s } C ( n ) = l ( s , n ) while ( N != 1000 ) M = G + { westward } such that C ( due west ) is the minimum for all west in ( N - M ) for each due north in ( Due north - Yard ) C ( n ) = MIN ( C ( n ), C ( w ) + 50 ( w , due north ))
Basically, the algorithm works every bit follows. We outset with M containing this node s and then initialize the table of costs (the array C(due north)
) to other nodes using the known costs to directly connected nodes. We then wait for the node that is reachable at the everyman toll (w) and add it to M. Finally, nosotros update the table of costs by because the cost of reaching nodes through w. In the final line of the algorithm, we choose a new route to node n that goes through node westward if the full cost of going from the source to w and and so following the link from w to n is less than the quondam route we had to north. This procedure is repeated until all nodes are incorporated in M.
In practise, each switch computes its routing table directly from the LSPs it has collected using a realization of Dijkstra's algorithm chosen the forward search algorithm. Specifically, each switch maintains two lists, known every bit Tentative
and Confirmed
. Each of these lists contains a gear up of entries of the grade (Destination, Cost, NextHop)
. The algorithm works as follows:
- Initialize the
Confirmed
list with an entry for myself; this entry has a toll of 0. - For the node only added to the
Confirmed
list in the previous step, call information technology nodeSide by side
and select its LSP. - For each neighbor (
Neighbour
) ofNext
, calculate the toll (Cost
) to achieve thisNeighbor
as the sum of the toll from myself toAdjacent
and fromAdjacent
toNeighbor
.- If
Neighbor
is currently on neither theConfirmed
nor theTentative
list, then add together(Neighbor, Cost, NextHop)
to theTentative
list, whereNextHop
is the direction I go to reachNext
. - If
Neighbor
is currently on theTentative
list, and theCost
is less than the currently listed cost forNeighbor
, then replace the electric current entry with(Neighbor, Cost, NextHop)
, whereNextHop
is the direction I become to reachNext
.
- If
- If the
Tentative
listing is empty, finish. Otherwise, pick the entry from theTentative
listing with the lowest toll, move it to theConfirmed
list, and return to pace 2.
This will become a lot easier to understand when we wait at an example. Consider the network depicted in Figure 89. Note that, unlike our previous case, this network has a range of dissimilar edge costs. Table xx traces the steps for building the routing tabular array for node D. We denote the 2 outputs of D past using the names of the nodes to which they connect, B and C. Note the way the algorithm seems to head off on simulated leads (like the 11-unit cost path to B that was the first addition to the Tentative
list) but ends upwards with the least-toll paths to all nodes.
Step | Confirmed | Tentative | Comments |
---|---|---|---|
1 | (D,0,–) | Since D is the only new member of the confirmed list, await at its LSP. | |
two | (D,0,–) | (B,11,B) (C,2,C) | D's LSP says nosotros can reach B through B at cost 11, which is better than anything else on either listing, so put information technology on Tentative list; same for C. |
3 | (D,0,–) (C,2,C) | (B,11,B) | Put lowest-cost member of Tentative (C) onto Confirmed listing. Adjacent, examine LSP of newly confirmed member (C). |
4 | (D,0,–) (C,2,C) | (B,5,C) (A,12,C) | Cost to reach B through C is v, and then supervene upon (B,11,B). C'southward LSP tells united states of america that we tin can achieve A at cost 12. |
5 | (D,0,–) (C,2,C) (B,5,C) | (A,12,C) | Move lowest-price member of Tentative (B) to Confirmed , then wait at its LSP. |
half dozen | (D,0,–) (C,2,C) (B,5,C) | (A,ten,C) | Since nosotros can reach A at cost v through B, supercede the Tentative entry. |
7 | (D,0,–) (C,ii,C) (B,5,C) (A,10,C) | Move lowest-cost member of Tentative (A) to Confirmed , and we are all washed. |
The link-state routing algorithm has many nice backdrop: Information technology has been proven to stabilize speedily, it does non generate much traffic, and information technology responds rapidly to topology changes or node failures. On the downside, the corporeality of information stored at each node (ane LSP for every other node in the network) can exist quite large. This is one of the central problems of routing and is an instance of the more than general problem of scalability. Some solutions to both the specific trouble (the amount of storage potentially required at each node) and the general problem (scalability) volition be discussed in the adjacent department.
Fundamental Takeaway
Distance-vector and link-land are both distributed routing algorithms, only they adopt different strategies. In distance-vector, each node talks only to its straight continued neighbors, merely it tells them everything it has learned (i.e., altitude to all nodes). In link-state, each node talks to all other nodes, only it tells them but what it knows for sure (i.e., just the state of its directly continued links). In contrast to both of these algorithms, we volition consider a more centralized approach to routing in Section 3.v when we innovate Software Divers Networking (SDN). [Side by side]
The Open Shortest Path First Protocol (OSPF)¶
One of the most widely used link-country routing protocols is OSPF. The showtime word, "Open," refers to the fact that information technology is an open up, nonproprietary standard, created under the auspices of the Internet Engineering Chore Strength (IETF). The "SPF" office comes from an alternative name for link-state routing. OSPF adds quite a number of features to the bones link-state algorithm described above, including the post-obit:
- Hallmark of routing messages—One feature of distributed routing algorithms is that they disperse information from one node to many other nodes, and the unabridged network can thus exist impacted past bad data from ane node. For this reason, information technology's a good idea to be sure that all the nodes taking part in the protocol tin can be trusted. Authenticating routing messages helps reach this. Early on versions of OSPF used a elementary viii-byte password for authentication. This is non a strong plenty course of authentication to prevent defended malicious users, only it alleviates some problems caused by misconfiguration or casual attacks. (A similar course of authentication was added to RIP in version 2.) Potent cryptographic authentication was afterwards added.
- Boosted bureaucracy—Hierarchy is one of the fundamental tools used to make systems more scalable. OSPF introduces another layer of bureaucracy into routing by allowing a domain to be partitioned into areas. This ways that a router within a domain does not necessarily need to know how to accomplish every network within that domain—it may exist able to go by knowing just how to go to the right area. Thus, there is a reduction in the corporeality of information that must be transmitted to and stored in each node.
- Load balancing—OSPF allows multiple routes to the same place to be assigned the same cost and will cause traffic to be distributed evenly over those routes, thus making ameliorate apply of the available network chapters.
There are several different types of OSPF letters, but all begin with the same header, every bit shown in Figure 90. The Version
field is currently set to 2, and the Type
field may take the values 1 through 5. The SourceAddr
identifies the sender of the message, and the AreaId
is a 32-bit identifier of the area in which the node is located. The entire packet, except the hallmark information, is protected past a 16-scrap checksum using the aforementioned algorithm every bit the IP header. The Authentication type
is 0 if no hallmark is used; otherwise, it may be 1, implying that a uncomplicated password is used, or two, which indicates that a cryptographic hallmark checksum is used. In the latter cases, the Authentication
field carries the password or cryptographic checksum.
Of the five OSPF bulletin types, type one is the "hi" message, which a router sends to its peers to notify them that it is still alive and connected as described in a higher place. The remaining types are used to request, send, and acknowledge the receipt of link-state letters. The basic edifice cake of link-state messages in OSPF is the link-state advertizing (LSA). One message may contain many LSAs. Nosotros provide a few details of the LSA here.
Like any internetwork routing protocol, OSPF must provide information about how to reach networks. Thus, OSPF must provide a little more information than the uncomplicated graph-based protocol described above. Specifically, a router running OSPF may generate link-state packets that advertise ane or more of the networks that are directly connected to that router. In addition, a router that is connected to another router by some link must advertise the cost of reaching that router over the link. These ii types of advertisements are necessary to enable all the routers in a domain to make up one's mind the cost of reaching all networks in that domain and the appropriate adjacent hop for each network.
Figure 91 shows the bundle format for a type one link-state advertisement. Type 1 LSAs advertise the cost of links between routers. Type ii LSAs are used to annunciate networks to which the advertising router is connected, while other types are used to back up additional hierarchy equally described in the next section. Many fields in the LSA should be familiar from the preceding give-and-take. The LS Age
is the equivalent of a fourth dimension to live, except that information technology counts up and the LSA expires when the age reaches a divers maximum value. The Type
field tells the states that this is a type 1 LSA.
In a type 1 LSA, the Link state ID
and the Advertising router
field are identical. Each carries a 32-scrap identifier for the router that created this LSA. While a number of assignment strategies may exist used to assign this ID, it is essential that it be unique in the routing domain and that a given router consistently uses the same router ID. 1 mode to option a router ID that meets these requirements would exist to choice the everyman IP address amid all the IP addresses assigned to that router. (Recall that a router may have a different IP address on each of its interfaces.)
The LS sequence number
is used exactly as described in a higher place to detect quondam or duplicate LSAs. The LS checksum
is similar to others we have seen in other protocols; information technology is, of form, used to verify that data has not been corrupted. It covers all fields in the packet except LS Age
, so it is not necessary to recompute a checksum every time LS Age
is incremented. Length
is the length in bytes of the consummate LSA.
Now we get to the bodily link-state data. This is fabricated a little complicated by the presence of TOS (type of service) data. Ignoring that for a moment, each link in the LSA is represented by a Link ID
, some Link Information
, and a metric
. The start 2 of these fields place the link; a common way to practise this would be to utilize the router ID of the router at the far finish of the link as the Link ID
and and then utilise the Link Data
to disambiguate among multiple parallel links if necessary. The metric
is of course the cost of the link. Blazon
tells the states something most the link—for instance, if it is a betoken-to-point link.
The TOS information is present to allow OSPF to cull different routes for IP packets based on the value in their TOS field. Instead of assigning a unmarried metric to a link, it is possible to assign different metrics depending on the TOS value of the data. For example, if nosotros had a link in our network that was very skilful for delay-sensitive traffic, nosotros could give it a low metric for the TOS value representing low delay and a high metric for everything else. OSPF would and then selection a different shortest path for those packets that had their TOS field set to that value. Information technology is worth noting that, at the fourth dimension of writing, this adequacy has non been widely deployed.
three.4.4 Metrics¶
The preceding discussion assumes that link costs, or metrics, are known when we execute the routing algorithm. In this section, we look at some ways to calculate link costs that have proven effective in practice. One case that we take seen already, which is quite reasonable and very simple, is to assign a price of ane to all links—the least-cost route will then exist the 1 with the fewest hops. Such an approach has several drawbacks, nevertheless. First, information technology does not distinguish between links on a latency basis. Thus, a satellite link with 250-ms latency looks just every bit attractive to the routing protocol as a terrestrial link with i-ms latency. Second, it does not distinguish betwixt routes on a chapters footing, making a i-Mbps link look but every bit good equally a 10-Gbps link. Finally, it does not distinguish betwixt links based on their current load, making it impossible to route around overloaded links. Information technology turns out that this last trouble is the hardest because yous are trying to capture the complex and dynamic characteristics of a link in a single scalar toll.
The ARPANET was the testing ground for a number of dissimilar approaches to link-cost calculation. (It was also the place where the superior stability of link-state over distance-vector routing was demonstrated; the original mechanism used altitude vector while the after version used link country.) The following give-and-take traces the evolution of the ARPANET routing metric and, in so doing, explores the subtle aspects of the problem.
The original ARPANET routing metric measured the number of packets that were queued waiting to exist transmitted on each link, meaning that a link with ten packets queued waiting to exist transmitted was assigned a larger cost weight than a link with 5 packets queued for transmission. Using queue length every bit a routing metric did not work well, however, since queue length is an bogus measure of load—it moves packets toward the shortest queue rather than toward the destination, a situation all too familiar to those of us who hop from line to line at the grocery store. Stated more than precisely, the original ARPANET routing mechanism suffered from the fact that it did not have either the bandwidth or the latency of the link into consideration.
A second version of the ARPANET routing algorithm took both link bandwidth and latency into consideration and used delay, rather than just queue length, as a measure of load. This was done as follows. First, each incoming packet was timestamped with its time of arrival at the router ( ArrivalTime
); its difference time from the router ( DepartTime
) was also recorded. Second, when the link-level ACK was received from the other side, the node computed the delay for that packet every bit
Filibuster = ( DepartTime - ArrivalTime ) + TransmissionTime + Latency
where TransmissionTime
and Latency
were statically defined for the link and captured the link's bandwidth and latency, respectively. Detect that in this case, DepartTime - ArrivalTime
represents the amount of time the parcel was delayed (queued) in the node due to load. If the ACK did non go far, but instead the packet timed out, then DepartTime
was reset to the fourth dimension the parcel was retransmitted. In this case, DepartTime - ArrivalTime
captures the reliability of the link—the more frequent the retransmission of packets, the less reliable the link, and the more than we want to avoid it. Finally, the weight assigned to each link was derived from the average filibuster experienced by the packets recently sent over that link.
Although an improvement over the original mechanism, this approach also had a lot of problems. Under low-cal load, it worked reasonably well, since the ii static factors of delay dominated the cost. Under heavy load, however, a congested link would start to advertise a very loftier cost. This caused all the traffic to movement off that link, leaving information technology idle, so then it would annunciate a low cost, thereby alluring back all the traffic, so on. The effect of this instability was that, under heavy load, many links would in fact spend a great deal of time being idle, which is the last matter you want under heavy load.
Another problem was that the range of link values was much also large. For example, a heavily loaded 9.vi-kbps link could look 127 times more costly than a lightly loaded 56-kbps link. (Keep in heed, nosotros're talking about the ARPANET circa 1975.) This means that the routing algorithm would choose a path with 126 hops of lightly loaded 56-kbps links in preference to a ane-hop nine.half dozen-kbps path. While shedding some traffic from an overloaded line is a good idea, making it look so unattractive that it loses all its traffic is excessive. Using 126 hops when 1 hop will do is in general a bad use of network resources. Too, satellite links were unduly penalized, so that an idle 56-kbps satellite link looked considerably more costly than an idle ix.six-kbps terrestrial link, fifty-fifty though the former would requite better functioning for high-bandwidth applications.
A third arroyo addressed these problems. The major changes were to compress the dynamic range of the metric considerably, to account for the link type, and to smoothen the variation of the metric with fourth dimension.
The smoothing was achieved past several mechanisms. First, the delay measurement was transformed to a link utilization, and this number was averaged with the terminal reported utilization to suppress sudden changes. Second, there was a hard limit on how much the metric could change from one measurement bike to the next. Past smoothing the changes in the cost, the likelihood that all nodes would carelessness a route at once is profoundly reduced.
The compression of the dynamic range was achieved by feeding the measured utilization, the link type, and the link speed into a part that is shown graphically in Figure 92. below. Discover the following:
- A highly loaded link never shows a price of more than than iii times its cost when idle.
- The most expensive link is merely vii times the price of the least expensive.
- A high-speed satellite link is more than bonny than a low-speed terrestrial link.
- Cost is a part of link utilization just at moderate to high loads.
All of these factors mean that a link is much less probable to be universally abandoned, since a threefold increase in cost is likely to make the link unattractive for some paths while letting information technology remain the best choice for others. The slopes, offsets, and breakpoints for the curves in Figure 92 were arrived at by a great deal of trial and mistake, and they were advisedly tuned to provide good performance.
Despite all these improvements, information technology turns out that in the bulk of real-world network deployments, metrics change rarely if at all and just under the control of a network administrator, not automatically as described to a higher place. The reason for this is partly that conventional wisdom now holds that dynamically changing metrics are likewise unstable, even though this probably need not be true. Perhaps more than significantly, many networks today lack the dandy disparity of link speeds and latencies that prevailed in the ARPANET. Thus, static metrics are the norm. Ane common approach to setting metrics is to employ a constant multiplied by (i/link_bandwidth).
Key Takeaway
Why do nosotros even so tell the story about a decades one-time algorithm that'due south no longer in apply? Because it perfectly illustrates 2 valuable lessons. The first is that computer systems are oftentimes designed iteratively based on experience. We seldom get information technology right the outset time, so it's of import to deploy a simple solution sooner rather than afterwards, and await to improve information technology over time. Staying stuck in the design phase indefinitely is usually not a good plan. The second is the well-know Osculation principle: Go on it Elementary, Stupid. When building a circuitous system, less is often more. Opportunities to invent sophisticated optimizations are plentiful, and information technology'due south a tempting opportunity to pursue. While such optimizations sometimes have short-term value, information technology is shocking how ofttimes a simple approach proves all-time over time. This is because when a system has many moving parts, every bit the Internet most certainly does, keeping each function every bit unproblematic as possible is normally the all-time approach. [Adjacent]
stthomasolcou1970.blogspot.com
Source: https://book.systemsapproach.org/internetworking/routing.html
0 Response to "when a router is deciding how to route a packet, what table does it consult?"
Post a Comment