pete > courses > CS 431 Spring 25 > Lecture 05: Link Layer to Network Layer
Lecture 05: Link Layer to Network Layer
Goals
- explain the purpose and behavior of binary exponential backoff
- compare and contrast the behavior of hubs, switches, and bridges
- compare and contrast circuit-based networks to packet-switching networks
- explain why link-layer protocols alone are insufficient
- define broadcast and broadcast domain
- define network and explain why they are necessary
- define internetwork
we ended up last time with the idea of binary exponential backoff |
this is an algorithm that Ethernet stations use when they detect that a sent frame caused a collision |
each station that notices a collision will pick a random number between 0 and 1 and wait that amount of time before trying to send again |
if the second attempt causes another collision, the station picks a random number between 0 and 3 for its delay |
each subsequent collision sending this frame will double the size of the range |
after 10 attempts, the link layer gives up and reports an error to the upper layer |
the lingering question was "what time units?"
the answer is kind of weird: we take the random number and multiply it by the amount of time required to send 512 bits over a maximum-length link
for example, 10 Mbit Ethernet (which is old and slow) can transmit 10 million bits per second
that’s one bit every tenth of a microsecond
sending 512 bits would take 51.2 microseconds
binary exponential backoff (BEB) is effective for a few reasons
it allows every station to make its own decision in isolation (important because the communication medium is contested and therefore cannot be used by stations to coordinate between themselves)
if the medium is relatively lightly used, BEB will quickly result in a successful transmission because different stations are unlikely to come up with the same random number
if the medium is relatively heavily used, BEB will quickly back off (because exponential) and likely find an unused timeslot, and eventually succeed at sending
now, recall the notion of thick Ethernet, in which stations are all literally spliced onto the same wire
this is suboptimal for many reasons
it requires us literally splicing into the wire to add a station, which is a fraught operation that can cause everything to fail
it requires that the wire already be laid in places where we might want to place stations
if there is a problem at any point on the cable, the entire thing needs to be replaced and all stations re-attached
in practice, this style of network has not been used in decades
instead, we use a hub (sometimes called a repeater, for reasons that will become clear shortly)
a hub is a little box with a bunch of ports into which we can plug ethernet cables
and it equates the behavior of thick Ethernet: that is, any frame received by the hub will be sent out all ports (hence the alternate name: it repeats the frame to all connected ports)
so the functionality we’ve discussed for Ethernet is the same, it’s just that the topology (the physical arrangement of devices/hardware) is a bit different
instead of a single wire into which every machine splices, we have a single hub to which every individual machine is connected by a patch cable
now, if a single patch cable goes faulty, only a single machine is affected and we just need to replace that one cable
allowing a machine to join the party is as simple as connecting them via patch cable to another port on the hub
what do collisions look like in this context?
let’s imagine we’ve got a hub with machines A, B, C, and D each connected to a different port
if machine A sends a frame to machine B, that frame will also be sent down the parts to which machines C and D are connected
if machine C sends a frame to machine D at the exact same time, it will collide with the frame from A to B
in fact, any two machines sending simultaneously will cause a collision
the hub will notice this, drop both frames, and put the appropriate signal on the wire so that the senders know there was a collision
thus, all four machines (A, B, C, and D: the hub doesn’t count as a machine here) are part of the same collision domain: that is, the set of machines whose frames might collide with one another
on the topic of collisions, then, the hub is mostly no different than the thick Ethernet setup: any two machines sending a frame simultaneously will cause a collision (though now it’s the hub that detects the collision rather than the individual stations)
that said, in the specific case of all 4 machines being connected to a hub and A sending to B and C sending to D at the same time, it does seem a bit unnecessary for a collision to occur because (at least superficially) the two streams of communication travel mutually exclusive paths
relatedly, if A sends a frame to B, it seems wasteful to also send that frame down the wires to C and D because it’s not intended for them
therefore, it would be nice if the hub was smart: if it knew which machines were attached to which ports and only sent a frame to the port on which the destination machine was attached
this is precisely what a switch does
when attached to a switch, if machine A sends a frame to machine B, the switch will only send that frame out the port to which machine B is connected
likewise, if machine C sends a frame to machine D at the exact same time, the switch will happily send that frame out the port to which machine D is connected, also at the exact same time
therefore, using a switch rather than a hub potentially results in many fewer collisions because it is smart about where it sends the frames
this is a good time to review the notion of collision domain, because switches change the game there
recall that a collision domain is a set of machines whose frames could collide with another
using a hub, all attached machines are part of the same collision domain because any two machines sending at the same time will cause a collision
using a switch, the notion of collision domain… kind of doesn’t make sense
yes, two frames with the same destination could collide
but unrelated streams of frames won’t collide
so the probability of a collision happening with a switch is much, much lower
relatedly, a switch has much higher aggregate bandwidth than a hub
aggregate bandwidth is the sum of the bandwidth of all connected devices
since a hub does nothing special to improve beyond the shared medium situation, the total bandwidth of all connected devices is just the bandwidth of the shared medium
with a switch, however, the situation is different: with a 4-port switch, machines on ports 1 and 2 can send frames to each other with the full speed of the medium and, separately, machines on ports 3 and 4 can also send frames to each with the full speed of the medium
in the 4-port case, the maximum aggregate bandwidth of a switch is double that of a hub
(it is tempting to assume that the aggregate bandwidth of a switch is always "speed of communication medium times number of ports divided by 2" but the computational hardware inside can’t always support this, so it’s not a reliable rule of thumb)
but how does a switch know which port a machine is connected to?
it doesn’t!
at least not until the first time a machine sends a frame
at which point the switch makes a record of the port where that frame originated
the switch has memory in which it remembers this
(when looking at switch specifications, you may see something like "16K MAC Table" which means that the switch can remember 16,384 different MAC addresses and which port each is connected to)
let’s imagine four machines (A, B, C, and D) all connected to a switch
the switch has just been powered on and its MAC table is empty
machine A sends a frame to machine B
the switch can’t know which port machine B is connected to, so it sends the frame to all ports
only when machine B chooses to send a frame of its own does the switch learn which port it’s on and is it able to start making smart decisions
hold on though: that "16K Mac Table" mentioned above seems a bit ridiculous
how on earth could one possibly connect 16,384 different machines to a single switch?
answer: one can’t
this only makes sense if we daisy-chain switches
that is, we connect switches to each other, with the intention that all connected machines can send frames to each other
in such a situation, though, there is still the possibility that some links will become super contested (ie, lots of frames need to be send along them)
arranging and configuring switches to best accomodate the needs of the machines is the purview of network engineers and architects
let us consider a modification to the architecture
conceptually, we can set aside hubs and switches because they don’t incur fundamental changes to understood behavior: they just make things more convenient/efficient
so we’ll start with the situation where we have four machines connected to the wire
+---+ | A | +-+-+ | +---+---+ | | +---+ | | +---+ | D +---+ +---+ B | +---+ | | +---+ | | +---+---+ | +-+-+ | C | +-_-+
then let us imagine that machine B has a second NIC, and that NIC is attached to a different cable, to which several other machines are connected:
+---+ +---+ | A | | E | +-+-+ +-+-+ | | +---+---+ +---+---+ | | | | +---+ | | +---+ | | +---+ | D +---+ +---+ B |---+ +---+ F | +---+ | | +---+ | | +---+ | | | | +---+---+ +---+---+ | | +-+-+ +-+-+ | C | | G | +---+ +---+
now let us imagine that machine D sends out a frame
machines A and C will do the normal thing of checking whether the frame is for them
but machine B could do something different
machine B could choose to pass that frame along to the wire on the right
if it chooses to do so, then machine B is acting as a bridge
(note that, for this to work, machine B must disable the normal "is this frame for me?" logic: it must pass all frames through to the other interface)
there’s another piece of the puzzle that introduces problems here
recall that the Ethernet frame has a 6-octet Destination address field that identifies the intended recipient
and that each NIC is assigned a (hopefully unique) address
so that when a frame arrives, the NIC will compare the destination address in the frame to the address of the NIC; if it doesn’t match, the frame will be silently dropped; otherwise, the frame will be delivered to the upper layers
there is also a special MAC address called the broadcast address (FF:FF:FF:FF:FF:FF) that every NIC is supposed to accept and deliver to the upper layers
no NIC is assigned this address, but if a frame is sent with this address as its destination, every receiving NIC is supposed to pay attention to it
the set of machines that will receive each other’s broadcast frames is called a broadcast domain
in the context of hubs and switches, they both happily pass along broadcast frames, so every machine attached to the same hub or switch (including those daisy-chained) are part of the same broadcast domain
and bridges, of course, also happily pass broadcast frames through—because they pass all frames through
another name for a broadcast domain is a network
so every machine in a particular network can a) send frames to each other and b) can expect that every other machine will see its broadcast frames
and this is where we start to see some limits
we might want to have a huge network—perhaps even a world-spanning network
but a huge network would mean that a single broadcast frame has to be sent to EVERY machine on the network
this very quickly gets unwieldy
(it’s also why a 16K MAC table is quite sufficient for the vast majority of cases)
the alternative is to have many separate networks (broadcast domains) and interconnect them
which gives us an internetwork: a collection of interconnected networks
Things (tm) have evolved such that humanity has developed a single, enormous internetwork
we call it the Internet