pete > courses > CS 431 Spring 25 > Lecture 03: Link Layer I – SLIP and PPP
Lecture 03: Link Layer I – SLIP and PPP
Goals
- explain the role of the link layer in the abstraction hierarchy
- define MTU
- define octet
- define opaque data and encapsulation and explain how they relate to separation of concerns
- explain the role of the controller, driver, interrupts, and interfaces
last time, we talked about the physical layer and the task it performs: it permits transmitting and receiving bits over a communication medium
we talked about hardware involved (transmitter, reciever, and medium) and characteristics (bandwidth, latency)
when transmitting, the PHY is given bits by the kernel, converts them to an analog signal, and sends that signal over the medium
when receiving, the PHY gets an analog signal from the medium, converts it into bits, and hands those bits off to the kernel
we therefore have the ability to send and receive bits: great!
now what?
to motivate where we’re going, let us consider the physical layer communication medium
the assumption is that a communication medium (eg, a wire) will be connected to our host and then elsewhere
this wire could be connected to zero other hosts, one other host, or many other hosts
these possibilities introduce different complications into how we can use the ability to send bits over that wire
the simplest case is that the wire is connected nowhere else
this may seem nonsensical, but the idea will have a very real use, which we’ll get to in due time
let’s not really consider this case now
the next case, where the wire is connected to a single other host, gets more interesting
especially for a half-duplex medium, because the hosts will need to agree who gets to send at what time
one way to achieve this is to break up communications into separate chunks, and each end gets to send a chunk, after which it waits to see if the other side needs to send anything of its own
breaking up data into chunks like this is called framing and the chunks are called frames
and just like the unit of communication at the physical layer is bits, the unit of communication at the link layer is frames
the final case is the most complex: there are many hosts connected to a (shared) communication medium (wire)
the issue of deciding who gets to send when is amplified because now all those hosts might want to send, but only one at a time can put a signal on the wire
another issue that arises here is that one host might want to send data to exactly one other host, and not bother the rest
we therefore need a notion of addressing: that is, having a way to identify a host, so that the data sent can be delivered to the right place
–
and indeed this sums up what the link layer does:
it builds on top of the physical (with its ability to send and receive bits)
and introduces the concepts of framing and addressing
it can also do things like error detection and error correction, but those are less universal among link layer implementations
one weird aspect of the link layer is that, frequently, part of it is implemented in hardware (that talks to the PHY, which is a physical layer thing) and another part of it is implemented in software
there are certainly dedicated devices that do all link-layer operations in hardware, but all the computers we use (including phones) implement some of those operations in software
(the link layer is also referred to as "Layer 2")
it is also at the link layer where we start to think in terms of bytes, not bits
and here we encounter a bit of a problem
when I say "byte", you very likely think "8 bits" but that was not always the case
there are some older architectures were a byte is 7 bits and others where it’s 9 bits
the more precise, unambiguous term for an 8-bit quantity is octet
you will see this word pop up a lot in the protocol specification documents we’ll be reading
we want to end up talking about Ethernet, which is among the most popular link layers these days
but I want to start with a couple simpler link layers, both so you can see some variety and also to build up the concepts more deliberately
the first link layer we’ll look at is SLIP: Serial Line Internet Protocol, which was created to allow IP (a network-layer protocol) to work over serial lines (a physical layer)
it was designed for point-to-point connections—that is, a single medium with exactly two hosts connected
it has since been superseded, but it is historically meaningful, and it will provide our first view of an RFC
here it is: RFC 1055
the first thing to note is that the RFC is identified by number: all of them have one, and often protocols are talked about by the number rather than the name
the next thing to note is that this RFC is non-standard
its purpose is to document a historical protocol rather than to provide a blueprint for implementation
the format is surprisingly readable and it even includes sample code at the end
the details are given in the PROTOCOL section, which is quite short
(note that it uses the term "packet", which we have not encountered yet: for now, consider it equivalent to "frame")
to send a frame, the sending host just dumps the bits on the wire
to indicate that the frame is done, the sending host sends a special END byte, which has value 192
(I find it easier to think in hex, so it’s the byte 0xc0)
this raises a complication: what if one of the bytes in the middle of the frame has value 0xc0?
we escape it, just like we backslash escape special characters in strings in Python, C, or Java (eg, the "\n" in "hello world\n")
in SLIP, the ESC character has value 219 (hex 0xdb) and is used to escape the END byte if it appears in the body of the frame
that is, if the SLIP link layer is handed data to send, it will scan over that data and replace every byte with value 0xc0 with two bytes: 0xdb 0xdc
the receiving host will then scan all the bytes it gets, and if it finds the two-byte sequence 0xdb 0xdc, it will replace them both with the single byte 0xc0
you will note that 0xdb will also need to be escaped: it is replaced with 0xdb 0xdd
another new idea here is that of maximum frame size
the RFC says that implementations should not produce frames (which now they call "datagrams") larger than 1006 bytes (by which they mean octets)
while it doesn’t use the term, this parameter is the MTU: the "maximum transmission unit"
most link layers specify one
when we get to Ethernet, we’ll talk about why its MTU was chosen
and when we reach the higher layers, we’ll have to deal with its implications
the RFC also notes limitations of SLIP
since we haven’t talked about IP yet, the addressing one is difficult to internalize, so don’t worry about it
type identification is interesting, though: SLIP has no way to tell the receiver what kind of data it is delivering
the assumption is that it will always be IP data, but this basically kills and chance for SLIP to be multipurpose
SLIP also doesn’t perform error detection or correction, which can be a bummer over the kinds of physical media it has been used for
nor does it have any support for compression
on to PPP, which fixes many of these problems
we’re not going to focus on how, because PPP is surprisingly complex and flexible and it would take a while
we’re instead going to look at facets of PPP which will show up in other protocols as we go up the stack
PPP is defined in RFC 1661
this is a very long document, which I do not expect or even want you to read
I want instead to look at RFC 1662, which describes the structure of frames in PPP
first, look at section 1.1, which talks about wording
this is important, because these words can have varying implications when used in informal human-to-human discussion, but they have very precise meanings in RFCs
for an implementation to be RFC-compliant it is required to implement all functionalities noted with MUST
and to not implement any functionalities labeled MUST NOT
whereas SHOULD is only recommended
and MAY is purely optional
these terms and their meanings are used in all RFCs
down in section 3.1, we see the structure of a PPP frame
it starts with a very specific sequence of bits: 0111 1110 (hex 0x7e)
then a hard-coded address: 1111 1111 (0xff)
then a hard-coded control value: 0000 00011 (0x03)
a field called "Protocol" that could be either 8 bits or 16 bits long (more on this in a moment)
then the Information being transmitted, of variable length (an integer number of octets, indicated by the asterisk)
then a variable amount of Padding (also an integer number of octets)
then a 16- or 32-bit Frame Check Sequence (FCS)
and finally another of the flag octets: 0111 1110 (0x7e)
PPP frames are intended to carry arbitrary data
but the recipient will still want to know what kind of data it’s getting, because one type of data might be handled one way, while another type might be handled a different way
the Protocol field is used to identify the kind of data in the Information field, so that the recipient can decide the right thing to do with it
some values for the Protocol field are defined in section 2 of RFC 1661, though more have been added since
note that PPP doesn’t really care about the meaning of the data it is transmitting: it defines no limitations or assumptions about the octets in the Information field
nor does PPP make any decisions based on the Information field or have to have any understanding of its content
thus the Information field is opaque to PPP
another term for this field is the payload of the frame, referring to the data being delivered, as opposed to the other octets that describe how to perform the delivery
(in that regard, we can think of the payload as data and everything outside the payload as metadata)
we also say that the payload is encapsulated inside a PPP frame
we will see these ideas return several times in the coming weeks
the Frame Check Sequence is our first look at error detection
when preparing the frame, the sender reads in all the octets (address, control, protocol, information, and padding), performs a mathematical operation on them, and out pops a 16- or 32-bit value, which is put into the frame following the padding
then, when the recipient gets the frame, they perform the exact same algorithm on the exact same bits and check to see if the result matches the FCS value in the frame
if there is any difference, this indicates that some of the octets have been corrupted
when encountering such corruption, many link layers will silently drop the frame, and leave it up to the higher layers to notice a frame went missing
stepping back a bit, the frame structure is "several metadata octets at the beginning, the data (payload), then some octets at the end"
stuff that precedes the payload is called a header
and stuff that follows the payload is called a footer
PPP frames have both, some protocols only have one or the other (if so, in my experience, it’s pretty much always a header)
now let’s step back a little further and consider the implications
PPP is a link-layer implementation
it will be handed a payload from the next layer up
compose a frame containing the payload
and then pass the bits to the physical layer (PHY) for sending over the medium
on the other end, the receiver’s PHY will get the bits, and pass them up to the link layer
which will then recognize a frame
it will extract the payload and, using the Protocol field, pass the payload on to the specific higher-layer software that deals with payloads of that type
the first thing we need to clear up is what link-layer hardware is involved in this process
(remember I said that part of the link layer is implemented in hardware)
the chip that takes the frame and hands it off to the PHY is called a controller
so the software part of the link layer hands the frame to the controller
and the controller hands the constituent bits to the PHY
that software is called a driver, and is part of the kernel
there will be different drivers for different controller chips
and the notion of driver and controller is relevant well outside networking: USB, audio hardware, disk drives, etc all operate using drivers and controllers
one word you will also hear in this context is inferface
this is kind of synonymous with the PHY/controller combo
it usually shows up more in the upper (software) layers of abstraction
the second thing to clear up is how the receiving controller tells the driver that a new frame has arrived and needs to be handled
this is complicated, because the CPU is ostensibly executing instructions for a running process
and the controller wants the CPU to deal with the new frame
what happens is that the controller interrupts the CPU: it tells the CPU to stop doing what it’s doing and handle the new frame instead
it does so by asserting (putting a signal on) a wire running from the controller to the CPU—this is called "raising an interrupt"
when the CPU is about to start executing a new instruction, it checks this wire and, if it’s on, the CPU will execute a different instruction instead: specifically, the first instruction of the function that handles interrupts
that function will identify the correct driver and execute its "a new frame has arrived" function
when that function is done, the process originally running on the CPU will be resumed
(it is, unsurprisingly, more nuanced and complicated than this, but that’s the gist)