pete > courses > CS 431 Spring 25 > lecture 15: TCP in action


Lecture 15: TCP in action

Goals


last time, we started looking at TCP and saw the high-level idea of sequence numbers that allow both ends of a connection to know whether any data has gone missing

(an additional use of sequence numbers is to correctly rearrange data that has arrived out of order)

our goal today is to see how this all works in reality

to do so, we will need a program that can create TCP connections: for that purpose, we will use netcat, which is usually available as the program nc

recall that every transmission is going to contain a sequence number

and the recipient will send back an acknowledgement that notes the sequence number

we’ll start by getting a general sense from the RFC and then look at actual packets on the wire


RFC 793: Transmission Control Protocol

all the background and stuff is worthwhile reading because it provides a lot of interesting and helpful context

but the important part is the TCP header format diagram on page 15

just like all the other protocols we’ve looked at, TCP uses a fixed(ish) size header that precedes its payload

just like UDP datagrams, a TCP segment will be encapsulated inside an IP packet

in looking at the TCP header, we first see 16-bit source and destination port numbers, which have the same use/meaning as in UDP

(though port numbers are not interchangeable between the two protocols: a TCP segment addressed to port 22 has NO RELATION to a UDP datagram addressed to port 22)

following these, we have a 32-bit sequence number and a 32-bit acknowledgement number, which look most promising

then a data offset (which is the equivalent of IP’s header length field)

and then several flags, including ACK and SYN, which are short for Acknowledgement and Synchronize, respectively

when a process wants to communicate its initial sequence number, it will send a segment with the SYN flag set and the sequence number in the sequence number field

when a process wants to acknowledge receipt of data having a particular sequence number, it will send a segment with the ACK flag set and the sequence number in the acknowledgement number field


if we imagine the initial sequence numbers in both directions are zero, we can re-write the ladder diagram above like so:

  Process A |                                   | Process B
            |                                   |
            |                                   | accept(2)
            |                                   |
            |  SYN, seq_num=0                   |
  connect() | --------------------------------> |
            |                                   |
            |  SYN/ACK, seq_num=0, ack_num=1    |
            | <-------------------------------- |
            |                                   |
            | ACK, ack_num=1                    |
            | --------------------------------> |
            |                                   |

the "ack_num=1" part may look a bit strange

the way to think about the acknowledgment number is that it indicates the sequence number expected next

so if ack_num is 1, that means the recipient has received all data up to and including sequence number 0 (which, in this case, is the SYN)


before we look at the packets in detail, a word on the setup

since Process A is playing the active role in establishing the connection, we’re going to call it the client

since Process B is playing the passive role in establishing the connection, we’re going to call it the server

remember, though, that once the connection is established, either end can send data to the other: from the perspective of TCP, there is no longer any notion of who started the connection

for the purposes of the following examples, the client will be a netcat process running on FreeBSD

and the server will be a netcat process running on Linux

annoyingly, the two versions of netcat use slightly different syntax (I could avoid this by installing gnu-netcat on FreeBSD, but I prefer using the native version, so that’s what I’m going to inflict on you—plus, being able to deal with ambiguity is healthy)

in the following examples, shell commands will be prefixed with "server\(" (Linux) or "client\)" (FreeBSD) to indicate which machine/OS they’re running on

in the Wireshark captures below, the server (Linux) has IP address 10.3.72.235 and the client (FreeBSD) has IP address 10.0.2.15


now the reality:

first, we start Wireshark, capturing on FreeBSD interface em0 (which is its interface to the outside world, and the interface over which the connection to Linux will transmit its packets)

then, we run netcat such that it behaves as Process B in the above diagram, listening for incoming connection requests on TCP port 4000:

server$ nc -l -p 4000

(remember that the above command works only for the Linux version of netcat; to do the same thing on FreeBSD, drop the "-p")

finally, in a separate terminal, we ask a client to connect to that server process:

client$ nc 10.3.72.235 4000

the result is this sequence of packets: tcp-three-step-handshake.pcapng


first, there are three packets! this is encouraging, because we expect a three-step handshake

the Source and Destination columns tell us that the packets are being sent in the directions we expect: first from the client to the server, then the server to the client, and finally again from the client to the server

in the Info column, on the left end, we first see that the port numbers line up as well: the port on the server is indeed 4000 and the port on the client is 12483 (this was chosen random-ish-ly by the kernel because it doesn’t matter which port the client uses, so long as no other process uses the same one)

also in the Info column, we see "[SYN]" for the first packet, "[SYN, ACK]" for the second, and "[ACK]" for the third: this represents the flags

if we click on the first packet and expand the "Transmission Control Protocol" part of the packet dissection display, and then expand the "Flags" part, we can cross-reference the bit vector with the RFC

same for the other two packets


where things get interesting is the sequence numbers

click on the first packet, expand the TCP part, and we see a few lines that mention sequence number:

Sequence number: 0    (relative sequence number)
Sequence number (raw): 3004226853
[Next Sequence Number: 1    (relative sequence number)

the first line makes sense: it matches our understanding that, since this is the initial segment of the connection, it should have sequence number zero

but if we click on that line, then the bytes b3 10 dd 25 are highlighted in the hex representation, which clearly indicates that the actual content of the field in the TCP header is not the value zero

this is because the initial sequence isn’t actually zero: it’s random (it isn’t truly "random", but at this point, for our needs, it may as well be—to be elaborated upon later)

the real initial sequence number is indeed b3 10 dd 25 which, when interpreted as an unsigned integer, has value 3004226853, which is what Wireshark is giving in the "raw" line above

the "relative sequence number" line is calculated and shown because it’s often helpful to have a sense of where in the lifetime of a connection a given segment resides, and it’s annoying to have to constantly subtract the (random) initial sequence number

(when working with data, the term "raw" is used to describe data that is shown exactly as it arrived; likewise, the term "cooked" is used to describe data that has been modified in some way, to be more clear/relevant/helpful/whathaveyou—in this case, the "relative sequence number" is cooked data)


still in the first packet, the acknowledgement number is zero (both raw and cooked):

Acknowledgement number: 0
Acknowledgement number (raw): 0

this makes sense because, when it creates that first packet, the client doesn’t know the sequence number of the data it should be acknowledging

and also the ACK flag on this packet is not set, so this field should be ignored anyway


on to the second segment (the SYN/ACK), we see that it has a much different sequence number:

Sequence number (raw): 1492800001

this makes sense: recall that a connection is bi-directional and each direction maintains its own notion of sequence numbers

the (raw) acknowledgement number, however, looks familiar:

Acknowledgement number: 0
Acknowledgement number (raw): 3004226854

it is exactly one greater than the sequence number sent in the first packet

this tracks with my claim about acknowledgement numbers above: the acknowledgement number is one greater than the sequence number of the data being acknowledged


finally, in the third packet, only the ACK flag is set and the acknowledgement number is what we expect: one greater than the sequence number of the just-received packet:

Acknowledgement number: 1
Acknowledgement number (raw): 1492800002

so if we were to re-draw the ladder diagram from before with the real sequence numbers used in this particular three-step handshake, it would look like this:

  Process A |                                                       | Process B
            |                                                       |
            |                                                       | accept(2)
            |                                                       |
            |  SYN, seq_num=3004226853                              |
  connect() | ----------------------------------------------------> |
            |                                                       |
            |  SYN/ACK, seq_num=1492800001, ack_num=3004226854      |
            | <---------------------------------------------------- |
            |                                                       |
            | ACK, ack_num=1492800002                               |
            | ----------------------------------------------------> |
            |                                                       |

that’s the three-step handshake in action

the subtle part to keep in mind is that the acknowledgement number identifies the data expected next

so Process A expects to receive data with sequence number 3004226854

and Process B expects to receive data with sequence number 1492800002


now let’s send some data

in the terminal running the client (Process A, FreeBSD), I type "hello there" and press Enter

in Wireshark, we see two more packets (capture file here: tcp-data-a-to-b.pcapng)

this makes sense: one packet containing the data sent from Process A to Process B and a second packet containing the corresponding ACK sent back from B to A

before we look at sequence numbers and ack numbers, first note the flags:

the first packet of this pair has both PSH and ACK flags set

the latter kind of make sense: the space for the acknowledgement number is always set aside in the TCP header, so it might as well contain a valid acknowledgement number (this can be useful if an ACK gets lost in transit, obviating the need to re-send it)


the PSH flag is new though

consider what Process B is likely doing at this point: it has probably called recv(2) to retrieve data sent to it by Process A and that syscall is blocking until it has data to deliver to the process

that call will likely look something like this:

bytes_read = recv(fd, buffer, buffer_length);

what if Process A sends way fewer bytes than buffer_length ? when the kernel underneath Process B gets that data over the wire, should it wait for more data or return right away?

the PSH flag says "when you get this, deliver it to the process immediately", the effect being that recv(2) will return immediately after this segment arrives

this is why we see "hello there" printed out right away on the terminal running the server


now, sequence numbers

for the first packet, which contains the data:

Sequence number: 1    (relative sequence number)
Sequence number (raw): 3004226854
[Next Sequence Number: 13    (relative sequence number)

let’s look at the last line first: it says the next sequence number is 13

that’s odd, we only sent 1 segment

but that segment contained 12 octets of data: the string "hello", a space, the string "there", and a newline

lesson: sequence numbers refer to octets, not segments

(this is helpful because, especially way back in the day before all the hardware was super standardized and shockingly reliable, as it is now, a single segment could get divided into parts in transit, each part transmitted in a separate IP packet, and therefore need to be reassembled by the recipient)

so, to translate the above Wireshark information into English, I would say: "this segment contains data starting with (relative) octet #1" and "since the segment contains 12 octets of data, the next segment sent in this direction will have (relative) sequence number 13"

(note that the "Next Sequence Number" information is not sent in the TCP packet: Wireshark infers it from the rest of the information)


the acknowledgement number in this segment is identical that sent by the final ACK of the three-step handshake:

Acknowledgement number: 1
Acknowledgement number (raw): 3004226866

this makes sense because no data has been sent from Process B to Process A yet


now armed with the understanding of what sequence numbers mean, the next packet isn’t surprising:

Acknowledgement number: 13
Acknowledgement number (raw): 3004226866

it says "I have received all octets up to #12 and I am now ready to receive #13"


and here’s the ladder diagram for those two packets:

  Process A |                                                       | Process B
            |                                                       |
            |  data="hello there\n"                                 |
            |  PSH/ACK, seq_num=3004226854, ack_num=1492800002      |
            | ----------------------------------------------------> |
            |                                                       |
            |  ACK, seq_num=1492800002, ack_num=3004226866          |
            | <---------------------------------------------------- |
            |                                                       |

the next pair of packets demosntrate the same thing in the opposite direction

because the meme demands it, in the terminal window running Process B, I type "general kenobi" and press enter

Wireshark capture: tcp-data-b-to-a.pcapng

the corresponding ladder diagram follows

I leave it to you to correlate the Wireshark trace to the diagram

  Process A |                                                       | Process B
            |                                                       |
            |  data="general kenobi\n"                              |
            |  PSH/ACK, seq_num=1492800002, ack_num=3004226866      |
            | <---------------------------------------------------- |
            |                                                       |
            |  ACK, seq_num=3004226866, ack_num=1492800017          |
            | ----------------------------------------------------> |
            |                                                       |

further transmissions follow the same pattern


now I press Control-C in the terminal running Process A

four packets appear: tcp-teardown.pcapng

Process A sends a FIN packet to Process B, which gets ACK’d

and then Process B sends a separate FIN packet to Process A, which also gets ACK’d

this is the TCP four-step teardown

the reason it’s four steps is that, technically, one could choose to half-close a connection: that is, a process could send a FIN to indicate that it will no longer be sending data, while remaining open to receiving data


you might ask yourself how software can keep track of all of this

stuff like "I’ve sent a SYN, now I’m waiting for a SYN/ACK" or "I’ve sent a SYN/ACK and now I’m waiting for an ACK"

you might also recall our previous discussion of state machines

in fact, the RFC contains a state machine that encapsulates all this information: it’s on page 23

this state machine describes the states pertaining to a single connection

the square boxes represent states

the lines between boxes represent transitions

these transitions have two-part labels: a top part and a bottom part, separated by a line

for example, the transition from the LISTEN state to the SYN RCVD state has this label:

  rcv SYN
------------
snd SYN, ACK

these labels should all be interpreted in the same way:

the thing above the line is the event that happened to cause the transition to take place

the thing below the line is the action that should be taken as the transition is being made

so the way to read the above transition label is "if the connection is currently in the LISTEN state and a SYN packet is received, we should send a SYN/ACK packet and move to state SYN RCVD"


every connection begins in the CLOSED state

from there, we either passively open (ie, call listen(2)) or actively open (ie, call connect(2))

you can trace the state transitions from there and see that they mirror the exchanges of packets in the preceding ladder diagrams

once in the ESTAB state (which is short for "established"), either side may send segments containing data to the other, which will get acknowledged


here’s the Wireshark capture file for the full connection: tcp-full-connection.pcapng

Last modified: