Prelab 3 Due: Beginning of lab on 2020-02-26

For the lab this week, we’re going to be experimenting with basic cryptography, which is the study of techniques for securely transmitting and receiving information. For many cryptographic systems, we need to support two related functions: encryption, where we take a message and encrypt it to produce the ciphertext; and decryption, where we take a ciphertext and extract the original plaintext. To get you ready for lab, we’re going introduce some key concepts here and experiment with some of the encryption schemes by hand.

There are 6 questions below which you will need to answer on Gradescope by the due date at the top of the page.

Many, many, encryption schemes have been developed over time. In this lab, we’ll be focusing on what are called symmetric-key encryption schemes. If you have someone that you know you’d like to communicate securely with (i.e., without anyone else being able to read it) you decide beforehand on a secret key that you both know. Then, when you want to send a message, you encrypt it using the secret key and send the encrypted message to the other person. The other person can then use that same key to decrypt the message. This is called a symmetric-key system because the same key is used for both encryption and decryption. It is sometimes also called a pre-shared key system because both parties must have the key before it can work.

Caesar’s scheme

One of the simpler and more famous encryption schemes is credited to Julius Caesar. This scheme is representative of “substitution ciphers” in that each character is substituted for a different character in the message. To specify a substitution cipher we need to specify the alphabet of characters that our message can contain and then for each of these characters we specify a corresponding substitution character. Caesar’s approach was to substitute a letter in the original alphabet with the letter that was some fixed number of letters up in the alphabet. For example, if you chose 2 as your fixed number:

alphabet: a b c d e f g h i j k l m n o p q r s t u v w x  y  z ' '
key:      c d e f g h i j k l m n o p q r s t u v w x y z ' ' a  b

or if we chose 4

alphabet: a b c d e f g h i j k l m n o p q r s t u v  w  x y z ' '
key:      e f g h i j k l m n o p q r s t u v w x y z ' ' a b c  d 

Notice that we include the space as a character (written as ' ') and that we wrap around when we reach the end of the alphabet. When encrypting, you simply replace each character in your message with the encrypted character (i.e., the character the fixed number up in the alphabet) and to decrypt it you reverse the process. Do the following based on Caesar’s method. Submit all your answers via Gradescope.

Question 1: Encrypt ‘this is a test’ with a shift of 2 (to encrypt, substitute letters from “alphabet” with the corresponding letters in “key”).

Question 2: Decrypt ‘kbnqxgbeubencuu’ with a shift of 2 (to decrypt, substitute the letters from “key” with the letters in “alphabet”).

Question 3: Decrypt ‘stenewjfqqceit’ with a shift of 5.

General substitution cyphers

As mentioned, Caesar’s scheme is a specific example of a substitution cipher. In general, a substitution cipher can substitute any letter for any other letter. For example:

alphabet: a b c d e f g h i j k l m n o p q r s t u v w x  y  z ' '
key:      h v i e k s y r b d a j q w n c x m g u f l t p ' ' o  z

Question 4: Decrypt the following message with the substitution above: ‘urkzak zbgzvhwhwhg’.

The basic idea when programming something like this is to note that if we store both the original alphabet and the encryption key as strings, we can find the index in the original alphabet string and use that same index in the key to lookup the corresponding encryption character and vice versa during decryption. Specifically:

alphabet: a b c d e f g h i j k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z  ' '
          0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
key:      h v i e k s y r b d a  j  q  w  n  c  x  m  g  u  f  l  t  p ' ' o  z

Answer the following questions:

Question 5: If we have a variable ALPHABET initialized as follows:

ALPHABET = "abcdefghijklmnopqrstuvwxyz "

and we have a letter stored in a variable letter, write a one-line Python expression (use the method find) that will find the index in ALPHABET where that letter occurs. For example, if letter contained the character e then your expression would evaluate to 4.

Question 6: If we stored that index in a variable index and the encrypted letters corresponding to ALPHABET are stored in a variable key, e.g.,

key = "hvieksyrbdajqwncxmgufltp oz"

write a one-line Python expression that for any input character will find the corresponding encrypted character. For example, if index contained the value 4 then your expression would evaluate to k. (To clarify, in answering this question you assume that index already contains a value, such as the index determined by the previous question.)

Something to think about (you don’t actually need to answer this): Why aren’t substitution cyphers very secure, i.e., why are they relatively easy to break?