CS 105 - Understanding Our Algorithmic World

CS 105 - Exercise Seven

Goals

  • Learn a classic algorithm
  • Learn some techniques for manipulating text
  • Get an introduction to working with lists

Prerequisite

There is no starter code for this exercise, so go ahead and visit https://snap.berkeley.edu/snap/snap.html and make sure you are logged in.

Objective

As you heard, the problem of breaking codes played a big role in the birth of computers. We are going to use encryption to learn about working with lists and text. The code we are going to implement is called the Caesar cipher. Caesar apparently used this to hide his messages, but in all honesty, it isn't the strongest code. Okay -- it is so weak that it isn't really considered a code any more... But it is a good start for us.

The idea is that apply a constant offset to every letter in the message. Do, for example, if we chose a right shift of 3, "don't panic" will be transformed into "grq'w sdqif". Decoding is pretty easy, we just need to apply the shift in the opposite direction.

In order to do this, we are going to need to learn a little bit about how to work with text and lists in Snap!

Text

Computers can only store and work with numbers.

Actually, this is just a convenient lie...

If we want to store something non-numeric, like the letter "A", we could pick a number, like 1, and say, "when you see the number 1, show me an 'A'".

We also need to have a common system (i.e., a standard) so that when I write these web pages, your computer interprets the collection of numbers the same way and you can read what I wrote. There are a couple of different standards (the programs on your computer figure out which to use based on context and other information that might have come with the text).

By and large, the system you will encounter these days is called unicode.

The details of unicode are not terribly important right now -- all you really need to know is that it provides a mapping from numbers to text characters. I say "text characters" because it includes everything that you could include in a block of text: letters, punctuation, digits, even emoji (😲).

Unicode and Snap!

In the Operators palette, you will find two blocks: letter to unicode block and unicode to letter block. The first will report the unicode value of a character, and the second will report the letter of a particular unicode value.

Drag these out into your script area and give them a try.

Other text functions in Snap!

There are a couple of other built-in blocks for working with text in the Operators palette. We are going to play with split text block and join text block.

Split

Pull split text block out into the script area. Change the menu on the right hand side to read "letter". Click the block to run it. You should see something like this:

split text result

Whoa! What is that?!

That is a new type of data called a list. So, we now have numbers, text, booleans, and lists.

The list is about what you would expect -- a list of other pieces of data. As you can see, each element in the array is associated with a number -- this allows us to access individual items in the list. You will find some other list blocks in the Variables palette.

Pull item block out into the script area. This block will allow us to read individual items out of the list. Note that it has a different looking input that indicates that it takes a list.

Combine the two blocks together like so:

combined blocks

This will allow you to report a single value from the split text. Try in out by entering a couple of different values.

Join

The second text block, join text block is the opposite of split text block. It takes a collection of text items and glues them together into a single piece of text (more technically, it concatenates the text). Click it to see what it reports.

The interface on this block is a little weird. If you click the black arrows on the right, you can change how many inputs the block has. For our purposes, we want to give the block a packaged list, rather than multiple inputs. Click the left arrow until you only have a single input. Then drag your split text block into the input.

joining split text

When you click this, you should see that the text is now back together.

Encode

Okay, now we are ready to encode some text with the Caesar cipher. We are going to do a fixed right shift of three characters (e.g., 'A' will become 'D', 'S' will become 'V', etc... ).

Create a new block encode block that takes a single input. note that the extra wide input is because I set it to be a text input. In the block editor, click the input name. In the dialog box that appears, click the black triangle on the right which gives you the expanded input types. You will find Text in the center column of options.

Our algorithm will be as follows

  • split the input text into a list (put this in a variable so we can use it later)
  • iterate over each letter of the text
    • convert the letter to unicode
    • add 3 to the value
    • convert the result back to a letter
    • replace the old letter with the new shifted letter in the list
  • join all of the converted letter together into a single text
  • report the text

Use the following blocks:

add block for i block item block join text block length block replace item block report item block script variables block set variable block split text block unicode as block unicode of block

Note that I included some blocks in there that you have not used and I have not discussed. (I think you can figure out how they work...).

Build up to the encode block. Start small. For instance, convert a single letter to unicode, shift it and convert it back to text. This will be an important building block.

Decode

Once you have encode working, you should create a new block to decode messages as well. However, once you have encoding working, decoding is trivial. Decoding just means shifting every letter back to the left by the same amount.

Duplicate encode block and rename it decode. Then just switch the direction of the shift.

Test it out by using encode block as an input to decode block. The combination should report back the original text (which admittedly, while correct, does seem a little anti-climatic).

(Optional) Improvements

There are two major things that we could do to improve the encoder. These aren't required, but they would be good practice (and might come up again some time).

First, we could make the shift an input. Of course, if we do that, we won't need a separate decoder (why not?).

Second, the result would look nicer if the only things that were converted were letters, and all of the spaces, punctuations, and other characters were left alone. This is a little tricker, because we also need to wrap around (e.g., when we shift 'Z', it should map to 'C'), and we need to handle upper and lowercase separately.

The secret to this modification is that (a) we can compare characters using the normal comparison operators (it is the underlying numerical representation that is used for comparing), and (b) once we have converted letters to unicode, they are numbers and can be used for math (e.g., you could subtract the unicode of 'A' to figure out how many characters from the start of the alphabet a particular capital letter was).

What I will be looking for

  • There should be an encode block that shifts the letters of the input text by three characters to the right.
  • The block should use the blocks specified above
  • There should be a decode block which reverses the shift of encode block.

Submitting

Share the project using the instructions from exercise 1.

Only one person from each group needs to submit. When you submit the assignment you wil have the opportunity to create a group -- please remember to include your partner(s) in the group.

Visit the exercise page on Canvas to submit the URL.