BFOIT - Introduction to Computer Programming

# Cryptography - BFOIT Saturday - October 13, 2007

## Caesar's Cipher

The idea for this lesson came from the book:
In Code, Sarah Flannery with David Flannery.

### Introduction

Caesar's cipher is a simple substitution algorithm where ciphertext characters are substituted for plaintext characters. In Caesar's Cipher, a ciphertext character is just the plaintext character shifted through the alphabet by an offset number.

Example (with shift offset of 5):

 ciphertext F G H I J K L M N O P Q R S T U V W X Y Z A B C D E plaintext A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

So, what do we need to know to write a program that encrypts text in this manner?

### ASCII Characters

As I talk about in the very first lesson in BFOIT's curriculum, Programming - What Is It, everything inside a computer is represented as numbers.  For symbols, characters in this case, one standard representation is the ASCII character set.  Here is a sample from it:

 ASCIICharacter inOctal inDecimal ASCIICharacter inOctal inDecimal ASCIICharacter inOctal inDecimal ASCIICharacter inOctal inDecimal `space` `040` `32` `0` `060` `48` `A` `101` `65` `a` `141` `97` `!` `041` `33` `1` `061` `49` `B` `102` `66` `b` `142` `98` `(` `050` `40` `2` `062` `50` `C` `103` `67` `c` `143` `99` `)` `051` `41` `3` `063` `51` `D` `104` `68` `d` `144` `100`

### Logo Primitives ASCII and CHAR

So, what do we need to encrypt and decrypt characters, to play with them mathematical?

If you try to add a number to a character in Logo, you get an error.  If you type in the instruction 'SUM "A 5', you get ERROR: Can't convert 'A' to a number.

We need to be able to convert a character into a number and then, given some number, convert it into a character.  Luckily, even though Logo will not do this conversion automatically, Logo does have two primitive operators that do exactly what we want, ASCII and CHAR.

 jLogo Character Procedures Name Input(s) Description ASCII character Outputs the integer (between 0 and 127) that represents the character in the ASCII code. CHAR number outputs a single character word that is the ASCII code corresponding to the input (an integer between 0 and 127).
```    ? println ascii "A
65
? println ascii "Z
90
? println char 67
C
? println char 97
a
?
```

Now, back to the math stuff we need.

```    ? println char sum (ascii "A) 10
K
? println difference (ascii "E) (ascii "A)
4
? println difference (ascii "W) (ascii "A)
22
```

So now we know how to convert the letters of the alphabet (characters in Logo) to numbers.  From this point on we will talk about characters and character indices.  A character index will have the association: A=0, B=1, C=2, etc..., X=23, Y=24, and Z=25.  Here is an operator that produces a character index, given an uppercase character as its input.

```    ;output the index of the uppercase character
;       A=0, B=1, C=2, ...  X=23, Y=24, Z=25
to chIdx :uppercaseChar
output difference (ascii :uppercaseChar) (ascii "A)
end
```

### Modulo Math

As with the QUOTIENT operator, given one number (the dividend) to be divided by another (the divisor), the modulo operator produces what's left over after the division.  For example, when 5 is divided by 2, the remainder is 1, when 8 is divided by 4, the remainder is 0.  In Logo, the modulo operator is the REMAINDER operator.

```    ? println remainder 5 2
1
? println remainder 8 4
0
? println remainder 12345 100
45
?
```

So why is the modulo operator important in cryptology?

Think about what happens when we try to shift the last characters.  Take a shift amount of 5.  Five characters after A in the alphabet is F.  Five characters after M is R.  But... five characters after Y?  Ah oh...  We have to wrap-around!  Five characters after Y is D where A follows Z.

This is where the modulo operator comes in; it does just what we need it to do.  Given any character index and any shift amount, when they are added together and the result divided by the size of our alphabet, the remainder of this division is a legal character index.

If we supply the remainder operator with a character index as its first input and the alphabet size (26 in our case) as the second input, its output is the encrypted character index, with the wrap around covered!

Think about a very simple case, a shift amount of one (1).  For any character up to, but not including, Z, when you add one to the character's index and divide by twenty six (26) you get the correct encrypted character.  When you have a Z and you add one (1) to its character index (25), you get 26; when this is divided by 26 (the alphabet size), you get zero.  And zero is A, which is the correct answer for Z shifted by one.  So, let's look at the code to encrypt Logo sentences.

### Caesar's Cipher Logo Code

```    ;number of characters in the encrypted text alphabet
to alphabetSize
output difference (sum (ascii "Z) 1) ascii "A
end

;output the index of the uppercase character
;       A=0, B=1, C=2, ...  X=23, Y=24, Z=25
to chIdx :uppercaseChar
output difference (ascii :uppercaseChar) (ascii "A)
end

;amount (modula alphabetSize) a character is shifted
to shiftAmount
output 5
end

;output the uppercase equivalent of :character, if any
;       otherwise :character itself
to toUpperCase :character
if less? (ascii :character) (ascii "a) [output :character]
if greater? (ascii :character) (ascii "z) [output :character]
output char difference (ascii :character) (difference (ascii "a) (ascii "A))
end

;output the encrypted representation of a character
;the character is shifted by shiftAmount
to encryptChar :character
make "character toUpperCase :character
if less? (ascii :character) (ascii "A) [output :character]
if greater? (ascii :character) (ascii "Z) [output :character]
output char sum (ascii "A) (remainder (sum (chIdx :character) shiftAmount) alphabetSize)
end

;output the encrypted representation of a word
to encryptWord :wd
if empty? :wd [output " ]
output word (encryptChar first :wd) (encryptWord butfirst :wd)
end

to encryptSentence :sent
if empty? :sent [output [] ]
output sentence (encryptWord first :sent) (encryptSentence butfirst :sent)
end
```

Extend what I've given you; write code to decrypt a Logo sentence.  Write the procedures decryptChar, decryptWord, and decryptSentence which mirror their encrypt counterparts.

So what's the problem with Caesar's Cipher?

The problem is that there are only twenty six alternatives to arriving at a solution.  But, the same technique can be used with a greater number of possible cases.  What if instead of encrypting only uppercase characters, we encrypted uppercase, lowercase, digits, and special symbols?  This increases the number of possible substitutions.

And then there is the possibility of encrypting pairs of characters.  This really expands the possibilities. 