Primary Articles Learning Environment
NSDL Timely Teaching, Issue 1. October 2009.
Cryptography: The Math and Science of Secrecy
by Dr. Mike May
??Cryptography, the science of encrypting and decrypting information, is often thought of in terms of soldiers and spies clandestinely sending and intercepting secret messages. However within the past generation as more and more of our communication is electronic, it has become part of normal business, so that it is hard to find anyone who does not routinely use commercial grade cryptography. ?At its most fundamental level, the goal of cryptography is to send a message that can only be read by the desired recipient. One system of ancient times is the Spartan scytale where a message was written across a strip of paper wrapped around a stick of specified length and width. When the strip was unwound the letters were scrambled or transposed. A second system is the Caesar cipher, named after Julius Caesar, where each letter was shifted by three letters, replacing A by D and B by E. Variations of this monoalphabetic substitution survive today in jumbles in many newspapers.?? An important variation of the Caesar cipher is the Vigenere cipher which uses a longer key for the shift so that different substitution are used at depending on the position of the character in the message. For example, if the key is the 6 letter word "secret," then the 1st, 7th, and 13th characters of the message are shifted by s, or 19 characters, while the 2nd, 8th, and 14th characters of the message are shifted by e or 5 characters. A big advantage of such a system is that each letter can be encoded as several different letters, reducing the ability of an enemy to decode the message by looking for frequently used letters. An extreme variation of the same idea uses a long book to control the shift, so the key becomes a location in the book used to start encryption. ??
A central problem with Vigenere and its variants is that English (and any human language) is far from random. This allows attackers to use statistical methods to break a code. One response was to create a long string of randomly generated shifts. This method is called using a one time pad and is theoretically unbreakable. In practice the one time pad shifts the problem of sending the message to the problem of securely transmitting the key, which is now as long as the message. The solution was to design a machine that would produce a string of characters that seem to be random, with the string started by a shorter key. The most famous of attempt using this method is the Enigma machine used by the Germans in WWII.
The other famous method for secret communication from WWII was the Code Talkers of the Allied Forces, which used the Navaho language and the correct assumption that the obscurity of language would prevent the enemy from understanding the communications. This method of encryption violated a central principle of modern cryptography, formulated by Kerckhoff, and often stated as "no security through obscurity", which assumes that the enemy knows your system of encryption. ??The breaking of the German Enigma machine and the Japanese Purple Machine highlighted the fact that being able to securely send information and intercept the enemy's communications had become an important tool of warfare. The US government created the National Security Agency, or NSA, centered in Fort Meade, in 1952. For years, this agency was so secretive that it was said the initials stood for "No Such Agency."?
?The second half of the twentieth century and the availability of modern computers and the increasing amount of financial data transmitted electronically changed the typical setting for practical cryptography. Cryptography went from the realm of soldiers and spies to that of bankers and ordinary commerce. Today it is hard to find anyone who does not routinely use commercial grade cryptography. We typically use encryption of some form or another whenever we use an ATM machine, have a credit card electronically verified, visit a secure website, use a wireless computer connection, or use a cell phone. Cryptography has become part of our ordinary life. (To see some systems you may be using without knowing it, launch your favorite Internet browser and open the preferences. The location of the security certificates may vary. On Firefox, open the advanced tab of the preference and select encryption, then view the certificates.) Rather than encrypting letters, all modern systems work with the bits obtained by using the normal ASCII [link] representation used by all computers. For some systems, we think of the string of bits as representing a number.
In the 1970's Horst Feistal, while working at IBM labs, invented a series of algorithms for encrypting a block of plaintext, the most famous of which was the Lucifer (cipher). As electronic communication of banking and commercial records became more important to the economy, the National Bureau of Standards (NSA) sent out a solicitation for an encryption algorithm that would be a commercial standard. When the responses to the solicitation were inadequate, they asked the National Security Agency (NSA) to help create a standard for encryption. Instead, the NSA worked with IBM, and helped them develop Lucifer into the Data Encryption Standard, or DES. For years there was concern that NSA had adjusted the S-Boxes, a feature of the algorithm, to introduce a back door into the code for easy eavesdropping. Years later when differential cryptoanalysis was discovered as a way of attacking this class of ciphers, it was realized the NSA had optimized the S-Boxes to be resistant to this kind of attack. The NSA also wanted to reduce the key length from 64 to 48 bits to produce a cope that would be commercially secure, but would allow law enforcement to decrypt a message when needed. The tension between the needs of commercial security and law enforcement recurs regularly in cryptography. With the compromise key length of 56 bits, DES has 2^56 possible keys.
In the 1990's the increase in available computer power rendered DES susceptible to a brute force attack where all 2^56 possible keys are checked. In 1997, the government announced an open competition for a replacement encryption algorithm, with the competition run by the National Institute of Standards and Technology, or NIST. In 2002, Rijndael, a code designed by Belgian cryptographers Vincent Rijmen and Joan Daemen became the Advanced Encryption Standard, or AES. AES was designed to work with keys of 128, 196, or 256 bits, so that it can be adjusted for future increases in computer power. The open competition for an encryption standard went so well, the NIST is currently holding a similar competition for a new standard for a hash function [internal link] .?
?Another indication of the changing role of cryptography comes from the laws concerning cryptographic software. Into the 1990's, cryptographic software was formally a munition under US law. This led to the creation of two versions of the Netscape browser, one with 40-bit encryption for export and a US version with 128-bit encryption. These laws were significantly relaxed in 1996, with oversight for many cryptographic tools now under the Department of Commerce.
Public Key Cryptography
Both DES and AES are symmetric ciphers. The same key is used both for encryption and decryption. A problem related to the use of symmetric ciphers is one of key distribution. The keys need to be securely transferred to be used. That is not such a big problem between institutions like banks that routinely communicate with each other, but it becomes a problem with electronic commerce where a new customer does not want to wait for secure delivery of a pin before ordering online. By contrast, public key cryptography uses a system with two keys, so a locking key can be transferred publicly, while an unlocking key is kept secret. In 1976, Whitfield Diffie and Martin Hellman published a practical method for secretly sharing keys over an unprotected channel.
Public key systems depend on mathematical operations that are straightforward in one direction and impossibly hard in the other direction. The easiest example is the multiplication of two prime numbers and the factoring of a number into its prime factors. With a piece of paper or a basic calculator, it is easy to see that 307 times 967 is 296,869. However, it is much harder to guess that 557,993 is the product of 751 and 743. A second one-way function takes powers of numbers in modular or clock arithmetic. (In modular and clock arithmetic, we are only interested in remainders. Thus 5 hours after 10:00 o'clock is 3:00 o'clock. The difference between the two systems is that in modular arithmetic, noon would be 0:00 o'clock.) It uses the difference in difficulty of modular exponentiation and discrete logarithms. (Given a, b, c, and n, with a^b=c, mod n, going from using a and b to compute c, is modular exponentiation whereas going from a and c to b is solving a logarithm.) If we are looking at powers of a number mod 10, we only care about the remainder after we divide by 10. Thus the powers of 3 are, in order, 3 then 9 (or 3*3), then 7 (3*9=27, forget the 20), then 1 (7*3=21, forget the 20), then 3. The number of multiplications needed to compute a power can be reduced to about twice the length of the exponent base 2, while solving the discrete logarithm is much harder. It should be noted that when your browser gets a key for secure page it does one of these operations with a prime number that is between 100 and 300 digits long. For a 100-digit exponent we need 600 multiplications to find the power, and 10^100 multiplications to find the discrete log by guessing and checking. With the fastest computer currently available, the sun will burn out before the discrete log problem is solved by a naive approach.
The Diffie-Hellman Key Exchange
In explaining the Diffie-Hellman key exchange, we follow a tradition in writing about cryptography where we assume Alice is trying to send a message to Bob and Eve is trying to listen in. (If the third party is trying to be malicious and send fake messages, the identity changes to Mallory.) Alice and Bob agree on a base g and a modulus n, which is a 100-digit prime number. They don't care if Eve learns the values of g and n and may even publish these numbers. Alice picks a secret number a and computes g^a mod n. Bob picks a secret number b and computes g^b mod n. Alice and Bob tell each other the values of g^a and g^b respectively, while keeping a and b secret. Once again, they make no effort to keep g^a and g^b secret from Eve. Alice and Bob can then each compute k=g^(a+b)=(g^a)^b=(g^b)^a. They can agree to use the last 128 bits of k for their AES key. Because Eve does not know either a or b, and cannot solve the logarithm problem to deduce them from g^a and g^b she cannot compute k. Alice and Bob have thus exchanged a secret key while communicating openly on a channel with Eve listening in.
If you look at the security certificates on your web browser, it is likely that the public key algorithm used will be RSA, a method invented by Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman. You have probably used it if you have done online banking or bought something online at a secure site, like iTunes. For RSA, the vendor creates two keys, one of which is public and the other is private. Anyone can use the public key to encrypt a message, but only the vendor knows the private key needed to read the message. A typical implementation of RSA randomly chooses 2 prime numbers, p and q, that are each 300 digits long, and multiplies them to get n=p*q. We will also want to compute n1=(p-1)(q-1). I choose an encryption key E and use the extended Euclidean algorithm to find a decryption key D such that DE-1 is a multiple of n1. If my plaintext message (which has to be a number) is m, then the encrypted ciphertext is c=m^E mod n. For decryption, I note that due to a theorem of Fermat, m=c^D mod n. For technical reasons, almost everyone always uses the same encryption key, E=65,537=2^16+1. Given n and E, trying to find D is equivalent to finding the prime factors p and q of n. Simply guessing one of the factors would be as hard as guessing the winning numbers on the Powerball contest, 28 weeks in a row. (There are more efficient methods of finding the factors, but for numbers this size, they are still "practically impossible". A group of mathematicians announced in January 2010 that they had factored the product of two 166 digit numbers. The effort involved about 1,500 computer years of computation.)
Any discussion of cryptography should include comments on the related topics of authenticity (was the message sent by the person who claims to have sent the message) and message integrity (is the message received the same as the message that was sent). A simple form of authentication or digital signature can be done with RSA. If I encrypt a message with my private decryption key and send the two messages together, then anyone can use the public encryption key to verify that I was the one who did the encryption. One problem with this scheme is that RSA only encrypts messages shorter than n and 300 digits only gives me room for about 120 characters.
Cryptographers use hash functions to reduce a long message to a short "digest" of a fixed length. Hash functions are designed so it is practically impossible to find two messages that make sense with the same digest. As mentioned above, the current attacks on the standard hash functions look like they will be able to generate messages with the same hash within a few years, so the National Institute of Standards and Technology (NIST) is sponsoring a competition to design a new standard hash function.
How the Pieces Fit Together
It is worthwhile to put all the pieces of modern cryptography together in a typical arrangement. A public key system (RSA) is used to establish communications, but public key systems are computationally intensive and comparatively slow, so the only thing communicated with the public key cipher is a session key for a symmetric cipher (AES). This is analogous to putting a collection of books in a case, locked with AES, and putting the AES key in an envelope locked with RSA. Additionally we use a hash algorithm (SHA-1) to reduce the communication to a hash that we then sign with a signature algorithm (DSA). To continue the analogy we also produce a single sheet of paper that can be used to verify the collection of books is unchanged from what I intended to send and I sign that sheet of paper with a signature that cannot be forged by anyone else. I add that paper to the locked case of books. I have efficiently insured secrecy, integrity, and authenticity. There are a variety of algorithms to choose from or each of the four steps mentioned, with the most popular chosen in each category.
??The hidden technical issues and future trends
In an attempt to make cryptography accessible, there is a danger that readers may be left with the impression it is easy and anyone can create a good cipher or that a cryptographic algorithm is secure if it can't be broken quickly by your friends. For a code to be secure it needs to resist the amount of effort a government can bring to bear over a period of years. RSA with a pair of 65 digit primes was considered insecure when such a code was broken by a coalition of 600 volunteers using 1600 computers working together for 6 months. The work in public key cryptography looks at the rather sophisticated mathematics behind efforts to efficiently find long primes and ways to factor a product of two larger prime numbers. Variants of methods based on the discrete logarithm problem using integer points on elliptic curves are making their way into the commercial market. Attacks on symmetric ciphers use a lot of computer power to find places where the relationship between the plaintext and the encoded ciphertext does not appear random. Advances in attacks can come from a broad range of the branches of mathematics. Cryptographers are also working on using the principles of quantum mechanics to produce practical codes where even listening in will be detected.
"Cryptography." Wikipedia, The Free Encyclopedia. Retrieved 16 Aug 2009, <http://en.wikipedia.org/w/index.php?title=Cryptography&oldid=307147660>. Wikipedia has several good entries on cryptography. The links let you continue exploration as you want to know more details.
"RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Version 4.1". RSA Laboratories Retrieved 16 Aug 2009, <http://www.rsa.com/rsalabs/node.asp?id=2152> RSA notes that they stopped updating this FAQ page in 2000, so it is somewhat out of date. It is still useful in getting an overview.
"Timeline of cryptography." Wikipedia, The Free Encyclopedia. Retrieved 16 Aug 2009 <http://en.wikipedia.org/w/index.php?title=Timeline_of_cryptography&oldid=307201786>. This timeline is much fuller than the one given in this article.
Mihir Bellare and Phillip Rogaway, (2009) Introduction to Modern Cryptography, course notes for UCSD course CSE107. <http://cseweb.ucsd.edu/users/mihir/cse107/classnotes.html>. Effectively an undergraduate textbook for a course on cryptography.
Gary C. Kessler, An Overview of Cryptography. May 1998, updated 4 June 2009. Retrieved 16 Aug 2009. <http://www.garykessler.net/library/crypto.html >. A good introduction and overview with nice slides.
Menezes, van Oorschot, and Vanstone 1996 Handbook of Applied Cryptography available online at <http://world.std.com/~franl/crypto.html>. Retrieved 16 Aug 2009. This is a textbook that goes into the details of how to practically implement cryptographic algorithms. Since it is over 10 years old it is not cutting edge, but it gives a wealth of details. It is not at a beginner's level.
Bruce Schneier (2009) Cryptogram newsletter. Retrieved 16 Aug 2009 <http://www.schneier.com/crypto-gram.html >. Schneier is one of the leading figures of professional cryptography. The newsletter covers current developments.
Practical Cryptography, Niels Ferguson and Bruce Schneier, 2003 Hoboken, N.J., Wiley and Sons. <http://www.schneier.com/book-practical.html>. A very readable book and a good source.
Selection of Research and Summary Articles
Following is a brief selection of articles from the technical literature. These works provide insight into some of the issues and research questions facing mathematicians and information scientists and the computational methods they enlist to generate theory and practice. Teaching faculty are encouraged to use the technical materials to inform classroom discussions and relate contemporary investigations to concepts covered in lectures. Students can use the works for self-directed study in conjunction with their textbooks, the Mathematics Digital Library , and/or the NSDL Central Collection .
Preceding the technical papers are two topical overviews from Plus Magazine that provide accessible, non-technical summaries cryptography as well as the 2008 Chauvenet Prize winning essay from the Bulletin of the American Mathematical Society. This elaboration of a lecture given at the Current Events special session of the 2004 American Mathematical Society annual meeting explains the Agrawal-Kayal-Saxena (AKS) primality test--an algorithm that determines whether a number is prime or composite within polynomial time--with complete proofs, and to put the result and ideas in appropriate historical context.
Artur Ekert (2005) Cracking codes Plus Magazine. Issue 34.
A description of classic cryptography and some classic attacks.
Artur Ekert (2005) Cracking codes, part II Plus Magazine. Issue 35
An introduction to quantum cryptography.
American Mathematical Society Lecture
Andrew Granville (2004) It Is Easy To Determine Whether a Given Integer is Prime Bulletin of the American Mathematical Society. Volume 42, number 1.
A semi-technical essay in pure mathematics on prime numbers and methods for determining primality.
1. Daniele Micciancio (2010) The RSA Group is Pseudo-Free Journal of Cryptography. Volume 23, number 2.
Journal Home Page: Journal of Cryptography
Current Table of Contents: Current Volume 23
Motivation: Free groups--groups where if there is a subset S of a group G such that any element of G could be written in one and only one way as a product of finitely many elements of S and their inverses--are widely used in computer science, and most modern cryptography relies on the hardness of computational problems over finite groups. So, as argued by Rivest , pseudo-free groups are a very interesting notion from a cryptographic perspective.
Methods: The main question left open by Rivest's work in  is: do pseudo-free groups exist? In  Rivest suggested the RSA group (where N = PQ is the product of two large primes) as a possible candidate pseudo-free Abelian group and nicknamed the corresponding conjecture the super-strong RSA assumption. In this paper we resolve Rivest's conjecture and prove that is pseudo-free under the strong RSA assumption, at least when N = PQ is the product of two "safe primes" (i.e., odd primes such that p = (P - 1)/2 and q = (Q - 1)/2 are also prime1), a special class of prime numbers widely used in cryptography*.
Results/Conclusions: The author has offered the first example of provably secure pseudo-free group under standard cryptographic assumptions. In particular, the paper proved that the RSA group where N is the product of two safe primes is pseudo-free, assuming the hardness of the strong RSA problem.
 R.L. Rivest, On the notion of pseudo-free groups. In: Theory of Cryptography Conference-Proceedings of TCC'04. LNCS, vol. 2951, pp. 505-521 (2004)
- Equivalently, using more standard mathematical terminology, P = 2p +1 and Q= 2q +1 where p and q are Sophie Germain primes.
2. Chin-Wen Chou, Julien Laurat, Hui Deng, Kyung Soo Choi, Hugues de Riedmatten, Daniel Felinto, H. Jeff Kimble (2007) Functional Quantum Nodes for Entanglement Distribution over Scalable Quantum Networks. Science. Vol. 316. no. 5829, pp. 1316 - 1320 Article available from Science in full text after free registration
Journal Home Page: Science
Current Table of Contents: Current Volume 324
Motivation: A developing area of cryptography using quantum physics where measuring a state changes it. That allows a communication system where listening in is easily observable.
Methods: This paper looks at the issues of sending quantum bits along a network. In particular the experiment describes the entanglement of quantum states at nodes 3 meters apart. In theory, this allows for a larger scale network.
Results/Conclusions: The method of secure communication is not currently practical for broad usage.
3. Stephanie Wehner, Andreas Winter, Harry Buhrman, Matthias Christandl, Falk Unger (2006) Implications of superstrong non-locality for cryptography. Proceedings of the Royal Society A 462 no. 2071 1919-1932
Journal Home Page: Proceedings of the Royal Society A
Current Table of Contents: July 8, 2009; 465 (2107)
Publisher: Royal Society Publishing
Motivation: This study looks at using non-local boxes-- hypothetical 'machines' that give rise to superstrong non-local correlations--as a method of getting stronger cryptographic results than can be obtained through quantum mechanics alone.
Methods: The workers demonstrate how to circumvent the problem of delay implicit in earlier work*, which suggested that a 1-2 oblivious transfer (OT)--a protocol where a sender sends information to a receiver, but remains oblivious as to what is received--could be constructed using one non-local box alone. The investigators construct a protocol for bit commitment (BC)--a commitment scheme that allows one to commit to a value while keeping it hidden, with the ability to reveal the committed value later--and 1-2 OT based on non-local boxes. This shows that superstrong non-local correlations in the form of non-local boxes enable the researchers to solve cryptographic problems otherwise known to be impossible.
Results/Conclusions: This paper should be considered in contrast to Chou et al., paper 2 above, which looks what can currently be achieved in cryptography through quantum mechanics. This paper considers what could be achieved with an ideal device that mimics some of the characteristics of entangled states in quantum mechanics.
- Wolf, S. & Wullschleger, J. 2005 Oblivious transfer and quantum non-locality. In Proc. Int. Symp. on Information Theory (ISIT), pp. 1745-1748.
a. If I want to keep a message secret for 20 years, how big a safety factor should I include for a brute force attack? (Give your best guess, how much faster will computers be in 20 years? How much faster is a typical computer today compared with one sold 20 years ago?)
b. A brute force attack on AES checks 2^128 or about 10^38 keys. How many keys would a computer need to be able to solve? per second to do a brute force attack in 10 years? How does that compare with the current speed of the fastest supercomputer in existence?
c. Many of the public key cryptosystems start by picking a number that is "probably prime", understanding that the system will be much weaker if the chosen number is not a prime. What is an acceptable failure rate for cryptography?
d. In a typical day, how often do you communicate electronically with the hope or expectation that your communication is secret? For what percentage of these communications can you find the encryption method used?
e. Cryptography is one of the few branches of mathematics that is restricted as a matter of law. Strong cryptography protects you from identity theft but also allows criminals and spies to hide their activities. What is the proper role of government regulation in this area? (Proposed answers range from escrow accounts for keys where the government can open any encrypted material to no regulation at all.)