RSA Public Key Cryptography

note

For centuries, one of the hazards of cryptography was that a copy of your code book might fall into enemy hands, so that all your encrypted transmissions could then be intercepted and decoded. Worse yet, you might have no way of knowing whether one of your communications stations had been taken over by the enemy: The enemy might be masquerading as one of your own troops. All this changed in 1976 when Whitfield Diffie and Martin Hellman proposed a form of encryption that should be easy to perform but would be difficult to break, even if the encryption procedure were made public. The scheme works like this: Suppose that Bob wants to receive a message from Alice without observers being able to read the message. Bob chooses a very large integer $m$ , say $m \approx 10^{200}$ , and defines a permutation $\pi$ of the numbers $1,2, \ldots, m$ . The algorithm for computing $\pi$ is made public, and in particular is given to Alice. The characters of Alice's message can be associated with digits in a standard way, and the digits can be broken into blocks of length not exceeding 200, so that Alice's message is equivalent to one or more integers $t$ , each one in the interval $[1, m]$ . Thus $t$ is the plaintext. Alice computes $c=\pi(t)$ ; this is the cryptotext; it is also an integer in the interval $[1, m]$ . Alice sends $c$ to Bob. Since an observer may also gain access to $c$ , for the security of the communication it is essential that there be no quick algorithm for computing the inverse permutation $\pi^{-1}$ , since $t=\pi^{-1}(c)$ . However, Bob possesses some secret information concerning $\pi$ that allows him to compute $\pi^{-1}$ quickly, and hence read Alice's message. A permutation with the peculiar property that $\pi$ is easy to compute while $\pi^{-1}$ is difficult (i.e., would take centuries on the fastest computers) is called a trap door function.

The success of the Diffie-Hellman scheme depends on being able to find trap door functions. This was achieved in 1977 by Ron Rivest, Adi Shamir, and Len Adleman. Their RSA method depends on the number theory that we have been investigating: Bob secretly chooses two 100 -digit primes $p_{1}, p_{2}$ , and sets $m=p_{1} p_{2}$ . Bob also chooses a large positive integer $k$ with the property that $\gcd(k, \phi(m))=1$ . Among the reduced residue classes $(\bmod m)$ , the map $\pi(x) \equiv x^{k} \quad(\bmod m)$ is a permutation. Bob makes $m$ and $k$ public, and Alice sends him $c \equiv t^{k}(\bmod m)$ . (Recall that we have a powering algorithm that makes this easy.) Since Bob knows how to factor $m$ , Bob knows the value of $\phi(m)$ . Hence Bob can find a positive integer $k^{\prime}$ such that $k k^{\prime} \equiv 1(\bmod \phi(m))$ . (We use the extended Euclidean algorithm to solve linear congruences, so this is also fast.) We now show that the map $x \mapsto x^{k^{\prime}} \quad(\bmod m)$ is the inverse permutation that we need. To this end, choose $q$ so that $k k^{\prime}=1+q \phi(m)$ , and recall Euler's congruence, which asserts that if $\gcd(x, m)=1$ then $x^{\phi(m)} \equiv 1 \quad(\bmod m)$ . Hence

\left(x^{k}\right)^{k^{\prime}}=x^{k k^{\prime}}=x^{1+q \phi(m)}=x\left(x^{\phi(m)}\right)^{q} \equiv x(1)^{q}=x \quad(\bmod m) .

Thus the decryption process for Bob is similar to Alice's encryption, but with the parameter $k$ replaced by $k^{\prime}$ . Note that only Bob can calculate $k^{\prime}$ . Even Alice can't read her own message, once she's encoded it!

In the RSA method, the permutation being employed constitutes a trap door function only to the extent that large composite integers are difficult to factor. In the present state of knowledge one can factor a number of size $10^{150}$ , but there is no guarantee that there does not exist some factoring method yet to be discovered by which even much larger numbers could be factored quickly. One could imagine that such a method might be taken as a State Secret. Indeed, when Rivest, Shamir, and Adleman published their work in 1978, the Director of the National Security Agency (General Odum) gave serious consideration to going to Congress asking for legislation that would make all research in number theory "born classified" as is the case with atomic research. He was dissuaded from this, but in any case any lingering impression that number theory is the purest of the pure, totally devoid of practical application, has been forever dispelled.

Rivest, Shamir and Adleman patented their method, and formed the company RSA Data Systems to market RSA-based products. To emphasize the security of their system, they offered a prize of $100 for the first decryption of the message

\begin{aligned} c= & 968696137546220614771409222543558829057599911245743198746951209308162 \\ & 98225145708356931476622883989628013391990551829945157815154, \end{aligned}

which was encrypted using the 129-digit modulus

\begin{gathered} m=1143816257578888676692357799761466120102182967212423625625618429357 \\ 06935245733897830597123563958705058989075147599290026879543541 \end{gathered}

and the public exponent

k=9007 .

The estimate at that time was that it would take 40 trillion years to factor this $m$ . However, on 29 April, 1994, Derek Atkins, Michael Graff, Arjen Lenstra, and Paul Leyland announced that $m=p_{1} p_{2}$ where

\begin{aligned} & p_{1}=3490529510847650949147849619903898133417764638493387843990820577, \\ & p_{2}=32769132993266709549961988190834461413177642967992942539798288533 . \end{aligned}

This enabled them to determine the secret exponent,

\begin{aligned} k^{\prime}= & 106698614368578024442868771328920154780709906633937862801226224496631 \\ & 063125911774470873340168597462306553968544513277109053606095, \end{aligned}

and consequently the plaintext

\begin{aligned} t= & 20080500130107090300231518041900011805001917210501130919080015191909 \\ & 0618010705 . \end{aligned}

After conversion back to alphabetic characters, this reads

THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE

Lenstra (at Bellcore) and his team used the double large prime variation of the multiple polynomial quadratic sieve factoring method. The calculation took more than 5000 mips years, and was executed over a period of 8 months on over 600 different computers that were made available for the purpose by volunteers in more than 20 countries, on all continents except Antarctica. The final stage of the computation took 45 hours on a 16 K MasPar MP-1 massively parallel computer. The relatively short time that it took to factor RSA129 is partly due to increased speed and power of computer hardware, but it is mainly due to progress that has been made in developing faster factoring algorithms.

The RSA-129 modulus was factored by combining the latest factoring algorithms with enormous computing resources. With larger moduli, the RSA method is considered to be secure, and is widely used. In the spring of 1996, Rivest (a mathematician at MIT) sold his interest in the company to a venture capital firm for $\$ 50,000,000$. So being a mathematician is not only fun, but occasionally also profitable!

The program RSA automates the arithmetic operations that arise when executing the RSA algorithm. To use this program you will need a public modulus, a public exponent, and a secret exponent. Once Bob has the Alice's public key, he can send her an encrypted message by using the RSA program. This program has no word-processing capabilities, so Bob must use the TextToCode program to convert his message to a numerical value. After this, he invokes RSA, where he can and set the variables. Each letter of the text needs to be converted to a two-digit Code; the codes are then concatenated to form a sequence of residues. Each residue is taken to the power $k$ modulo $m$ to form a new sequence of residues. This is the encryption. In turn, Alice can load the cipher text and her Variables, including her secret decrypting exponent $k^{\prime}$ . She can then decrypt to recover the original plaintext sequence of residues. These can be separated to form codes, and finally text, which can be decoded using TextToCode program.

Before proceeding further we consider how to convert characters into numbers. This can be done in many ways. For example, we could let A correspond to 1 , B to $2, \ldots$ , and Z to 26. Alternatively, computers store alphanumeric characters by their ASCII codes. (ASCII is an abbreviation for American Standard Code for Information Interchange.) The first of these methods makes no provision for punctuation, numerals, or lower case letters. The second provides all printable characters, but is inefficient because each character requires three digits (in base 10). The characters that can be typed in the standard keyboard have ASCII codes between 32 (to denote a space ' ') and 126 (for ~). As a compromise between the two systems described above, we subtract 32 from each ASCII code to obtain a 2-digit number. These numbers run from 00 to 94 . In order to preserve the line breaks in a file we need an end of line marker; we assign the code 95 for this purpose. Thus we have the codes opposite.

Problem 1

Suppose that Bob took the (ridiculously small) modulus $m=91$ , and proposed the public exponent $k=17$ . Suppose that Alice sent him the encrypted message $c=51$ . Use the programs Factor, Phi, LinCon, and Power appropriately to recover her plaintext $t$ .

Problem 2

The proof above that $x^{k k^{\prime}} \equiv x(\bmod m)$ assumed that $\gcd (x, m)=1$ . If $m=p_{1} p_{2}$ where $p_{1}$ and $p_{2}$ are distinct primes, what is the probability that $\gcd (x, m)>1$ when $x$ is randomly chosen?

Problem 3

Show that if $m$ is squarefree then the restriction to $\gcd(x, m)=1$ is unnecessary. That is, if $m$ is squarefree and $k k^{\prime} \equiv 1(\bmod \phi(m))$ , then $x^{k k^{\prime}} \equiv x(\bmod m)$ for all integers $x$ .

Problem 4

The encrypted message

355456249 475197422 636832086 601788838

was created using the modulus $m=670726081$ and the public exponent $k=663599161$ . The program RSA will assist in decrypting this, but first you must determine the value of $\phi(m)$ , and then solve the congruence $k k^{\prime} \equiv 1(\bmod \phi(m))$ . (Use Factor and/or Phi, and then LinCon.) Then use TextToCode to decrypt the message. The resulting residues can be separated into 2-digit codes, which may be read as text. What was the message?

Problem 5

Although Bob is the only person who can decrypt a message encrypted with his parameters, he has no way of knowing that the message actually came from Alice, since anyone can use his parameters. To overcome this defect, suppose that Bob has a trap door function $\pi_{B}$ and that Alice also has a trap door function $\pi_{A}$ . Suppose that Alice sends $c=\pi_{B}\left(\pi_{A}^{-1}(t)\right)$ to Bob. What should Bob do, to decrypt this? Can anyone else decrypt it? Can Bob now be sure that the message came from Alice?

Problem 6

In the preceding problem there was a tacit assumption that the trap door functions $\pi_{A}$ and $\pi_{B}$ act on the same set of numbers. Suppose now that $\pi_{A}$ permutes the residue classes modulo $m_{A}$ , and that $\pi_{B}$ permutes the residue classes modulo $m_{B}$ . If $m_{A} \leq m_{B}$ then we may still proceed as above, since we may consider $\pi_{A}^{-1}(t)$ as lying in the interval $\left[0, m_{A}\right)$ , which defines a unique residue class $m_{B}$ . How would you modify the above procedure if $m_{A}>m_{B}$ ?

Problem 7

In formulating their challenge, Rivest, Shamir and Adleman did not use the system in the ASCII table to convert from alphanumeric characters to a residue class $t$ . By comparing $t$ with the stated text can you infer the system that they used

Problem 8

Let $m=854937209155735099$ , and suppose that you are given the information that $m$ has at most two prime factors, and that $\phi(m)=854937207303842520$ . Can you find the primes?

Problem 9

Suppose that $m=1247=29 \cdot 43$ , so that $\phi(m)=1176$ . In order that $x^{k k^{\prime}} \equiv x(\bmod$ $m$ ) for all $x$ , it is sufficient that $k k^{\prime} \equiv 1(\bmod \phi(m))$ , but is it necessary? Suppose that $k=5$ . How many $k^{\prime}$ are there, $0 \leq k^{\prime}<\phi(m)$ , such that $x^{k k^{\prime}} \equiv x(\bmod m)$ for all $x$ ? What if you take instead $k=11$ ? Why is the number of admissible $k^{\prime}$ so large? To achieve security, the acceptable $k^{\prime}$ should be very rare. How should the prime factors of $m$ be chosen, to achieve this?

An RSA code can be broken if the value of $\phi(m)$ is known. One way to determine $\phi(m)$ is to factor $m$ , but one could conceive that possibly $\phi(m)$ might be found more quickly, without having to factor $m$ . However, we argue now that this is not the case: If the value of $\phi(m)$ is known then the factorization of $m$ can be recovered with little work. Hence any quick method of evaluating $\phi(m)$ would yield a quick method of factorization.

If $\phi(m)=m-1$ then $m$ is prime, and we are done. (Of course such an $m$ would not be used for RSA encryption.) Hence we may suppose that $\phi(m)<m-1$ , i.e. that $m$ is composite.

If $m$ is a product of two distinct primes, say $m=p q$ , and if $\phi(m)$ is known, then the primes can easily be found. To see this, note that $p+q=m-\phi(m)+1$ . Since the sum $p+q$ and the product $p q$ are both known, the values of $(p-q)^{2}=(p+q)^{2}-4 p q$ can be determined. By taking square roots one obtains $|p-q|$ , and the primes are ( $p+q \pm|p-q|$ )/2. (This is, after all, how one solves for the roots of a quadratic polynomial.) This procedure may be attempted whenever $m$ and $\phi(m)$ are both known. If it fails then we know that $m$ is not the product of two distinct primes. To eliminate the (unlikely) possibility that $m=p^{2}$ , we may compute $\sqrt{m}$ .

Now we come to the heart of the matter: $m$ is the product of 3 or more primes. Suppose that a number $c$ is known with the property that $a^{c} \equiv 1(\bmod m)$ whenever $\gcd(a, m)=1$ . For example, $\phi(m)$ is such a number $c$ . It is enough to find a proper divisor of $m$ , since the method may then be applied to the divisors, repeatedly, until all the factors are prime. The number $c$ may be hard to factor, but at least we can determine the power of 2 in it, say $c=2^{j} \cdot k$ with $k$ odd. Choose a number $a$ at random, $0<a<m$ . If 1 ; $\gcd(a, m)<m$ then we have found a proper divisor of $m$ . If $\gcd(a, m)=1$ then put $b=a^{k}$ . The value of $b(\bmod m)$ is quickly found by the powering algorithm. By repeated squaring, compute $b^{2}, b^{4}, \ldots, b^{2^{j}}(\bmod m)$ . Actually, there is no need to compute the last term, since this number is $\equiv 1(\bmod m)$ . In the sequence of powers of $b$ computed, suppose that the first 1 is preceded by a number other than -1 . Then we have an $x$ such that $x \not \equiv \pm 1 \quad(\bmod m)$ , but $x^{2} \equiv 1(\bmod m)$ , and hence $\gcd(x-1, m)$ is a proper divisor of $m$ . This procedure is not guaranteed to work for every $a$ , but should work for a large proportion of $a$ 's modulo $m$ , provided that $m$ is divisible by two or more odd primes. (We may assume that $m$ is odd.) In the one remaining case, $m=p^{k}$ with $k>1$ , the prime $p$ can be found quickly, since $p=m /\gcd (m, \phi(m))$ .

Problem 10

Let $m=308557669718497477$ . This number is the product of two distinct primes, and $\phi(m)=308557668607386336$ . Find the primes.

Problem 11

Let $m=144145168546451$ . Given that $\phi(m)=144136398922632$ , show that $m$ is not a prime, and is not a product of two primes. (I.e. verify that $\phi(m)<m-1$ , that $m$ is not a perfect square, and that $(m-\phi(m)+1)^{2}-4 m$ is not a perfect square.) Use the procedure described above to find the prime factorization of $m$ . How many different bases a do you need to consider?

Problem 1​

Problem 2​

Problem 3​

Problem 4​

Problem 5​

Problem 6​

Problem 7​

Problem 8​

Problem 9​

Problem 10​

Problem 11​