Skip to main content

RSA Public Key Cryptography

note

This is Lab 9 from Clint by Hugh Montgomery, edited by Elad Zelingher. All rights reserved to Hugh Montgomery.

For centuries, one of the hazards of cryptography was that a copy of your code book might fall into enemy hands, so that all your encrypted transmissions could then be intercepted and decoded. Worse yet, you might have no way of knowing whether one of your communications stations had been taken over by the enemy: The enemy might be masquerading as one of your own troops. All this changed in 1976 when Whitfield Diffie and Martin Hellman proposed a form of encryption that should be easy to perform but would be difficult to break, even if the encryption procedure were made public. The scheme works like this: Suppose that Bob wants to receive a message from Alice without observers being able to read the message. Bob chooses a very large integer mm, say m10200m \approx 10^{200}, and defines a permutation π\pi of the numbers 1,2,,m1,2, \ldots, m. The algorithm for computing π\pi is made public, and in particular is given to Alice. The characters of Alice's message can be associated with digits in a standard way, and the digits can be broken into blocks of length not exceeding 200, so that Alice's message is equivalent to one or more integers tt, each one in the interval [1,m][1, m]. Thus tt is the plaintext. Alice computes c=π(t)c=\pi(t); this is the cryptotext; it is also an integer in the interval [1,m][1, m]. Alice sends cc to Bob. Since an observer may also gain access to cc, for the security of the communication it is essential that there be no quick algorithm for computing the inverse permutation π1\pi^{-1}, since t=π1(c)t=\pi^{-1}(c). However, Bob possesses some secret information concerning π\pi that allows him to compute π1\pi^{-1} quickly, and hence read Alice's message. A permutation with the peculiar property that π\pi is easy to compute while π1\pi^{-1} is difficult (i.e., would take centuries on the fastest computers) is called a trap door function.

The success of the Diffie-Hellman scheme depends on being able to find trap door functions. This was achieved in 1977 by Ron Rivest, Adi Shamir, and Len Adleman. Their RSA method depends on the number theory that we have been investigating: Bob secretly chooses two 100 -digit primes p1,p2p_{1}, p_{2}, and sets m=p1p2m=p_{1} p_{2}. Bob also chooses a large positive integer kk with the property that gcd(k,ϕ(m))=1\gcd(k, \phi(m))=1. Among the reduced residue classes (modm)(\bmod m), the map π(x)xk(modm)\pi(x) \equiv x^{k} \quad(\bmod m) is a permutation. Bob makes mm and kk public, and Alice sends him ctk(modm)c \equiv t^{k}(\bmod m). (Recall that we have a powering algorithm that makes this easy.) Since Bob knows how to factor mm, Bob knows the value of ϕ(m)\phi(m). Hence Bob can find a positive integer kk^{\prime} such that kk1(modϕ(m))k k^{\prime} \equiv 1(\bmod \phi(m)). (We use the extended Euclidean algorithm to solve linear congruences, so this is also fast.) We now show that the map xxk(modm)x \mapsto x^{k^{\prime}} \quad(\bmod m) is the inverse permutation that we need. To this end, choose qq so that kk=1+qϕ(m)k k^{\prime}=1+q \phi(m), and recall Euler's congruence, which asserts that if gcd(x,m)=1\gcd(x, m)=1 then xϕ(m)1(modm)x^{\phi(m)} \equiv 1 \quad(\bmod m). Hence

(xk)k=xkk=x1+qϕ(m)=x(xϕ(m))qx(1)q=x(modm).\left(x^{k}\right)^{k^{\prime}}=x^{k k^{\prime}}=x^{1+q \phi(m)}=x\left(x^{\phi(m)}\right)^{q} \equiv x(1)^{q}=x \quad(\bmod m) .

Thus the decryption process for Bob is similar to Alice's encryption, but with the parameter kk replaced by kk^{\prime}. Note that only Bob can calculate kk^{\prime}. Even Alice can't read her own message, once she's encoded it!

In the RSA method, the permutation being employed constitutes a trap door function only to the extent that large composite integers are difficult to factor. In the present state of knowledge one can factor a number of size 1015010^{150}, but there is no guarantee that there does not exist some factoring method yet to be discovered by which even much larger numbers could be factored quickly. One could imagine that such a method might be taken as a State Secret. Indeed, when Rivest, Shamir, and Adleman published their work in 1978, the Director of the National Security Agency (General Odum) gave serious consideration to going to Congress asking for legislation that would make all research in number theory "born classified" as is the case with atomic research. He was dissuaded from this, but in any case any lingering impression that number theory is the purest of the pure, totally devoid of practical application, has been forever dispelled.

Rivest, Shamir and Adleman patented their method, and formed the company RSA Data Systems to market RSA-based products. To emphasize the security of their system, they offered a prize of $100 for the first decryption of the message

c=96869613754622061477140922254355882905759991124574319874695120930816298225145708356931476622883989628013391990551829945157815154,\begin{aligned} c= & 968696137546220614771409222543558829057599911245743198746951209308162 \\ & 98225145708356931476622883989628013391990551829945157815154, \end{aligned}

which was encrypted using the 129-digit modulus

m=114381625757888867669235779976146612010218296721242362562561842935706935245733897830597123563958705058989075147599290026879543541\begin{gathered} m=1143816257578888676692357799761466120102182967212423625625618429357 \\ 06935245733897830597123563958705058989075147599290026879543541 \end{gathered}

and the public exponent

k=9007.k=9007 .

The estimate at that time was that it would take 40 trillion years to factor this mm. However, on 29 April, 1994, Derek Atkins, Michael Graff, Arjen Lenstra, and Paul Leyland announced that m=p1p2m=p_{1} p_{2} where

p1=3490529510847650949147849619903898133417764638493387843990820577,p2=32769132993266709549961988190834461413177642967992942539798288533.\begin{aligned} & p_{1}=3490529510847650949147849619903898133417764638493387843990820577, \\ & p_{2}=32769132993266709549961988190834461413177642967992942539798288533 . \end{aligned}

This enabled them to determine the secret exponent,

k=106698614368578024442868771328920154780709906633937862801226224496631063125911774470873340168597462306553968544513277109053606095,\begin{aligned} k^{\prime}= & 106698614368578024442868771328920154780709906633937862801226224496631 \\ & 063125911774470873340168597462306553968544513277109053606095, \end{aligned}

and consequently the plaintext

t=200805001301070903002315180419000118050019172105011309190800151919090618010705.\begin{aligned} t= & 20080500130107090300231518041900011805001917210501130919080015191909 \\ & 0618010705 . \end{aligned}

After conversion back to alphabetic characters, this reads

THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE

Lenstra (at Bellcore) and his team used the double large prime variation of the multiple polynomial quadratic sieve factoring method. The calculation took more than 5000 mips years, and was executed over a period of 8 months on over 600 different computers that were made available for the purpose by volunteers in more than 20 countries, on all continents except Antarctica. The final stage of the computation took 45 hours on a 16 K MasPar MP-1 massively parallel computer. The relatively short time that it took to factor RSA129 is partly due to increased speed and power of computer hardware, but it is mainly due to progress that has been made in developing faster factoring algorithms.

The RSA-129 modulus was factored by combining the latest factoring algorithms with enormous computing resources. With larger moduli, the RSA method is considered to be secure, and is widely used. In the spring of 1996, Rivest (a mathematician at MIT) sold his interest in the company to a venture capital firm for \ 50,000,000$. So being a mathematician is not only fun, but occasionally also profitable!

The program RSA automates the arithmetic operations that arise when executing the RSA algorithm. To use this program you will need a public modulus, a public exponent, and a secret exponent. Once Bob has the Alice's public key, he can send her an encrypted message by using the RSA program. This program has no word-processing capabilities, so Bob must use the TextToCode program to convert his message to a numerical value. After this, he invokes RSA, where he can and set the variables. Each letter of the text needs to be converted to a two-digit Code; the codes are then concatenated to form a sequence of residues. Each residue is taken to the power kk modulo mm to form a new sequence of residues. This is the encryption. In turn, Alice can load the cipher text and her Variables, including her secret decrypting exponent kk^{\prime}. She can then decrypt to recover the original plaintext sequence of residues. These can be separated to form codes, and finally text, which can be decoded using TextToCode program.

Before proceeding further we consider how to convert characters into numbers. This can be done in many ways. For example, we could let A correspond to 1 , B to 2,2, \ldots, and Z to 26. Alternatively, computers store alphanumeric characters by their ASCII codes. (ASCII is an abbreviation for American Standard Code for Information Interchange.) The first of these methods makes no provision for punctuation, numerals, or lower case letters. The second provides all printable characters, but is inefficient because each character requires three digits (in base 10). The characters that can be typed in the standard keyboard have ASCII codes between 32 (to denote a space ' ') and 126 (for ~). As a compromise between the two systems described above, we subtract 32 from each ASCII code to obtain a 2-digit number. These numbers run from 00 to 94 . In order to preserve the line breaks in a file we need an end of line marker; we assign the code 95 for this purpose. Thus we have the codes opposite.

Problem 1

Suppose that Bob took the (ridiculously small) modulus m=91m=91, and proposed the public exponent k=17k=17. Suppose that Alice sent him the encrypted message c=51c=51. Use the programs Factor, Phi, LinCon, and Power appropriately to recover her plaintext tt.

Problem 2

The proof above that xkkx(modm)x^{k k^{\prime}} \equiv x(\bmod m) assumed that gcd(x,m)=1\gcd (x, m)=1. If m=p1p2m=p_{1} p_{2} where p1p_{1} and p2p_{2} are distinct primes, what is the probability that gcd(x,m)>1\gcd (x, m)>1 when xx is randomly chosen?

Problem 3

Show that if mm is squarefree then the restriction to gcd(x,m)=1\gcd(x, m)=1 is unnecessary. That is, if mm is squarefree and kk1(modϕ(m))k k^{\prime} \equiv 1(\bmod \phi(m)), then xkkx(modm)x^{k k^{\prime}} \equiv x(\bmod m) for all integers xx.

Problem 4

The encrypted message

355456249 475197422 636832086 601788838

was created using the modulus m=670726081m=670726081 and the public exponent k=663599161k=663599161. The program RSA will assist in decrypting this, but first you must determine the value of ϕ(m)\phi(m), and then solve the congruence kk1(modϕ(m))k k^{\prime} \equiv 1(\bmod \phi(m)). (Use Factor and/or Phi, and then LinCon.) Then use TextToCode to decrypt the message. The resulting residues can be separated into 2-digit codes, which may be read as text. What was the message?

Problem 5

Although Bob is the only person who can decrypt a message encrypted with his parameters, he has no way of knowing that the message actually came from Alice, since anyone can use his parameters. To overcome this defect, suppose that Bob has a trap door function πB\pi_{B} and that Alice also has a trap door function πA\pi_{A}. Suppose that Alice sends c=πB(πA1(t))c=\pi_{B}\left(\pi_{A}^{-1}(t)\right) to Bob. What should Bob do, to decrypt this? Can anyone else decrypt it? Can Bob now be sure that the message came from Alice?

Problem 6

In the preceding problem there was a tacit assumption that the trap door functions πA\pi_{A} and πB\pi_{B} act on the same set of numbers. Suppose now that πA\pi_{A} permutes the residue classes modulo mAm_{A}, and that πB\pi_{B} permutes the residue classes modulo mBm_{B}. If mAmBm_{A} \leq m_{B} then we may still proceed as above, since we may consider πA1(t)\pi_{A}^{-1}(t) as lying in the interval [0,mA)\left[0, m_{A}\right), which defines a unique residue class mBm_{B}. How would you modify the above procedure if mA>mBm_{A}>m_{B}?

Problem 7

In formulating their challenge, Rivest, Shamir and Adleman did not use the system in the ASCII table to convert from alphanumeric characters to a residue class tt. By comparing tt with the stated text can you infer the system that they used

Problem 8

Let m=854937209155735099m=854937209155735099, and suppose that you are given the information that mm has at most two prime factors, and that ϕ(m)=854937207303842520\phi(m)=854937207303842520. Can you find the primes?

Problem 9

Suppose that m=1247=2943m=1247=29 \cdot 43, so that ϕ(m)=1176\phi(m)=1176. In order that xkkx(modx^{k k^{\prime}} \equiv x(\bmod mm ) for all xx, it is sufficient that kk1(modϕ(m))k k^{\prime} \equiv 1(\bmod \phi(m)), but is it necessary? Suppose that k=5k=5. How many kk^{\prime} are there, 0k<ϕ(m)0 \leq k^{\prime}<\phi(m), such that xkkx(modm)x^{k k^{\prime}} \equiv x(\bmod m) for all xx ? What if you take instead k=11k=11 ? Why is the number of admissible kk^{\prime} so large? To achieve security, the acceptable kk^{\prime} should be very rare. How should the prime factors of mm be chosen, to achieve this?

An RSA code can be broken if the value of ϕ(m)\phi(m) is known. One way to determine ϕ(m)\phi(m) is to factor mm, but one could conceive that possibly ϕ(m)\phi(m) might be found more quickly, without having to factor mm. However, we argue now that this is not the case: If the value of ϕ(m)\phi(m) is known then the factorization of mm can be recovered with little work. Hence any quick method of evaluating ϕ(m)\phi(m) would yield a quick method of factorization.

If ϕ(m)=m1\phi(m)=m-1 then mm is prime, and we are done. (Of course such an mm would not be used for RSA encryption.) Hence we may suppose that ϕ(m)<m1\phi(m)<m-1, i.e. that mm is composite.

If mm is a product of two distinct primes, say m=pqm=p q, and if ϕ(m)\phi(m) is known, then the primes can easily be found. To see this, note that p+q=mϕ(m)+1p+q=m-\phi(m)+1. Since the sum p+qp+q and the product pqp q are both known, the values of (pq)2=(p+q)24pq(p-q)^{2}=(p+q)^{2}-4 p q can be determined. By taking square roots one obtains pq|p-q|, and the primes are ( p+q±pqp+q \pm|p-q| )/2. (This is, after all, how one solves for the roots of a quadratic polynomial.) This procedure may be attempted whenever mm and ϕ(m)\phi(m) are both known. If it fails then we know that mm is not the product of two distinct primes. To eliminate the (unlikely) possibility that m=p2m=p^{2}, we may compute m\sqrt{m}.

Now we come to the heart of the matter: mm is the product of 3 or more primes. Suppose that a number cc is known with the property that ac1(modm)a^{c} \equiv 1(\bmod m) whenever gcd(a,m)=1\gcd(a, m)=1. For example, ϕ(m)\phi(m) is such a number cc. It is enough to find a proper divisor of mm, since the method may then be applied to the divisors, repeatedly, until all the factors are prime. The number cc may be hard to factor, but at least we can determine the power of 2 in it, say c=2jkc=2^{j} \cdot k with kk odd. Choose a number aa at random, 0<a<m0<a<m. If 1 ; gcd(a,m)<m\gcd(a, m)<m then we have found a proper divisor of mm. If gcd(a,m)=1\gcd(a, m)=1 then put b=akb=a^{k}. The value of b(modm)b(\bmod m) is quickly found by the powering algorithm. By repeated squaring, compute b2,b4,,b2j(modm)b^{2}, b^{4}, \ldots, b^{2^{j}}(\bmod m). Actually, there is no need to compute the last term, since this number is 1(modm)\equiv 1(\bmod m). In the sequence of powers of bb computed, suppose that the first 1 is preceded by a number other than -1 . Then we have an xx such that x≢±1(modm)x \not \equiv \pm 1 \quad(\bmod m), but x21(modm)x^{2} \equiv 1(\bmod m), and hence gcd(x1,m)\gcd(x-1, m) is a proper divisor of mm. This procedure is not guaranteed to work for every aa, but should work for a large proportion of aa 's modulo mm, provided that mm is divisible by two or more odd primes. (We may assume that mm is odd.) In the one remaining case, m=pkm=p^{k} with k>1k>1, the prime pp can be found quickly, since p=m/gcd(m,ϕ(m))p=m /\gcd (m, \phi(m)).

Problem 10

Let m=308557669718497477m=308557669718497477. This number is the product of two distinct primes, and ϕ(m)=308557668607386336\phi(m)=308557668607386336. Find the primes.

Problem 11

Let m=144145168546451m=144145168546451. Given that ϕ(m)=144136398922632\phi(m)=144136398922632, show that mm is not a prime, and is not a product of two primes. (I.e. verify that ϕ(m)<m1\phi(m)<m-1, that mm is not a perfect square, and that (mϕ(m)+1)24m(m-\phi(m)+1)^{2}-4 m is not a perfect square.) Use the procedure described above to find the prime factorization of mm. How many different bases a do you need to consider?