Factorization and Prime Numbers
(Lab 2 from Clint By Hugh Montgomery, Edited by Elad Zelingher, all rights reserved to Hugh Montgomery)
The program FacTab produces a table of least prime divisors of odd numbers, up to 10^{9}. The values are calculated by dividing small primes into the numbers in the desired range, until the only numbers for which a least prime divisor has not been found are prime. Let p be a given prime number. The least composite integer n such that p is the least prime factor of n is n=p^{2}. (In this connection, recall Problem 24. on p. 30 of NZM.) Thus if one is to prepare a table of least prime factors of integers in an interval [a, b], then it is useful to have on hand a table of all primes p \leq b^{1 / 2}. In the case of FacTab, the intervals considered are of the form [10 N, 10 N+200] with N \leq 10^{8}. Since 31607 and 31627 are consecutive primes, and since
it suffices to have a table of primes through the prime 31607. Such a table of primes may be constructed as follows: Consider a sequence a_{1}, a_{2}, \ldots, a_{31607} of 0's and 1's. Initially we take a_{1}=0, and a_{i}=1 for all i>1. We operate on this sequence so that eventually a_{i}=1 if i is prime, a_{i}=0 otherwise. Start with p=2, and while p \leq 173 perform the following operations: Put j=p^{2}. This is the least composite integer such that a_{j} is still 1. Put a_{j}=0. Replace j by j+p, and set a_{j}=0. Replace j by j+p. Continue in this manner until j>31607. By examining the numbers a_{p+1}, a_{p+2}, \ldots, find the least integer q such that q>p and a_{q}=1. Then q is the least prime number greater than p. Replace p by q, and start over. This method of generating primes is known as the Sieve of Eratosthenes. It suffices to sieve only to p=173, since 173 is the largest prime \leq \sqrt{31607}. FacTab constructs a table of small primes in this way when the program is first loaded, with one modification: Since the even numbers are immediately eliminated, FacTab saves time and memory by applying the sieve only to the odd integers.
Problem 1
Use FacTab to view the least prime factors of the odd numbers in an interval. You will note that the least prime factor of numbers of the form 10 k+5 display a certain pattern. Describe this pattern, and prove that it persists.
Problem 2
Can you find any other patterns similar to the one noted above?
Problem 3
For 5 \leq k \leq 20, how many primes lie between e^{k} and e^{k}+100? How do these numbers compare with 100 / k?
Problem 4
For several values of x and h (with h small compared with x ), count the primes between x and x+h, and compare the result with h / \log x.
Problem 5
How many primes lie between 20831330 and 20831530? By changing the value of Start, determine whether this is typical of similar intervals in this vicinity. For a report of a more extensive study of the gaps between primes, see D. Shanks, On Maximal Gaps between Successive Primes, Math. Comp. 18 (1964), 646-651.
Problem 6
For how many integers n \leq x is the least prime factor of n greater than 2? Greater than 3? Than 5? Than 7? How do these numbers increase with x? Formulate a conjecture concerning the asymptotic behavior. Can you prove your conjecture? (Theorem 8.8(e) and Theorem 8.29 of NZM are relevant here.)
Problem 7
A prime number p is called a twin prime if p+2 is also prime. Repeat Problem 3 above, but this time counting only twin primes. How do the counts compare with 100 / k^{2}? Do you conjecture that there are infinitely many twin primes, or do you conjecture that there are only finitely many?
Problem 8
The program Factor determines the canonical factorization of an integer n by trial division. Suppose that prime factors <d have been found, and removed, leaving an integer m yet to be factored. If m=1 then we are done. If 1<m<d^{2} then m is prime, and we are done in this case also. Otherwise, we divide d into m. If d \mid m then d is prime, and we repeatedly divide by d until d no longer divides the remaining number. Then we replace d by d+1 and repeat the process. To save time, after powers of 2 have been removed, only odd d are considered. Further savings can be obtained by noting that after removing powers of 2 and of 3 it suffices to consider d of the forms d=6 k-1, d=6 k+1.
Factor takes this a step further: After powers of 2, 3 and 5 have been removed, only those d of the eight forms 30k+1, 30k+7, 30k+11, 30k+13, 30k+17, 30k+19, 30k+23, 30k+29 are considered. Thus d is replaced by d+30 after only 8 trial divisions. This method is in principle slightly wasteful, because it would be enough to consider prime values of d, but in practice it seems to take longer to generate a table of primes. (Try it, if you like to write programs.) Instead of generating a table of primes, one could construct a permanent file listing primes, and then call the needed primes from that file, but this seems to take still longer.
Use FacTab to find the largest prime <10^{k} for k=1,2, \ldots, 9. Apply Factor to each of these nine primes, and note the time required to perform the calculations. Is the time roughly c \sqrt{n}? What values of c do you observe?
Problem 9
Apply Factor to each of the numbers 10^{18}-k for k=1,2, \ldots, 9. In some of these cases, you will tire of waiting for a complete resolution. To interrupt the program, simply press a key, and note how the program reports its partial results.
Problem 10
Apply factor to 20 randomly-chosen numbers \approx 10^{9}. Make a record of these numbers, and note which ones are square-free. What proportion of them are square-free? How close is this proportion to 6 / \pi^{2}=0.6079 \ldots? How many integers n \leq x are not divisible by 4? How many are not divisible by 9? How many by neither 4 nor 9? The proportion of n \leq x for which 4 \nmid n and 9 \nmid n tends to a limit as x tends to infinity. What is this limit?
Can you guess how this limit would change if we also require that 25 \nmid n? (See Theorems 8.25 and 8.29 of NZM.)
Problem 11
The program GetNextP yields the least prime p greater than a given number a, provided that a \leq 10^{9}. For a in this range, GetNextP uses the same sieving procedure as found in FacTab. For larger values of a, 10^{9}<a \leq 10^{18}, the program GetNextP locates the least integer q>a that is likely to be prime. That is, the interval (a, q) contains no prime number and q is "probably" prime (in the sense that it passes several strong pseudoprime tests; this is discussed on pp. 77, 78 of NZM).
Use GetNextP to find the least prime p>x, for several x \approx 10^{8}. How are the differences p-x distributed? What is their mean? If you like to program, you could conduct a larger study, and a more detailed statistical analysis.
The asyptotic situation remains a matter of conjecture, but it is expected that as x tends to infinity, the mean lies between (1-\epsilon) \log x and (1+\epsilon) \log x. Also, for any fixed c>0, it is predicted that the proportion of integers x \leq X such that p-x>c \log x tends to e^{-c} as X tends to infinity.
Problem 12
Write a program that deletes from a given sequence of integers those that are divisible by the square of a prime. In this way, count the number of square-free integers in various short intervals, and also the number of square-free integers not exceeding 10^{4}.
In Theorem 8.25 of NZM it is shown that the number Q(x) of square-free integers not exceeding x is c x+O(\sqrt{x}) where c=6 / \pi^{2}. (The O-notation is defined on p. 365 of NZM.) It is conjectured that the error term here is actually O\left(x^{\theta}\right) for any \theta>1 / 4. Although such a strong upper bound for the magnitude of the error term has not yet been proved, it is known that the error term does achieve the order of x^{1 / 4} infinitely often. Does the numerical evidence generated by your program support the stronger conjecture? Still less is known concerning the variation in the number of square-free numbers in short intervals.