freeswitch/third_party/bnlib/test/primes.doc

216 lines
11 KiB
Plaintext

The choice of Diffie-Hellman parameters
* Background
Diffie-Hellman key exchange uses two parameters, a prime p and a
generator g, which are used to derive the public parameters
y1 = g^x1 (mod p) and y2 = g^x2 (mod p), and then the shared secret
z = y1^x2 = (g^x1)^x2 = g^(x1*x2) = (g^x2)^x1 = y2^x1 (mod p).
For the computation to be secure, several conditions must be true.
The exponent must be big enough, for there is a square-root search
algorithm to find the exponent. (E.g. a 16-bit exponent can be found
in about 2^8 = 256 steps.) And then the modulus must be chosen so
as to make the general discrete log problem difficult.
The general discrete log problem can be solved for each prime-power
factor of p-1 independently, so if all of the factors of p-1 are small,
this is easy to do. Since p-1 is even, it must have a factor of 2, but
the remaining portion q = (p-1)/2 can be chosen to be prime, making the
problem as difficult as possible. Finding such numbers is computationally
expensive, but as they are parameters which are only computed once, this
is a reasonable up-front cost.
* Number theory
A second advantage of prime moduli of this form is that all generators
g are good. This is because the generator must have a large order in
the group Z*_p. But that group is of size p-1 = 2*q, and the order of
any element of a group must divide the size of the group. The only
divisors this has are 1, 2, q and 2*q. The only element of order 1 is
1, and the only element of order 2 is -1. All other elements, from 2
through -2, have orders of either p-1 or (p-1)/2, which are both large.
If the generator g has order p-1, it is a generator of the group Z*_p,
and this is generally how one is advised to generate Diffie-Hellman
parameters. This explains the similarity in names. However, if g is
of order p-1, then it must be a quadratic non-residue modulo p. That
is, it must not be a square of another number. If it were a square,
then since the size of the group is even, no power of it would ever
equal its square root, so it could not be a generator.
If g is indeed of order p-1, then even powers of g are quadratic
residues (squares, modulo p), and odd powers are quadratic non-residues
(non-squares). Given a number y and a prime p, the Legendre symbol
(y/p) is straightforward to compute, and this tells you if y is a
quadratic residue. If it is, and y = g^x, then x must be even. If not,
then x would have to be odd. In this way, for a generator which is a
quadratic non-residue, the low-order bit of the exponent x is easily
computed.
If g is a quadratic residue, then the useful values of exponents x is
more limited, since only the value of the exponent x modulo q = (p-1)/2
has any effect on the output y = g^x (mod p), but generally exponents x
are much less than p in any case, so this limitation on range is not an
issue.
Essentially, in either case, only the value of x modulo q is secret,
but if g is a quadratic non-residue, the low-order bit of x is
available to an attacker, while if it is a quadratic residue, the
high-order bit is known to be 0.
Thus, it does not really matter whether g is a quadratic residue or
not, but if it is not, the exponent x should be chosen one bit larger.
This adds a trivial amount of work to the computation of y, and for
that reason it may be preferable to choose g to be a quadratic
residue. This is not currently done, however.
* Choice of generator g
Because any g will do, and the choice of g does not affect the difficulty
of performing the discrete log computation, choosing it for convenience
of computation is best, and g = 2 is simplest to compute with.
If in fact it is desirable to choose a generator which is a quadratic
residue, then g = 2 can still be used if the prime p is suitably
chosen. If p = +/-1 (mod 8), then 2 is a quadratic residue. If
p = +/-3 (mod 8), then 2 is a non-residue.
* Choice of prime-generation technique
There may be additional primes of special form for which the discrete
logarithm problem is particularly easy. The authors are not aware of
any, but theoretical advances are possible and it would be nice to
assure users of the system that the prime was not chosen to have any
hidden special properties: only the published criteria were used.
David Kravitz of the NSA has suggested a technique for generating
"kosherized" primes for DSS which has been adapted to generate
Diffie-Hellman primes.
The technique uses a string of bytes as a seed to a cryptographically
strong one-way hash function. This generator produces the initial
value for a search for a suitable prime.
David Kravitz' technique generates random numbers from successive seeds
until one is found to be a suitable prime. This is unbearably slow for
primes of the special form being sought, but it can be sped up, at a
negligible cost in uniformity of the chosen primes by generating only a
starting position for a linear search for a suitable prime. Such a
search can be carried out particularly efficiently.
* Details of the technique
The generator is based on SHA.1, the FIPS 180.1 secure hash algorithm.
This takes the given seed as input and produces a 160-bit output
sequence in 20 bytes. These bytes are taken as a big-endian number to
produce a number n0 from 0 to 2^160-1.
(I.e. n0 = 2^152 * byte0 + 2^144 * byte1 + ... + 2^8 * byte19 + byte20.)
Then, the seed is incremented, as a big-endian array of bytes, modulo its
size (i.e. the last byte is incremented, propagating carry if necessary),
and hashed again to produce n1, then n2, etc.
A number of arbitrary size may be constructed by concatenating
N = n0 + 2^160 * n1 + 2^320 * n2 + .... To get a number no larger
than 2^k, take the low-order k bits of N, N mod 2^k. Obviously,
if k is 1024, it is only necessary to compute n0 through n6.
To generate a k-bit prime p (2^k > p >= 2^(k-1)), take t = N mod 2^(k-2),
i.e. a number with at most k-2 significant bits. Then add 2^(k-1),
to force the number into the desired range, and 2^(k-2), to force it
into the high half of the range. This extra refinement makes an attack
more expensive, without affecting the time required to do computations
mod p. Additional high-order 1 bits could be forced, but the incremental
benefit rapidly diminishes.
The resultant number t is used as the starting point in a search for a
suitable prime p. p is chosen to be the first number >= t such that p
is prime and (p-1)/2 is prime.
* Choice of seed
Because SHA.1 is a cryptographic hash, it is computationally infeasible
to find an input which has a given output. Indeed, there is no known
technique better than brute-force search to find an input which
produces an output with any special properties. Assuming that there is
an unknown class of primes which are easy to solve the discrete
logarithm problem for, this ensures that the chance of choosing a prime
p which is a member of that class is no better than random chance,
regardless of malice on the part of the designer.
The seed chosen is arbitrary, so was chosen for aesthetic reasons.
It is the 79 bytes of the ASCII representation of a quote by Mahatma
Gandhi:
Whatever you do will be insignificant, but it is very important that you do it.
* Implementation details
Obviously, a program was written to find a prime according to these
rules. To aid anyone who wishes to repeat the search to confirm that
the published primes were indeed generated in this way, here is a
description of how it was done. The primes if the desired form have a
density of about (ln p)^-2. E.g. for 1024-bit p, about one out of
every 503791 numbers meets these criteria, so a considerable amount of
searching is required. The following techniques can make the
computation tolerable.
First, note that q must be odd and not congruent to 0, modulo 3. Thus,
q must be congruent to +/-1, modulo 6. Thus p = 2*q+1 must be
congruent to 2*1+1 = 3 or 2*-1+1 = -1 modulo 12. But p congruent to 3
mod 12 would be divisible by 3, and not prime, so p must be congruent
to 11 mod 12.
Thus, the initial search point t can first be increased until it is
congruent to 11 modulo 12. Searching from this point forward, only
every 12th number, t+12*i, needs to be considered.
If it is desired to choose p so that 2 is a quadratic residue (meaning
that p is congruent to +/-1 modulo 8), then this additional constraint
can be met with no additional difficulty by beginning at the next
number which is congruent to 23 mod 24 and searching in steps of 24.
But in the following discussion, a step size of 12 is assumed.
Then, a sieve is built for trial division by a number of small primes
for a range of following i values. For large primes, a large search
space is required, so a large sieve is desirable. The value used was
65536 bits (8K bytes). It may be necessary to rebuild the sieve
beginning at t+12*65536 if no suitable prime is found before then, but
this sieve is large enough that the refilling is infrequent and the
overhead is negligible.
Initially, every position in the sieve is marked as a potential prime.
Then, for the small primes s from 5 through 65521, position i in the
sieve is marked as unsuitable if t+12*i is divisble by s, i.e.
definitely not prime. To do this cheaply, consider that t+12*i = 0
(mod s) if i = -12^-1 * t (mod s). So finding t mod s, then 12^-1 (mod
s) and multiplying (mod s) will produce the first i value which is
known to be divisible by s, and then every s positions thereafter in
the sieve will be divisible. This does the equivalent of a great deal
of trial division with minimal effort.
Positions in the sieve are also marked as as unsuitable if (t-1)/2+6*i
= 0 (mod s), because these positions will have (p-1)/2 divisible by s
and thus non-prime. This works similarly, and (t-1)/2 mod s can be
derived from t mod s without actually doing another full division.
This sieve filters out all but 1/591 of the possible values of i as
obviously composite, leaving an expected 852 numbers to be checked by
stronger means before a suitable prime p is found.
After these two sieving operations have removed all numbers from
consideration where p or q = (p-1)/2 have small divisors, the remaining
candidates are subjected to a fast optimized Fermat test, to the base 2,
once for p and once for q. This eliminates, for practical purposes,
all composite numbers.
Special composite numbers can be chosen which pass this test and yet
are not primes - they are called pseudoprimes - but they are so rare in
the ranges considered that the chances of finding one without
deliberate search are utterly negligible. And the stating value for
the search was carefully chosen to have no hidden special properties.
If p and q are found to be prime by this test, some extra confirmation
pseudoprimality tests are performed just to make sure of the conclusion
and p is returned as the result.