Generating Cryptographic Primes in Rust

This is a draft post. There may be some rough edges, but the core content is here. Feedback is welcome!

Many cryptographic schemes rely on large prime numbers for their security. RSA needs two of them. Diffie-Hellman needs one. DSA, ElGamal, Paillier — primes everywhere. Cryptographic libraries need to generate these primes efficiently and securely when producing keypairs. In this post, we’ll learn how these primes are generated, and implement our own in Rust.

The Needle in a (Predictably Dense) Haystack

There is no known formula for deterministically producing the $n$-th prime. So cryptographic libraries do something wonderfully pragmatic: generate random numbers and test them.

This isn’t as hopeless as it sounds. The prime number theorem tells us that primes near $N$ have density roughly $\frac{1}{\ln N}$. For a 1024-bit number, that’s about 1 in 710. Since we only test odd candidates, we expect to find a prime within about 355 trials — fast enough for real-time key generation.

Fermat’s Little Theorem: The Foundation

The mathematical bedrock is Fermat’s Little Theorem:

$$a^{p-1} \equiv 1 \pmod{p}$$

for any prime $p$ and any $a$ not divisible by $p$. The contrapositive gives us a compositeness test: if we find any $a$ such that $a^{n-1} \not\equiv 1 \pmod{n}$, then $n$ is definitely composite.

This is the Fermat primality test, and it almost works. The problem is the word “almost.”

Carmichael Numbers: Why Fermat Isn’t Enough

Carmichael numbers are composite numbers that satisfy Fermat’s condition for every base coprime to them. The smallest is $561 = 3 \times 11 \times 17$. No matter which base you pick (as long as it’s coprime to 561), you’ll get $a^{560} \equiv 1 \pmod{561}$. The Fermat test is completely fooled.

There are infinitely many Carmichael numbers (Alford, Granville & Pomerance, 1994), so any purely Fermat-based test has a fundamental gap. We need something stronger.

Miller-Rabin: Exploiting the Square Root of Unity

Miller-Rabin’s insight is to extract more information from the same computation. Instead of just checking whether $a^{n-1} \equiv 1$, we examine how the computation arrives at 1.

The Decomposition

Write $n - 1 = 2^r \times d$ where $d$ is odd. This factors out all powers of 2 from the exponent. For $n = 221$:

$$220 = 2^2 \times 55 \quad \Rightarrow \quad r = 2,\ d = 55$$

The Witness Chain

For a random witness $a$:

Compute $x = a^d \bmod n$
If $x \equiv 1$ or $x \equiv n-1$, this round is inconclusive (probably prime)
Otherwise, square $x$ up to $r-1$ times:
- If we ever hit $n-1$, this round is inconclusive
- If we reach the end without hitting $n-1$, $n$ is definitely composite

Why Does This Work?

The key fact: if $n$ is prime, then $x^2 \equiv 1 \pmod{n}$ implies $x \equiv \pm 1 \pmod{n}$. The only square roots of 1 modulo a prime are 1 and $-1 \equiv n-1$.

So as we square our way from $a^d$ up to $a^{n-1}$, the sequence must pass through $n-1$ before reaching 1 (unless it starts at 1). If it jumps straight to 1 from some value other than $n-1$, that value is a non-trivial square root of unity — which can’t exist modulo a prime. So $n$ must factor.

Here’s a prime passing the test:

And a composite getting caught:

Carmichael numbers can’t hide from this either. While $a^{n-1} \equiv 1$ for every base, the path to 1 reveals the composite structure through these forbidden square roots:

The Implementation

Let’s build this in Rust. We’ll use the curv crate for big integer arithmetic.

Candidate Generation

First, we need random odd numbers of the right bit length:

pub fn generate_prime_candidate(bit_length: usize) -> BigInt {
    let mut rng = rand::thread_rng();
    let mut bytes = vec![0u8; (bit_length + 7) / 8];
    rng.fill_bytes(&mut bytes);
    let last = bytes.len() - 1;
    bytes[last] |= 1;                          // LSB = 1 → odd
    bytes[0] |= 1 << ((bit_length - 1) % 8);  // MSB = 1 → correct bit length
    BigInt::from_bytes(&bytes)
}

Setting the MSB ensures we get exactly bit_length bits (not fewer). Setting the LSB ensures the candidate is odd (even numbers > 2 aren’t prime, so why waste a test).

Witness Generation

We need random witnesses in $[2, n-2]$. The range has size $n - 3$, so we sample uniformly from $[0, n-4]$ and shift by 2:

pub fn generate_random_witness(n: &BigInt, rng: &mut impl RngCore) -> BigInt {
    let n_bytes = n.to_bytes().len();
    let two = BigInt::from(2);
    let n_minus_three = n - &BigInt::from(3);
    loop {
        let bytes = rng_bytes(n_bytes, rng);
        let a = BigInt::from_bytes(&bytes)
            .mod_floor(&n_minus_three) + &two;
        if a >= two && a <= (n - &two) {
            return a;
        }
    }
}

Why avoid $a = 0, 1, n-1$? Zero is useless, $a = 1$ always gives 1 (tells us nothing), and $a = n-1$ always gives 1 by Fermat’s theorem. We want witnesses that can actually distinguish primes from composites.

The Core Test

pub fn miller_rabin(n: &BigInt, k: u32) -> bool {
    // Handle small cases
    let one = BigInt::one();
    let two = BigInt::from(2);
    let n_minus_one = n - &BigInt::from(1);

    match n {
        _ if n <= &one => return false,
        _ if n == &two => return true,
        _ if n == &BigInt::from(3) => return true,
        _ if n.mod_floor(&two) == BigInt::zero() => return false,
        _ => {}
    }

    // Decompose n-1 = 2^r * d
    let mut r = 0u32;
    let mut d = n - &BigInt::from(1);
    while d.mod_floor(&two) == BigInt::zero() {
        d = d.div_floor(&two);
        r += 1;
    }

    // Witness loop
    let mut rng = rand::thread_rng();
    'outer: for _ in 0..k {
        let a = generate_random_witness(n, &mut rng);
        let mut x = BigInt::mod_pow(&a, &d, n);

        if x == BigInt::one() || x == n_minus_one {
            continue;
        }

        for _ in 0..(r-1) {
            x = BigInt::mod_pow(&x, &two, n);
            if x == n_minus_one {
                continue 'outer;
            }
        }

        return false; // Definitely composite
    }

    true // Probably prime
}

The 'outer label and continue 'outer is a Rust pattern for breaking out of nested loops — when we hit $n-1$ in the inner squaring loop, we want to skip to the next witness, not just break the inner loop.

The Generator

The top-level loop is almost trivially simple:

impl PrimeGenerator {
    pub fn generate(&self) -> BigInt {
        loop {
            let candidate = generate_prime_candidate(self.bit_length);
            if miller_rabin(&candidate, self.rounds) {
                return candidate;
            }
        }
    }
}

Generate, test, repeat. With 40 rounds of Miller-Rabin, the probability of accepting a composite is at most $4^{-40} \approx 8.3 \times 10^{-25}$ — more certain than your hardware not having a cosmic ray flip a bit during the computation.

Testing

Good tests for a primality tester should cover the obvious (small primes, small composites, edge cases) and the adversarial:

#[test]
fn carmichael_numbers() {
    let carmichaels = [561, 1105, 1729, 2465, 6601];
    for c in carmichaels {
        assert!(
            !miller_rabin(&BigInt::from(c), 40),
            "{c} is a Carmichael number, should be composite"
        );
    }
}

These are the numbers that would fool a Fermat test but should be caught by Miller-Rabin. If our implementation passes these, the square-root-of-unity check is working correctly.

running 5 tests
test tests::edge_cases ... ok
test tests::small_composites ... ok
test tests::carmichael_numbers ... ok
test tests::small_primes ... ok
test tests::generate_large_prime ... ok

test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.02s

   Doc-tests miller_rabin

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Error Bounds and Practical Considerations

The $4^{-k}$ bound is actually a worst case. For random composites (as opposed to adversarially chosen ones), the probability of a false positive is much lower. Most composites are caught by the very first witness.

In practice, cryptographic libraries often combine Miller-Rabin with trial division against small primes (2, 3, 5, 7, …, up to a few hundred) as a fast pre-filter. A candidate divisible by 3 doesn’t need 40 rounds of modular exponentiation to reject.

The algorithm we’ve built here is essentially what OpenSSL, GnuPG, and other production libraries use under the hood — it’s the same mathematical core, just layered with engineering optimizations on top.