Feeding a PRF its own tail

A while back, while reviewing code for a customer, I stumbled on a custom RNG. It was used for pretty much everything in the application, including generating supposedly unique identifiers.

A custom RNG is usually a red flag, so I took a closer look.

But before getting into how it works, we need to talk about pseudorandom functions.

On PRFs

A pseudorandom function (PRF) takes a key and some input data, and produces a fixed-size output:

PRF(key, data) -> output

Informally, in a secure PRF, given any number of (data, output) pairs, recovering the key is as hard as a brute-force search.

More importantly, no matter how many distinct (data, output) pairs you have already seen, for a new data input any possible output is equally likely. Seeing an output before doesn’t mean a different input can’t produce it again. Unlike a permutation, the security of the function doesn’t degrade as the number of inputs grows.

A common example is SipHash: it takes a 128-bit key and a variable-length input, and produces a 64- or 128-bit output.

The custom random number generator

The custom RNG worked like this:

SipHash(key, state) -> new_state, output

The key is generated from the kernel RNG when the application starts, and never changes afterwards.

There’s also a special case with a static key, where callers of the public API can set the initial state themselves.

The state is initialized to a different value (the seed) for each thread.

Each call splits the function’s 128-bit output into two 64-bit words: the first half overwrites the state, the second half is handed back to the caller as random bytes.

At first sight, the generator may feel reasonable. A 2⁶⁴ domain is borderline, but for most protocols it might be acceptable in practice. The PRF is indistinguishable from random under its key, the state never sits still, and every output depends on the full secret. What could go wrong?

Quite a lot, as it turns out.

The rho-shaped graph

A pseudorandom function is not a permutation. Two different inputs are perfectly free to map to the same output.

And on average, they do: about 37% of the codomain is never hit at all.

What does that mean for our random number generator?

It means the sequence of states is a walk on a graph where every node has exactly one outgoing edge and an unpredictable number of incoming edges. That walk always has the same shape: a tail of states that are never revisited, leading into a cycle that loops forever. Like the Greek letter ρ.

With a 2⁶⁴ state, on average we fall into a loop after about 2³² steps. That’s very small. It’s entirely practical to generate, store, and index the whole cycle on a laptop, and then, given any output, predict the ones that follow.

Feeding a random function its own output gives you about √N outputs, not N. Every doubling of state size buys you one extra bit of effective period, not two.

Threads end up agreeing with each other

The RNG has another failure mode. All threads share a common key, but each starts from a different seed.

So they start at different nodes of the functional graph. But the graph is fixed, and every node eventually leads into one of a small number of cycles. How small?

A random function on N elements has, on average, about ½ ln N distinct cycles. For N = 2⁶⁴, that’s roughly twenty-two. Just twenty-two attractors, into which the entire 2⁶⁴-element state space drains.

So no matter how carefully you seeded your threads with independent entropy, after each one walks its tail of a few billion steps, it ends up trapped on one of those twenty-two cycles.

From that point on, the threads are no longer producing independent streams. They are producing rotations of one of a handful of fixed sequences.

The long-run behavior of the system collapses onto a tiny shared set of orbits.

For applications where the threads’ streams are supposed to be independent, this is bad. The threads end up generating the same sequences.

Short cycles in PRFs

With a PRF and a 64-bit input, about 2³² is the longest cycle you can hope for, and the variance around it is large. Depending on the seed and the exact function, the period can be far shorter.

For example, directly feeding SipHash with a 64-bit output back as the next state, and with key (0, 1) and seed 0x0000004de9886527 the loop closes after only 16,933,653 steps, more than two hundred times shorter than the 2³² you’d expect on average.

Nothing stops a pseudorandom function from cycling faster still, all the way down to a fixed point: a state that maps to itself, so the generator emits the same value forever. Fortunately, SipHash has only one fixed point: the all-zero state, unreachable for any key. Hard to hit by accident.

Backtracking resistance

If the key and some outputs leak, it’s not only possible to predict every future output. It’s also possible to recompute the previous ones.

No preimage search needed: the cycles are short enough that it’s more efficient to just compute them.

A minimal fix

This random number generator is obviously flawed and must not be used in any security context.

It’s also very inefficient, producing only 64 bits per call.

A minimal fix would be to use per-thread keys and a counter:

SipHash(thread_key, counter) -> output

Incrementing the counter after every produced output builds a sequence that won’t repeat until the counter wraps, at which point you should rekey, and you’ll be safe for about 2⁹⁶ outputs. As a bonus, it’s twice as fast as the original generator: 128 bits per call instead of 64.

That doesn’t buy any backtracking resistance, though. The keys still need frequent updates.

I’m also not convinced SipHash is a good fit here, given the input size. AES would be faster, and fine as long as it’s rekeyed before 2³² outputs. Or, if you can use it, AES-PRF (1, 2) lets you go up to 2⁶⁴ output blocks before rekeying, keeping the generic distinguishing advantage around 2^-128, at no performance cost.

A better fix

It’s generally hard to justify building a custom random number generator. These days, the standard library of your favorite language probably ships efficient functions to safely generate random numbers. Use them.

If you really need to build an RNG, be very careful. Use at least a 128-bit or 256-bit state.

Daniel Bernstein’s post on fast-key-erasure random-number generators remains an excellent reference.

Want something fancier and faster than AES-CTR? Use ButterKnife in counter mode (similar to Skye). It outputs 1024 bits per call, and the first 256 can be used as the next tweak and key. PoC.

Want something based on a public permutation? I like Reverie a lot.

But realistically, you probably don’t need any of that.

What happened next

I reported all of this. Years later, the generator itself was never touched; the only change was a line added to the documentation noting that it “should not be used for cryptographic purposes”.

Sad.