Item 44158824

How does this compare with Abseil's random library? https://abseil.io/docs/cpp/guides/random

bhickey • 4 days ago

This library appears to be insecure by default. I think there are vanishingly few use cases for non-crypto RNGs. We made absl random secure by default using randen: https://arxiv.org/abs/1810.02227

The algorithm is provably secure, so long as AES is secure. It is also backtracking resistant: an adversary with the current RNG state cannot step backwards.

On hardware with AES primitives, it's faster than MT, though slower than pcg64.

5 replies

SideQuark • 4 days ago

By far the vast majority of RNG calls in use need to be non-crypto. They are so many orders of magnitude too slow for so many uses that it's ludicrous to claim there are " vanishingly few use cases for non-crypto RNGs," unless you're trying to scare people into using your work.

Science, neural networks, simulation, gaming, rendering, weather, nuclear, robotics, signal processing, engineering, finance, and more industries require fast rngs to get billions to trillions of them quickly.

Very few things actually need secure - only the things that need a secure endpoint, and most of those simply use the secure rng to do a private key transfer algo, after which there is no more rngs.

Use the right tool for the job. Widen your view of what things are used for. Etc.

1 reply

bhickey • 3 days ago

> They are so many orders of magnitude too slow for so many uses

On hardware with AES primitives this is simply false. Yes, embedded cases are different. There's some neat work on probabilistic computing that uses a xorshift RNG in hardware. These are specialized use cases that probably aren't your use case. Use the right tool for the job and try to be less condescending.

1 reply

SideQuark • 3 days ago

> On hardware with AES primitives this is simply false.

????

Do you even code these things?

I'm unaware of any CPU with AES hardware that lets you compute SIMD under AES, and AES instructions on every CPU I've used (quite a few) are again slower even for single data paths. Once you switch to PRNGs using SIMD that alone is an order of magnitude faster than any AES method. It's not hard to check (say via Agner Fog's CPU instruction tables if you want Intel/AMD) that no AES method is going to beat the fastest non-crypto RNGS. Or simply code and profile a few.

AES primitives are generally used for AES, and even for RNG, they are slow. No one is using crypto rand, even with AES, for all the uses I mentioned, which is orders of magnitude more rands than all the crypto in the world. Between physical simulation, video game uses, and no data centers and AI running PRNGs up the wazoo, all the crypto rand in the world is a tiny, tiny, fraction of the uses - and even there they're used a handful of times to set up a private key system with no more PRNGs needed (and this is why AES instructions were built into CPUs in the first place - for symmetric key encryption - no PRNGS there...)

And once you stop with the simple single rng call mentality, the world runs on GPUs and parallel HW where you get 1000 to ~20,000+ cores, all of which are doing non-crypto rand a lot for the uses above.

There's a reason places like CERN publish RNG algorithms suitable for their needs, which are often adopted in many industries that need actual high speed, high quality PRNGs, and none of them are crypto RNGs. Not one.

So go ahead and show me this fast AES PRNG that is as fast as the fastest non-crpyto RNGs. I'll wait......

I've worked in this space a long time, done DARPA HPC projects, written articles on PRNGs, designed specialized PRNGs, have a math PhD (my advisor did crypto, so I've done quite a bit of that), and have written numerical software for just about every application mentioned above. Read my posting history. I know what I'm taking about.

> Use the right tool for the job ....

Yes, indeed.

> ... and try to be less condescending.

Then don't make false claims, then double down with more. I'm not the only one calling you out on your claims in this thread; probably there's a reason.

spacechild1 • 4 days ago

> I think there are vanishingly few use cases for non-crypto RNGs.

There are quite a few applications:

- audio applications

- computer games

- simulations

- etc.

2 replies

bhickey • 4 days ago

Years ago a player figured out how to decode the RNG state in dungeon crawl. This allowed them to essentially God mode the game because they could determine when a monster would land a hit.

Night_Thastus • 4 days ago

Out of curiosity, how do audio applications use RNG?

1 reply

magicalhippo • 4 days ago

Noise. Either directly as a source, which is then filtered to produce say a cymbal sound, or indirectly by creating timing variance, ie notes not playing exactly at the beat each time, which makes things sound less artificial and more pleasing.

It can also be used to dither[1] the result when doing a final rendering, to improve the noise floor.

[1]: https://en.wikipedia.org/wiki/Noise_shaping

1 reply

Night_Thastus • 4 days ago

Ah, ok. I was thinking playback. Noise makes total sense. Thanks!

1 reply

magicalhippo • 4 days ago

I guess I should have pointed out the noise shaping could be useful if you're doing 16bit playback of a 24bit source, so possibly useful during playback as well.

thomasmg • 4 days ago

So I'm a bit confused, the paper says "'Randen' ... outperforming ... PCG" but now you wrote "though slower than pcg64"? In the paper itself, PCG is faster but "these microbenchmark results are quite irrelevant in practice — which actual application repeatedly calls a random generator and ignores the results?" I wonder, was the paper peer-reviewed? (I'm not saying it should have been..., just wondering.)

My guess is that yes, Randen is fast, but it is not quite as fast as PCG or other pseudo-random number generators.

Whether or not it is relevant, that's another story. I can not agree to your claim "there are vanishingly few use cases" for insecure random number generators.

3 replies

jeffbee • 4 days ago

If you look inside Abseil it seems that the authors generally agree on performance. The default random in Abseil is Randen, but the faster variant the use of which is discouraged is PCG.

SideQuark • 2 days ago

The paper doesn't seem peer reviewed or published. It doesn't show up as published anywhere on Google scholar, and papers that are published years later still reference the arxiv link, not a publication link. This is pretty strong evidence of not being reviewed or published.

You can check via: google scholar, search randen random number, note there is no version with a journal, click citations and look through the few articles mentioning it, you can see which of those are published, can see a few, not one has a journal ref to it.

bhickey • 4 days ago

I was citing the micro benchmarks because they provide a pessimistic comparison in favor of other RNGs. My recollection is that it appeared as a conference paper, but I'd need to ask the author.

1 reply

SideQuark • 2 days ago

> the micro benchmarks because they provide a pessimistic comparison in favor of other RNGs

Why would one claim they're fast, then simply fiddle with benchmarks so they find a place where they look fast?

When someone wants a fast RNG, they're not calling it 10 times per sec. They're hammering it for actual needs.

adastra22 • 4 days ago

Every time you use a hash map you use a (PR)RNG. There are an abundant number of use cases for non-CS RNGs.

3 replies

jeffbee • 4 days ago

Well, when you use the Abseil hash maps the "random" number is actually a thread-local variable that monotonically increments, and the Abseil hash functions use static "random" data.

spacechild1 • 4 days ago

That would be a very strange hash map :) Could it be that you are confusing hash functions with PRNGs?

2 replies

SideQuark • 2 days ago

A many hash tables use PRNG sequences for collision resolution by moving around to other buckets. That PRNG seed comes from the hash value. PRNGs are chosen for good uniform spread and repeatable sequence and speed.

The naive collision resolution ideas start like this: chained items (slow, memory costs), linear probing (results in clustering), quadratic probing (better, still has problems), double hashing (good), PRNG sequences with good properties (so any item hitting the same spot checks the same seq of overflow slots. One can alternate linear and PRNG or other combos.

Start with the basic wiki article https://en.wikipedia.org/wiki/Hash_table, check through many of the modern has table types linked from there.

adastra22 • 4 days ago

Why do you think they are different? That's what a hash function is--mapping input to outputs with a pseudo-random distribution. Different words for literally the same thing.

1 reply

spacechild1 • 4 days ago

> That's what a hash function is--mapping input to outputs with a pseudo-random distribution

That's not what a PRNG does. A PRNG produces a sequence of pseudo-random values (with a particular distribution) based on an initial seed value.

A hash function is stateless, a PRNG is not.

1 reply

adastra22 • 4 days ago

When used as a hasher, the 'seed' is the input. When used as a PRNG tool, the seed is an incrementing counter or something. It's the same function though.

1 reply

spacechild1 • 4 days ago

I see what you mean. You can use a hash function to build a PRNG, but the hash function itself it not a PRNG.

In a hash table, the hash function is not used to produce a series of random numbers, that's why it's a bit silly to call it a PRNG. Instead, the purpose is to reduce a (possibly larger) input value to a fixed size output value, often with the requirement that similar input values should result in very different output values to avoid clustering.

However, your hash table example is still useful as an analogy.

bhickey • 4 days ago

You've made yourself susceptible to hash flooding.

duped • 4 days ago

I mean the only use case for cryptographically secure RNGs is cryptography. I can't think of non-crypto use cases that need it, but all of those need performance (and determinism is sometimes desirable!).