thr3ads.net - Linux Virtualization - getrandom waits for a long time when /dev/random is insufficiently read from [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Stephan Mueller

2016-Jul-29 17:03 UTC

getrandom waits for a long time when /dev/random is insufficiently read from

Am Freitag, 29. Juli 2016, 10:14:07 CEST schrieb Alex Xu:

Hi Alex,
> On Fri, 29 Jul 2016 15:12:30 +0200
> 
> Stephan Mueller <smueller at chronox.de> wrote as excerpted:
> > Am Freitag, 29. Juli 2016, 09:03:45 CEST schrieb Alex Xu:
> > > In my opinion, assuming I am not doing something terribly wrong,
> > > this constitutes a bug in the kernel's handling of getrandom
calls
> > > at boot, possibly only when the primary source of entropy is
> > > virtio.
> > 
> > Nope, I do not think that this is true:
> > 
> > - /dev/random returns one byte for one byte of entropy received, but
> > it has a lower limit of 64 bits
> > 
> > - getrandom behaves like /dev/urandom (i.e. nonblocking) except
> > during boot where it waits until the RNG has collected 128 bits
> > before operating like a DRNG that is seeded once in a while when
> > entropy comes in.
> > 
> > 
> > Ciao
> > Stephan
> 
> I don't follow. Assuming you are correct and this is the issue, then
> reading 128 bits (16 bytes) from /dev/random should "exhaust the
> supply" and then both reads from /dev/random and calling getrandom
> should block.
You assume that getrandom works like /dev/random. This is not the case. It is 
a full deterministic RNG like /dev/urandom (which is seeded during its 
operation as entropy is available).

getrandom *only* differs from /dev/*u*random in that it waits initially such 
that the system collected 128 bits of entropy.

But you point to a real issue: when /dev/random is pulled before getrandom 
(and yet insufficient entropy is present), then the getrandom call will be 
woken up when the input_pool received 128 bits. But those 128 bits are fed 
from the input_pool to the blocking_pool based on the caller at the /dev/
random device. This implies that the reader for getrandom will NOT be able to 
obtain data from the input_pool and the nonblocking_pool because the transfer 
operation will not succeed. This implies that the nonblocking_pool remains 
unseeded and yet getrandom returns data to the caller.> 
> That, however, is not the behavior I observed, which is that reading
> any amount from /dev/random will never block (since it is fed
> from /dev/urandom on the host side) whereas calling getrandom will
> always block unless /dev/random is read from first.
That is a different issue that I did not read from your initial explanation.

I need to look into it a bit deeper.> 
> Moreover, as long as virtio-rng is available (and fed
> from /dev/urandom), /proc/sys/kernel/random/entropy_avail is always 961
> immediately after booting, which is more than enough to satisfy a
> one-byte read. After reading 1 byte, the estimate decreases to 896 or
> 897, but after reading 29 more bytes it increases to 1106.
> 
> Again, these observations are consistent with the conjecture that the
> issue arises since virtio-rng is a "pull" source of entropy
whereas
> most other methods (e.g. interrupt timing) are "push" sources. I
> suspect that a similar issue occurs if RDRAND is the only source of
> entropy.
> 
> I also tried running rngd in the guest which resolved the issue but
> seems entirely stupid to me, even moreso since
>
http://rhelblog.redhat.com/2015/03/09/red-hat-enterprise-linux-virtual-machi
> nes-access-to-random-numbers-made-easy/ says that "The use of rngd is
now
> not required and the guest kernel itself fetches entropy from the host when
> the available entropy falls below a specific threshold.".
right -- the kernel has now an in-kernel link that makes rngd superflowous in 
this case.



Ciao
Stephan

Alex Xu

2016-Jul-29 17:31 UTC

head link

getrandom waits for a long time when /dev/random is insufficiently read from

On Fri, 29 Jul 2016 19:03:51 +0200
Stephan Mueller <smueller at chronox.de> wrote as
excerpted:> Am Freitag, 29. Juli 2016, 10:14:07 CEST schrieb Alex Xu:
> > I don't follow. Assuming you are correct and this is the issue,
then
> > reading 128 bits (16 bytes) from /dev/random should "exhaust the
> > supply" and then both reads from /dev/random and calling
getrandom
> > should block.  
> 
> You assume that getrandom works like /dev/random. This is not the
> case. It is a full deterministic RNG like /dev/urandom (which is
> seeded during its operation as entropy is available).
My understanding was that all three methods of obtaining entropy from
userspace all receive data from the CSPRNG in the kernel, and that the
only difference is that /dev/random and getrandom may block depending
on the kernel's estimate of the currently available entropy.
> getrandom *only* differs from /dev/*u*random in that it waits
> initially such that the system collected 128 bits of entropy.
I agree, this is the documented behavior of getrandom.
> But you point to a real issue: when /dev/random is pulled before
> getrandom (and yet insufficient entropy is present), then the
> getrandom call will be woken up when the input_pool received 128
> bits. But those 128 bits are fed from the input_pool to the
> blocking_pool based on the caller at the /dev/ random device. This
> implies that the reader for getrandom will NOT be able to obtain data
> from the input_pool and the nonblocking_pool because the transfer
> operation will not succeed. This implies that the nonblocking_pool
> remains unseeded and yet getrandom returns data to the caller.
I don't understand what this means. For my use case, hwrng is fed from
the host's urandom, so none of /dev/random, /dev/hwrng, /dev/urandom,
or getrandom with any flags in the guest should ever block except
possibly for very large amounts requested (megabytes at least).
> > That, however, is not the behavior I observed, which is that reading
> > any amount from /dev/random will never block (since it is fed
> > from /dev/urandom on the host side) whereas calling getrandom will
> > always block unless /dev/random is read from first.  
> 
> That is a different issue that I did not read from your initial
> explanation.
> 
> I need to look into it a bit deeper.
I have been trying to explain the same problem the entire time. Let me
be clear what the problem is as I see it:

When qemu is started with -object rng-random,filename=/dev/urandom, and
immediately (i.e. with no initrd and as the first thing in init):

1. the guest runs dd if=/dev/random, there is no blocking and tons of
data goes to the screen. the data appears to be random.

2. the guest runs getrandom with any requested amount (tested 1 byte
and 16 bytes) and no flags, it blocks for 90-110 seconds while the
"non-blocking pool is initialized". the returned data appears to be
random.

3. the guest runs getrandom with GRND_RANDOM with any requested amount,
it returns the desired amount or possibly less, but in my experience at
least 10 bytes. the returned data appears to be random.

I believe that the difference between cases 1 and 2 is a bug, since
based on my previous statement, in this scenario, getrandom should
never block.

Theodore Ts'o

2016-Jul-30 22:09 UTC

head link

getrandom waits for a long time when /dev/random is insufficiently read from

On Fri, Jul 29, 2016 at 01:31:14PM -0400, Alex Xu wrote:> 
> My understanding was that all three methods of obtaining entropy from
> userspace all receive data from the CSPRNG in the kernel, and that the
> only difference is that /dev/random and getrandom may block depending
> on the kernel's estimate of the currently available entropy.
This is incorrect.

/dev/random is a legacy interface which dates back to a time when
people didn't have as much trust in the cryptographic primitives ---
when there was concerns that the NSA might have put a back-door into
SHA-1, for example.  (As it turns out; we were wrong.  NSA put the
back door into Dual EC DRBG.)  So it uses a strategy of an extremely
conservative entropy estimator, and will allow N bytes to be
/dev/random pool as the entropy estimator believes that it has
gathered at least N bytes of entropy from environmental noise.

/dev/urandom uses a different output pool from /dev/random (the random
and urandom pools both draw from an common input pool).  Originally
the /dev/urandom pool drew from the input pool as needed, but it
wouldn't block if there was insufficient entropy.  Over time, it now
has limits about how quickly it can draw from the input pool, and it
behaves more and more like a CSPRNG.  In fact, in the most recent set
of patches which Linus has accepted for v4.8-rc1, the urandom pool has
been replaced by an actual CSPRNG using the ChaCha-20 stream cipher.

The getrandom(2) system call uses the same output pool (4.7 and
earlier) or CSPRG (starting with v4.8-rc1) as /dev/urandom.  The big
difference is that it blocks until we know for sure that the output
pool or CSRPNG has been seeded with 128 bits of entropy.  We don't do
this with /dev/urandom for backwards compatibility reasons.  (For
example, if we did make /dev/urandom block until it was seeded, it
would break systemd, because systemd and progams run by systemd draws
from /dev/urandom before it has been initialized, and if /dev/urandom
were to block, the boot would hang, and with the system quiscient, we
wouldn't get much environmental noise, and the system would hang
hours.)
> When qemu is started with -object rng-random,filename=/dev/urandom, and
> immediately (i.e. with no initrd and as the first thing in init):
> 
> 1. the guest runs dd if=/dev/random, there is no blocking and tons of
> data goes to the screen. the data appears to be random.
> 
> 2. the guest runs getrandom with any requested amount (tested 1 byte
> and 16 bytes) and no flags, it blocks for 90-110 seconds while the
> "non-blocking pool is initialized". the returned data appears to
be
> random.
> 
> 3. the guest runs getrandom with GRND_RANDOM with any requested amount,
> it returns the desired amount or possibly less, but in my experience at
> least 10 bytes. the returned data appears to be random.
> 
> I believe that the difference between cases 1 and 2 is a bug, since
> based on my previous statement, in this scenario, getrandom should
> never block.
This is correct; and it has been fixed in the patches in v4.8-rc1.
The patch which fixes this has been marked for backporting to stable
kernels:

commit 3371f3da08cff4b75c1f2dce742d460539d6566d
Author: Theodore Ts'o <tytso at mit.edu>
Date:   Sun Jun 12 18:11:51 2016 -0400

    random: initialize the non-blocking pool via add_hwgenerator_randomness()

    If we have a hardware RNG and are using the in-kernel rngd, we should
    use this to initialize the non-blocking pool so that getrandom(2)
    doesn't block unnecessarily.

    Cc: stable at kernel.org
    Signed-off-by: Theodore Ts'o <tytso at mit.edu>

Basically, the urandom pool (now CSRPNG) wasn't getting initialized
from the hardware random number generator.  Most people didn't notice
because very few people actually *use* hardware random number
generators (although it's much more common in VM's, which is how
you're using it), and use of getrandom(2) is still relatively rare,
given that glibc hasn't yet seen fit to support it yet.

Cheers,

					- Ted

Reasonably Related Threads

Search for more maybe matching threads

Linux Virtualization - Jul 2016 - getrandom waits for a long time when /dev/random is insufficiently read from

getrandom waits for a long time when /dev/random is insufficiently read from

getrandom waits for a long time when /dev/random is insufficiently read from

getrandom waits for a long time when /dev/random is insufficiently read from

Reasonably Related Threads