thr3ads.net - freebsd security - Collecting entropy from device

If this information is useful, please help other people find it:
Share via:

Pawel Jakub Dawidek

2012-Sep-18 21:23 UTC

Collecting entropy from device_attach() times.

Hi.

I experimented a bit with collecting entropy from the time it takes for
device_attach() to run (in CPU cycles). It seems that those times have
enough variation that we can use it for entropy harvesting. It happens
even before root is mounted, so pretty early.

On the machine I'm testing it, which has minimal kernel plus NIC driver
I see 75 device_attach() calls. I'm being very careful and advertising
to yarrow that each call has only 4 bits of entropy (most of the time
there is much more). This gives 300 bits of entropy on this machine
before we even start init.

For real hardware like sound card it takes between 34647162 and 35548675
cycles to run device_attach(), so the difference here is 901513. If all
the times are more or less equally probable in this range we have more
than 19 bits of entropy from this one call, but I reduced if to four
bits only, because there are devices that are much faster to attach.
We could make the code more complex by assuming 0.01% of the time
varies, which should still be safe and will allow to collect more
entropy from those long calls.

The patch is here:

http://people.freebsd.org/~pjd/patches/harvest_device_attach.patch

Comments?

--
Pawel Jakub Dawidek http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-security/attachments/20120918/64884bfb/attachment.pgp

Dag-Erling Smørgrav

2012-Sep-19 16:52 UTC

head link

Collecting entropy from device_attach() times.

Pawel Jakub Dawidek <pjd@FreeBSD.org> writes:> I experimented a bit with collecting entropy from the time it takes for
> device_attach() to run (in CPU cycles). It seems that those times have
> enough variation that we can use it for entropy harvesting. It happens
> even before root is mounted, so pretty early.
Excellent idea :)
> On the machine I'm testing it, which has minimal kernel plus NIC driver
> I see 75 device_attach() calls. I'm being very careful and advertising
> to yarrow that each call has only 4 bits of entropy (most of the time
> there is much more). This gives 300 bits of entropy on this machine
> before we even start init.
Virtual machines (and even some physical hardware) can have as few as 40
devices.  I have a VirtualBox instance running 9.1-RC1 that has only 36
devices (based on `sysctl dev | cut -d. -f2-3 | sort -u | wc -l`), and a
soekris net5501 that only has 43.  This does not count network
interfaces, though.
> For real hardware like sound card it takes between 34647162 and 35548675
> cycles to run device_attach(), [...]
You can't rely on the existence of a TSC.  I would suggest using the
fractional part of binuptime instead.

I would also suggest modifying yarrow to block reseeding as long as
possible, ideally right up until the first time something asks for a
random number, since reseeding throws away all accumulated entropy.  I'd
suggest delaying reseeding until right before we start the scheduler,
but if I understand correctly, geom_geli may need randomness before
that?

DES
-- 
Dag-Erling Sm?rgrav - des@des.no

2012-Sep-19 18:28 UTC

head link

Collecting entropy from device_attach() times.

On Tue, 18 Sep 2012 23:14:22 +0200
Pawel Jakub Dawidek wrote:
> Hi.
> 
> The patch is here:
> 
> 	http://people.freebsd.org/~pjd/patches/harvest_device_attach.patch
> 
> Comments?
> 
+	attachtime = get_cyclecount() - attachtime;

the above line is redundant since random_harvest() already contains a
call to get_cyclecount().



On Wed, 19 Sep 2012 17:28:46 +0200
Dag-Erling Sm?rgrav wrote:
> You can't rely on the existence of a TSC.  I would suggest using the
> fractional part of binuptime instead.
get_cyclecount() is supposed to be platform independent and should
fall-back to nanotime(9) if TSC or equivalent is absent.

Jonathan Anderson

2012-Sep-19 18:58 UTC

head link

Collecting entropy from device_attach() times.

On Tuesday, 18 September 2012 at 22:14, Pawel Jakub Dawidek
wrote:> I experimented a bit with collecting entropy from the time it takes for
> device_attach() to run (in CPU cycles). It seems that those times have
> enough variation that we can use it for entropy harvesting. It happens
> even before root is mounted, so pretty early.
>  
That sounds really great.
  > If all the times are more or less equally probable in this range [?]
They're very unlikely to be equally probable. It would make sense to do some
characterization of these times and their statistics: a highly non-uniform
distribution would mean that we don't actually get many bits per attach.

  > [?] we have more
> than 19 bits of entropy from this one call, but I reduced if to four
> bits only, because there are devices that are much faster to attach.
>  
Another reason for doing the above characterization is that, if a particular
device_attach() really does provide 12 bits of uncertainty, it's a shame to
drop eight of them on the floor.
> We could make the code more complex by assuming 0.01% of the time
> varies, which should still be safe and will allow to collect more
> entropy from those long calls.
>  
I'm a bit leery of assuming that things "should still be safe" for
the above reasons. Again, some hard numbers would really help here. Maybe we
should even convince a student to do a project. :)


Jon
--  
Jonathan Anderson

Research Associate
Computer Laboratory
University of Cambridge

jonathan.anderson@cl.cam.ac.uk
+44 1223 763 747

David O'Brien

2012-Sep-19 22:35 UTC

head link

Collecting entropy from device_attach() times.

On Tue, Sep 18, 2012 at 11:14:22PM +0200, Pawel Jakub Dawidek
wrote:> I experimented a bit with collecting entropy from the time it takes for
> device_attach() to run (in CPU cycles). It seems that those times have
> enough variation that we can use it for entropy harvesting. It happens
> even before root is mounted, so pretty early.
I like it.  Microsoft harvests from something like 900 events/things.
The more good things like this we find improves our security.
> The patch is here:
> 	http://people.freebsd.org/~pjd/patches/harvest_device_attach.patch
> Comments?
Embelishments:

Index: sys/dev/random/randomdev_soft.c
==================================================================---
sys/dev/random/randomdev_soft.c	(revision 240694)
+++ sys/dev/random/randomdev_soft.c	(working copy)
@@ -158,6 +185,11 @@ random_yarrow_init(void)
 	    "Harvest serial net entropy");
 	SYSCTL_ADD_PROC(&random_clist,
 	    SYSCTL_CHILDREN(random_sys_harvest_o),
+	    OID_AUTO, "devprobe", CTLTYPE_INT | CTLFLAG_RW,
+	    &harvest.devprobe, 1, random_check_boolean, "I",
+	    "Harvest Device Probe entropy");
+	SYSCTL_ADD_PROC(&random_clist,
+	    SYSCTL_CHILDREN(random_sys_harvest_o),
 	    OID_AUTO, "interrupt", CTLTYPE_INT | CTLFLAG_RW,
 	    &harvest.interrupt, 1, random_check_boolean, "I",
 	    "Harvest IRQ entropy");
@@ -303,7 +341,7 @@ random_harvest_internal(u_int64_t someco
 	KASSERT(origin == RANDOM_START || origin == RANDOM_WRITE ||
             origin == RANDOM_KEYBOARD || origin == RANDOM_MOUSE ||
             origin == RANDOM_NET || origin == RANDOM_INTERRUPT ||
-            origin == RANDOM_PURE,
+            origin == RANDOM_PURE || origin == RANDOM_DEVICE,
 	    ("random_harvest_internal: origin %d invalid\n", origin));
 
 	/* Lockless read to avoid lock operations if fifo is full. */
Index: sys/dev/random/harvest.c
==================================================================---
sys/dev/random/harvest.c	(revision 240694)
+++ sys/dev/random/harvest.c	(working copy)
@@ -48,7 +48,13 @@ __FBSDID("$FreeBSD$");
 static int read_random_phony(void *, int);
 
 /* Structure holding the desired entropy sources */
-struct harvest_select harvest = { 1, 1, 1, 0 };
+struct harvest_select harvest = {
+	1,	/*ethernet*/
+	1,	/*pt2pt*/
+	1,	/*intr*/
+	0,	/*swi*/
+	1,	/*devprobe*/
+};
 static int warned = 0;
 
 /* hold the address of the routine which is actually called if
Index: sys/sys/random.h
==================================================================---
sys/sys/random.h	(revision 240495)
+++ sys/sys/random.h	(working copy)
@@ -45,6 +45,7 @@ enum esource {
 	RANDOM_NET,
 	RANDOM_INTERRUPT,
 	RANDOM_PURE,
+	RANDOM_DEVICE,
 	ENTROPYSOURCE
 };
 void random_harvest(void *, u_int, u_int, u_int, enum esource);
@@ -57,6 +58,7 @@ struct harvest_select {
 	int point_to_point;
 	int interrupt;
 	int swi;
+	int device;
 };
 
 extern struct harvest_select harvest;
Index: sys/kern/subr_bus.c
==================================================================---
sys/kern/subr_bus.c	(revision 240495)
+++ sys/kern/subr_bus.c	(working copy)
@@ -44,6 +44,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/condvar.h>
 #include <sys/queue.h>
 #include <machine/bus.h>
+#include <sys/random.h>
 #include <sys/rman.h>
 #include <sys/selinfo.h>
 #include <sys/signalvar.h>
@@ -53,6 +54,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/bus.h>
 #include <sys/interrupt.h>
 
+#include <machine/cpu.h>
 #include <machine/stdarg.h>
 
 #include <vm/uma.h>
@@ -2760,8 +2762,10 @@ device_probe_and_attach(device_t dev)
 int
 device_attach(device_t dev)
 {
+	uint64_t attachtime;
 	int error;
 
+	attachtime = get_cyclecount();
 	device_sysctl_init(dev);
 	if (!device_is_quiet(dev))
 		device_print_child(dev->parent, dev);
@@ -2784,6 +2788,10 @@ device_attach(device_t dev)
 		dev->state = DS_ATTACHED;
 	dev->flags &= ~DF_DONENOMATCH;
 	devadded(dev);
+	if (harvest.devprobe)
+		random_harvest(&attachtime, sizeof(attachtime), 4, 0,
+		    RANDOM_DEVICE);
+
 	return (0);
 }
 
Index: etc/defaults/rc.conf
==================================================================---
etc/defaults/rc.conf	(revision 239610)
+++ etc/defaults/rc.conf	(working copy)
@@ -642,6 +642,7 @@ entropy_file="/entropy"	# Set to NO to d
 entropy_dir="/var/db/entropy" # Set to NO to disable caching entropy
via cron.
 entropy_save_sz="2048"	# Size of the entropy cache files.
 entropy_save_num="8"	# Number of entropy cache files to save.
+harvest_devprobe="YES"	# Entropy device harvests device probe
randomness
 harvest_interrupt="YES"	# Entropy device harvests interrupt
randomness
 harvest_ethernet="YES"	# Entropy device harvests ethernet randomness
 harvest_p_to_p="YES"	# Entropy device harvests point-to-point
randomness
Index: etc/rc.d/initrandom
==================================================================---
etc/rc.d/initrandom	(revision 239610)
+++ etc/rc.d/initrandom	(working copy)
@@ -41,6 +63,12 @@ initrandom_start()
 	if [ \! -z "${soft_random_generator}" ] ; then
 
 		if [ -w /dev/random ]; then
+			if checkyesno harvest_devprobe; then
+				${SYSCTL} kern.random.sys.harvest.devprobe=1 >/dev/null
+				echo -n ' interrupts'
+			else
+				${SYSCTL} kern.random.sys.harvest.devprobe=0 >/dev/null
+			fi
 			if checkyesno harvest_interrupt; then
 				${SYSCTL} kern.random.sys.harvest.interrupt=1 >/dev/null
 				echo -n ' interrupts'

Pawel Jakub Dawidek

2012-Sep-22 09:18 UTC

head link

Collecting entropy from device_attach() times.

On Wed, Sep 19, 2012 at 07:28:36PM +0100, RW wrote:> On Tue, 18 Sep 2012 23:14:22 +0200
> Pawel Jakub Dawidek wrote:
> 
> > Hi.
> > 
> 
> > The patch is here:
> > 
> > 	http://people.freebsd.org/~pjd/patches/harvest_device_attach.patch
> > 
> > Comments?
> > 
> 
> +	attachtime = get_cyclecount() - attachtime;
> 
> the above line is redundant since random_harvest() already contains a
> call to get_cyclecount().
Agreed, although in more recent patch I need total time, so I can
calculate how many bits it has, so I can estimate how much entropy there
is.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-security/attachments/20120922/a87315e7/attachment.pgp

Pawel Jakub Dawidek

2012-Sep-23 15:17 UTC

head link

Collecting entropy from device_attach() times.

On Sun, Sep 23, 2012 at 02:37:48AM +0200, Mariusz Gromada
wrote:> W dniu 2012-09-22 21:53, Pawel Jakub Dawidek pisze:
> > Mariusz, can you confirm my findings? 
> 
> Pawel,
> 
> Your conclusions can be easily confirmed by shape analysis of the EDF. 
> Usually maximum quantile difference (called D-statistic) gives you a 
> kind of overview, function shape gives you a strong feeling, p-value 
> gives you a formal proof.
> D-statistic values (your data):
> 
>   6bit:   0.33%
>   7bit:   0.29%
>   8bit:   0.27%
>   9bit:   0.21%
> 10bit:   6.34%
> 11bit:  19.07%
> 12bit:  54.80%
> 
> What I would say: increasing the number of bits from 6 to 9 does not 
> affect distribution "uniformity", reaching the tenth bit results
in
> sudden increase in the difference measure -  the more bits, the more 
> difference is observed. Distribution shape analysis for the 10th bit 
> shows non-linear function. Lack of "randomness" in the quntile 
> difference curve - chart  shows completely lack of noise (pure 
> functional relation).  These are very strong indicators that starting 
> from 10th bit distribution was changed and is no longer uniform.
> 
> To formally confirm above conclusion for i.e. 5% significance level, 
> which means that confidence level is 95%, I need some extra data 
> regarding sample sizes. Please pass to me number of collected 
> observations in each 6-12 bit experiment.
Total number of observations was 162833.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-security/attachments/20120923/d3fc7f89/attachment.pgp

Mariusz Gromada

2012-Sep-24 22:03 UTC

head link

Collecting entropy from device_attach() times.

W dniu 2012-09-23 17:17, Pawel Jakub Dawidek pisze:> On Sun, Sep 23, 2012 at 02:37:48AM +0200, Mariusz Gromada wrote:
>> W dniu 2012-09-22 21:53, Pawel Jakub Dawidek pisze:
>>> Mariusz, can you confirm my findings?
>>
>> Pawel,
>>
>> Your conclusions can be easily confirmed by shape analysis of the EDF.
>> Usually maximum quantile difference (called D-statistic) gives you a
>> kind of overview, function shape gives you a strong feeling, p-value
>> gives you a formal proof.
>> D-statistic values (your data):
>>
>>    6bit:   0.33%
>>    7bit:   0.29%
>>    8bit:   0.27%
>>    9bit:   0.21%
>> 10bit:   6.34%
>> 11bit:  19.07%
>> 12bit:  54.80%
>>
>> What I would say: increasing the number of bits from 6 to 9 does not
>> affect distribution "uniformity", reaching the tenth bit
results in
>> sudden increase in the difference measure -  the more bits, the more
>> difference is observed. Distribution shape analysis for the 10th bit
>> shows non-linear function. Lack of "randomness" in the
quntile
>> difference curve - chart  shows completely lack of noise (pure
>> functional relation).  These are very strong indicators that starting
>> from 10th bit distribution was changed and is no longer uniform.
>>
>> To formally confirm above conclusion for i.e. 5% significance level,
>> which means that confidence level is 95%, I need some extra data
>> regarding sample sizes. Please pass to me number of collected
>> observations in each 6-12 bit experiment.
>
> Total number of observations was 162833.
>
Ok, finally I have some formal results. To be completely honest I need 
to point out that, in fact, we have a discrete data (for example 
integers 0, 1, ..., 63, but not continues numbers spread across 0 and 
63). That is way  I am going to use two sample Kolmogorov-Smirnov test. 
  Methodology is simple:

- Pawel?s data will be called empirical one
- Theoretical data will be generated as a sequence of unique integer 
numbers from 0 to 2**n -1, where n is the number of bits. Assumption - 
each number appears in theoretical data only once representing ideal 
uniform distribution.

Calculations will be done in the R-cran package

Loading empirical data form files:

 > e6 = read.table("E:\\pawel\\dhr2_6bit_sorted.txt")
 > e7 = read.table("E:\\pawel\\dhr2_7bit_sorted.txt")
 > e8 = read.table("E:\\pawel\\dhr2_8bit_sorted.txt")
 > e9 = read.table("E:\\pawel\\dhr2_9bit_sorted.txt")
 > e10 = read.table("E:\\pawel\\dhr2_10bit_sorted.txt")
 > e11 = read.table("E:\\pawel\\dhr2_11bit_sorted.txt")
 > e12 = read.table("E:\\pawel\\dhr2_12bit_sorted.txt")

Generating ideal theoretical data:

 > t6 = c(0:(2**6-1))
 > t7 = c(0:(2**7-1))
 > t8 = c(0:(2**8-1))
 > t9 = c(0:(2**9-1))
 > t10 = c(0:(2**10-1))
 > t11 = c(0:(2**11-1))
 > t12 = c(0:(2**12-1))

Performing KS tests:

 > ks.test(e6, t6)
D = 0.0032, p-value = 1

 > ks.test(e7, t7)
D = 0.0029, p-value = 1

 > ks.test(e8, t8)
D = 0.0027, p-value = 1

 > ks.test(e9, t9)
D = 0.0022, p-value = 1

 > ks.test(e10, t10)
D = 0.0634, p-value = 0.0005562

 > ks.test(e11, t11)
D = 0.1907, p-value < 2.2e-16

 > ks.test(e12, t12)
D = 0.5479, p-value < 2.2e-16

As you can see D-statistics are almost the same as calculated by Pawel 
(considering roundings). P-values are very interesting due to very high 
number of observations generated by Pawel. Between 6 bits and 9 bits 
estimated p-values are equal to 1, so it means that it is impossible (at 
any significance level) to reject null hypothesis stating that compared 
distributions are equal. Final conclusion: it has to be random, and for 
sure it is random!

Additionally starting form 10 bits we can observe dramatic decrease of 
p-value (from 100% to c.a. 0,06% and much less for the 11-12 bits). So 
low p-value means that it is impossible not to reject null hypothesis 
stating that compared distributions are equal. Final conclusion: it 
cannot be random, and for sure it is not random.

I did the same comparison for the previous real device attach data (2081 
obs.). R code and the results are below:

 > e16 = read.table("E:\\pawel\\device_attach_16bit.log")
 > t16 = c(0:(2**16-1))
 > ks.test(e16, t16)
D = 0.0178, p-value = 0.5422

Again, D-statistic an p-value are almost the same as previously 
calculated "manually". P-value is very high (it is not as high as in
the
6-12 bits tests, but consider much lower number of observations: 2081 vs 
  162833), giving almost sureness that you have captured real 16-bits 
entropy!

Regards,
Mariusz

Mariusz Gromada

2012-Sep-24 22:35 UTC

head link

Collecting entropy from device_attach() times.

W dniu 2012-09-24 23:56, Mariusz Gromada pisze:
> Ok, finally I have some formal results. To be completely honest I need
> to point out that, in fact, we have a discrete data (for example
> integers 0, 1, ..., 63, but not continues numbers spread across 0 and
> 63). That is way  I am going to use two sample Kolmogorov-Smirnov test.
Another clarification is needed. KS test in general (and in theory) 
should be used for continuous distributions. But in our case we can 
easily say that we observe our distribution in integers only (rounding), 
and the whole rest is easily estimated.

Regards,
Mariusz

Seemingly Similar Threads

Search for more apparently analagous threads

freebsd security - Sep 2012 - Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Collecting entropy from device_attach() times.

Seemingly Similar Threads