Ajay Shah
2004-Apr-22 05:34 UTC
Evidence from Debian's package tracking (Was Re: [R] Size of R user base.)
I have watched the discussions about the size of the R user base with
much interest. One more source of data that might help is the
voluntary data capture in Debian. If you are a Debian user, you should
volunteer information. It's very easy: as root, say:
# apt-get install popularity-contest
The results are found at:
http://popcon.debian.org/main/math/by_inst
This shows that of the 4800 people who volunteered information, 1631
had installed gnuplot -- which suggests that perhaps one third of
Debian installs are by numerate people. R-base was installed by
roughly one-tenth of the sample.
So that's one useful fact: Roughly one in ten of Debian users is an R
user. Roughly one in three of the numerate users is an R user.
I would take this one-in-ten fact quite seriously, except for the
extent to which which R users are perhaps more likely (as compared
with the population) to volunteer information about what packages they
use.
Now let's engage in some wild guesswork.
* It is believed that there are roughly 2e7 desktops in the world
today, running a freeware Unix system.
* Debian is undoubtedly a biased source of data, in having the more
geeky users. Let's knock off a factor of 10 in order to correct for
this.
* If we think that 1% of all freeware Unix users are R users, then we
get to an estimate of 200,000 users of R in the freeware Unix
world. There would be more using Mac OS X, Solaris, etc.
Google data shows that 1% of google hits are from Linux while 4% are
from Mac users. So for each Linux user, there are 4 Mac OS X
users. But then, a lot of them are Aunt Tillie, and are unlikely to
need anything more than a calculator.
--
Ajay Shah Consultant
ajayshah at mayin.org Department of Economic Affairs
http://www.mayin.org/ajayshah Ministry of Finance, New Delhi
Ajay Shah <ajayshah at mayin.org> writes:> > This shows that of the 4800 people who volunteered information, 1631 > had installed gnuplot -- which suggests that perhaps one third of > Debian installs are by numerate people. R-base was installed by > roughly one-tenth of the sample. > > So that's one useful fact: Roughly one in ten of Debian users is an R > user. Roughly one in three of the numerate users is an R user.The dangers with generalization... I don't have gnuplot on most of my machines. So either I'm not numerate, or that assumption is wrong. (however, given my counting skills yesterday, the former might be true). best, -tony -- rossini at u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}