Ajay Shah
2004-Apr-22 05:34 UTC
Evidence from Debian's package tracking (Was Re: [R] Size of R user base.)
I have watched the discussions about the size of the R user base with much interest. One more source of data that might help is the voluntary data capture in Debian. If you are a Debian user, you should volunteer information. It's very easy: as root, say: # apt-get install popularity-contest The results are found at: http://popcon.debian.org/main/math/by_inst This shows that of the 4800 people who volunteered information, 1631 had installed gnuplot -- which suggests that perhaps one third of Debian installs are by numerate people. R-base was installed by roughly one-tenth of the sample. So that's one useful fact: Roughly one in ten of Debian users is an R user. Roughly one in three of the numerate users is an R user. I would take this one-in-ten fact quite seriously, except for the extent to which which R users are perhaps more likely (as compared with the population) to volunteer information about what packages they use. Now let's engage in some wild guesswork. * It is believed that there are roughly 2e7 desktops in the world today, running a freeware Unix system. * Debian is undoubtedly a biased source of data, in having the more geeky users. Let's knock off a factor of 10 in order to correct for this. * If we think that 1% of all freeware Unix users are R users, then we get to an estimate of 200,000 users of R in the freeware Unix world. There would be more using Mac OS X, Solaris, etc. Google data shows that 1% of google hits are from Linux while 4% are from Mac users. So for each Linux user, there are 4 Mac OS X users. But then, a lot of them are Aunt Tillie, and are unlikely to need anything more than a calculator. -- Ajay Shah Consultant ajayshah at mayin.org Department of Economic Affairs http://www.mayin.org/ajayshah Ministry of Finance, New Delhi
Ajay Shah <ajayshah at mayin.org> writes:> > This shows that of the 4800 people who volunteered information, 1631 > had installed gnuplot -- which suggests that perhaps one third of > Debian installs are by numerate people. R-base was installed by > roughly one-tenth of the sample. > > So that's one useful fact: Roughly one in ten of Debian users is an R > user. Roughly one in three of the numerate users is an R user.The dangers with generalization... I don't have gnuplot on most of my machines. So either I'm not numerate, or that assumption is wrong. (however, given my counting skills yesterday, the former might be true). best, -tony -- rossini at u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}