On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote:> I took a look at apparent gender among list participants a few years ago: > https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html > > Same general thing: very few regular participants on the list were > women. I don't see any sign that that has changed in the last three > years. The bar to participation in the R-help list is much, much lower > than that to become a developer.I plotted the gender of posters on r-help over time. The plot is here: https://twitter.com/scottkosty/status/449933971644633088 The code to reproduce that plot is here: https://github.com/scottkosty/genderAnalysis The R file there will call devtools::install_github to install a package from Github used for guessing the gender based on the first name (https://github.com/scottkosty/gender). Note also on that tweet that Gabriela de Queiroz posted it, who is the founder of R-ladies; and that David Smith showed interest in discussing the topic. So there is definitely demand for some data analysis and discussion on the topic.> It would be interesting to look at the stats for CRAN packages as well. > > The very low percentage of regular female participants is one of the > things that keeps me active on this list: to demonstrate that it's not > only men who use R and participate in the community.Thank you for that! Scott -- Scott Kostyshak Economics PhD Candidate Princeton University> (If you decide to do the stats for 2014, be aware that I've been out > on medical leave for the past two months, so the numbers are even > lower than usual.) > > Sarah > > On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw > <maarten.blaauw at qub.ac.uk> wrote: >> Hi there, >> >> I can't help to notice that the gender balance among R developers and >> ordinary members is extremely skewed (as it is with open source software in >> general). >> >> Have a look at http://www.r-project.org/foundation/memberlist.html - at most >> a handful of women are listed among the 'supporting members', and none at >> all among the 29 'ordinary members'. >> >> On the other hand I personally know many happy R users of both genders. >> >> My questions are thus: Should R developers (and users) be worried that the >> 'other half' is excluded? If so, how could female R users/developers be >> persuaded to become more visible (e.g. added as supporting or ordinary >> members)? >> >> Thanks, >> >> Maarten >> > -- > Sarah Goslee > http://www.functionaldiversity.org > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Nice graph, Scott, thanks! Based on your code I plotted not the absolute numbers but the ratios, which show slowly increasing relative participation of female Rhelpers over time (red = women, blue=men, black=unknown). After a c. 5% female contribution in 1998, this has grown to about 15% now. At this rate we'll reach parity around AD 2080. My code: if (!require(gender)) { library(devtools) install_github("scottkosty/gender") library(gender) } rHelp <- rHelpNames rHelp[is.na(rHelp$gender), "gender"] <- "unknown" yr <- unique(rHelp$year) helpers <- list(dates, M=rep(0, length(yr)), F=rep(0, length(yr)), unkn=rep(0, length(yr))) for(i in 1:nrow(rHelp)) { j <- which(yr == rHelp$year[i]) gender <- rHelp$gender[i] if(gender == "M") helpers$M[[j]] <- helpers$M[[j]]+1 else if(gender == "F") helpers$F[[j]] <- helpers$F[[j]]+1 else if(gender == "unknown") helpers$unkn[[j]] <- helpers$unkn[[j]]+1 } plot(yr, helpers$M / (helpers$M+helpers$F+helpers$unkn), type="l", col=4, ylim=c(0,1), ylab="proportions", yaxs="i") lines(yr, helpers$F / (helpers$M+helpers$F+helpers$unkn), col=2) lines(yr, helpers$unkn / (helpers$M+helpers$F+helpers$unkn)) Cheers, Maarten On 25/11/14 12:11, Scott Kostyshak wrote:> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote: >> I took a look at apparent gender among list participants a few years ago: >> https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html >> >> Same general thing: very few regular participants on the list were >> women. I don't see any sign that that has changed in the last three >> years. The bar to participation in the R-help list is much, much lower >> than that to become a developer. > > I plotted the gender of posters on r-help over time. The plot is here: > https://twitter.com/scottkosty/status/449933971644633088 > > The code to reproduce that plot is here: > https://github.com/scottkosty/genderAnalysis > The R file there will call devtools::install_github to install a > package from Github used for guessing the gender based on the first > name (https://github.com/scottkosty/gender). > > Note also on that tweet that Gabriela de Queiroz posted it, who is the > founder of R-ladies; and that David Smith showed interest in > discussing the topic. So there is definitely demand for some data > analysis and discussion on the topic. > >> It would be interesting to look at the stats for CRAN packages as well. >> >> The very low percentage of regular female participants is one of the >> things that keeps me active on this list: to demonstrate that it's not >> only men who use R and participate in the community. > > Thank you for that! > > Scott > > > -- > Scott Kostyshak > Economics PhD Candidate > Princeton University > >> (If you decide to do the stats for 2014, be aware that I've been out >> on medical leave for the past two months, so the numbers are even >> lower than usual.) >> >> Sarah >> >> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw >> <maarten.blaauw at qub.ac.uk> wrote: >>> Hi there, >>> >>> I can't help to notice that the gender balance among R developers and >>> ordinary members is extremely skewed (as it is with open source software in >>> general). >>> >>> Have a look at http://www.r-project.org/foundation/memberlist.html - at most >>> a handful of women are listed among the 'supporting members', and none at >>> all among the 29 'ordinary members'. >>> >>> On the other hand I personally know many happy R users of both genders. >>> >>> My questions are thus: Should R developers (and users) be worried that the >>> 'other half' is excluded? If so, how could female R users/developers be >>> persuaded to become more visible (e.g. added as supporting or ordinary >>> members)? >>> >>> Thanks, >>> >>> Maarten >>> >> -- >> Sarah Goslee >> http://www.functionaldiversity.org >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.-- | Dr. Maarten Blaauw | Lecturer in Chronology | | School of Geography, Archaeology & Palaeoecology | Queen's University Belfast, UK | | www http://www.chrono.qub.ac.uk/blaauw | tel +44 (0)28 9097 3895 -------------- next part -------------- A non-text attachment was scrubbed... Name: gendeR.pdf Type: application/pdf Size: 4671 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20141125/0c5dc54f/attachment.pdf>
On 11/25/2014 04:11 AM, Scott Kostyshak wrote:> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote: >> I took a look at apparent gender among list participants a few years ago: >> https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html >> >> Same general thing: very few regular participants on the list were >> women. I don't see any sign that that has changed in the last three >> years. The bar to participation in the R-help list is much, much lower >> than that to become a developer. > > I plotted the gender of posters on r-help over time. The plot is here: > https://twitter.com/scottkosty/status/449933971644633088 > > The code to reproduce that plot is here: > https://github.com/scottkosty/genderAnalysis > The R file there will call devtools::install_github to install a > package from Github used for guessing the gender based on the first > name (https://github.com/scottkosty/gender).It would be great to include in your package the script that scraped author names from R-help archives (I guess that's what you did?). Presumably it easily applies to other mailing lists hosted at the same location (R-devel, further along the ladder from user to developer, and Bioconductor / Bioc-devel, in a different domain and perhaps confounded with a different 'feel' to the list). Also the R community is definitely international, so finding more versatile gender-assignment approaches seems important. it might be interesting to ask about participation in mailing list forums versus other, and in particular the recent Bioconductor transition from mailing list to 'StackOverflow' style support forum (https://support.bioconductor.org) -- on the one hand the 'gamification' elements might seem to only entrench male participation, while on the other we have already seen increased (quantifiable) and broader (subjective) participation from the Bioconductor community. I'd be happy to make support site usage data available, and am interested in collaborating in an academically well-founded analysis of this data; any interested parties please feel free to contact me off-list. Martin Morgan Bioconductor> > Note also on that tweet that Gabriela de Queiroz posted it, who is the > founder of R-ladies; and that David Smith showed interest in > discussing the topic. So there is definitely demand for some data > analysis and discussion on the topic. > >> It would be interesting to look at the stats for CRAN packages as well. >> >> The very low percentage of regular female participants is one of the >> things that keeps me active on this list: to demonstrate that it's not >> only men who use R and participate in the community. > > Thank you for that! > > Scott > > > -- > Scott Kostyshak > Economics PhD Candidate > Princeton University > >> (If you decide to do the stats for 2014, be aware that I've been out >> on medical leave for the past two months, so the numbers are even >> lower than usual.) >> >> Sarah >> >> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw >> <maarten.blaauw at qub.ac.uk> wrote: >>> Hi there, >>> >>> I can't help to notice that the gender balance among R developers and >>> ordinary members is extremely skewed (as it is with open source software in >>> general). >>> >>> Have a look at http://www.r-project.org/foundation/memberlist.html - at most >>> a handful of women are listed among the 'supporting members', and none at >>> all among the 29 'ordinary members'. >>> >>> On the other hand I personally know many happy R users of both genders. >>> >>> My questions are thus: Should R developers (and users) be worried that the >>> 'other half' is excluded? If so, how could female R users/developers be >>> persuaded to become more visible (e.g. added as supporting or ordinary >>> members)? >>> >>> Thanks, >>> >>> Maarten >>> >> -- >> Sarah Goslee >> http://www.functionaldiversity.org >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
On Tue, Nov 25, 2014 at 8:24 AM, Maarten Blaauw <maarten.blaauw at qub.ac.uk> wrote:> Nice graph, Scott, thanks! > > Based on your code I plotted not the absolute numbers but the ratios, which > show slowly increasing relative participation of female Rhelpers over time > (red = women, blue=men, black=unknown). After a c. 5% female contribution in > 1998, this has grown to about 15% now. At this rate we'll reach parity > around AD 2080.Interesting forecasts Maarten! Let's hope for a trend break to make them wrong. Scott -- Scott Kostyshak Economics PhD Candidate Princeton University> My code: > > if (!require(gender)) { > library(devtools) > install_github("scottkosty/gender") > library(gender) > } > rHelp <- rHelpNames > rHelp[is.na(rHelp$gender), "gender"] <- "unknown" > > yr <- unique(rHelp$year) > > helpers <- list(dates, M=rep(0, length(yr)), F=rep(0, length(yr)), > unkn=rep(0, length(yr))) > > for(i in 1:nrow(rHelp)) > { > j <- which(yr == rHelp$year[i]) > gender <- rHelp$gender[i] > if(gender == "M") > helpers$M[[j]] <- helpers$M[[j]]+1 else > if(gender == "F") > helpers$F[[j]] <- helpers$F[[j]]+1 else > if(gender == "unknown") > helpers$unkn[[j]] <- helpers$unkn[[j]]+1 > } > plot(yr, helpers$M / (helpers$M+helpers$F+helpers$unkn), type="l", col=4, > ylim=c(0,1), ylab="proportions", yaxs="i") > lines(yr, helpers$F / (helpers$M+helpers$F+helpers$unkn), col=2) > lines(yr, helpers$unkn / (helpers$M+helpers$F+helpers$unkn)) > > Cheers, > > Maarten > > > On 25/11/14 12:11, Scott Kostyshak wrote: >> >> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com> >> wrote: >>> >>> I took a look at apparent gender among list participants a few years ago: >>> https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html >>> >>> Same general thing: very few regular participants on the list were >>> women. I don't see any sign that that has changed in the last three >>> years. The bar to participation in the R-help list is much, much lower >>> than that to become a developer. >> >> >> I plotted the gender of posters on r-help over time. The plot is here: >> https://twitter.com/scottkosty/status/449933971644633088 >> >> The code to reproduce that plot is here: >> https://github.com/scottkosty/genderAnalysis >> The R file there will call devtools::install_github to install a >> package from Github used for guessing the gender based on the first >> name (https://github.com/scottkosty/gender). >> >> Note also on that tweet that Gabriela de Queiroz posted it, who is the >> founder of R-ladies; and that David Smith showed interest in >> discussing the topic. So there is definitely demand for some data >> analysis and discussion on the topic. >> >>> It would be interesting to look at the stats for CRAN packages as well. >>> >>> The very low percentage of regular female participants is one of the >>> things that keeps me active on this list: to demonstrate that it's not >>> only men who use R and participate in the community. >> >> >> Thank you for that! >> >> Scott >> >> >> -- >> Scott Kostyshak >> Economics PhD Candidate >> Princeton University >> >>> (If you decide to do the stats for 2014, be aware that I've been out >>> on medical leave for the past two months, so the numbers are even >>> lower than usual.) >>> >>> Sarah >>> >>> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw >>> <maarten.blaauw at qub.ac.uk> wrote: >>>> >>>> Hi there, >>>> >>>> I can't help to notice that the gender balance among R developers and >>>> ordinary members is extremely skewed (as it is with open source software >>>> in >>>> general). >>>> >>>> Have a look at http://www.r-project.org/foundation/memberlist.html - at >>>> most >>>> a handful of women are listed among the 'supporting members', and none >>>> at >>>> all among the 29 'ordinary members'. >>>> >>>> On the other hand I personally know many happy R users of both genders. >>>> >>>> My questions are thus: Should R developers (and users) be worried that >>>> the >>>> 'other half' is excluded? If so, how could female R users/developers be >>>> persuaded to become more visible (e.g. added as supporting or ordinary >>>> members)? >>>> >>>> Thanks, >>>> >>>> Maarten >>>> >>> -- >>> Sarah Goslee >>> http://www.functionaldiversity.org >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > > -- > | Dr. Maarten Blaauw > | Lecturer in Chronology > | > | School of Geography, Archaeology & Palaeoecology > | Queen's University Belfast, UK > | > | www http://www.chrono.qub.ac.uk/blaauw > | tel +44 (0)28 9097 3895
On Tue, Nov 25, 2014 at 1:15 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:> On 11/25/2014 04:11 AM, Scott Kostyshak wrote: >> >> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com> >> wrote: >>> >>> I took a look at apparent gender among list participants a few years ago: >>> https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html >>> >>> Same general thing: very few regular participants on the list were >>> women. I don't see any sign that that has changed in the last three >>> years. The bar to participation in the R-help list is much, much lower >>> than that to become a developer. >> >> >> I plotted the gender of posters on r-help over time. The plot is here: >> https://twitter.com/scottkosty/status/449933971644633088 >> >> The code to reproduce that plot is here: >> https://github.com/scottkosty/genderAnalysis >> The R file there will call devtools::install_github to install a >> package from Github used for guessing the gender based on the first >> name (https://github.com/scottkosty/gender). > > > It would be great to include in your package the script that scraped author > names from R-help archives (I guess that's what you did?). Presumably it > easily applies to other mailing lists hosted at the same location (R-devel, > further along the ladder from user to developer, and Bioconductor / > Bioc-devel, in a different domain and perhaps confounded with a different > 'feel' to the list). Also the R community is definitely international, so > finding more versatile gender-assignment approaches seems important.I just put the script up on https://github.com/scottkosty/genderAnalysis I don't have much time at the moment to generalize it, but a pull request is always welcome. Alternatively, anyone is welcome (at least as far as I'm concerned) to take the script and modify it for any purpose.> it might be interesting to ask about participation in mailing list forums > versus other, and in particular the recent Bioconductor transition from > mailing list to 'StackOverflow' style support forum > (https://support.bioconductor.org) -- on the one hand the 'gamification' > elements might seem to only entrench male participation, while on the other > we have already seen increased (quantifiable) and broader (subjective) > participation from the Bioconductor community. I'd be happy to make support > site usage data available, and am interested in collaborating in an > academically well-founded analysis of this data; any interested parties > please feel free to contact me off-list.I would be interested in collaborating on such a project in the future also. Scott -- Scott Kostyshak Economics PhD Candidate Princeton University> > Martin Morgan > Bioconductor > > >> >> Note also on that tweet that Gabriela de Queiroz posted it, who is the >> founder of R-ladies; and that David Smith showed interest in >> discussing the topic. So there is definitely demand for some data >> analysis and discussion on the topic. >> >>> It would be interesting to look at the stats for CRAN packages as well. >>> >>> The very low percentage of regular female participants is one of the >>> things that keeps me active on this list: to demonstrate that it's not >>> only men who use R and participate in the community. >> >> >> Thank you for that! >> >> Scott >> >> >> -- >> Scott Kostyshak >> Economics PhD Candidate >> Princeton University >> >>> (If you decide to do the stats for 2014, be aware that I've been out >>> on medical leave for the past two months, so the numbers are even >>> lower than usual.) >>> >>> Sarah >>> >>> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw >>> <maarten.blaauw at qub.ac.uk> wrote: >>>> >>>> Hi there, >>>> >>>> I can't help to notice that the gender balance among R developers and >>>> ordinary members is extremely skewed (as it is with open source software >>>> in >>>> general). >>>> >>>> Have a look at http://www.r-project.org/foundation/memberlist.html - at >>>> most >>>> a handful of women are listed among the 'supporting members', and none >>>> at >>>> all among the 29 'ordinary members'. >>>> >>>> On the other hand I personally know many happy R users of both genders. >>>> >>>> My questions are thus: Should R developers (and users) be worried that >>>> the >>>> 'other half' is excluded? If so, how could female R users/developers be >>>> persuaded to become more visible (e.g. added as supporting or ordinary >>>> members)? >>>> >>>> Thanks, >>>> >>>> Maarten >>>> >>> -- >>> Sarah Goslee >>> http://www.functionaldiversity.org >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793