Hi all, i have a pretty easy indexing question, at least i believe so. The main reason i post it here, is that brackets and $ are hard to google. How do I index correctly, if i just want to display the whole dataset conditioned on the fact that some particular column equals one. I know i can do something like: data$somecolumn[data$particularcol ==1] . That will show all "some column" values where the particular column is 1. Unfortunately something like : data[data$particularcol ==1] does not work to get the whole matrix. is there some easy way except the % in % stuff ? Thx in advance
have a look at ?"[.data.frame"; what you need is the following: dat <- data.frame(a = rbinom(20, 1, 0.5), x = rnorm(20), y = rnorm(20)) dat dat[dat$a == 1, ] I hope it helps. Best, Dimitris Bunny, lautloscrew.com wrote:> Hi all, > > i have a pretty easy indexing question, at least i believe so. The main > reason i post it here, is that brackets and $ are hard to google. > > How do I index correctly, if i just want to display the whole dataset > conditioned on the fact that some particular column equals one. > > I know i can do something like: data$somecolumn[data$particularcol > ==1] . That will show all "some column" values where the particular > column is 1. > Unfortunately something like : data[data$particularcol ==1] does not > work to get the whole matrix. > > is there some easy way except the % in % stuff ? > > Thx in advance > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
Dear Bunny, You need to add a comma: data[data$particularcol ==1, ] data[put here the conditions for rows, put here conditions for columns]. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Bunny, lautloscrew.com Verzonden: dinsdag 13 januari 2009 10:26 Aan: r-help at r-project.org Onderwerp: [R] indexing question Hi all, i have a pretty easy indexing question, at least i believe so. The main reason i post it here, is that brackets and $ are hard to google. How do I index correctly, if i just want to display the whole dataset conditioned on the fact that some particular column equals one. I know i can do something like: data$somecolumn[data$particularcol ==1] . That will show all "some column" values where the particular column is 1. Unfortunately something like : data[data$particularcol ==1] does not work to get the whole matrix. is there some easy way except the % in % stuff ? Thx in advance ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
you can also look at subset,> my.data.frame <- data.frame(a=rnorm(10), > b=factor(sample(letters[1:4], 10, replace=T))) > str(my.data.frame) > my.data.frame[my.data.frame$b == "a", ] > subset(my.data.frame, b == "a")by the way, it is probably safer not to use "data" as a variable name as it is also a function. Hope this helps, baptiste On 13 Jan 2009, at 09:33, Dimitris Rizopoulos wrote:> have a look at ?"[.data.frame"; what you need is the following: > > dat <- data.frame(a = rbinom(20, 1, 0.5), x = rnorm(20), y = > rnorm(20)) > > dat > dat[dat$a == 1, ] > > > I hope it helps. > > Best, > Dimitris > > > Bunny, lautloscrew.com wrote: >> Hi all, >> >> i have a pretty easy indexing question, at least i believe so. The >> main >> reason i post it here, is that brackets and $ are hard to google. >> >> How do I index correctly, if i just want to display the whole dataset >> conditioned on the fact that some particular column equals one. >> >> I know i can do something like: data$somecolumn[data$particularcol >> ==1] . That will show all "some column" values where the particular >> column is 1. >> Unfortunately something like : data[data$particularcol ==1] does not >> work to get the whole matrix. >> >> is there some easy way except the % in % stuff ? >> >> Thx in advance >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code._____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Dear Thierry, thx for your help, this exactly what i was looking for. For other beginners reading this: dont name your data "data". My bad dont call it like that. use "dataset" or whatever instead. Am 13.01.2009 um 10:38 schrieb ONKELINX, Thierry:> Dear Bunny, > > You need to add a comma: data[data$particularcol ==1, ] > > data[put here the conditions for rows, put here conditions for > columns]. > > > HTH, > > Thierry > > ------------------------------------------------------------------------ > ---- > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and Forest > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, > methodology and quality assurance > Gaverstraat 4 > 9500 Geraardsbergen > Belgium > tel. + 32 54/436 185 > Thierry.Onkelinx at inbo.be > www.inbo.be > > To call in the statistician after the experiment is done may be no > more > than asking him to perform a post-mortem examination: he may be able > to > say what the experiment died of. > ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. > ~ Roger Brinner > > The combination of some data and an aching desire for an answer does > not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] > Namens Bunny, lautloscrew.com > Verzonden: dinsdag 13 januari 2009 10:26 > Aan: r-help at r-project.org > Onderwerp: [R] indexing question > > Hi all, > > i have a pretty easy indexing question, at least i believe so. The > main reason i post it here, is that brackets and $ are hard to google. > > How do I index correctly, if i just want to display the whole dataset > conditioned on the fact that some particular column equals one. > > I know i can do something like: data$somecolumn[data$particularcol > ==1] . That will show all "some column" values where the particular > column is 1. > Unfortunately something like : data[data$particularcol ==1] does not > work to get the whole matrix. > > is there some easy way except the % in % stuff ? > > Thx in advance > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Dit bericht en eventuele bijlagen geven enkel de visie van de > schrijver weer > en binden het INBO onder geen enkel beding, zolang dit bericht niet > bevestigd is > door een geldig ondertekend document. The views expressed in this > message > and any annex are purely those of the writer and may not be regarded > as stating > an official position of INBO, as long as the message is not > confirmed by a duly > signed document.
From: baptiste auguie <ba208@exeter.ac.uk> To: Dimitris Rizopoulos <d.rizopoulos@erasmusmc.nl> Date: Tue, 13 Jan 2009 09:38:09 +0000 Subject: Re: [R] indexing question> you can also look at subset, > > > my.data.frame <- data.frame(a=rnorm(10), >> b=factor(sample(letters[1:4], 10, replace=T))) >> str(my.data.frame) >> my.data.frame[my.data.frame$b == "a", ] >> subset(my.data.frame, b == "a") >> > > by the way, it is probably safer not to use "data" as a variable name as it > is also a function. >I've often wondered about this. The thing is, I've never run into a problem with this. For example:> ls()character(0)> data(ToothGrowth) > ls()[1] "ToothGrowth"> rm(ToothGrowth) > ls()character(0)> data <- data.frame(1:10, 101:110) > data(ToothGrowth) #works just the same > ls()[1] "data" "ToothGrowth">In this example the data command works just the same the second time, even though I have a data.frame named data. Can someone give an example where this causes a problem? Thanks, Ista [[alternative HTML version deleted]]