Dear all, I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups. I would like to use the ?summary? and ?table? arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable. I have used this method: dgb001<-subset(data,data$variable==1 & data,data$variable) However, I get the following error: ?Error: cannot allocate vector of size 16.0 Gb?. Is there another method I can try? Kind regards, Jamie Burgess PhD Student Endocrinology and Diabetes University of Liverpool Aintree University Hospital & The Walton Centre Institute of Ageing & Chronic Disease 0151 529 5936 [[alternative HTML version deleted]]
I think the syntax you are looking for is datasubset <- data[ data$A ==1 & data$B == 1 , ] ) This gives the subset of your original data for variable A with value 1 and variable B with value 1. On Mon, May 25, 2020 at 12:57 PM Burgess, Jamie <Jamie.Burgess at liverpool.ac.uk> wrote:> > Dear all, > > I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups. I would like to use the ?summary? and ?table? arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable. > I have used this method: > > dgb001<-subset(data,data$variable==1 & data,data$variable) > > > However, I get the following error: ?Error: cannot allocate vector of size 16.0 Gb?. Is there another method I can try? > > > Kind regards, > > > Jamie Burgess > > PhD Student Endocrinology and Diabetes > > University of Liverpool > > Aintree University Hospital & > > The Walton Centre > > Institute of Ageing & Chronic Disease > > 0151 529 5936 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Yes. In particular: data$variable==1 & data makes no sense (data is a data frame). A typo perhaps? Or as Richard indicated, consult references/tutorials to learn proper syntax for (vectorized) predicates. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, May 25, 2020 at 10:20 AM Richard M. Heiberger <rmh at temple.edu> wrote:> I think the syntax you are looking for is > > datasubset <- data[ data$A ==1 & data$B == 1 , ] ) > > This gives the subset of your original data for variable A with value > 1 and variable B with value 1. > > > On Mon, May 25, 2020 at 12:57 PM Burgess, Jamie > <Jamie.Burgess at liverpool.ac.uk> wrote: > > > > Dear all, > > > > I hope this message finds you well. I am currently trying to subset my > data by two variables, so far, I have tried two different ways to stratify > participants into groups. I would like to use the ?summary? and ?table? > arguments to characterise the data of participants based on the presence of > two variables and summarise this sub-set against a third variable. > > I have used this method: > > > > dgb001<-subset(data,data$variable==1 & data,data$variable) > > > > > > However, I get the following error: ?Error: cannot allocate vector of > size 16.0 Gb?. Is there another method I can try? > > > > > > Kind regards, > > > > > > Jamie Burgess > > > > PhD Student Endocrinology and Diabetes > > > > University of Liverpool > > > > Aintree University Hospital & > > > > The Walton Centre > > > > Institute of Ageing & Chronic Disease > > > > 0151 529 5936 > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hello, Inline. ?s 13:26 de 25/05/20, Burgess, Jamie escreveu:> Dear all, > > I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups.I don't understand what you mean by this, do you want to split the data set into sub-dataframes by 2 variables? If so try df_groups <- split(data, list(data$Var1, data$Var2), drop = TRUE) This produces a list of sub-df's. To get the group with Var1 == 1 and Var2 == 1 grp_name <- paste(1, 1, sep = '.') df_groups[[grp_name]] But if you only want the sub-df with Var1 == 1 and Var2 == 1, any of the following will do it. data[data$Var1 == 1 & data$Var2 == 1, ] subset(data, Var1 == 1 & Var2 == 1) Hope this helps, Rui Barradas I would like to use the ?summary? and ?table? arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable.> I have used this method: > > dgb001<-subset(data,data$variable==1 & data,data$variable) > > > However, I get the following error: ?Error: cannot allocate vector of size 16.0 Gb?. Is there another method I can try? > > > Kind regards, > > > Jamie Burgess > > PhD Student Endocrinology and Diabetes > > University of Liverpool > > Aintree University Hospital & > > The Walton Centre > > Institute of Ageing & Chronic Disease > > 0151 529 5936 > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Jamie, Your seem to want some descriptive statistic applied to subsets of your data frame "data" (maybe a more imaginative name would help). I'll guess that your data frame contains variables X, Y and Z among others. Further, I'll guess that you want the summaries of variable Z subset by Y and X. data<-data.frame(X=sample(1:2,100,TRUE),Y=sample(1:2,100,TRUE), Z=rnorm(100)) by(data,data[,c("X","Y")],summary) Jim On Tue, May 26, 2020 at 2:57 AM Burgess, Jamie <Jamie.Burgess at liverpool.ac.uk> wrote:> > Dear all, > > I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups. I would like to use the ?summary? and ?table? arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable. > I have used this method: > > dgb001<-subset(data,data$variable==1 & data,data$variable) > > > However, I get the following error: ?Error: cannot allocate vector of size 16.0 Gb?. Is there another method I can try? > > > Kind regards, > > > Jamie Burgess > > PhD Student Endocrinology and Diabetes > > University of Liverpool > > Aintree University Hospital & > > The Walton Centre > > Institute of Ageing & Chronic Disease > > 0151 529 5936 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
oops, that should have been: by(data$Z,data[,c("X","Y")],summary) Jim On Tue, May 26, 2020 at 9:00 AM Jim Lemon <drjimlemon at gmail.com> wrote:> > Hi Jamie, > Your seem to want some descriptive statistic applied to subsets of > your data frame "data" (maybe a more imaginative name would help). > I'll guess that your data frame contains variables X, Y and Z among > others. Further, I'll guess that you want the summaries of variable Z > subset by Y and X. > > data<-data.frame(X=sample(1:2,100,TRUE),Y=sample(1:2,100,TRUE), > Z=rnorm(100)) > by(data,data[,c("X","Y")],summary) > > Jim > > On Tue, May 26, 2020 at 2:57 AM Burgess, Jamie > <Jamie.Burgess at liverpool.ac.uk> wrote: > > > > Dear all, > > > > I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups. I would like to use the ?summary? and ?table? arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable. > > I have used this method: > > > > dgb001<-subset(data,data$variable==1 & data,data$variable) > > > > > > However, I get the following error: ?Error: cannot allocate vector of size 16.0 Gb?. Is there another method I can try? > > > > > > Kind regards, > > > > > > Jamie Burgess > > > > PhD Student Endocrinology and Diabetes > > > > University of Liverpool > > > > Aintree University Hospital & > > > > The Walton Centre > > > > Institute of Ageing & Chronic Disease > > > > 0151 529 5936 > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.