Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable "Code" starts with an "R". I took all the Code and put them into a vector vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE) Then I created a function that is supposed to take all the lines in the my data x.df for which "Code" equals one value of "vec". See the code below where I created a loop to do that.> myfunc<-function(data,var2,var1)+ { + i=1 + while (i<632){ + line<-subset(data,var2==var1[i]) + if (i==1){ + df<-line + df<-data.frame(df) + } + else { + line<-data.frame(line) + df<-rbind(df,line) + } + i<-i+1 + } + fix(df) + }>The results of my program higly depend on the few last lines of the program. If I put "fix(df)", as above, the function opens a window with my data and it seems a sensible results (I have not checked in details but I barely have what I am suppose to get).> myfunc<-function(data,var2,var1)... + } + df<-data.frame(df) + print(is.data.frame(df)) + }> myfunc(x.df,x.df$Code,vec)[1] TRUE> print(is.data.frame(df))[1] FALSE In the case above I ask whether or not the "df" is a data.frame and the answer is true, when the program has ended, I ask again and the answer is false. Could anyone tell me what to do to get this data and could anyone tell me why those differences in the results?> as.data.frame(df)Erreur dans as.data.frame.default(df) : impossible de convertir automatiquement la classe "function" en un tableau de données (data.frame)>[[alternative HTML version deleted]]
Hi Jean-Baptiste, two points: 1) Your variable "df" is a *local* variable which you define in your function myfunc(), so it is not known outside myfunc(). When you ask is.data.frame(df), R looks at the global definition of df - which is the density function of the F distribution. To make your function run (especially interactively) will require a major rewrite. 2) Generally, using variable names that are already used as R objects (like "df" in your example) is a bad idea. For an example of the problems you can run into, see 1) above. 3) Loops are not "the R way". Depending on what you want to do with the subset of your data.frame, you may want to do something like this: x.df[substr(x.df$Code,1,1)=="R",] Look at ?substr to learn more - this function is "vectorized", meaning that it takes a vector input and returns a vector output. Look at section 2.7 in "An introduction to R". Good luck! Stephan Jean-Baptiste Combes schrieb:> Hello, > > I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am > using XP. > > I have a data which has a data.frame format called x.df (read from a csv > file). I want to take from this data observations for which the variable > "Code" starts with an "R". I took all the Code and put them into a vector > vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE) > > Then I created a function that is supposed to take all the lines in the my > data x.df for which "Code" equals one value of "vec". See the code below > where I created a loop to do that. > >> myfunc<-function(data,var2,var1) > + { > + i=1 > + while (i<632){ > + line<-subset(data,var2==var1[i]) > + if (i==1){ > + df<-line > + df<-data.frame(df) > + } > + else { > + line<-data.frame(line) > + df<-rbind(df,line) > + } > + i<-i+1 > + } > + fix(df) > + } > > The results of my program higly depend on the few last lines of the program. > If I put "fix(df)", as above, the function opens a window with my data and > it seems a sensible results (I have not checked in details but I barely have > what I am suppose to get). >> myfunc<-function(data,var2,var1) > ... > + } > + df<-data.frame(df) > + print(is.data.frame(df)) > + } >> myfunc(x.df,x.df$Code,vec) > [1] TRUE >> print(is.data.frame(df)) > [1] FALSE > > In the case above I ask whether or not the "df" is a data.frame and the > answer is true, when the program has ended, I ask again and the answer is > false. > > Could anyone tell me what to do to get this data and could anyone tell me > why those differences in the results? > >> as.data.frame(df) > Erreur dans as.data.frame.default(df) : > impossible de convertir automatiquement la classe "function" en un > tableau de donn?es (data.frame) > > [[alternative HTML version deleted]] > > > > ------------------------------------------------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi! Jean-Baptiste Combes wrote:> Hello, > > I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am > using XP. > > I have a data which has a data.frame format called x.df (read from a csv > file). I want to take from this data observations for which the variable > "Code" starts with an "R". I took all the Code and put them into a vector > vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE)I am not sure if I understood you correctly, but could a simple: subset(x.df, substring(Code,1,1)=="R") be an appropriate solution? HTH, Kimmo
On Jan 12, 2010, at 6:17 AM, Jean-Baptiste Combes wrote:> Hello, > > I use R 2.10, and I am new in R (I used to use SAS and lately > Stata), I am > using XP. > > I have a data which has a data.frame format called x.df (read from a > csv > file). I want to take from this data observations for which the > variable > "Code" starts with an "R". I took all the Code and put them into a > vector > vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE)vec is going to be a vector of row numbers that can be used to address the data.frame> > Then I created a function that is supposed to take all the lines in > the my > data x.df for which "Code" equals one value of "vec". See the code > below > where I created a loop to do that. >That seems to be a very short R one-liner: data[vec, ] ?"[" -- David.>> myfunc<-function(data,var2,var1) > + { > + i=1 > + while (i<632){ #where does that come from ? > + line<-subset(data,var2==var1[i]) > + if (i==1){ > + df<-line > + df<-data.frame(df) > + } > + else { > + line<-data.frame(line) > + df<-rbind(df,line) > + } > + i<-i+1 > + } > + fix(df) > + } >> > > The results of my program higly depend on the few last lines of the > program. > If I put "fix(df)", as above, the function opens a window with my > data and > it seems a sensible results (I have not checked in details but I > barely have > what I am suppose to get). >> myfunc<-function(data,var2,var1) > ... > + } > + df<-data.frame(df) > + print(is.data.frame(df)) > + } >> myfunc(x.df,x.df$Code,vec) > [1] TRUE >> print(is.data.frame(df)) > [1] FALSE > > In the case above I ask whether or not the "df" is a data.frame and > the > answer is true, when the program has ended, I ask again and the > answer is > false. > > Could anyone tell me what to do to get this data and could anyone > tell me > why those differences in the results? > >> as.data.frame(df) > Erreur dans as.data.frame.default(df) : > impossible de convertir automatiquement la classe "function" en un > tableau de donn?es (data.frame) >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
See help(grepl) so using built in data frame CO2 this gets rows whose Plant column start with Qn: subset(CO2, grepl("^Qn", Plant)) On Tue, Jan 12, 2010 at 6:17 AM, Jean-Baptiste Combes <jeanbaptiste.combes.abdn at googlemail.com> wrote:> Hello, > > I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am > using XP. > > I have a data which has a data.frame format called x.df (read from a csv > file). I want to take from this data observations for which the variable > "Code" starts with an "R". I took all the Code and put them into a vector > vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE) > > Then I created a function that is supposed to take all the lines in the my > data x.df for which "Code" equals one value of "vec". See the code below > where I created a loop to do that. > >> myfunc<-function(data,var2,var1) > + { > + i=1 > + while (i<632){ > + line<-subset(data,var2==var1[i]) > + if (i==1){ > + df<-line > + df<-data.frame(df) > + } > + else { > + line<-data.frame(line) > + df<-rbind(df,line) > + } > + i<-i+1 > + } > + fix(df) > + } >> > > The results of my program higly depend on the few last lines of the program. > If I put "fix(df)", as above, the function opens a window with my data and > it seems a sensible results (I have not checked in details but I barely have > what I am suppose to get). >> myfunc<-function(data,var2,var1) > ... > + } > + df<-data.frame(df) > + print(is.data.frame(df)) > + } >> myfunc(x.df,x.df$Code,vec) > [1] TRUE >> print(is.data.frame(df)) > [1] FALSE > > In the case above I ask whether or not the "df" is a data.frame and the > answer is true, when the program has ended, I ask again and the answer is > false. > > Could anyone tell me what to do to get this data and could anyone tell me > why those differences in the results? > >> as.data.frame(df) > Erreur dans as.data.frame.default(df) : > ?impossible de convertir automatiquement la classe ?"function" en un > tableau de donn?es (data.frame) >> > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >