Typhenn Brichieri-Colombi
2015-Mar-04 22:02 UTC
[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit
Hello, I am trying to use the following custom function in an aggregatefunction, but cannot get R to recognize my data. I?ve read the help on function()and on aggregate() but am unable to solve my problem. How can I get R torecognize the data inputs for the custom function nested within aggregate()? My custom function is found below, as well as the errormessage I get when I run it on a test data set (I will be using this functionon a much larger dataset (over 600,000 rows)) Thank you for your time and your help! ? d_rule<-function(a,x){? i<-which(a==max(a)) out<-ifelse(length(i)==1, x[i], min(x)) return(out) } ? a<-c(2,2,1,4,2,5,2,3,4,4) x<-c(1:10) g<-c(1,1,2,2,3,3,4,4,5,5) dat<-as.data.frame(cbind(x,g)) ? test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x) Error in dat$x : $ operator is invalid for atomic vectors [[alternative HTML version deleted]]
Bert Gunter
2015-Mar-05 05:15 UTC
[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit
What do you think dat$a is? I recommend that you spend some time with an R tutorial if you plan to use R. Your code is pretty bad. Examples: use of the ifelse construction instead of if ... else; return() Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Wed, Mar 4, 2015 at 2:02 PM, Typhenn Brichieri-Colombi via R-help <r-help at r-project.org> wrote:> Hello, > > I am trying to use the following custom function in an aggregatefunction, but cannot get R to recognize my data. I?ve read the help on function()and on aggregate() but am unable to solve my problem. How can I get R torecognize the data inputs for the custom function nested within aggregate()? > > My custom function is found below, as well as the errormessage I get when I run it on a test data set (I will be using this functionon a much larger dataset (over 600,000 rows)) > > Thank you for your time and your help! > > > > d_rule<-function(a,x){ > > i<-which(a==max(a)) > > out<-ifelse(length(i)==1, x[i], min(x)) > > return(out) > > } > > > > a<-c(2,2,1,4,2,5,2,3,4,4) > > x<-c(1:10) > > g<-c(1,1,2,2,3,3,4,4,5,5) > > dat<-as.data.frame(cbind(x,g)) > > > > test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x) > > Error in dat$x : $ operator is invalid for atomic vectors > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Jeff Newmiller
2015-Mar-05 15:54 UTC
[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit
The aggregate function applies FUN to vectors, not data frames. For example, the default "mean" function accepts a vector such as a column in a data frame and returns a scalar (well, a vector of length 1). Aggregate then calls this function once for each piece of the column(s) you give it. Your function wants two vectors, but aggregate does not understand how to give two inputs. (In the future, please follow R-help mailing list guidelines and post using plain text so your code does not get messed up.) You could use split to break your data frame into a list of data frames, and then sapply to extract the results you are looking for. I prefer to use the plyr or dplyr or data.table packages to do all this for me. d_rule <- function( DF ) { i <- which( DF$a==max( DF$a ) ) if ( length( i ) == 1 ){ DF[ i, "x" ] } else { min( DF[ , "x" ] ) # did you mean min( DF$x[i] ) ? } } dat <- data.frame( a=c(2,2,1,4,2,5,2,3,4,4) , x = c(1:10) , g = c(1,1,2,2,3,3,4,4,5,5) ) # note that cbind on vectors creates a matrix # in a matrix all columns must be of the same type # but data frames generally have a variety of types # so don't use cbind when making a data frame library( dplyr ) result <- dat %>% group_by( g ) %>% do( answer = d_rule( . ) ) %>% as.data.frame --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On March 4, 2015 2:02:06 PM PST, Typhenn Brichieri-Colombi via R-help <r-help at r-project.org> wrote:>Hello, > >I am trying to use the following custom function in an >aggregatefunction, but cannot get R to recognize my data. I?ve read the >help on function()and on aggregate() but am unable to solve my problem. >How can I get R torecognize the data inputs for the custom function >nested within aggregate()? > >My custom function is found below, as well as the errormessage I get >when I run it on a test data set (I will be using this functionon a >much larger dataset (over 600,000 rows)) > >Thank you for your time and your help! > > >? >d_rule<-function(a,x){? > >i<-which(a==max(a)) > >out<-ifelse(length(i)==1, x[i], min(x)) > >return(out) > >} > > >? >a<-c(2,2,1,4,2,5,2,3,4,4) > >x<-c(1:10) > >g<-c(1,1,2,2,3,3,4,4,5,5) > >dat<-as.data.frame(cbind(x,g)) > > >? >test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x) > >Error in dat$x : $ operator is invalid for atomic vectors > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Bert Gunter
2015-Mar-05 16:12 UTC
[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit
Sorry, Jeff. aggregate() is generic.>From ?aggregate:"## S3 method for class 'data.frame' aggregate(x, by, FUN, ..., simplify = TRUE)" Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Thu, Mar 5, 2015 at 7:54 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> The aggregate function applies FUN to vectors, not data frames. For example, the default "mean" function accepts a vector such as a column in a data frame and returns a scalar (well, a vector of length 1). Aggregate then calls this function once for each piece of the column(s) you give it. Your function wants two vectors, but aggregate does not understand how to give two inputs. > > (In the future, please follow R-help mailing list guidelines and post using plain text so your code does not get messed up.) > > You could use split to break your data frame into a list of data frames, and then sapply to extract the results you are looking for. I prefer to use the plyr or dplyr or data.table packages to do all this for me. > > d_rule <- function( DF ) { > i <- which( DF$a==max( DF$a ) ) > if ( length( i ) == 1 ){ > DF[ i, "x" ] > } else { > min( DF[ , "x" ] ) # did you mean min( DF$x[i] ) ? > } > } > > dat <- data.frame( a=c(2,2,1,4,2,5,2,3,4,4) > , x = c(1:10) > , g = c(1,1,2,2,3,3,4,4,5,5) > ) > # note that cbind on vectors creates a matrix > # in a matrix all columns must be of the same type > # but data frames generally have a variety of types > # so don't use cbind when making a data frame > > library( dplyr ) > > result <- dat %>% group_by( g ) %>% do( answer = d_rule( . ) ) %>% as.data.frame > > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On March 4, 2015 2:02:06 PM PST, Typhenn Brichieri-Colombi via R-help <r-help at r-project.org> wrote: >>Hello, >> >>I am trying to use the following custom function in an >>aggregatefunction, but cannot get R to recognize my data. I?ve read the >>help on function()and on aggregate() but am unable to solve my problem. >>How can I get R torecognize the data inputs for the custom function >>nested within aggregate()? >> >>My custom function is found below, as well as the errormessage I get >>when I run it on a test data set (I will be using this functionon a >>much larger dataset (over 600,000 rows)) >> >>Thank you for your time and your help! >> >> >> >>d_rule<-function(a,x){ >> >>i<-which(a==max(a)) >> >>out<-ifelse(length(i)==1, x[i], min(x)) >> >>return(out) >> >>} >> >> >> >>a<-c(2,2,1,4,2,5,2,3,4,4) >> >>x<-c(1:10) >> >>g<-c(1,1,2,2,3,3,4,4,5,5) >> >>dat<-as.data.frame(cbind(x,g)) >> >> >> >>test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x) >> >>Error in dat$x : $ operator is invalid for atomic vectors >> >> >> >> [[alternative HTML version deleted]] >> >>______________________________________________ >>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2015-Mar-05 18:09 UTC
[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit
On Mar 4, 2015, at 2:02 PM, Typhenn Brichieri-Colombi via R-help wrote:> Hello, > > I am trying to use the following custom function in an aggregatefunction, but cannot get R to recognize my data. I?ve read the help on function()and on aggregate() but am unable to solve my problem. How can I get R torecognize the data inputs for the custom function nested within aggregate()? > > My custom function is found below, as well as the errormessage I get when I run it on a test data set (I will be using this functionon a much larger dataset (over 600,000 rows)) > > Thank you for your time and your help! > > d_rule<-function(a,x){ > i<-which(a==max(a)) > out<-ifelse(length(i)==1, x[i], min(x)) > return(out) > } > > a<-c(2,2,1,4,2,5,2,3,4,4) > x<-c(1:10) > g<-c(1,1,2,2,3,3,4,4,5,5) > dat<-as.data.frame(cbind(x,g)) > > test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x) > > Error in dat$x : $ operator is invalid for atomic vectorsThat message makes no sense to me because it suggests that the 'dat'-object was not a dataframe. I get a different error which I think I can explain:> test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x)Error in FUN(X[[1L]], ...) : unused argument (c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) The aggregate.data.frame function is I believe one by one positionally matching the x and g columns to the a-parameter of the d_rule-function and would be attempting to use dat$a (which as far as I can tell doesn't exist and would throw an error if the interpreter ever got to that step) as the match to the d_rule x-parameter, but before it does that, it tries to match dat$x to some parameter in d_rule but because both parameters already have candidate objects matched up, it fails producing the error message I see. (It's also very bad practice to use the construction as.data.frame(cbind(x,g)). It will mangle the results if there are either factor or character variables inside the cbind(). -- David.> > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA