Kelly Thompson
2022-Apr-09 01:26 UTC
[R] "apply" a function that takes two or more vectors as arguments, such as cor(x, y), over a "category" or "grouping variable" or "index"?
#Q. How can I "apply" a function that takes two or more vectors as arguments, such as cor(x, y), over a "category" or "grouping variable" or "index"? #I'm using cor() as an example, I'd like to find a way to do this for any function that takes 2 or more vectors as arguments. #create example data my_category <- rep ( c("a","b","c"), 4) set.seed(12345) my_x <- rnorm(12) set.seed(54321) my_y <- rnorm(12) my_df <- data.frame(my_category, my_x, my_y) #review data my_df #If i wanted to get the correlation of x and y grouped by category, I could use this code and loop: my_category_unique <- unique(my_category) my_results <- vector("list", length(my_category_unique) ) names(my_results) <- my_category_unique #start i loop for (i in 1:length(my_category_unique) ) { my_criteria_i <- my_category == my_category_unique[i] my_x_i <- my_x[which(my_criteria_i)] my_y_i <- my_y[which(my_criteria_i)] my_correl_i <- cor(x = my_x_i, y = my_y_i) my_results[i] <- list(my_correl_i) } # end i loop #review results my_results #Q. Is there a better or more "elegant" way to do this, using by(), aggregate(), apply(), or some other function? #This does not work and results in this error message: "Error in FUN(dd[x, ], ...) : incompatible dimensions" by (data = my_x, INDICES = my_category, FUN = cor, y = my_y) #This does not work and results in this error message: "Error in cor(my_df$x, my_df$y) : ... supply both 'x' and 'y' or a matrix-like 'x' " by (data = my_df, INDICES = my_category, FUN = function(x, y) { cor (my_df$x, my_df$y) } ) #if I wanted the mean of x by category, I could use by() or aggregate(): by (data = my_x, INDICES = my_category, FUN = mean) aggregate(x = my_x, by = list(my_category), FUN = mean) #Thanks!
Richard M. Heiberger
2022-Apr-09 01:37 UTC
[R] [External] "apply" a function that takes two or more vectors as arguments, such as cor(x, y), over a "category" or "grouping variable" or "index"?
look at ?mapply Apply a Function to Multiple List or Vector Arguments to see if that meets your needs> On Apr 08, 2022, at 21:26, Kelly Thompson <kt1572757 at gmail.com> wrote: > > #Q. How can I "apply" a function that takes two or more vectors as > arguments, such as cor(x, y), over a "category" or "grouping variable" > or "index"? > #I'm using cor() as an example, I'd like to find a way to do this for > any function that takes 2 or more vectors as arguments. > > > #create example data > > my_category <- rep ( c("a","b","c"), 4) > > set.seed(12345) > my_x <- rnorm(12) > > set.seed(54321) > my_y <- rnorm(12) > > my_df <- data.frame(my_category, my_x, my_y) > > #review data > my_df > > #If i wanted to get the correlation of x and y grouped by category, I > could use this code and loop: > > my_category_unique <- unique(my_category) > > my_results <- vector("list", length(my_category_unique) ) > names(my_results) <- my_category_unique > > #start i loop > for (i in 1:length(my_category_unique) ) { > my_criteria_i <- my_category == my_category_unique[i] > my_x_i <- my_x[which(my_criteria_i)] > my_y_i <- my_y[which(my_criteria_i)] > my_correl_i <- cor(x = my_x_i, y = my_y_i) > my_results[i] <- list(my_correl_i) > } # end i loop > > #review results > my_results > > #Q. Is there a better or more "elegant" way to do this, using by(), > aggregate(), apply(), or some other function? > > #This does not work and results in this error message: "Error in > FUN(dd[x, ], ...) : incompatible dimensions" > by (data = my_x, INDICES = my_category, FUN = cor, y = my_y) > > #This does not work and results in this error message: "Error in > cor(my_df$x, my_df$y) : ... supply both 'x' and 'y' or a matrix-like > 'x' " > by (data = my_df, INDICES = my_category, FUN = function(x, y) { cor > (my_df$x, my_df$y) } ) > > > #if I wanted the mean of x by category, I could use by() or aggregate(): > by (data = my_x, INDICES = my_category, FUN = mean) > > aggregate(x = my_x, by = list(my_category), FUN = mean) > > #Thanks! > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Crmh%40temple.edu%7C4c8a50fd1bf14b2cf7b408da19c7fe20%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637850644148770767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=23Y%2Fqw7G1gb4ACIz5V41DjBIR8c2IFkkZgud9dGaftE%3D&reserved=0 > PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Crmh%40temple.edu%7C4c8a50fd1bf14b2cf7b408da19c7fe20%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637850644148770767%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=3vIZYrMBnAZKZhZCwHcLpILHEE72NuLc03LXAxr%2BXQ4%3D&reserved=0 > and provide commented, minimal, self-contained, reproducible code.
Deepayan Sarkar
2022-Apr-09 05:16 UTC
[R] "apply" a function that takes two or more vectors as arguments, such as cor(x, y), over a "category" or "grouping variable" or "index"?
On Sat, Apr 9, 2022 at 6:56 AM Kelly Thompson <kt1572757 at gmail.com> wrote:> > #Q. How can I "apply" a function that takes two or more vectors as > arguments, such as cor(x, y), over a "category" or "grouping variable" > or "index"? > #I'm using cor() as an example, I'd like to find a way to do this for > any function that takes 2 or more vectors as arguments. > > #create example data > > my_category <- rep ( c("a","b","c"), 4) > > set.seed(12345) > my_x <- rnorm(12) > > set.seed(54321) > my_y <- rnorm(12) > > my_df <- data.frame(my_category, my_x, my_y) > > #review data > my_df > > #If i wanted to get the correlation of x and y grouped by category, I > could use this code and loop: > > my_category_unique <- unique(my_category) > > my_results <- vector("list", length(my_category_unique) ) > names(my_results) <- my_category_unique > > #start i loop > for (i in 1:length(my_category_unique) ) { > my_criteria_i <- my_category == my_category_unique[i] > my_x_i <- my_x[which(my_criteria_i)] > my_y_i <- my_y[which(my_criteria_i)] > my_correl_i <- cor(x = my_x_i, y = my_y_i) > my_results[i] <- list(my_correl_i) > } # end i loop > > #review results > my_results > > #Q. Is there a better or more "elegant" way to do this, using by(), > aggregate(), apply(), or some other function?split() is another generally useful function to know about: e.g., s <- split(my_df, ~ my_category) lapply(s, function(d) with(d, cor(my_x, my_y))) Best, -Deepayan> #This does not work and results in this error message: "Error in > FUN(dd[x, ], ...) : incompatible dimensions" > by (data = my_x, INDICES = my_category, FUN = cor, y = my_y) > > #This does not work and results in this error message: "Error in > cor(my_df$x, my_df$y) : ... supply both 'x' and 'y' or a matrix-like > 'x' " > by (data = my_df, INDICES = my_category, FUN = function(x, y) { cor > (my_df$x, my_df$y) } ) > > > #if I wanted the mean of x by category, I could use by() or aggregate(): > by (data = my_x, INDICES = my_category, FUN = mean) > > aggregate(x = my_x, by = list(my_category), FUN = mean) > > #Thanks! > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2022-Apr-09 12:32 UTC
[R] "apply" a function that takes two or more vectors as arguments, such as cor(x, y), over a "category" or "grouping variable" or "index"?
Hello, Another option is ?by. by(my_df[-1], my_df$my_category, cor) by(my_df[-1], my_df$my_category, \(x) cor(x)[1,2]) Hope this helps, Rui Barradas ?s 02:26 de 09/04/2022, Kelly Thompson escreveu:> #Q. How can I "apply" a function that takes two or more vectors as > arguments, such as cor(x, y), over a "category" or "grouping variable" > or "index"? > #I'm using cor() as an example, I'd like to find a way to do this for > any function that takes 2 or more vectors as arguments. > > > #create example data > > my_category <- rep ( c("a","b","c"), 4) > > set.seed(12345) > my_x <- rnorm(12) > > set.seed(54321) > my_y <- rnorm(12) > > my_df <- data.frame(my_category, my_x, my_y) > > #review data > my_df > > #If i wanted to get the correlation of x and y grouped by category, I > could use this code and loop: > > my_category_unique <- unique(my_category) > > my_results <- vector("list", length(my_category_unique) ) > names(my_results) <- my_category_unique > > #start i loop > for (i in 1:length(my_category_unique) ) { > my_criteria_i <- my_category == my_category_unique[i] > my_x_i <- my_x[which(my_criteria_i)] > my_y_i <- my_y[which(my_criteria_i)] > my_correl_i <- cor(x = my_x_i, y = my_y_i) > my_results[i] <- list(my_correl_i) > } # end i loop > > #review results > my_results > > #Q. Is there a better or more "elegant" way to do this, using by(), > aggregate(), apply(), or some other function? > > #This does not work and results in this error message: "Error in > FUN(dd[x, ], ...) : incompatible dimensions" > by (data = my_x, INDICES = my_category, FUN = cor, y = my_y) > > #This does not work and results in this error message: "Error in > cor(my_df$x, my_df$y) : ... supply both 'x' and 'y' or a matrix-like > 'x' " > by (data = my_df, INDICES = my_category, FUN = function(x, y) { cor > (my_df$x, my_df$y) } ) > > > #if I wanted the mean of x by category, I could use by() or aggregate(): > by (data = my_x, INDICES = my_category, FUN = mean) > > aggregate(x = my_x, by = list(my_category), FUN = mean) > > #Thanks! > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.