Hi guys I am new to R and I am currently trying to do a regression: I have two matrices with 200 time series each. In order to achieve a loop, I used the following command: sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x]))) Each column/time series has a unique name, in case of Matrix 1 I have 200 cities, in case of Matrix 2 I have 200 stocks. However, if I run the command I get the following result: [[1]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: Min 1Q Median 3Q Max -134.9 -68.6 -32.8 33.2 261.2 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 525.2356 69.8059 7.52 9.1e-10 *** matrix2[, x] 0.0640 0.0161 3.98 0.00023 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 113 on 50 degrees of freedom (41 observations deleted due to missingness) Multiple R-squared: 0.24, Adjusted R-squared: 0.225 F-statistic: 15.8 on 1 and 50 DF, p-value: 0.000226 [[2]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: Min 1Q Median 3Q Max -914.9 -393.3 -76.9 243.3 1304.7 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.33e+03 1.88e+02 17.70 < 2e-16 *** matrix2[, x] 4.10e-01 7.87e-02 5.21 3.4e-06 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 531 on 51 degrees of freedom (40 observations deleted due to missingness) Multiple R-squared: 0.348, Adjusted R-squared: 0.335 F-statistic: 27.2 on 1 and 51 DF, p-value: 3.4e-06 Instead of the X's in the call response, I'd like to have the column name (city and stock). Is this by any means possible? Thanks in advance for your help. Best Tom -- View this message in context: http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html Sent from the R help mailing list archive at Nabble.com.
It depends on how fancy you want to get. A quick fix would be pairs <- paste0(colnames(matrix1), ".", colnames(matrix2)) # lapply will be faster since you are returning a list results <- lapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x]))) names(results) <- pairs results The results will now be separated by labels that indicate the city.stock. ------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of TMiller Sent: Friday, August 2, 2013 10:17 AM To: r-help at r-project.org Subject: [R] Regression Column names instead of numbers Hi guys I am new to R and I am currently trying to do a regression: I have two matrices with 200 time series each. In order to achieve a loop, I used the following command: sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x]))) Each column/time series has a unique name, in case of Matrix 1 I have 200 cities, in case of Matrix 2 I have 200 stocks. However, if I run the command I get the following result: [[1]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: Min 1Q Median 3Q Max -134.9 -68.6 -32.8 33.2 261.2 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 525.2356 69.8059 7.52 9.1e-10 *** matrix2[, x] 0.0640 0.0161 3.98 0.00023 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 113 on 50 degrees of freedom (41 observations deleted due to missingness) Multiple R-squared: 0.24, Adjusted R-squared: 0.225 F-statistic: 15.8 on 1 and 50 DF, p-value: 0.000226 [[2]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: Min 1Q Median 3Q Max -914.9 -393.3 -76.9 243.3 1304.7 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.33e+03 1.88e+02 17.70 < 2e-16 *** matrix2[, x] 4.10e-01 7.87e-02 5.21 3.4e-06 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 531 on 51 degrees of freedom (40 observations deleted due to missingness) Multiple R-squared: 0.348, Adjusted R-squared: 0.335 F-statistic: 27.2 on 1 and 51 DF, p-value: 3.4e-06 Instead of the X's in the call response, I'd like to have the column name (city and stock). Is this by any means possible? Thanks in advance for your help. Best Tom -- View this message in context: http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You could try: set.seed(25) mt1<- matrix(sample(c(NA,1:40),20*200,replace=TRUE),ncol=200) colnames(mt1)<- paste0("X",1:200) set.seed(487) mt2<- matrix(sample(c(NA,1:80),20*200,replace=TRUE),ncol=200) colnames(mt2)<- colnames(mt1) res<-lapply(colnames(mt1),function(x) {x1<-data.frame(mt1[,x],mt2[,x]); colnames(x1)<-paste0(c("mt1","mt2"),x); summary(lm(as.formula(paste(colnames(x1)[1],"~",colnames(x1)[2],sep="")),data=x1))}) res [[1]] Call: lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2], ??? sep = "")), data = x1) Residuals: ??? Min????? 1Q? Median????? 3Q???? Max -16.799? -8.821? -1.059?? 8.414? 19.544 Coefficients: ??????????? Estimate Std. Error t value Pr(>|t|)? (Intercept)? 14.7292???? 6.2952??? 2.34??? 0.031 * mt2X1???????? 0.1302???? 0.1342??? 0.97??? 0.345? --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 11.53 on 18 degrees of freedom Multiple R-squared:? 0.04967,??? Adjusted R-squared:? -0.003127 F-statistic: 0.9408 on 1 and 18 DF,? p-value: 0.3449 [[2]] Call: lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2], ??? sep = "")), data = x1) Residuals: ??? Min????? 1Q? Median????? 3Q???? Max -17.641? -6.809? -2.255?? 5.235? 19.684 Coefficients: ??????????? Estimate Std. Error t value Pr(>|t|)? (Intercept)?? 3.7745???? 5.1715?? 0.730?? 0.4754? mt2X2???????? 0.2635???? 0.1155?? 2.283?? 0.0356 * --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 10.15 on 17 degrees of freedom ? (1 observation deleted due to missingness) Multiple R-squared:? 0.2346,??? Adjusted R-squared:? 0.1896 F-statistic:? 5.21 on 1 and 17 DF,? p-value: 0.0356 A.K. ----- Original Message ----- From: TMiller <thomas.mueller at student.unisg.ch> To: r-help at r-project.org Cc: Sent: Friday, August 2, 2013 11:16 AM Subject: [R] Regression Column names instead of numbers Hi guys I am new to R and I am currently trying to do a regression: I have two matrices with 200 time series each. In order to achieve a loop, I used the following command: sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x]))) Each column/time series has a unique name, in case of Matrix 1 I have 200 cities, in case of Matrix 2 I have 200 stocks. However, if I run the command I get the following result: [[1]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: ? Min? ? 1Q Median? ? 3Q? ? Max -134.9? -68.6? -32.8? 33.2? 261.2 Coefficients: ? ? ? ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|)? ? (Intercept)? ? ? ? ? 525.2356? ? 69.8059? ? 7.52? 9.1e-10 *** matrix2[, x]? ? ? ? ? ? 0.0640? ? 0.0161? ? 3.98? 0.00023 *** --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 113 on 50 degrees of freedom ? (41 observations deleted due to missingness) Multiple R-squared: 0.24,??? Adjusted R-squared: 0.225 F-statistic: 15.8 on 1 and 50 DF,? p-value: 0.000226 [[2]] Call: lm(formula = matrix1[, x] ~ matrix2[, x]) Residuals: ? Min? ? 1Q Median? ? 3Q? ? Max -914.9 -393.3? -76.9? 243.3 1304.7 Coefficients: ? ? ? ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|)? ? (Intercept)? ? ? ? ? 3.33e+03? 1.88e+02? 17.70? < 2e-16 *** matrix2[, x]? ? ? ? ? ? 4.10e-01? 7.87e-02? ? 5.21? 3.4e-06 *** --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 531 on 51 degrees of freedom ? (40 observations deleted due to missingness) Multiple R-squared: 0.348,??? Adjusted R-squared: 0.335 F-statistic: 27.2 on 1 and 51 DF,? p-value: 3.4e-06 Instead of the X's in the call response, I'd like to have the column name (city and stock). Is this by any means possible? Thanks in advance for your help. Best Tom -- View this message in context: http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.