Hi guys
I am new to R and I am currently trying to do a regression:
I have two matrices with 200 time series each.
In order to achieve a loop, I used the following command:
sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x])))
Each column/time series has a unique name, in case of Matrix 1 I have 200
cities, in case of Matrix 2 I have 200 stocks.
However, if I run the command I get the following result:
[[1]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
   Min     1Q Median     3Q    Max 
-134.9  -68.6  -32.8   33.2  261.2 
Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          525.2356    69.8059    7.52  9.1e-10 ***
matrix2[, x]             0.0640     0.0161    3.98  0.00023 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 113 on 50 degrees of freedom
  (41 observations deleted due to missingness)
Multiple R-squared: 0.24,	Adjusted R-squared: 0.225 
F-statistic: 15.8 on 1 and 50 DF,  p-value: 0.000226 
[[2]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
   Min     1Q Median     3Q    Max 
-914.9 -393.3  -76.9  243.3 1304.7 
Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          3.33e+03   1.88e+02   17.70  < 2e-16 ***
matrix2[, x]            4.10e-01   7.87e-02    5.21  3.4e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 531 on 51 degrees of freedom
  (40 observations deleted due to missingness)
Multiple R-squared: 0.348,	Adjusted R-squared: 0.335 
F-statistic: 27.2 on 1 and 51 DF,  p-value: 3.4e-06 
Instead of the X's in the call response, I'd like to have the column
name
(city and stock).
Is this by any means possible?
Thanks in advance for your help.
Best
Tom
--
View this message in context:
http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html
Sent from the R help mailing list archive at Nabble.com.
It depends on how fancy you want to get. A quick fix would be
pairs <- paste0(colnames(matrix1), ".", colnames(matrix2))
# lapply will be faster since you are returning a list
results <- lapply(1:200, function(x)
summary(lm(formula=matrix1[,x]~matrix2[,x])))
names(results) <- pairs
results
The results will now be separated by labels that indicate the city.stock.
-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of TMiller
Sent: Friday, August 2, 2013 10:17 AM
To: r-help at r-project.org
Subject: [R] Regression Column names instead of numbers
Hi guys
I am new to R and I am currently trying to do a regression:
I have two matrices with 200 time series each.
In order to achieve a loop, I used the following command:
sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x])))
Each column/time series has a unique name, in case of Matrix 1 I have 200
cities, in case of Matrix 2 I have 200 stocks.
However, if I run the command I get the following result:
[[1]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
   Min     1Q Median     3Q    Max 
-134.9  -68.6  -32.8   33.2  261.2 
Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          525.2356    69.8059    7.52  9.1e-10 ***
matrix2[, x]             0.0640     0.0161    3.98  0.00023 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 113 on 50 degrees of freedom
  (41 observations deleted due to missingness)
Multiple R-squared: 0.24,	Adjusted R-squared: 0.225 
F-statistic: 15.8 on 1 and 50 DF,  p-value: 0.000226 
[[2]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
   Min     1Q Median     3Q    Max 
-914.9 -393.3  -76.9  243.3 1304.7 
Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)          3.33e+03   1.88e+02   17.70  < 2e-16 ***
matrix2[, x]            4.10e-01   7.87e-02    5.21  3.4e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 531 on 51 degrees of freedom
  (40 observations deleted due to missingness)
Multiple R-squared: 0.348,	Adjusted R-squared: 0.335 
F-statistic: 27.2 on 1 and 51 DF,  p-value: 3.4e-06 
Instead of the X's in the call response, I'd like to have the column
name
(city and stock).
Is this by any means possible?
Thanks in advance for your help.
Best
Tom
--
View this message in context:
http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
You could try:
set.seed(25)
mt1<- matrix(sample(c(NA,1:40),20*200,replace=TRUE),ncol=200)
colnames(mt1)<- paste0("X",1:200)
set.seed(487)
mt2<- matrix(sample(c(NA,1:80),20*200,replace=TRUE),ncol=200)
colnames(mt2)<- colnames(mt1)
res<-lapply(colnames(mt1),function(x) {x1<-data.frame(mt1[,x],mt2[,x]);
colnames(x1)<-paste0(c("mt1","mt2"),x);
summary(lm(as.formula(paste(colnames(x1)[1],"~",colnames(x1)[2],sep="")),data=x1))})
res
[[1]]
Call:
lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2], 
??? sep = "")), data = x1)
Residuals:
??? Min????? 1Q? Median????? 3Q???? Max 
-16.799? -8.821? -1.059?? 8.414? 19.544 
Coefficients:
??????????? Estimate Std. Error t value Pr(>|t|)? 
(Intercept)? 14.7292???? 6.2952??? 2.34??? 0.031 *
mt2X1???????? 0.1302???? 0.1342??? 0.97??? 0.345? 
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 11.53 on 18 degrees of freedom
Multiple R-squared:? 0.04967,??? Adjusted R-squared:? -0.003127 
F-statistic: 0.9408 on 1 and 18 DF,? p-value: 0.3449
[[2]]
Call:
lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2], 
??? sep = "")), data = x1)
Residuals:
??? Min????? 1Q? Median????? 3Q???? Max 
-17.641? -6.809? -2.255?? 5.235? 19.684 
Coefficients:
??????????? Estimate Std. Error t value Pr(>|t|)? 
(Intercept)?? 3.7745???? 5.1715?? 0.730?? 0.4754? 
mt2X2???????? 0.2635???? 0.1155?? 2.283?? 0.0356 *
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 10.15 on 17 degrees of freedom
? (1 observation deleted due to missingness)
Multiple R-squared:? 0.2346,??? Adjusted R-squared:? 0.1896 
F-statistic:? 5.21 on 1 and 17 DF,? p-value: 0.0356
A.K.
----- Original Message -----
From: TMiller <thomas.mueller at student.unisg.ch>
To: r-help at r-project.org
Cc: 
Sent: Friday, August 2, 2013 11:16 AM
Subject: [R] Regression Column names instead of numbers
Hi guys
I am new to R and I am currently trying to do a regression:
I have two matrices with 200 time series each.
In order to achieve a loop, I used the following command:
sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x])))
Each column/time series has a unique name, in case of Matrix 1 I have 200
cities, in case of Matrix 2 I have 200 stocks.
However, if I run the command I get the following result:
[[1]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
?  Min? ?  1Q Median? ?  3Q? ? Max 
-134.9? -68.6? -32.8?  33.2? 261.2 
Coefficients:
? ? ? ? ? ? ? ? ? ?  Estimate Std. Error t value Pr(>|t|)? ? 
(Intercept)? ? ? ? ? 525.2356? ? 69.8059? ? 7.52? 9.1e-10 ***
matrix2[, x]? ? ? ? ? ?  0.0640? ?  0.0161? ? 3.98? 0.00023 ***
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 113 on 50 degrees of freedom
? (41 observations deleted due to missingness)
Multiple R-squared: 0.24,??? Adjusted R-squared: 0.225 
F-statistic: 15.8 on 1 and 50 DF,? p-value: 0.000226 
[[2]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
?  Min? ?  1Q Median? ?  3Q? ? Max 
-914.9 -393.3? -76.9? 243.3 1304.7 
Coefficients:
? ? ? ? ? ? ? ? ? ?  Estimate Std. Error t value Pr(>|t|)? ? 
(Intercept)? ? ? ? ? 3.33e+03?  1.88e+02?  17.70? < 2e-16 ***
matrix2[, x]? ? ? ? ? ? 4.10e-01?  7.87e-02? ? 5.21? 3.4e-06 ***
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
Residual standard error: 531 on 51 degrees of freedom
? (40 observations deleted due to missingness)
Multiple R-squared: 0.348,??? Adjusted R-squared: 0.335 
F-statistic: 27.2 on 1 and 51 DF,? p-value: 3.4e-06 
Instead of the X's in the call response, I'd like to have the column
name
(city and stock).
Is this by any means possible?
Thanks in advance for your help.
Best
Tom
--
View this message in context:
http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.