Dear readers I would like to make General Linear Model (GLM) for the following data set http://louise.hoffman.googlepages.com/fuel.csv The code I have written is fuelData<-read.table('fuel.csv',header=TRUE, sep=',') n<-dim(fuelData)[1] xOnes<- matrix(1,nrow=n,ncol=1) x<-cbind(xOnes,fuelData[,3]) y<-fuelData[,4] theta<-((t(x)%*%x)^(-1))%*%t(x)%*%y which gives> theta[,1] [1,] 215.8374077 [2,] 0.1083491 When I do it in Matlab I get theta to be 79.69 0.18 which is correct. ~79 is the crossing of the y-axis. Have I perhaps written theta wrong? The formula for theta is (alpha,beta)^T = (x^T * x)^(-1) * x^T * Y where ^T means transposed. Can someone see where the mistake is? Hugs, Louise
On 3/03/2008, at 9:18 AM, Louise Hoffman wrote:> Dear readers > > I would like to make General Linear Model (GLM) for the following > data set > http://louise.hoffman.googlepages.com/fuel.csv > > The code I have written is > > fuelData<-read.table('fuel.csv',header=TRUE, sep=',') > n<-dim(fuelData)[1] > xOnes<- matrix(1,nrow=n,ncol=1) > x<-cbind(xOnes,fuelData[,3]) > y<-fuelData[,4] > theta<-((t(x)%*%x)^(-1))%*%t(x)%*%y > > which gives > >> theta > [,1] > [1,] 215.8374077 > [2,] 0.1083491 > > When I do it in Matlab I get theta to be > 79.69 > 0.18 > > which is correct. ~79 is the crossing of the y-axis.This is certainly ***NOT*** correct. (If you really got those numbers from Matlab, then Matlab is up to Puttee.) Have you plotted your data? (1) Fitting a straight line is ridiculous. (2) If you are so foolish as to fit a straight line, you get theta to have entries -4197.96 (intercept) and 2.16 (slope). The line y = 79.69 + 0.18*x is off the edge of the graph and does not even appear.> > Have I perhaps written theta wrong?Yes. The expression (t(x)%*%x)^(-1) is the matrix of entry by entry reciprocals of the entries of t(x)%*%x. You want: theta <- solve(t(x)%*%x))%*%t(x)%*%y Anyhow, if you're going to use R, why not ***use R***? fit <- lm(fpi ~ rtime,data=fuelData) theta <- coef(fit) This gives an answer identical to that from the corrected version of your ``from scratch'' expression. (That expression, while theoretically correct, is numerically ill-advised. The cognoscenti use either the Choleski or the ``qr'' decomposition of t(x)%*%x to effect the calculations. One of these is what is going on in the bowels of lm(). But here I speak of that of which I know little.) cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
On 02-Mar-08 20:18:29, Louise Hoffman wrote:> Dear readers > > I would like to make General Linear Model (GLM) for the > following data set http://louise.hoffman.googlepages.com/fuel.csv > > The code I have written is > > fuelData<-read.table('fuel.csv',header=TRUE, sep=',') > n<-dim(fuelData)[1] > xOnes<- matrix(1,nrow=n,ncol=1) > x<-cbind(xOnes,fuelData[,3]) > y<-fuelData[,4] > theta<-((t(x)%*%x)^(-1))%*%t(x)%*%y > > which gives > >> theta > [,1] > [1,] 215.8374077 > [2,] 0.1083491 > > When I do it in Matlab I get theta to be > 79.69 > 0.18 > > which is correct. ~79 is the crossing of the y-axis. > > Have I perhaps written theta wrong? The formula for theta is > (alpha,beta)^T = (x^T * x)^(-1) * x^T * Y > > where ^T means transposed.Unfortunately, x^(-1) is not the inverse of x: x<-matrix(c(2,4,4,5),nrow=2) x # [,1] [,2] # [1,] 2 4 # [2,] 4 5 x^(-1) # [,1] [,2] # [1,] 0.50 0.25 # [2,] 0.25 0.20 i.e. it is the matrix which you get by applying the operation (...)^(-1) to each element of x. In R, the inverse of a non-singular matrix is (somewhat obscurely) denoted by solve(x): solve(x) # [,1] [,2] # [1,] -0.8333333 0.6666667 # [2,] 0.6666667 -0.3333333 solve(x)%*%x # [,1] [,2] # [1,] 1 1.110223e-16 # [2,] 0 1.000000e+00 (Note the slight rounding error); whereas (x^(-1))%*%x # [,1] [,2] # [1,] 2.0 3.25 # [2,] 1.3 2.00 Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 02-Mar-08 Time: 21:05:50 ------------------------------ XFMail ------------------------------
> Unfortunately, x^(-1) is not the inverse of x: > > x<-matrix(c(2,4,4,5),nrow=2) > x > # [,1] [,2] > # [1,] 2 4 > # [2,] 4 5 > > x^(-1) > # [,1] [,2] > # [1,] 0.50 0.25 > # [2,] 0.25 0.20 > > i.e. it is the matrix which you get by applying the operation > (...)^(-1) to each element of x. > > In R, the inverse of a non-singular matrix is (somewhat obscurely) > denoted by solve(x): > > solve(x) > # [,1] [,2] > # [1,] -0.8333333 0.6666667 > # [2,] 0.6666667 -0.3333333 > > solve(x)%*%x > # [,1] [,2] > # [1,] 1 1.110223e-16 > # [2,] 0 1.000000e+00 > (Note the slight rounding error); whereas > > (x^(-1))%*%x > # [,1] [,2] > # [1,] 2.0 3.25 > # [2,] 1.3 2.00Thanks for the very detailed explanation! I will never make that mistake again =)