Aaron Barzilai
2007-Dec-27 23:22 UTC
[R] Help with lm and multiple linear regression? (Plain Text version)
(Apologies the previous version was sent as rich text) Hello, I'm new to R, but I've read the intro to R and successfully connected it to an instance of mysql. I'm trying to perform multiple linear regression, but I'm having trouble using the lm function. To start, I have read in a simply y matrix of values(dependent variable) and x matrix of independent variables. It says both are data frames, but lm is giving me an error that my y variable is a list. Any suggestions on how to do this? It's not clear to me what the problem is as they're both data frames. My actual problem will use a much wider matrix of coefficients, I've only included two for illustration. Additionally, I'd actually like to weight the observations. How would I go about doing that? I also have that as a separate column vector. Thanks, Aaron Here's my session:> marginmargin 1 66.67 2 -58.33 3 100.00 4 -33.33 5 200.00 6 -83.33 7 -100.00 8 0.00 9 100.00 10 -18.18 11 -55.36 12 -125.00 13 -33.33 14 -200.00 15 0.00 16 -100.00 17 75.00 18 0.00 19 -200.00 20 35.71 21 100.00 22 50.00 23 -86.67 24 165.00> personcoeffPerson1 Person2 1 -1 1 2 -1 1 3 -1 1 4 -1 1 5 -1 1 6 -1 1 7 0 0 8 0 0 9 0 1 10 -1 1 11 -1 1 12 -1 1 13 -1 1 14 -1 0 15 0 0 16 0 0 17 0 1 18 -1 1 19 -1 1 20 -1 1 21 -1 1 22 -1 1 23 -1 1 24 -1 1> class(margin)[1] "data.frame"> class(personcoeff)[1] "data.frame"> lm(margin~personcoeff)Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid type (list) for variable 'margin' ____________________________________________________________________________________ Be a better friend, newshound, and
Tim Calkins
2007-Dec-27 23:55 UTC
[R] Help with lm and multiple linear regression? (Plain Text version)
consider merging everything into a singe dataframe. i haven't tried it, but something like the following could work:> reg.data <- cbind(margin, personcoeff) > names(reg.data) <- c('margin', 'p1', 'p2') > lm(margin~p1+p2, data = reg.data)the idea here is that by specifying the data frame with the data argument in lm, R looks for the columns of the names specified in the formula. for weights, see ?lm and look for the weights argument. cheers, tc On Dec 28, 2007 10:22 AM, Aaron Barzilai <aaron_barzilai at yahoo.com> wrote:> (Apologies the previous version was sent as rich text) > > Hello, > I'm new to R, but I've read the intro to R and successfully connected it to an instance of mysql. I'm trying to perform multiple linear regression, but I'm having trouble using the lm function. To start, I have read in a simply y matrix of values(dependent variable) and x matrix of independent variables. It says both are data frames, but lm is giving me an error that my y variable is a list. > > Any suggestions on how to do this? It's not clear to me what the problem is as they're both data frames. My actual problem will use a much wider matrix of coefficients, I've only included two for illustration. > > Additionally, I'd actually like to weight the observations. How would I go about doing that? I also have that as a separate column vector. > > Thanks, > Aaron > > Here's my session: > > margin > margin > 1 66.67 > 2 -58.33 > 3 100.00 > 4 -33.33 > 5 200.00 > 6 -83.33 > 7 -100.00 > 8 0.00 > 9 100.00 > 10 -18.18 > 11 -55.36 > 12 -125.00 > 13 -33.33 > 14 -200.00 > 15 0.00 > 16 -100.00 > 17 75.00 > 18 0.00 > 19 -200.00 > 20 35.71 > 21 100.00 > 22 50.00 > 23 -86.67 > 24 165.00 > > personcoeff > Person1 Person2 > 1 -1 1 > 2 -1 1 > 3 -1 1 > 4 -1 1 > 5 -1 1 > 6 -1 1 > 7 0 0 > 8 0 0 > 9 0 1 > 10 -1 1 > 11 -1 1 > 12 -1 1 > 13 -1 1 > 14 -1 0 > 15 0 0 > 16 0 0 > 17 0 1 > 18 -1 1 > 19 -1 1 > 20 -1 1 > 21 -1 1 > 22 -1 1 > 23 -1 1 > 24 -1 1 > > class(margin) > [1] "data.frame" > > class(personcoeff) > [1] "data.frame" > > lm(margin~personcoeff) > Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : > invalid type (list) for variable 'margin' > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Tim Calkins 0406 753 997
Aaron Barzilai
2007-Dec-28 02:20 UTC
[R] Help with lm and multiple linear regression? (Plain Text version)
Tim (and others who responded privately), Thanks for the help, this approach did work. I have also reread ?lm a little more closely, I do see the weights functionality. I have one last question: Now that I understand how to call this function and review the results, I want to extend it to my much larger real problem, with 100s of columns. Is there a way to call the function in more of a matrix algebra syntax, where I would list the matrix(e.g. personcoeff) rather than the individual column names? It seems like I might need to use lm.wfit, but per the help I'd rather use lm. Thanks, Aaron ----- Original Message ---- From: Tim Calkins <tim.calkins at gmail.com> To: Aaron Barzilai <aaron_barzilai at yahoo.com> Cc: r-help at r-project.org Sent: Thursday, December 27, 2007 6:55:57 PM Subject: Re: [R] Help with lm and multiple linear regression? (Plain Text version) consider merging everything into a singe dataframe. i haven't tried it, but something like the following could work:> reg.data <- cbind(margin, personcoeff) > names(reg.data) <- c('margin', 'p1', 'p2') > lm(margin~p1+p2, data = reg.data)the idea here is that by specifying the data frame with the data argument in lm, R looks for the columns of the names specified in the formula. for weights, see ?lm and look for the weights argument. cheers, tc On Dec 28, 2007 10:22 AM, Aaron Barzilai <aaron_barzilai at yahoo.com> wrote:> (Apologies the previous version was sent as rich text) > > Hello, > I'm new to R, but I've read the intro to R and successfully connected it to an instance of mysql. I'm trying to perform multiple linear regression, but I'm having trouble using the lm function. To start, I have read in a simply y matrix of values(dependent variable) and x matrix of independent variables. It says both are data frames, but lm is giving me an error that my y variable is a list. > > Any suggestions on how to do this? It's not clear to me what the problem is as they're both data frames. My actual problem will use a much wider matrix of coefficients, I've only included two for illustration. > > Additionally, I'd actually like to weight the observations. How would I go about doing that? I also have that as a separate column vector. > > Thanks, > Aaron > > Here's my session: > > margin > margin > 1 66.67 > 2 -58.33 > 3 100.00 > 4 -33.33 > 5 200.00 > 6 -83.33 > 7 -100.00 > 8 0.00 > 9 100.00 > 10 -18.18 > 11 -55.36 > 12 -125.00 > 13 -33.33 > 14 -200.00 > 15 0.00 > 16 -100.00 > 17 75.00 > 18 0.00 > 19 -200.00 > 20 35.71 > 21 100.00 > 22 50.00 > 23 -86.67 > 24 165.00 > > personcoeff > Person1 Person2 > 1 -1 1 > 2 -1 1 > 3 -1 1 > 4 -1 1 > 5 -1 1 > 6 -1 1 > 7 0 0 > 8 0 0 > 9 0 1 > 10 -1 1 > 11 -1 1 > 12 -1 1 > 13 -1 1 > 14 -1 0 > 15 0 0 > 16 0 0 > 17 0 1 > 18 -1 1 > 19 -1 1 > 20 -1 1 > 21 -1 1 > 22 -1 1 > 23 -1 1 > 24 -1 1 > > class(margin) > [1] "data.frame" > > class(personcoeff) > [1] "data.frame" > > lm(margin~personcoeff) > Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : > invalid type (list) for variable 'margin' > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Tim Calkins 0406 753 997 ____________________________________________________________________________________ Be a better friend, newshound, and