thr3ads.net - R help - [R] Unsuccessful beginner's struggle with lm [Aug 2013]

If this information is useful, please help other people find it:
Share via:

David Epstein

2013-Aug-29 12:23 UTC

[R] Unsuccessful beginner's struggle with lm

I have two data frames, "train" and "response". Here is my
attempt to do a
linear regression. All entries of both data frames are numeric. I am
expecting the intercept value to lie between 2 and 3 (in particular,
non-zero).

Here is a record of my interaction with R:
> class(response)
[1] "data.frame"> c(nrow(response),ncol(response))
[1] 1389    1> class(train)
[1] "data.frame"> c(nrow(train),ncol(train))
[1] 1389  256> beta.lm <- lm(response ~ train)Error in model.frame.default(formula = response ~ train, drop.unused.levels
= TRUE) :
  invalid type (list) for variable 'response'

What elementary syntax error am I making in my call to lm? And why does R
think at first that the class of "response" is data.frame, but that
its
class is "list" when I call lm?

Thanks
David

	[[alternative HTML version deleted]]

Duncan Murdoch

2013-Aug-29 12:39 UTC

head link

[R] Unsuccessful beginner's struggle with lm

On 13-08-29 8:23 AM, David Epstein wrote:> I have two data frames, "train" and "response". Here is
my attempt to do a
> linear regression. All entries of both data frames are numeric. I am
> expecting the intercept value to lie between 2 and 3 (in particular,
> non-zero).
lm expects the variables in the formula to be numeric vectors (or 
factors).  They are often columns of a dataframe, but they won't be 
dataframes themselves.
>
> Here is a record of my interaction with R:
>
>> class(response)
> [1] "data.frame"
>> c(nrow(response),ncol(response))
> [1] 1389    1
>> class(train)
> [1] "data.frame"
>> c(nrow(train),ncol(train))
> [1] 1389  256
>> beta.lm <- lm(response ~ train)
> Error in model.frame.default(formula = response ~ train, drop.unused.levels
> = TRUE) :
>    invalid type (list) for variable 'response'
>
> What elementary syntax error am I making in my call to lm? And why does R
> think at first that the class of "response" is data.frame, but
that its
> class is "list" when I call lm?
dataframes are lists with some extra rules added.  lm() is just 
reporting the low level type, rather than the high level one.

The way to do what you want is to include the response as a column in 
the same dataframe that includes the predictor variables.  If you call 
the dataframe "df" and the response column name "response",
then the lm
call would look like

lm(response ~ ., data=df)

The "." here means "all the other columns".  You could also
list them
explicitly, but 256 of them sounds like a lot...

Duncan Murdoch

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Aug 2013 - Unsuccessful beginner's struggle with lm

[R] Unsuccessful beginner's struggle with lm

[R] Unsuccessful beginner's struggle with lm

Apparently Analagous Threads