I'm trying to develop a linear model for crop productivity based on variables published as part of the SSURGO database released by the USDA. My default is to just run lm() with continuous predictor variables as numeric, and discrete predictor variables as factors, but some of the discrete variables are ordinal (e.g. drainage class, which ranges from excessively drained to excessively poorly drained), but this doesn't make use of the fact that the predictor variables have a known order. How do I correctly set up a regression model (with lm or similar) to detect the influence of ordinal variables? How will the output differ compared to the dummy variable outputs for unordered categorical variables. Thanks, Alex
I would consider this is a question for a statistics forum such as stats.stackexchange.com, not R-help, which is about R programming. They do sometimes intersect, as here, but I think you need to *understand what you're doing* before you write the R code to do it. Obviously, IMO. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Oct 5, 2017 at 10:54 AM, Alexandra Thorn <alexandra.thorn at gmail.com> wrote:> I'm trying to develop a linear model for crop productivity based on > variables published as part of the SSURGO database released by the > USDA. My default is to just run lm() with continuous predictor > variables as numeric, and discrete predictor variables as factors, but > some of the discrete variables are ordinal (e.g. drainage class, which > ranges from excessively drained to excessively poorly drained), but > this doesn't make use of the fact that the predictor variables have a > known order. > > How do I correctly set up a regression model (with lm or similar) to > detect the influence of ordinal variables? > > How will the output differ compared to the dummy variable outputs for > unordered categorical variables. > > Thanks, > Alex > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
This article may be helpful, at least to get you started: https://www.r-bloggers.com/ordinal-data/ Cheers, Boris> On Oct 5, 2017, at 3:35 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > I would consider this is a question for a statistics forum such as > stats.stackexchange.com, not R-help, which is about R programming. They do > sometimes intersect, as here, but I think you need to *understand what > you're doing* before you write the R code to do it. > > Obviously, IMO. > > Cheers, > Bert > > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Thu, Oct 5, 2017 at 10:54 AM, Alexandra Thorn <alexandra.thorn at gmail.com> > wrote: > >> I'm trying to develop a linear model for crop productivity based on >> variables published as part of the SSURGO database released by the >> USDA. My default is to just run lm() with continuous predictor >> variables as numeric, and discrete predictor variables as factors, but >> some of the discrete variables are ordinal (e.g. drainage class, which >> ranges from excessively drained to excessively poorly drained), but >> this doesn't make use of the fact that the predictor variables have a >> known order. >> >> How do I correctly set up a regression model (with lm or similar) to >> detect the influence of ordinal variables? >> >> How will the output differ compared to the dummy variable outputs for >> unordered categorical variables. >> >> Thanks, >> Alex >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Try looking at the help page for factor
?factor
for something to start with.
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
On 10/5/17, 10:54 AM, "R-help on behalf of Alexandra Thorn"
<r-help-bounces at r-project.org on behalf of alexandra.thorn at
gmail.com> wrote:
I'm trying to develop a linear model for crop productivity based on
variables published as part of the SSURGO database released by the
USDA. My default is to just run lm() with continuous predictor
variables as numeric, and discrete predictor variables as factors, but
some of the discrete variables are ordinal (e.g. drainage class, which
ranges from excessively drained to excessively poorly drained), but
this doesn't make use of the fact that the predictor variables have a
known order.
How do I correctly set up a regression model (with lm or similar) to
detect the influence of ordinal variables?
How will the output differ compared to the dummy variable outputs for
unordered categorical variables.
Thanks,
Alex
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.