thr3ads.net - R help - [R] Working With Variables Having Different Lengths [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Rich Shepard

2011-Oct-21 16:13 UTC

[R] Working With Variables Having Different Lengths

Because of regulatory requirement changes over several decades and weather
conditions preventing site access the variables in my data set have
different lengths. I'd like guidance on how to perform linear regressions
and other models with these variables.

   For example, there are 2206 rows for the parameter "TDS" but only
1191
rows for the parameter "Cond." Such discrepancies are common in these
data.

   Is there a reference I can read to learn how to analyze such data?

Rich

Weidong Gu

2011-Oct-21 16:39 UTC

head link

[R] Working With Variables Having Different Lengths

Sounds like you are dealing with missing data problem. At default, lm
or glm would only keep observations with complete records (complete
case analysis). This can be problematic if you have many missing
variables and missing values occur not completely at random (i.e.,
missing values are dependent on other (un)measured variables or
missing values themselves). Imputation is a common tool for handling
imcomplete data analysis. In R, you can find packages which conduct
single or multiple imputations, e.g. randomForest, norm, mice, mi
etc..

No easy way out with missing data problems, all imputations are based
on some strong and untestable assumptions.

Weidong Gu

On Fri, Oct 21, 2011 at 12:13 PM, Rich Shepard <rshepard at
appl-ecosys.com> wrote:> ?Because of regulatory requirement changes over several decades and weather
> conditions preventing site access the variables in my data set have
> different lengths. I'd like guidance on how to perform linear
regressions
> and other models with these variables.
>
> ?For example, there are 2206 rows for the parameter "TDS" but
only 1191
> rows for the parameter "Cond." Such discrepancies are common in
these data.
>
> ?Is there a reference I can read to learn how to analyze such data?
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

B77S

2011-Oct-21 16:56 UTC

head link

[R] Working With Variables Having Different Lengths

I know in my experience "Cond" (conductivity??) doesn't vary much
within a
stream except for during high flow events, and I would imagine the same is
true for TDS.  If these are all low flow values, you could possibly
determine a mean/median value to use for the missing data points.  Obviously
this is going to be different if you are sampling storm events.  If you have
stage data and lots of data points, you may be able to model the parameters
as a function of stage. 
HTH

Rich Shepard wrote:> 
> Because of regulatory requirement changes over several decades and weather
> conditions preventing site access the variables in my data set have
> different lengths. I'd like guidance on how to perform linear
regressions
> and other models with these variables.
> 
>    For example, there are 2206 rows for the parameter "TDS" but
only 1191
> rows for the parameter "Cond." Such discrepancies are common in
these
> data.
> 
>    Is there a reference I can read to learn how to analyze such data?
> 
> Rich
> 
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context:
http://r.789695.n4.nabble.com/Working-With-Variables-Having-Different-Lengths-tp3926023p3926158.html
Sent from the R help mailing list archive at Nabble.com.

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Oct 2011 - Working With Variables Having Different Lengths

[R] Working With Variables Having Different Lengths

[R] Working With Variables Having Different Lengths

[R] Working With Variables Having Different Lengths

Seemingly Similar Threads