Hi all, I have got two datasets, one of them is rainfall data and the other one is groundwater level data. I would like to see whether there is a correlation between these two datasets and if there is, to what extent they are correlated. My stats background is limited, therefore any advice on which command I should use in R would be greatly appreciated. Thanks in advance. Chris -- View this message in context: http://www.nabble.com/Statistical-analysis-tp25531331p25531331.html Sent from the R help mailing list archive at Nabble.com.
Chris Li wrote:> > Hi all, > > I have got two datasets, one of them is rainfall data and the other one is > groundwater level data. > > I would like to see whether there is a correlation between these two > datasets and if there is, to what extent they are correlated. > > My stats background is limited, therefore any advice on which command I > should use in R would be greatly appreciated. > > Thanks in advance. > Chris >Supposing you have two variables-- precipitation, p, and groundwater potential, h-- a simple test for linear correlation is to produce a scatterplot of h vs. p: plot( h ~ p ) If it looks linear, than it may be worthwhile to have R estimate the coefficient of correlation for the data: cor( p, h ) If the correlation coefficient is close to +/- 1, than your data is exhibiting a strong linear trend and a linear model may be appropriate: linModel <- lm( h ~ p ) abline( linModel ) Good luck! -Charlie ----- Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/Statistical-analysis-tp25531331p25531335.html Sent from the R help mailing list archive at Nabble.com.
Chris Li wrote:> Hi all, > > I have got two datasets, one of them is rainfall data and the other one is > groundwater level data. > > I would like to see whether there is a correlation between these two > datasets and if there is, to what extent they are correlated. > > My stats background is limited, therefore any advice on which command I > should use in R would be greatly appreciated. > > Thanks in advance. > Chris >Hi, My advice would be to get an introductory statistics book and start with that. There is an Introductory stats book by Dalgaard that uses R. Strikes two birds with one blow. http://www.amazon.com/Introductory-Statistics-R-Peter-Dalgaard/dp/0387954759 cheers, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul
Hi Chris, If I understand your question correctly, what you want is both easy and hard. Easy: # making a reproducible example, as asked in the posting guide # two vectors water <- rnorm(1000) rain <- rgamma(1000,.5) # the following does everything you mention and more summary(lm(water~rain)) cor(water,rain) Hard: lm() and cor() assume independence of observations, linearity of the relation, normality of the residuals. Are these assumptions valid for your problem? Are your datasets time series? There will be ??autocorrelation in both datasets. There may be a ?lag. Decide whether to estimate and correct for those. Are there multiple sample locations? There may be dependence. Would you rather assume rain and change in groundwater level are related? Etc. Cheers, Arien Chris Li wrote:> Hi all, > > I have got two datasets, one of them is rainfall data and the other one is > groundwater level data. > > I would like to see whether there is a correlation between these two > datasets and if there is, to what extent they are correlated. > > My stats background is limited, therefore any advice on which command I > should use in R would be greatly appreciated. > > Thanks in advance. > Chris-- drs. H.A. (Arien) Lam (Ph.D. student) Department of Physical Geography Faculty of Geosciences Utrecht University, The Netherlands
Rainfall data is widely accepted as Random walk process and hence it is non-stationary. Therefore if correlation or regression coef. is measured on raw data then you may land in the world of spurious measures. I would suggest you to check whether unit root is there in your data or not first. If it is there then estimate corr or any other statistical measure on differenced data. Best, cls59 wrote:> > > > Chris Li wrote: >> >> Hi all, >> >> I have got two datasets, one of them is rainfall data and the other one >> is groundwater level data. >> >> I would like to see whether there is a correlation between these two >> datasets and if there is, to what extent they are correlated. >> >> My stats background is limited, therefore any advice on which command I >> should use in R would be greatly appreciated. >> >> Thanks in advance. >> Chris >> > > > Supposing you have two variables-- precipitation, p, and groundwater > potential, h-- a simple test for linear correlation is to produce a > scatterplot of h vs. p: > > plot( h ~ p ) > > If it looks linear, than it may be worthwhile to have R estimate the > coefficient of correlation for the data: > > cor( p, h ) > > If the correlation coefficient is close to +/- 1, than your data is > exhibiting a strong linear trend and a linear model may be appropriate: > > linModel <- lm( h ~ p ) > > abline( linModel ) > > > Good luck! > > -Charlie > >-- View this message in context: http://www.nabble.com/Statistical-analysis-tp25531331p25570612.html Sent from the R help mailing list archive at Nabble.com.
Since todays ground water may be influenced by yesterdays rainfall, you may want to look at the dynlm package and possibly lag.plot and the zoo package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Chris Li > Sent: Wednesday, September 23, 2009 5:37 PM > To: r-help at r-project.org > Subject: [R] Statistical analysis > > > Hi all, > > I have got two datasets, one of them is rainfall data and the other one > is > groundwater level data. > > I would like to see whether there is a correlation between these two > datasets and if there is, to what extent they are correlated. > > My stats background is limited, therefore any advice on which command I > should use in R would be greatly appreciated. > > Thanks in advance. > Chris > -- > View this message in context: http://www.nabble.com/Statistical- > analysis-tp25531331p25531331.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.