Dear R-help community, I have 2 different arrays of precipitation data each of the same dimensions of [longitude, latitude, time] dim=[30,32,43], called array1 and array2. I need to correlate them. This is the code I used to get one overall correlation value for the whole of the area of interest:> result <- cor(array1,array2,use="complete.obs") > resultThis give me a single value but I'm not convinced it is actually a correlation value for the total area for the total time period of 43 years....can anybody tell me if I am indeed wrong in my coding and/or indeed my low knowledge of the statistics of correlation. Also, I wanted to produce a correlation map over the 43 years. Could you also advise me if this is correct, I am more confident that this is than the above code:> result <- array(NA, c(30,32)) > > for(i in 1:30){ > for(j in 1:32){ > array1.ts <- array1[i,j,] > array2.ts <- array2[i,j,] > result[i,j] <- cor(array1.ts,array2.ts,use= "complete.obs") > } > }I appreciate your time very much. If I don't iron out this problem now the ground-work for my entire PhD will not be stable at all, Many thanks for reading my problem, happy 2007 :-) Jenny Barnes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jennifer Barnes PhD student - long range drought prediction Climate Extremes Department of Space and Climate Physics University College London Holmbury St Mary, Dorking Surrey RH5 6NT 01483 204149 Web: http://climate.mssl.ucl.ac.uk
Hi Jenny! So if i understand your datafile corect you have 960 case for a year. Any you have 43 years.. Yes? I'm not sure you should use correlation in this situation because of the autocorrelation of the data. There are big autocorrelation on spatial data's like what you use, and there are also a very big autocorrelation in time series data. I think you have to decompose your time series, and you have to cut down, the trend (and maybe some kind of sesonality), and than for the residuals you should do a correlation. You have to filter out the autocorrelation on the spatial data too, some way.. And because of the above problems, don't calculate correlation for the entierly databases! bye, Zoltan [[alternative HTML version deleted]]
Hi Zoltan, Right, I have 30x32=960 data points per year (It is actually the mean febuary precipitation total in case you were wondering) at each grid point over the world, so I have 960 data points each of the 43 years. Therefore can I do anything with a trend and residuals? I don't think I can if it's just mean feb precipitation, one data point per grid square per year... I apreicate your help though very much.....although I do still need to perform a spatial correlation if anyone else can help? Many thanks, Jenny Hi Jenny! So if i understand your datafile corect you have 960 case for a year. Any you have 43 years.. Yes? I'm not sure you should use correlation in this situation because of the autocorrelation of the data. There are big autocorrelation on spatial data's like what you use, and there are also a very big autocorrelation in time series data. I think you have to decompose your time series, and you have to cut down, the trend (and maybe some kind of sesonality), and than for the residuals you should do a correlation. You have to filter out the autocorrelation on the spatial data too, some way.. And because of the above problems, don't calculate correlation for the entierly databases! bye, Zoltan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jennifer Barnes PhD student - long range drought prediction Climate Extremes Department of Space and Climate Physics University College London Holmbury St Mary, Dorking Surrey RH5 6NT 01483 204149 07916 139187 Web: http://climate.mssl.ucl.ac.uk