Hello, First of all, let me apologize that my statistics background is modest at best. I am doing some extreme value analysis on model output (WRF) which have the following dimensions: speed(time,lat,lon) I am trying to fit the GPD (gpd.fit) to each point (time,lat,lon) to get a return level plot with values at each grid point. (Map with return level by location.) Here is some code I tried, following similar structure to languages I'm more familiar with, but it isn't working. Y = as.matrix(time,lat,lon) for (t in 1:67) + for (j in 1:106) + for (i in 1:193) + fit(t,j,i)<-gpd.fit(speed(t,j,i), threshold=17,ydat = Y) I receive errors at this point, and cant figure out how to get individual fits at each grid point. Thanks, Michelle -- View this message in context: http://r.789695.n4.nabble.com/Return-level-plots-tp4643847.html Sent from the R help mailing list archive at Nabble.com.
On Fri, Sep 21, 2012 at 3:17 PM, MichelleNCSU <mlcipull at ncsu.edu> wrote:> Hello, > > First of all, let me apologize that my statistics background is modest at > best. > > I am doing some extreme value analysis on model output (WRF) which have the > following dimensions: > > speed(time,lat,lon) > > I am trying to fit the GPD (gpd.fit) to each point (time,lat,lon) to get a > return level plot with values at each grid point. (Map with return level by > location.) > > Here is some code I tried, following similar structure to languages I'm more > familiar with, but it isn't working. > > Y = as.matrix(time,lat,lon) > > for (t in 1:67) > + for (j in 1:106) > + for (i in 1:193) > + fit(t,j,i)<-gpd.fit(speed(t,j,i), threshold=17,ydat = Y)On first blush, your problem seems to be a case of MATLAB-itis ;-) R subsetting is done with square brackets (like x[3,4] or so), with parentheses being reserved for grouping and function calls. Moving ahead, you're really going to want to use braces for your loops to clarify scope (see basically every programmer's guide ever)" for(t in 1:67){ for(j in 1:106){ for(i in 1:193){ fit[t,j,i] <- gpd.fit(speed[t,j,i], threshold = 17, ydat = Y) } } } Isn't that clearer? Finally, you should see if the function gpd.fit() is (in R speak) vectorized: if so, you will be able to drop the explicit loops and make things much much faster. Cheers, Michael PS -- Just a heads up: most of us don't read R help through Nabble (R help is really a mailing list) so it would make us all much happier if you could explicitly include context in any follow up questions.> > I receive errors at this point, and cant figure out how to get individual > fits at each grid point. > > Thanks, > > Michelle > >
On Sep 21, 2012, at 7:17 AM, MichelleNCSU wrote:> Hello, > > First of all, let me apologize that my statistics background is modest at > best. > > I am doing some extreme value analysis on model output (WRF) which have the > following dimensions: > > speed(time,lat,lon)How is this object structured? Are there multiple time layers where speed is measured at lat-lon points are successive times?> > I am trying to fit the GPD (gpd.fit) to each point (time,lat,lon) to get a > return level plot with values at each grid point. (Map with return level by > location.)I'm not really sure conducting extreme value analysis is a safe procedure when your stats background is "modest at best".> > Here is some code I tried, following similar structure to languages I'm more > familiar with, but it isn't working. > > Y = as.matrix(time,lat,lon)This suggests to me that you come from a different computing universe where the as.matrix() function allows multiple objects (presumably vectors) to be placed "side-by-side" and have a matrix object returned. (In this computing universe that role is played by the cbind function.) Perhaps: Y = cbind(time,lat,lon) # "c" standing for column> > for (t in 1:67) > + for (j in 1:106) > + for (i in 1:193) > + fit(t,j,i)<-gpd.fit(speed(t,j,i), threshold=17,ydat = Y) > > I receive errors at this point, and cant figure out how to get individual > fits at each grid point.It's a puzzle to me why you think that passing a single point to a regression function will allow any solution. I did a "??" search and find that there is `gpd.fit` function in the 'ismev' package, but it (like all other regression functions of which I am aware) appears to take full data objects rather than single points. You would not need to use for-loops to pass the object. Perhaps: fit <-gpd.fit(Y, threshold=17,ydat = Y) # Not sure where "speed" entered the picture. # as noted before there is ambiguity in the problem statement # Or did you do a prior differentiation operation? # perhaps a 1/ first difference on time? # perhaps ... if gpd.fit follows the usual R conventions # return the first point i=1,j=1,k=1 pred.ijk <- predict(fit, data,frame(time=i, lat,=j,lon=k) ) ? expand.grid # to cover a range Your placement of a functional form on the LHS of a formula also suggests recent migration from another statistical universe where assignment is done into functions, i.e. forms using parentheses, rather than the extraction/insertion operators: "[" and "[<-", to put values into structures with dimensions, like matrices and dataframes. I think you really need to do some more self-study of the introductory R material rather than making wild guesses at what "might" work based on experience with Python, Perl, or (less likely in view of that effort to assign into a function) Matlab. You should also read the Posting Guide. -- David Winsemius, MD Alameda, CA, USA
Hi Michelle, Please cc the list on your replies so others (far more knowledgeable than I) can answer. On Mon, Sep 24, 2012 at 12:12 AM, Michelle Cipullo <mlcipull at ncsu.edu> wrote:>>> I am doing some extreme value analysis on model output (WRF) which have the >>> following dimensions: >>> >>> speed(time,lat,lon) >>> >>> I am trying to fit the GPD (gpd.fit) to each point (time,lat,lon) to get a >>> return level plot with values at each grid point. (Map with return level by >>> location.) >>> >> Finally, you should see if the function gpd.fit() is (in R speak) >> vectorized: if so, you will be able to drop the explicit loops and >> make things much much faster. > > This function is from the ismev package: > http://www.maths.lth.se/matstat/staff/nader/stint/R_Manuals/ismev.pdf > It doesn't say explicitly (that I saw) if it is or isn't vectorized. > When I tried doing the indexing the correct way (thanks!), I get the > following error: > > Error in optim(init, gpd.lik, hessian = TRUE, method = method, control > = list(maxit = maxit, : > non-finite value supplied by optimUntested in the absence of a reproducible example: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example If I understand the model you want to give correctly, you are fitting multiple GPDs at each spatial point where data is collected over time. If that's the case, you likely should place the entire time "slice" to gpd.fit() all at once. Cheers, Michael> > I've tried googling this to find a solution, but most resources i > have found don't list a solution. > > The problem I have with not looping it over individual points is that > the value in the fit should vary due to the fact that at each > location, the terrain and factors driving the wind speed will vary > greatly (some stuff over water, some stuff over mountains, etc.) so > the plan is to do a gpd.fit for each individual point so that i can > plot a return level map similar to what can be seen on page 8 of this > paper (http://www.stat.duke.edu/~es112/Research/ims-MannshardtShamseldinSmithSain.pdf > ). > This paper takes into account spatial statistics, which is beyond the > level of what I am trying to do, which is simply generating a return > level at each individual point based on the data points of that > individual place, not taking into account the neighbors of that point. > Let me know if this doesn't make sense.