Jenny Barnes
2006-Nov-20 13:03 UTC
[R] problem with loop to put data into array with missing data for some files
Dear R-help community, My main goal of this message is to find a way of skipping a file of a month/year in a loop that does not exist (and making it's output into an data.out array would be NA) and moving onto the next year/month in the loop to carry on filling data.out with real precipitation data. The situation so far: I downloaded 50 years worth of GRIB data files from the NCEP data site http://nomad3.ncep.noaa.gov/pub/reanalysis-1/month/grb2d.gau/ I then created a loop in R to read each month of each of the 50 years worth of files and only extract the precipitation records using wgrib and grep as show in the code at the end of this message. I had to use grep to extract the PRATE records each time, as unfortunately with these grib files precipitation is not always the same record number each time. This worked fine until Feb 1972 (which is file /home/jmb/sst_precip/gribdata/197202.grib). This file did not download completely and as a result the file that saved to my computer did not include the record for precipitation - only the first 13 records. At this point in the loop it just stopped with error message: Error in seq.default(lat.beg, lat.end, length = ny) : length must be non-negative number Which is obviously because ny does not actually exist for this file as it cannot be extracted from source as this record does not exist for this file! My question is: Does anybody have any suggestions as to how I could code it so that my loop will allow this file to just output NA into the data.out array in data.out$data and maybe keep FALSE in data.out$date for the month&years where this has occured and then continue to loop around all the other months and years in succession? Things that don't work that I have tried: 1)I cannot just restart the loop after any anomalous years as this "resets" data.out and changes its dimensions depending on the number of years looped (nyrs) etc. 2)The first point of trouble is when I call upon ny in the line>lats <- seq(lat.beg, lat.end, length=ny)as ny is NA in this record - so I tried putting the line>test <- dims[15] >if(test != "NA"){with the closing } at the end of the rest of the loop ...this gives the error message>Error in if (test != "NA") { : missing value where TRUE/FALSE needed3) My thought was that maybe I would need something like>if (parameter == "PRATE" ){then continue with loop; but this will obviously not work by itself and gives the error message>Error in if (test != "NA") { : missing value where TRUE/FALSE needed- would need to specify what to do if parameter is not =="PRATE". I encourage anybody with another idea to contact me with their suggestions! Thank you for taking the time to read my problem, Sincerely, Jennifer Barnes r-help at stat.math.ethz.ch Here is my code: #takes precip data for every year and month needed #from file saved on computer already #and sorts it into a list (array) #set some control parameters startyr <-1954 #note needs to be year before data wanted nyrs <- 50 months <- nyrs*12 k <- 0 library(chron) library(utils) data.out <- list(lats=seq(88.542, -88.542, length=94), lons=seq(0, 360-1.875, length=192), date=vector(length=months), data=array(NA, c(months, 94*192)) ) #the loop for(yr in 1:nyrs){ for(mon in 1:12){ i<- startyr+yr if(mon < 10) {j <- 0} if(mon < 10) {mon2 <- mon} if(mon == 10) {(j <- 1)} if(mon == 10) {mon2 <- 0} if(mon == 11) {(j <- 1)} if(mon == 11) {mon2 <- 1} if(mon == 12) {(j <- 1)} if(mon == 12) {mon2 <- 2} outfile <- paste("/home/jmb/sst_precip/gribdata/",i, j, mon2, ".grib", sep="") #decode the grib cmd <- paste("/usr/local/bin/wgrib ", outfile, "| grep \":PRATE:\" | /usr/local/bin/wgrib", outfile, "-i -V -text -o /tmp/wgrib.output") status <- system(cmd, intern=T) #extract metadata needed for working with julian dates: dt <- unlist(strsplit(status[1], " "))[3] dt.yr <- as.numeric(substr(dt, 1,4)) dt.mon <- as.numeric(substr(dt, 5,6)) dt.day <- as.numeric(substr(dt, 7,8)) dt.hr <- as.numeric(substr(dt, 9,10)) dt.jul <- julian(dt.mon, dt.day+(dt.hr/24), dt.yr) parameter <- unlist(strsplit(status[1], " "))[4] param <- status[2] dims <- unlist(strsplit(status[3], " ")) nx <- as.numeric(dims[13]) ny <- as.numeric(dims[15]) temp <- unlist(strsplit(status[5]," ")) lat.beg <- as.numeric(temp[6]) lat.end <- as.numeric(temp[8]) lats <- seq(lat.beg, lat.end, length=ny) temp <- unlist(strsplit(status[6]," ")) lon.beg <- as.numeric(temp[14]) lon.end <- as.numeric(temp[16]) if(lon.beg > lon.end){lon.beg <- (lon.beg -360)} #since sp prefers -180 to 180 encoding lons <- seq(lon.beg, lon.end, length=nx) lons[lons > 180] <- lons[lons > 180]-360 #since sp prefers -180 to 180 encoding #read flat file into R (header first) dims <- scan("/tmp/wgrib.output", nlines=1, quiet=T) nrec <- dims[1]*dims[2] indata <- paste("indata.", i, j, mon2, sep="") indata <- scan("/tmp/wgrib.output", 'numeric', nlines=nrec, skip=1, quiet=T) indata <- matrix(as.numeric(indata), nrow=dims[1], ncol=dims[2], byrow=F) #to get precip into mm/month, not mm/sec: if((dt.mon==02)&(dt.yrendnumber%%4==0)){days.mon <- 29} else {days.mon <-28} #to take into account leap-years!!! if(dt.mon==01){days.mon <- 31} if(dt.mon==03){days.mon <- 31} if(dt.mon==05){days.mon <- 31} if(dt.mon==07){days.mon <- 31} if(dt.mon==08){days.mon <- 31} if(dt.mon==10){days.mon <- 31} if(dt.mon==12){days.mon <- 31} if(dt.mon==04){days.mon <- 30} if(dt.mon==06){days.mon <- 30} if(dt.mon==09){days.mon <- 30} if(dt.mon==11){days.mon <- 30} sec.mon <- days.mon*24*60*60 newindata <- paste("newindata.",i, j, mon2, sep="") newindata <- sec.mon*indata k<-k+1 data.out$date[k] <- paste(i, j, mon2, sep="") data.out$data[k,] <- newindata } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jennifer Barnes PhD student - long range drought prediction Climate Extremes Department of Space and Climate Physics University College London Holmbury St Mary, Dorking Surrey RH5 6NT Web: http://climate.mssl.ucl.ac.uk
john seers (IFR)
2006-Nov-20 13:49 UTC
[R] problem with loop to put data into array with missing data forsome files
Hi Jenny I guess this is a silly question so sorry in advance. But why don't you just re-download the file that failed to download completely? John --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail john.seers at bbsrc.ac.uk e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Jenny Barnes Sent: 20 November 2006 13:03 To: r-help at stat.math.ethz.ch Subject: [R] problem with loop to put data into array with missing data forsome files Dear R-help community, My main goal of this message is to find a way of skipping a file of a month/year in a loop that does not exist (and making it's output into an data.out array would be NA) and moving onto the next year/month in the loop to carry on filling data.out with real precipitation data. The situation so far: I downloaded 50 years worth of GRIB data files from the NCEP data site http://nomad3.ncep.noaa.gov/pub/reanalysis-1/month/grb2d.gau/ I then created a loop in R to read each month of each of the 50 years worth of files and only extract the precipitation records using wgrib and grep as show in the code at the end of this message. I had to use grep to extract the PRATE records each time, as unfortunately with these grib files precipitation is not always the same record number each time. This worked fine until Feb 1972 (which is file /home/jmb/sst_precip/gribdata/197202.grib). This file did not download completely and as a result the file that saved to my computer did not include the record for precipitation - only the first 13 records. At this point in the loop it just stopped with error message: Error in seq.default(lat.beg, lat.end, length = ny) : length must be non-negative number Which is obviously because ny does not actually exist for this file as it cannot be extracted from source as this record does not exist for this file! My question is: Does anybody have any suggestions as to how I could code it so that my loop will allow this file to just output NA into the data.out array in data.out$data and maybe keep FALSE in data.out$date for the month&years where this has occured and then continue to loop around all the other months and years in succession? Things that don't work that I have tried: 1)I cannot just restart the loop after any anomalous years as this "resets" data.out and changes its dimensions depending on the number of years looped (nyrs) etc. 2)The first point of trouble is when I call upon ny in the line>lats <- seq(lat.beg, lat.end, length=ny)as ny is NA in this record - so I tried putting the line>test <- dims[15] >if(test != "NA"){with the closing } at the end of the rest of the loop ...this gives the error message>Error in if (test != "NA") { : missing value where TRUE/FALSE needed3) My thought was that maybe I would need something like>if (parameter == "PRATE" ){then continue with loop; but this will obviously not work by itself and gives the error message>Error in if (test != "NA") { : missing value where TRUE/FALSE needed- would need to specify what to do if parameter is not =="PRATE". I encourage anybody with another idea to contact me with their suggestions! Thank you for taking the time to read my problem, Sincerely, Jennifer Barnes r-help at stat.math.ethz.ch Here is my code: #takes precip data for every year and month needed #from file saved on computer already #and sorts it into a list (array) #set some control parameters startyr <-1954 #note needs to be year before data wanted nyrs <- 50 months <- nyrs*12 k <- 0 library(chron) library(utils) data.out <- list(lats=seq(88.542, -88.542, length=94), lons=seq(0, 360-1.875, length=192), date=vector(length=months), data=array(NA, c(months, 94*192)) ) #the loop for(yr in 1:nyrs){ for(mon in 1:12){ i<- startyr+yr if(mon < 10) {j <- 0} if(mon < 10) {mon2 <- mon} if(mon == 10) {(j <- 1)} if(mon == 10) {mon2 <- 0} if(mon == 11) {(j <- 1)} if(mon == 11) {mon2 <- 1} if(mon == 12) {(j <- 1)} if(mon == 12) {mon2 <- 2} outfile <- paste("/home/jmb/sst_precip/gribdata/",i, j, mon2, ".grib", sep="") #decode the grib cmd <- paste("/usr/local/bin/wgrib ", outfile, "| grep \":PRATE:\" | /usr/local/bin/wgrib", outfile, "-i -V -text -o /tmp/wgrib.output") status <- system(cmd, intern=T) #extract metadata needed for working with julian dates: dt <- unlist(strsplit(status[1], " "))[3] dt.yr <- as.numeric(substr(dt, 1,4)) dt.mon <- as.numeric(substr(dt, 5,6)) dt.day <- as.numeric(substr(dt, 7,8)) dt.hr <- as.numeric(substr(dt, 9,10)) dt.jul <- julian(dt.mon, dt.day+(dt.hr/24), dt.yr) parameter <- unlist(strsplit(status[1], " "))[4] param <- status[2] dims <- unlist(strsplit(status[3], " ")) nx <- as.numeric(dims[13]) ny <- as.numeric(dims[15]) temp <- unlist(strsplit(status[5]," ")) lat.beg <- as.numeric(temp[6]) lat.end <- as.numeric(temp[8]) lats <- seq(lat.beg, lat.end, length=ny) temp <- unlist(strsplit(status[6]," ")) lon.beg <- as.numeric(temp[14]) lon.end <- as.numeric(temp[16]) if(lon.beg > lon.end){lon.beg <- (lon.beg -360)} #since sp prefers -180 to 180 encoding lons <- seq(lon.beg, lon.end, length=nx) lons[lons > 180] <- lons[lons > 180]-360 #since sp prefers -180 to 180 encoding #read flat file into R (header first) dims <- scan("/tmp/wgrib.output", nlines=1, quiet=T) nrec <- dims[1]*dims[2] indata <- paste("indata.", i, j, mon2, sep="") indata <- scan("/tmp/wgrib.output", 'numeric', nlines=nrec, skip=1, quiet=T) indata <- matrix(as.numeric(indata), nrow=dims[1], ncol=dims[2], byrow=F) #to get precip into mm/month, not mm/sec: if((dt.mon==02)&(dt.yrendnumber%%4==0)){days.mon <- 29} else {days.mon <-28} #to take into account leap-years!!! if(dt.mon==01){days.mon <- 31} if(dt.mon==03){days.mon <- 31} if(dt.mon==05){days.mon <- 31} if(dt.mon==07){days.mon <- 31} if(dt.mon==08){days.mon <- 31} if(dt.mon==10){days.mon <- 31} if(dt.mon==12){days.mon <- 31} if(dt.mon==04){days.mon <- 30} if(dt.mon==06){days.mon <- 30} if(dt.mon==09){days.mon <- 30} if(dt.mon==11){days.mon <- 30} sec.mon <- days.mon*24*60*60 newindata <- paste("newindata.",i, j, mon2, sep="") newindata <- sec.mon*indata k<-k+1 data.out$date[k] <- paste(i, j, mon2, sep="") data.out$data[k,] <- newindata } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jennifer Barnes PhD student - long range drought prediction Climate Extremes Department of Space and Climate Physics University College London Holmbury St Mary, Dorking Surrey RH5 6NT Web: http://climate.mssl.ucl.ac.uk ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.