Hello, I'm new to R with a (probably elementary) question. Suppose I have a dataset called /A/ with /n/ locations, and each location contains within it 3 time series of different variables (all of 100 years length); each time series is of a weather variable (for each location there is a temperature, precipitation, and pressure). For instance, location 1 has a temperature1 time series, a precip1 time series, and a pressure1 time series; location two has a temperature2, precip2, and pressure2 timeseries...That is, there are 100 rows, and (/n/*3)+1 columns. The extra column is the time. I want to load in this dataset and declare a variable for each time series. The columns are in order of location, so it goes temp1, precip1,pressure1, temp2,... and so forth in increasing column order. There are always 100 rows. Manually, Id have to do: temp1=A[,1] precip1=A[,2] pressure1=A[,3] temp2=A[,4] precip2=A[,5] pressure2=A[,6] temp3=A[,7] and so forth..... Problem is, n is large, so I don't want to repeat this pattern forever. I figure I need a loop both for the variable name (ie.., the variable at a particular location) as well as for what column it reads from. Any help...? -- View this message in context: http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html Sent from the R help mailing list archive at Nabble.com.
Hello, Why do you need 9 variables in your environment if they are time series that correspond to the same period? You should use time series functions. #install.packages('zoo') library(zoo) # Make up a dataset Year <- seq(from=as.Date("1901-01-01"), by="year", length.out=100) dat <- data.frame(matrix(rnorm(100*9), ncol=9), Year) # assign names. varNames <- expand.grid(c("temp", "precip", "pressure"), 1:3, stringsAsFactors=FALSE) varNames <- as.vector(apply(colNames, 1, paste, collapse="")) varNames <- c(varNames, "Year") names(dat) <- varNames head(dat) # and transform it into a time series of class 'zoo' z <- zoo(dat[, 1:9], order.by=dat$Year) str(z) head(z) Another way would be, like you say, to use a loop to put the variables in a list. Something like lst <- list() for(i in 1:9) lst[[i]] <- dat[, i] names(lst) <- varNames Note that I've used a dataset called 'dat' n place of your 'A'. You should post a data example, like the posting guide says. Using dput(). Hope this helps, Rui Barradas Em 14-07-2012 03:44, cmc0605 escreveu:> Hello, I'm new to R with a (probably elementary) question. > > Suppose I have a dataset called /A/ with /n/ locations, and each location > contains within it 3 time series of different variables (all of 100 years > length); each time series is of a weather variable (for each location there > is a temperature, precipitation, and pressure). For instance, location 1 > has a temperature1 time series, a precip1 time series, and a pressure1 time > series; location two has a temperature2, precip2, and pressure2 > timeseries...That is, there are 100 rows, and (/n/*3)+1 columns. The extra > column is the time. > > I want to load in this dataset and declare a variable for each time series. > The columns are in order of location, so it goes temp1, precip1,pressure1, > temp2,... and so forth in increasing column order. There are always 100 > rows. Manually, Id have to do: > > temp1=A[,1] > precip1=A[,2] > pressure1=A[,3] > temp2=A[,4] > precip2=A[,5] > pressure2=A[,6] > temp3=A[,7] > and so forth..... > > Problem is, n is large, so I don't want to repeat this pattern forever. I > figure I need a loop both for the variable name (ie.., the variable at a > particular location) as well as for what column it reads from. > > Any help...? > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, Right, it should be 'varNames' in the apply. I guess I had something called colNames in my environment. I've just rm(list=ls()) and rerun the code, corrected. No errors this time. varNames is the result of expand.grid, therefore does have a dim attribute. The faulty instruction corrected is: varNames <- as.vector(apply(varNames, 1, paste, collapse="")) Rui Barradas Em 15-07-2012 18:12, arun escreveu:> Hi Rui, > > Getting some error messages: > > > varNames <- as.vector(apply(colNames, 1, paste, collapse="")) > Error in apply(colNames, 1, paste, collapse = "") : > dim(X) must have a positive length > > A.K. > > > > ----- Original Message ----- > From: Rui Barradas <ruipbarradas at sapo.pt> > To: cmc0605 <colose21 at gmail.com> > Cc: r-help at r-project.org > Sent: Sunday, July 15, 2012 8:12 AM > Subject: Re: [R] Loading in Large Dataset + variables via loop > > Hello, > > Why do you need 9 variables in your environment if they are time series > that correspond to the same period? You should use time series functions. > > #install.packages('zoo') > library(zoo) > > # Make up a dataset > Year <- seq(from=as.Date("1901-01-01"), by="year", length.out=100) > dat <- data.frame(matrix(rnorm(100*9), ncol=9), Year) > > # assign names. > varNames <- expand.grid(c("temp", "precip", "pressure"), 1:3, > stringsAsFactors=FALSE) > varNames <- as.vector(apply(colNames, 1, paste, collapse="")) > varNames <- c(varNames, "Year") > names(dat) <- varNames > head(dat) > > # and transform it into a time series of class 'zoo' > z <- zoo(dat[, 1:9], order.by=dat$Year) > str(z) > head(z) > > > Another way would be, like you say, to use a loop to put the variables > in a list. Something like > > lst <- list() > for(i in 1:9) lst[[i]] <- dat[, i] > names(lst) <- varNames > > > Note that I've used a dataset called 'dat' n place of your 'A'. You > should post a data example, like the posting guide says. Using dput(). > > Hope this helps, > > Rui Barradas > > > > Em 14-07-2012 03:44, cmc0605 escreveu: >> Hello, I'm new to R with a (probably elementary) question. >> >> Suppose I have a dataset called /A/ with /n/ locations, and each location >> contains within it 3 time series of different variables (all of 100 years >> length); each time series is of a weather variable (for each location there >> is a temperature, precipitation, and pressure). For instance, location 1 >> has a temperature1 time series, a precip1 time series, and a pressure1 time >> series; location two has a temperature2, precip2, and pressure2 >> timeseries...That is, there are 100 rows, and (/n/*3)+1 columns. The extra >> column is the time. >> >> I want to load in this dataset and declare a variable for each time series. >> The columns are in order of location, so it goes temp1, precip1,pressure1, >> temp2,... and so forth in increasing column order. There are always 100 >> rows. Manually, Id have to do: >> >> temp1=A[,1] >> precip1=A[,2] >> pressure1=A[,3] >> temp2=A[,4] >> precip2=A[,5] >> pressure2=A[,6] >> temp3=A[,7] >> and so forth..... >> >> Problem is, n is large, so I don't want to repeat this pattern forever. I >> figure I need a loop both for the variable name (ie.., the variable at a >> particular location) as well as for what column it reads from. >> >> Any help...? >> >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >