Naidraug
2012-Aug-05 14:08 UTC
[R] trouble with looping for effect of sampling interval increase
I've looked everywhere and tinkered for three days now, so I figure asking might be good. So here's a general rundown of what I am trying to get my code to do I am giving you the whole rundown because I need a solution that retain certain ways of doing things because they give me the information i need. I want to examine the effect of increasing my sampling interval on my data. Example: what if instead of sampling every hour I sampled every two, oh yeah, how about every three?.. etc ad nausea. How I want to do this is to take the data I have now, add an index to it, that contains counters. Those counters will look something like 1,2,1,2,.. for the first one, 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand... Then for each column in the index my loops should start in the first column, run only the ones, store that, then run the twos, and store that in the same column of output in a different row. Then move to the next column run the ones, store in the next column of output, run the twos, store in the next row of that column, run the threes, etc on out until there is no more. I want to use this index for a number of reasons. The first is that after this I will be going back through and using a different method for sub-sampling but keeping all else the same. So all I have to do there is change the way I generate the index. The second is that it allows me to run many subsamples and see their range. So the code I have made, generates my index, and does the heavy lifting all correctly, as well as my averages, and quartiles, but a look at the head () of my key output (IntervalBetas) shows that something has gone a miss. You have to look close to catch it. The values generated for each row of output are identical, this should not be the case, as row one of the first output column should be generated from all values indexed by a one in the first column, whereas in column two there are different values indexed by the number one. I've checked about everything I can think of, done print() on my loop sequence things (those little i and j) and wiggled about everything. I am flummoxed. I think the bit that is messing up is in here : #Here is the loop for betas from sampling interval increase c <- WHOLESIZE[2]-1 for (i in 1:c) { x <- length(unique(index[,i])) for (j in 1:x) { data <- WHOLE [WHOLE[,x]==j,1] But also here is the whole code in case I am wrong that that is the problem area: #loop for making index #clean dataset of empty cells dataset <- na.omit (datasetORIGINAL) #how messed up was the data? holeyDATA <- datasetORIGINAL - dataset D <- dim(dataset) #what is the smallest sample? tinysample <- 100 #how long is the dataset? datalength <- length (dataset) #MD <- how many divisions MD <- datalength/tinysample #clear things up for the index loop WHOLE <- NULL index <- NULL #do the index loop for (a in 1:MD) { index <- cbind (index, rep (1:a, length = D[1])) } index <- subset(index, select = -c(1) ) #merge dataset and index loop WHOLE <- cbind (dataset, index) WHOLESIZE <- dim (WHOLE) #Housekeeping before loops IntervalBetas <- NULL IntervalBetas <- c(NA,NA) IntervalBetas <- as.data.frame (IntervalBetas) IntervalLowerQ <- NULL IntervalUpperQ <- NULL IntervalMean <- NULL IntervalMedian <- NULL #Here is the loop for betas from sampling interval increase c <- WHOLESIZE[2]-1 for (i in 1:c) { x <- length(unique(index[,i])) for (j in 1:x) { data <- WHOLE [WHOLE[,x]==j,1] #get power spectral density PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE) frequency <- PSDPLOT$freq PSD <- PSDPLOT$spec #log transform the power spectral density Logfrequency <- log(frequency) LogPSD<- log(PSD) #fit my line to the data Line <- lm (LogPSD ~ Logfrequency) #store the slope of the line Betas <- rbind (Betas, -coef(Line)[2]) #Get values on the curve shape BSkew <- skew (Betas) BMean <- mean (Betas) BMedian <- median (Betas) Q <- quantile (Betas) #store curve shape values IntervalLowerQ <- rbind (IntervalLowerQ , Q[2]) IntervalUpperQ <- rbind (IntervalUpperQ , Q[4]) IntervalSkew <- rbind (IntervalSkew , BSkew) IntervalMean <- rbind (IntervalMean , BMean) IntervalMedian <- rbind (IntervalMedian , BMedian) #Store the Betas #This is a pain BetaSave <- Betas no.r <- nrow(IntervalBetas) l.v <- length(BetaSave) difer <- no.r - l.v difers <- abs(difer) if (no.r < l.v){ IntervalBetas <- rbind(IntervalBetas,rep(NA,difers)) } else { (BetaSave <- rbind(BetaSave,rep(NA,difers))) } IntervalBetas <- cbind (IntervalBetas, BetaSave) } } #That ends the loop within a loop for how sampling interval #changes beta head (IntervalBetas) -- View this message in context: http://r.789695.n4.nabble.com/trouble-with-looping-for-effect-of-sampling-interval-increase-tp4639213.html Sent from the R help mailing list archive at Nabble.com.
Jean V Adams
2012-Aug-06 17:33 UTC
[R] trouble with looping for effect of sampling interval increase
You would make it much easier for R-help readers to solve your problem if you provided a small example data set with your code, so that we could reproduce your results and troubleshoot the issues. Jean Naidraug <white.232@wright.edu> wrote on 08/05/2012 09:08:25 AM:> > I've looked everywhere and tinkered for three days now, so I figureasking> might be good. > So here's a general rundown of what I am trying to get my code to do Iam> giving you the whole rundown because I need a solution that retaincertain> ways of doing things because they give me the information i need. > I want to examine the effect of increasing my sampling interval on mydata.> Example: what if instead of sampling every hour I sampled every two, oh > yeah, how about every three?.. etc ad nausea. How I want to do this isto> take the data I have now, add an index to it, that contains counters.Those> counters will look something like 1,2,1,2,.. for the first one, > 1,2,3,1,2,3.. for the next one. I have a lot of them, like say athousand...> Then for each column in the index my loops should start in the firstcolumn,> run only the ones, store that, then run the twos, and store that in thesame> column of output in a different row. Then move to the next column runthe> ones, store in the next column of output, run the twos, store in thenext> row of that column, run the threes, etc on out until there is no more. I > want to use this index for a number of reasons. The first is that afterthis> I will be going back through and using a different method forsub-sampling> but keeping all else the same. So all I have to do there is change theway I> generate the index. The second is that it allows me to run manysubsamples> and see their range. So the code I have made, generates my index, anddoes> the heavy lifting all correctly, as well as my averages, and quartiles,but> a look at the head () of my key output (IntervalBetas) shows thatsomething> has gone a miss. You have to look close to catch it. The valuesgenerated> for each row of output are identical, this should not be the case, asrow> one of the first output column should be generated from all valuesindexed> by a one in the first column, whereas in column two there are different > values indexed by the number one. I've checked about everything I canthink> of, done print() on my loop sequence things (those little i and j) and > wiggled about everything. I am flummoxed. I think the bit that ismessing up> is in here : > #Here is the loop for betas from sampling interval increase > c <- WHOLESIZE[2]-1 > for (i in 1:c) > { > x <- length(unique(index[,i])) > > for (j in 1:x) > { > > data <- WHOLE [WHOLE[,x]==j,1] > > But also here is the whole code in case I am wrong that that is theproblem> area: > > #loop for making index > > > #clean dataset of empty cells > dataset <- na.omit (datasetORIGINAL) > #how messed up was the data? > holeyDATA <- datasetORIGINAL - dataset > > D <- dim(dataset) > > #what is the smallest sample? > tinysample <- 100 > > > > > #how long is the dataset? > datalength <- length (dataset) > > > #MD <- how many divisions > > MD <- datalength/tinysample > > #clear things up for the index loop > WHOLE <- NULL > index <- NULL > #do the index loop > > for (a in 1:MD) > { > index <- cbind (index, rep (1:a, length = D[1])) > } > index <- subset(index, select = -c(1) ) > > #merge dataset and index loop > WHOLE <- cbind (dataset, index) > > WHOLESIZE <- dim (WHOLE) > > #Housekeeping before loops > IntervalBetas <- NULL > > > IntervalBetas <- c(NA,NA) > IntervalBetas <- as.data.frame (IntervalBetas) > IntervalLowerQ <- NULL > IntervalUpperQ <- NULL > IntervalMean <- NULL > IntervalMedian <- NULL > > #Here is the loop for betas from sampling interval increase > c <- WHOLESIZE[2]-1 > for (i in 1:c) > { > x <- length(unique(index[,i])) > > for (j in 1:x) > { > > data <- WHOLE [WHOLE[,x]==j,1] > > > > > #get power spectral density > > PSDPLOT <- spectrum (data, detrend = TRUE, plot = FALSE) > frequency <- PSDPLOT$freq > PSD <- PSDPLOT$spec > #log transform the power spectral density > Logfrequency <- log(frequency) > LogPSD<- log(PSD) > #fit my line to the data > Line <- lm (LogPSD ~ Logfrequency) > #store the slope of the line > Betas <- rbind (Betas, -coef(Line)[2]) > > #Get values on the curve shape > BSkew <- skew (Betas) > BMean <- mean (Betas) > BMedian <- median (Betas) > Q <- quantile (Betas) > > > #store curve shape values > IntervalLowerQ <- rbind (IntervalLowerQ , Q[2]) > IntervalUpperQ <- rbind (IntervalUpperQ , Q[4]) > IntervalSkew <- rbind (IntervalSkew , BSkew) > IntervalMean <- rbind (IntervalMean , BMean) > IntervalMedian <- rbind (IntervalMedian , BMedian) > > #Store the Betas > #This is a pain > > > BetaSave <- Betas > no.r <- nrow(IntervalBetas) > l.v <- length(BetaSave) > difer <- no.r - l.v > difers <- abs(difer) > if (no.r < l.v){ > IntervalBetas <- rbind(IntervalBetas,rep(NA,difers)) > } > else { > (BetaSave <- rbind(BetaSave,rep(NA,difers))) > } > > IntervalBetas <- cbind (IntervalBetas, BetaSave) > > > } > > } > > #That ends the loop within a loop for how sampling interval > #changes beta > head (IntervalBetas)[[alternative HTML version deleted]]