thr3ads.net - R help - [R] Faster way to zero-pad a data frame...? [May 2006]

If this information is useful, please help other people find it:
Share via:

Pete Cap

2006-May-30 19:57 UTC

[R] Faster way to zero-pad a data frame...?

Hello List,
 
 I am working on creating periodograms from IP network traffic logs using the
Fast Fourier Transform.  The FFT requires all the data points to be
evenly-spaced in the time domain (constant delta-T), so I have a step where I
zero-pad the data.
 
 Lately I've been wondering if there is a faster way to do this.  Here's
what I've got:
 
 * data1 is a data frame consisting of a timestamp, in seconds, from the
beginning of the network log, and the number of network events that fell on that
timestamp.
 Example:
 time,events
 0,1
 1,30
 5,14
 10,4
 
 *data2 is the zero-padded data frame.  It has length equal to the greatest
value of "time" in data2:
 time,events
 1,0
 2,0
 3,0
 4,0
 5,0
 6,0
 7,0
 8,0
 9,0
 10,0
 
 So I run this for loop:
 for(i in 1:length(data1[,1])) {
     data2[data1[i,1],2]<-data1[i,2]
 }
 
 Which goes to each row in data1, reads the timestamp, and writes the
"events" to the corresponding row in data2.  The result is:
 time,events
 0,1
 1,30
 2,0
 3,0
 4,0
 5,14
 6,0
 7,0
 9,0
 9,0
 10,4
 
 For a 24-hour log (86,400 seconds) this can take a while...Any advice on how to
speed it up would be appreciated.
 
 Thanks,
 Pete Cap
 
		
---------------------------------

	[[alternative HTML version deleted]]

jim holtman

2006-May-30 20:13 UTC

head link

[R] Faster way to zero-pad a data frame...?

How about starting your time from 1 instead of 0 to make indexing earier
(you can always substract one later).  If so:
> x  time events
1    1      1
2    2     30
3    6     14
4   11      4> y <- data.frame(time=seq(max(x$time)), events=rep(0, max(x$time)))
> y   time events
1     1      0
2     2      0
3     3      0
4     4      0
5     5      0
6     6      0
7     7      0
8     8      0
9     9      0
10   10      0
11   11      0> y$events[x$time] <- x$events
> y   time events
1     1      1
2     2     30
3     3      0
4     4      0
5     5      0
6     6     14
7     7      0
8     8      0
9     9      0
10   10      0
11   11      4>


On 5/30/06, Pete Cap <peteoutside@yahoo.com>
wrote:>
> Hello List,
>
> I am working on creating periodograms from IP network traffic logs using
> the Fast Fourier Transform.  The FFT requires all the data points to be
> evenly-spaced in the time domain (constant delta-T), so I have a step where
> I zero-pad the data.
>
> Lately I've been wondering if there is a faster way to do this. 
Here's
> what I've got:
>
> * data1 is a data frame consisting of a timestamp, in seconds, from the
> beginning of the network log, and the number of network events that fell on
> that timestamp.
> Example:
> time,events
> 0,1
> 1,30
> 5,14
> 10,4
>
> *data2 is the zero-padded data frame.  It has length equal to the greatest
> value of "time" in data2:
> time,events
> 1,0
> 2,0
> 3,0
> 4,0
> 5,0
> 6,0
> 7,0
> 8,0
> 9,0
> 10,0
>
> So I run this for loop:
> for(i in 1:length(data1[,1])) {
>     data2[data1[i,1],2]<-data1[i,2]
> }
>
> Which goes to each row in data1, reads the timestamp, and writes the
> "events" to the corresponding row in data2.  The result is:
> time,events
> 0,1
> 1,30
> 2,0
> 3,0
> 4,0
> 5,14
> 6,0
> 7,0
> 9,0
> 9,0
> 10,4
>
> For a 24-hour log (86,400 seconds) this can take a while...Any advice on
> how to speed it up would be appreciated.
>
> Thanks,
> Pete Cap
>
>
> ---------------------------------
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

Gabor Grothendieck

2006-May-30 20:54 UTC

head link

[R] Faster way to zero-pad a data frame...?

Try this:

Lines <- "time,events
 0,1
 1,30
 5,14
 10,4"

library(zoo)
data1 <- read.zoo(textConnection(Lines), header = TRUE, sep = ",")
data2 <- as.ts(data1)
data2[is.na(data2)] <- 0 # omit this lines if NAs in extra positions is ok


On 5/30/06, Pete Cap <peteoutside at yahoo.com>
wrote:> Hello List,
>
>  I am working on creating periodograms from IP network traffic logs using
the Fast Fourier Transform.  The FFT requires all the data points to be
evenly-spaced in the time domain (constant delta-T), so I have a step where I
zero-pad the data.
>
>  Lately I've been wondering if there is a faster way to do this. 
Here's what I've got:
>
>  * data1 is a data frame consisting of a timestamp, in seconds, from the
beginning of the network log, and the number of network events that fell on that
timestamp.
>  Example:
>  time,events
>  0,1
>  1,30
>  5,14
>  10,4
>
>  *data2 is the zero-padded data frame.  It has length equal to the greatest
value of "time" in data2:
>  time,events
>  1,0
>  2,0
>  3,0
>  4,0
>  5,0
>  6,0
>  7,0
>  8,0
>  9,0
>  10,0
>
>  So I run this for loop:
>  for(i in 1:length(data1[,1])) {
>     data2[data1[i,1],2]<-data1[i,2]
>  }
>
>  Which goes to each row in data1, reads the timestamp, and writes the
"events" to the corresponding row in data2.  The result is:
>  time,events
>  0,1
>  1,30
>  2,0
>  3,0
>  4,0
>  5,14
>  6,0
>  7,0
>  9,0
>  9,0
>  10,4
>
>  For a 24-hour log (86,400 seconds) this can take a while...Any advice on
how to speed it up would be appreciated.
>
>  Thanks,
>  Pete Cap
>
>
> ---------------------------------
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Rolf Turner

2006-May-30 21:31 UTC

head link

[R] Faster way to zero-pad a data frame...?

Why not something simple like:

# Toy example:
data1 <- data.frame(time=c(0,1,5,10),events=c(1,30,14,4))
data2 <- rep(0,11) # Or more generally data2 <- rep(0,1+max(data1$time))

# You don't need a for loop! Use the indexing capabilities of R!
data2[data1$time+1] <- data1$events # The ``+1'' is to allow for
0-origin.
data2 <- ts(data2,start=0)

					???

			cheers,

				Rolf Turner
				rolf at math.unb.ca

Apparently Analagous Threads

Search for more seemingly similar threads

R help - May 2006 - Faster way to zero-pad a data frame...?

[R] Faster way to zero-pad a data frame...?

[R] Faster way to zero-pad a data frame...?

[R] Faster way to zero-pad a data frame...?

[R] Faster way to zero-pad a data frame...?

Apparently Analagous Threads