Stathis Kamperis
2012-Jul-08 14:31 UTC
[R] How to replace a column in a data frame with another one with a different size
Hello everyone, I have a dataframe with 1 column and I'd like to replace that column with a moving average. Example:> library('zoo') > mydat <- seq_len(10) > mydat[1] 1 2 3 4 5 6 7 8 9 10> df <- data.frame("V1" = mydat) > dfV1 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10> df[df$V1 <- rollapply(df$V1, 3, mean)]Error in `$<-.data.frame`(`*tmp*`, "V1", value = c(2, 3, 4, 5, 6, 7, 8, : replacement has 8 rows, data has 10>I could use a temporary variable to store the results of rollapply() and then reconstruct the data frame, but I was wondering if there is a one-liner that can achieve the same thing. Best regards, Stathis P.S. If you don't mind, cc me at your reply because I'm not subscribed to the list (but I will check the archive anyway).
Michael Weylandt
2012-Jul-08 17:10 UTC
[R] How to replace a column in a data frame with another one with a different size
On Jul 8, 2012, at 9:31 AM, Stathis Kamperis <ekamperi at gmail.com> wrote:> Hello everyone, > > I have a dataframe with 1 column and I'd like to replace that column > with a moving average. > Example: > >> library('zoo') >> mydat <- seq_len(10) >> mydat > [1] 1 2 3 4 5 6 7 8 9 10 >> df <- data.frame("V1" = mydat) >> df > V1 > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 > 6 6 > 7 7 > 8 8 > 9 9 > 10 10 >> df[df$V1 <- rollapply(df$V1, 3, mean)] > Error in `$<-.data.frame`(`*tmp*`, "V1", value = c(2, 3, 4, 5, 6, 7, 8, : > replacement has 8 rows, data has 10 >> >I'm not sure you need the outer df[...] -- I think you just want df$V1 <- rollapply(df$V1,3,mean) However, this will still give you the error message you're seeing because rollapply() only returns 8 values here (you don't get the "endpoints" by default). To get the right number of rows, you want rollapply(df$V1, 3, mean, fill = NA) # Change NA if desired which will put NA's on each end and give you a length 10 result, as needed. Best, Michael> I could use a temporary variable to store the results of rollapply() > and then reconstruct the data frame, but I was wondering if there is a > one-liner that can achieve the same thing. > > Best regards, > Stathis > > P.S. If you don't mind, cc me at your reply because I'm not subscribed > to the list (but I will check the archive anyway). > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
arun
2012-Jul-08 17:37 UTC
[R] How to replace a column in a data frame with another one with a different size
Hi, As the error says, the replacements should have equal number of rows. You can either do it adding NAs, df$V1<-c(NA,rollapply(df$V1,3,mean),NA) df ?? V1 1? NA 2?? 2 3?? 3 4?? 4 5?? 5 6?? 6 7?? 7 8?? 8 9?? 9 10 NA #or, #use any of these #as Michael suggested df2<-rollapply(df,3,mean,fill=NA) df3<-rollapply(df,3,mean,na.pad=TRUE) identical(df2,df3) [1] TRUE A.K. ----- Original Message ----- From: Stathis Kamperis <ekamperi at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, July 8, 2012 10:31 AM Subject: [R] How to replace a column in a data frame with another one with a different size Hello everyone, I have a dataframe with 1 column and I'd like to replace that column with a moving average. Example:> library('zoo') > mydat <- seq_len(10) > mydat[1]? 1? 2? 3? 4? 5? 6? 7? 8? 9 10> df <- data.frame("V1" = mydat) > df?? V1 1?? 1 2?? 2 3?? 3 4?? 4 5?? 5 6?? 6 7?? 7 8?? 8 9?? 9 10 10> df[df$V1 <- rollapply(df$V1, 3, mean)]Error in `$<-.data.frame`(`*tmp*`, "V1", value = c(2, 3, 4, 5, 6, 7, 8,? : ? replacement has 8 rows, data has 10>I could use a temporary variable to store the results of rollapply() and then reconstruct the data frame, but I was wondering if there is a one-liner that can achieve the same thing. Best regards, Stathis P.S. If you don't mind, cc me at your reply because I'm not subscribed to the list (but I will check the archive anyway). ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
R. Michael Weylandt
2012-Jul-08 17:52 UTC
[R] How to replace a column in a data frame with another one with a different size
Your On Sun, Jul 8, 2012 at 12:22 PM, Stathis Kamperis <ekamperi at gmail.com> wrote:> 2012/7/8 Michael Weylandt <michael.weylandt at gmail.com>: >> >> >> On Jul 8, 2012, at 9:31 AM, Stathis Kamperis <ekamperi at gmail.com> wrote: >> >>> Hello everyone, >>> >>> I have a dataframe with 1 column and I'd like to replace that column >>> with a moving average. >>> Example: >>> >>>> library('zoo') >>>> mydat <- seq_len(10) >>>> mydat >>> [1] 1 2 3 4 5 6 7 8 9 10 >>>> df <- data.frame("V1" = mydat) >>>> df >>> V1 >>> 1 1 >>> 2 2 >>> 3 3 >>> 4 4 >>> 5 5 >>> 6 6 >>> 7 7 >>> 8 8 >>> 9 9 >>> 10 10 >>>> df[df$V1 <- rollapply(df$V1, 3, mean)] >>> Error in `$<-.data.frame`(`*tmp*`, "V1", value = c(2, 3, 4, 5, 6, 7, 8, : >>> replacement has 8 rows, data has 10 >>>> >>> >> >> I'm not sure you need the outer df[...] -- I think you just want >> >> df$V1 <- rollapply(df$V1,3,mean) >> >> However, this will still give you the error message you're seeing because rollapply() only returns 8 values here (you don't get the "endpoints" by default). To get the right number of rows, you want >> >> rollapply(df$V1, 3, mean, fill = NA) # Change NA if desired >> >> which will put NA's on each end and give you a length 10 result, as needed. >> > > Thanks Michael (and arun@)! > > If I would do that, then (in my particular case), I'd need to > eliminate NA's, with something like: > df$V1 <- df$V1[!is.na(df$V1)] > > which would still fail with the same error message :-PYou're getting tripped up (again) by trying to sub-assign something that's too small. df is a rectangular array of data: on the RHS of that expression, you are selecting out a subset of it of say 8 rows and telling R to replace the 10-row V1 column with those 8 elements. This cannot be done with the fixed rectangular structure and hence the error message. What you want to do is something like this: df[!is.na(df$V1), ] Let's walk through that df$V1 -- take the V1 column of df is.na() -- get a logical vector saying where NAs are !is.na() -- identify the rows where there _aren't_ NAs df[ !is.na(), ] -- (the important one) take the rows of df (all columns) where there aren't NAs What you might be wanting to do is df <- df[!is.na(df$V1), ] This is much better than what you are trying to do (working on the whole array at a time and trusting R to keep it all together than trying to manipulate slices individually) But even more idiomatic would be complete.cases(df) Take a look at some introductory material and try to wrap your head around indexing rows and columns together again: it's a fantastic paradigm and will be of much more use to you long run than trying to work on individual columns for subsetting/data-cleaning. Best, Michael> > Regards, > Stathis > >> Best, >> Michael >> >>> I could use a temporary variable to store the results of rollapply() >>> and then reconstruct the data frame, but I was wondering if there is a >>> one-liner that can achieve the same thing. >>> >>> Best regards, >>> Stathis >>> >>> P.S. If you don't mind, cc me at your reply because I'm not subscribed >>> to the list (but I will check the archive anyway). >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code.
Apparently Analagous Threads
- Reshape2, melt, order of categorical variable and ggplot2
- LDA Precdict - Seems to be predicting on the Training Data
- using "rollapply" to calculate a moving sum or running sum?
- creating variable that codes for the match/mismatch between two other variables
- performance: zoo's rollapply() vs inline