Ad Flan
2011-Feb-09 06:10 UTC
[R] merge, rbind, merge.zoo? adding new rows to existing data.frame
Hi all, I'm trying to add updated data to an existing time series where an overlap exists. I need to give priority to the update data. My script runs every morning to collect data the updated data. There are quite often varied lengths, so once off solutions identifying rows to solve this example won't work. I've experimented with merge, rbind, merge.zoo, but to no avail. An example existing <- data.frame( date = c("17-01-2011", "18-01-2011", "19-01-2011", "20-01-2011", "21-01-2011"), ? ? data = c(5, 5, 5, 5, 23)) existing$date <- as.Date(existing$date, "%d-%m-%Y") update <- data.frame( date = c("20-01-2011", "21-01-2011", "22-01-2011"), ? ? data = c(6, 22, 6)) update$date <- as.Date(update$date, "%d-%m-%Y") merge(existing, update, all.x = TRUE) #This will only keep existing values #structure is>str(existing)'data.frame': 5 obs. of 2 variables: $ date:Class 'Date' num [1:5] 14991 14992 14993 14994 14995 $ data: num 5 5 5 5 23> str(update)'data.frame': 3 obs. of 2 variables: $ date:Class 'Date' num [1:3] 14994 14995 14996 $ data: num 6 22 6 # The output should be: # ? date data # 2011-01-17 ? ?5 #(from existing) # 2011-01-18 ? ?5 #(from existing) # 2011-01-19 ? ?5 #(from existing) # 2011-01-20 ? ?6 #(from update) # 2011-01-21 ? 22 #(from update) # 2011-01-22 ? ?6 #(from update) Any ideas? Many thanks, Adam Flanagan
Gabor Grothendieck
2011-Feb-09 08:04 UTC
[R] merge, rbind, merge.zoo? adding new rows to existing data.frame
On Wed, Feb 9, 2011 at 1:10 AM, Ad Flan <aj.flan at gmail.com> wrote:> Hi all, > I'm trying to add updated data to an existing time series where an > overlap exists. I need to give priority to the update data. > My script runs every morning to collect data the updated data. There > are quite often varied lengths, so once off solutions identifying rows > to solve this example won't work. > > I've experimented with merge, rbind, merge.zoo, but to no avail. > > An example > > existing <- data.frame( > date = c("17-01-2011", "18-01-2011", "19-01-2011", "20-01-2011", "21-01-2011"), > ? ? data = c(5, 5, 5, 5, 23)) > existing$date <- as.Date(existing$date, "%d-%m-%Y") > > update <- data.frame( > date = c("20-01-2011", "21-01-2011", "22-01-2011"), > ? ? data = c(6, 22, 6)) > update$date <- as.Date(update$date, "%d-%m-%Y") > > merge(existing, update, all.x = TRUE) > #This will only keep existing values > > #structure is >>str(existing) > 'data.frame': ? 5 obs. of ?2 variables: > ?$ date:Class 'Date' ?num [1:5] 14991 14992 14993 14994 14995 > ?$ data: num ?5 5 5 5 23 > >> str(update) > 'data.frame': ? 3 obs. of ?2 variables: > ?$ date:Class 'Date' ?num [1:3] 14994 14995 14996 > ?$ data: num ?6 22 6 > > # The output should be: > # ? ? ? ? ? ? ? date data > # 2011-01-17 ? ?5 ? ? ? ?#(from existing) > # 2011-01-18 ? ?5 ? ? ? ?#(from existing) > # 2011-01-19 ? ?5 ? ? ? ?#(from existing) > # 2011-01-20 ? ?6 ? ? ? ?#(from update) > # 2011-01-21 ? 22 ? ? ? #(from update) > # 2011-01-22 ? ?6 ? ? ? ?#(from update) >Try this: both <- merge(existing, update, by = 1, all = TRUE) transform(both, data.x.updated = ifelse(is.na(data.x), data.y, data.x)) If you want to use merge.zoo then its nearly the same: library(zoo) existing.z <- read.zoo(existing) update.z <- read.zoo(update) both.z <- merge(existing.z, update.z) both.z$updated <- ifelse(is.na(both.z$existing.z), both.z$update.z, both.z$existing.z) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Apparently Analagous Threads
- R Newbie: quantmod and zoo: Warning in rbind.zoo(...) : column names differ
- Rbind for appending zoo objects
- inconsistency between timeSeries and zoo causing a problem with rbind
- Problem with zoo and rbind() converting matrix to vector
- Error with rbind and zoo