thr3ads.net - R help - [R] Can R fill in missing values? [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Jia Ying Mei

2008-Jul-24 03:18 UTC

[R] Can R fill in missing values?

Hi,

I know this can be done in Stata (which is quite messy) but I wanted to 
know if it can be done in R. So lets say I have a merged data set (I 
used the merge function by date for the attached two files), where all 
the missing values are filled with NAs (which is what the all.x=TRUE does).

Is there any way to replace those NAs with the value of the latest row 
that contains a value?

For example:

 > Date<-read.table("Desktop/R/Testdate.txt", head=T,
sep="\t")
 > Data<-read.table("Desktop/R/Testinput.txt", head=T,
sep="\t")
 > Merged<-merge(Date, Data, all.x=TRUE)
 > Merged
     Date France Germany
1 3/10/07      2       4
2 3/11/07     NA      NA
3 3/12/07     NA      NA
4 3/13/07     NA      NA
5 3/14/07     NA      NA
6 3/15/07      1       2

Given this Merged data, is there a way to replace every NA value from 
3/11 to 3/14 with that of 3/15? But then say there are multiple 
intervals with NAs that I want to fill with the last given value?

Technically, I know I could just manually edit the data, but the real 
files I need to work with has thousands of rows, so I was wondering if 
there was a way to do this (probably a loop?). Thanks a bunch.

Jia Ying Mei
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Testdate.txt
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20080723/aa0e5d29/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Testinput.txt
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20080723/aa0e5d29/attachment-0001.txt>

Erik Iverson

2008-Jul-24 03:34 UTC

head link

[R] Can R fill in missing values?

Hello -

Jia Ying Mei wrote:> Hi,
> 
> I know this can be done in Stata (which is quite messy) but I wanted to 
> know if it can be done in R. So lets say I have a merged data set (I 
> used the merge function by date for the attached two files), where all 
> the missing values are filled with NAs (which is what the all.x=TRUE does).
> 
> Is there any way to replace those NAs with the value of the latest row 
> that contains a value?
> 
> For example:
> 
>  > Date<-read.table("Desktop/R/Testdate.txt", head=T,
sep="\t")
>  > Data<-read.table("Desktop/R/Testinput.txt", head=T,
sep="\t")
>  > Merged<-merge(Date, Data, all.x=TRUE)
>  > Merged
>     Date France Germany
> 1 3/10/07      2       4
> 2 3/11/07     NA      NA
> 3 3/12/07     NA      NA
> 4 3/13/07     NA      NA
> 5 3/14/07     NA      NA
> 6 3/15/07      1       2
> 
> Given this Merged data, is there a way to replace every NA value from 
> 3/11 to 3/14 with that of 3/15? But then say there are multiple 
> intervals with NAs that I want to fill with the last given value?
> 
Yes, no loop necessary.

# begin R code
install.packages("zoo")
library(zoo)

Merged$France <- na.locf(Merged$France, fromLast = TRUE)

# end R code


This will of course work on the 'France' column.  Use of lapply in 
conjuction with this idea will lead to solving this problem for N 
columns in a couple lines of R.  Not messy at all!

Best Regards,
Erik Iverson

Gabor Grothendieck

2008-Jul-24 04:08 UTC

head link

[R] Can R fill in missing values?

This can be done readily with zoo. See zoo's 3 vignettes
and its help files for commands used below.

# setwd("...directory where we have write permission...")
# can alternately manually download these two files
download.file("https://stat.ethz.ch/pipermail/r-help/attachments/20080723/aa0e5d29/attachment.txt",
"Testdate.txt", method = "wget")
download.file("https://stat.ethz.ch/pipermail/r-help/attachments/20080723/aa0e5d29/attachment-0001.txt",
"Testinput.txt", method = "wget")

fmt <- "%m/%d/%y"

dd <- read.csv("Testdate.txt")
dd$Date <- as.Date(dd$Date, fmt)

library(zoo)
z <- read.zoo("Testinput.txt", header = TRUE, format = fmt)
z2 <- merge(z, zoo(,dd$Date))
na.locf(z2, fromLast = TRUE)

# alternately if its only needed to fill in the gaps in z
# then we can eliminate Testdate.txt and calculate z2
# (call it z2a this time) directly from z as:

rng <- range(time(z))
z2a <- merge(z, zoo(, seq(rng[1], rng[2], "day")))
# continue as previously

2008/7/23 Jia Ying Mei <jiamei at princeton.edu>:> Hi,
>
> I know this can be done in Stata (which is quite messy) but I wanted to
know
> if it can be done in R. So lets say I have a merged data set (I used the
> merge function by date for the attached two files), where all the missing
> values are filled with NAs (which is what the all.x=TRUE does).
>
> Is there any way to replace those NAs with the value of the latest row that
> contains a value?
>
> For example:
>
>> Date<-read.table("Desktop/R/Testdate.txt", head=T,
sep="\t")
>> Data<-read.table("Desktop/R/Testinput.txt", head=T,
sep="\t")
>> Merged<-merge(Date, Data, all.x=TRUE)
>> Merged
>    Date France Germany
> 1 3/10/07      2       4
> 2 3/11/07     NA      NA
> 3 3/12/07     NA      NA
> 4 3/13/07     NA      NA
> 5 3/14/07     NA      NA
> 6 3/15/07      1       2
>
> Given this Merged data, is there a way to replace every NA value from 3/11
> to 3/14 with that of 3/15? But then say there are multiple intervals with
> NAs that I want to fill with the last given value?
>
> Technically, I know I could just manually edit the data, but the real files
> I need to work with has thousands of rows, so I was wondering if there was
a
> way to do this (probably a loop?). Thanks a bunch.
>
> Jia Ying Mei
>
> Date
> 3/15/07
> 3/14/07
> 3/13/07
> 3/12/07
> 3/11/07
> 3/10/07
>
> Date    France  Germany
> 3/15/07 1       2
> 3/10/07 2       4
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

Reasonably Related Threads

Search for more maybe matching threads

R help - Jul 2008 - Can R fill in missing values?

[R] Can R fill in missing values?

[R] Can R fill in missing values?

[R] Can R fill in missing values?

Reasonably Related Threads