thr3ads.net - R help - [R] Replacing N.A values in a data frame [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Santosh Srinivas

2010-Oct-14 08:17 UTC

[R] Replacing N.A values in a data frame

Hello,  I have a data frame as below ... in cases where I have N.A. I want
to use an average of the past date and next date .. any help?

13/10/2010	A	23
13/10/2010	B	12
13/10/2010	C	124
14/10/2010	A	43
14/10/2010	B	54
14/10/2010	C	65
15/10/2010	A	43
15/10/2010	B	N.A.
15/10/2010	C	65

----------------------------------------------------------------------------
--------------------------
Thanks R-Helpers.

Henrique Dallazuanna

2010-Oct-14 12:38 UTC

head link

[R] Replacing N.A values in a data frame

If I understand you can use approxfun:

DF <- read.table(textConnection("
13/10/2010      A       23
13/10/2010      B       12
13/10/2010      C       124
14/10/2010      A       43
14/10/2010      B       54
14/10/2010      C       65
15/10/2010      A       43
15/10/2010      B       N.A.
15/10/2010      C       65"), na.strings = "N.A.")

f <- approxfun(1:nrow(DF), DF$V3)
DF$V4 <- sapply(seq(nrow(DF)), f)


On Thu, Oct 14, 2010 at 5:17 AM, Santosh Srinivas <
santosh.srinivas@gmail.com> wrote:
> Hello,  I have a data frame as below ... in cases where I have N.A. I want
> to use an average of the past date and next date .. any help?
>
> 13/10/2010      A       23
> 13/10/2010      B       12
> 13/10/2010      C       124
> 14/10/2010      A       43
> 14/10/2010      B       54
> 14/10/2010      C       65
> 15/10/2010      A       43
> 15/10/2010      B       N.A.
> 15/10/2010      C       65
>
>
>
----------------------------------------------------------------------------
> --------------------------
> Thanks R-Helpers.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]

Gabor Grothendieck

2010-Oct-14 12:51 UTC

head link

[R] Replacing N.A values in a data frame

On Thu, Oct 14, 2010 at 4:17 AM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote:> Hello, ?I have a data frame as below ... in cases where I have N.A. I want
> to use an average of the past date and next date .. any help?
>
> 13/10/2010 ? ? ?A ? ? ? 23
> 13/10/2010 ? ? ?B ? ? ? 12
> 13/10/2010 ? ? ?C ? ? ? 124
> 14/10/2010 ? ? ?A ? ? ? 43
> 14/10/2010 ? ? ?B ? ? ? 54
> 14/10/2010 ? ? ?C ? ? ? 65
> 15/10/2010 ? ? ?A ? ? ? 43
> 15/10/2010 ? ? ?B ? ? ? N.A.
> 15/10/2010 ? ? ?C ? ? ? 65
Assuming A, B and C refer to separate time series you can use
na.approx in zoo.

Lines <- "13/10/2010      A       23
13/10/2010      B       12
13/10/2010      C       124
14/10/2010      A       43
14/10/2010      B       54
14/10/2010      C       65
15/10/2010      A       43
15/10/2010      B       N.A.
15/10/2010      C       65"

library(zoo)

# z <- read.zoo("myfile.dat", format = "%d/%m/%Y", split
= 2,
na.strings = "N.A.")
z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y", split =
2,
na.strings = "N.A.")

na.approx(z)  # or na.approx(z, rule = 2)

which gives this multivariate time series in zoo:
> na.approx(z)            A  B   C
2010-10-13 23 12 124
2010-10-14 43 54  65
2010-10-15 43 NA  65
> # or
> na.approx(z, rule = 2)            A  B   C
2010-10-13 23 12 124
2010-10-14 43 54  65
2010-10-15 43 54  65





-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Santosh Srinivas

2010-Oct-14 13:59 UTC

head link

[R] Replacing N.A values in a data frame

Wow! That?s Amazing! Many thanks!

When I do the below ... why do the column names get thrown off? Ticker is a
factor / character ... I tried both
> temp <- head(MF_Data_Sub)
> temp        Date Ticker   Price
1 2008-04-01 106270 10.3287
2 2008-04-01 106269 10.3287
3 2008-04-01 102767 12.6832
4 2008-04-01 102766 10.5396
5 2008-04-01 102855  9.7833
6 2008-04-01 102856 12.1485> tZoo <- read.zoo(temp,split=2)
> tZoo           X102766 X102767 X102855 X102856 X106269 X106270
2008-04-01 10.5396 12.6832  9.7833 12.1485 10.3287 10.3287


Also, is there an easy way to do a return profile on the data below after it
is transformed?

Thanks very much!
S



-----Original Message-----
From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] 
Sent: 14 October 2010 18:22
To: Santosh Srinivas
Cc: r-help
Subject: Re: [R] Replacing N.A values in a data frame

On Thu, Oct 14, 2010 at 4:17 AM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote:> Hello, ?I have a data frame as below ... in cases where I have N.A. I want
> to use an average of the past date and next date .. any help?
>
> 13/10/2010 ? ? ?A ? ? ? 23
> 13/10/2010 ? ? ?B ? ? ? 12
> 13/10/2010 ? ? ?C ? ? ? 124
> 14/10/2010 ? ? ?A ? ? ? 43
> 14/10/2010 ? ? ?B ? ? ? 54
> 14/10/2010 ? ? ?C ? ? ? 65
> 15/10/2010 ? ? ?A ? ? ? 43
> 15/10/2010 ? ? ?B ? ? ? N.A.
> 15/10/2010 ? ? ?C ? ? ? 65
Assuming A, B and C refer to separate time series you can use
na.approx in zoo.

Lines <- "13/10/2010      A       23
13/10/2010      B       12
13/10/2010      C       124
14/10/2010      A       43
14/10/2010      B       54
14/10/2010      C       65
15/10/2010      A       43
15/10/2010      B       N.A.
15/10/2010      C       65"

library(zoo)

# z <- read.zoo("myfile.dat", format = "%d/%m/%Y", split
= 2,
na.strings = "N.A.")
z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y", split =
2,
na.strings = "N.A.")

na.approx(z)  # or na.approx(z, rule = 2)

which gives this multivariate time series in zoo:
> na.approx(z)            A  B   C
2010-10-13 23 12 124
2010-10-14 43 54  65
2010-10-15 43 NA  65
> # or
> na.approx(z, rule = 2)            A  B   C
2010-10-13 23 12 124
2010-10-14 43 54  65
2010-10-15 43 54  65





-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Gabor Grothendieck

2010-Oct-14 22:59 UTC

head link

[R] Replacing N.A values in a data frame

On Thu, Oct 14, 2010 at 9:59 AM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote:> Wow! That?s Amazing! Many thanks!
>
> When I do the below ... why do the column names get thrown off? Ticker is a
> factor / character ... I tried both
>
>> temp <- head(MF_Data_Sub)
>> temp
> ? ? ? ?Date Ticker ? Price
> 1 2008-04-01 106270 10.3287
> 2 2008-04-01 106269 10.3287
> 3 2008-04-01 102767 12.6832
> 4 2008-04-01 102766 10.5396
> 5 2008-04-01 102855 ?9.7833
> 6 2008-04-01 102856 12.1485
>> tZoo <- read.zoo(temp,split=2)
>> tZoo
> ? ? ? ? ? X102766 X102767 X102855 X102856 X106269 X106270
> 2008-04-01 10.5396 12.6832 ?9.7833 12.1485 10.3287 10.3287
It automatically makes the names valid variable names for R as does
data.frame in R.  If you do not want that behavior add the check.names
= FALSE argument to read.zoo .
> Also, is there an easy way to do a return profile on the data below after
it
> is transformed?
>
What is a "profile on the data"?   These all work: str(z); summary(z);
View(z)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Gabor Grothendieck

2010-Oct-15 01:35 UTC

head link

[R] Replacing N.A values in a data frame

On Thu, Oct 14, 2010 at 6:59 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:> On Thu, Oct 14, 2010 at 9:59 AM, Santosh Srinivas
> <santosh.srinivas at gmail.com> wrote:
>> Wow! That?s Amazing! Many thanks!
>>
>> When I do the below ... why do the column names get thrown off? Ticker
is a
>> factor / character ... I tried both
>>
>>> temp <- head(MF_Data_Sub)
>>> temp
>> ? ? ? ?Date Ticker ? Price
>> 1 2008-04-01 106270 10.3287
>> 2 2008-04-01 106269 10.3287
>> 3 2008-04-01 102767 12.6832
>> 4 2008-04-01 102766 10.5396
>> 5 2008-04-01 102855 ?9.7833
>> 6 2008-04-01 102856 12.1485
>>> tZoo <- read.zoo(temp,split=2)
>>> tZoo
>> ? ? ? ? ? X102766 X102767 X102855 X102856 X106269 X106270
>> 2008-04-01 10.5396 12.6832 ?9.7833 12.1485 10.3287 10.3287
>
> It automatically makes the names valid variable names for R as does
> data.frame in R. ?If you do not want that behavior add the check.names
> = FALSE argument to read.zoo .
>
>> Also, is there an easy way to do a return profile on the data below
after it
>> is transformed?
>>
>
> What is a "profile on the data"? ? These all work: str(z);
summary(z); View(z)
Based on offline discussion what was wanted was the returns, i.e.

diff(log(z))

Also for other manipulations of financial data see the xts, quantmod
and PerformanceAnalytics packages.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

R help - Oct 2010 - Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

[R] Replacing N.A values in a data frame

Apparently Analagous Threads