Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]]
Hello,
Something like this?
x <- scan(text = "
125
130.3
327.2
252.2
33.8
6.1
5.1
0.5
0.5
0
2.3
0
0
0
0
0
0
0
0
0
0.8
5.1
0
0.3
0
0
0
0
0
0
45.7
43.4
0
0
0
0
0
")
putMissing <- function(x, by){
idx <- by*seq_along(x)
idx <- idx[which(idx <= length(x))]
x[idx] <- NA
x
}
putMissing(x, 10)
putMissing(x, 5)
Hope this helps,
Rui Barradas
Em 25-04-2013 07:41, Roslina Zakaria escreveu:> Dear r-users,
>
> I would like to investigate about how to fill in missing data. I started
with a complete data and try to introduce missing data into the data series.
Then I would use some method to fill in the missing data and then compare with
the original data how good it is. My question is, how do I introduce missing
data in my complete data systematically like for example every 10th data will be
erased and assumed as missing. Here are some rainfall data:
>
> 125
> 130.3
> 327.2
> 252.2
> 33.8
> 6.1
> 5.1
> 0.5
> 0.5
> 0
> 2.3
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0.8
> 5.1
> 0
> 0.3
> 0
> 0
> 0
> 0
> 0
> 0
> 45.7
> 43.4
> 0
> 0
> 0
> 0
> 0
>
> Thank you so much for any help given. I hope my question is clear.
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
I read your data into a dataframe> x <- read.table( "clipboard" )and renamed the only column> colnames( x )[1] <- "orig"With a loop, I created a 2nd column "miss" where in every 10th row the observation is set to NA: for( i in 1 : length( x$orig ) ) { if( as.integer( rownames( x )[ i ] ) %% 10 == 0 ) { x$miss[i] <- NA } else { x$miss[i] <- x$orig[i] } } This is probably the least elegant of all possible solutions but it works... Rgds, Rainer On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote:> Dear r-users, > > I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: > > 125 > 130.3 > 327.2 > 252.2 > 33.8 > 6.1 > 5.1 > 0.5 > 0.5 > 0 > 2.3 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0.8 > 5.1 > 0 > 0.3 > 0 > 0 > 0 > 0 > 0 > 0 > 45.7 > 43.4 > 0 > 0 > 0 > 0 > 0 > > Thank you so much for any help given. I hope my question is clear. > [[alternative HTML version deleted]] >