Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]]
Hello, Something like this? x <- scan(text = " 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 ") putMissing <- function(x, by){ idx <- by*seq_along(x) idx <- idx[which(idx <= length(x))] x[idx] <- NA x } putMissing(x, 10) putMissing(x, 5) Hope this helps, Rui Barradas Em 25-04-2013 07:41, Roslina Zakaria escreveu:> Dear r-users, > > I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: > > 125 > 130.3 > 327.2 > 252.2 > 33.8 > 6.1 > 5.1 > 0.5 > 0.5 > 0 > 2.3 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0.8 > 5.1 > 0 > 0.3 > 0 > 0 > 0 > 0 > 0 > 0 > 45.7 > 43.4 > 0 > 0 > 0 > 0 > 0 > > Thank you so much for any help given. I hope my question is clear. > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I read your data into a dataframe> x <- read.table( "clipboard" )and renamed the only column> colnames( x )[1] <- "orig"With a loop, I created a 2nd column "miss" where in every 10th row the observation is set to NA: for( i in 1 : length( x$orig ) ) { if( as.integer( rownames( x )[ i ] ) %% 10 == 0 ) { x$miss[i] <- NA } else { x$miss[i] <- x$orig[i] } } This is probably the least elegant of all possible solutions but it works... Rgds, Rainer On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote:> Dear r-users, > > I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: > > 125 > 130.3 > 327.2 > 252.2 > 33.8 > 6.1 > 5.1 > 0.5 > 0.5 > 0 > 2.3 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0 > 0.8 > 5.1 > 0 > 0.3 > 0 > 0 > 0 > 0 > 0 > 0 > 45.7 > 43.4 > 0 > 0 > 0 > 0 > 0 > > Thank you so much for any help given. I hope my question is clear. > [[alternative HTML version deleted]] >