thr3ads.net - R help - [R] Adding NA values in random positions in a dataframe [Nov 2013]

If this information is useful, please help other people find it:
Share via:

arun

2013-Nov-28 17:57 UTC

[R] Adding NA values in random positions in a dataframe

Hi,
One way would be:
?set.seed(42)
?dat1 <-
as.data.frame(matrix(sample(c(1:5,NA),50,replace=TRUE,prob=c(10,15,15,20,30,10)),ncol=5))
set.seed(49)
?dat1[!is.na(dat1)][ match(
sample(seq(dat1[!is.na(dat1)]),length(dat1[!is.na(dat1)])*(0.20)),seq(dat1[!is.na(dat1)]))]
<- NA
length(dat1[is.na(dat1)])/length(unlist(dat1))
#[1] 0.28

A.K.


Hello, I'm quite new at R so I don't know which is the most efficient 
way to execute a function that I could write easily in other languages. 

This is my problem: I have a dataframe with a certain numbers of
 NA (approximately 10%). I want to add other NA values in random 
positions of the dataframes until reaching an overall proportions of NA 
values of 30% (clearly the positions with NA values don't have to 
change). I tried looking at iterative function in R as apply or sapply 
but I can't actually figure out how to use them in this case. Thank you.

arun

2013-Nov-29 18:09 UTC

head link

[R] Adding NA values in random positions in a dataframe

Hi,
I used that because 10% of the values in the data were already NA. 


You are right.? Sorry, ?match() is unnecessary.? I was trying another solution
with match() which didn't work out and forgot to check whether it was
adequate or not.
set.seed(49)
dat1[!is.na(dat1)][sample(seq(dat1[!is.na(dat1)]),length(dat1[!is.na(dat1)])*(0.20))]
<- NA
A.K.


Thanks for the reply. I don't get the 0.20 multiplied by the length of the
non NA value, where did you take it from?

Furthermore, why do we have to use the function match? Wouldn't it be enough
to use the saple function?


On Thursday, November 28, 2013 12:57 PM, arun <smartpink111 at yahoo.com>
wrote:
Hi,
One way would be:
?set.seed(42)
?dat1 <-
as.data.frame(matrix(sample(c(1:5,NA),50,replace=TRUE,prob=c(10,15,15,20,30,10)),ncol=5))
set.seed(49)
?dat1[!is.na(dat1)][ match(
sample(seq(dat1[!is.na(dat1)]),length(dat1[!is.na(dat1)])*(0.20)),seq(dat1[!is.na(dat1)]))]
<- NA
length(dat1[is.na(dat1)])/length(unlist(dat1))
#[1] 0.28

A.K.


Hello, I'm quite new at R so I don't know which is the most efficient 
way to execute a function that I could write easily in other languages. 

This is my problem: I have a dataframe with a certain numbers of
NA (approximately 10%). I want to add other NA values in random 
positions of the dataframes until reaching an overall proportions of NA 
values of 30% (clearly the positions with NA values don't have to 
change). I tried looking at iterative function in R as apply or sapply 
but I can't actually figure out how to use them in this case. Thank you.

R help - Nov 2013 - Adding NA values in random positions in a dataframe

[R] Adding NA values in random positions in a dataframe

[R] Adding NA values in random positions in a dataframe