R help - Sep 2010 - how to replace NA with a specific score that is dependant on another indicator variable

Hi everyone,

I’m looking for a clever bit of code to replace NA’s with a specific score
depending on an indicator variable.

I can see how to do it using lots of if statements but I’m sure there most
be a neater, better way of doing it.

Any ideas at all will be much appreciated, I’m dreading coding up all those
if statements!!!!!

My problem is as follows:

I have a data set with lots of missing data:

EG Raw Data Set

Category             variable1             variable2             variable3

      1                            5                            NA
NA

      1                           NA
3                              4

      2                            NA
       7                            NA

    etc

Now I want to replace the NA’s with the average for each category, so if
these averages were:

EG Averages

Category             variable1             variable2             variable3

      1                           4.5
3.2                           2.5

      2                           3.5
       7.4                           5.9

So I’d like my data set to look like the following once I’ve replaced the
NA’s with the appropriate category average:

EG Imputed Data Set

Category             variable1             variable2             variable3

      1                            5                            3.2
2.5

      1                           4.5
3                              4

      2                           3.5
     7                             5.9

    etc

Any ideas would be very much appreciated!!!!!

thankyou

Chris Howden

Founding Partner

Tricky Solutions

Tricky Solutions 4 Tricky Problems

Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training

(mobile) 0410 689 945

(fax / office) (+618) 8952 7878

chris@trickysolutions.com.au

	[[alternative HTML version deleted]]

R help - Sep 2010 - how to replace NA with a specific score that is dependant on another indicator variable

[R] how to replace NA with a specific score that is dependant on another indicator variable

[R] how to replace NA with a specific score that is dependant on another indicator variable

[R] how to replace NA with a specific score that is dependant on another indicator variable

[R] Can R handle a matrix with 8 billion entries?

[R] Can R handle a matrix with 8 billion entries?

[R] Can R handle a matrix with 8 billion entries?

[R] Can R handle a matrix with 8 billion entries?

[R] Can R handle a matrix with 8 billion entries?

[R] Can R handle a matrix with 8 billion entries?

Reasonably Related Threads