thr3ads.net - R help - [R] recode problem - unexplained values [Sep 2006]

If this information is useful, please help other people find it:
Share via:

bgreen at dyson.brisnet.org.au

2006-Sep-28 02:27 UTC

[R] recode problem - unexplained values

I am hoping for some advice regarding the difficulties I have been having
recoding variables which are contained in a csv file.  Table 1 (below) 
shows there are two types of blanks - as reported in the first two
columns. I am using windows XP & the latets version of R.

When blanks cells are replaced with a value of n using syntax: > affect
[affect==""] <- "n"
there are still 3 blank values (Table 2).   When as.numeric is applied,
this also causes problems because values of 2,3 & 4 are generated rather
than just 1 & 2.

TABLE 1

table(group,actions)
     actions
group           n   y
    1 100   2   0   3
    2  30   1   1   0
    3  24   0   0   0



TABLE 2
>  table(group,actions)     actions
group           n   y
    1   0   2 100   3
    2   0   1  31   0
    3   0   0  24   0


Below is another example - for some reason there are 2 types of 'aobh'
values.

> table(group, type)     type
group aobh aobh   gbh   m  uw
    1  104      1   0   0   0
    2    0      0  15   0  17
    3    0      0   0  24   0


Any assistance is much appreciated,


Bob Green

Richard M. Heiberger

2006-Sep-28 04:28 UTC

head link

[R] recode problem - unexplained values

I can propose a strategy.

This example shows that there are different types of blanks when you
look at character data.

    as.character(c("", " ", "  ", "  
"))

Your test for "" found only one of them.

Look at the data as read.csv produces it.  That will probably give you
some clues.

mydata <- read.csv("filename")

mydata

as.character(mydata)




Rich

Marc Schwartz (via MN)

2006-Sep-28 15:12 UTC

head link

[R] recode problem - unexplained values

On Thu, 2006-09-28 at 12:27 +1000, bgreen at dyson.brisnet.org.au
wrote:> I am hoping for some advice regarding the difficulties I have been having
> recoding variables which are contained in a csv file.  Table 1 (below) 
> shows there are two types of blanks - as reported in the first two
> columns. I am using windows XP & the latets version of R.
> 
> When blanks cells are replaced with a value of n using syntax: > affect
> [affect==""] <- "n"
> there are still 3 blank values (Table 2).   When as.numeric is applied,
> this also causes problems because values of 2,3 & 4 are generated
rather
> than just 1 & 2.
> 
> TABLE 1
> 
> table(group,actions)
>      actions
> group           n   y
>     1 100   2   0   3
>     2  30   1   1   0
>     3  24   0   0   0
> 
> 
> 
> TABLE 2
> 
> >  table(group,actions)
>      actions
> group           n   y
>     1   0   2 100   3
>     2   0   1  31   0
>     3   0   0  24   0
> 
> 
> Below is another example - for some reason there are 2 types of
'aobh'
> values.
> 
> 
> > table(group, type)
>      type
> group aobh aobh   gbh   m  uw
>     1  104      1   0   0   0
>     2    0      0  15   0  17
>     3    0      0   0  24   0
> 
> 
> Any assistance is much appreciated,
> 
> 
> Bob Green
Bob,

A quick heads up, which is the presumption that "aobh" and "aobh 
" are
different values simply as a consequence of leading/trailing spaces in
the source data file within the delimited fields. This is also the
likely reason for there being multiple missing/blank values in your
imported data set.

Presuming that you used one of the read.table() family functions (ie.
read.csv() ), take note of the 'strip.white' argument in ?read.table,
which defaults to FALSE. If you change it to TRUE, the function will
strip leading and trailing blanks, likely resolving this issue.

HTH,

Marc Schwartz

Seemingly Similar Threads

Search for more reasonably related threads

R help - Sep 2006 - recode problem - unexplained values

[R] recode problem - unexplained values

[R] recode problem - unexplained values

[R] recode problem - unexplained values

Seemingly Similar Threads