On Wed, 2006-11-08 at 23:16 +0100, Marco Boks wrote:> Dear All,
> I am lost about the following. I have got a large dataframe (largeset)
> with in the first column identification numbers as factors
> largeset$ID
>
> p000345
>
> p000356
>
> p000569
>
> etc
> in order to use them to merge with another dataframe with numerical
> values (000345, 000356) I want to convert them to numerical.
> >as.numeric(as.character(largeset$ID)) gives NA's
> >as.numeric(strsplit(as.character(largeset[,1]), "p")) also
fails:
> Error in as.double.default(strsplit(as.character(largeset[, 1]),
> "p")) :
> unimplemented type 'character' in 'asReal'
> Any suggestions would be very appreciated
>
> Marco
Two approaches, depending upon whether you need actual numeric values
(which will not retain the leading zeroes) or simply strip the 'p' to
retain the leading zeroes. Also presuming that the leading 'p' is the
only non-numeric character here.
# Presuming that ID is a factor> ID
[1] p000345 p000356 p000569
Levels: p000345 p000356 p000569
# Retain leading zeroes as a character vector> sub("p", "", ID)
[1] "000345" "000356" "000569"
# Convert to a numeric vector> as.numeric(sub("p", "", ID))
[1] 345 356 569
See ?sub for more information.
HTH,
Marc Schwartz