thr3ads.net - R help - [R] Character (1a, 1b) to numeric [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Fox, John

2020-Jul-10 18:10 UTC

[R] Character (1a, 1b) to numeric

Dear Jean-Louis,

There must be many ways to do this. Here's one simple way (with no claim of
optimality!):
> xc <-  c("1", "1a", "1b", "1c",
"2", "2a", "2b", "2c")
> xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> 
> set.seed(123) # for reproducibility
> x <- sample(xc, 20, replace=TRUE) # "data"
> 
> names(xn) <- xc
> z <- xn[x]
> 
> data.frame(z, x)     z  x
1  2.5 2b
2  2.5 2b
3  1.5 1b
4  2.3 2a
5  1.5 1b
6  1.3 1a
7  1.3 1a
8  2.3 2a
9  1.5 1b
10 2.0  2
11 1.7 1c
12 2.3 2a
13 2.3 2a
14 1.0  1
15 1.3 1a
16 1.5 1b
17 2.7 2c
18 2.0  2
19 1.5 1b
20 1.5 1b

I hope this helps,
 John

  -----------------------------
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox
> On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abitbol at sent.com>
wrote:
> 
> Dear All
> 
> I have a character vector,  representing histology stages, such as for
example:
> xc <-  c("1", "1a", "1b", "1c",
"2", "2a", "2b", "2c")
> 
> and this goes on to 3, 3a etc in various order for each patient. I do have
of course a pre-established  classification available which does change
according to the histology criteria under assessment.
> 
> I would want to convert xc, for plotting reasons, to a numeric vector such
as
> 
> xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> 
> Unfortunately I have no clue on how to do that.
> 
> Thanks for any help and apologies if I am missing the obvious way to do it.
> 
> JL
> -- 
> Verif30042020
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Carlson

2020-Jul-10 19:28 UTC

head link

[R] Character (1a, 1b) to numeric

Here is a different approach:

xc <-  c("1", "1a", "1b", "1c",
"2", "2a", "2b", "2c")
xn <- as.numeric(gsub("a", ".3", gsub("b",
".5", gsub("c", ".7", xc))))
xn
# [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7

David L Carlson
Professor Emeritus of Anthropology
Texas A&M University

On Fri, Jul 10, 2020 at 1:10 PM Fox, John <jfox at mcmaster.ca> wrote:
> Dear Jean-Louis,
>
> There must be many ways to do this. Here's one simple way (with no
claim
> of optimality!):
>
> > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> >
> > set.seed(123) # for reproducibility
> > x <- sample(xc, 20, replace=TRUE) # "data"
> >
> > names(xn) <- xc
> > z <- xn[x]
> >
> > data.frame(z, x)
>      z  x
> 1  2.5 2b
> 2  2.5 2b
> 3  1.5 1b
> 4  2.3 2a
> 5  1.5 1b
> 6  1.3 1a
> 7  1.3 1a
> 8  2.3 2a
> 9  1.5 1b
> 10 2.0  2
> 11 1.7 1c
> 12 2.3 2a
> 13 2.3 2a
> 14 1.0  1
> 15 1.3 1a
> 16 1.5 1b
> 17 2.7 2c
> 18 2.0  2
> 19 1.5 1b
> 20 1.5 1b
>
> I hope this helps,
>  John
>
>   -----------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
>
> > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abitbol at
sent.com>
> wrote:
> >
> > Dear All
> >
> > I have a character vector,  representing histology stages, such as for
> example:
> > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> >
> > and this goes on to 3, 3a etc in various order for each patient. I do
> have of course a pre-established  classification available which does
> change according to the histology criteria under assessment.
> >
> > I would want to convert xc, for plotting reasons, to a numeric vector
> such as
> >
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> >
> > Unfortunately I have no clue on how to do that.
> >
> > Thanks for any help and apologies if I am missing the obvious way to
do
> it.
> >
> > JL
> > --
> > Verif30042020
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > PLEASE do read the posting guide
>
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> PLEASE do read the posting guide
>
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Bert Gunter

2020-Jul-10 19:54 UTC

head link

[R] Character (1a, 1b) to numeric

... and continuing with this cute little thread...

I found the OP's specification a little imprecise -- are your values always
a string that begins with *some sort" of numeric value followed by
"some
sort" of alpha code? That is, could the numeric value be several digits and
the alpha code several letters? Probably not, and the existing solutions
you have been provided are almost certainly all you need. But for fun,
assuming this more general specification, here is a general way to split
your alphanumeric codes up into numeric and alpha parts and then convert by
using a couple of sub() 's.
> set.seed(131)
> xc <- sample(c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c"),
15, replace
= TRUE)> nums <- sub("[[:alpha:]]+","",xc)  ## extract
numeric part
> alph <- sub("\\d+","",xc)   ## extract alpha part
> codes <- letters[1:3] ## whatever alpha codes are used
> vals <- setNames(c(.3,.5,.7), codes) ## whatever numeric values to
convert codes to> xnew <- as.numeric(nums) + ifelse(alph == "",0, vals[alph])
> data.frame (xc = xc, xnew = xnew)   xc xnew
1  1a  1.3
2   2  2.0
3  1c  1.7
4  1c  1.7
5  1b  1.5
6  1a  1.3
7   2  2.0
8   2  2.0
9  1a  1.3
10 1a  1.3
11 2c  2.7
12 1b  1.5
13 1b  1.5
14  1  1.0
15 1c  1.7

Echoing others, no claim for optimality in any sense.

Cheers,
Bert


On Fri, Jul 10, 2020 at 12:28 PM David Carlson <dcarlson at tamu.edu>
wrote:
> Here is a different approach:
>
> xc <-  c("1", "1a", "1b", "1c",
"2", "2a", "2b", "2c")
> xn <- as.numeric(gsub("a", ".3", gsub("b",
".5", gsub("c", ".7", xc))))
> xn
> # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
>
> David L Carlson
> Professor Emeritus of Anthropology
> Texas A&M University
>
> On Fri, Jul 10, 2020 at 1:10 PM Fox, John <jfox at mcmaster.ca>
wrote:
>
> > Dear Jean-Louis,
> >
> > There must be many ways to do this. Here's one simple way (with no
claim
> > of optimality!):
> >
> > > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > >
> > > set.seed(123) # for reproducibility
> > > x <- sample(xc, 20, replace=TRUE) # "data"
> > >
> > > names(xn) <- xc
> > > z <- xn[x]
> > >
> > > data.frame(z, x)
> >      z  x
> > 1  2.5 2b
> > 2  2.5 2b
> > 3  1.5 1b
> > 4  2.3 2a
> > 5  1.5 1b
> > 6  1.3 1a
> > 7  1.3 1a
> > 8  2.3 2a
> > 9  1.5 1b
> > 10 2.0  2
> > 11 1.7 1c
> > 12 2.3 2a
> > 13 2.3 2a
> > 14 1.0  1
> > 15 1.3 1a
> > 16 1.5 1b
> > 17 2.7 2c
> > 18 2.0  2
> > 19 1.5 1b
> > 20 1.5 1b
> >
> > I hope this helps,
> >  John
> >
> >   -----------------------------
> >   John Fox, Professor Emeritus
> >   McMaster University
> >   Hamilton, Ontario, Canada
> >   Web: http::/socserv.mcmaster.ca/jfox
> >
> > > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abitbol at
sent.com>
> > wrote:
> > >
> > > Dear All
> > >
> > > I have a character vector,  representing histology stages, such
as for
> > example:
> > > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> > >
> > > and this goes on to 3, 3a etc in various order for each patient.
I do
> > have of course a pre-established  classification available which does
> > change according to the histology criteria under assessment.
> > >
> > > I would want to convert xc, for plotting reasons, to a numeric
vector
> > such as
> > >
> > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > >
> > > Unfortunately I have no clue on how to do that.
> > >
> > > Thanks for any help and apologies if I am missing the obvious way
to do
> > it.
> > >
> > > JL
> > > --
> > > Verif30042020
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> > >
> >
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > > PLEASE do read the posting guide
> >
>
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > > and provide commented, minimal, self-contained, reproducible
code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >
> >
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > PLEASE do read the posting guide
> >
>
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Fox, John

2020-Jul-10 20:02 UTC

head link

[R] Character (1a, 1b) to numeric

Hi,

We've had several solutions, and I was curious about their relative
efficiency. Here's a test with a moderately large data vector:
> library("microbenchmark")
> set.seed(123) # for reproducibility
> x <- sample(xc, 1e4, replace=TRUE) # "data"
> microbenchmark(John = John <- xn[x], +                Rich = Rich <- xn[match(x, xc)], 
+                Jeff = Jeff <- {
+                 n <- as.integer( sub( "[a-i]$", "", x )
)
+                 d <- match( sub( "^\\d+", "", x ),
letters[1:9] )
+                 d[ is.na( d ) ] <- 0
+                 n + d / 10
+                 },
+                David = David <- as.numeric(gsub("a",
".3",
+                                      gsub("b", ".5", 
+                                           gsub("c", ".7",
x)))),
+                times=1000L
+                )
Unit: microseconds
  expr       min        lq       mean     median         uq       max neval cld
  John   228.816   345.371   513.5614   503.5965   533.0635  10829.08  1000 a  
  Rich   217.395   343.035   534.2074   489.0075   518.3260  15388.96  1000 a  
  Jeff 10325.471 13070.737 15387.2545 15397.9790 17204.0115 153486.94  1000  b 
 David 14256.673 18148.492 20185.7156 20170.3635 22067.6690  34998.95  1000  
c> all.equal(John, Rich)
[1] TRUE> all.equal(John, David)
[1] "names for target but not for current"> all.equal(John, Jeff)[1] "names for target but not for current" "Mean relative
difference: 0.1498243"

Of course, efficiency isn't the only consideration, and aesthetically (and
no doubt subjectively) I prefer Rich Heiberger's solution. OTOH, Jeff's
solution is more general in that it generates the correspondence between letters
and numbers. The argument for Jeff's solution would, however, be stronger if
it gave the desired answer.

Best,
 John
> On Jul 10, 2020, at 3:28 PM, David Carlson <dcarlson at tamu.edu>
wrote:
> 
> Here is a different approach:
> 
> xc <-  c("1", "1a", "1b", "1c",
"2", "2a", "2b", "2c")
> xn <- as.numeric(gsub("a", ".3", gsub("b",
".5", gsub("c", ".7", xc))))
> xn
> # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> 
> David L Carlson
> Professor Emeritus of Anthropology
> Texas A&M University
> 
> On Fri, Jul 10, 2020 at 1:10 PM Fox, John <jfox at mcmaster.ca>
wrote:
> Dear Jean-Louis,
> 
> There must be many ways to do this. Here's one simple way (with no
claim of optimality!):
> 
> > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > 
> > set.seed(123) # for reproducibility
> > x <- sample(xc, 20, replace=TRUE) # "data"
> > 
> > names(xn) <- xc
> > z <- xn[x]
> > 
> > data.frame(z, x)
>      z  x
> 1  2.5 2b
> 2  2.5 2b
> 3  1.5 1b
> 4  2.3 2a
> 5  1.5 1b
> 6  1.3 1a
> 7  1.3 1a
> 8  2.3 2a
> 9  1.5 1b
> 10 2.0  2
> 11 1.7 1c
> 12 2.3 2a
> 13 2.3 2a
> 14 1.0  1
> 15 1.3 1a
> 16 1.5 1b
> 17 2.7 2c
> 18 2.0  2
> 19 1.5 1b
> 20 1.5 1b
> 
> I hope this helps,
>  John
> 
>   -----------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
> 
> > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol <abitbol at
sent.com> wrote:
> > 
> > Dear All
> > 
> > I have a character vector,  representing histology stages, such as for
example:
> > xc <-  c("1", "1a", "1b",
"1c", "2", "2a", "2b", "2c")
> > 
> > and this goes on to 3, 3a etc in various order for each patient. I do
have of course a pre-established  classification available which does change
according to the histology criteria under assessment.
> > 
> > I would want to convert xc, for plotting reasons, to a numeric vector
such as
> > 
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > 
> > Unfortunately I have no clue on how to do that.
> > 
> > Thanks for any help and apologies if I am missing the obvious way to
do it.
> > 
> > JL
> > -- 
> > Verif30042020
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> > PLEASE do read the posting guide
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcU3rSW6I$
> PLEASE do read the posting guide
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!V7p9rtNSgBWmF3KJ3U_01fR7vP_I7y-OnWHiTFxwRZ6bVJ3-emOwkBtcg7nzsmk$
> and provide commented, minimal, self-contained, reproducible code.

R help - Jul 2020 - Character (1a, 1b) to numeric

[R] Character (1a, 1b) to numeric

[R] Character (1a, 1b) to numeric

[R] Character (1a, 1b) to numeric

[R] Character (1a, 1b) to numeric