thr3ads.net - R help - [R] A More efficient method? [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Keith Alan Chamberlain

2007-Jul-04 13:44 UTC

[R] A More efficient method?

Dear Rhelpers,

Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this, but one
which can also handle an arbitrarily large number of categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
C1=vector(length=length(Cat))	# New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

Ken Knoblauch

2007-Jul-04 15:11 UTC

head link

[R] A More efficient method?

Keith Alan Chamberlain <Keith.Chamberlain <at> Colorado.EDU>
writes:>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat))	# New vector for numeric values
> for(i in 1:length(C1)){
> 	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
> 
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
 ifelse(Cat == "a", -1, 1)
[1] -1 -1 -1  1  1  1 -1 -1  1

HTH

Benilton Carvalho

2007-Jul-04 15:12 UTC

head link

[R] A More efficient method?

C1 <- rep(-1, length(Cat))
C1[Cat == "b"]] <- 1

b

On Jul 4, 2007, at 9:44 AM, Keith Alan Chamberlain wrote:
> Dear Rhelpers,
>
> Is there a faster way than below to set a vector based on values from
> another vector? I'd like to call a pre-existing function for this,  
> but one
> which can also handle an arbitrarily large number of categories.  
> Any ideas?
>
>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat))	# New vector for numeric values
>
> # Cycle through each column and set C1 to corresponding value of Cat.
> for(i in 1:length(C1)){
> 	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
>
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
>
> Sincerely,
> KeithC.
> Psych Undergrad, CU Boulder (US)
> RE McNair Scholar
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Stefan Grosse

2007-Jul-04 15:16 UTC

head link

[R] A More efficient method?

>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat))	# New vector for numeric values
>
> # Cycle through each column and set C1 to corresponding value of Cat.
> for(i in 1:length(C1)){
> 	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
>
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
>
>   how about:

Cat<-c('a','a','a','b','b','b','a','a','b')
c1<- -2*(Cat=="a")+1



-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

ONKELINX, Thierry

2007-Jul-04 15:17 UTC

head link

[R] A More efficient method?

Cat <-
c('a','a','a','b','b','b','a','a','b')
C1 <- ifelse(Cat == 'a', -1, 1)

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op stat.math.ethz.ch 
> [mailto:r-help-bounces op stat.math.ethz.ch] Namens Keith Alan 
> Chamberlain
> Verzonden: woensdag 4 juli 2007 15:45
> Aan: r-help op stat.math.ethz.ch
> Onderwerp: [R] A More efficient method?
> 
> Dear Rhelpers,
> 
> Is there a faster way than below to set a vector based on 
> values from another vector? I'd like to call a pre-existing 
> function for this, but one which can also handle an 
> arbitrarily large number of categories. Any ideas?
> 
>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat))	# New vector for numeric values
> 
> # Cycle through each column and set C1 to corresponding value of Cat.
> for(i in 1:length(C1)){
> 	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
> 
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
> 
> Sincerely,
> KeithC.
> Psych Undergrad, CU Boulder (US)
> RE McNair Scholar
> 
> ______________________________________________
> R-help op stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Gabor Grothendieck

2007-Jul-04 15:29 UTC

head link

[R] A More efficient method?

Here are two ways.  The second way is more than 10x faster.
> set.seed(1)
> C <- sample(c("a", "b"), 100000, replace = TRUE)
> system.time(s1 <- ifelse(C == "a", 1, -1))   user  system elapsed
   0.37    0.01    0.38> system.time(s2 <- 2 * (C == "a") - 1)   user  system elapsed
   0.02    0.00    0.02> identical(s1, s2)[1] TRUE

On 7/4/07, Keith Alan Chamberlain <Keith.Chamberlain at colorado.edu>
wrote:> Dear Rhelpers,
>
> Is there a faster way than below to set a vector based on values from
> another vector? I'd like to call a pre-existing function for this, but
one
> which can also handle an arbitrarily large number of categories. Any ideas?
>
>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat))   # New vector for numeric values
>
> # Cycle through each column and set C1 to corresponding value of Cat.
> for(i in 1:length(C1)){
>        if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
>
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
>
> Sincerely,
> KeithC.
> Psych Undergrad, CU Boulder (US)
> RE McNair Scholar
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

(Ted Harding)

2007-Jul-04 15:48 UTC

head link

[R] A More efficient method?

[Sorry, there were silly typose in the previous version. Corrected below]

On 04-Jul-07 13:44:44, Keith Alan Chamberlain wrote:> Dear Rhelpers,
> 
> Is there a faster way than below to set a vector based on values
> from another vector? I'd like to call a pre-existing function for
> this, but one which can also handle an arbitrarily large number
> of categories. Any ideas?
> 
>
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
> C1=vector(length=length(Cat)) # New vector for numeric values
> 
># Cycle through each column and set C1 to corresponding value of Cat.
> for(i in 1:length(C1)){
>       if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
> }
> 
> C1
> [1] -1 -1 -1  1  1  1 -1 -1  1
> Cat
> [1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
>
Cat=c('a','a','a','b','b','b','a','a','b')
> Cat=="b" [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
> (Cat=="b") - 0.5[1] -0.5 -0.5 -0.5  0.5  0.5  0.5 -0.5 -0.5  0.5
> 2*((Cat=="b") - 0.5)[1] -1 -1 -1  1  1  1 -1 -1  1

to give one example of a way to do it. But you don't say why you
really want to do this. You may really want factors. And what do
you want to see if there is "an arbitrarily large number of
categories"?

For instance:
> factor(Cat,labels=c(-1,1))[1] -1 -1 -1 1  1  1  -1 -1 1 

but this is not a vector, but a "factor" object. To get the vector,
you need to convert Cat to an integer:
> as.integer(factor(Cat))[1] 1 1 1 2 2 2 1 1 2

where (unless you've specified otherwise in factor()) the values
will correspond to the elements of Cat in "natural" order, in this
case first "a" (-> 1), then "b" (-> 2).

E.g.
>
Cat2<-c("a","a","c","b","a","b")
> as.integer(factor(Cat2))[1] 1 1 3 2 1 2

so, with C2<-as.integer(factor(Cat2)), you get a vector of distinct
integers [1,2,3) for the distinct levels
("a","b","c") of Cat2.
If you want different integer values for these levels, you can write
a function to change them.

Hoping this helps to break the ice!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <efh at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Jul-07                                       Time: 16:44:20
------------------------------ XFMail ------------------------------

S Ellison

2007-Jul-04 15:57 UTC

head link

[R] A More efficient method?

#Given
Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable

#and defining 
coding<-array(c(-1,1), dimnames=list(unique(Cat) ))

#(ie an array of values corresponding to your character array levels, and with
names set to those levels)

coding[Cat]

#does what you want.
>>> Keith Alan Chamberlain <Keith.Chamberlain at Colorado.EDU>
04/07/2007 14:44:44 >>>Dear Rhelpers,

Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this, but one
which can also handle an arbitrarily large number of categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
C1=vector(length=length(Cat))	# New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

*******************************************************************
This email and any attachments are confidential. Any use, co...{{dropped}}

Stefan Grosse

2007-Jul-04 16:03 UTC

head link

[R] A More efficient method?

Gabor Grothendieck wrote:>> set.seed(1)
>> C <- sample(c("a", "b"), 100000, replace = TRUE)
>> system.time(s1 <- ifelse(C == "a", 1, -1))
>>     
>    user  system elapsed
>    0.37    0.01    0.38
>   
>> system.time(s2 <- 2 * (C == "a") - 1)
>>     
>    user  system elapsed
>    0.02    0.00    0.02
>   
> system.time(s1 <- ifelse(C == "a", 1, -1))   user  system elapsed
   0.04    0.01    0.08> system.time(s2 <- 2 * (C == "a") - 1)   user  system elapsed
      0       0       0


I am just wondering: how comes the time does add up to 0.05 while
elapsed states 0.08 on my system? (Vista+R2.5.1)

Stefan


-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

Keith Alan Chamberlain

2007-Jul-04 16:37 UTC

head link

[R] A More efficient method?

Dear Ted,

You are correct in that factors are probably what I had in mind since I
would be using them as predictors in a regression. I didn't know the syntax
to get R to do the arithmetic.

Many thanks to everyone who replied! 

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

François Pinard

2007-Jul-04 17:51 UTC

head link

[R] A More efficient method?

[Keith Alan Chamberlain]
>Is there a faster way than below to set a vector based on values
>from another vector? I'd like to call a pre-existing function for
>this, but one which can also handle an arbitrarily large number of
>categories. Any ideas?
>Cat=c('a','a','a','b','b','b','a','a','b')
# Categorical variable
>C1=vector(length=length(Cat))	# New vector for numeric values
># Cycle through each column and set C1 to corresponding value of Cat.
>for(i in 1:length(C1)){
>	if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
>}
>C1
>[1] -1 -1 -1  1  1  1 -1 -1  1
>Cat
>[1] "a" "a" "a" "b" "b"
"b" "a" "a" "b"
For handling an arbitrarily large number of categories, one may go
through a recoding vector, like this for the example above:
> Cat <- c('a', 'a', 'a', 'b',
'b', 'b', 'a', 'a', 'b')
> C1 <- c(a=-1, b=1)[Cat]
> C1 a  a  a  b  b  b  a  a  b
-1 -1 -1  1  1  1 -1 -1  1

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Jul 2007 - A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

[R] A More efficient method?

Apparently Analagous Threads