thr3ads.net - R help - [R] Vector indexing question [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Paul Lynch

2007-Mar-29 23:55 UTC

[R] Vector indexing question

Suppose you have 4 related vectors:

a.id<-c(1:25, 1:25, 1:25)
a.vals <- c(101:175)        # same length as a.id (the values for those IDs)
a.id.levels <- c(1:25)
a.id.ratings <- rep(letters[1:5], times=5)    # same length as a.id.levels

What I would like to do is specify a rating from a.ratings (e.g. "e"),
get the vector of corresponding IDs from a.id.levels (via
a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
to get the corresponding values from a.vals.

I think I can probably write a loop to construct of a vector of
ratings of the same length as a.id so that the ratings match the ID,
and then go from there.  Is there a better way?  Perhaps using factors
or levels or something?

Thanks,
      --Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

Adaikalavan Ramasamy

2007-Mar-30 00:39 UTC

head link

[R] Vector indexing question

Sounds like you have two different tables and are trying to mine one 
based on the other. Try

ref <- data.frame( levels  = 1:25,
                    ratings = rep(letters[1:5], times=5) )

db <- data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) )

levels.of.interest <- ref$levels[ ref$rating=="a" ]
db$vals[ which(db$levels %in% levels.of.interest) ]

  [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171


OR a much more intuitive way is to merge both tables and proceeding as

out <- merge( db, ref, by="levels", all.x=TRUE )
out <- out[ order(out$val), ] # little cleanup
subset( out, ratings=="a" )   # ignore the rownames

    levels vals ratings
1       1  101       a
16      6  106       a
31     11  111       a
46     16  116       a
61     21  121       a
3       1  126       a
17      6  131       a
32     11  136       a
47     16  141       a
62     21  146       a
2       1  151       a
18      6  156       a
33     11  161       a
48     16  166       a
63     21  171       a

Then you can do cool things using the apply() family like
   tapply( out$vals, out$ratings, mean )
     a   b   c   d   e
   136 137 138 139 140

Check out %in%, merge and apply.

Regards, Adai



Paul Lynch wrote:> Suppose you have 4 related vectors:
> 
> a.id<-c(1:25, 1:25, 1:25)
> a.vals <- c(101:175)        # same length as a.id (the values for those
IDs)
> a.id.levels <- c(1:25)
> a.id.ratings <- rep(letters[1:5], times=5)    # same length as
a.id.levels
> 
> What I would like to do is specify a rating from a.ratings (e.g.
"e"),
> get the vector of corresponding IDs from a.id.levels (via
> a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in
a.id
> to get the corresponding values from a.vals.
> 
> I think I can probably write a loop to construct of a vector of
> ratings of the same length as a.id so that the ratings match the ID,
> and then go from there.  Is there a better way?  Perhaps using factors
> or levels or something?
> 
> Thanks,
>       --Paul
>

Marc Schwartz

2007-Mar-30 00:51 UTC

head link

[R] Vector indexing question

On Thu, 2007-03-29 at 19:55 -0400, Paul Lynch wrote:> Suppose you have 4 related vectors:
> 
> a.id<-c(1:25, 1:25, 1:25)
> a.vals <- c(101:175)        # same length as a.id (the values for those
IDs)
> a.id.levels <- c(1:25)
> a.id.ratings <- rep(letters[1:5], times=5)    # same length as
a.id.levels
> 
> What I would like to do is specify a rating from a.ratings (e.g.
"e"),
> get the vector of corresponding IDs from a.id.levels (via
> a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in
a.id
> to get the corresponding values from a.vals.
> 
> I think I can probably write a loop to construct of a vector of
> ratings of the same length as a.id so that the ratings match the ID,
> and then go from there.  Is there a better way?  Perhaps using factors
> or levels or something?
> 
> Thanks,
>       --Paul
Is this what you want?

DF <- data.frame(a.id, a.vals, a.id.levels, a.id.ratings)
> DF[DF$a.id.ratings == "e", "a.vals"] [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175

or
> subset(DF, a.id.ratings == "e", select = a.vals)   a.vals
5     105
10    110
15    115
20    120
25    125
30    130
35    135
40    140
45    145
50    150
55    155
60    160
65    165
70    170
75    175

See ?subset

HTH,

Marc Schwartz

Charles C. Berry

2007-Mar-30 03:55 UTC

head link

[R] Vector indexing question

On Thu, 29 Mar 2007, Paul Lynch wrote:
> Suppose you have 4 related vectors:
>
> a.id<-c(1:25, 1:25, 1:25)
> a.vals <- c(101:175)        # same length as a.id (the values for those
IDs)
> a.id.levels <- c(1:25)
> a.id.ratings <- rep(letters[1:5], times=5)    # same length as
a.id.levels
>
> What I would like to do is specify a rating from a.ratings (e.g.
"e"),
> get the vector of corresponding IDs from a.id.levels (via
> a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in
a.id
> to get the corresponding values from a.vals.
see

 	?factor
 	?match ( in case a.id.levels does not actually index a.id.ratings)
 	?split
> a.ratings.factor <- factor( a.id.ratings[ match(a.id, a.id.levels) ])
> a.vals[ a.ratings.factor == 'e' ]  [1] 105 110 115 120 125 130 135 140 145 150 155 160 165 170
175>
> split( a.vals, a.ratings.factor ) # more generally$a
  [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171

$b
  [1] 102 107 112 117 122 127 132 137 142 147 152 157 162 167 172
[output truncated]

> lm( a.vals ~ a.ratings.factor - 1 ) # means of a.vals
Call:
lm(formula = a.vals ~ a.ratings.factor - 1)

Coefficients:
a.ratings.factora  a.ratings.factorb  a.ratings.factorc  a.ratings.factord 
a.ratings.factore
               136                137                138                139     
140
>
> I think I can probably write a loop to construct of a vector of
> ratings of the same length as a.id so that the ratings match the ID,
> and then go from there.  Is there a better way?  Perhaps using factors
> or levels or something?
A warning: using factor() in this way

 	 a.ratings.factor <- factor( a.id, levels=a.id.levels, labels=a.id.ratings
)

will work in this case:

 	a.vals[ a.ratings.factor == 'e' ]

but generally will get you into trouble as its creates a factor with 25 
non-unique levels. So,

 	split( a.vals, a.ratings.factor )

ends up giving a list of 25 (non-uniquely labelled) components

HTH,

Chuck
>
> Thanks,
>      --Paul
>
> -- 
> Paul Lynch
> Aquilent, Inc.
> National Library of Medicine (Contractor)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Mar 2007 - Vector indexing question

[R] Vector indexing question

[R] Vector indexing question

[R] Vector indexing question

[R] Vector indexing question

Apparently Analagous Threads