Dear R People: I have a question about a "sorting" problem, please. I have a vector xx:> xx[1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 and a vector of breaks:> xx.y[1] -2.2000000 -0.9666667 0.2666667 1.5000000 I want to produce another vector z which contains the number of the class that each data point is in. for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1 this can be accomplished via loops, but I was wondering if there is a more efficient method, please. By the way, eventually, there will be many more data points and more classes. thank you for any help! sincerely, Erin Hodgess mailto: hodgesse at uhd.edu Version 1.7.0 R for Windows
Erin Hodgess wrote:> Dear R People: > > I have a question about a "sorting" problem, please. > > I have a vector xx: > > >>xx > > > [1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 > > and a vector of breaks: > > >>xx.y > > > [1] -2.2000000 -0.9666667 0.2666667 1.5000000 > > I want to produce another vector z which contains the number of the class > that each data point is in. > > for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1 > > this can be accomplished via loops, but I was wondering if there is a more > efficient method, please. > > By the way, eventually, there will be many more data points and more > classes. >I think what you're looking for is ?cut: R> xx = c(-2.0, 1.4, -1.2, -2.2, 0.4, 1.5, -2.2, 0.2, -0.4, -0.9) R> cut(xx, breaks = c(-Inf, -2.2, -0.97, 0.27, 1.5, Inf)) [1] (-2.2,-0.97] (0.27,1.5] (-2.2,-0.97] (-Inf,-2.2] (0.27,1.5] [6] (0.27,1.5] (-Inf,-2.2] (-0.97,0.27] (-0.97,0.27] (-0.97,0.27] Levels: (-Inf,-2.2] (-2.2,-0.97] (-0.97,0.27] (0.27,1.5] (1.5,Inf] R> Regards, Sundar
t1 <- outer(data, breaks + c(rep(0, length(breaks)-1), 1e-5),
"<")
Apply(t1, 1, function(x){min(which(x))}) - 1
Adding to the final break point makes sure that every data point will be
less than some break point.
So:
> xx <- c(-2, 1.4, -1.2, -2.2, .4, 1.5, -2.2, 0.2, -.4, -.9)
> xx.y <- c(-2.2, -0.967, 0.2667, 1.5)
> t1 <- outer(xx, xx.y + c(rep(0, length(xx.y)-1), 1), "<")
> apply(t1, 1, function(x){min(which(x))}) - 1
[1] 1 3 1 1 3 3 1 2 2 2>
Hope this helps,
Matt
-----Original Message-----
From: Erin Hodgess [mailto:hodgess at uhddx01.dt.uh.edu]
Sent: Thursday, June 12, 2003 2:34 PM
To: r-help at stat.math.ethz.ch
Subject: [R] breaks
Dear R People:
I have a question about a "sorting" problem, please.
I have a vector xx:
> xx
[1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9
and a vector of breaks:
> xx.y
[1] -2.2000000 -0.9666667 0.2666667 1.5000000
I want to produce another vector z which contains the number of the class
that each data point is in.
for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1
this can be accomplished via loops, but I was wondering if there is a more
efficient method, please.
By the way, eventually, there will be many more data points and more
classes.
thank you for any help!
sincerely,
Erin Hodgess
mailto: hodgesse at uhd.edu
Version 1.7.0 R for Windows
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments, cont... {{dropped}}
Erin, even though you've already summarized,
I think the optimal answer to your question is
findInterval()
{there's also R-C API you can use from your C/C++}
Martin
>>>>> "Erin" == Erin Hodgess <hodgess at
uhddx01.dt.uh.edu>
>>>>> on Thu, 12 Jun 2003 13:33:52 -0500 (CDT) writes:
Erin> Dear R People: I have a question about a "sorting"
Erin> problem, please.
Erin> I have a vector xx:
>> xx
Erin> [1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9
Erin> and a vector of breaks:
>> xx.y
Erin> [1] -2.2000000 -0.9666667 0.2666667 1.5000000
Erin> I want to produce another vector z which contains the
Erin> number of the class that each data point is in.
Erin> for instance, xx[1] is between xx.y[1] and xx.y[2], so
Erin> z[1] == 1
Erin> this can be accomplished via loops, but I was
Erin> wondering if there is a more efficient method, please.
Erin> By the way, eventually, there will be many more data
Erin> points and more classes.
Erin> thank you for any help!
Erin> sincerely, Erin Hodgess mailto: hodgesse at uhd.edu
Erin> Version 1.7.0 R for Windows
Erin> ______________________________________________
Erin> R-help at stat.math.ethz.ch mailing list
Erin> https://www.stat.math.ethz.ch/mailman/listinfo/r-help