Dear R People: I have a question about a "sorting" problem, please. I have a vector xx:> xx[1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 and a vector of breaks:> xx.y[1] -2.2000000 -0.9666667 0.2666667 1.5000000 I want to produce another vector z which contains the number of the class that each data point is in. for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1 this can be accomplished via loops, but I was wondering if there is a more efficient method, please. By the way, eventually, there will be many more data points and more classes. thank you for any help! sincerely, Erin Hodgess mailto: hodgesse at uhd.edu Version 1.7.0 R for Windows
Erin Hodgess wrote:> Dear R People: > > I have a question about a "sorting" problem, please. > > I have a vector xx: > > >>xx > > > [1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 > > and a vector of breaks: > > >>xx.y > > > [1] -2.2000000 -0.9666667 0.2666667 1.5000000 > > I want to produce another vector z which contains the number of the class > that each data point is in. > > for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1 > > this can be accomplished via loops, but I was wondering if there is a more > efficient method, please. > > By the way, eventually, there will be many more data points and more > classes. >I think what you're looking for is ?cut: R> xx = c(-2.0, 1.4, -1.2, -2.2, 0.4, 1.5, -2.2, 0.2, -0.4, -0.9) R> cut(xx, breaks = c(-Inf, -2.2, -0.97, 0.27, 1.5, Inf)) [1] (-2.2,-0.97] (0.27,1.5] (-2.2,-0.97] (-Inf,-2.2] (0.27,1.5] [6] (0.27,1.5] (-Inf,-2.2] (-0.97,0.27] (-0.97,0.27] (-0.97,0.27] Levels: (-Inf,-2.2] (-2.2,-0.97] (-0.97,0.27] (0.27,1.5] (1.5,Inf] R> Regards, Sundar
t1 <- outer(data, breaks + c(rep(0, length(breaks)-1), 1e-5), "<") Apply(t1, 1, function(x){min(which(x))}) - 1 Adding to the final break point makes sure that every data point will be less than some break point. So:> xx <- c(-2, 1.4, -1.2, -2.2, .4, 1.5, -2.2, 0.2, -.4, -.9) > xx.y <- c(-2.2, -0.967, 0.2667, 1.5) > t1 <- outer(xx, xx.y + c(rep(0, length(xx.y)-1), 1), "<") > apply(t1, 1, function(x){min(which(x))}) - 1[1] 1 3 1 1 3 3 1 2 2 2>Hope this helps, Matt -----Original Message----- From: Erin Hodgess [mailto:hodgess at uhddx01.dt.uh.edu] Sent: Thursday, June 12, 2003 2:34 PM To: r-help at stat.math.ethz.ch Subject: [R] breaks Dear R People: I have a question about a "sorting" problem, please. I have a vector xx:> xx[1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 and a vector of breaks:> xx.y[1] -2.2000000 -0.9666667 0.2666667 1.5000000 I want to produce another vector z which contains the number of the class that each data point is in. for instance, xx[1] is between xx.y[1] and xx.y[2], so z[1] == 1 this can be accomplished via loops, but I was wondering if there is a more efficient method, please. By the way, eventually, there will be many more data points and more classes. thank you for any help! sincerely, Erin Hodgess mailto: hodgesse at uhd.edu Version 1.7.0 R for Windows ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, cont... {{dropped}}
Erin, even though you've already summarized, I think the optimal answer to your question is findInterval() {there's also R-C API you can use from your C/C++} Martin>>>>> "Erin" == Erin Hodgess <hodgess at uhddx01.dt.uh.edu> >>>>> on Thu, 12 Jun 2003 13:33:52 -0500 (CDT) writes:Erin> Dear R People: I have a question about a "sorting" Erin> problem, please. Erin> I have a vector xx: >> xx Erin> [1] -2.0 1.4 -1.2 -2.2 0.4 1.5 -2.2 0.2 -0.4 -0.9 Erin> and a vector of breaks: >> xx.y Erin> [1] -2.2000000 -0.9666667 0.2666667 1.5000000 Erin> I want to produce another vector z which contains the Erin> number of the class that each data point is in. Erin> for instance, xx[1] is between xx.y[1] and xx.y[2], so Erin> z[1] == 1 Erin> this can be accomplished via loops, but I was Erin> wondering if there is a more efficient method, please. Erin> By the way, eventually, there will be many more data Erin> points and more classes. Erin> thank you for any help! Erin> sincerely, Erin Hodgess mailto: hodgesse at uhd.edu Erin> Version 1.7.0 R for Windows Erin> ______________________________________________ Erin> R-help at stat.math.ethz.ch mailing list Erin> https://www.stat.math.ethz.ch/mailman/listinfo/r-help