Dear R People: I have the following data:> ail.df[,1][1] 47677 47602 47678 47905 47909 47906 47605 47673 47607> cut(ail.df[,1],breaks=3)[1] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] [4] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] [7] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] Levels: (4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04]>so I have cut ail.df[,1] into 3 levels. However, the second level never appears in the data set. Is there a way to set cut such that every level appears, please? thanks in advance, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com
You could use quantile() to create the breakpoints > x <- c(47677, 47602, 47678, 47905, 47909, 47906, 47605, 47673, 47607) > cutX <- cut(x, breaks=quantile(x, (0:3)/3), include.lowest=TRUE) > cutX [1] (4.77e+04,4.78e+04] [4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04] [5] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] [4.76e+04,4.77e+04] (4.77e+04,4.78e+04] [9] [4.76e+04,4.77e+04] Levels: [4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04] > table(cutX) cutX [4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04] 3 3 3 This will fail if there are only 2 distinct values in the dataset. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Erin Hodgess > Sent: Wednesday, December 07, 2011 7:15 PM > To: R help > Subject: [R] a weird "cut" question > > Dear R People: > > I have the following data: > > > ail.df[,1] > [1] 47677 47602 47678 47905 47909 47906 47605 47673 47607 > > cut(ail.df[,1],breaks=3) > [1] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] > [4] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] > [7] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] > Levels: (4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04] > > > so I have cut ail.df[,1] into 3 levels. However, the second level > never appears in the data set. > > Is there a way to set cut such that every level appears, please? > > thanks in advance, > Sincerely, > Erin > > > -- > Erin Hodgess > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: erinm.hodgess at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
You picked a data set and ask whether the function parameters can fix the partition. Perhaps you should consider a different data set: rep(47677,9). I think you cannot "fix" the partition without considering the data also. Given that, you may be able to go back and manually identify a partition that works with a particular data set (e.g. use quantiles), but you may be asking too much to automate that for all data sets. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Erin Hodgess <erinm.hodgess at gmail.com> wrote:>Dear R People: > >I have the following data: > >> ail.df[,1] >[1] 47677 47602 47678 47905 47909 47906 47605 47673 47607 >> cut(ail.df[,1],breaks=3) >[1] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] >[4] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] (4.78e+04,4.79e+04] >[7] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] (4.76e+04,4.77e+04] >Levels: (4.76e+04,4.77e+04] (4.77e+04,4.78e+04] (4.78e+04,4.79e+04] >> >so I have cut ail.df[,1] into 3 levels. However, the second level >never appears in the data set. > >Is there a way to set cut such that every level appears, please? > >thanks in advance, >Sincerely, >Erin > > >-- >Erin Hodgess >Associate Professor >Department of Computer and Mathematical Sciences >University of Houston - Downtown >mailto: erinm.hodgess at gmail.com > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.