Jose Claudio Faria
2010-Apr-21 14:24 UTC
[R] Help: formatting the result of 'cut' function
Dear list, I would like to format the result of the 'cut' function to perform a subsequent frequency distribution table (fdt) suitable for publications. Below an reproducible example: set.seed(1) x <- c(rnorm(1e3, mean=10, sd=1), 50, 100) start <- 0 end <- 110 h <-10 c1 <- cut(x, br=seq(start, end, h), right=TRUE) levels(c1) # I get: # [1] "(0,10]" "(10,20]" "(20,30]" "(30,40]" # [5] "(40,50]" "(50,60]" "(60,70]" "(70,80]" # [9] "(80,90]" "(90,100]" "(100,110]" # I need (observe digits and space after the comma): # [1] "(000, 010]" "(010, 020]" "(020, 030]" "(030, 040]" # [5] "(040, 050]" "(050, 060]" "(060, 070]" "(070, 080]" # [9] "(080, 090]" "(090, 100]" "(100, 110]" c2 <- cut(x, br=seq(start, end, h), right=FALSE) levels(c2) # I get: # [1] "[0,10)" "[10,20)" "[20,30)" "[30,40)" # [5] "[40,50)" "[50,60)" "[60,70)" "[70,80)" # [9] "[80,90)" "[90,100)" "[100,110)" # I need (observe digits and space after the comma): # [1] "[000, 010)" "[010, 020)" "[020, 030)" "[030, 040)" # [5] "[040, 050)" "[050, 060)" "[060, 070)" "[070, 080)" # [9] "[080, 090)" "[090, 100)" "[100, 110)" # Making fdt: table(c1) # I get: # c1 # (0,10] (10,20] (20,30] (30,40] (40,50] (50,60] # 518 482 0 0 1 0 # (60,70] (70,80] (80,90] (90,100] (100,110] # 0 0 0 1 0 # I need (observe digits and space after the comma): # c1 # (000, 010] (010, 020] (020, 030] (030, 040] (040, 050] (050, 060] # 518 482 0 0 1 0 # (060, 070] (070, 080] (080, 090] (090, 100] (100, 110] # 0 0 0 1 0 table(c2) # I get: # c2 # [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) # 518 482 0 0 0 1 # [60,70) [70,80) [80,90) [90,100) [100,110) # 0 0 0 0 1 # I need (observe digits and space after the comma): # c2 # [000, 010) [010, 020) [020, 030) [030, 040) [040, 050) [050, 060) # 518 482 0 0 0 1 # [060, 070) [070, 080) [080, 090) [090, 100) [100, 110) # 0 0 0 0 1 Is it possible? Any tip will be welcome! Thanks in advance, -- ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ Jose Claudio Faria Estatistica - prof. Titular UESC/DCET/Brasil joseclaudio.faria at gmail.com ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
Gabor Grothendieck
2010-Apr-21 16:15 UTC
[R] Help: formatting the result of 'cut' function
gsubfn is like gsub except instead of a replacement string it uses a replacement function whose input is the string matched by the regular expression and whose output replaces the match. The replacement function can optionally be specified as a formula as we do here. If there is no left hand side to the formula then the arguments are taken to be the free variables in the right hand side of the formula, here its just x. Finally we use sub to replace each comma with comma space. See http://gsubfn.googlecode.com for more. library(gsubfn) levels(c1) <- gsubfn("\\d+", ~ sprintf("%03d", as.numeric(x)), levels(c1)) levels(c1) <- sub(",", ", ", levels(c1)) On Wed, Apr 21, 2010 at 10:24 AM, Jose Claudio Faria <joseclaudio.faria at gmail.com> wrote:> Dear list, > > I would like to format the result of the 'cut' function to perform a subsequent > frequency distribution table (fdt) suitable for publications. > Below an reproducible example: > > set.seed(1) > x <- c(rnorm(1e3, mean=10, sd=1), 50, 100) > > start <- 0 > end ? <- 110 > h ? ? <-10 > > c1 <- cut(x, br=seq(start, end, h), right=TRUE) > levels(c1) > # I get: > # [1] "(0,10]" ? ?"(10,20]" ? "(20,30]" ? "(30,40]" > # [5] "(40,50]" ? "(50,60]" ? "(60,70]" ? "(70,80]" > # [9] "(80,90]" ? "(90,100]" ?"(100,110]" > > # I need (observe digits and space after the comma): > # [1] "(000, 010]" ?"(010, 020]" ?"(020, 030]" ?"(030, 040]" > # [5] "(040, 050]" ?"(050, 060]" ?"(060, 070]" ?"(070, 080]" > # [9] "(080, 090]" ?"(090, 100]" ?"(100, 110]" > > c2 <- cut(x, br=seq(start, end, h), right=FALSE) > levels(c2) > # I get: > # [1] "[0,10)" ? ?"[10,20)" ? "[20,30)" ? "[30,40)" > # [5] "[40,50)" ? "[50,60)" ? "[60,70)" ? "[70,80)" > # [9] "[80,90)" ? "[90,100)" ?"[100,110)" > > # I need (observe digits and space after the comma): > # [1] "[000, 010)" ?"[010, 020)" ?"[020, 030)" ?"[030, 040)" > # [5] "[040, 050)" ?"[050, 060)" ?"[060, 070)" ?"[070, 080)" > # [9] "[080, 090)" ?"[090, 100)" ?"[100, 110)" > > # Making fdt: > table(c1) > # I get: > # c1 > # ? ?(0,10] ? (10,20] ? (20,30] ? (30,40] ? (40,50] ? (50,60] > # ? ? ? 518 ? ? ? ?482 ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?1 ? ? ? ? ? 0 > # ? (60,70] ? (70,80] ? (80,90] ?(90,100] (100,110] > # ? ? ? ? ? ?0 ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?1 ? ? ? ? ? ? 0 > > # I need (observe digits and space after the comma): > # c1 > # ?(000, 010] ?(010, 020] ?(020, 030] ?(030, 040] ?(040, 050] ?(050, 060] > # ? ? ? ? ? ?518 ? ? ? ? ? 482 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 > ? ?1 ? ? ? ? ? ? ? 0 > # ?(060, 070] ?(070, 080] ?(080, 090] ?(090, 100] ?(100, 110] > # ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ?1 ? ? ? ? ? ? ? 0 > > table(c2) > # I get: > # c2 > # ? ?[0,10) ? [10,20) ? [20,30) ? [30,40) ? [40,50) ? [50,60) > # ? ? ? 518 ? ? ? ?482 ? ? ? ? ? 0 ? ? ? ? ? ?0 ? ? ? ? ? 0 ? ? ? ? ? 1 > # ? [60,70) ? [70,80) ? [80,90) ?[90,100) [100,110) > # ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?0 ? ? ? ? ? ?0 ? ? ? ? ? ? 1 > > # I need (observe digits and space after the comma): > # c2 > # ? [000, 010) ?[010, 020) ?[020, 030) ?[030, 040) ?[040, 050) ?[050, 060) > # ? ? ? ? ? ?518 ? ? ? ? ? ?482 ? ? ? ? ? ? ?0 ? ? ? ? ? ? ? 0 > ? ? 0 ? ? ? ? ? ? ? 1 > # ? [060, 070) ?[070, 080) ?[080, 090) ?[090, 100) ?[100, 110) > # ? ? ? ? ? ? ? ?0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 > ? ? ? ? 1 > > > Is it possible? Any tip will be welcome! > > Thanks in advance, > -- > ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ > Jose Claudio Faria > Estatistica - prof. Titular > UESC/DCET/Brasil > joseclaudio.faria at gmail.com > ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Try this:> set.seed(1) > x <- c(rnorm(1e3, mean=10, sd=1), 50, 100) > > start <- 0 > end <- 110 > h <-10 > > c1 <- cut(x, br=seq(start, end, h), right=TRUE) > levels(c1)[1] "(0,10]" "(10,20]" "(20,30]" "(30,40]" "(40,50]" "(50,60]" "(60,70]" "(70,80]" [9] "(80,90]" "(90,100]" "(100,110]"> # I get: > # [1] "(0,10]" "(10,20]" "(20,30]" "(30,40]" > # [5] "(40,50]" "(50,60]" "(60,70]" "(70,80]" > # [9] "(80,90]" "(90,100]" "(100,110]" > x.str <- strsplit(levels(c1), ',') > # reformat the numbers > x.fmt <- lapply(x.str, function(.vals){+ .vals[1L] <- sprintf("(%03d", as.integer(substring(.vals[1L], 2))) + .vals[2L] <- sprintf("%03d]", as.integer(substring(.vals[2L], 1, nchar(.vals[2L]) - 1))) + paste(.vals, collapse=',') + })> levels(c1) <- unlist(x.fmt) > levels(c1)[1] "(000,010]" "(010,020]" "(020,030]" "(030,040]" "(040,050]" "(050,060]" "(060,070]" "(070,080]" [9] "(080,090]" "(090,100]" "(100,110]">On Wed, Apr 21, 2010 at 10:24 AM, Jose Claudio Faria < joseclaudio.faria@gmail.com> wrote:> Dear list, > > I would like to format the result of the 'cut' function to perform a > subsequent > frequency distribution table (fdt) suitable for publications. > Below an reproducible example: > > set.seed(1) > x <- c(rnorm(1e3, mean=10, sd=1), 50, 100) > > start <- 0 > end <- 110 > h <-10 > > c1 <- cut(x, br=seq(start, end, h), right=TRUE) > levels(c1) > # I get: > # [1] "(0,10]" "(10,20]" "(20,30]" "(30,40]" > # [5] "(40,50]" "(50,60]" "(60,70]" "(70,80]" > # [9] "(80,90]" "(90,100]" "(100,110]" > > # I need (observe digits and space after the comma): > # [1] "(000, 010]" "(010, 020]" "(020, 030]" "(030, 040]" > # [5] "(040, 050]" "(050, 060]" "(060, 070]" "(070, 080]" > # [9] "(080, 090]" "(090, 100]" "(100, 110]" > > c2 <- cut(x, br=seq(start, end, h), right=FALSE) > levels(c2) > # I get: > # [1] "[0,10)" "[10,20)" "[20,30)" "[30,40)" > # [5] "[40,50)" "[50,60)" "[60,70)" "[70,80)" > # [9] "[80,90)" "[90,100)" "[100,110)" > > # I need (observe digits and space after the comma): > # [1] "[000, 010)" "[010, 020)" "[020, 030)" "[030, 040)" > # [5] "[040, 050)" "[050, 060)" "[060, 070)" "[070, 080)" > # [9] "[080, 090)" "[090, 100)" "[100, 110)" > > # Making fdt: > table(c1) > # I get: > # c1 > # (0,10] (10,20] (20,30] (30,40] (40,50] (50,60] > # 518 482 0 0 1 0 > # (60,70] (70,80] (80,90] (90,100] (100,110] > # 0 0 0 1 0 > > # I need (observe digits and space after the comma): > # c1 > # (000, 010] (010, 020] (020, 030] (030, 040] (040, 050] (050, 060] > # 518 482 0 0 > 1 0 > # (060, 070] (070, 080] (080, 090] (090, 100] (100, 110] > # 0 0 0 1 > 0 > > table(c2) > # I get: > # c2 > # [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) > # 518 482 0 0 0 1 > # [60,70) [70,80) [80,90) [90,100) [100,110) > # 0 0 0 0 1 > > # I need (observe digits and space after the comma): > # c2 > # [000, 010) [010, 020) [020, 030) [030, 040) [040, 050) [050, 060) > # 518 482 0 0 > 0 1 > # [060, 070) [070, 080) [080, 090) [090, 100) [100, 110) > # 0 0 0 0 > 1 > > > Is it possible? Any tip will be welcome! > > Thanks in advance, > -- > ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ > Jose Claudio Faria > Estatistica - prof. Titular > UESC/DCET/Brasil > joseclaudio.faria@gmail.com > ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\ > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]