thr3ads.net - R help - [R] Help: formatting the result of 'cut' function [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Jose Claudio Faria

2010-Apr-21 14:24 UTC

[R] Help: formatting the result of 'cut' function

Dear list,

I would like to format the result of the 'cut' function to perform a
subsequent
frequency distribution table (fdt) suitable for publications.
Below an reproducible example:

set.seed(1)
x <- c(rnorm(1e3, mean=10, sd=1), 50, 100)

start <- 0
end   <- 110
h     <-10

c1 <- cut(x, br=seq(start, end, h), right=TRUE)
levels(c1)
# I get:
# [1] "(0,10]"    "(10,20]"   "(20,30]"  
"(30,40]"
# [5] "(40,50]"   "(50,60]"   "(60,70]"  
"(70,80]"
# [9] "(80,90]"   "(90,100]"  "(100,110]"

# I need (observe digits and space after the comma):
# [1] "(000, 010]"  "(010, 020]"  "(020, 030]" 
"(030, 040]"
# [5] "(040, 050]"  "(050, 060]"  "(060, 070]" 
"(070, 080]"
# [9] "(080, 090]"  "(090, 100]"  "(100, 110]"

c2 <- cut(x, br=seq(start, end, h), right=FALSE)
levels(c2)
# I get:
# [1] "[0,10)"    "[10,20)"   "[20,30)"  
"[30,40)"
# [5] "[40,50)"   "[50,60)"   "[60,70)"  
"[70,80)"
# [9] "[80,90)"   "[90,100)"  "[100,110)"

# I need (observe digits and space after the comma):
# [1] "[000, 010)"  "[010, 020)"  "[020, 030)" 
"[030, 040)"
# [5] "[040, 050)"  "[050, 060)"  "[060, 070)" 
"[070, 080)"
# [9] "[080, 090)"  "[090, 100)"  "[100, 110)"

# Making fdt:
table(c1)
# I get:
# c1
#    (0,10]   (10,20]   (20,30]   (30,40]   (40,50]   (50,60]
#       518        482           0           0            1           0
#   (60,70]   (70,80]   (80,90]  (90,100] (100,110]
#            0           0           0            1             0

# I need (observe digits and space after the comma):
# c1
#  (000, 010]  (010, 020]  (020, 030]  (030, 040]  (040, 050]  (050, 060]
#            518           482               0               0
    1               0
#  (060, 070]  (070, 080]  (080, 090]  (090, 100]  (100, 110]
#               0               0               0              1               0

table(c2)
# I get:
# c2
#    [0,10)   [10,20)   [20,30)   [30,40)   [40,50)   [50,60)
#       518        482           0            0           0           1
#   [60,70)   [70,80)   [80,90)  [90,100) [100,110)
#           0           0            0            0             1

# I need (observe digits and space after the comma):
# c2
#   [000, 010)  [010, 020)  [020, 030)  [030, 040)  [040, 050)  [050, 060)
#            518            482              0               0
     0               1
#   [060, 070)  [070, 080)  [080, 090)  [090, 100)  [100, 110)
#                0               0               0               0
         1


Is it possible? Any tip will be welcome!

Thanks in advance,
-- 
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
Jose Claudio Faria
Estatistica - prof. Titular
UESC/DCET/Brasil
joseclaudio.faria at gmail.com
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\

Gabor Grothendieck

2010-Apr-21 16:15 UTC

head link

[R] Help: formatting the result of 'cut' function

gsubfn is like gsub except instead of a replacement string it uses a
replacement function whose input is the string matched by the regular
expression and whose output replaces the match.   The replacement
function can optionally be specified as a formula as we do here.  If
there is no left hand side to the formula then the arguments are taken
to be the free variables in the right hand side of the formula, here
its just x.  Finally we use sub to replace each comma with comma
space.   See http://gsubfn.googlecode.com for more.

library(gsubfn)
levels(c1) <- gsubfn("\\d+", ~ sprintf("%03d",
as.numeric(x)), levels(c1))
levels(c1) <- sub(",", ", ", levels(c1))


On Wed, Apr 21, 2010 at 10:24 AM, Jose Claudio Faria
<joseclaudio.faria at gmail.com> wrote:> Dear list,
>
> I would like to format the result of the 'cut' function to perform
a subsequent
> frequency distribution table (fdt) suitable for publications.
> Below an reproducible example:
>
> set.seed(1)
> x <- c(rnorm(1e3, mean=10, sd=1), 50, 100)
>
> start <- 0
> end ? <- 110
> h ? ? <-10
>
> c1 <- cut(x, br=seq(start, end, h), right=TRUE)
> levels(c1)
> # I get:
> # [1] "(0,10]" ? ?"(10,20]" ? "(20,30]" ?
"(30,40]"
> # [5] "(40,50]" ? "(50,60]" ? "(60,70]" ?
"(70,80]"
> # [9] "(80,90]" ? "(90,100]" ?"(100,110]"
>
> # I need (observe digits and space after the comma):
> # [1] "(000, 010]" ?"(010, 020]" ?"(020,
030]" ?"(030, 040]"
> # [5] "(040, 050]" ?"(050, 060]" ?"(060,
070]" ?"(070, 080]"
> # [9] "(080, 090]" ?"(090, 100]" ?"(100,
110]"
>
> c2 <- cut(x, br=seq(start, end, h), right=FALSE)
> levels(c2)
> # I get:
> # [1] "[0,10)" ? ?"[10,20)" ? "[20,30)" ?
"[30,40)"
> # [5] "[40,50)" ? "[50,60)" ? "[60,70)" ?
"[70,80)"
> # [9] "[80,90)" ? "[90,100)" ?"[100,110)"
>
> # I need (observe digits and space after the comma):
> # [1] "[000, 010)" ?"[010, 020)" ?"[020,
030)" ?"[030, 040)"
> # [5] "[040, 050)" ?"[050, 060)" ?"[060,
070)" ?"[070, 080)"
> # [9] "[080, 090)" ?"[090, 100)" ?"[100,
110)"
>
> # Making fdt:
> table(c1)
> # I get:
> # c1
> # ? ?(0,10] ? (10,20] ? (20,30] ? (30,40] ? (40,50] ? (50,60]
> # ? ? ? 518 ? ? ? ?482 ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?1 ? ? ? ? ? 0
> # ? (60,70] ? (70,80] ? (80,90] ?(90,100] (100,110]
> # ? ? ? ? ? ?0 ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?1 ? ? ? ? ? ? 0
>
> # I need (observe digits and space after the comma):
> # c1
> # ?(000, 010] ?(010, 020] ?(020, 030] ?(030, 040] ?(040, 050] ?(050, 060]
> # ? ? ? ? ? ?518 ? ? ? ? ? 482 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0
> ? ?1 ? ? ? ? ? ? ? 0
> # ?(060, 070] ?(070, 080] ?(080, 090] ?(090, 100] ?(100, 110]
> # ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ?1 ? ? ? ? ?
? ? 0
>
> table(c2)
> # I get:
> # c2
> # ? ?[0,10) ? [10,20) ? [20,30) ? [30,40) ? [40,50) ? [50,60)
> # ? ? ? 518 ? ? ? ?482 ? ? ? ? ? 0 ? ? ? ? ? ?0 ? ? ? ? ? 0 ? ? ? ? ? 1
> # ? [60,70) ? [70,80) ? [80,90) ?[90,100) [100,110)
> # ? ? ? ? ? 0 ? ? ? ? ? 0 ? ? ? ? ? ?0 ? ? ? ? ? ?0 ? ? ? ? ? ? 1
>
> # I need (observe digits and space after the comma):
> # c2
> # ? [000, 010) ?[010, 020) ?[020, 030) ?[030, 040) ?[040, 050) ?[050, 060)
> # ? ? ? ? ? ?518 ? ? ? ? ? ?482 ? ? ? ? ? ? ?0 ? ? ? ? ? ? ? 0
> ? ? 0 ? ? ? ? ? ? ? 1
> # ? [060, 070) ?[070, 080) ?[080, 090) ?[090, 100) ?[100, 110)
> # ? ? ? ? ? ? ? ?0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0 ? ? ? ? ? ? ? 0
> ? ? ? ? 1
>
>
> Is it possible? Any tip will be welcome!
>
> Thanks in advance,
> --
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
> Jose Claudio Faria
> Estatistica - prof. Titular
> UESC/DCET/Brasil
> joseclaudio.faria at gmail.com
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

jim holtman

2010-Apr-21 16:15 UTC

head link

[R] Help: formatting the result of 'cut' function

Try this:
> set.seed(1)
> x <- c(rnorm(1e3, mean=10, sd=1), 50, 100)
>
> start <- 0
> end   <- 110
> h     <-10
>
> c1 <- cut(x, br=seq(start, end, h), right=TRUE)
> levels(c1) [1] "(0,10]"    "(10,20]"   "(20,30]"  
"(30,40]"   "(40,50]"   "(50,60]"
"(60,70]"   "(70,80]"
 [9] "(80,90]"   "(90,100]" 
"(100,110]"> # I get:
> # [1] "(0,10]"    "(10,20]"   "(20,30]"  
"(30,40]"
> # [5] "(40,50]"   "(50,60]"   "(60,70]"  
"(70,80]"
> # [9] "(80,90]"   "(90,100]"  "(100,110]"
> x.str <- strsplit(levels(c1), ',')
> # reformat the numbers
> x.fmt <- lapply(x.str, function(.vals){+     .vals[1L] <- sprintf("(%03d", as.integer(substring(.vals[1L],
2)))
+     .vals[2L] <- sprintf("%03d]", as.integer(substring(.vals[2L],
1,
nchar(.vals[2L]) - 1)))
+     paste(.vals, collapse=',')
+ })> levels(c1) <- unlist(x.fmt)
> levels(c1) [1] "(000,010]" "(010,020]" "(020,030]"
"(030,040]" "(040,050]" "(050,060]"
"(060,070]" "(070,080]"
 [9] "(080,090]" "(090,100]"
"(100,110]">

On Wed, Apr 21, 2010 at 10:24 AM, Jose Claudio Faria <
joseclaudio.faria@gmail.com> wrote:
> Dear list,
>
> I would like to format the result of the 'cut' function to perform
a
> subsequent
> frequency distribution table (fdt) suitable for publications.
> Below an reproducible example:
>
> set.seed(1)
> x <- c(rnorm(1e3, mean=10, sd=1), 50, 100)
>
> start <- 0
> end   <- 110
> h     <-10
>
> c1 <- cut(x, br=seq(start, end, h), right=TRUE)
> levels(c1)
> # I get:
> # [1] "(0,10]"    "(10,20]"   "(20,30]"  
"(30,40]"
> # [5] "(40,50]"   "(50,60]"   "(60,70]"  
"(70,80]"
> # [9] "(80,90]"   "(90,100]"  "(100,110]"
>
> # I need (observe digits and space after the comma):
> # [1] "(000, 010]"  "(010, 020]"  "(020,
030]"  "(030, 040]"
> # [5] "(040, 050]"  "(050, 060]"  "(060,
070]"  "(070, 080]"
> # [9] "(080, 090]"  "(090, 100]"  "(100,
110]"
>
> c2 <- cut(x, br=seq(start, end, h), right=FALSE)
> levels(c2)
> # I get:
> # [1] "[0,10)"    "[10,20)"   "[20,30)"  
"[30,40)"
> # [5] "[40,50)"   "[50,60)"   "[60,70)"  
"[70,80)"
> # [9] "[80,90)"   "[90,100)"  "[100,110)"
>
> # I need (observe digits and space after the comma):
> # [1] "[000, 010)"  "[010, 020)"  "[020,
030)"  "[030, 040)"
> # [5] "[040, 050)"  "[050, 060)"  "[060,
070)"  "[070, 080)"
> # [9] "[080, 090)"  "[090, 100)"  "[100,
110)"
>
> # Making fdt:
> table(c1)
> # I get:
> # c1
> #    (0,10]   (10,20]   (20,30]   (30,40]   (40,50]   (50,60]
> #       518        482           0           0            1           0
> #   (60,70]   (70,80]   (80,90]  (90,100] (100,110]
> #            0           0           0            1             0
>
> # I need (observe digits and space after the comma):
> # c1
> #  (000, 010]  (010, 020]  (020, 030]  (030, 040]  (040, 050]  (050, 060]
> #            518           482               0               0
>    1               0
> #  (060, 070]  (070, 080]  (080, 090]  (090, 100]  (100, 110]
> #               0               0               0              1
>     0
>
> table(c2)
> # I get:
> # c2
> #    [0,10)   [10,20)   [20,30)   [30,40)   [40,50)   [50,60)
> #       518        482           0            0           0           1
> #   [60,70)   [70,80)   [80,90)  [90,100) [100,110)
> #           0           0            0            0             1
>
> # I need (observe digits and space after the comma):
> # c2
> #   [000, 010)  [010, 020)  [020, 030)  [030, 040)  [040, 050)  [050, 060)
> #            518            482              0               0
>     0               1
> #   [060, 070)  [070, 080)  [080, 090)  [090, 100)  [100, 110)
> #                0               0               0               0
>         1
>
>
> Is it possible? Any tip will be welcome!
>
> Thanks in advance,
> --
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
> Jose Claudio Faria
> Estatistica - prof. Titular
> UESC/DCET/Brasil
> joseclaudio.faria@gmail.com
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Apr 2010 - Help: formatting the result of 'cut' function

[R] Help: formatting the result of 'cut' function

[R] Help: formatting the result of 'cut' function

[R] Help: formatting the result of 'cut' function

Seemingly Similar Threads