thr3ads.net - R help - [R] Creating a vector of categories [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Christoffer Karlsson

2010-Mar-26 09:41 UTC

[R] Creating a vector of categories

Hi,

I have a column in a data frame looking something like:

$sex $language $count
male  english  0
male  english  0
female  english  32
male  spanish  154
female  english  11
female  norweigan 7

and so on.
What I want to do is to order these in to categories, for instance one
category where count>=0 & count<10 and so on..

I want my data to turn out looking something like:

male english 0-10 1324
male english 11-20 756
.....
male spanish 0-10 354
...
female english 0-10 1557
...

and so on, where the right hand is the count of the number of people in each
category.
Up until now I've been subsetting the data frame into each category, and
then counting number of rows in each subset. However I now have a large
amount of different factor combinations which makes this process tedious.

Any help would be appreciated!
Chris

	[[alternative HTML version deleted]]

Jim Lemon

2010-Mar-26 11:02 UTC

head link

[R] Creating a vector of categories

On 03/26/2010 08:41 PM, Christoffer Karlsson wrote:> Hi,
>
> I have a column in a data frame looking something like:
>
> $sex $language $count
> male  english  0
> male  english  0
> female  english  32
> male  spanish  154
> female  english  11
> female  norweigan 7
>
> and so on.
> What I want to do is to order these in to categories, for instance one
> category where count>=0&  count<10 and so on..
>
> I want my data to turn out looking something like:
>
> male english 0-10 1324
> male english 11-20 756
> .....
> male spanish 0-10 354
> ...
> female english 0-10 1557
> ...
>
> and so on, where the right hand is the count of the number of people in
each
> category.
> Up until now I've been subsetting the data frame into each category,
and
> then counting number of rows in each subset. However I now have a large
> amount of different factor combinations which makes this process tedious.
>
> Any help would be appreciated!
Hi Chris,
As luck would have it, I have been working on a very similar problem, 
that of graphically representing multi-level summaries. What you could 
do is to create a new factor variable with the "cut" function (say, 
"countcut"), then call the "by" function like this:

by(mydf$sex,list(mydf$language,mydf$countcut),sum)

You will not get the format you have specified, but you will get the 
numbers that can be reformatted.

Jim

Petr PIKAL

2010-Mar-26 11:05 UTC

head link

[R] Odp: Creating a vector of categories

Hi

r-help-bounces at r-project.org napsal dne 26.03.2010 10:41:29:
> Hi,
> 
> I have a column in a data frame looking something like:
> 
> $sex $language $count
> male  english  0
> male  english  0
> female  english  32
> male  spanish  154
> female  english  11
> female  norweigan 7
> 
> and so on.
> What I want to do is to order these in to categories, for instance one
> category where count>=0 & count<10 and so on..
Break your counts into desired levels, 
see ?cut
cut(1:100, breaks=10)

> 
> I want my data to turn out looking something like:
> 
> male english 0-10 1324
> male english 11-20 756
> .....
> male spanish 0-10 354
> ...
> female english 0-10 1557
> ...
aggregate your data

with(your.data, aggregate(count, list(sex, language, cutted.count), 
length))

Regards
Petr

> 
> and so on, where the right hand is the count of the number of people in 
each> category.
> Up until now I've been subsetting the data frame into each category,
and
> then counting number of rows in each subset. However I now have a large
> amount of different factor combinations which makes this process 
tedious.> 
> Any help would be appreciated!
> Chris
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

Sharpie

2010-Mar-26 11:40 UTC

head link

[R] Creating a vector of categories

Christoffer Karlsson wrote:> 
> Hi,
> 
> I have a column in a data frame looking something like:
> 
> $sex $language $count
> male  english  0
> male  english  0
> female  english  32
> male  spanish  154
> female  english  11
> female  norweigan 7
> 
> and so on.
> What I want to do is to order these in to categories, for instance one
> category where count>=0 & count<10 and so on..
> 
> I want my data to turn out looking something like:
> 
> male english 0-10 1324
> male english 11-20 756
> .....
> male spanish 0-10 354
> ...
> female english 0-10 1557
> ...
> 
> and so on, where the right hand is the count of the number of people in
> each
> category.
> Up until now I've been subsetting the data frame into each category,
and
> then counting number of rows in each subset. However I now have a large
> amount of different factor combinations which makes this process tedious.
> 
> Any help would be appreciated!
> Chris
> 
You can quickly assign a category to each row in your data frame with the
cut() function:

  testData <- structure(list(sex = structure(c(2L, 2L, 1L, 2L, 1L, 1L, 2L, 
1L, 2L), .Label = c("female", "male"), class =
"factor"), language structure(c(1L,
1L, 1L, 3L, 1L, 2L, 3L, 3L, 1L), .Label = c("english",
"norweigan",
"spanish"), class = "factor"), count = c(0L, 0L, 32L, 154L,
11L,
7L, 3L, 5L, 2L)), .Names = c("sex", "language",
"count"), class "data.frame", row.names = c(NA,
-9L))

  binMax <- ceiling( max(testData$count) / 10 ) * 10
  binBreaks <- seq( 0, binMax, by = 10 )

  testData$bin <- cut( testData$count, binBreaks, include.lowest = TRUE )

And then as Petr said:

  with( testData, aggregate(count, list(sex, language, bin), length))


Hope this helps!

-Charlie

-----
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
-- 
View this message in context:
http://n4.nabble.com/Creating-a-vector-of-categories-tp1691911p1692028.html
Sent from the R help mailing list archive at Nabble.com.

Sharpie

2010-Mar-26 11:44 UTC

head link

[R] Creating a vector of categories

Sharpie wrote:> 
>   testData$bin <- cut( testData$count, binBreaks, include.lowest = TRUE
)
> 
I also made a slight mistake, you will want to replace inclde.lowest = TRUE
with right = FALSE to the call to cut() to preserve the
greater-than-or-equal boundary at the lower end of each bin.

Sorry if that caused any confusion!

-Charlie

-----
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
-- 
View this message in context:
http://n4.nabble.com/Creating-a-vector-of-categories-tp1691911p1692030.html
Sent from the R help mailing list archive at Nabble.com.

Apparently Analagous Threads

Search for more maybe matching threads

R help - Mar 2010 - Creating a vector of categories

[R] Creating a vector of categories

[R] Creating a vector of categories

[R] Odp: Creating a vector of categories

[R] Creating a vector of categories

[R] Creating a vector of categories

Apparently Analagous Threads