thr3ads.net - R help - [R] aggregate by part of a field [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Hui Du

2011-Mar-10 02:02 UTC

[R] aggregate by part of a field

Hi All,

                I have a data frame like

a = data.frame(date = c(20081201, 20081202, 20081201), product = c("a b c d
e", "a b c g h t", "d e h a c e h g"), sales = c(1, 2,
3))

                Now I want to aggregate the sales by part of the a$product.
'Product' is the product name, a string separated by a space. The key in
my aggregate function is first three items in "product" field. In my
example, the key is "a b c", "a b c" and "d e h",
respectively. Do you know how to do it? I thought an awkward way which needed
several function calls (like strsplit, lapply, paste etc)  to manipulate the
string in 'product' field. I guess there could be some more elegant way
to do it.

                Thanks in advance.


HXD

	[[alternative HTML version deleted]]

Dennis Murphy

2011-Mar-10 07:29 UTC

head link

[R] aggregate by part of a field

Hi:

Here's one approach, although I imagine there are more efficient ways.

# A function to strip spaces and return the first three non-blank elements
of a string
keyset <- function(x) substr(gsub(' ', '', x)[1], 1, 3)

# Apply the function to the data frame to generate the key:
a$key <- sapply(a$product, keyset)> a      date         product sales key
1 20081201       a b c d e     1 abc
2 20081202     a b c g h t     2 abc
3 20081201 d e h a c e h g     3 deh

# Use aggregate to sum sales by key:
aggregate(sales ~ key, data = a, FUN = sum)
  key sales
1 abc     3
2 deh     3

HTH,
Dennis

On Wed, Mar 9, 2011 at 6:02 PM, Hui Du <Hui.Du@dataventures.com> wrote:
>
> Hi All,
>
>                I have a data frame like
>
> a = data.frame(date = c(20081201, 20081202, 20081201), product = c("a
b c d
> e", "a b c g h t", "d e h a c e h g"), sales =
c(1, 2, 3))
>
>                Now I want to aggregate the sales by part of the a$product.
> 'Product' is the product name, a string separated by a space. The
key in my
> aggregate function is first three items in "product" field. In my
example,
> the key is "a b c", "a b c" and "d e h",
respectively. Do you know how to do
> it? I thought an awkward way which needed several function calls (like
> strsplit, lapply, paste etc)  to manipulate the string in 'product'
field. I
> guess there could be some more elegant way to do it.
>
>                Thanks in advance.
>
>
> HXD
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more maybe matching threads

R help - Mar 2011 - aggregate by part of a field

[R] aggregate by part of a field

[R] aggregate by part of a field

Reasonably Related Threads