Forgot to cc: to list
----- Forwarded Message -----
From: arun <smartpink111 at yahoo.com>
To: Marc Schwartz <marc_schwartz at me.com>
Cc: Barry King <barry.king at qlx.com>; Cc: Barry King <barry.king at
qlx.com>
Sent: Friday, March 15, 2013 3:41 PM
Subject: Re: [R] Help finding first value in a BY group
Thanks Marc for catching that.
You could also use ?ave()?
#unsorted
PeriodSKUForecast[as.logical(with(PeriodSKUForecast,ave(Period,SKU,FUN=function(x)
x==min(x)))),-1]
#? SKU Forecast
#1? A1?????? 99
#4? X4?????? 63
#9? K2????? 207
#sorted
NewDF[as.logical(with(NewDF,ave(Period,SKU,FUN=function(x) x==min(x)))),-1]
#SKU Forecast
#1? A1?????? 99
#9? K2????? 207
#4? X4?????? 63
A.K.
----- Original Message -----
From: Marc Schwartz <marc_schwartz at me.com>
To: arun <smartpink111 at yahoo.com>
Cc: Barry King <barry.king at qlx.com>; R help <r-help at
r-project.org>
Sent: Friday, March 15, 2013 2:56 PM
Subject: Re: [R] Help finding first value in a BY group
Hi,
There is a potential gotcha with the approach of using head(..., 1) in each of
the solutions that Arun has below, which is the assumption that the data is
sorted, as is the case in the example data. It seems reasonable to consider that
the real data at hand may not be entered in order or presorted.
If the data is not sorted (switching the order of the two K2 related entries):
Period <- c(1, 2, 3, 1, 2, 3, 4, 2, 1)
Forecast <- c(99, 103, 128, 63, 69, 72, 75, 201, 207)
SKU <-
c("A1","A1","A1","X4","X4","X4","X4","K2","K2")
PeriodSKUForecast <- data.frame(Period, SKU, Forecast)
> PeriodSKUForecast
? Period SKU Forecast
1? ? ? 1? A1? ? ?? 99
2? ? ? 2? A1? ? ? 103
3? ? ? 3? A1? ? ? 128
4? ? ? 1? X4? ? ?? 63
5? ? ? 2? X4? ? ?? 69
6? ? ? 3? X4? ? ?? 72
7? ? ? 4? X4? ? ?? 75
8? ? ? 2? K2? ? ? 201
9? ? ? 1? K2? ? ? 207
> with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))
A1? K2? X4
99 201? 63
> aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
? SKU Forecast
1? A1? ? ?? 99
2? K2? ? ? 201
3? X4? ? ?? 63
Note that the wrong value for K2 is returned.
You would either have to pre-sort the data frame before using these approaches:
NewDF <- PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ]
> NewDF
? Period SKU Forecast
1? ? ? 1? A1? ? ?? 99
2? ? ? 2? A1? ? ? 103
3? ? ? 3? A1? ? ? 128
9? ? ? 1? K2? ? ? 207
8? ? ? 2? K2? ? ? 201
4? ? ? 1? X4? ? ?? 63
5? ? ? 2? X4? ? ?? 69
6? ? ? 3? X4? ? ?? 72
7? ? ? 4? X4? ? ?? 75
> with(NewDF,tapply(Forecast,SKU,head,1))
A1? K2? X4
99 207? 63
Or consider an approach that does not depend upon the sort order, but which
subsets based upon the minimum value of Period for each SKU:
do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU),
? ? ? ? ? ? ? ? ? ? ? function(x) x[which.min(x$Period), ]))
?? Period SKU Forecast
A1? ? ? 1? A1? ? ?? 99
K2? ? ? 1? K2? ? ? 207
X4? ? ? 1? X4? ? ?? 63
or remove the Period column if you don't want it:
> do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU),
? ? ? ? ? ? ? ? ? ? ? ? function(x) x[which.min(x$Period), -1]))
?? SKU Forecast
A1? A1? ? ?? 99
K2? K2? ? ? 207
X4? X4? ? ?? 63
Regards,
Marc Schwartz
On Mar 15, 2013, at 12:37 PM, arun <smartpink111 at yahoo.com> wrote:
> Hi,
> Try:
> data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)))
> #?? Forecast
> #A1? ? ?? 99
> #K2? ? ? 207
> #X4? ? ?? 63
>
> #or
>? aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1)
> #? SKU Forecast
> #1? A1? ? ?? 99
> #2? K2? ? ? 207
> #3? X4? ? ?? 63
>
> #or
> library(plyr)
> ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1))
> #? SKU Forecast
> #1? A1? ? ?? 99
> #2? K2? ? ? 207
> #3? X4? ? ?? 63
> A.K.
>
>
>
>
> ----- Original Message -----
> From: Barry King <barry.king at qlx.com>
> To: r-help at r-project.org
> Cc:
> Sent: Friday, March 15, 2013 1:30 PM
> Subject: [R] Help finding first value in a BY group
>
> I have a large Excel file with SKU numbers (stock keeping units) and
> forecasts which can be mimicked with the following:
>
> Period <- c(1, 2, 3, 1, 2, 3, 4, 1, 2)
> SKU <-
c("A1","A1","A1","X4","X4","X4","X4","K2","K2")
> Forecast <- c(99, 103, 128, 63, 69, 72, 75, 207, 201)
> PeriodSKUForecast <- data.frame(Period, SKU, Forecast)
> PeriodSKUForecast
>
>?? Period SKU Forecast
> 1? ? ? 1? A1? ? ?? 99
> 2? ? ? 2? A1? ? ? 103
> 3? ? ? 3? A1? ? ? 128
> 4? ? ? 1? X4? ? ?? 63
> 5? ? ? 2? X4? ? ?? 69
> 6? ? ? 3? X4? ? ?? 72
> 7? ? ? 4? X4? ? ?? 75
> 8? ? ? 1? K2? ? ? 207
> 9? ? ? 2? K2? ? ? 201
>
> I need to create a matrix with only the first forecast for each SKU:
>
> A1 99
> X4 63
> K2 207
>
> The Period for the first forecast will always be the minimum value
> for an SKU.
>
> Can anyone suggest how I might accomplish this?
>
> Thank you,